LabWindows/CVI

cancel
Showing results for 
Search instead for 
Did you mean: 

Does CVI support the new, dual-core processors?

And here's yet another instance of serial i/o problems using the NI rs-232 library on a multi-core PC -

A DLL I wrote using CVI 6.x has worked fine for years.  We recently started using it on a new system, and fairly regularly the serial I/O would mysteriously hang up.  The DLL is single threaded, and uses the NI rs-232 library with a "built in" com port on a P4 hyperthreaded micro/motherboard.

I forced the new system to run as single core (changed boot.ini using msconfig) and now it runs without hangin up.

Is there a flaw in the NI rs-232 library?  There have been several separately reported instances of applications that use the ni rs-232 library failing when run on a multi-core microprocessor. 

Somebody at NI understands this issue I am quite sure.  Why else would NI advertise a serial board (843x) as being "hyperhtread compatible"?  Why wouldn't you expect a CVI app using the rs-232 library to work on a multi-core system?  Why would you have to procure a "hyperthread compatible" com port card?

I have used the 843x boards, the NI 1.8 serial library, and the same rs-232 library and this combo runs just fine on multi-core systems.

Can I use the NI 1.8 serial driver on the built-in com ports on a multi-core PC?  Will this fix the problem?

Menchar

0 Kudos
Message 11 of 20
(2,271 Views)

Menchar,

First of all, as always, I would recommend using the latest NI Serial driver. However, in your case, nothing for Windows serial has changed since NI Serial version 1.8, so I would first suggest upgrading to this version to see if this helps.

You are correct in saying that NI advertises our boards as hyperthread compatible (link). That said, I'm not sure if our older drivers supported this technology. This would lead me to recommend to you to try and use the latest drivers and versions of the runtime engine because the software has been optimized to run with the latest software.

With that in mind, I don't think that there is a problem with NI's serial library because you stated yourself that the application works fine with NI serial boards and NI's serial library. When you are having problems, it sounds like you are using built in serial ports, which may not be hyperthread compatible themselves (even though the drivers themselves are).

Furthermore, is your program hanging from the start? Or is it hanging on the first serial execution? From the way you described it, it sounds like this is not the case, but rather it would randomly hang during execution. If this is the case, it sounds to me like NIs software is working fine, at least for a portion of time. I'm not sure why your program would be hanging; however, I don't know all of the specifics of your built in serial board.

As a temporary workaround, you noted that you could force your computer to function on one processor. This, obviously, is not optimal for every other application on your computer; however, you can actually force a thread to function on just one processor. This is described in the document Multithreading in LabWindows/CVI. If you can apply this technique to your applications that are not using NI Serial boards on multiprocessor systems it should help it not hang.

Matt Mueller
National Instruments

0 Kudos
Message 12 of 20
(2,235 Views)
Matt -
 
Thanks for the reply, I appreciate it.
 
I'm not using the NI serial driver or add-in serial ports at all on the systems that show the problem.  I am using the NI rs-232 library, which, by my understanding, runs on top of the HAL in user address space and uses the native Win32 serial driver (or other installed serial driver I imagine).  In all cases we're running on a built-in com port (implemented in the processor chip set I believe), typically "COM1".  When I set the NUM PROC flag to 1 in boot.ini using msconfig the application works reliably.
 
I would not normally expect that the com ports built into multi-core Pentium motherboards would be faulty but I guess that's a possibility.  I've seen the problem on several P4 hyperthreaded systems here at Raytheon and if you search the discussion forums I think you'll see that several others have reported what appears to be the same problem. 
 
If the MB's / chipsets / comports are faulty on Hyperthreaded P4's then there's nothing I would expect NI to be able to do about it.
 
If the native Windows serial driver is faulty, (or the NI rs-232 library / native driver combination is faulty) then I could see that using the NI serial driver against a built-in com port would help.  I take it that I can install and use the NI 1.8 serial driver on built-in com ports, even if I don't have an NI add-in serial board?  Is the NI 1.8 serial driver a kernel mode driver (i.e. below the HAL)?
 
The more relevant question to me in re the NI 843x product line is why are they advertised to be "hyperthread compatable"?  Is it because other NI serial port add in boards are not hyperthread compatible?  Does com port hardware need specific features in order to be hyperthread compatible?
 
If one vendor's board/driver/library combo works together, that doesn't prove that the constituent parts necessarily are all kosher when considered separately or in other combinations. 
 
I've asked before about the runtime engine, and I think it's something of a canned answer - do you know of any specific hyperthreading-related issue that later CVI rte's addressed?
 
The program does not hang from the start, but at apparently random points in execution during serial i/o.  This is characteristic of thread coordination problems - a vulnerability exists that doesn't get exposed until a pathological combination of thread execution, i/o. etc occurs.  
 
I'm familiar with setting processor affinity for a thread or process, and it isn't as simple as it seems.  Constraining a particular process or thread to a single core doesn't prevent any other process or thread from, for example, using the NI rs-232 library / serial driver from multiple threads that scheduled on any of the cores.  On the other hand, forcing everthing onto a single core does guarantee that all threads using the rs-232 library / serial driver (whomever's) / com port (whomever's) aren't concurrent at the hardware level in the micro.  So as a quick fix I'm using a single core.  If I had the time I suppose I could experiment to see if processor affinity cures it as well. 
 
I'm interested in finding out where the problem is - are we buying faulty motherboards or is it something else?
 
Thanks for your input, Matt.
 
Menchar
 
 
 
 
 
 
 

Message Edited by menchar on 09-19-2006 03:20 PM

0 Kudos
Message 13 of 20
(2,224 Views)
Hi Menchar,

From what you describe, my guess is that whatever driver is being used for your built-in serial port (likely some driver that shipped as a part of the OS) is not multithread/hyperthread safe.  The CVI RS-232 library is implemented via Win32 API calls, which in turn use the "best" driver it can find to actually control the hardware.  For NI serial boards, this would be the NI-Serial driver, but for the built-in ports it would likely be some other driver.  If that driver is not properly written to handle being called concurrently (and with multiple cores, this is true concurrency; not the simulated concurrency that comes with multiple threads running on a single processor), you could conceivably get deadlocks, which would manifest as hangs.  The fact that the same program runs without hangs on the NI hardware strongly indicates that the flaw does not lie in the RS-232 library.  Also, as a side note, even if your application is single-threaded, the RS-232 library uses worker threads for each port that pass queued data to the Win32 API.  You can try disabling the output queue (pass -1 as the output queue size in OpenComConfig) to prevent using the worker threads, but that is not guaranteed to fix whatever deadlock may be occurring.

I would suspect that if you could manage to use the NI-Serial driver for the built-in ports it would in fact solve the problem, but I'm not sure if that is possible.  It would likely require that your built-in ports had an NI-Serial-compatible UART, and then you would probably have to muck around with some .inf files and registry keys.  Unfortunately, this is not something I have the first idea about how to do.

Hope this helps.

Mert A.
National Instruments


Message 14 of 20
(2,221 Views)
Matt -
 
I think I understand concurrency pretty well - we've deployed concurrent apps built with CVI for many years. 
 
I had never seen this problem until we tried using hyperthreaded P4's.
 
Thanks for the insight on the multiple threads in the rs-232 library.  I often do defeat the output queue but it may not have been done in this particular app.  At one time as I recall even if you did defeat the queueing with OpenComConfig() the OS still used a default queue of 4096 bytes.  Well, constraining the system to a single core would also affect the behavior of the multi-threaded NI rs-232 library - the worker threads all run on a single core with pseudo concurrency.  It could be that the rs-232 library is not doing anything "wrong" - but it might be exposing some vulnerability in the hardware / native driver.
 
I won't try installing the NI serial driver on these systems - it wouldn't be used even if I did it.
 
These problems can be very subtle and hard to find.  
 
You may be interested to know that early versions of the NI-DAQ library had concurrency problems.  I was an early adopter of the newly "thread safe" NI-DAQ library.  I used it with two NI-DAQ boards in the same system and it could deadlock when called from multiple threads.    After a major NI-DAQ re-write things got better though it was too late for my project.
 
Thanks again Matt.
 
Menchar  
0 Kudos
Message 15 of 20
(2,221 Views)

Menchar,

Mert added some of the information that I didn't know and I think it answers most of your questions. As Mert indicated, I don't think it is a problem with your built in com ports, rather, it is a problem with the drivers that RS-232 is using to talk to your com ports. I assume these drivers are not hyperthread compatible, so on single processor computers you would not encounter these deadlocks. I would try to find out what driver that the RS-232 library is selecting to use with your com ports and attempt to find out if that driver is hyperthread compatible.

Just to touch on a few from a couple posts ago:

You asked if other NI serial products are hyperthread compatible. The answer is: "All NI PCI and PXI serial interfaces offer multiprocessor and hyperthreading support..." 

You asked if I knew of any hyperthread additions in the runtime engine, and I don't know if there were any fixes specifically targeted at this. However, NI's Serial driver did change for Windows in version 1.8.

Hope this helps and thanks again for your continuing support of NI and its products - it really helps to have dedicated customers that help us ensure that our hardware and software is working correctly.

Matt Mueller
National Instruments

0 Kudos
Message 16 of 20
(2,195 Views)

Thanks Matt - dang, I was calling Mert Matt 😉  Thanks Mert too. 

Well, the serial driver being used is whatever installs from the WinXP Pro SP2 media.  And that may well be the culprit, though if the WinXP native serial driver were faulty under hyperthreading you'd think we'd have heard about it as such.

Menchar

 

 

0 Kudos
Message 17 of 20
(2,184 Views)
Menchar,

I believe the serial drivers that are being used are serial.sys and serenum.sys. I can't find anywhere that states whether they are or are not hyper/multithread compatible, but I would assume they are not.

With that said, have you tried downloading the latest service packs from Windows? I know on Windows 2000 there was a serial fix on one of the releases, so I wonder if the drivers were upgraded in some of the later service packs.

Matt Mueller
National Instruments
0 Kudos
Message 18 of 20
(2,161 Views)
Matt / Mert -
 
The one thing we know with certainty is that on Hyperthreaded P4's running XP Pro setting the NUM PROC = 1 in boot.ini solves the problem.  We've run all week now without any errors after making this change.
 
It's a very confusing issue - lots of variations.
 
It looked to me that in this last case we had, the hang occured after a write as it waited for a response.   The timeout was 15 seconds. 
 
I have queue sizes of 8K in both directions:
 
 #define CR          0x0D
 #define LF          0x0A
 #define SPACE       0x20
 #define BIT_RATE    115200
 #define PARITY      0
 #define DATA_WIDTH  7
 #define STOP_BITS   1
 #define IN_QUEUE    8000
 #define OUT_QUEUE   8000
 #define COM_TIMEOUT 15
 #define RESET_TIMEOUT 12   
 #define BUFF_SIZE   5200
 
 OpenComConfig (ComPort, "", BIT_RATE, PARITY, DATA_WIDTH, STOP_BITS, IN_QUEUE, OUT_QUEUE);
 if ((iStatus = ReturnRS232Err()) < 0) {
  sprintf (szMessage, "InitializeXFLIR: Com port initialization failed for port %i.  Error is %i", ComPort, iStatus); 
  writeLogMessageString (szMessage);
   return INIT_ERROR;
 }
 
I'd be surprised to learn that the standard WinXP serial driver isn't hyperthread compatible.  Microsoft says WInXP supports hyperthreading.  I suppose it's possible.
 
I'd more likely think that maybe some of these systems have early chip sets that have a vulnerability - the built-in com port(s) are usually implemented in the chip set I believe.
 
Then again, running everything on a single core slows the system down - you could imagine any number of race conditions in application code that could cause this, though that shouldn't be the case here.  The application is in VB6 and calls the DLL I implemented in CVI.  It turns out that the VB6 runtime is multi-threaded itself, though the VB app does not explicitly create any threads.  But I've seen the problem on apps implemented solely in CVI too.
 
Heck, it could even be a side effect of some other, apparently unrelated driver in the kernel that's behaving badly on multi-core systems.  We use a framegrabber that has a multi-threaded DLL on many of these systems.
 
Well, thanks for the support, I'll continue to monitor this issue and see if we don't eventually figure it out.  We have a bunch of dual AMD Opteron systems too but they use the NI 843x card (I used the 843x in part because of the hyperthread compatability claim by NI) and we haven't seen the problem there.
 
Menchar

Message Edited by menchar on 09-21-2006 10:54 AM

0 Kudos
Message 19 of 20
(2,150 Views)
Menchar,

As you pointed out, you are configuring the port to use an 8K output queue.  I would try disabling it.  If the deadlock is occurring just after a write, it seems plausible that it is related to the worker thread and main thread accessing the driver simultaneously.  I believe that Windows does keep an output buffer of at least 4K regardless of the size passed to OpenComConfig, but if you pass a negative value, the RS-232 library performs writes synchronously instead of in a worker thread.  I would be interested to see if this is viable workaround.

If you give it a shot, please do post back with your findings.

Mert A.
National Instruments
0 Kudos
Message 20 of 20
(2,115 Views)