How come thread aborts and TCP read looses stream data on Labview RT?

bobp · ‎12-22-2005

The program uses the great command based architecture provided in the following link. http://zone.ni.com/devzone/conceptd.nsf/webmain/cd63fd0f746b17a686256f090065f572

In the case concerning this post, two NI-PCI-6723 (32 analog output card) are installed in a Desktop PC target running LVRT 7.1.1 (combining Pharlap). Simply, the Computer is a TCP/IP client reading data stream from a server and producing 64 analog outputs. That is 64 times 2kHz sampling rate.

The problem: After a random time(usually less than 2 hours), the application stops reading data from TCP messages.

Working around debugger losing track of vi operations, I found out the TCP stream was corrupted. But no error is reported by the first Read TCP.vi used to get the incoming message length. As a result, the second read eventually gets a parameter to read a message of randomly selected length.

The TCP stream is guaranteed to be ordered and complete. The TCP data source is feeding without a problem several other clients for chart display purposes.

Although no errors were flagged by TCP Read.vi, I expected a possible overrun of the TCP-IP stack input buffer. The calling vi loop would need more CPU time to transfer TCP messages to the RT-FIFO. I raised this vi priority to high. As a result: no more TCP stream data corruption. Frame size is always 5126.

Application is running fine since 20 hours now.

Issue solved. Unfortunately, I can't observe CPU load margins. I feel the execution trace toolkit is not easy to use with a desktop PC target. (lots of CPU operations)

If my observations are exact...here are some resulting suggestions:

1-Is it possible to check and possibly improve TCP library error messages on the pharlap based labview RT to not miss overrun errors?

2-It could be worth looking for reasons where the RT thread seems to stop and the debugger loses track if read tcp.vi is called in a subroutine with an improper data size?

3-I also changed the TCP read mode to buffered. IMHO, It is important to do so to avoid loosing track of the stream in case of timeout. Reading an incomplete packet would provide incomplete data and loose position of the next data packet length message.

Thank you for your help.

Charlie S. · ‎12-27-2005

Hi bobp,

It looks like you've invesitaged this quite a bit. However, I still have a few questions for you:

1. What do you mean by the "debugger losing track of vi operations"? Are you referring to the execution trace toolkit?

2. How is the TCP stream getting corrupted? Is it missing data?

3. You say that the first TCP Read does not return an error -- does it return anything? I'm assuming it has some data, since the second read depends on that when reading the data stream (I'm assuming you're using the TCP Msg Read.vi from the example you cited)

4. TCP is a handshaked protocol, so nothing should be overwritten. How is your TCP stream getting corrupted? Is it missing data? What makes you suspect an overrun of the TCP-IP stack in Pharlap?

Thanks in advance,

Charlie S.

Visit ni.com/gettingstarted for step-by-step help in setting up your system

bobp · ‎01-03-2006

1- Not using the execution trace toolkit. In this case, debugger refers to the labview diagram tools: run/abort, pause, highlight execution and step. When the debugger "loses track", it is not possible to reveal the vi state as usual by selecting pause or execution highlight or abort. More precisely, the vi uses two parallel while loops. One calling TCP read msg.vi and another checking for stop conditions. When clicking on execution highlight, the first loop stands still, nothing dimed inside and no green arrow on any blocks, but all wires dimed on the outputs. The second loop shows normal execution highlighting with values "riding" over diagram wires. When selecting abort, a popup window announces lost of communication with target, click on button to return to "the local" labview for windows.

2- Corrupted meaning that the stream content is not received as transmitted. To determine how data is missing or altered is an additional investigation.

3- The first read does not return any error. It does return wrong data and this data is used to set the second read size with a wrong value. Such an error breaks the transmission scheme.

4- Yes, TCP is designed to not miss any data. That is why I suspect a failure of the TCP implementation. Buffer overrun is an hypothesis based on the failure I observed and the apparent help of increasing execution priority. The source is also sending data to several other clients using the same scheme. The other clients are running the - same - data reception vi, use labview for windows,( also do not use nidaq-mx), and do not have this problem.

gnilsson · ‎01-03-2006

Hi bobp,

I agree with your feeling that TCP/IP is "not optimal" on Pharlap

I guess you already set the inteface to Half Duplex, if its not connected through a HUB or eqv., as recomended somewere at ni.com (I cant remember the exact place, but I've seen it in at least two places, one as a performance recommendation

I have experienced the problem with MAX and Full Duplex...

I suspect that the TCP/IP stack is not coded when todays modern harware was availible, and therfore has som bug / deadlock, that becomes critical when connecting point to point with full duplex.

Charlie S. · ‎01-06-2006

Hi bobp,

Thanks for the additional info. It sounds to me as though one of your loops is starving out the other (as you've already alluded to). Since the execution highlighting makes it appear as though one loop is always executing while the other is not, it is clear that it is taking priority and starving the other. You have appropriately remedied this by increasing the priority of the affected loop.

The fact that the first read does not report an error, yet returns erroneous data is troubling to me. I cannot immediately comment on what could be causing this other than a possible error in the way Pharlap handles that TCP stack. I can, however, ping our R&D team for more information.

Thanks,

Charlie S.

Visit ni.com/gettingstarted for step-by-step help in setting up your system

LabVIEW

How come thread aborts and TCP read looses stream data on Labview RT?

How come thread aborts and TCP read looses stream data on Labview RT?

Re: How come thread aborts and TCP read looses stream data on Labview RT?

Re: How come thread aborts and TCP read looses stream data on Labview RT?

Re: How come thread aborts and TCP read looses stream data on Labview RT?

Re: How come thread aborts and TCP read looses stream data on Labview RT?