TCP Read Fail

LLindenbauer · ‎08-19-2022

@Mark_Yedinak wrote:

No, don't use power shell to create the connection and pass data to LabVIEW via that.

I would not recommend that either. I do, however, recommend to split the task into smaller pieces. Either to find the problem by reduction or stepwise expand from a minimal working example toward the solution. I'm sorry I was unclear on that.

@Froboz wrote:

I did download VS Code with Powershell and tried running a modified version of the link you sent. It's complaining the connected host has failed to respond 192.168.100.118:9760. Address and port look right. Something else to try to figure out.

Powershell comes with standard Windows installations, there should be no need to download anything (this makes it a powerful tool to be able to use).

I think it would be a step forward if you can rule out your computer network configuration (software firewall etc.) as the problem. You can do this by setting up another computer and instruct it to pretend to be your instrument. It does not matter much whether you use Powershell for this or use the LabVIEW TCP examples. Use whatever you are comfortable with.

I agree with @Mark_Yedinak that it is far more likely that the problem is outside of LabVIEW. I am no expert on TCP, so I am wildly guessing here: I am confused by the source and destination ports listed in the wireshark trace. What do you see when checking the open connections with "Perfmon /Res" (https://superuser.com/questions/1025252/how-to-list-open-ports-and-application-using-them-in-windows...)?

Froboz · ‎08-19-2022

A little back story to this. Our company has been using Python 2 to run functional tests on manufactured products. Python 2 is no longer supported and IT consideres it a security risk to maintain, so we are being pushed to Python 3, which is a bit different from its predecessor in a number of different areas. I am looking for a more stable test platform where we will not have to rewrite code with a major rev change. I've used Labview before, just not with TCP/IP.

The individual who wrote Python 2 script ran into some timing issues (could not receive packets back after transmitting commands - sound familiar?) and so made his code multi-threaded with one thread handling the commands and the other thread handling the Listening part. This seemed to work for Python 2. Not for Python 3, however. And I suspect Labview has the same problem, that it is not fast enough to pick up the response on the same line and simply times out. As a test for Python 3 (not tried on Labview yet), I put a 200ms delay in the firmware after the DUT received the command before it would respond over the same line, and that seemed to fix the problem. But we would rather not want to have to modify firmware on all products to accomodate this.

Since Python 2 was able to operated normally with given firmware set, there must be a solution that will work for other platforms. I was hoping that platform would be Labview, as it is (in my opinion) more appropriate for functional testing in a manufacturing space than Python.

Wireshark does pick up the data return, and I was looking to see if I could pick up any examples of Labview acting as a Sniffer for TCP. Unfortunately, the examples that do exist are dated and the .dlls (winpcap) are no longer supported. The new npcap which I think current wireshark uses there are no examples for - with respect to Labview - and for some reason, I run into installation issues on my Win10 machine.

Mark_Yedinak · ‎08-19-2022

I have not found a good example of LabVIEW functioning as a packet sniffer. This would have to be implemented using npcap. You would have to look at the API for that.

In terms of LabVIEW TCP being able to keep up with your device you should have no problem with well written code. I have written some very heavy network applications and have had no issues with keeping up with the communications. In fact I have written network stress tests for other devices using LabVIEW. Without seeing your specific communication protocol it is hard to give specific device. Also, while I am not a huge fan of Python a decently written Python application should be up to the task as well.

I have not checked in a wile but the LabVIEW TCP read and write functions were blocking calls. So calling them in parallel really didn't buy much since the read would block when the write was active and vise versa. Not sure if that has changed in newer releases but I doubt it. I would work around this by having a basic state machine that would alternate between read and writes. The read task would read blocks of any available data and pass that to a separate task for parsing/processing. The read in this case was dumb. It simply read any available data. The write would write any available data passed to it. If there was nothing to write it would simply read. You would have to determine if the write or read was a higher priority if you had multiple things to transmit. But must protocols are generally a command/response protocol so it is fairly easy to implement that.

However, based on your error code you need to determine why the peer is closing the connection. You also need to look at the device protocol to make sure you are properly implementing it.

As I stated, I am quite confident that LabVIEW is capable of meeting your needs. I have implemented large, distributed systems based on network messaging and the message broker can handle a very significant message load.

Mark Yedinak
Certified LabVIEW Architect
LabVIEW Champion

"Does anyone know where the love of God goes when the waves turn the minutes to hours?"
Wreck of the Edmund Fitzgerald - Gordon Lightfoot

rolfk · ‎08-19-2022

@Froboz wrote:

The individual who wrote Python 2 script ran into some timing issues (could not receive packets back after transmitting commands - sound familiar?) and so made his code multi-threaded with one thread handling the commands and the other thread handling the Listening part. This seemed to work for Python 2. Not for Python 3, however. And I suspect Labview has the same problem, that it is not fast enough to pick up the response on the same line and simply times out. As a test for Python 3 (not tried on Labview yet), I put a 200ms delay in the firmware after the DUT received the command before it would respond over the same line, and that seemed to fix the problem. But we would rather not want to have to modify firmware on all products to accomodate this.

That definitely sounds backwards somehow. The LabVIEW TCP/IP nodes have multiple possible modes and the only situation I can imagine that this could occur is if you use Immediate mode with a very short timeout and/or CRLF mode and there is already some data in the TCP/IP input buffer from a previous response from the device. It could be for instance that the device responded with two CRLF pairs to the previous command, the second without any prepended data.. HTTP uses this for instance to indicate the end of the HTTP header after which the HTTP Body (the actual HTML document) follows. In that case you might have sent a command and your device responded with a CRLF terminated response and another empty CRLF line. You only read back the first and forget about the second empty line.

TCP/IP is a streaming protocol, and the TCP Read in LabVIEW only will return if there is

1) an error on the socket in which case you will receive an error in the cluster

2) the timeout occurred in which case you receive a timeout error (56)

3) the termination condition has been encounter (depends on the mode you indicate the Read to use)

4) the requested amount of bytes has been retrieved

There is no other reason for TCP Read to return, so delaying the response to make it magically work sounds more than just a little weird. Either you need to play with the mode for the TCP Read node or there is something else at play that you haven't told us yet. TCP Read specifically CAN'T return badly because the data arrives to early (unless you use Immediate mode maybe, I practically never used that one so far and it is generally not the right mode to use).

Since Python 2 was able to operated normally with given firmware set, there must be a solution that will work for other platforms. I was hoping that platform would be Labview, as it is (in my opinion) more appropriate for functional testing in a manufacturing space than Python.

Wireshark does pick up the data return, and I was looking to see if I could pick up any examples of Labview acting as a Sniffer for TCP. Unfortunately, the examples that do exist are dated and the .dlls (winpcap) are no longer supported. The new npcap which I think current wireshark uses there are no examples for - with respect to Labview - and for some reason, I run into installation issues on my Win10 machine.

Winpcap is indeed discontinued since quite some time. The LabVIEW examples to use the Winpcap library are very basic and limited. However it would be not that difficult to update them to work with npcap, but the help DLL needs to be recompiled and relinked with the npcap import library and potentially there might need to be minor modifications to the DLL source code. But npcap also allows to select a mode during installation that installs a Winpcap compatible wrapper so that the LabVIEW library still would work.

Rolf Kalbermatter My Blog

DEMO, Electronic and Mechanical Support department, room 36.LB00.390

rolfk · ‎08-19-2022

@Mark_Yedinak wrote:

I have not checked in a wile but the LabVIEW TCP read and write functions were blocking calls. So calling them in parallel really didn't buy much since the read would block when the write was active and vise versa. Not sure if that has changed in newer releases but I doubt it.

Actually the LabVIEW TCP VIs are asynchronous! But A TCP session (and an UDP session too) has an internal mutex that will block access to that resource for other operations while the node is busy doing work on it. This means for TCP Write basically that as long as it is busy trying to copy the data into the underlying socket buffer, no other TCP node can access the socket. For TCP Read things are a little more complicated. This uses the sockets select() or poll() functionality to wait for incoming data to be retrieved. The mutex is acquired for the verification of the session and preparation of the wait but released while the Read waits on the event from the socket. The TCP Read still seems to block and won't return before either an error or timeout occurred, the termination condition is fulfilled or the requested amount of bytes have been retrieved but the VI will not block and let other code perform including TCP IP nodes on this and other sessions. Of course if you use dataflow to chain the read after the write you won't have such parallelisme, but both read and write can very well operate in parallel if you want to. As to how desirable that usually is I would leave this open, most communication is command-response based and it is often not desirable nor even possible to receive anything before you have sent off the command to cause a response. An yes this asynchronous operation is implemented in LabVIEW itself. It did even work like that before LabVIEW acquired multithreading support and will even nowadays work if you configure LabVIEW to only startup with one single thread. Most LabVIEW nodes that have any change of needing to wait in some ways on something are still fully implemented using this old LabVIEW asynchronous operation (Wait ms, Wait until next multiple, etc, etc) and in the case of VISA you can even configure the nodes to work asynchronous (using this LabVIEW cooperative multithreading) or synchronous, in which case each node will consume a native OS thread for the duration of its execution. With VISA there used to be some weird scheduling interactions that could make the asynchronous mode perform badly as it somehow was fighting the underlying VISA mechanism for its asynchronous API that was also trying to do its own scheduling to improve performance.

Rolf Kalbermatter My Blog

DEMO, Electronic and Mechanical Support department, room 36.LB00.390

Froboz · ‎08-19-2022

Well, it is not a timing issue. While placing a delay in FW helped in the case of Python 3, no such luck for Labview, so it is something entirely different. Python 3 with 200ms delay written into FW, result. The fact that it responds with an ACK means data received

LLindenbauer · ‎08-19-2022

While I enjoyed the background history, I did hope for the debug information. I'll make my hunch more specific: The commands and replies are sent along two different connections. The computer connects to 9760 on the device, the device tries to connect back to the computer on 9760. Checking the open connections while your reference implementation is running would be able to confirm this.

Create a second VI, wire up a TCP Listen on local port 9760 and TCP Read. Keep it running, then launch your first VI with the TCP Write. Then check back and see if anything happened to the TCP Listen in the second VI.

Then again, since you have the reference implementation in python - and, as it seems, for the Firmware - is studying the source code not an option?

Froboz · ‎08-19-2022

Well, damn, that worked! So what would be preventing it from working from a single VI? It only works for a single cycle but I was able to read one time. I get an Error 62 on the write end after the data is read, meaning the connection was aborted for some reason after the read occurred.

rolfk · ‎08-20-2022

That back channel sounds like a very bad idea in nowadays world of firewalls and whatever else. It’s one of the reason the FTP protocol is strongly discouraged nowadays since with its active download mode the server opens a reverse channel for the data transfer and that just doesn’t fly well with firewalls. They obviously can’t prevent outgoing connections without some more intelligent deep packet inspection to detect potentially dangerous actors but they are generally very diligent about blocking every single incoming connection unless that port number is on a special whitelist.

Your device seems to expect to be able to open such a reverse connection and perform likely a specific transaction on that and if that doesn’t succeed it simply drops the original connection too. I can’t imagine where something like this is really good for but if it is for security it is totally misguided in nowadays internet world.

Rolf Kalbermatter My Blog

DEMO, Electronic and Mechanical Support department, room 36.LB00.390

LLindenbauer · ‎08-20-2022

@Froboz wrote:

Well, damn, that worked! So what would be preventing it from working from a single VI?

Good to hear! There is nothing preventing that. Just copy the contents of both VIs into a single one. You will have to get the timing of the operations right, just as you did manually before. The most important thing is that you need two TCP connection wires. One for the outgoing command, one for the incoming data.

It only works for a single cycle but I was able to read one time.

Maybe the connections get closed right afterwards? Only you can tell. I have pointed you to tools and steps that can help you to diagnose this. Have you tried that? It will probably also show up in the complete wireshark trace.

Since this turned out to be a problem related to TCP, I would like to highly encourage you to read up on that protocol and play around with it. I am sure it will help with troubleshooting this device in the future. The whole purpose is having a two-way communication, not hangig up the phone after every sentence.

LabVIEW

TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail

Re: TCP Read Fail