LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Need TCP Error Explanation

One thread is doing hairy math problems (500x500 CDB matrix inversions and stuff), and posting results into a queue. After it posts results, it proceeds with the next math problem.

Another thread picks up the results from the queue, opens a TCP connection to another thread (on the same machine or a different machine), and transmits the results, after converting them to a string.

The results might be 50-100 kBytes.


The transmit thread opens a connection, transmits a header and a block(of 1024), a header and a block, a header and a block, etc. until done.

The receive thread waits on a connection, then receives a header (fixed size), then a block (described by the header), a header and a block, a header and a block, etc., until done.



All TCP operations use a timeout of 200 mSec.


The trouble is, I�m getting receiver errors (56 = timeout), that the transmit side doesn�t see. The transmit side is set to detect an error, and re-do the whole thing later if an error occurs. That has previously proven to be working. But now, I have cases where the receive side reports an error (56), but the transmit side doesn�t know about it, so my code fails.

I thought that an error on the receive side would be reflected back to the transmit side. (guaranteed delivery?)

Should I jack up the timeout value and hope for the best?

Should I implement an acknowledge reply scheme?

Other ideas?
Steve Bird
Culverson Software - Elegant software that is a pleasure to use.
Culverson.com


LinkedIn

Blog for (mostly LabVIEW) programmers: Tips And Tricks

0 Kudos
Message 1 of 4
(2,675 Views)
Hi!

At the first (my opinion) feedback from receiver must be organized. The logic will be following:
The transmitter: transmits a header and a block, wait for feedback, transmits a header and a block, wait for feedback, and so on...
The receiver waits on a connection, then receives a header, then a block, then send feedback, and so on...

The second reason - timeout not large enough. Unfortunately TCP realization in LabVIEW a very strange. I have done with analysing of a packets, which was sent by LabVIEW - they split whole data into 32K sections, and it makes impossible fast data transfer. For example, sometimes you needs more than 300ms for transferring 0.5 MB data over Gigabit(!) ethernet. The best way - using standard Winsock API for sending data -
then transferring speed will be significally increased.

Third, not all errors on the receiver side will be reflected back to the transmit side. Only with acknowledgement from receiver you can get 100% guarantee, that data received and processed correctly...

with best regards...
0 Kudos
Message 2 of 4
(2,675 Views)
The transmitter: transmits a header and a block, wait for feedback, transmits a header and a block, wait for feedback, and so on...
The receiver waits on a connection, then receives a header, then a block, then send feedback, and so on...


I really don't want to do it that way. I thought the job of the TCP / IP layer was to handle this feedback. For the most part, the data moves really fast - 10 mSec or less for the whole thing. I don't want to bog that down with a handshake on every block. I suspect that it's being delivered from OS to OS, but the problem is getting it into LabVIEW. That would explain why there's no error showing up on the transmit side.

I'm thinking of implementing an overall feedback - if I received the whole 50
k-100k bytes successfully, I'll send a feedback packet down the same connection. If the transmitter doesn't get the feedback within X mSec after sending the last packet, it will put it back into the queue for re-transmission.

In my case it's not important to get it there soon, but important to know that it gets there.
Steve Bird
Culverson Software - Elegant software that is a pleasure to use.
Culverson.com


LinkedIn

Blog for (mostly LabVIEW) programmers: Tips And Tricks

0 Kudos
Message 3 of 4
(2,675 Views)
Hi Coastal,

I am not sure we can blame TCP/IP. You are probably correct in suspecting LV interface to the protocol stack.

I think your higher level check is a wise move.

Increasing the timeout may also help avoid those occations when the network is so busy that it can not get the data through.

I seem to remeber that someone re-wrote this interface from scratch to speed things up. I think I read about it on Info-LabVIEW. Sorry I can not be more specific than that.

Just my thoughts,

Ben
Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel
0 Kudos
Message 4 of 4
(2,675 Views)