LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Linux RT FTP Transfer randomly stops file transfer with Error 425

We are in the process of switching from PharLap to Linux RT. For legacy reasons we need the RT Application to pull files from from a remote ftp server (about 300-400 files. around 50 MB total).

 

The transfer is using active transfer mode on purpose since we can't open any more ports in the customers firewall. As far as I know pharlap doesn't use a firewall anyway so thats the obvious choice here.

 

The process has been working fine for the last couple of years. The same exact code behaves very inconsistent on linux rt so there are multiple outcomes:

 

1. The transfer finishes just fine.

 

2. A random number of files is transfered and the transfer suddenly stops. About a minute later it returns with error 425 "Can't Open Data Connection". The current file is skipped. Now the transfer either finishes or it gets stuck at another random file, which is again skipped with error 425. This is repeated until the list of files is finished.

 

The FTP Server is the built in IIS running on Windows 10.

 

I can't wrap my head around about what is actually happening here. So far I've tried:

 

- Using different ports. No Change

- Switching to passive mode. Works fine as long as I open the firewall for the set passive range. Which i can't do on the customers network

- Using a different FTP Server (Quick ‘n Easy FTP Server Lite) same port. Works fine. Not feasible on the customers site either.

- Using different clients to download the exact same files to my work machine. Works fine.

 

iptables -L on the RT Target is set to accept on all chains. Therefore iam guesing everything is open as well.

 

Any help would be apprceciated.

Screenshot 2025-05-12 121600.png

 

 

0 Kudos
Message 1 of 6
(544 Views)

Just a thought, but maybe add a wait between file transfers. A 100ms wait will add 30-40 seconds to your total transfer time, but assuming that's not a problem then it might be worth trying in order to determine if you're getting periodically blocked by the firewall by hammering new transfers as quickly as you can.

0 Kudos
Message 2 of 6
(509 Views)

Thank you for the suggestion. Unfortunately I already tried this after noticing the file transfer on linux (when it worked) was way faster than on pharlap. I've tried up to 1s delay. No change in behaviour.

 

Even reducing the number of files doesn't help. 10 files and a couple of retries is enough to trigger this.

 

 

0 Kudos
Message 3 of 6
(497 Views)

What about a single file? Can you programmatically zip up the files before transfer and then unzip after the transfer? I think there are a couple of zip toolkits out there, including the OpenG Zip Library

0 Kudos
Message 4 of 6
(484 Views)

I can't zip the files from the rt side. I also tried to open/close a new session for every single file. It still occurs sooner or later.

 

I am still not sure which side is causing the problem here. 

0 Kudos
Message 5 of 6
(465 Views)

Sounds like a potential socket/port range limit. Every TCP/IP connection is characterized by the 4 elements, host IP and port, and target IP and port. TCP/IP does not allow more than one connection where these 4 are all the same. But typically TCP/IP also uses lingering, meaning that a socket is left lingering around for minutes or more to catch potentially delayed packets that arrive out of order after the connection is closed. If there is a limited amount of sockets/ports available, the new connection request may not be able to get initiated since no socket in the permitted port range can be allocated.

 

There are two mechanisms that control this and related aspects, one is to allow address reuse of sockets and the other is the linger timeout. Unfortunately the LabVIEW network refnum has no built in access to these socket options. You would have to experiment with something along these lines:

 

https://knowledge.ni.com/KnowledgeArticleDetails?id=kA00Z0000019NZfSAM

 

But it is a little outdated. While they updated it to use correctly the build in TCP Get Raw Net Object.vi they did not adapt the Call Library Node to actually use a pointer sized integer for the socket file descriptor. This might actually be not a problem on Linux since there, file descriptors are typically low integer numbers but on other platforms it could be a pointer and using an int32 for that on a 64-bit platform might truncate the socket handle value.

 

As to how to address the SO_REUSEADDRESS or SO_REUSEPORT, or SO_LINGER option you would have to adjust the according values passed to the Call Library Node in that VI. but there are additional limitations here. SO_REUSEADDRESS for instance only has effect if called before the socket is bound, and that is something that happens in the TCP Open Connection node, so you can't change that because you only get a LabVIEW network refnum after the connection was created and the bind() call has been made already at that point.

 

Read this about why use of SO_LINGER with 0 timeout may seem a solution but is almost always a bad idea. https://stackoverflow.com/questions/3757289/when-is-tcp-option-so-linger-0-required

 

It may be actually a better idea to check the IIS configuration about holding on to sockets unnecessarily after they were closed.

 

As an example, a browser knows from the Content-Length HTTP header when it has read all data and can initiate the close. (I know that in HTTP 1.1 it will keep it open for a while for a possible reuse, and then close it.)

This is likely going to get you into the depths of socket programming and according configurations that most users would absolutely want to avoid.

Rolf Kalbermatter  My Blog
DEMO, Electronic and Mechanical Support department, room 36.LB00.390
0 Kudos
Message 6 of 6
(438 Views)