08-26-2022 11:13 AM
Controller: cRIO-9056
Software: Labview and RT module 2020
This is something I've noticed with regular TCP servers, as well as ModbusTCP servers as well. Whenever I'm running a VI that deals with TCP, if I stop the VI and then restart it immediately after, or even wait just 10 seconds or so before I restart it, I get an error. I then have to wait a couple minutes before I restart it and then all is fine.
I can't remember where I read this, maybe on here or reddit, that you're supposed to wait 5 minutes or so before trying to reconnect/utilize the same TCP port? Is that what's going on here?
I'm concerned that this could throw my TCP servers into a forever error. Is there anyway of handling this instead of just having to be aware of the fact that you have to wait before starting everything up again?
I can't remember exactly what the error is for a regular TCP server, but I know when this error occurs on my modbusTCP server the Daemon Status says "Error"
08-28-2022 02:55 PM
There are some cases where the TCP protocol can get to a state that might delay a server starting or a client from establishing a connection but this generally occurs if there are too many connection requests to a server in a very short period of time. TCP has a TIMED WAIT state at the end of a connection to make sure that all data is transferred. In addition, the stack will have a finite amount of resources. If a server is slammed with open requests that close immediately afterward that the server can get starved of resources and you will need to wait for all of the connections is the TIMED WAIT state to finally close. However, this is a sign of a poor implementation or a denial of service attack. A well written client/server should rarely ever encounter this condition. I have written many MODBUS client/server applications all which support automatic reconnect and have not encountered this issue.
On the server side the main listener should do nothing more that accept a connection and then spawn a subtask to handle the connection. If it is doing more than this is can create a bottle neck for servicing new connections.
The client side should attempt retries when it loses a connection using a reasonable timeout (minimum of several seconds if going over the Internet to 1 or 2 seconds on a local network) along with a short delay between connection requests. I have seen clients implement the reconnect by immediately trying to open a connection on an error. This floods both the client and possibly the server stack with connection requests. When an error is detected wait a brief time before trying to open the connection again. It is also a good practice to increase this wait time between requests if you are unable to re-establish the connection and you are stuck trying to reconnect. You should set a maximum delay between attempts just so you don't end up with effectively a wait forever. How long this maximum should be would depend on the system itself. Mission critical may max at one or two seconds. Something less critical may go as long as 30 seconds between requests.
If you are encountering this situation frequently than I suspect that either the server, client or both are poorly implemented.