cRIO TCP/IP stops working without reason.

woutert · ‎04-13-2007

Hi,

I developed a turbojet simulating module with a cRIO9004 module.

Every now and then (every n hours) the controlling computer looses connection. So the panel I see on my portable has lost the connection.

A ping to the IP adress of the cRIO controller even fails.

Other computers, with DCS screens, using the same ethernet connection continue to work fine on the same local network.

The program in the cRIO continu's to work fine, the FPGA also. But user interactions are no longer possible.

How could I regain control from my remote pc?

Is this a bug?

How can we restart the connection?

A hard reset doesn't work even so. I have to shut the cRIO down by plugging out the power supply before deploying and restarting the program.

Did I do something wrong?

Have program a realtime simulator for powerplants written in C++.I translates the RT sim to NI components and software (Labview).

My rt pxi turbine simulator for simulating grid incidents was succesfully used in a nuclear plant in 2006.

Look at http://sine.ni.com/cs/app/doc/p/id/cs-755

nathand · ‎04-13-2007

When I've had problems like this it's usually because some high-priority loop is taking up all the cRIO processor time, preventing it from responding to any other requests. Is it possible you're leaking memory (constantly enlarging an array, for example) in such a way that the RIO needs to make a new copy every time through your main loop, causing it to slow down over time? If you increase your main loop time, does it take longer before you lose communication?

pierreR · ‎04-15-2007

I totally agree about the things said by nathand. use the real-time system manager to diagnosticate the cPU and the RAM ressources used by your application. I f you own the Execution Trace Toolkit, use it to record a trace of your application. If your time critical code uses a lot of resssources, PharLap considers that TCP/IP is less critical than your code, that's why you lose the communication.
Regards

Pierre R...

Certified LabVIEW Developer

koutnym · ‎03-10-2008

Hello,

I have quite the same experience that my cRIO application after some hours (35 hours) of operation causes the cRIO-9012 controller hang its TCP/IP operations. It became unreachable via FTP, System Manager... even ping is not working and device is not found on the network, while the RT application still continue its performance (DIO lines, USER LED, log files are evidence of it). Time between the failures is always approximately the same.

I have for some time observed cRIO using System Manager to make sure any memory leakage is not causing above described problem. Application is balanced at the edge of occupied memory, but still, this is not reason why it should hang once it successfully run. The long term monitored parameters are:

Total Memory: 60785 KB
Used Memory: 55674 KB
Available Memory: 5111 KB
Contiguous Memory: 3984 KB

Note that cRIO is under continuous TCP/IP communication producing ~600 KByte/s synchronous data stream in direction to host computer.

RT application is written in LabVIEW 8.5, cRIO 2.3.1.

In addition, is it possible somehow simply get amount of available RT controller's memory inside the running code (to be able log it to the file)?

Thanks for any hints. Please note I am quite experienced with LabVIEW (cRIO, FPGA) programming.

Martin Koutny

LJMartin · ‎04-01-2008

I too am having the exact same problem. My cRIO9012 stops talking (no ping or anything) but keeps running its program (FPGA too) as evidenced by flashing led, log file and process still under control. Of course, operator cannot see or do anything on his display.

After (re)booting the controller, the network functions run for hours to days before stopping.

My CPU load is stable at about 40% and free contigeous memory is stable at about 3MB (low, but it is completely stable... plus the program does keep running).

I'm running Labview RT 8.2.1 and use shared variables (hosted on the RIO) and send or receive about 30 variables per second.

Has anyone resolved this problem?

koutnym · ‎04-01-2008

Hi,

after some weeks spent with useless testing of different methods of TCP/IP communication, searching for mistakes in my code, I tried to update NI-RIO 2.3.1 to version NI-RIO 2.4... last help even with risc of "new" bugs comming with it. I did it three days ago and my system seems to work from that time up to now without any troubles. For me it is evidence of bug in NI-RIO 2.3.1. Try to install NI-RIO 2.4 on your computer (comes with LabVIEW 8.5.1, where LabVIEW you dont need to update to this version) and then use MAX to update software of your remote system target.

Martin Koutny

Bassett_Hound · ‎04-01-2008

Hi LJMartin,

There were a couple Ethernet bug fixes that went into both 8.5.1 and RIO 2.4 that I believe your seeing. A while back, this was reported to R&D (# 66376) and was fixed in LabVIEW 8.5.1 and NI RIO 2.4. Here is a link to the NI RIO 2.4 release. If your not able to upgrade to LabVIEW 8.5.x please contact our technical support group for additional support.

Let me know if you have any questions,

Basset Hound

LJMartin · ‎04-01-2008

Thanks Martin and Basset Hound,

I'll try upgrading to LV RT 8.5 and NI-RIO 2.4 in the next couple of days. Thanks for the link to NI-RIO 2.4 since I've only got CDs up to version 2.3.1 I'll make a post once I'm finished to let you know how it worked out.

Thanks,
L. John Martin.

LJMartin · ‎04-10-2008

Hi... As promised, an update of our progress:

- I haven't upgraded to LV8.5 and NI RIO 2.4 yet. While I was upgrading my laptop, people at the plant set up my switch as a VLAN so that I don't get all the traffic from the existing control system at the plant. The existing system uses multicast for all its communications and was flooding the network with traffic. Putting my switch on its own VLAN has eliminated the extra traffic from my ports and seems to have solved the problem. We've run for a week without the dreaded "controller stops its network functions while otherwise keeping running" error. I read in an FAQ somewhere that the cRIOs (or the O/S on them) doesn't support UDP-multicast... perhaps the existance of all this traffic was the source of our problems.

I do still plan to upgrade to LV 8.5.1 (I just got the CDs for this yesterday) and NI RIO 2.4 in late April or early May when I am next on-site. I'm quite hesitant to try this across the VPN as it involves formatting drives and such. If something went wrong, we'd be down for a few weeks until I made it back to site. Also, I saw somewhere that NI RIO 2.4 doesn't support 64 bit O/Ss and my laptop (on which I develop and manage the cRIOs) is an AMD Turion 64x2. I'm hoping that this isn't a problem.

I'll make another post after I've upgraded to LV 8.5.1 and NI RIO 2.4.

Long John Martin.

rex1030 · ‎06-05-2008

Yea update to the newer version of RT.

Also make sure you have waits in all your loops that aren't already timed. If a loop is running as fast as possible sometimes it will take higher priority than lesser important system functions, like tcp/ip connection. By the way, I am pretty sure pinging your controller will never work because the RT OS doesn't have that functionality.

---------------------------------
[will work for kudos]

LabVIEW

cRIO TCP/IP stops working without reason.

cRIO TCP/IP stops working without reason.

Re: cRIO TCP/IP stops working without reason.

Re : Re: cRIO TCP/IP stops working without reason.

Re: cRIO TCP/IP stops working without reason.

Re: cRIO TCP/IP stops working without reason.

Re: cRIO TCP/IP stops working without reason.

Re: cRIO TCP/IP stops working without reason.

Re: cRIO TCP/IP stops working without reason.

Re: cRIO TCP/IP stops working without reason.

Re: cRIO TCP/IP stops working without reason.