LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

How can I diagnose a Labview RT program crash on cRIO 9073?

I have 2 cRIO 9073 purchased about 1 year apart.  Both are running the same realtime application.  The first cRIO runs fine.  The 2nd cRIO crashes intermittently while executing 1 particular VI that contains 2 timed loops running in parallel (1 collecting data and the other controlling machine movement).  When it crashes, the app stops, web server and FTP stop responding, MAX cannot communicate with the device ( I must perform a hard reset).  After reset, viewing the error log using MAX shows no errors.  I've added message logging to see if it stops in a particular place, but see no patterns.  In some cases, the device seems to be continually restarting (every 2-3 minutes) until it finally hangs.  In many instances, my configurations files (used to store runtime variables) have been corrupted or erased.  When trying to deploy the app on this Rio, I generally must try multiple times because I receive the error, "Error deploying on target".  I have tried formatting the flash and reinstalling the OS many times from different sources.

I'm running Labview 2009, SP1 with the FPGA option. 

 

Any ideas on how to diagnose this problem?  Are there any diagnostic tools to test this device?

0 Kudos
Message 1 of 12
(4,844 Views)

Swap controllers to make sure it is not an external issue.  If the problem migrates then image the good controller and deploy good image on the bad controller.  If the problem persists then you Probably have some bad hardware.  I suppose you could also fire up the Distributed System Manager to see cpu and memory usage to look for critical condiditions.

0 Kudos
Message 2 of 12
(4,839 Views)

Since the last post:

 

1) Swapped controllers.  Problem followed the controller.

2) cRIO crashed and I could not communicate with the device (Measurement and Automation Explorer could not connect to this device).

3) Sent the CRIO in for repair.  NI indicated they could not find a problem with this device, but since it took so long to diagnose, they replaced the device with a different unit (refurbished?).

4) Reinstalled exact same software that was running on the first CRIO (which has now been running continuously for more that 6 months).

5) System crashed in less than 1 day.

6) I eliminated one timing loop and replace another with a while loop.  This while loop still calls a VI that has 2 timing loops running concurrently.  This caused the system to crash at random spots rather than during the 2 timed looped VI.

 

If anyone has any suggestions on how to diagnos this kind of problem it would be greatly appreciated.

 

0 Kudos
Message 3 of 12
(4,754 Views)

Did you monitor the cpu and memory usage?

0 Kudos
Message 4 of 12
(4,745 Views)

If you are running the same codebase and NI sent you a 2nd tested unit then the problem stems from some unique circumstance the 'bad' controller finds itself in.  Your descriptions are vague so it is hard to understand what your setup is and what your code looks like.

0 Kudos
Message 5 of 12
(4,744 Views)

I have monitored memory usage which has never fallen below 5MB available.  I have not monitored CPU usage, but will give this a try.

 

0 Kudos
Message 6 of 12
(4,735 Views)

Here's a better description of what I'm trying to do.  This system is used to control the head position on a test machine.  I used the LV RT wizard to create the base VI with 1 deterministic loop and 1 non-deterministic.  The deterministic loop schedules 2 different test.  Test 1 is every 5 minutes (collect analog data, read temperatures, calculate new head position based on temperature , drive stepper motors to new position, collect data after moving, then dismiss).  Test 2 runs once a day with a duration of about 4.5 hours (drive stepper motors to user defined position, simulatiously collect data at various rates from 5Hz to .01 Hz, drive head to next user defined position, etc). This test has 2 timed loops running at different rates, one collects data, the other moves the head and acts as a timer to know when to move to the next position.

 

If I never run test 2, then the system has never crashed, leading me to believe the problem is in the test 2 VI.  The crashes don't necessarly occur in the Test 2 VI.  On some occations 1-2 hours after Test 2 has completed, the CRIO will start rebooting itself (every 2-3 minutes).  This may happen 4-5 times until it will finally hang completely. 

 

Since the crashes happen randomly (it may run for 2-3 days before crashing) I'm trying to find some way of trapping  errors or exceptions that would give me some clue as to what the problem may be. 

0 Kudos
Message 7 of 12
(4,728 Views)

Have you used the Disributed System Manager?  It has the ability to monitor cpu usage and setup threshold conditions for fault generation and logging.  Also, you can use the RT fault vi's to publish errors to the DSM.  I use the Set Fault vi when any critical timed loops do not meet their timing constraints or you can trap any vi errors and generate a published fault.  Also you still are not describing your system sufficiently...what is going on with the cRIO TCPIP?  Are you using remote panel, NSV's, etc.

0 Kudos
Message 8 of 12
(4,722 Views)

I noticed that the older cRIO has a BIOS version 2.5 while the newer cRIO has a BIOS version of 2.4. 

 

Where can I get the newer BIOS version?  I assume that I can use MAX to update the BIOS.

0 Kudos
Message 9 of 12
(4,698 Views)

In MAX, find your cRIO system, right-click on the Software item and choose Install Software.  There's an "Update BIOS" button in the dialog box that appears, if you have a newer version available.

0 Kudos
Message 10 of 12
(4,695 Views)