04-07-2015 01:33 PM
Would you be able to provide a system spec with hardware and software used for your system?
All modules and software installed on the cRIO would be helpful information.
I haven’t been able to tie the error back to anything specific at this point. I’m trying to see if I can tie the crash back to a known bug in LabVIEW.
I also noticed a specific error code in one of the crash logs: -63184. Is that code reported in a dialog at the time of the crash?
To confirm, the controller only crashes on a Linux RT target and you have gotten the application to run successfully on a VxWorks target (9024)?
Also, did you uninstall/reinstall the RIO driver on host before reformating the target?
04-07-2015 08:26 PM
Hi BIGMACK,
The C-Series modules present in the cRIO are below in order from slots 1 to 8. Note that the cRIO chassis is configured for pure Scan Engine mode.
NI 9265
NI 9265
NI 9265
NI 9208
NI 9208
NI 9476
NI 9425
NI 9213
The software configuration was given in my original post, but I've since reformatted and installed the cRIO so it may be slightly different. The current configuration is below:
The error (-63184) seemed to manifest itself as an issue during deployment. The code which I'm writing has the ability to run an FPGA, dynamically loaded by an 'FPGA' class. This code is unused at the moment, and is not even called from the main application, but wasn't letting me deploy for whatever reason. To get around the deployment error I had to run the offending VI in my FPGA class, but because the chassis is configured for Scan Engine mode, it probably generated that error. I suspect it's unrelated to the cRIO crashing issue, as it isn't present in the other log or in msmeulers log.
I have only ever seen this crash on the 9067, which is the first Linux RT controller I've used. I didn't see the crash once when developing the same code on the 9024.
I haven't tried uninstalling/reinstalling the RIO software on the host due to time constraints, but I will give that a try when I have a chance.
04-08-2015 05:03 PM
Hi Michael,
By any chance, when you moved to try with the 9024 controller, did you create a new project? Is there a chance that some files in the original project are somehow corrupted? I’ve seen previous “unexpected cases/errors” that when moving to a new project and start adding items progressively got to solve the issue.
Regards,
AGJ
04-13-2015 12:04 AM
With us a new project / format new installation didn't work out. Still after closing the TCP connection a big fat crash of the system.....
04-14-2015 11:07 AM
I’ve found that at least one service request has been created for this issue in our system. The service request will be the fastest method of resolving the current issue. I will monitor the work on that service request and try to post anything that seems relevant to this discussion.
Is there a minimalistic version of your code that reproduces the issue?
Also, can you confirm that you are not using any of these functions in your Real-Time code?
Unsupported LabVIEW Features on NI Linux Real-Time Targets - http://digital.ni.com/public.nsf/allkb/1C16BFEA5262E33E86257A46006FC92B
Thank you for providing the modules and software set. I don’t see any concerns with the modules and compatibility. I do know that the NI-RIO 14.6 patch was released recently. Have you installed this patch? We identified a few critical bugs in the NI-RIO 14.5 drivers that were fixed in this patch. It’s possible your issue was tied to this update.
http://www.ni.com/download/compactrio-module-software-14.6/5256/en/
I’m escalating this issue internally and will post any relevant findings separate from those discovered on the support ticket currently in work.
04-15-2015 12:30 AM - edited 04-15-2015 12:45 AM
Hello Will,
the 4.6 update didn't resolve the issue. This update wasn't on my scope, because the listed issues were only module related. The RT part is reduced in such way that no c series modules are included. But just to be sure I updated the system, without any better result.
I went to local support (which escalated this to the US branch) because this post wasn't initiated by me, and I wasn't sure if this is exactly the same problem, altough the error message isn't the same. I didn't want to end up without a solution after some time, because the deadline of this project is aproaching fast.
I attached the reduced code, for RT there wasn't much to reduce. The GUI is now the network transeiver in the project. If you deploy the RT VI and try to connect using the network transeiver, you will see the message counter increasing when its connected. If I press exit, most of the times, a project pop up apears stating that connection has been lost. Reconnect will show the error (0x661).
Now the time that the application starts and errors which are generated by the application are logged in a file at the following location on the controller: :\home\lvuser\natinst\nisyslog : (<hostname>.txt)
I'm not using the functions listed in the ' Unsupported LabVIEW Features on NI Linux Real-Time Targets '. In fact the same (STM) set up is used in another application, which is now succesfully running but then under 2013 SP1. I will try to dig this one up, reduce it and see if this is showing the same problem in 2014. However, most of the STM code was copied and modified in this project. I also tried to reprogram the complete RT application, this resulted in the same behaviour.
UPDATE: With the 2013 SP1 code upgraded to 2014 it took a few attempts, but then the same problem apeared. Again: this code was used in another project and running also on 3 different 9068 controllers for months without any problems.
Best regards,
04-15-2015 06:12 AM
Hello there,
I am experiencing the same error on my cRIO.
The past week the system gave the error (Hex 0X661) the Labview real-Time process encountered an unexpected error and restarted automaticly.
The strange part is when testing the system the first weeks there was not a problem with it when it was running stand alone or connected to the HMI.
The setup is the following :
cRio 9066
modules :
NI 9205
NI 9211
NI 9264
NI 9472
After reading all the info in this toppic I was able to get the Log files from the system. And I attached them in this post.
I want to know what is going wrong when it reboots automaticly to find a sollution for what is causing this problem.
Kind regards
Joris Willems
Test Bench Engineer
04-15-2015 10:43 AM - edited 04-15-2015 11:12 AM
Joris,
Are you by chance usint the STM library provided by systems engineering?
04-15-2015 11:33 AM
MSMuelers,
I've been monitoring the internal support ticket, but I'll post the solution we think we may have found (at least for a workaround). The original behavior still isn't completely explainable, and this behavior will be reported to R&D for investigation.
One of the engineers internally was able to reproduce the issue with no other changes aside from the hardcoded IP addresses. When running the code in Highlight Execution mode, the error was not reported. The Data Sender loop was entering the Exit case as the Data Receiver loop was in the Read Data case. When running at execution speed, the read might be occurring after the connection was closed. A 500 ms wait before the connection closes to the Exit case of the Data Sender loop as and this seems to have resolved the issue.
Can you reproduce this on your end with the wait function or by using a notifier or other communication method to prevent the Exit case from running without all other TCP/STM functions completed?
MichaelBalzar,
Are you also using TCP/IP or the STM library?
Can you post a simplified version of your code that reproduces the error?
If there is no simplified version available, I can try and reproduce the error internally and see if we can pinpoint a similar error source.
04-15-2015 06:52 PM
Hi BIGMACK,
Thanks for escalating this issue.
The crash I'm seeing doesn't appear to be as reproducible as with the code msmuelers has provided. I'd be lucky to see it once or twice a week while developing on the cRIO full time. When the crash does happen, it seems to happen with a cluster of events (deployment issues, LV crashes, other internal faults), though they may just be coincidental. So I don't have any code, simplified or otherwise which can reproduce the error with any degree of reliability. It's just a matter of developing full time on the cRIO and it eventually pops up.
The code I have makes heavy use of the Current Value Table (CVT), several classes with multiple levels of inheritence, and programmatic access to shared variables and IO variables. When I posted the initial crash, I don't think the code used shared variable access, but did use programmatic access to Scan Engine IO. I'm not using the STM library or any direct TCP/IP access, but I guess any variable access hits the TCP/IP stack at some point.
I just had a thought - I've been doing a lot of performance testing and monitoring via the System State Publisher data in the DSM, which no doubt uses TCP/IP too, so that may be a contributing factor.