cRIO Scan Engine; Missing points in RT

viScience · ‎02-23-2012

I believe the scan engine is a fixed bitfile that actualizes itself as a small FPGA microcontroller that is configured by a file called variables.xml on the cRIO. The variables.xml file is generated by the project or system replication tools but can alos be manipulated by LV properties of the IO.var class. The FPGA scan engine then works cooperatively with the RT SVE components to pipe data up to your LV diagram.

Jed_Davidow · ‎02-23-2012

Thank you for that answer. The big question is: are there different versions of the ScanEngine bitfile that are appropriate for different FPGAs (specifically, the older LX45 and the newer LX30)? Is it possible that by not configuring my project correctly and then swapping hardware that I pushed an innapropriate ScanEngine onto my 9076 and caused the trouble I was seeing? (Trouble that has since been eliminated with a reformat and reinstall of the target)?

Jed_Davidow · ‎02-23-2012

Here is a snippet of what I am doing (Generally; There are some missing details, but you get the picture).

I have found that I can compile this to an RTEXE and then FTP it to the startup folder and the targets run it. I currently have code in there to handle 9205, and 9234-9237 modules.

I also swapped in a 9401 just to see if it detected on the target, and there was no problem (never had the 9401 in during Dev) detecting.

That's a Scanned Variable Read, not a Direct Variable read in the timed loop.

What do you guys think?

Jed_Davidow · ‎02-24-2012

Trying to attach the snippet. Very strange- it's showing up perfectly well inline on the machine I posted it from, but not the others.

ColdenR · ‎02-24-2012

Jed,

That's a pretty cool way to access variables from different modules that you can just swap in and out. This certainly seems like it should work (especially given that you've already tested it). This document talks about the RIO Scan Interface under the hood:

http://zone.ni.com/devzone/cda/tut/p/id/7693

Specifically, there is a relevant paragraph under "Inside the RIO Scan Interface":

The RIO Scan Interface contains several components that enable the flexibility and performance it provides. Each I/O module communicates directly with a cartridge controller responsible for detecting the module type and communicating I/O data to and from the module. The cartridge controller is a “soft-core” eight-bit microcontroller, which is instantiated in the FPGA, allowing you to use any supported I/O module without compiling.

So basically the RIO scan interface is a bit file that gets loaded onto the FPGA when you're using the scan interface.

However, the bit file is specific to the type of FPGA that you're using (as specified by the project). So that's why you were seeing all those problems with your 9076. It sounds like you loaded a RIO scan interface bit file intended for the FPGA in the 9012/9112 setup onto the FPGA in the 9076. I'm actually surprised that the program allowed you to do this, and that it worked at all. But in the future you will definitely need to specify the correct cRIO type in the project - changing the IP for the project only works if you're using the same cRIO model for both IPs.

The easiest way to handle this though is to just create a new project that references all the same files from the 9012/9112 project. Then you can use the 9076 FPGA in one project and the 9012/9112 in the other project, but you don't need separate files.

Colden

Jed_Davidow · ‎02-24-2012

> The easiest way to handle this though is to just create a new project that references all the same files from the 9012/9112 project.

Yeah- I haven't done that before and I was a little worried about what the implications would be. BUT it appears that not only does my customer want me to support this code on the 9076 and 9012/9112, but he wants a special version that runs a specific set of modules on the 9076 to run as fast as possible without Scan Mode. So looks like I will have several projects.

Thanks!

StephenB · ‎02-24-2012

You could be dropping samples because of the way your fifo read timing is set up. the 0 timeout on your FIFO reads means it will not wait for data to actually be present. Instead the execution will always move to the case structure.

The wait in the case structure is not synchronized to the scan engine and you could potentially get out of phase or even have your waiting preempted.

A better timing would be to have the bottom loop block on the RT FIFO read. That way it executes whenever the data shows up and you're sure not to miss anything unless your FIFO overflows because the bottom loop can't keep up with the top loop. It's easy to catch that: set The FIFO write timeout to in the top loop to 0 and put a counter on the "timed out?" output terminal. If it gets above 0, your FIFO is overflowing because the consumer isn't keeping up.

Stephen B

StephenB · ‎02-24-2012

"waiting preempted" doesn't make any sense. Never mind that. But you could be out of phase.

Either way the timed out counter on your writes will be good debugging.

Stephen B

Jed_Davidow · ‎02-24-2012

Hi Stephen,

Thanks for the analysis. But the Zero-timeout-to-a-wait is on purpose. First off, the consumer loop doesn't need to be in sync with the acquisition. It's just logging, no control or logic on the data.

-If the FIFOs are empty, then the consumer loop enters the wait state. Assuming that the system isn't fundamentally over-burdened, it is possible that the top loop will write to the FIFO multiple times. But that FIFO is 1024 points deep, so it should be able to handle a back up every so often. Basically that's enough to handle any DAQ speed up until the point where it cannot keep up over all.

-If the FIFOs are not empty, data is read from the FIFO and the other case executes. This case always calls a disk write, and (in my actual code) calls UDP write every N loops so that a remote listener can pick up a subset of the data. If there is still another point in there, it loops without a wait. So it eventually catches up.

(I also have a couple other FIFOs being fed by the producer loop in the main code, including one that also passes the Timed Loop Global End Time (or whatever the ns timer is called) to the consumer loop. This FIFO is written to last in the error chain; So it's the FIFO I check to see if there is data ready in the FIFOs. If that one has data, they all have data. Any error in the chain and this one would not have data. I'll admit that my error handling is not all that robust, but I have not yet seen any errors in the FIFO portions of the code- ever.

Also, there are a couple other loops running in parallel to these. A couple to boradcast status over UDP as well as one to listen for commands from the remote control station. Lastly, there is another loop listening to a ~4kB/sec serial stream for specifc flags. That loop is crucial for syncing this data to another DAQ system in the lab- so I need to keep the CPU utilization low... What I have noticed is that if I call the FIFO reads with non-zero timeouts, the RT CPU gets bogged down considerably, if not completely (at 250Hz acquisition, the entire program utilizes >90% of the CPU, WITH the zero timeouts). This is the promary reason that I use a zero-timeout + wait state.

The old labVIEW DAQ drivers (pre-DAQmx) used to be the same way for reads from the buffer (calling the buffer read with a non-zero numData when it was empty killed the processor. So you called it with numData=0 and checked the buffer utilization. If it was non-zero, it was safe to read that amount of data). It's actually kind of annoying that most of the basic examples use the timeouts that kill CPU, both in RT and non-RT applications. I end up recoding that for many a customer... OK. Maybe it's not such a bad idea. 🙂

So the upshot is that if the FIFOs build up (every so often, when I monitor them, they shoot up to 70-80 points, then drop to zero when running at 250-400Hz), the UDP writes might fall 1/4-1/2 second behind, which is no big deal. The data still gets logged without missing any.

I have tried buffering the disk writes and not writing to the USB flash drive on every loop with data; cutting that down by even 50x doesn't seem to affect the performace appreciably, so I removed that code for simplicity's sake.

StephenB · ‎02-25-2012

I Understand better now, thank you.

I still don't like the idea of a hard coded wait and a zero timeout. This should be avoided by using a non zero timeout on the FIFO read. You are correct, your CPU usage will go to 100% while blocking on the FIFO nodes... But only if the obtain FIFO node has its read/write mode configured for "polling". If you need to save CPU, you should change that to "blocking". See the detailed help on the open FIFO node.

Assuming the file IO nodes to be working- there are two places the data could be lost:
1) it is never appearing in the time critical loop in the first place
2) it is failing to transfer between the time critical loop and the logging loop

We can easily check #2 by putting counters on the "timed out" output terminal of the FIFO write in the timed loop. You can get one in the signal processing -> point by point -> other pallete. If the number(s) are ever over 0... You're dropping points.

Stephen B

Real-Time Measurement and Control

cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT

Re: cRIO Scan Engine; Missing points in RT