Real-Time Measurement and Control

cancel
Showing results for 
Search instead for 
Did you mean: 

Host to Target DMA FIFO Read Timeout in cRIO-9049

Hi there! (sorry for the long description and for the lack of attached VIs, no permission here)

I'm using LabVIEW 2019 (32-bit) and a cRIO-9049 to implement a XYZ motion control system, which must follow trajectory references. Closed loop control (feedback readings + control laws calculation + actuators commands) is implemented in FPGA at 20 kHz (50 µs period). Trajectory references discretized at 20 kHz are read from a single CSV file by RT and streamed to FPGA using a single Host to Target DMA FIFO channel.

 

The problem: sometimes I get a read timeout on FPGA side, and I just can't explain why, neither solve it. For a same trajecory file, this error occurs roughly in 4 of 10 attempts, so most of time it runs without problems, but it eventually fails at some random sample of the trajectory, no regularity observed.

 

I can't ignore this error by waiting until a new sample arrives on FPGA side, because this would distort the motion profile. 

 

A "trajectory sample" consists of 8 elements (SGL type) :

  • 1 trigger (irrelevant to this discussion)
  • 3 position references (X, Y and Z)
  • 3 feedforward control efforts (X, Y and Z)
  • 1 dummy element, to make total size a power of 2 (required by the "Number of elements per read" FIFO setting)

As suggested here and here, to make an efficient usage of DMA FIFO, I write the data by blocks, which size is configurable before starting the streaming. I've been mostly working with a 200 ms block size, which corresponds to (200 ms/block)*(20 kHz)*(8 elements/sample) = 32000 elements/block, although I tested others sizes, and didn't solve the problem.

 

FPGA consumes these blocks sample-wise (8 elements per reading) at the 20 kHz controller rate. On RT side, I implement a loop to send a block per iteration at a rate equal to the block duration. This way, I try to keep the "mean flow of writes" equal to the "mean flow of reads". For example, with 200 ms block size, my RT loop runs with a 200 ms iteration loop time. To give some advantage for the RT, I only start reading blocks in FPGA after writing two blocks, i.e., I enable FPGA FIFO reading (using a FPGA front-panel boolean) only at the end of the second iteration from the RT loop.

 

I've tried different implementations of this RT loop, all of which I'm pretty confident can run every iteration faster than the specified loop time/block duration.

 

First, I tried different loop structures and timing monitoring, such as:

 

  • While loop timed with Wait Until Next ms Multiple and monitored using different Tick Count's and Flat Sequences
  • Timed loop, monitoring with Tick Counts and left node readings, like Finished Late? and Iteration Duration
  • Timed loop set with dedicated CPU, monitoring with Tick Counts and left node readings, like Finished Late? and Iteration Duration

For these options, I've tried:

 

  1. Reading each block from the CSV file inside each iteration of the RT loop, which is less recommended due to File I/O overhead/indeterminism, but still didn't result in any "late iteration"
  2. Reading the entire file beforehand and then start the loop, which uses a lot of memory in RT, but drastically reduces execution time of iterations

For a 200 ms block size, the first approach resulted in iteration of  ~130 ms, and the second approach around 200 us. FIFO read timeout in FPGA occured in all combinations of the two previous lists.

 

I've also played with buffer sizes. In RT it's depth is set to 160768, corresponding to ~1 second of trajectory (8 elements/sample * 20 kHz), and I never got even close to fill it. In FPGA, I've tried the following sizes:

 

  • 1029 (minimum configurable, recommended here)
  • 32815 (enough for 1 block of 200 ms)
  • 65583 (enough for 2 blocks of 200 ms)

Other things I've tried, without success:

 

  • Set different timeouts to the FPGA FIFO Read Method, from 0 to 40 ticks (1 us, for 40 MHz FPGA clock), above which would start to compromise my closed-loop implementation
  • Set Number of Elements Per Read in FIFO configuration to 1 (instead of 8), and run FIFO Read method inside a For Loop 8 times
  • Start reading FIFO in FPGA after the third block was written by RT, instead of the second

Since all iterations of RT loop seem to run on time (based on tick counter and timed loops left node readings) and is always at least one block ahead from FPGA, I started to suspect on the FPGA side or the DMA FIFO controller itself.

 

I started monitoring both FIFOs using "Empty Elements Remaining" (RT) and "Get Number of Elements to Read" (FPGA) Invoke Methods and noticed that:

 

  1. When no problem occurs, they indicates the number of elements on both FIFOs remains roughly constant, indicating the proper balance between flows of writes and reads, as expected. In RT's FIFO, it keeps close to 64000 (two blocks of 200 ms), and in FPGA's FIFO, close to its configured size;
  2. A couple of seconds before a read timeout occurs in FPGA, the number of elements in RT side starts to fall at a rate of ~2 kHz, while the number of elements in FPGA FIFO remains roughly the same, until the RT FIFO is empty. At this moment, the number of elements in FPGA FIFO starts to fall, at a rate of 22 kHz, which is 2 kHz above the excepted 20 kHz rate

I've checked this 2 kHz and 22 kHz by reading the Number of Elements in both FIFOs periodically using different methods in RT and post-analysis, and it really seems that the FPGA suddenly increases the rate at which it reads the FIFO. This could explain why I get the timeout, but then I have no explanation for how this could occur.

 

I doubled-checked (by using, for example, a dedicated Timed Loop and monitoring its Iteration Durations) whether this sudden rate increase could be just a problem in the RT implementation of this measurement, and I'm confident it's not a false conclusion.

 

The FPGA loop responsible for reading the FIFO at 20 kHz is implemented simply using a While loop, a Flat Sequence and a Loop Timer with a constant input, as recommended in many examples. I tried configuring this Loop Timer by ticks and by µs, no difference observed. And when I check the difference of its output between each iteration, it just stays constant and equal to its input value, so no indication it increased the loop rate.

 

Lastly, both implementations in RT and FPGA have a lot going on besides this data streaming, and we consider this could also explain the problem, although all loops had their iteration duration checked, CPU loads in RT are fine, and the most used resources in FPGA are slices (71%) and LUTs (52%).

 

In summary:

 

  1. In some attempts to stream data from Host (RT) to Target (FPGA) using a DMA FIFO, I get read timeout on FPGA side;
  2. I can't ignore it or wait for a new sample to arrive, cause it would compromise motion trajectories;
  3. Data is sent periodically by chuncks, to improve streaming efficiency. Different implementations of RT streaming loop were tested and all respect iterations deadline, in order to neither fill, or empty FIFO buffer;
  4. Different configurations of FIFO sizes, timeouts and loop timings were tested, but none solved the problem;
  5. RT streaming is always at least 2 chuncks ahead from FPGA reading (except when the problem occurs);
  6. I have evidence (checked using different approaches) that the timeout occurs because FPGA suddenly starts to read FIFO faster that expected (22 kHz, instead 20 kHz, from which 2 kHz difference is the same rate RT FIFO starts to empty before a read timeout occurs);
  7. Application is running a lot of stuff in parallel both in RT and FPGA, although proper loop timings were repeatedly confirmed by different methods

I'm stuck with this problem for at least 2 weeks already, and need to get a solution as soon as possible. If anyone have any ideas of something I'm missing new tests I could try, I would really appreaciate! Again, I wish I could post some VI's, but at the moment I have no permission.

 

Best Regards,


Gabriel O. Brunheira

Mechatronics Engineer
Brazilian Synchrotron Light Source

0 Kudos
Message 1 of 5
(230 Views)

Update: I created a version of the application in RT and FPGA containing only the code responsible for the data streaming, and the timeout error still occured, so the parallel execution of other stuff mentioned in the previous post seems to have nothing to do with it.

0 Kudos
Message 2 of 5
(199 Views)

Gabriel,

This post may not directly match your problem, but there is a lot of useful detail about DMA FIFO setup some of which is fairly obscure, and I know our LV developer used some of insights from this to solve a problem we were having with loosing data (rather than timeouts):

https://forums.ni.com/t5/LabVIEW/DMA-FIFO-switching-beteen-channels-after-FPGA-sends/td-p/2556251

 

Good luck,

Andy

Consultant Control Engineer
www-isc-ltd.com
0 Kudos
Message 3 of 5
(178 Views)

Hi Andy!

 

Thank you very much for sending this information, I'll take a look!

 

Since my last message, I was able to get new information about the problem. I modified the main VIs in RT and FPGA to toggle two digital outputs: one that toggles at the RT write loop (DIO1), and the other toggles at the FPGA read loop (DIO2). Then I monitored them with a oscilloscope (see below). Red curve is an analog output that reproduces the trajectory reference:

 

gabrielbrunheira_0-1741978969986.png

 

As expected, the first signal (RT) toggles at a period 200 ms, and the second one at 50 us (20 kHz FPGA controller rate). But when the problem occurs, the RT period simply increases to around 220 ms (~18 kHz), and keeps it. This way, the “mean flow of writes” decrease, then the number of elements in RT buffer slowly decreases until it’s empty. At this moment, FPGA buffer starts to empty until an underflow occurs. This 18 kHz explains the "2 kHz question" I mentioned in my original post.

 

I was monitoring RT loop rate using different “software indicators”, including Tick Counts differences, Iteration Duration and Finished Late? from the Timed Loop version, and none of this indicated that RT was taking more time than expected, not even once. Actually, it was even more strange: when the problem occurs, the measured Iteration Duration decrease by a couple of milisecconds!

 

To test this software indicators, I induced a delay to occur in a specific iteration of the RT write loop by using a Wait VI when the iteration number was equal to some number. In this case, I was indeed able to see not only the RT digital output taking longer to toggle at one iteration (see below), but also with my software indicators (tick counts, iteration duration and finished late):

 

gabrielbrunheira_1-1741978969994.png

 

The only explanation I came up with was that somehow the clock used by RT as a timebase reduces it’s frequency (maybe due to overheating?), so it gets slower relative to the FPGA without noticing it. I believe this could even explain somehow why the Iteration Duration reduced (as mentioned above) when the problem occurs.

 

Anyway, knowing the problem is the RT, now I changed the streaming strategy, and I no longer write the block at a period equal to the “block duration”. Now I’m just writing the blocks as soon as there’s space available in the FIFO, by setting a timeout different of zero in the Write method and removing the temporization of the write loop. I'm still analyzing the impact this have on CPU load, maybe I'll have to include some strategy to avoid CPU starvation.

 

This seems to solve the problem, but I still have no explanation for why RT suddenly takes more time to run its loop, without been able to notice it (using the so called “software indicators”).

 

What do you think?

 

Thanks again!

0 Kudos
Message 4 of 5
(154 Views)

Great to hear you have identified and developed a work around.


Consultant Control Engineer
www-isc-ltd.com
0 Kudos
Message 5 of 5
(140 Views)