Consumer loop - unpredictable queue rate

Riv · ‎10-15-2013

Hello everyone,

I'm developing producer-consumer structure on RT and I have a strange problem. I'm receiving data from FPGA in producer loop and send it through queue to consumer loop. Next, they should go through network stream to Host. Because of the fact that consumer didn't keep up with producer (queue overflowed), I started to slow producer (the cost of this action is sending more data). It didn't help, so I added case structure in producer to send just half data to queue (the ones on TRUE in shift register).

This is the moment when I realized that sometimes closing LabVIEW environment and opening it again effects in speeding up queue transmission. Then, I can even accelerate producer loop and my queue doesn't overflow! And I can stop and run my program and it still works well.

I end up closing and opening LabVIEW and running my program all over again waiting for it to "click", because when it finally happens, I have no problem until next LabVIEW session.

Anyone have any idea what I could do wrong? Or how could I fix it?

0utlaw · ‎10-15-2013

Hi Owca,

Are you running your RT application interactively, or deploying it as a standalone RT executable? Interactive execution will definitely slow down your RT code and network communication speeds.

Regards,

Tom L.

Riv · ‎10-15-2013

I'm using interactive exeuction. I don't have my RIO with me now but I'll try to build standalone application tomorrow and I'll say how it works. Thanks!

aputman · ‎10-15-2013

Couple questions...

Why do you have a timeout on your dequeue operation and a timeout on your "write stream" vi? Looking at your code, I don't see a reason why you need a timeout. I would set the dequeue to wait forever for another element to appear.

Why do those 2 VIs run parallel to each other? Why not dequeue the element and immediately stream it in the same loop iteration?

aputman

ValkoA · ‎10-16-2013

Dear Owca,

I see no problems within the code you've sent, and the speeds that you work with are fairly low, too. This leads me to the assumption that other parts of the code might be causig the issue you're experiencing. Let me make a few suggestions:

Check processor usage when the VI is running. Since the consumer is runing at normal priority (as it should), other high priority tasks might starve it.
How much data is there approximately in an array? Another good test is to monitor memory usage to see if fragmentation is causing an trouble.
Are you using a fix sized queue? If not, I do recomend it to avoid dynamic memory allocations.

Disabling debugging features also helps quite a bit. Please get back to me if you have results.

Kind regards:

Andrew Valko

NI Hungary

Andrew Valko
National Instruments Hungary

Riv · ‎10-18-2013

Hello everyone,

at the beginning I'd like to apalogise for such late response. Something important came out and took a while, but I'm back now.

Firstly, aputman

- if I give "-1" to timeout I wouldn't be able to stop the loop if error occures after executing the loop. Maybe I care too much, but it shouldn't cause trouble here.

- I use 2 loop because I had some problems with network streaming data with just one timed loop and I found on LAVA forum someone's problem with TCP/IP. There was an advice to insert queue between receiving data from FPGA and streaming it to Host. The problem was connected to overflowing so tried implementing the advice.

ValkoA,

Reading your post I decided to change my code a bit. It occured that in spite of the fact that I want my timed loop to execute every 200ms (=collect 20'000 samples during each iteration) it collects between 10'000 and 30'000 samples FXP <14,5> (because of the feedback from Elements Remaining in FIFO, I think). So I changed "Number of Elements" in FIFO to 20'000. I made fixed size of queue buffer, too. Queue worked well then and my program was able to send data from RT to Host 20'000samples/200ms FXP <14,5>. However, it's too slow for me, and I need better quality than <14,5>.

Now I'm wondering if network stream is able to somehow slower all the process? When I tried making structure with just one Timed Loop with FIFO_Out and Write_to_stream inside it (so without any queues or producer/consumer), it couldn't send my arrays with FXP <20,5> fast enough - I received them on Host (network streaming) with latency, regardless how fast was my Timed Loop. For ex. my RT loop was during its 1000 iteration while my host was taking only 700th array of data. And then, suddendly reader on host stops throwing anything, and after a while it starts throw data very fast. There's a moment that the reader endpoint on host give ma arrays of zeros, and finally sometimes program stops. Maybe networks stream overflows.

Sorry for such long description... 🙂

And now here is my question: can I do something to speed up network stream a little? Single Timed Loop with FIFO_Out and Write_to_stream inside doesn't work at all for FXP <20,5>, regardless how fast I execute Timed Loop (program crashes). And the idea with producer/consumer doesn't work here too (queue overflows). FPGA sends data with 100kHz frequency. I fixed size of Network Stream.

Thank you

Texas_Diaz · ‎10-18-2013

I think I might see your problem. Or should I say, problems.

Don't write to front-panel objects from within a timed loop. On an RT controller this has an even more detrimental effect than on the desktop; writing to a front-panel object sends a TCP message to the host (if there's a host connected) which can/will seriously affect the loop rate of the timed loop. Just as a "haha", when doing this on an RT target, grab the front-panel window and drag it around while running the application and see what that does to your timed loop (absolutely brings it to its knees, if it had any). If you want to update a front panel object, send the data via a FIFO to a while loop and update the front panel object there.
Speaking of the FIFO, try to never use a normal FIFO inside a timed loop. Always use an RT FIFO. The RT FIFO preallocates memory and has several other efficiencies made to it that boosts performance and guarantees (that it doesn't affect) determinism in the timed loop. The regular FIFO does none of this. The regular FIFO allocates memory for every new queue item it creates, which means it has to possibly fight with a lower-priority thread for the memory manager (and if running on Windows that's the Windows memory manager, which is constantly being hammered by everybody including LabVIEW) and that seriously affects performance.

I really like your well-documented and well-organized code, that made it easy to find these two issues you need to address.

-Danny

Riv · ‎10-19-2013

Hi Danny,

thank you for your reply. I don't have problem with Timed Loop but with Consumer Loop. Consumer Loop is predictable now (thanks ValkoA!) but it doesn't keep up with Timed Loop for 20 word length FXP. However, I changed Queue to RT FIFO and removed all not-needed indicators from Consumer Loop, but it was still slow then. So I decided to measure execution time of RT Queue and "Write Single Element to Network Stream" in Consumer Loop and it occured that RT Queue executed ~2ms while Network Stream wrote these 20k elements (FXP 20 word lenght) during ~400ms each time... I don't think it's how it should be, but it's my first experience with network streaming so I can be wrong.

I attached new image for Producer/Consumer on RT.

I also tried again structure with just one Timed Loop and Write Single Element to Network Stream inside it. It looks like Reader Endpoint on Host doesn't take data fast enough - Available Elements for Reading (Property Node) on Host is always zero and Total Elements Read on Host doesn't keep up with Total Elements Written on RT.

I attached image for idea with single Timed Loop on RT, too.

I wonder if there's any process that can slow Network Streaming somehow or it's normal and I just can't send my data with 20 word lenght (FPGA collects data with 100kHz frequency). However, queue seems to work great with those arrays and speed.

ValkoA · ‎10-21-2013

Dear Owca,

I've made a simple benchmark using your data size and Fixed-point configuration, as well as a 9076 cRIO. My results were about 60% better, so in average I had about 140-160 ms spent per iteration.I'm attaching the project so that you'll be able to test it on your end.

If this is still above 200ms, I believe we have to look for a workaround. A few suggestions:

Simplifying the nework topology so that we have lower response imes and less traffic from other devices.
Deleting front panel elements also can help as they need periodic updates (TCP messages) from/to the dev. PC.
If response time is the problem, we might look into a solution using UDP. UDP is very fast becase it does not need acknowledgement for every message. The drawback is that we have to implement this data check on the host side.

Let me know your results.

Regards:

Andrew Valko

NIH

Andrew Valko
National Instruments Hungary

Riv · ‎10-23-2013

Hi ValkoA,

thank you very much for your help, I think I finally diagnosed my problems!

1. There has to be a feedback node connected to Signal_OUT FIFO in Timed Loop. If not, queue is slowed down significantly and it starts to overflow.

2. I let my FIFO buffer on FPGA overflow before I started to read it on RT. However, I'm sure my RT buffer wasn't overflowed, because I started in just before state (state machine) with Producer/Consumer structure. I resolved problem adding case structure in FPGA and entering TRUE ( = write to Signal_OUT ) to Property Node in RT just before state with Producer/Consumer.

I don't know why these two affect my program in this way, to be honest, but queue works well now.

And this one affected my Network Stream speed:
3. I had other loop with shared variables. I limited speed of this loop.

Thank you very much for your help, I truly appreciate it 🙂

Owca

LabVIEW

Consumer loop - unpredictable queue rate

Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate

Re: Consumer loop - unpredictable queue rate