10-15-2013 11:34 AM - edited 10-15-2013 11:38 AM
Hello everyone,
I'm developing producer-consumer structure on RT and I have a strange problem. I'm receiving data from FPGA in producer loop and send it through queue to consumer loop. Next, they should go through network stream to Host. Because of the fact that consumer didn't keep up with producer (queue overflowed), I started to slow producer (the cost of this action is sending more data). It didn't help, so I added case structure in producer to send just half data to queue (the ones on TRUE in shift register).
This is the moment when I realized that sometimes closing LabVIEW environment and opening it again effects in speeding up queue transmission. Then, I can even accelerate producer loop and my queue doesn't overflow! And I can stop and run my program and it still works well.
I end up closing and opening LabVIEW and running my program all over again waiting for it to "click", because when it finally happens, I have no problem until next LabVIEW session.
Anyone have any idea what I could do wrong? Or how could I fix it?
Solved! Go to Solution.
10-15-2013 11:45 AM
Hi Owca,
Are you running your RT application interactively, or deploying it as a standalone RT executable? Interactive execution will definitely slow down your RT code and network communication speeds.
Regards,
10-15-2013 11:56 AM
I'm using interactive exeuction. I don't have my RIO with me now but I'll try to build standalone application tomorrow and I'll say how it works. Thanks!
10-15-2013 04:08 PM
Couple questions...
Why do you have a timeout on your dequeue operation and a timeout on your "write stream" vi? Looking at your code, I don't see a reason why you need a timeout. I would set the dequeue to wait forever for another element to appear.
Why do those 2 VIs run parallel to each other? Why not dequeue the element and immediately stream it in the same loop iteration?
10-16-2013 06:14 AM - edited 10-16-2013 06:15 AM
Dear Owca,
I see no problems within the code you've sent, and the speeds that you work with are fairly low, too. This leads me to the assumption that other parts of the code might be causig the issue you're experiencing. Let me make a few suggestions:
Disabling debugging features also helps quite a bit. Please get back to me if you have results.
Kind regards:
Andrew Valko
NI Hungary
10-18-2013 11:18 AM - edited 10-18-2013 11:19 AM
Hello everyone,
at the beginning I'd like to apalogise for such late response. Something important came out and took a while, but I'm back now.
Firstly, aputman
- if I give "-1" to timeout I wouldn't be able to stop the loop if error occures after executing the loop. Maybe I care too much, but it shouldn't cause trouble here.
- I use 2 loop because I had some problems with network streaming data with just one timed loop and I found on LAVA forum someone's problem with TCP/IP. There was an advice to insert queue between receiving data from FPGA and streaming it to Host. The problem was connected to overflowing so tried implementing the advice.
Reading your post I decided to change my code a bit. It occured that in spite of the fact that I want my timed loop to execute every 200ms (=collect 20'000 samples during each iteration) it collects between 10'000 and 30'000 samples FXP <14,5> (because of the feedback from Elements Remaining in FIFO, I think). So I changed "Number of Elements" in FIFO to 20'000. I made fixed size of queue buffer, too. Queue worked well then and my program was able to send data from RT to Host 20'000samples/200ms FXP <14,5>. However, it's too slow for me, and I need better quality than <14,5>.
Now I'm wondering if network stream is able to somehow slower all the process? When I tried making structure with just one Timed Loop with FIFO_Out and Write_to_stream inside it (so without any queues or producer/consumer), it couldn't send my arrays with FXP <20,5> fast enough - I received them on Host (network streaming) with latency, regardless how fast was my Timed Loop. For ex. my RT loop was during its 1000 iteration while my host was taking only 700th array of data. And then, suddendly reader on host stops throwing anything, and after a while it starts throw data very fast. There's a moment that the reader endpoint on host give ma arrays of zeros, and finally sometimes program stops. Maybe networks stream overflows.
Sorry for such long description... 🙂
And now here is my question: can I do something to speed up network stream a little? Single Timed Loop with FIFO_Out and Write_to_stream inside doesn't work at all for FXP <20,5>, regardless how fast I execute Timed Loop (program crashes). And the idea with producer/consumer doesn't work here too (queue overflows). FPGA sends data with 100kHz frequency. I fixed size of Network Stream.
Thank you
10-18-2013 04:11 PM
I think I might see your problem. Or should I say, problems.
I really like your well-documented and well-organized code, that made it easy to find these two issues you need to address.
-Danny
10-19-2013 09:35 AM - edited 10-19-2013 09:37 AM
Hi Danny,
thank you for your reply. I don't have problem with Timed Loop but with Consumer Loop. Consumer Loop is predictable now (thanks ValkoA!) but it doesn't keep up with Timed Loop for 20 word length FXP. However, I changed Queue to RT FIFO and removed all not-needed indicators from Consumer Loop, but it was still slow then. So I decided to measure execution time of RT Queue and "Write Single Element to Network Stream" in Consumer Loop and it occured that RT Queue executed ~2ms while Network Stream wrote these 20k elements (FXP 20 word lenght) during ~400ms each time... I don't think it's how it should be, but it's my first experience with network streaming so I can be wrong.
I attached new image for Producer/Consumer on RT.
I also tried again structure with just one Timed Loop and Write Single Element to Network Stream inside it. It looks like Reader Endpoint on Host doesn't take data fast enough - Available Elements for Reading (Property Node) on Host is always zero and Total Elements Read on Host doesn't keep up with Total Elements Written on RT.
I attached image for idea with single Timed Loop on RT, too.
I wonder if there's any process that can slow Network Streaming somehow or it's normal and I just can't send my data with 20 word lenght (FPGA collects data with 100kHz frequency). However, queue seems to work great with those arrays and speed.
10-21-2013 08:08 AM - edited 10-21-2013 08:09 AM
Dear Owca,
I've made a simple benchmark using your data size and Fixed-point configuration, as well as a 9076 cRIO. My results were about 60% better, so in average I had about 140-160 ms spent per iteration.I'm attaching the project so that you'll be able to test it on your end.
If this is still above 200ms, I believe we have to look for a workaround. A few suggestions:
Let me know your results.
Regards:
Andrew Valko
NIH
10-23-2013 03:20 AM - edited 10-23-2013 03:21 AM
Hi ValkoA,
thank you very much for your help, I think I finally diagnosed my problems!
1. There has to be a feedback node connected to Signal_OUT FIFO in Timed Loop. If not, queue is slowed down significantly and it starts to overflow.
2. I let my FIFO buffer on FPGA overflow before I started to read it on RT. However, I'm sure my RT buffer wasn't overflowed, because I started in just before state (state machine) with Producer/Consumer structure. I resolved problem adding case structure in FPGA and entering TRUE ( = write to Signal_OUT ) to Property Node in RT just before state with Producer/Consumer.
I don't know why these two affect my program in this way, to be honest, but queue works well now.
And this one affected my Network Stream speed:
3. I had other loop with shared variables. I limited speed of this loop.
Thank you very much for your help, I truly appreciate it 🙂
Owca