Transferring data from memory LUT to DMA FIFO on FPGA

itayshom · ‎12-29-2013

Hi everyone,

I'm using a 7833R FPGA. In my application, I need to input a time-domain digital signal into a buffer at a high 80 MHz rate. For every time bin the data is summed across multiple runs.

So I use a single-cycle timed loop for the acquisition. I want to use address-based access so I need to use a memory (not FIFO). Since I need to increase the current value at each bin I'm forced to use a LUT-type memory (read-increase-write).

After the acquisition cycle is over, I naturally need to transfer the buffer contents to the host, so I use a DMA target-to-host FIFO. Now because the LUT cannot be used in multiple clock domains, I need to use another single-cycle timed loop for transferring from memory to FIFO.

The problem is that I get timing violations due to huge routing delays (12.5 ns missed by 24 ns). Everything compiles fine without the LUT-->FIFO part.

I tried converting the memory to Block RAM and use feedback nodes (i.e. pipelining) when inputting the data, but I got an error that told me all all outputs must be wired direct to feedback nodes, even though they all indeed were.

I also use pipelining for the transfer to FIFO, but it doesn't help a bit

Memory and FIFO data-type is U8. Memory has 8192 elements, and the FIFO has 8191 elements.

I'm attaching my vi. It's pretty complicated but maybe someone can suggest an optimization. The memory is called "Transmitted". The FIFO is called "Transmitted FIFO". The transfer from memory to FIFO occurs at the 3rd frame of the sequence.

Any help would be greatly appreciated.

Thank you,

Itay.

david1147 · ‎12-30-2013

Here is what I've tried to make the compilation pass:

1) Change the memory type to Block Ram

2) In the 2nd frame, insert pipelining both after the Memory Read and after the "AND FFFFE000" (this is one of the critical path);

3) Wire a different memory named control for Memory Write node and point that to "Transmitted". This should eliminate the error you see. (I believe this is a LV FPGA bug.)

Let me know if this solves your problem.

itayshom · ‎12-30-2013

Hi David,

Thank you very much. I'll try that, but I've been rethinking my problem and may try a whole new approach with no memory at all, and using only write and no reads. I wanted to avoid pipelining if possible because it complicates things.

A few remarks:

1) How do you know whether something is in the critical path?

2) Concerning your point 3, it's right. I just forgot that te 'memory out' output of the read method is just another output that needs to be pipelined.

Thank you again,

Itay.

david1147 · ‎12-30-2013

>> 1) How do you know whether something is in the critical path?

There is a "Highlight Critical Path" or similar button on the compilation dialog page when your compilation fails your requested clock rate. You can clock on that to find your critical path. See screenshot attached.

>> 2) Concerning your point 3, it's right. I just forgot that te 'memory out' output of the read method is just another output that needs to be pipelined.

I don't think you can do that. It seems that LV FPGA does not allow pipeline of named control wires.

BrowningG · ‎12-30-2013

FYI: the Read (Memory Method) help topic goes over the times when this method requires a pipelining register on the data output. LabVIEW forces you to use the pipeline register when accessing block RAM memory since the memory read implementation always requires a full clock cycle.

More details on the timing analysis window tool are here: Fixing Timing Violations.

The Memory In/Out and FIFO In/Out reference wires do not have to adhere to the same exact dataflow constructs as the data/address/handshaking signals. They are used only to indicate which FPGA resource the method will use.

The iteration terminal of the single-cycle timed loop will behave like the iteration terminal on a while loop and count up until it hits 2,147,483,647 and stay there until the loop is restarted. Does that matter in this application?

A LUT FIFO of 8192 elements is going to be pretty big. Once the FPGA starts getting pretty full, routing large LUT memories can be tough. Block memory is designed to be optimized for the larger storage items (like youe 8191 element memory) and can ease your timing concerns. It does add the pipelining stage meaning you would need to queue up a read before you expect it. Maybe one case structure to read the memory item when the bitshift operation = 1 and another to write when the operation equals 0 (one cycle later when the read data is available).

Is the access pattern always the same order or do you need to hop around the memory? Using a FIFO as part of your buffer could help your storage mechanism if the additional pipelining scheme adds other complexities.

Most of your logic is pretty simple in terms of FPGA timing. Boolean operations and selectors are fairly cheap in routing and logic. Comparison and arithmetic operations are more expensive (multiplies being the most expensive out of those). I would suspect that the LUT memories are the main culprit for your middle loop timing failures, the image David1147 showed had something called a tunnel controller and a non diagram component taking 6 ns of the 12 ns path.

Also, using a flat sequence structure adds some additional combinatorial delays to a VI to incorporate the logic that LabVIEW requires to enforce dataflow on the FPGA through the frames. One way to consider removing this additiona delay is to implement the entire application in one single-cycle timed loop using a state machine. You could also remove the external while loop through this action. Each frame is essentially a state in your state machine.

Regards,
Browning G
FlexRIO R&D

LabVIEW

Transferring data from memory LUT to DMA FIFO on FPGA

Transferring data from memory LUT to DMA FIFO on FPGA

Re: Transferring data from memory LUT to DMA FIFO on FPGA

Re: Transferring data from memory LUT to DMA FIFO on FPGA

Re: Transferring data from memory LUT to DMA FIFO on FPGA

Re: Transferring data from memory LUT to DMA FIFO on FPGA