Using LUT memory item to store accumulated value in SCTL

nkmath · ‎01-11-2021

Hello,

I have a large set of stored data ( > 100 GB) of a continuous intensity time-series. My goal is to obtain many FFTs of the data, each averaged over a typical duration of 100 - 1000 ms. Since the data files are very large, I would like to use LabVIEW FPGA to perform the FFT, as I expect a considerable increase in the processing time over the host processor.

To start, I was able to modify the example /Hardware Input and Output/FlexRIO/FPGA Fundamentals/FFT/FFT(1 channel, 1 sample).lvproj to get the FFT of the stored data. This appeared to work well, where after each sequential input of 1024 samples, the FPGA sends back to the host the FFT of that initial sample set, corresponding to 1024 frequency bins. Here, the output looks good.. I was able to match these FFTs to FFTs of the data performed with the processor.

However, I am struggling to produce the accumulated FFT value within the FPGA. In the host processor, this is not so hard.. I just obtain many sequential FFTs of the data, and take the average at each frequency bin. To implement this in the FPGA, I thought to use a memory item (either LUT or Block Memory) where the address corresponds to the "data index" (i.e. the frequency bin index). On each clock cycle, I read from the memory item (that stores the accumulated value), add it to the value obtained from the FFT Express VI, and write it back to the same address. I realized that when using Block Memory, it is necessary to include feedback nodes in order to compensate for the latency of the read method, but for the LUT (currently used) should not be an issue. I've also included a "reset" read/write control, such that the accumulated value can be reset to zero by the host interface.

Now, what I see is that when "Reset" == False, the 'accumulated' output is incorrect.. however, if I set "Reset" == True, then I get the expected output. So, my assumption is there is something happening either in the addition of the current FFT value with the stored value, or something with the indexing of the memory item. Perhaps there is some delay that is not accounted for in between the read and write of the memory item, but my impression is that all the code between them should occur in the same clock cycle.

I've attached a snippet of the FPGA code containing most of the relevant functions (I hope). If more is needed I can zip the file and share it, but first wanted to get opinions on this approach, and if there are any obvious issues.

The real/imag values from the FFT Express VI are output as an FXP (<+\-, 32, 19>). The memory items, and the FIFOs are elements are configured for an FXP (<+\-, 64,38>). The high throughput addition is configured to take in the <+/-, 32,19> as the 'x' input, the <+\-, 64,38> as the 'y' input, and the output is <+\-, 64,38>.

Let me know if there are any questions, and thanks in advance for any advice.

Best Regards

N.

Jorn_Deruyck · ‎01-14-2021

While it is true that reading from a block ram executes in a single tick, it doesn't necessarily mean that the value on the output corresponds to the register address supplied during that tick.

There is a "read latency" that you have to take care of.

You can check what the required latency is by looking at the property dialog of the block ram configuration:

You can compensate for this latency by adding extra feedback registers.

Either by adding multiple feedback registers end to end or simply by configuring the feedback register to have a cycle delay (using the property dialog of the feedback register):

As is shown in the image above, you can do the same for additional signals to 'align' them for further processing.

nkmath · ‎01-14-2021

Hi Jorn,

Thanks for your reply.

For my block memory item, I am using the LUT implementation (see attached photo) which I thought does not have any read/write latency. As shown, the cycles of read latency are grayed out with a value of zero. Perhaps there are some issues in using the LUT implementation that I am not considering?

I initially was using block memory, but have now switched over to the LUT implementation. When I used block memory, I indeed used extra feedback registers to attempt to synchronize the timing but with a similar result to what I describe in my original post.

Best

N.

Jorn_Deruyck · ‎01-14-2021

Hi nkmath,

you are correct, I overlooked that detail in your description and image.

To be honest, I can't see any other clear indicator of the behavior you're describing.

Perhaps some strange behavior due to attempting to read/write to the same address within the same clock cycle?

nkmath · ‎01-14-2021

Perhaps some strange behavior due to attempting to read/write to the same address within the same clock cycle?

I have tried inserting a feedback register at the input of the memory write, for both the address/data, and it unfortunately made no difference.

My feeling is that it may have more to do with the FXP configuration. When I have "Reset" == True (i.e. the memory item for the real/imag values are set to zero) the corresponding FFT output looks like:

where the y axis is log_10 ( |z| ), with z being the complex output of the FFT as a function of the frequency bin. When I turn "Reset" == False (i.e., in principle accumulating the real/imag value at a given frequency bin), I get:

The overall baseline value has shifted (immediately), and the peak at the approximate bin #325 has disappeared. The peak seen at that bin number is expected, it is a known RF noise frequency in our system. I expect that when accumulating the FFT over many cycles, that peak should grow in significance. The DC value (i.e. at bin # = 0) is still strong in the FFT when the accumulation is done..

I will continue to try and debug this, but let me know of any other ideas.

Also, I've changed the FXP configuration. The output of the FFT Express VI is still a <+/-,32,19>, but I now convert that to a <+/-,32,32> before adding with the value stored in the memory block, which is also <+/-,32,32>.. then in the host code this is converted to DBL.

Best

N.

LabVIEW

Using LUT memory item to store accumulated value in SCTL

Using LUT memory item to store accumulated value in SCTL

Re: Using LUT memory item to store accumulated value in SCTL

Re: Using LUT memory item to store accumulated value in SCTL

Re: Using LUT memory item to store accumulated value in SCTL

Re: Using LUT memory item to store accumulated value in SCTL