LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

cRIO-9057 FPGA DSP48 usage too high - optimization?

Alright, I'll have to admit that I was a bit uniformed. The multiply and the high throughput multiply does seem to use DSP modules to calculate. The problem is that your fixed-point data type is too large.

Per this post:

https://forums.ni.com/t5/LabVIEW/Reducing-DSP48s-usage-in-Labview-FPGA/m-p/4150266/highlight/true#M1...

If you use a fixed-point number with a very high resolution then it would require more DSP modules to do multiplication.

0 Kudos
Message 11 of 19
(2,880 Views)

If you look up UG479 (v1.10) March 27, 2018 (Xilinx AMD doc) you can see the specs of the DSP used in the 9057's FPGA.

 

I recommend a range analysis of your code.  That is, go through each aspect of your code and review the actual range (and significant digits) needed.  This can help inform you on how many bits are actually needed.  With fixed point math it is easy to have 'bit-creep' where each stage adds bits where in reality this does not mean we need all of those bits.


Certified LabVIEW Architect, Certified Professional Instructor
ALE Consultants

Introduction to LabVIEW FPGA for RF, Radar, and Electronic Warfare Applications
0 Kudos
Message 12 of 19
(2,853 Views)

Josh,

I do not use any of the built-in PID controllers, but I do use lead-lag form PID controllers in my code (shown in fpga3 image). Unfortunately, these are necessary for the function of this program. This is where I use two z^-1 functions... does this affect DSP48 usage?

Thank you for your help!

0 Kudos
Message 13 of 19
(2,827 Views)

Terry, thanks again for your response. I am using a NI9264 AO module and a NI9401 DIO module with a sample rate of 9 MHz (using 8 input channels). I am hoping to run the controller loop of my FPGA program (the slowest loop) at 20 kHz max. Will this allow me to do anything with clock timing to decrease DSP usage?

I went through my program and have tried to give a thorough overview of the signal process: In this FPGA VI I have three while loops:

- The first while loop is purely logic gates and integer math to interpolate the six quadrature encoders I am using for position feedback. 

- The second while loop (shown in fpga1) takes the position outputs of the first while loop and computes the x, y, z, theta x, theta y, theta z positions of the controlled object. I downgraded all of the math from high throughput math (HT) to regular math operators. There are 11 multiply functions in this loop.
- The third loop is the controller and output loop. For one degree of freedom (DOF), I am using a vi to generate a trapezoidal velocity trajectory profile using nested case structures inside of a while loop. The output represents a position setpoint. I have again downgraded all of this math from HT to regular math operators. There are 13 multiply functions and 9 exponent functions. I attached a snip of this code in (fpga4). The other DOFs receive a static position setpoint.
These position setpoints are fed into a PID controller (fpga3). The P controller is just a single HT multiplier. The I controller utilizes a single cycle timed loop (default timing) and HT math. The D controller utilizes a Z delay function, a single cycle timed loop (default timing) and HT math. There are a total of 5 multiply functions per controller and 6 controllers.
The control efforts are then sent to a VI to map the six control efforts to the 8 actuators I am using. This VI uses all HT math, a total of 25 multiply functions, 4 divide functions, and I have included a snippet in (fpga5). The resulting actuator commands are then each multiplied by a slide integer (fpga2). 

Total, there are 75 multiply functions, 9 exponent functions, and 4 divide functions, among many add/subtracts. Please let me know if you have any advice to reduce DSP48 usage! Thanks.

0 Kudos
Message 14 of 19
(2,824 Views)

75 multiplicators, but depending on the bit widths of each multiplication you may need more than 1 DSP for each multiply.

 

I am not familiar with your target, but the max for a multiplicator is mostly 25x17 or something like that. I see you're using quite a few 32-bit numbers and multiplying two 32-bitnumbers will definitely require more than a single DSP.

 

You will need to either decrease your bit widths or significantly re-structure your code to allow for multiplexing multiple calculations over a single set of DSPs.

0 Kudos
Message 15 of 19
(2,808 Views)

Intaris, this is very helpful. It sounds like the bit widths are the main culprit in taking up all of these DSPs. 

I am curious to know what you mean about "multiplexing multiple calculations over a single set of DSPs." I have been trying to understand the DSP48E1 function built into LabVIEW and whether this might reduce some DSP usage... can you speak to this at all? Thanks again.

0 Kudos
Message 16 of 19
(2,800 Views)

Terry, I am absolutely dealing with bit creep, the math is yielding FXP values with many more bits than I need, but I am unsure how to reduce the effect of this. Do you have tips or resources for dealing with this? Is there a way to restrict the output of math functions to a certain number of bits? I assume the FXP conversion function won't impact this. Please let me know!

0 Kudos
Message 17 of 19
(2,792 Views)

If you structure your code very differently, you can iterate over a single DSP instance and feed it with different data each cycle, collecting the results and sending them on their way. This way the DSP can be executed at a faster speed, with it alternating between different actual calculations. tT's important to take note of any latency issues so that your data doesn't become desynced.

Message 18 of 19
(2,782 Views)

@Intaris wrote:

If you structure your code very differently, you can iterate over a single DSP instance and feed it with different data each cycle, collecting the results and sending them on their way. This way the DSP can be executed at a faster speed, with it alternating between different actual calculations. tT's important to take note of any latency issues so that your data doesn't become desynced.


This is what I am also suggesting.


Certified LabVIEW Architect, Certified Professional Instructor
ALE Consultants

Introduction to LabVIEW FPGA for RF, Radar, and Electronic Warfare Applications
0 Kudos
Message 19 of 19
(2,738 Views)