What are the best practices for implementing efficient arithmetic operations in HLS?

Hey hey
What are your go-to strategies for implementing fast and resource-efficient arithmetic operations in HLS? Think fixed vs floating point, operator selection, custom IP, or optimizing the datapath.

Looking to level up beyond just “it works.” Appreciate any hard-won tips or patterns you rely on!

HLS tools support inbuilt arithmetic operations and functions. These are optimized IP which can be plugged in the source code in the form of function calls or simply symbols (*,+,-,). Some of the IPs are present in GitHub - hlslibs/ac_math: Algorithmic C Math Library. Apart from this as a designer it is better to use logical operations, shifters whenever possible to improve on the QoR. Another important point is to use the right rounding and overflow mode in the right point of time. For e.g. using saturation and rounding in accumulators in a fully unrolled loop will lead to area explosion as there will be rounding and saturation logic in each iteration of the loop. The smarter thing would be to have an accumulator without rounding and saturation and apply them outside the loop for the final result.

For best QoR (Power,performance, area) keep in mind you are always designing hardware even though you are using a higher- level abstracted language such as C++/SystemC.