How can I improve performance via a high-level approach when implementing long equations in C++

后端 未结 10 1826
孤街浪徒
孤街浪徒 2021-01-30 19:34

I am developing some engineering simulations. This involves implementing some long equations such as this equation to calculate stress in a rubber like material:



        
相关标签:
10条回答
  • 2021-01-30 20:00

    It looks like you have a lot of repeated operations going on.

    pow(l1 * l2 * l3, -0.1e1 / 0.3e1)
    pow(l1 * l2 * l3, -0.4e1 / 0.3e1)
    

    You could pre-calculate those so you are not repeatedly calling the pow function which can be expensive.

    You could also pre-calutate

    l1 * l2 * l3
    

    as you use that term repeatedly.

    0 讨论(0)
  • 2021-01-30 20:00

    If you have a Nvidia CUDA graphics card, you could consider offloading the calculations to the graphics card - which itself is more suitable for computationally complicated calculations.

    https://developer.nvidia.com/how-to-cuda-c-cpp

    If not, you may want to consider multiple threads for calculations.

    0 讨论(0)
  • 2021-01-30 20:04

    By any chance, could you supply the calculation symbolically. If there are vector operations, you might really want to investigate using blas or lapack which in some cases can run operations in parallel.

    It is conceivable (at the risk of being off-topic?) that you might be able to use python with numpy and/or scipy. To the extent that it was possible, your calculations might be more readable.

    0 讨论(0)
  • 2021-01-30 20:06

    This may be a little terse, but I've actually found good speedup for polynomials (interpolation of energy functions) by using Horner Form, which basically rewrites ax^3 + bx^2 + cx + d as d + x(c + x(b + x(a))). This will avoid a lot of repeated calls to pow() and stops you from doing silly things like separately calling pow(x,6) and pow(x,7) instead of just doing x*pow(x,6).

    This is not directly applicable to your current problem, but if you have high order polynomials with integer powers it can help. You might have to watch out for numerical stability and overflow issues since the order of operations is important for that (although in general I actually think Horner Form helps for this, since x^20 and x are usually many orders of magnitude apart).

    Also as a practical tip, if you haven't done so already, try to simplify the expression in maple first. You can probably get it to do most of the common subexpression elimination for you. I don't know how much it affects the code generator in that program in particular, but I know in Mathematica doing a FullSimplify before generating the code can result in a huge difference.

    0 讨论(0)
提交回复
热议问题