How can I improve performance via a high-level approach when implementing long equations in C++

后端未结

关注

 10  1826

I am developing some engineering simulations. This involves implementing some long equations such as this equation to calculate stress in a rubber like material:


                      
              相关标签:


      
      
        
          10条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  闹比i        
                
              
                            
                2021-01-30 20:00
              
            
            
                                                                       
It looks like you have a lot of repeated operations going on.

pow(l1 * l2 * l3, -0.1e1 / 0.3e1)
pow(l1 * l2 * l3, -0.4e1 / 0.3e1)


You could pre-calculate those so you are not repeatedly calling the pow function which can be expensive.

You could also pre-calutate 

l1 * l2 * l3


as you use that term repeatedly.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  半阙折子戏        
                
              
                            
                2021-01-30 20:00
              
            
            
                                                                       
If you have a Nvidia CUDA graphics card, you could consider offloading the calculations to the graphics card - which itself is more suitable for computationally complicated calculations.

https://developer.nvidia.com/how-to-cuda-c-cpp

If not, you may want to consider multiple threads for calculations.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  生来不讨喜        
                
              
                            
                2021-01-30 20:04
              
            
            
                                                                       
By any chance, could you supply the calculation symbolically. If there are vector operations, you might really want to investigate using blas or lapack which in some cases can run operations in parallel.

It is conceivable (at the risk of being off-topic?) that you might be able to use python with numpy and/or scipy. To the extent that it was possible, your calculations might be more readable.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清酒与你        
                
              
                            
                2021-01-30 20:06
              
            
            
                                                                       
This may be a little terse, but I've actually found good speedup for polynomials (interpolation of energy functions) by using Horner Form, which basically rewrites ax^3 + bx^2 + cx + d as d + x(c + x(b + x(a))). This will avoid a lot of repeated calls to pow() and stops you from doing silly things like separately calling pow(x,6) and pow(x,7) instead of just doing x*pow(x,6).

This is not directly applicable to your current problem, but if you have high order polynomials with integer powers it can help. You might have to watch out for numerical stability  and overflow issues since the order of operations is important for that (although in general I actually think Horner Form helps for this, since x^20 and x are usually many orders of magnitude apart).

Also as a practical tip, if you haven't done so already, try to simplify the expression in maple first. You can probably get it to do most of the common subexpression elimination for you. I don't know how much it affects the code generator in that program in particular, but I know in Mathematica doing a FullSimplify before generating the code can result in a huge difference.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复