Can I make my compiler use fast-math on a per-function basis?

后端未结

关注

 2  1660

Suppose I have

template  void foo(float* data, size_t length);

and I want to compile one instantiation with


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  囚心锁ツ        
                
              
                            
                2021-01-12 10:21
              
            
            
                                                                       
As of CUDA 7.5 (the latest version I am familiar with, although CUDA 8.0 is currently shipping), nvcc does not support function attributes that allow programmers to apply specific compiler optimizations on a per-function basis.

Since optimization configurations set via command line switches apply to the entire compilation unit, one possible approach is to use as many different compilation units as there are different optimization configurations, as already noted in the question; source code may be shared and #include-ed from a common file.

With nvcc, the command line switch --use_fast_math basically controls three areas of functionality:


Flush-to-zero mode is enabled (that is, denormal support is disabled)
Single-precision reciprocal, division, and square root are switched to approximate versions
Certain standard math functions are replaced by equivalent, lower-precision, intrinsics


You can apply some of these changes with per-operation granularity by using appropriate intrinsics, others by using PTX inline assembly.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  鱼传尺愫        
                
              
                            
                2021-01-12 10:30
              
            
            
                                                                       
In GCC you can declare functions like following:

__attribute__((optimize("-ffast-math")))
double
myfunc(double val)
{
    return val / 2;
}


This is GCC-only feature.

See working example here -> https://gcc.gnu.org/ml/gcc/2009-10/msg00385.html

It seems that GCC not verifies optimize() arguments. So typos like "-ffast-match" will be silently ignored.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复