Why does adding a small float to a large float just drop the small one?

后端未结

关注

 3  391

Say I have:

float a = 3            // (gdb) p/f a   = 3
float b = 299792458    // (gdb) p/f b   = 299792448

then

float sum


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  感动是毒        
                
              
                            
                2020-11-29 12:39
              
            
            
                                                                       
Floating-point number have limited precision. If you're using a float, you're only using 32 bits. However some of those bits are reserved for defining the exponent, so that you really only have 23 bits to use. The number you give is too large for those 23 bits, so the last few digits are ignored.

To make this a little more intuitive, suppose all of the bits except 2 were reserved for the exponent. Then we can represent 0, 1, 2, and 3 without trouble, but then we have to increment the exponent. Now we need to represent 4 through 16 with only 2 bits. So the numbers that can be represented will be somewhat spread out: 4 and 5 won't both be there. So, 4+1 = 4.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  囚心锁ツ        
                
              
                            
                2020-11-29 12:45
              
            
            
                                                                       
All you really need to know about the mechanics of rounding is that the result you get is the closest float to the correct answer (with some extra rules that decide what to do if the correct answer is exactly between two floats). It just so happens that the smaller number you added is less than half the distance between two floats at that scale, so the result is indistinguishable from the larger number you added. This is correct, to within the limits of float precision. If you want a better answer, use a better-precision data type, like double.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  夕颜        
                
              
                            
                2020-11-29 13:05
              
            
            
                                                                       
32-bit floats only have 24 bits of precision. Thus, a float cannot hold b exactly - it does the best job it can by setting some exponent and then mantissa to get as close as possible.

When you then consider the floating point representation of b and a, and try and add them, the addition operation will shift the small number a's mantissa downwards as it tries to match b's exponent, to the point where the value (3) falls off the end and you're left with 0. Hence, the addition operator ends up adding floating point zero to b.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复