integrate() gives totally wrong number

后端未结

关注

 4  1191

integrate() gives horribly wrong answer:

integrate(function (x) dnorm(x, -5, 0.07), -Inf, Inf, subdivisions = 10000L)
# 2.127372e-23 with absolute error <


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  说谎        
                
              
                            
                2021-01-12 00:38
              
            
            
                                                                       
Interesting workaround: not too surprisingly, integrate does well when the values sampled (on (-Inf,Inf), no less) are closer to the "center" of the data. You can reduce this by using your function but hinting at a center:

Without adjustment:

t(sapply(-10:10, function(i) integrate(function (x) dnorm(x, i, 0.07), -Inf, Inf, subdivisions = 10000L)))
#       value        abs.error    subdivisions message call      
#  [1,] 0            0            1            "OK"    Expression
#  [2,] 1            4.611403e-05 10           "OK"    Expression
#  [3,] 6.619713e-19 1.212066e-18 2            "OK"    Expression
#  [4,] 7.344551e-71 0            2            "OK"    Expression
#  [5,] 3.389557e-06 6.086176e-06 3            "OK"    Expression
#  [6,] 2.127372e-23 3.849798e-23 2            "OK"    Expression
#  [7,] 1            3.483439e-05 8            "OK"    Expression
#  [8,] 1            6.338078e-07 11           "OK"    Expression
#  [9,] 1            3.408389e-06 7            "OK"    Expression
# [10,] 1            6.414833e-07 8            "OK"    Expression
# [11,] 1            7.578907e-06 3            "OK"    Expression
# [12,] 1            6.414833e-07 8            "OK"    Expression
# [13,] 1            3.408389e-06 7            "OK"    Expression
# [14,] 1            6.338078e-07 11           "OK"    Expression
# [15,] 1            3.483439e-05 8            "OK"    Expression
# [16,] 2.127372e-23 3.849798e-23 2            "OK"    Expression
# [17,] 3.389557e-06 6.086176e-06 3            "OK"    Expression
# [18,] 7.344551e-71 0            2            "OK"    Expression
# [19,] 6.619713e-19 1.212066e-18 2            "OK"    Expression
# [20,] 1            4.611403e-05 10           "OK"    Expression
# [21,] 0            0            1            "OK"    Expression


If we add a "centering" hint, though, we get more consistent results:

t(sapply(-10:10, function(i) integrate(function (x, offset) dnorm(x + offset, i, 0.07), -Inf, Inf, subdivisions = 10000L, offset = i)))
#       value abs.error    subdivisions message call      
#  [1,] 1     7.578907e-06 3            "OK"    Expression
#  [2,] 1     7.578907e-06 3            "OK"    Expression
#  [3,] 1     7.578907e-06 3            "OK"    Expression
#  [4,] 1     7.578907e-06 3            "OK"    Expression
#  [5,] 1     7.578907e-06 3            "OK"    Expression
#  [6,] 1     7.578907e-06 3            "OK"    Expression
#  [7,] 1     7.578907e-06 3            "OK"    Expression
#  [8,] 1     7.578907e-06 3            "OK"    Expression
#  [9,] 1     7.578907e-06 3            "OK"    Expression
# [10,] 1     7.578907e-06 3            "OK"    Expression
# [11,] 1     7.578907e-06 3            "OK"    Expression
# [12,] 1     7.578907e-06 3            "OK"    Expression
# [13,] 1     7.578907e-06 3            "OK"    Expression
# [14,] 1     7.578907e-06 3            "OK"    Expression
# [15,] 1     7.578907e-06 3            "OK"    Expression
# [16,] 1     7.578907e-06 3            "OK"    Expression
# [17,] 1     7.578907e-06 3            "OK"    Expression
# [18,] 1     7.578907e-06 3            "OK"    Expression
# [19,] 1     7.578907e-06 3            "OK"    Expression
# [20,] 1     7.578907e-06 3            "OK"    Expression
# [21,] 1     7.578907e-06 3            "OK"    Expression


I recognize this is mitigation for heuristics, presumes knowing something about your distribution before integration, and is not a perfect "generic" solution. Just offering another perspective.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2021-01-12 00:50
              
            
            
                                                                       
Try package cubature.

library(cubature)

hcubature(function (x) dnorm(x, -5, 0.07), -Inf, Inf)
#$integral
#[1] 1
#
#$error
#[1] 9.963875e-06
#
#$functionEvaluations
#[1] 405
#
#$returnCode
#[1] 0


Note that function pcubature in the same package also returns 0.

From vignette("cubature"), section Introduction. My emphasis.  


  This R cubature package exposes both the hcubature and pcubature
  routines of the underlying C cubature library, including the
  vectorized interfaces.  
  
  Per the documentation, use of pcubature is advisable only for smooth
  integrands in dimensions up to three at most. In fact, the pcubature
  routines perform significantly worse than the vectorized hcubature
  in inappropriate cases. So when in doubt, you are better off using
  hcubature.  


Since in this case the integrand is the normal density, a smooth and 1-dimensional function, there would be reasons to prefer pcubature. But it doesn't give the right result. The vignette concludes the following.


  
  Vectorized hcubature seems to be a good starting point.  
  For smooth integrands in low dimensions (≤3),  pcubature might be worth trying out. Experiment before using in a production package.
  

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  既然无缘        
                
              
                            
                2021-01-12 00:53
              
            
            
                                                                       
Expanding a little further on @r2evan's and @Limey's comments:

@Limey: for very general problems like this, there is simply no way to guarantee a generic solution.

One way to solve such problem is to use more knowledge of the properties of the integrand (@r2evans's answer); the answer referenced by @Limey goes into detail for a different problem.

One "gotcha" that you may not have thought of is that trying out a bunch of generic methods, tuning settings, etc. may mislead you into concluding that some settings/methods are generically better than the first one you tried that failed to get the right answer. (Methods that work may work better because they're generically better, but trying them on one example doesn't prove it!)

As an example, the description of pcubature() (in ?cubature::pcubature says


  This algorithm is often superior to h-adaptive integration for
       smooth integrands in a few (<=3) dimensions, but is a poor choice
       in higher dimensions or for non-smooth integrands. 


However, recall that pcubature() happens to fail for your example, which is a smooth low-dimensional case - exactly where pcubature() is supposed to perform better - which suggests that it may be just luck that hcubature() works and pcubature() doesn't in this case.

An illustration of how sensitive the results can be to parameters (lower/upper limits in this case):

library(emdbook)
cc <- curve3d(integrate( dnorm, mean=-5, sd=0.07,
        lower=x, upper=y, subdivisions=1000L)$value,
       xlim=c(-30,-10), ylim=c(0,30), n = c(61, 61),
       sys3d="image", col=c("black", "white"),
    xlab="lower", ylab="upper")


White squares are successful (integral=1), black squares are bad (integral=0).


                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  我寻月下人不归        
                
              
                            
                2021-01-12 01:00
              
            
            
                                                                       
From the online documentation:  "Like all numerical integration routines, these evaluate the function on a finite set of points. If the function is approximately constant (in particular, zero) over nearly all its range it is possible that the result and error estimate may be seriously wrong."

I take this to mean "caveat emptor".  I notice that in your example, the absolute error is greater than value of the integral.  Given that you know f(x) > 0 for all x, at least it's giving you the chance to spot that something has gone wrong.  It's down to you to take the opportunity.

integrate( function(x) dnorm(x, -5, 0.07), -20, 10, subdivisions=1000L) 


Gives

1 with absolute error < 9.8e-07


The warning in the online doc says to me that, given your apparent definition of buggy, the answer to your question is "no, there is no reliable numerical intergration method.  Not in R or any other language".  No numerical integration technique should be used blindly.  The user needs to check their inputs are sensible and the output is reasonable.  It's no good believing an answer just because the computer gave it to you.

See also this post.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复