Scipy interpolation with masked data?

后端未结

关注

 3  1951

I am trying to interpolate a 2D array that contents masked data. I have used some of the SciPy module\'s methods available, including interp2d, bisplrep/b


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  谎友^        
                
              
                            
                2021-01-03 02:46
              
            
            
                                                                       
You can actually use every function that accepts x, y, z (which is the case for interp2d and probably the others as well) with your masked data. But you need to explicitly create a mgrid:

z = ... # Your data
x, y = np.mgrid[0:z.shape[0], 0:z.shape[1]]


Then you need to delete all masked values in all of these coordinates:

x = x[~z.mask]
y = y[~z.mask]
z = z[~z.mask]


With these final x, y, z you can call every of your specified functions (that accepts incomplete grids, so RectBivariateSpline won't work). Notice however that some of these use interpolation boxes so if there is a too big region where you discarded the data because of your mask the interpolation will fail there (resulting in np.nan or 0). But you might tweak the parameters to compensate for that, if it happens.

For example:

data = np.random.randint(0, 10, (5,5))
mask = np.random.uniform(0,1,(5,5)) > 0.5
z = np.ma.array(data, mask=mask)
x, y = np.mgrid[0:z.shape[0], 0:z.shape[1]]
x1 = x[~z.mask]
y1 = y[~z.mask]
z1 = z[~z.mask]
interp2d(x1, y1, z1)(np.arange(z.shape[0]), np.arange(z.shape[1]))

array([[  1.1356716 ,   2.45313727,   3.77060294,   6.09790177, 9.31328935],
       [  3.91917937,   4.        ,   4.08082063,   3.98508121, 3.73406764],
       [ 42.1933738 ,  25.0966869 ,   8.        ,   0.        , 0.        ],
       [  1.55118338,   3.        ,   4.44881662,   4.73544593, 4.        ],
       [  5.        ,   8.        ,  11.        ,   9.34152525, 3.58619652]])


you can see the small area of 0's because the mask had there many masked values:

mask
array([[False,  True,  True,  True, False],
       [False, False,  True, False, False],
       [ True,  True, False,  True,  True],
       [False,  True, False,  True,  True],
       [False,  True, False, False,  True]], dtype=bool)

data
array([[2, 4, 4, 5, 5],
       [1, 4, 1, 3, 8],
       [9, 1, 8, 0, 9],
       [7, 2, 0, 3, 4],
       [9, 6, 0, 4, 4]])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  无人及你        
                
              
                            
                2021-01-03 02:47
              
            
            
                                                                       
The problem with the approaches outlined by @MSeifert is that the regular grid structure is lost, resulting in inefficient interpolation. It is justified only to fill in missing data by interpolation, but not for typical interpolation from one grid to another, where missing data should not be filled in.

In this case, filling missing values with np.nan is the simplest approach. These will be propagated in the calculations and the resulting array will have nans wherever a missing value was used for interpolation.

# fast interpolator that use the regular grid structure (x and y are 1D arrays)
z = z_masked.filled(np.nan)
zinterp = RegularGridInterpolator((x, y), z.T)

# new grid to interpolate on
X2, Y2 = np.meshgrid(x2, y2)
newpoints = np.array((X2, Y2)).T

# actual interpolation
z2 = zinterp(newpoints)
z2_masked = np.ma.array(z2, mask=np.isnan(z2))


For completeness, another approach is to interpolate a second mask array (filled with 1 where data is missing) to fill in missing values on the new grid.

# fast interpolator that use the regular grid structure (x and y are 1D arrays)
zinterp = RegularGridInterpolator((x, y), z.T)
minterp = RegularGridInterpolator((x, y), (mask+0.).T)

# actual interpolation
z2 = zinterp(newpoints)
mask2 = minterp(newpoints) > 0  # apply threshold, e.g. 0.5 is considered contaminated and will be removed.
z2[mask2] = np.nan  # fill with nans or whatever missing data flag


Note both approaches should also work with RectBivariateSpline, if spline interoplation is desired. And either way, this should be much faster than using interp2d...
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  無奈伤痛        
                
              
                            
                2021-01-03 03:00
              
            
            
                                                                       
i typically follow the approach described by @mseifert, but add the following refinement if i am weary of the interpolation error through the masked areas. that seems to be one of your concerns, @hurrdrought? the idea is to propagate the mask to the interpolated result. a simple example for 1D data is:

def ma_interp(newx,x,y,mask,propagate_mask=True):
    newy = np.interp(newx,x[~mask],y[~mask]) # interpolate data
    if propagate_mask: # interpolate mask & apply to interpolated data
        newmask = mask[:]
        newmask[mask] = 1; newmask[~mask] = 0
        newmask = np.interp(newx,x,newmask)
        newy = np.ma.masked_array(newy, newmask>0.5)
    return newy

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复