Extract interpolated values from a 2D array based on a large set of xy points

不问归期 提交于 2020-07-22 04:55:50

问题


I have a reasonably large 1000 x 4000 pixel xr.DataArray returned from an OpenDataCube query, and a large set (> 200,000) of xy point values. I need to sample the array to return a value under each xy point, and return interpolated values (e.g. if the point lands halfway between a 0 and a 1.0 pixel, the value returned should be 0.5).

xr.interp lets me easily sample interpolated values, but it returns a huge matrix of every combination of all the x and y values, rather than just the values for each xy point itself. I've tried using np.diagonal to extract just the xy point values, but this is slow, very quickly runs into memory issues and feels inefficient given I still need to wait for every combination of values to be interpolated via xr.interp.

Reproducible example

(using just 10,000 sample points (ideally, I need something that can scale to > 200,000 or more):

# Create sample array
width, height = 1000, 4000
val_array = xr.DataArray(data=np.random.randint(0, 10, size=(height, width)).astype(np.float32),
                         coords={'x': np.linspace(3000, 5000, width),
                                 'y': np.linspace(-3000, -5000, height)}, dims=['y', 'x'])

# Create sample points
n = 10000
x_points = np.random.randint(3000, 5000, size=n)
y_points = np.random.randint(-5000, -3000, size=n)

Current approach

%%timeit

# ATTEMPT 1
np.diagonal(val_array.interp(x=x_points, y=y_points).squeeze().values)
32.6 s ± 1.01 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

Does anyone know of a faster or more memory efficient way to achieve this?


回答1:


To avoid the full grid, you need to introduce a new dimension.

x = xr.DataArray(x_points, dims='z')
y = xr.DataArray(y_points, dims='z')
val_array.interp(x=x, y=y)

Will give you an array just along the new z dimension:

<xarray.DataArray (z: 10000)>
array([4.368132, 2.139781, 5.693636, ..., 3.7505  , 3.713589, 2.28494 ])
Coordinates:
    x        (z) int64 4647 4471 4692 3942 3468 ... 3040 3993 3027 4427 3749
    y        (z) int64 -3744 -4074 -3634 -3289 -3221 ... -4195 -4131 -4814 -3362
Dimensions without coordinates: z

36.9 ms ± 1.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

There's a nice example in the xarray docs on Advanced Interpolation.



来源:https://stackoverflow.com/questions/55034347/extract-interpolated-values-from-a-2d-array-based-on-a-large-set-of-xy-points

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!