Indexing xarray data with variable length DataArray

自闭症网瘾萝莉.ら 提交于 2020-08-05 10:23:21

问题


I am trying to extract data from xarray dataset using DataArray indexing. My goal is to obtain the data along different line segments overlapping the array. For that I have obtained indices of each of the lines (these are of different sizes based on the length).

For example for line 1 : x = [1,2,3], y=[7,8,9] and similarly for line 2 is x=[1,4,5,6,8], y=[0,2,7,9,6] and so on I have some of the lines which are 100x 2. For this I have tried like below :

df=xarray_dataset
indx=xr.DataArray([[1,2,3],[1,4,5,6,8],[2,3]])
indy=xr.DataArray([[7,9,8],[0,2,7,9,6],[4,5]])
dx_sel=df.isel(x=indx,y=indy)

However what I understand that the length of each of the data array index needs to be equal. Is there a way I can handle such issues. Basically these indices represent the x and y coordinates of different segments within the data frame and get the mean of each of the segment, I have 100s of such segments if there are only few I would be able to use a loop for each of the segment indexes however it's not computationally efficient to use a loop for each segment.

This is a similar issue with numpy array as well. Is there a way to pass NaN or something similar in the index so that we could make the equal shape but no data is extracted for that index.


回答1:


You can use set_index -> unstack mechanism, which is based on pd.MultiIndex.

In [4]: df = xr.DataArray(np.arange(110).reshape(10, 11),  
   ...:                   dims=['x', 'y'])  
In [5]: indx=xr.DataArray([1,2,3, 1,4,5,6,8, 2,3], 
   ...:                   dims=['index'],  
   ...:                   coords={'i': ('index', [0,0,0, 1,1,1,1,1, 2,2]), 
   ...:                           'j': ('index', [0,1,2, 0,1,2,3,4, 0,1])}) 
   ...:  
   ...: indy=xr.DataArray([7,9,8, 0,2,7,9,6, 4,5], dims=['index'], 
   ...:                   coords={'i': ('index', [0,0,0, 1,1,1,1,1, 2,2]), 
   ...:                           'j': ('index', [0,1,2, 0,1,2,3,4, 0,1])})       

In [8]: df.isel(x=indx, y=indy).set_index(index=['i', 'j']).unstack('index')                                         
Out[8]: 
<xarray.DataArray (i: 3, j: 5)>
array([[18., 31., 41., nan, nan],
       [11., 46., 62., 75., 94.],
       [26., 38., nan, nan, nan]])
Coordinates:
  * i        (i) int64 0 1 2
  * j        (j) int64 0 1 2 3 4

Here, indx and indy has non-dimensional coordinates, i and j, which are essentially the original position of the index in the 2-dimensional space.



来源:https://stackoverflow.com/questions/62757416/indexing-xarray-data-with-variable-length-dataarray

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!