问题
I have an xarray of daily data with a number of variables. I want to extract the maximum q_routed
every year and the corresponding values of other variables on the day that the maximum q_routed
happens.
<xarray.Dataset>
Dimensions: (latitude: 1, longitude: 1, param_set: 1, time: 17167)
Coordinates:
* time (time) datetime64[ns] 1970-01-01 ...
* latitude (latitude) float32 44.5118
* longitude (longitude) float32 -111.435
* param_set (param_set) |S1 b''
Data variables:
ppt (time, param_set, latitude, longitude) float64 ...
pet (time, param_set, latitude, longitude) float64 ...
obsq (time, param_set, latitude, longitude) float64 ...
q_routed (time, param_set, latitude, longitude) float64 ...
The command below gives me the maximum of every variable in a year, but that's not what I want.
ncdat['q_routed'].groupby('time.year').max( )
Trial
I tried this
ncdat.groupby('time.year').argmax('time')
which leads to this error:
ValueError: All-NaN slice encountered
How can I do this?
回答1:
For this sort of operation, you probably want to use a custom reduce function:
def my_func(ds, dim=None):
return ds.isel(**{dim: ds['q_routed'].argmax(dim)})
new = ncdat.groupby('time.year').apply(my_func, dim='time')
Now, argmax
doesn't play nice when you have a full array of nans, so you may want to either only apply this function to locations with data or pre-fill the existing nans. Something like this could work:
mask = ncdat['q_routed'].isel(time=0).notnull() # determine where you have valid data
ncdat2 = ncdat.fillna(-9999) # fill nans with a missing flag of some kind
new = ncdat2.groupby('time.year').apply(my_func, dim='time').where(mask) # do the groupby operation/reduction and reapply the mask
来源:https://stackoverflow.com/questions/50498645/how-can-i-find-the-maximum-across-all-variables-corrresponding-to-the-max-in-one