python-xarray

NetCDF Time series slice with Python 3

纵饮孤独 提交于 2019-12-24 19:48:56
问题 I'm trying to plot a week of time series data from NetCDF files and coming into some problems. I'm using the following packages: import netCDF4 from matplotlib import pyplot as plt import numpy as np import xarray as xr import dask First I import two .nc files: ds1 = xr.open_dataset('ERA5_forecast_100V_247_2008.nc') ds2 = xr.open_dataset('ERA5_analysis_100V_247_2008.nc') Then I select time and grid location using xarray: dsloc1 = ds1.sel(time='2008-02-10',longitude=2.2,latitude=48.7,method=

python-xarray copy mask from one DataArray to another

空扰寡人 提交于 2019-12-24 11:14:50
问题 I got this to work for a simple case: arr2 = xr.DataArray((np.arange(16)-8).reshape(4, 4), dims=['x', 'y']) arr3 = xr.DataArray(np.arange(16).reshape(4, 4), dims=['x', 'y']) <xarray.DataArray (x: 4, y: 4)> array([[ nan, nan, nan, nan], [ nan, nan, nan, nan], [ nan, 9., 10., 11.], [ 12., 13., 14., 15.]]) Dimensions without coordinates: x, y However, i'm having troubling applying to NetCDF files. I have two datasets: significant wave height (Hs) and wind speed (ws). I would like to use the mask

How to use xr.apply_ufunc with changing dimensions

我们两清 提交于 2019-12-24 08:07:25
问题 I have a climate dataset with 3 dimensions loaded with xarray climate = xr.open_dataset(data_file) climate <xarray.Dataset> Dimensions: (lat: 621, lon: 1405, time: 424) Coordinates: * time (time) datetime64[ns] 2017-11-01 2017-11-02 2017-11-03 ... * lon (lon) float64 -125.0 -125.0 -124.9 -124.9 -124.8 -124.8 -124.7 ... * lat (lat) float64 49.92 49.87 49.83 49.79 49.75 49.71 49.67 49.62 ... Data variables: tmean (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ... status (time)

Read multiple coordinates with xarray

99封情书 提交于 2019-12-24 07:51:40
问题 I'm using xarray to read single point data from an openDAP server, then I convert the xarray object to dataframe. This works fine. I would like to read multiple points in a single call, but I don't which is the best approach to do so. This is the code I'm using for a single point: import pandas as pd import xarray as xr url = 'http://nomads.ncep.noaa.gov:9090/dods/gfs_0p25/gfs20161111/gfs_0p25_00z' lats = [40.1,40.5,42.3] lons = [1.02,1.24,1.84] vars = ['dswrfsfc', 'tmp2m', 'pressfc'] ds = xr

Join along a non-coordinate dimension in xarray

怎甘沉沦 提交于 2019-12-24 06:00:31
问题 I'm trying to join a set of values from one DataArray , to another. They should join based on a non-dimension coordinate of the first. I think this should be easy but I can't seem to work it out. The first array: In [4]: primary=xr.DataArray(np.random.rand(4), dims=list('a')) ...: primary.coords['group'] = (('a',), [0,0,1,1]) ...: primary ...: Out[4]: <xarray.DataArray (a: 4)> array([ 0.27772841, 0.06126117, 0.51753086, 0.35994987]) Coordinates: * a (a) int64 0 1 2 3 group (a) int64 0 0 1 1

randomly mask/set nan x% of data points in huge xarray.DataArray

前提是你 提交于 2019-12-24 04:01:15
问题 I have a huge (~ 2 billion data points) xarray.DataArray . I would like to randomly delete (either mask or replace by np.nan ) a given percentage of the data, where the probability for every data point to be chosen for deletion/masking is the same across all coordinates. I can convert the array to a numpy.array but I would preferably keep it in the dask chunks for speed. my data looks like this: >> data <xarray.DataArray 'stack-820860ba63bd07adc355885d96354267' (variable: 8, time: 228,

Why is 'invalid value encountered in greater' warning thrown in python xarray for nan? Shouldn't the nan commute without any issues?

邮差的信 提交于 2019-12-24 01:18:20
问题 The following is philosophical question aimed at figuring out why xarrays is the way that it is. I'm having trouble figuring out the Xarrays way to do the following. positive_values = values.where(values > 0) It follows x-arrays syntax, and computes what I want it to do using xarrays, but throws this Runtime Warning. RuntimeWarning: invalid value encountered in greater if not reflexive My question is, how am I abusing Xarrays ? I'd like to make the case that nans are excellent in the sense

Remove a dimension from some variables in an xarray Dataset

无人久伴 提交于 2019-12-23 21:37:28
问题 I have an xarray Dataset where some variables have more dimensions than necessary (e.g., a 3D dataset where the "latitude" and "longitude" variables also vary along time). How do I remove the extra dimensions? For example, in the dataset below, 'bar' is a 2D variable along the x and y axes, with constant vaues along the x axis. How do I remove the x dimension from 'bar' but not 'foo'? >>> ds = xr.Dataset({'foo': (('x', 'y'), np.random.randn(2, 3))}, {'x': [1, 2], 'y': [1, 2, 3], 'bar': (('x',

xarray too slow for performance critical code

让人想犯罪 __ 提交于 2019-12-23 15:44:50
问题 I planned to use xarray extensively in some numerically intensive scientific code that I am writing. So far, it makes the code very elegant, but I think I will have to abandon it as the performance cost is far too high. Here is an example, which creates two arrays and multiplies parts of them together using xarray (with several indexing schemes), and numpy. I used num_comp=2 and num_x=10000: Line # Hits Time Per Hit % Time Line Contents 4 @profile 5 def xr_timing(num_comp, num_x): 6 1 4112

Can I parallelize `numpy.bincount` using `xarray.apply_ufunc`?

╄→尐↘猪︶ㄣ 提交于 2019-12-23 04:00:14
问题 I want to parallelize the numpy.bincount function using the apply_ufunc API of xarray and the following code is what I've tried: import numpy as np import xarray as xr da = xr.DataArray(np.random.rand(2,16,32), dims=['time', 'y', 'x'], coords={'time': np.array(['2019-04-18', '2019-04-19'], dtype='datetime64'), 'y': np.arange(16), 'x': np.arange(32)}) f = xr.DataArray(da.data.reshape((2,512)),dims=['time','idx']) x = da.x.values y = da.y.values r = np.sqrt(x[np.newaxis,:]**2 + y[:,np.newaxis]*