I\'ve been using the .append()
method to concatenate two tables (with the same fields) in pandas. Unfortunately this method does not exist in xarray
Xarray doesn't have an append method because its data structures are built on top of NumPy's non-resizable arrays, so we cannot append new elements without copying the entire array. Hence, we don't implement an append
method. Instead, you should use xarray.concat.
One usual pattern is to accumulate Dataset/DataArray objects in a list, and concatenate once at the end:
datasets = []
for example in examples:
ds = create_an_xarray_dataset(example)
datasets.append(ds)
combined = xarray.concat(datasets, dim='example')
You don't want to concatenate inside the loop -- that would make your code run in quadratic time.
Alternatively, you could allocate a single Dataset/DataArray for the result, and fill in the values with indexing, e.g.,
dims = ('example', 'x', 'y')
combined = xarray.Dataset(
data_vars={'my_variable': (dims, np.zeros((len(examples), 100, 200)))},
coords={'example': examples})
for example in examples:
combined.loc[dict(example=example)] = create_an_xarray_dataset(example)
(Note that you always need to use indexing with square brackets like []
or .loc[]
-- assigning with sel()
and isel()
doesn't work.)
These two approaches are equally efficient -- it's really a matter of taste which one looks better to you or works better for your application.
For what it's worth, pandas has the same limitation: the append
method does indeed copy entire dataframes each time it is used. This is a perpetual surprise and source of performance issues for new users. So I do think that we made the right design decision not including it in xarray.
You can either use .concat
or merge()
. Documentation.