I am using data from multiple netcdf files (in a folder on my computer). Each file holds data for the entire USA, for a time period of 5 years. Locations are referenced based on
Nice start, I would recommend the following to help solve your issues.
First, check out ncrcat to quickly concatenate your individual netCDF files into a single file. I highly recommend downloading NCO for netCDF manipulations, especially in this instance where it will ease your Python coding later on.
Let's say the files are named precip_1.nc
, precip_2.nc
, precip_3.nc,
and precip_4.nc
. You could concatenate them along the record dimension to form a new precip_all.nc
with a record dimension of length 58400 with
ncrcat precip_1.nc precip_2.nc precip_3.nc precip_4.nc -O precip_all.nc
In Python we now just need to read in that new single file and then extract and store the time series for the desired grid cells. Something like this:
import netCDF4
import numpy as np
yindexlist = [1,2,3]
xindexlist = [4,5,6]
ngridcell = len(xidx)
ntimestep = 58400
# Define an empty 2D array to store time series of precip for a set of grid cells
timeseries_per_grid_cell = np.zeros([ngridcell, ntimestep])
ncfile = netCDF4.Dataset('path/to/file/precip_all.nc', 'r')
# Note that precip is 3D, so need to read in all dimensions
precip = ncfile.variables['precip'][:,:,:]
for i in range(ngridcell):
timeseries_per_grid_cell[i,:] = precip[:, yindexlist[i], xindexlist[i]]
ncfile.close()
If you have to use Python only, you'll need to keep track of the chunks of time indices that the individual files form to make the full time series. 58400/4 = 14600 time steps per file. So you'll have another loop to read in each individual file and store the corresponding slice of times, i.e. the first file will populate 0-14599, the second 14600-29199, etc.