Combine multiple NetCDF files into timeseries multidimensional array python

后端 未结 3 1324
轮回少年
轮回少年 2021-02-10 13:37

I am using data from multiple netcdf files (in a folder on my computer). Each file holds data for the entire USA, for a time period of 5 years. Locations are referenced based on

相关标签:
3条回答
  • 2021-02-10 14:14

    Nice start, I would recommend the following to help solve your issues.

    First, check out ncrcat to quickly concatenate your individual netCDF files into a single file. I highly recommend downloading NCO for netCDF manipulations, especially in this instance where it will ease your Python coding later on.

    Let's say the files are named precip_1.nc, precip_2.nc, precip_3.nc, and precip_4.nc. You could concatenate them along the record dimension to form a new precip_all.nc with a record dimension of length 58400 with

    ncrcat precip_1.nc precip_2.nc precip_3.nc precip_4.nc -O precip_all.nc
    

    In Python we now just need to read in that new single file and then extract and store the time series for the desired grid cells. Something like this:

    import netCDF4
    import numpy as np
    
    yindexlist = [1,2,3]
    xindexlist = [4,5,6]
    ngridcell = len(xidx)
    ntimestep = 58400
    
    # Define an empty 2D array to store time series of precip for a set of grid cells
    timeseries_per_grid_cell = np.zeros([ngridcell, ntimestep])
    
    ncfile = netCDF4.Dataset('path/to/file/precip_all.nc', 'r')
    
    # Note that precip is 3D, so need to read in all dimensions
    precip = ncfile.variables['precip'][:,:,:]
    
    for i in range(ngridcell):
         timeseries_per_grid_cell[i,:] = precip[:, yindexlist[i], xindexlist[i]]
    
    ncfile.close()
    

    If you have to use Python only, you'll need to keep track of the chunks of time indices that the individual files form to make the full time series. 58400/4 = 14600 time steps per file. So you'll have another loop to read in each individual file and store the corresponding slice of times, i.e. the first file will populate 0-14599, the second 14600-29199, etc.

    0 讨论(0)
  • 2021-02-10 14:20

    You can easily merge multiple netCDF files into one using netCDF4 package in Python. See example below:

    I have four netCDF files like 1.nc, 2.nc, 3.nc, 4.nc. Using command below all four files will be merge into one dataset.

    import netCDF4
    from netCDF4 import Dataset
    
    dataset = netCDF4.MFDataset(['1.nc','2.nc','3.nc','4.nc'])
    
    0 讨论(0)
  • 2021-02-10 14:22

    In parallel to the answer of N1B4, you can also concatenate 4 files along their time dimension using CDO from the command line

    cdo mergetime precip1.nc precip2.nc precip3.nc precip4.nc merged_file.nc 
    

    or with wildcards

    cdo mergetime precip?.nc merged_file.nc 
    

    and then proceed to read it in as per that answer.

    You can add another step from the command line to extract the location of choice by using

    cdo remapnn,lon=X/lat=Y merged_file.nc my_location.nc
    

    this picks out the gridcell nearest to your specified lon/lat (X,Y) coordinate, or you can use bilinear interpolation if you prefer:

    cdo remapbil,lon=X/lat=Y merged_file.nc my_location.nc 
    
    0 讨论(0)
提交回复
热议问题