python netcdf: making a copy of all variables and attributes but one

后端 未结 5 1009
南笙
南笙 2020-12-09 10:42

I need to process a single variable in a netcdf file that actually contains many attributes and variable. I think it is not possible to update a netcdf file (see question Ho

相关标签:
5条回答
  • 2020-12-09 11:16

    If you just want to copy the file picking out variables, nccopy is a great tool as submitted by @rewfuss.

    Here's a Pythonic (and more flexible) solution with python-netcdf4. This allows you to open it for processing and other calculations before writing to file.

    with netCDF4.Dataset(file1) as src, netCDF4.Dataset(file2) as dst:
    
      for name, dimension in src.dimensions.iteritems():
        dst.createDimension(name, len(dimension) if not dimension.isunlimited() else None)
    
      for name, variable in src.variables.iteritems():
    
        # take out the variable you don't want
        if name == 'some_variable': 
          continue
    
        x = dst.createVariable(name, variable.datatype, variable.dimensions)
        dst.variables[x][:] = src.variables[x][:]
    

    This does not take into account of variable attributes, such as fill_values. You can do that easily following the documentation.

    Do be careful, netCDF4 files once written/created this way cannot be undone. The moment you modify the variable, it is written to file at the end of with statement, or if you call .close() on the Dataset.

    Of course, if you wish to process the variables before writing them, you have to be careful about which dimensions to create. In a new file, Never write to variables without creating them. Also, never create variables without having defined dimensions, as noted in python-netcdf4's documentation.

    0 讨论(0)
  • 2020-12-09 11:22

    This answer builds on the one from Xavier Ho (https://stackoverflow.com/a/32002401/7666), but with the fixes I needed to complete it:

    import netCDF4 as nc
    import numpy as np
    toexclude = ["TO_REMOVE"]
    with nc.Dataset("orig.nc") as src, nc.Dataset("filtered.nc", "w") as dst:
        # copy attributes
        for name in src.ncattrs():
            dst.setncattr(name, src.getncattr(name))
        # copy dimensions
        for name, dimension in src.dimensions.iteritems():
            dst.createDimension(
                name, (len(dimension) if not dimension.isunlimited else None))
        # copy all file data except for the excluded
        for name, variable in src.variables.iteritems():
            if name not in toexclude:
                x = dst.createVariable(name, variable.datatype, variable.dimensions)
                dst.variables[name][:] = src.variables[name][:]
    
    0 讨论(0)
  • 2020-12-09 11:30

    Here's what I just used and worked. @arne's answer updated for Python 3 and also to include copying variable attributes:

    import netCDF4 as nc
    toexclude = ['ExcludeVar1', 'ExcludeVar2']
    
    with netCDF4.Dataset("in.nc") as src, netCDF4.Dataset("out.nc", "w") as dst:
        # copy global attributes all at once via dictionary
        dst.setncatts(src.__dict__)
        # copy dimensions
        for name, dimension in src.dimensions.items():
            dst.createDimension(
                name, (len(dimension) if not dimension.isunlimited() else None))
        # copy all file data except for the excluded
        for name, variable in src.variables.items():
            if name not in toexclude:
                x = dst.createVariable(name, variable.datatype, variable.dimensions)
                dst[name][:] = src[name][:]
                # copy variable attributes all at once via dictionary
                dst[name].setncatts(src[name].__dict__)
    
    0 讨论(0)
  • 2020-12-09 11:31

    The nccopy utility in C netCDF versions 4.3.0 and later includes an option to list which variables are to be copied (along with their attributes). Unfortunately, it doesn't include an option for which variables to exclude, which is what you need.

    However, if the list of (comma-delimited) variables to be included doesn't cause the nccopy command-line to exceed system limits, this would work. There are two variants for this option:

    nccopy -v var1,var2,...,varn input.nc output.nc
    nccopy -V var1,var2,...,varn input.nc output.nc
    

    The first (-v) includes all the variable definitions, but only data for the named variables. The second (-V) doesn't include definitions or data for unnamed variables.

    0 讨论(0)
  • 2020-12-09 11:32

    I know this is an old question, but as an alternative, you can use the library netcdf and shutil:

    import shutil
    from netcdf import netcdf as nc
    
    def processing(infile, variable, outfile):
        shutil.copyfile(infile, outfile)
        with nc.loader(infile) as in_root, nc.loader(outfile) as out_root:
            data = nc.getvar(in_root, variable)
            # do your processing with data and save them as memory "values"...
            values = data[:] * 3
            new_var = nc.getvar(out_root, variable, source=data)
            new_var[:] = values
    
    0 讨论(0)
提交回复
热议问题