deleting variables from a .mat file

前端 未结 4 1655
無奈伤痛
無奈伤痛 2021-01-04 02:05

Does anyone here know how to delete a variable from a matlab file? I know that you can add variables to an existing matlab file using the save -append

相关标签:
4条回答
  • 2021-01-04 02:18

    Interestingly enough, you can use the -append option with SAVE to effectively erase data from a .mat file. Note this excerpt from the documentation (bold added by me):

    For MAT-files, -append adds new variables to the file or replaces the saved values of existing variables with values in the workspace.

    In other words, if a variable in your .mat file is called A, you can save over that variable with a new copy of A (that you've set to []) using the -append option. There will still be a variable called A in the .mat file, but it will be empty and thus reduce the total file size.

    Here's an example:

    >> A = rand(1000);            %# Create a 1000-by-1000 matrix of random values
    >> save('savetest.mat','A');  %# Save A to a file
    >> whos -file savetest.mat    %# Look at the .mat file contents
      Name         Size                Bytes  Class     Attributes
    
      A         1000x1000            8000000  double
    

    The file size will be about 7.21 MB. Now do this:

    >> A = [];                              %# Set the variable A to empty
    >> save('savetest.mat','A','-append');  %# Overwrite A in the file
    >> whos -file savetest.mat              %# Look at the .mat file contents
      Name      Size            Bytes  Class     Attributes
    
      A         0x0                 0  double
    

    And now the file size will be around 169 bytes. The variable is still in there, but it is empty.

    0 讨论(0)
  • 2021-01-04 02:22

    10 GB of data? Updating multi-variable MAT files could get expensive due to MAT format overhead. Consider splitting the data up and saving each variable to a different MAT file, using directories for organization if necessary. Even if you had a convenient function to delete variables from a MAT file, it would be inefficient. The variables in a MAT file are layed out contiguously, so replacing one variable can require reading and writing much of the rest. If they're in separate files, you can just delete the whole file, which is fast.

    To see this in action, try this code, stepping through it in the debugger while using something like Process Explorer (on Windows) to monitor its I/O activity.

    function replace_vars_in_matfile
    
    x = 1;
    % Random dummy data; zeros would compress really well and throw off results
    y = randi(intmax('uint8')-1, 100*(2^20), 1, 'uint8');
    
    tic; save test.mat x y; toc;
    x = 2;
    tic; save -append test.mat x; toc;
    y = y + 1;
    tic; save -append test.mat y; toc;
    

    On my machine, the results look like this. (Read and Write are cumulative, Time is per operation.)

                        Read (MB)      Write (MB)       Time (sec)
    before any write:   25             0
    first write:        25             105              3.7
    append x:           235            315              3.6
    append y:           235            420              3.8
    

    Notice that updating the small x variable is more expensive than updating the large y. Much of this I/O activity is "redundant" housekeeping work to keep the MAT file format organized, and will go away if each variable is in its own file.

    Also, try to keep these files on the local filesystem; it'll be a lot faster than network drives. If they need to go on a network drive, consider doing the save() and load() on local temp files (maybe chosen with tempname()) and then copying them to/from the network drive. Matlab's save and load tend to be much faster with local filesystems, enough so that local save/load plus a copy can be a substantial net win.


    Here's a basic implementation that will let you save variables to separate files using the familiar save() and load() signatures. They're prefixed with "d" to indicate they're the directory-based versions. They use some tricks with evalin() and assignin(), so I thought it would be worth posting the full code.

    function dsave(file, varargin)
    %DSAVE Like save, but each var in its own file
    %
    % dsave filename var1 var2 var3...
    if nargin < 1 || isempty(file); file = 'matlab';  end
    [tfStruct,loc] = ismember({'-struct'}, varargin);
    args = varargin;
    args(loc(tfStruct)) = [];
    if ~all(cellfun(@isvarname, args))
        error('Invalid arguments. Usage: dsave filename <-struct> var1 var2 var3 ...');
    end
    if tfStruct
        structVarName = args{1};
        s = evalin('caller', structVarName);
    else
        varNames = args;
        if isempty(args)
            w = evalin('caller','whos');
            varNames = { w.name };
        end
        captureExpr = ['struct(' ...
            join(',', cellfun(@(x){sprintf('''%s'',{%s}',x,x)}, varNames)) ')'];
        s = evalin('caller', captureExpr);
    end
    
    % Use Java checks to avoid partial path ambiguity
    jFile = java.io.File(file);
    if ~jFile.exists()
        ok = mkdir(file);
        if ~ok; 
            error('failed creating dsave dir %s', file);
        end
    elseif ~jFile.isDirectory()
        error('Cannot save: destination exists but is not a dir: %s', file);
    end
    names = fieldnames(s);
    for i = 1:numel(names)
        varFile = fullfile(file, [names{i} '.mat']);
        varStruct = struct(names{i}, {s.(names{i})});
        save(varFile, '-struct', 'varStruct');
    end
    
    function out = join(Glue, Strings)
    Strings = cellstr(Strings);
    if length( Strings ) == 0
        out = '';
    elseif length( Strings ) == 1
        out = Strings{1};
    else
        Glue = sprintf( Glue ); % Support escape sequences
        out = strcat( Strings(1:end-1), { Glue } );
        out = [ out{:} Strings{end} ];
    end
    

    Here's the load() equivalent.

    function out = dload(file,varargin)
    %DLOAD Like load, but each var in its own file
    if nargin < 1 || isempty(file); file = 'matlab'; end
    varNames = varargin;
    if ~exist(file, 'dir')
        error('Not a dsave dir: %s', file);
    end
    if isempty(varNames)
        d = dir(file);
        varNames = regexprep(setdiff(ls(file), {'.','..'}), '\.mat$', '');
    end
    
    out = struct;
    for i = 1:numel(varNames)
        name = varNames{i};
        tmp = load(fullfile(file, [name '.mat']));
        out.(name) = tmp.(name);
    end
    
    if nargout == 0
        for i = 1:numel(varNames)
            assignin('caller', varNames{i}, out.(varNames{i}));
        end
        clear out
    end
    

    Dwhos() is the equivalent of whos('-file').

    function out = dwhos(file)
    %DWHOS List variable names in a dsave dir
    if nargin < 1 || isempty(file); file = 'matlab'; end
    out = regexprep(setdiff(ls(file), {'.','..'}), '\.mat$', '');
    

    And ddelete() to delete the individual variables like you asked.

    function ddelete(file,varargin)
    %DDELETE Delete variables from a dsave dir
    if nargin < 1 || isempty(file); file = 'matlab'; end
    varNames = varargin;
    for i = 1:numel(varNames)
        delete(fullfile(file, [varNames{i} '.mat']));
    end
    
    0 讨论(0)
  • 2021-01-04 02:28

    The only way of doing this that I know is to use the MAT-file API function matDeleteVariable. It would, I guess, be quite easy to write a Fortran or C routine to do this, but it does seem like a lot of effort for something that ought to be much easier.

    0 讨论(0)
  • 2021-01-04 02:32

    I suggest you load the variables from the .mat file you want to keep, and save them to a new .mat file. If necessary, you can load and save (using '-append') in a loop.

    S = load(filename, '-mat', variablesYouWantToKeep);
    save(newFilename,'-struct',S,variablesYouWantToKeep);
    %# then you can delete the old file
    delete(filename)
    
    0 讨论(0)
提交回复
热议问题