How should I delete hash elements while iterating?

前端 未结 4 883
醉话见心
醉话见心 2021-01-11 11:50

I have fairly large hash (some 10M keys) and I would like to delete some elements from it.

I usually don\'t like to use delete or splice, a

相关标签:
4条回答
  • 2021-01-11 12:40

    How about this:

    my %to_delete;
    
    foreach my $key (keys %hash) {
        if (should_be_deleted($key)) {
            $to_delete{$key}++;
        }
        # add some other keys the same way...
    }
    
    delete @hash{keys %to_delete};
    
    0 讨论(0)
  • 2021-01-11 12:41

    You can mark the hash elements to be deleted by setting their values to undef. That avoids wasting space on a separate list of keys to be deleted, as well as avoiding the checks on elements already marked for deletion. And it would also be less wasteful to use each instead of for, which builds a list of all the hash keys before starting to iterate the loop

    Like this

    while ( my ($key, $val) = each %hash ) {
    
        next unless defined $val and should_be_deleted($key);
    
        $hash{$key}       = undef;
        $hash{$key.'a'}   = undef;
        $hash{'kkk'.$key} = undef;
    }
    
    while ( my ($key, $val) = each %hash ) {
        delete $hash{$key} unless defined $val;
    }
    
    0 讨论(0)
  • 2021-01-11 12:46

    I recommend doing two passes because it's more robust. Hash order is effectively random, so there are no guarantees that you'll see the "primary" keys before the related ones. For example, if should_be_deleted() only detects the primary keys that aren't wanted and the related ones are calculated, you could end up processing unwanted data. A two-pass approach avoids this issue.

    my @unwanted;
    foreach my $key (keys %hash) {
        if (should_be_deleted($key)) {
             push @unwanted, $key;
             # push any related keys onto @unwanted
        }
    }
    
    delete @hash{@unwanted};
    
    foreach my $key (keys %hash) {
        # do something
    }
    
    0 讨论(0)
  • 2021-01-11 12:49

    Based on the example in the question, you could use a grep to filter out the keys that match your $key token.

    Update

    Your comment has clarified your need. My suggestion would be to determine the indexes that match your requirement and update you @keys set accordingly. The idea is to update @keys while looping over it so that unnecessary iterations are avoided.

    I've implemented the simple grep as a customizable function here.

    sub matches { $_[0] =~ /$_[1]/ ? 1 : 0 }  # Simple grep implemented here
    
    my @keys = keys %hash;  # @keys should initially contain all keys
    
    while ( @keys ) {
    
        my $key = shift @keys;
        next unless should_be_deleted ($key);  # Skip keys that are wanted
    
        my @indexes_to_delete = grep { matches ($key, qr/$keys[$_]/) } 0 .. $#keys;
    
        delete @hash { @keys[@indexes_to_delete] };     # Remove the unwanted keys
    
        splice @keys, $_, 1 foreach @indexes_to_delete; # Removes deleted ...
                                                        # ... elements from @keys.
                                                        # Avoids needless iterations.
    }
    
    0 讨论(0)
提交回复
热议问题