perl to remove lines from file

前端 未结 4 1067
春和景丽
春和景丽 2021-01-29 12:22

I have file that looks like:

ATOM 2517 O   VAL 160 8.337  12.679  -2.487
ATOM 2518 OXT VAL 160 7.646  12.461  -0.386
TER 
ATOM 2519 N   VAL 161 -14.431  5.789 -2         


        
相关标签:
4条回答
  • 2021-01-29 12:44

    I realized I was supposed to write it in Perl, but now I've already written it in Python. I'm posting it anyway as it may prove to be useful, don't see any harm in that.

    #!/usr/bin/python2.7
    import sys
    import glob
    import os
    
    try:
        dir = sys.argv[1]
    except IndexError:
        print "Usage: "+sys.argv[0]+" dir"
        print "Example: "+sys.argv[0]+" /home/user/dir/"
        sys.exit(1)
    
    for file in glob.glob(os.path.join(dir, 'File*_*MINvac.pdb')):
        fin = open(file, "r")
        content = fin.readlines()
        fin.close()
    
        for i in range(0, len(content)):
            try:
                if "TER" in content[i]:
                    del content[i]
                    del content[i-1]
                    del content[i:i+3]
            except IndexError:
                break
        fout = open(file, "w")
        fout.writelines(content)
        fout.close()
    

    Edit: Added support for multiple files, like the OP wanted.

    0 讨论(0)
  • 2021-01-29 12:49

    A simple line-by-line script.

    Usage: perl script.pl -i.bak fileglob

    E.g. perl script.pl -i.bak File*MINvac.pdb

    This will alter the original file, and save a backup of each file with the extension .bak. Note that if TER lines appear too close to the end of the file, it will cause warnings. On the other hand, so will the other solutions presented.

    If you do not wish to save backups (use caution, since changes are irreversible!), use -i instead.

    Code:

    #!/usr/bin/perl
    use v5.10;
    use strict;
    use warnings;
    
    my $prev;
    while (<>) {
        if (/^TER/) {
            print scalar <>;  # print next line
            <> for 1 .. 3;    # skip 3 lines
            $prev = undef;    # remove previous line
        } else {
            print $prev if defined $prev;
            $prev = $_;
        }
        if (eof) {  # New file next iteration?
            print $prev;
            $prev = undef;
        }
    }
    
    0 讨论(0)
  • 2021-01-29 12:56

    So, for each set of 6 consecutive lines, you want to discard all but the third line if the second line is a TER?

    TIMTOWTDI, but this should work:

    my @queue;
    while (<>) {
        push @queue, $_;
        @queue = $queue[2]  if @queue == 6 and $queue[1] =~ /^TER$/;
        print shift @queue  if @queue == 6;
    }
    print @queue;  # assume no TERs in last 4 lines
    
    0 讨论(0)
  • 2021-01-29 13:03
    use strict;
    use warnings;
    use Tie::File;
    
    my @array;
    
    tie @array, 'Tie::File', 'myFile.txt' or die "Unable to tie file";
    
    my %unwanted = map  { $_ => 1 }                # Hashify ...
                   map  { $_-1, $_, $_+2 .. $_+4 } # ... the five lines ...
                   grep { $array[$_] =~ /^TER/ }   # ... around 'TER'  ...
                   0 .. $#array ;                  # ... in the file
    
    # Remove the unwanted lines
    @array = map { $array[$_] } grep { ! $unwanted{$_} } 0 .. $#array;
    
    untie @array;  # The end
    
    0 讨论(0)
提交回复
热议问题