extract multiples columns from txt file perl

后端未结

关注

 3  1975

I have a txt file like this:

#Genera columnA columnB columnC columnD columnN
x1       1       3       7      0.9      2
x2       5       3       13     7


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  有刺的猬        
                
              
                            
                2021-01-17 06:13
              
            
            
                                                                       
You can simplify like this and using hash slices. 

#!/usr/bin/env perl
use strict;
use warnings;

my @wanted = ( '#Genera' , qw (  columnA columnC columnN ));

open my $input, '<', "file.txt" or die $!;

chomp ( my @header = split ' ', <$input> ); 

print join "\t", @wanted, "\n";
while ( <$input> ) { 
   my %row;
   @row{@header} = split; 
   print join "\t", @row{@wanted}, "\n";
}


Which outputs: 

#Genera columnA columnC columnN 
x1  1   7   2   
x2  5   13  5   
x3  0.1 7   0.4 


If you want to exactly match your indentation then add sprintf to the mix:

E.g.:

print join "\t", map { sprintf "%8s", $_} @wanted, "\n";
while ( <$input> ) { 
   my %row;
   @row{@header} = split; 
   print join "\t", map { sprintf "%8s", $_} @row{@wanted}, "\n";
}


Which then gives:

 #Genera     columnA     columnC     columnN           
      x1           1           7           2           
      x2           5          13           5           
      x3         0.1           7         0.4    

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  青春惊慌失措        
                
              
                            
                2021-01-17 06:14
              
            
            
                                                                       
There are command line switches that are used for this kind of application:

perl -lnae 'print join "\t", @F[1,3,5]' file.txt


Switch -a automatically creates variable @F for each line, split by whitespace. So @F[1,3,5] is an array slice of elements 1, 3, and 5. 

The downside of this, of course, is that you have to use the column numbers instead of the names. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  有刺的猬        
                
              
                            
                2021-01-17 06:16
              
            
            
                                                                       
This program does as you ask. It expects the path to the input file as a parameter on the command line, which can then be read using the empty "diamond operator" <> without explicitly opening it

Each non-blank line of the file is split into fields, and the header line is identified by the first starting with a hash symbol #

A call to map converts the @wanted_fields array into a list of indexes into @fields where those column headers appear and stores it in array @idx

This array is then used to slice the wanted columns from @fields for every line of input. The fields are printed, separated by tabs

use strict;
use warnings 'all';

use List::Util 'first';

my @wanted_fields = qw/ columnA columnC columnN /;

my @idx;

while ( <> ) {
    next unless /\S/;

    my @fields = split;

    if ( $fields[0] =~ /^#/ ) {

        @idx = ( 0, map {
            my $wanted = $_;
            first { $fields[$_] eq $wanted } 0 .. $#fields;
        } @wanted_fields );
    }

    print join( "\t", @fields[@idx] ), "\n" if @idx;
}


output

#Genera columnA columnC columnN
x1  1   7   2
x2  5   13  5
x3  0.1 7   0.4

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复