how to improve grep efficiency in perl when the file number is huge

后端 未结 2 480
梦谈多话
梦谈多话 2021-01-23 23:53

I want to grep some log information from the log files located in the following directory structure using perl: $jobDir/jobXXXX/host.log where XXXX is

相关标签:
2条回答
  • 2021-01-24 00:27

    While it would be more elegant to use the matching built into perl (see the other answer), calling the grep command can be more efficient and faster, especially if there are lots of data but only few matches. But the way you call it is to first run grep and collect all data, and then scan through all the data. This will need more memory because you first collect all data, and you have to wait for the output until all data are collected. Better would be to output as soon as the first data are collected:

    open( my $fh,'-|','grep',"information",'-r',$jobDir) or die $!;
    while (<$fh>) {
        if(/\((\d+)\)(.*)\((\d+)\)/){
            Output(xxxxxxxx);
        }
        $Num=$Num+1; #number count      
    }
    
    0 讨论(0)
  • 2021-01-24 00:43

    You should search those log file one by one, and scan each log file line by line, instead of reading the output of grep to memory (that could cost lots of memory, and slow your program, even your system):

    # untested script
    
    my $Num;
    foreach my $log (<$jobDir/job*/host.log>) {
        open my $logfh, '<', "$log" or die "Cannot open $log: $!";
        while (<$logfh>) {
            if (m/information/) {
                if(m/\((\d+)\)(.*)\((\d+)\)/) {
                    Output(xxx);
                }
                $Num++;
            }
        }
        close $logfh;
    }
    
    0 讨论(0)
提交回复
热议问题