how to improve grep efficiency in perl when the file number is huge

后端未结

关注

 2  480

I want to grep some log information from the log files located in the following directory structure using perl: $jobDir/jobXXXX/host.log where XXXX is

相关标签:

2条回答

轮回少年

2021-01-24 00:27
While it would be more elegant to use the matching built into perl (see the other answer), calling the grep command can be more efficient and faster, especially if there are lots of data but only few matches. But the way you call it is to first run grep and collect all data, and then scan through all the data. This will need more memory because you first collect all data, and you have to wait for the output until all data are collected. Better would be to output as soon as the first data are collected:
```
open( my $fh,'-|','grep',"information",'-r',$jobDir) or die $!;
while (<$fh>) {
    if(/$(\d+)$(.*)$(\d+)$/){
        Output(xxxxxxxx);
    }
    $Num=$Num+1; #number count      
}
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

别那么骄傲

2021-01-24 00:43

You should search those log file one by one, and scan each log file line by line, instead of reading the output of grep to memory (that could cost lots of memory, and slow your program, even your system):

# untested script

my $Num;
foreach my $log (<$jobDir/job*/host.log>) {
    open my $logfh, '<', "$log" or die "Cannot open $log: $!";
    while (<$logfh>) {
        if (m/information/) {
            if(m/\((\d+)\)(.*)\((\d+)\)/) {
                Output(xxx);
            }
            $Num++;
        }
    }
    close $logfh;
}

0 讨论(0)