How do I count the characters, words, and lines in a file, using Perl?

前端未结

关注

 10  1339

What is a good/best way to count the number of characters, words, and lines of a text file using Perl (without using wc)?

相关标签:

10条回答

不要未来只要你来

2020-12-31 03:32

Reading the file in fixed-size chunks may be more efficient than reading line-by-line. The wc binary does this.

#!/usr/bin/env perl

use constant BLOCK_SIZE => 16384;

for my $file (@ARGV) {
    open my $fh, '<', $file or do {
        warn "couldn't open $file: $!\n";
        continue;
    };

    my ($chars, $words, $lines) = (0, 0, 0);

    my ($new_word, $new_line);
    while ((my $size = sysread $fh, local $_, BLOCK_SIZE) > 0) {
        $chars += $size;
        $words += /\s+/g;
        $words-- if $new_word && /\A\s/;
        $lines += () = /\n/g;

        $new_word = /\s\Z/;
        $new_line = /\n\Z/;
    }
    $lines-- if $new_line;

    print "\t$lines\t$words\t$chars\t$file\n";
}

0 讨论(0)

花落未央

2020-12-31 03:41

Here's the perl code. Counting words can be somewhat subjective, but I just say it's any string of characters that isn't whitespace.

open(FILE, "<file.txt") or die "Could not open file: $!";

my ($lines, $words, $chars) = (0,0,0);

while (<FILE>) {
    $lines++;
    $chars += length($_);
    $words += scalar(split(/\s+/, $_));
}

print("lines=$lines words=$words chars=$chars\n");

0 讨论(0)

萌比男神i

2020-12-31 03:48
A variation on bmdhacks' answer that will probably produce better results is to use \s+ (or even better \W+) as the delimiter. Consider the string "The quick brown fox" (additional spaces if it's not obvious). Using a delimiter of a single whitespace character will give a word count of six not four. So, try:
```
open(FILE, "<file.txt") or die "Could not open file: $!";

my ($lines, $words, $chars) = (0,0,0);

while (<FILE>) {
    $lines++;
    $chars += length($_);
    $words += scalar(split(/\W+/, $_));
}

print("lines=$lines words=$words chars=$chars\n");
```
Using \W+ as the delimiter will stop punctuation (amongst other things) from counting as words.
0 讨论(0)
发布评论:

提交评论
- 加载中...
情书的邮戳

2020-12-31 03:48
This may be helpful to Perl beginners. I tried to simulate MS word counting functionalities and added one more feature which is not shown using wc in Linux.
- number of lines
- number of words
- number of characters with space
- number of characters without space (wc will not give this in its output but Microsoft words shows it.)
Here is the url: Counting words,characters and lines in a file
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2