Succinct way to print all lines up until the last line that matches a given pattern

只谈情不闲聊 提交于 2019-12-23 08:50:08

问题


I'm trying to find a succinct shell one-liner that'll give me all the lines in a file up until some pattern.

The use case is dumping all the lines in a log file until I spot some marker indicating that the server has been restarted.

Here's a stupid shell-only way that:

tail_file_to_pattern() {
    pattern=$1
    file=$2

    tail -n$((1 + $(wc -l $file | cut -d' ' -f1) - $(grep -E -n "$pattern" $file | tail -n 1 | cut -d ':' -f1))) $file
}

A slightly more reliable Perl way that takes the file on stdin:

perl -we '
    push @lines => $_ while <STDIN>;
    my $pattern = $ARGV[0];
    END {
        my $last_match = 0;
        for (my $i = @lines; $i--;) {
            $last_match = $i and last if $lines[$i] =~ /$pattern/;
        }
        print @lines[$last_match..$#lines];
    }
'

And of course you could do that more efficiently be opening the file, seeking to the end and seeking back until you found a matching line.

It's easy to print everything as of the first occurrence, e.g.:

sed -n '/PATTERN/,$p'

But I haven't come up with a way to print everything as of the last occurance.


回答1:


Here's a sed-only solution. To print every line in $file starting with the last line that matches $pattern:

sed -e "H;/${pattern}/h" -e '$g;$!d' $file

Note that like your examples, this only works properly if the file contains the pattern. Otherwise, it outputs the entire file.

Here's a breakdown of what it does, with sed commands in brackets:

  • [H] Append every line to sed's "hold space" but do not echo it to stdout [d].
  • When we encounter the pattern, [h] throw away the hold space and start over with the matching line.
  • When we get to the end of the file, copy the hold space to the pattern space [g] so it will echo to stdout.

Also note that it's likely to get slow with very large files, since any single-pass solution will need to keep a bunch of lines in memory.




回答2:


Load the data into an array line by line, and throw the array away when you find a pattern match. Print out whatever is left at the end.

 while (<>) {
     @x=() if /$pattern/;
     push @x, $_;
 }
 print @x;

As a one-liner:

 perl -ne '@x=() if /$pattern/;push @x,$_;END{print @x}' input-file



回答3:


Alternatively: tac "$file" | sed -n '/PATTERN/,$p' | tac

EDIT: If you don't have tac emulate it by defining

tac() {
    cat -n | sort -nr | cut -f2
}

Ugly but POSIX.




回答4:


I suggest a simplification of your shell script:

tail -n +$(grep -En "$pattern" "$file" | tail -1 | cut -d: -f1) "$file"

It's substantially more concise because it:

  • Uses tail's + option to print from the given line to the end, rather than having to calculate the distance from there to the end.
  • Uses more concise ways of expressing command line options.

And it fixes a bug by quoting $file (so it will work on files whose names contain spaces).




回答5:


Sed's q command will do the trick:

sed "/$pattern/q" $file

That will print all the lines until it gets to the line with the pattern. After that, sed will print that last line and quit.




回答6:


This questions title and description don't match.

For the question's title, +1 for @David W.'s answer. Also:

sed -ne '1,/PATTERN/p'

For question in the body, you've already got some solutions.

Note that tac is probably specific to Linux. It doesn't seem to exist in BSD or OSX. If you want a solution that's multi-platform, don't rely on tac.

Of course, just about any solution is going to require that your data either be spooled in memory, or submitted once for analysis and a second time for processing. For exampel:

#!/usr/local/bin/bash

tmpfile="/tmp/`basename $0`,$$"
trap "rm $tmpfile" 0 1 2 5
cat > $tmpfile

n=`awk '/PATTERN/{n=NR}END{print NR-n+1}' $tmpfile`

tail -$n $tmpfile

Note that my use of tail is for FreeBSD. If you use Linux, you'll probably need tail -n $n $tmpfile instead.




回答7:


Rob Davis pointed out to me what you said you wanted isn't what you really asked:

You said:

I'm trying to find a succinct shell one-liner that'll give me all the lines in a file up until some pattern.

but then at the very end of your post, you said:

But I haven't come up with a way to print everything as of the last occurance.

I've already gave you the answer for your first question. Here's a one line answer for your second question: Printing from a regular expression to the end of the file:

awk '{ if ($0 ~ /'"$pattern"'/) { flag = 1 } if (flag == 1) { print $0 } }' $file

A similar Perl one-liner:

export pattern="<regex>"
export file="<file>"
perl -ne '$flag=1 if /$ENV{pattern}/;print if $flag;' $file


来源:https://stackoverflow.com/questions/8962629/succinct-way-to-print-all-lines-up-until-the-last-line-that-matches-a-given-patt

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!