unix sed substitute nth occurence misfunction?

前端 未结 3 420
抹茶落季
抹茶落季 2021-01-24 19:07

Let\'s say I have a string which contains multiple occurences of the letter Z. For example: aaZbbZccZ. I want to print parts of that string, each time until the nex

相关标签:
3条回答
  • 2021-01-24 19:33

    Below sed works closely to what you are looking for, except it removes also the last Z.

    $echo aaZbbZccZdd | sed -e 's/Z[^Z]*//1g;s/$/Z/'
    aaZ
    
    $echo aaZbbZccZdd | sed -e 's/Z[^Z]*//2g;s/$/Z/'
    aaZbbZ
    
    $echo aaZbbZccZdd | sed -e 's/Z[^Z]*//3g;s/$/Z/'
    aaZbbZccZ
    
    $echo aaZbbZccZdd | sed -e 's/Z[^Z]*//4g;s/$/Z/'
    aaZbbZccZddZ
    

    Edit: Modified according to Aaron suggestion.

    Edit2: If you don't know how many Z there are in the string it's safer to use below command. Otherwise additional Z is added at the end.
    -r - enables regular expressions
    -e - separates sed operations, the same as ; but easier to read in my opinion.

    $echo aaZbbZccZddZ | sed -r -e 's/Z[^Z]*//1g' -e 's/([^Z])$/\1Z/'
    aaZ
    
    $echo aaZbbZccZddZ | sed -r -e 's/Z[^Z]*//2g' -e 's/([^Z])$/\1Z/'
    aaZbbZ
    
    $echo aaZbbZccZddZ | sed -r -e 's/Z[^Z]*//3g' -e 's/([^Z])$/\1Z/'
    aaZbbZccZ
    
    $echo aaZbbZccZddZ | sed -r -e 's/Z[^Z]*//4g' -e 's/([^Z])$/\1Z/'
    aaZbbZccZddZ
    
    $echo aaZbbZccZddZ | sed -r -e 's/Z[^Z]*//5g' -e 's/([^Z])$/\1Z/'
    aaZbbZccZddZ
    
    0 讨论(0)
  • 2021-01-24 19:40

    This should do what you expect (see comments) unless your string can contain line breaks:

    # -n will prevent default printing
    echo 'aaZbbZccZ' | sed -n '{
        # Add a line break after each 'Z'
        s/Z/Z\
    /g
        # Print it and consume it in the next sed command
        p
    }' | sed -n '{
        # Add only the first line to the hold buffer (you can remove it if you don't mind to see first blank line)
        1 {
            h
        }
        # As for the rest of the lines
        2,$ {
            # Replace the hold buffer with the pattern space
            x
            # Remove line breaks
            s/\n//
            # Print the result
            p
            # Get the hold buffer again (matched line)
            x
            # And append it with new line to the hold buffer
            H
        }'
    

    The idea is to break the string into multiples lines (each is terminated with Z), that will be processed one by one on the second sed command.

    On the second sed we use the Hold Buffer to remember previous lines, print the aggregated result, append new lines and each time remove the line breaks we previously added.

    And the output is

    aaZ
    aaZbbZ
    aaZbbZccZ
    
    0 讨论(0)
  • 2021-01-24 19:41

    This might work for you (GNU sed):

    sed -n 's/Z/&\n/g;:a;/\n/P;s/\n\(.*Z\)/\1/;ta' file
    

    Use sed's grep-like option -n to explicitly print content. Append a newline after each Z. If there were no substitutions then there is nothing to be done. Print upto the first newline, remove the first newline if the following characters contain a Z and repeat.

    0 讨论(0)
提交回复
热议问题