How to use sed/grep to extract text between two words?

后端 未结 12 2329
春和景丽
春和景丽 2020-11-22 05:25

I am trying to output a string that contains everything between two words of a string:

input:

\"Here is a String\"

output:

相关标签:
12条回答
  • 2020-11-22 06:06

    The accepted answer does not remove text that could be before Here or after String. This will:

    sed -e 's/.*Here\(.*\)String.*/\1/'
    

    The main difference is the addition of .* immediately before Here and after String.

    0 讨论(0)
  • 2020-11-22 06:09
    sed -e 's/Here\(.*\)String/\1/'
    
    0 讨论(0)
  • 2020-11-22 06:10

    GNU grep can also support positive & negative look-ahead & look-back: For your case, the command would be:

    echo "Here is a string" | grep -o -P '(?<=Here).*(?=string)'
    

    If there are multiple occurrences of Here and string, you can choose whether you want to match from the first Here and last string or match them individually. In terms of regex, it is called as greedy match (first case) or non-greedy match (second case)

    $ echo 'Here is a string, and Here is another string.' | grep -oP '(?<=Here).*(?=string)' # Greedy match
     is a string, and Here is another 
    $ echo 'Here is a string, and Here is another string.' | grep -oP '(?<=Here).*?(?=string)' # Non-greedy match (Notice the '?' after '*' in .*)
     is a 
     is another 
    
    0 讨论(0)
  • 2020-11-22 06:10

    If you have a long file with many multi-line ocurrences, it is useful to first print number lines:

    cat -n file | sed -n '/Here/,/String/p'
    
    0 讨论(0)
  • 2020-11-22 06:13

    Through GNU awk,

    $ echo "Here is a string" | awk -v FS="(Here|string)" '{print $2}'
     is a 
    

    grep with -P(perl-regexp) parameter supports \K, which helps in discarding the previously matched characters. In our case , the previously matched string was Here so it got discarded from the final output.

    $ echo "Here is a string" | grep -oP 'Here\K.*(?=string)'
     is a 
    $ echo "Here is a string" | grep -oP 'Here\K(?:(?!string).)*'
     is a 
    

    If you want the output to be is a then you could try the below,

    $ echo "Here is a string" | grep -oP 'Here\s*\K.*(?=\s+string)'
    is a
    $ echo "Here is a string" | grep -oP 'Here\s*\K(?:(?!\s+string).)*'
    is a
    
    0 讨论(0)
  • 2020-11-22 06:15

    All the above solutions have deficiencies where the last search string is repeated elsewhere in the string. I found it best to write a bash function.

        function str_str {
          local str
          str="${1#*${2}}"
          str="${str%%$3*}"
          echo -n "$str"
        }
    
        # test it ...
        mystr="this is a string"
        str_str "$mystr" "this " " string"
    
    0 讨论(0)
提交回复
热议问题