How to match a string not followed by a word using sed

后端 未结 4 491
-上瘾入骨i
-上瘾入骨i 2021-01-24 10:11

I need to delete all strings consisting of a hyphen followed by a whitespace, but only when the whitespace is not followed by the word \"og\". Example file:

Kult         


        
相关标签:
4条回答
  • 2021-01-24 10:43

    This might work for you (GNU sed):

    sed -r 's/(- (og|eller))|- /\1/g' file
    

    This relies on alternation to re-replace specific cases and the empty backreference to replace the general case.

    0 讨论(0)
  • 2021-01-24 10:48

    Given this input file (I added - ellers since you said in a comment you need to handle them too):

    $ cat file
    Kultur- og idrettsavdelinga skapar- eller nyska- pande kunst og utvik- lar- eller samfunnet
    

    here's the common sed idiomatic approach:

    $ sed 's/a/aA/g; s/- og/aB/g; s/- eller/aC/g; s/- //g; s/aC/- eller/g; s/aB/- og/g; s/aA/a/g' file
    Kultur- og idrettsavdelinga skapar- eller nyskapande kunst og utviklar- eller samfunnet
    

    The above works by turning all as (or whatever other char you like that's not in your target strings) into aA so we can then turn the strings we're interested in, - og and - eller, into a<some other character>, e.g. aB and aC and at that point we know the only occurrences of aB and aC in the input are the newly transformed - og and - eller since all of the existing as are now aA.

    Now we can just remove all remaining -s from the file and then convert the aCs back to - eller and aBs back to - ogs and finally all aAs back to the original as.

    0 讨论(0)
  • 2021-01-24 10:55

    You can also use a sed chain, first replacing - og with something nonsensical (like booogabooga), then performing the replacement, then reversing the booogabooga.

    sed -e 's/- og/booogabooga/g; s/- //g; s/booogabooga/- og/g'
    

    Some versions of sed may need:

    sed -e 's/- og/booogabooga/g' -e 's/- //g' -e 's/booogabooga/- og/g'
    

    This can be slower and more painful, especially if you have multiple replacements as @Kusalananda suggests, but it is easier to understand.

    0 讨论(0)
  • 2021-01-24 11:03

    The lookahead feature isn't available with sed, but you can describe all possibilities:

    sed -e 's/\(- \(- \)*\)\([^o]\|$\|o\([^g]\|$\)\)/\3/g'
    

    You can test it with: - - - - og - - oa - o => - og oa o

    0 讨论(0)
提交回复
热议问题