How to match a string not followed by a word using sed

后端未结

关注

 4  491

I need to delete all strings consisting of a hyphen followed by a whitespace, but only when the whitespace is not followed by the word \"og\". Example file:

Kult


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  悲哀的现实        
                
              
                            
                2021-01-24 10:43
              
            
            
                                                                       
This might work for you (GNU sed):

sed -r 's/(- (og|eller))|- /\1/g' file


This relies on alternation to re-replace specific cases and the empty backreference to replace the general case.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  面向向阳花        
                
              
                            
                2021-01-24 10:48
              
            
            
                                                                       
Given this input file (I added - ellers since you said in a comment you need to handle them too):

$ cat file
Kultur- og idrettsavdelinga skapar- eller nyska- pande kunst og utvik- lar- eller samfunnet


here's the common sed idiomatic approach:

$ sed 's/a/aA/g; s/- og/aB/g; s/- eller/aC/g; s/- //g; s/aC/- eller/g; s/aB/- og/g; s/aA/a/g' file
Kultur- og idrettsavdelinga skapar- eller nyskapande kunst og utviklar- eller samfunnet


The above works by turning all as (or whatever other char you like that's not in your target strings) into aA so we can then turn the strings we're interested in, - og and - eller, into a<some other character>, e.g. aB and aC and at that point we know the only occurrences of aB and aC in the input are the newly transformed - og and - eller since all of the existing as are now aA.

Now we can just remove all remaining -s from the file and then convert the aCs back to - eller and aBs back to - ogs and finally all aAs back to the original as.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情书的邮戳        
                
              
                            
                2021-01-24 10:55
              
            
            
                                                                       
You can also use a sed chain, first replacing - og with something nonsensical (like booogabooga), then performing the replacement, then reversing the booogabooga.

sed -e 's/- og/booogabooga/g; s/- //g; s/booogabooga/- og/g'


Some versions of sed may need:

sed -e 's/- og/booogabooga/g' -e 's/- //g' -e 's/booogabooga/- og/g'


This can be slower and more painful, especially if you have multiple replacements as @Kusalananda suggests, but it is easier to understand.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2021-01-24 11:03
              
            
            
                                                                       
The lookahead feature isn't available with sed, but you can describe all possibilities:

sed -e 's/\(- \(- \)*\)\([^o]\|$\|o\([^g]\|$\)\)/\3/g'


You can test it with: - - - - og - - oa - o => - og oa o
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复