How to replace paired square brackets with other syntax with sed?

前端未结

关注

 4  1598

I want to replace all pairs of square brackets in a file, e.g., [some text], with \\macro{some text}, e.g.:

This is some [text].
Th


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2020-12-06 02:03
              
            
            
                                                                       
It took a little doing, but here:
sed -i.bkup  's/\[\([^]]*\)\]/\\macro{\1}/g' test.txt

Let's see if I can explain this regular expression:

The \[ is matching a square bracket. Since [ is a valid magic regular expression character, the backslash means to match the literal character.
The \(...\) is a capture group. It captures the part of the regular expression I want. I can have many capture groups, and in sed I can reference them as \1, \2, etc.
Inside the capture group \(...\). I have [^]]*.

The [^...] syntax means any character but.
The [^]] means any character but a closing brace.
The * means zero or more of the preceding. That means I am capturing zero or more characters that are not closing square braces.


The \] means the closing square bracket

Let's look at the line this is [some] more [text]

In #1 above, I capture the first open square bracket in front of the word some. However, it's not in a capture group. This is the first character I'm going to substitute.
I now start a capture group. I am capturing according to 3.2 and 3.3 above, starting with the letter s in some as many characters as possible that are not closing square brackets. This means I am matching [some, but only capturing some.
In #4, I have ended my capture group. I've matched for substitution purposes [some and now I'm matching on the last closing square bracket. That means I'm matching [some]. Note that regular expressions are normally greedy. I'll explain below why this is important.
Now, I can match the replacement string. This is much easier. It's \\macro(\1). The \1 is replaced by my capture group. The \\ is just a backslash. Thus, I'll replace [some] with \macro{some}.

It would be much easier if I could be guaranteed a single set of square brackets in each line. Then I could have done this:
sed -i.bkup 's/\[\(.*\)\]/\\macro(\1)/g'

The capture group is now saying anything between to square brackets. However, the problem is that regular expressions are greedy, that means I would have matched from the s in some all the way to the final t in text. The 'x' below show the capture group. The [ and ] show the square brackets I'm matching on:
 this is [some] more [text]
         [xxxxxxxxxxxxxxxx]

This became more complex because I had to match on characters that had special meaning to regular expressions, so we see a lot of backslashing. Plus, I had to account for regular expression greediness, which got the nice looking, non-matching string [^]]* to match anything not a closing bracket. Add in the square brackets before and after \[[^]]*\], and don't forget the \(...\) capture group: \[\([^]]*\)\]And you get one big mess of a regular expression.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  难免孤独        
                
              
                            
                2020-12-06 02:08
              
            
            
                                                                       
sed -e 's/\[\([^]]*\)\]/\\macro{\1}/g' file.txt


This looks for an opening bracket, any number of explicitly non-closing brackets, then a closing bracket.  The group is captured by the parens and inserted into the replacement expression.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  说谎        
                
              
                            
                2020-12-06 02:10
              
            
            
                                                                       
use groups

sed 's|\[\([^]]*\)\]|\\macro{\1}|g' file

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  轻奢々        
                
              
                            
                2020-12-06 02:14
              
            
            
                                                                       
The following expression matches the pattern [a-z, A-Z and space] and replaces it with \macro{<whatever was between the []>}

sed -e 's/\[\([a-zA-Z ]*\)\]/\\macro{\1}/g'


In the expression the \( ... \) form a match group that can be referenced later in the substitution as \1
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复