replace multiple lines identifying end character

后端未结

关注

 3  724

I have the below code

CREATE TABLE Table1(
        column1 double NOT NULL,
        column2 varchar(60) NULL,
        column3 varchar(60) NULL,
        column4


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2021-01-25 02:39
              
            
            
                                                                       
This may look scary but it is not difficult to grasp with a bit of explanation: 

SED_DELIM=$(echo -en "\001")
START=' CONSTRAINT Index1 PRIMARY KEY CLUSTERED'
END='GO' 
sed -n $'\x5c'"${SED_DELIM}${START}${SED_DELIM},"$'\x5c'"${SED_DELIM}${END}${SED_DELIM}{s${SED_DELIM}GO${SED_DELIM})${SED_DELIM};t a;d;:a;};p" test2.txt


The sed has the following form you may be more familiar with:
 
 sed /regex1/,/regex2/{commands} 

First it uses the SOH non-printable as the delimiter \001

Sets the START and END tags for sed multiline match
 
Then performs the sed command: 

-n do not print by default 

$'\x5c' is a Bash string literal that corresponds to backslash \

The backslashes are necessary to escape the non-printable delimiter on the multiline range match.
 
{s${SED_DELIM}GO${SED_DELIM})${SED_DELIM};t a;d;:a;};p:
 
   s${SED_DELIM}GO${SED_DELIM})${SED_DELIM} replace the line that matches GO with ) 

   t a; if there is a successful substitution in the prior statement then branch to the :a label
 
      d if there is no subsitution then delete the line

      p print whatever the result is after the commands
branch to the 

I didn't see their answers prior to posting this - this answer is the same as FredPhil/hek2mgl - except in this manner you have a mechanism to be more dynamic on the LHS since you can change the delimiter to a character that is much less likely to appear in the dataset.  
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  慢半拍i        
                
              
                            
                2021-01-25 02:46
              
            
            
                                                                       
Update:

You can use the following sed command to replace even the last , before the CONSTRAINT block:

sed -r '/,/{N;/CONSTRAINT/{:a;N;/GO/!ba;s/([^,]+).*/\1\n)/};/CONSTRAINT/!n}' input.sql


Let me explain it as a multiline script:

# Search for a comma
/,/ {
  # If a command was found slurp in the next line
  # and append it to the current line in pattern buffer
  N
  # If the pattern buffer does not contain the word CONSTRAINT
  # print the pattern buffer and go on with the next line of input
  # meaning start searching for a comma
  /CONSTRAINT/! n

  # If the pattern CONSTRAINT was found we loop until we find the 
  # word GO
  /CONSTRAINT/ {
    # Define a start label for the loop 
    :a
    # Append the next line of input to the pattern buffer
    N
    # If GO is still not found in the pattern buffern
    # step to the start label of the loop
    /GO/! ba

    # The loop was exited meaning the pattern GO was found.
    # We keep the first line of the pattern buffer - without
    # the comma at the end and replace everything else by a )
    s/([^,]+).*/\1\n)/
  }
}


You can save the above multiline script in a file and execute it using

sed -rf script.sed input.sql




You can use the following sed command:

sed '/CONSTRAINT/{:a;N;/GO/!ba;s/.*/)/}' input.sql


The pattern searches for a line containing /CONSTRAINT/. If the pattern is found a block of commands is started wrapped between { }. In the block we first define a label a through :a. The we get the next line of input through N and append it to the pattern buffer. Unless we find the pattern /GO/! we'll continue at label a using the branch command b. If the pattern /GO/ is found we simply replace the buffer by a ).



An alternative can be using using a range like FredPhil suggested:

sed '/CONSTRAINT/,/GO/{s/GO/)/;te;d;:e}'

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  谎友^        
                
              
                            
                2021-01-25 02:53
              
            
            
                                                                       
With GNU awk for multi-char RS and assuming you want to get rid of the comma before the "CONSTRAINT":

$ cat tst.awk
BEGIN{ RS="^$"; ORS="" }
{
    gsub(/\<GO\>/,"\034")
    gsub(/,\s*CONSTRAINT[^\034]+\034/,")")
    gsub(/\034/,"GO")
    print
}
$ gawk -f tst.awk file
CREATE TABLE Table1(
        column1 double NOT NULL,
        column2 varchar(60) NULL,
        column3 varchar(60) NULL,
        column4 double NOT NULL)
GO


The above works by replacing every stand-alone "GO" with a control char that's unlikely to appear in your input (in this case I used the same value as the default SUBSEP) so we can use that char in a negated character list in the middle gsub() to create a regexp that ends with the first "GO" after "CONSTRAINT". This is one way to do "non-greedy" matching in awk.

If there is no char that you KNOW cannot appear in your input, you can create one like this:

$ cat tst.awk
BEGIN{ RS="^$"; ORS="" }
{
    gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/\<GO\>/,"b")
    gsub(/,\s*CONSTRAINT[^b]+b/,")")
    gsub(/b/,"GO"); gsub(/aB/,"b"); gsub(/aA/,"a")
    print
}
$ 
$ gawk -f tst.awk file
CREATE TABLE Table1(
        column1 double NOT NULL,
        column2 varchar(60) NULL,
        column3 varchar(60) NULL,
        column4 double NOT NULL)
GO


The above initially converts all "a"s to "aA" and "b"s to "aB" so that


there are no longer any "b"s in the record, and
since all original "a"s now have an "A" after them, the only occurrences of
"aB" represent where "bs" were originally located


and that means that we can now convert all "GO"s to "b"s just like we converted them to "\034" in the first script above. Then we do the main gsub() and then unroll our initial gsub()s.

This idea of gsub()ing to create chars that cannot previously exist, using those chars, then unrolling the initial gsub()s is an extremely useful idiom to learn and remember, e.g. see https://stackoverflow.com/a/13062682/1745001 for another application.

To see it working one step at a time:

$ cat file                                                                                                   
foo bar Hello World World able bodies

$ awk '{gsub(/a/,"aA")}1' file                                                                               
foo baAr Hello World World aAble bodies

$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB")}1' file                                                               
foo aBaAr Hello World World aAaBle aBodies

$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b")}1' file                                            
foo aBaAr Hello b b aAaBle aBodies

$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The")}1' file                         
foo aBaAr We Are The b aAaBle aBodies

$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World")}1' file 
foo aBaAr We Are The World aAaBle aBodies

$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World"); gsub(/aB/,"b")}1' file
foo baAr We Are The World aAble bodies

$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World"); gsub(/aB/,"b"); ; gsub(/aA/,"a")}1' file
foo bar We Are The World able bodies

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复