replace multiple lines identifying end character

后端 未结 3 718
眼角桃花
眼角桃花 2021-01-25 02:03

I have the below code

CREATE TABLE Table1(
        column1 double NOT NULL,
        column2 varchar(60) NULL,
        column3 varchar(60) NULL,
        column4          


        
相关标签:
3条回答
  • 2021-01-25 02:39

    This may look scary but it is not difficult to grasp with a bit of explanation:

    SED_DELIM=$(echo -en "\001")
    START=' CONSTRAINT Index1 PRIMARY KEY CLUSTERED'
    END='GO' 
    sed -n $'\x5c'"${SED_DELIM}${START}${SED_DELIM},"$'\x5c'"${SED_DELIM}${END}${SED_DELIM}{s${SED_DELIM}GO${SED_DELIM})${SED_DELIM};t a;d;:a;};p" test2.txt
    

    The sed has the following form you may be more familiar with:
    sed /regex1/,/regex2/{commands}

    First it uses the SOH non-printable as the delimiter \001
    Sets the START and END tags for sed multiline match
    Then performs the sed command:
    -n do not print by default
    $'\x5c' is a Bash string literal that corresponds to backslash \
    The backslashes are necessary to escape the non-printable delimiter on the multiline range match.
    {s${SED_DELIM}GO${SED_DELIM})${SED_DELIM};t a;d;:a;};p:
    s${SED_DELIM}GO${SED_DELIM})${SED_DELIM} replace the line that matches GO with )
    t a; if there is a successful substitution in the prior statement then branch to the :a label
    d if there is no subsitution then delete the line
    p print whatever the result is after the commands branch to the

    I didn't see their answers prior to posting this - this answer is the same as FredPhil/hek2mgl - except in this manner you have a mechanism to be more dynamic on the LHS since you can change the delimiter to a character that is much less likely to appear in the dataset.

    0 讨论(0)
  • 2021-01-25 02:46

    Update:

    You can use the following sed command to replace even the last , before the CONSTRAINT block:

    sed -r '/,/{N;/CONSTRAINT/{:a;N;/GO/!ba;s/([^,]+).*/\1\n)/};/CONSTRAINT/!n}' input.sql
    

    Let me explain it as a multiline script:

    # Search for a comma
    /,/ {
      # If a command was found slurp in the next line
      # and append it to the current line in pattern buffer
      N
      # If the pattern buffer does not contain the word CONSTRAINT
      # print the pattern buffer and go on with the next line of input
      # meaning start searching for a comma
      /CONSTRAINT/! n
    
      # If the pattern CONSTRAINT was found we loop until we find the 
      # word GO
      /CONSTRAINT/ {
        # Define a start label for the loop 
        :a
        # Append the next line of input to the pattern buffer
        N
        # If GO is still not found in the pattern buffern
        # step to the start label of the loop
        /GO/! ba
    
        # The loop was exited meaning the pattern GO was found.
        # We keep the first line of the pattern buffer - without
        # the comma at the end and replace everything else by a )
        s/([^,]+).*/\1\n)/
      }
    }
    

    You can save the above multiline script in a file and execute it using

    sed -rf script.sed input.sql
    

    You can use the following sed command:

    sed '/CONSTRAINT/{:a;N;/GO/!ba;s/.*/)/}' input.sql
    

    The pattern searches for a line containing /CONSTRAINT/. If the pattern is found a block of commands is started wrapped between { }. In the block we first define a label a through :a. The we get the next line of input through N and append it to the pattern buffer. Unless we find the pattern /GO/! we'll continue at label a using the branch command b. If the pattern /GO/ is found we simply replace the buffer by a ).


    An alternative can be using using a range like FredPhil suggested:

    sed '/CONSTRAINT/,/GO/{s/GO/)/;te;d;:e}'
    
    0 讨论(0)
  • 2021-01-25 02:53

    With GNU awk for multi-char RS and assuming you want to get rid of the comma before the "CONSTRAINT":

    $ cat tst.awk
    BEGIN{ RS="^$"; ORS="" }
    {
        gsub(/\<GO\>/,"\034")
        gsub(/,\s*CONSTRAINT[^\034]+\034/,")")
        gsub(/\034/,"GO")
        print
    }
    $ gawk -f tst.awk file
    CREATE TABLE Table1(
            column1 double NOT NULL,
            column2 varchar(60) NULL,
            column3 varchar(60) NULL,
            column4 double NOT NULL)
    GO
    

    The above works by replacing every stand-alone "GO" with a control char that's unlikely to appear in your input (in this case I used the same value as the default SUBSEP) so we can use that char in a negated character list in the middle gsub() to create a regexp that ends with the first "GO" after "CONSTRAINT". This is one way to do "non-greedy" matching in awk.

    If there is no char that you KNOW cannot appear in your input, you can create one like this:

    $ cat tst.awk
    BEGIN{ RS="^$"; ORS="" }
    {
        gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/\<GO\>/,"b")
        gsub(/,\s*CONSTRAINT[^b]+b/,")")
        gsub(/b/,"GO"); gsub(/aB/,"b"); gsub(/aA/,"a")
        print
    }
    $ 
    $ gawk -f tst.awk file
    CREATE TABLE Table1(
            column1 double NOT NULL,
            column2 varchar(60) NULL,
            column3 varchar(60) NULL,
            column4 double NOT NULL)
    GO
    

    The above initially converts all "a"s to "aA" and "b"s to "aB" so that

    1. there are no longer any "b"s in the record, and
    2. since all original "a"s now have an "A" after them, the only occurrences of "aB" represent where "bs" were originally located

    and that means that we can now convert all "GO"s to "b"s just like we converted them to "\034" in the first script above. Then we do the main gsub() and then unroll our initial gsub()s.

    This idea of gsub()ing to create chars that cannot previously exist, using those chars, then unrolling the initial gsub()s is an extremely useful idiom to learn and remember, e.g. see https://stackoverflow.com/a/13062682/1745001 for another application.

    To see it working one step at a time:

    $ cat file                                                                                                   
    foo bar Hello World World able bodies
    
    $ awk '{gsub(/a/,"aA")}1' file                                                                               
    foo baAr Hello World World aAble bodies
    
    $ awk '{gsub(/a/,"aA"); gsub(/b/,"aB")}1' file                                                               
    foo aBaAr Hello World World aAaBle aBodies
    
    $ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b")}1' file                                            
    foo aBaAr Hello b b aAaBle aBodies
    
    $ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The")}1' file                         
    foo aBaAr We Are The b aAaBle aBodies
    
    $ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World")}1' file 
    foo aBaAr We Are The World aAaBle aBodies
    
    $ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World"); gsub(/aB/,"b")}1' file
    foo baAr We Are The World aAble bodies
    
    $ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World"); gsub(/aB/,"b"); ; gsub(/aA/,"a")}1' file
    foo bar We Are The World able bodies
    
    0 讨论(0)
提交回复
热议问题