I have the below code
CREATE TABLE Table1(
column1 double NOT NULL,
column2 varchar(60) NULL,
column3 varchar(60) NULL,
column4
This may look scary but it is not difficult to grasp with a bit of explanation:
SED_DELIM=$(echo -en "\001")
START=' CONSTRAINT Index1 PRIMARY KEY CLUSTERED'
END='GO'
sed -n $'\x5c'"${SED_DELIM}${START}${SED_DELIM},"$'\x5c'"${SED_DELIM}${END}${SED_DELIM}{s${SED_DELIM}GO${SED_DELIM})${SED_DELIM};t a;d;:a;};p" test2.txt
The sed has the following form you may be more familiar with:
sed /regex1/,/regex2/{commands}
First it uses the SOH non-printable as the delimiter \001
Sets the START and END tags for sed multiline match
Then performs the sed command:
-n
do not print by default
$'\x5c'
is a Bash string literal that corresponds to backslash \
The backslashes are necessary to escape the non-printable delimiter on the multiline range match.
{s${SED_DELIM}GO${SED_DELIM})${SED_DELIM};t a;d;:a;};p
:
s${SED_DELIM}GO${SED_DELIM})${SED_DELIM}
replace the line that matches GO with )
t a;
if there is a successful substitution in the prior statement then branch to the :a
label
d
if there is no subsitution then delete the line
p
print whatever the result is after the commands
branch to the
I didn't see their answers prior to posting this - this answer is the same as FredPhil/hek2mgl - except in this manner you have a mechanism to be more dynamic on the LHS since you can change the delimiter to a character that is much less likely to appear in the dataset.
Update:
You can use the following sed
command to replace even the last ,
before the CONSTRAINT
block:
sed -r '/,/{N;/CONSTRAINT/{:a;N;/GO/!ba;s/([^,]+).*/\1\n)/};/CONSTRAINT/!n}' input.sql
Let me explain it as a multiline script:
# Search for a comma
/,/ {
# If a command was found slurp in the next line
# and append it to the current line in pattern buffer
N
# If the pattern buffer does not contain the word CONSTRAINT
# print the pattern buffer and go on with the next line of input
# meaning start searching for a comma
/CONSTRAINT/! n
# If the pattern CONSTRAINT was found we loop until we find the
# word GO
/CONSTRAINT/ {
# Define a start label for the loop
:a
# Append the next line of input to the pattern buffer
N
# If GO is still not found in the pattern buffern
# step to the start label of the loop
/GO/! ba
# The loop was exited meaning the pattern GO was found.
# We keep the first line of the pattern buffer - without
# the comma at the end and replace everything else by a )
s/([^,]+).*/\1\n)/
}
}
You can save the above multiline script in a file and execute it using
sed -rf script.sed input.sql
You can use the following sed
command:
sed '/CONSTRAINT/{:a;N;/GO/!ba;s/.*/)/}' input.sql
The pattern searches for a line containing /CONSTRAINT/
. If the pattern is found a block of commands is started wrapped between { }
. In the block we first define a label a
through :a
. The we get the next line of input through N
and append it to the pattern buffer. Unless we find the pattern /GO/!
we'll continue at label a
using the branch command b
. If the pattern /GO/
is found we simply replace the buffer by a )
.
An alternative can be using using a range like FredPhil suggested:
sed '/CONSTRAINT/,/GO/{s/GO/)/;te;d;:e}'
With GNU awk for multi-char RS and assuming you want to get rid of the comma before the "CONSTRAINT":
$ cat tst.awk
BEGIN{ RS="^$"; ORS="" }
{
gsub(/\<GO\>/,"\034")
gsub(/,\s*CONSTRAINT[^\034]+\034/,")")
gsub(/\034/,"GO")
print
}
$ gawk -f tst.awk file
CREATE TABLE Table1(
column1 double NOT NULL,
column2 varchar(60) NULL,
column3 varchar(60) NULL,
column4 double NOT NULL)
GO
The above works by replacing every stand-alone "GO" with a control char that's unlikely to appear in your input (in this case I used the same value as the default SUBSEP) so we can use that char in a negated character list in the middle gsub() to create a regexp that ends with the first "GO" after "CONSTRAINT". This is one way to do "non-greedy" matching in awk.
If there is no char that you KNOW cannot appear in your input, you can create one like this:
$ cat tst.awk
BEGIN{ RS="^$"; ORS="" }
{
gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/\<GO\>/,"b")
gsub(/,\s*CONSTRAINT[^b]+b/,")")
gsub(/b/,"GO"); gsub(/aB/,"b"); gsub(/aA/,"a")
print
}
$
$ gawk -f tst.awk file
CREATE TABLE Table1(
column1 double NOT NULL,
column2 varchar(60) NULL,
column3 varchar(60) NULL,
column4 double NOT NULL)
GO
The above initially converts all "a"s to "aA" and "b"s to "aB" so that
and that means that we can now convert all "GO"s to "b"s just like we converted them to "\034" in the first script above. Then we do the main gsub() and then unroll our initial gsub()s.
This idea of gsub()
ing to create chars that cannot previously exist, using those chars, then unrolling the initial gsub()
s is an extremely useful idiom to learn and remember, e.g. see https://stackoverflow.com/a/13062682/1745001 for another application.
To see it working one step at a time:
$ cat file
foo bar Hello World World able bodies
$ awk '{gsub(/a/,"aA")}1' file
foo baAr Hello World World aAble bodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB")}1' file
foo aBaAr Hello World World aAaBle aBodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b")}1' file
foo aBaAr Hello b b aAaBle aBodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The")}1' file
foo aBaAr We Are The b aAaBle aBodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World")}1' file
foo aBaAr We Are The World aAaBle aBodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World"); gsub(/aB/,"b")}1' file
foo baAr We Are The World aAble bodies
$ awk '{gsub(/a/,"aA"); gsub(/b/,"aB"); gsub(/World/,"b"); gsub(/Hello[^b]+b/,"We Are The"); gsub(/b/,"World"); gsub(/aB/,"b"); ; gsub(/aA/,"a")}1' file
foo bar We Are The World able bodies