问题
Creating a search and replace function for my application, I am running a test scenario with 3 files, array tscript test
I am trying to escape double quotation marks but it wont work
script file contains
variableName=$1
sed "s#data\-field\=\"${variableName}\.name\"#data\-field\=${variableName}\.name data\-type\=dropdown data\-dropdown\-type\=${variableName}#g" test
test file contains
data-field=“fee_category.name”
data-field=“tax_type.name”
array file contains
fee_category
tax_type
There is no error code, the output is just what I inputted because the sed command could not find what it was looking for, if I dont use double quotes next to ${VariableName} and remove them from the test file the function works fine.
回答1:
Following the comment of mklement0 , i am only writing this answer in order to share some of my findings in case we need a literal match of your special double quotes. It might be useful to other users.
Your quoted text fee_category.name
has Unicode Left Double Quotation Mark U+201c quotes on the left side and Unicode Right Double Quotation Mark U+201d on the right side.
Those non std quotation marks have also some representation in UTF-8 :
Unicode Left Double Quotation Mark U+201c
UTF-8 (hex) 0xE2 0x80 0x9C (e2809c)
UTF-16 (hex) 0x201C (201c)
Unicode Right Double Quotation Mark U+201d
UTF-8 (hex) 0xE2 0x80 0x9D (e2809d)
UTF-16 (hex) 0x201D (201d)
Analyzing your file with od
utility we can confirm presence of above hex utf-8 sequences in your data:
$ echo data-field=“fee_category.name” |od -w40 -t x1c
0000000 64 61 74 61 2d 66 69 65 6c 64 3d e2 80 9c 66 65 65 5f 63 61 74 65 67 6f 72 79 2e 6e 61 6d 65 e2 80 9d 0a
d a t a - f i e l d = 342 200 234 f e e _ c a t e g o r y . n a m e 342 200 235 \n
What is interesting is that we can print those unicode characters in bash either by using their unicode code or by using the utf-8 hex series :
$ echo -e "\u201c test \u201d"
“ test ”
$ echo -e "\xe2\x80\x9c test \xe2\x80\x9d"
“ test ”
Accordingly we can force sed to match those special chars like this:
$ string=$(echo -e "\u201c test \u201d");echo "$string"
“ test ”
$ lq=$(echo -ne "\u201c");rq=$(echo -ne "\u201d")
$ sed -E "s/($lq)(.+)($rq)/**\2**/" <<<"$string"
** test **
Also this seems to work fine, without the need of using "helper" variables:
$ sed -E "s/(\xe2\x80\x9c)(.+)(\xe2\x80\x9d)/**\2**/" <<<"$string"
** test **
Meaning that the hex sequence \xe2\x80\x9c
(or \xe2\x80\x9d
for right quotes) can be used directly by sed
to provide a literal match on this special quotes.
You might as well make a pre-process of your files and convert all those non standard quotes to standard quotes using something like :
$ sed -E "s/[\xe2\x80\x9c,\xe2\x80\x9d]/\x22/g" <<<"$string"
" test " #Special quotes replaced with classic ascii quotes.
Above test have been done in Debian Testing & Bash 4.4 & GNU Sed 4.4 and may be this techniques will not work in other sed flavors.
回答2:
In case of doubt, you can try to wildcard them:
variableName="fee_category"
sed "s#data-field=.${variableName}\.name.#& data-type=dropdown data-dropdown-type=${variableName}#g" test
# Or, when you do not want those quotes back in your output
sed "s#\(data-field=\).\(${variableName}\)\(\.name\).#\1\2\3 data-type=dropdown data-dropdown-type=\2#g" test
来源:https://stackoverflow.com/questions/43458809/escaping-double-quotation-marks-in-sed