问题
I have the following json file:
{ "last_modified": {
"type": "/type/datetime",
"value": "2008-04-01T03:28:50.625462" },
"type": { "key": "/type/author" },
"name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
"key": "/authors/OL2108538A",
"revision": 1 }
The name value has a double quote and I only want to replace this double quote with a single quote (not any other double quote). How can I do it?
回答1:
If you want to repleace all occurences of a single character, you can also use the command tr
, simpler than sed or awk:
cat myfile.txt | tr \" \'
Notice that both quotes are escaped. If you have other chars than quotes, you just write:
cat myfile.txt | tr a A
Edit: Note that after the question was edited this answer is no longer valid: it replaces all double quotes, not only the one inside the Name property.
回答2:
Adding some other weird error cases to your input
{ "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"},
"type": {"key": "/type/author"},
"name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.",
"key": "/authors/OL2108538A",
"revision": 1,
"has \" escaped quote": 1,
"has \" escaped quotes \"": 1,
"has multiple " internal " quotes": 1,
}
this Perl program that corrects unescaped internal double-quotes using the heuristic that a string's actual closing quote is followed by optional whitespace and either a colon, comma, semicolon, or curly brace
#! /usr/bin/perl -p
s<"(.+?)"(\s*[:,;}])> {
my($text,$terminator) = ($1,$2);
$text =~ s/(?<!\\)"/'/g; # " oh, the irony!
qq["$text"] . $terminator;
}eg;
produces the following output:
$ ./fixdqs input.json { "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "type": {"key": "/type/author"}, "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico's Economy.", "key": "/authors/OL2108538A", "revision": 1, "has \" escaped quote": 1, "has \" escaped quotes \"": 1, "has multiple ' internal ' quotes": 1, }
Delta from input to output:
$ diff -ub input.json <(./fixdqs input.json) --- input.json +++ /dev/fd/63 @@ -1,9 +1,9 @@ { "last_modified": {"type": "/type/datetime", "value": "2008-04-01T03:28:50.625462"}, "type": {"key": "/type/author"}, - "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico"s Economy.", + "name": "National Research Council. Committee on the Scientific and Technologic Base of Puerto Rico's Economy.", "key": "/authors/OL2108538A", "revision": 1, "has \" escaped quote": 1, "has \" escaped quotes \"": 1, - "has multiple " internal " quotes": 1, + "has multiple ' internal ' quotes": 1, }
回答3:
I think would be better to use sed
something like this:
sed 's/"/'/g' your file
回答4:
If you mean just the double quote in 'Rico"s'
, you can use:
sed "s/Rico\"s/Rico's/"
as in:
pax> echo '{"name": "National Res...rto Rico"s Economy.", "key": "blah"}'
| sed "s/Rico\"s/Rico's/"
{"name": "National Res...rto Rico's Economy.", "key": "blah"}
回答5:
Assuming your data is exactly like you showed and the extra double quotes only appear in the name value field:
Update:
I made the script slightly more robust (handling ', ' inside fields).
BEGIN {
q = "\""
FS = OFS = q ", " q
}
{
split($1, arr, ": " q)
gsub(q, "'", arr[2])
print arr[1] ": " q arr[2], $2, $3
}
Put this script in a file (say dequote.awk
) and run the script withawk -f dequote.awk input.json > output.json
.
Update 2:
Okay, so your input is extremely difficult to process. The only thing other thing I can think of is this:
{
start = match($0, "\"name\": ") + 8
stop = match($0, "\", \"key\": ")
if (start == 8 || stop == 0) {
print
next
}
pre = substr($0, 1, start)
post = substr($0, stop)
name = substr($0, start + 1, stop - start - 1)
gsub("\"", "'", name)
print pre name post
}
Explanation: I try to chop the line in three parts:
- Up to the first double quote for the "name" value field;
- the "name" value field minus the double quotes;
- the closing double quote and the rest of the line.
In part 2 I replace all double quotes by single quotes. Then I glue the three parts back together and print them.
回答6:
awk '{for(i=1;i<=NF;i++) if($i~/name/) { gsub("\042","\047",$(i+1)) } }1' file
回答7:
If just the quotes around "name" then you can use sed from command line or in a bash script:
sed -i 's/ "name"/ '\'name\''/g' filename.json
Tested, works.
来源:https://stackoverflow.com/questions/3422103/how-to-replace-text-using-sed-or-awk