问题
My data is in the following format..
{"Foo":"ABC","Bar":"20090101100000","Quux":"{\"QuuxId\":1234,\"QuuxName\":\"Sam\"}"}
I need it to be in this format:
{"Foo":"ABC","Bar":"20090101100000","Quux":{"QuuxId":1234,"QuuxName":"Sam"}}
I'm trying to using Pig's replace function to get it in the format I need.. So, I tried ..
"LOGS = LOAD 'inputloc' USING TextStorage() as unparsedString:chararray;;" +
"REPL1 = foreach LOGS REPLACE($0, '"{', '{');" +
"REPL2 = foreach REPL1 REPLACE($0, '}"', '}');"
"STORE REPL2 INTO 'outputlocation';"
It throws an error.. Unexpected token '{' in expression or statement.
So based on an answer here, I tried:
"REPL1 = foreach LOGS REPLACE($0, '"\\{', '\\{');"
Now, it gives an error.. Unexpected token '\\' in expression or statement.
Any help is sincerely appreciated..
Thanks
回答1:
Works for me:
REPL1 = FOREACH LOGS GENERATE REPLACE($0, '"\\{', '\\{');
In your code you are missing the GENERATE
and the double quotes at the beginning and end are wrong.
回答2:
Please check the below code.
LOGS = load 'inputlocation' as unparsedString:chararray;
REPL1 = foreach LOGS generate REPLACE($0, '"\\{', '\\{');
REPL2 = foreach REPL1 generate REPLACE($0, '}"', '}');
STORE REPL2 INTO 'outputlocation';
Hope it will work.
回答3:
Load the data using the delimiter as shown below:
sam = load 'sampledata' using PigStorage(',');
sam1 = foreach sam generate $0,$1,CONCAT(REPLACE($2,'([^A-Za-z0-9:"{]+)',''),REPLACE($3,'([^A-Za-z0-9:"}]+)',''));
This will give you the desired output.
({"Foo":"ABC","Bar":"20090101100000","Quux":"{"QuuxId":1234"QuuxName":"Sam"}"})
来源:https://stackoverflow.com/questions/31470995/replace-character-in-pig