问题
I have a .CSV file which has few records with numbers in them which are enclosed in double quotes (such as in "455,365.44") and commas in between the quotes. I need to remove the comma from the numeric values("455,365.44" should look like 455365.44 after processing) of the records so I could use them in the further processing of the file.
Here is a example of the file
column 1, column 2, column 3, column 4, column 5, column 6, column 7
12,"455,365.44","string with quotes, and with a comma in between","4,432",6787,890,88
432,"222,267.87","another, string with quotes, and with two comma in between","1,890",88,12,455
11,"4,324,653.22","simple string",77,777,333,22
and I need the result look like:
column 1, column 2, column 3, column 4, column 5, column 6, column 7
12,455365.44,"string with quotes, and with a comma in between",4432,6787,890,88
432,222267.87,"another, string with quotes, and with two comma in between",1890,88,12,455
11,4324653.22,"simple string",77,777,333,22
P.S: I need only the values which are numeric to be converted like this and the string values should remain same.
Please help...
回答1:
To remove the quotes (replace the number with the quotes with the number without them):
s/"(\d[\d.,]*)"/\1/g
See on rubular
For the commas I could only think of a lookahead and lookbehind, if thats supported by your regex implementation (replace commas with nothing if before and after is a number within quotes):
s/(?<="[\d,]+),(?=[\d,.]+")//g
You would have to execute this before removing the quotes.
It might also work without lookbehind:
s/,(?=[\d,.]*\d")//g
See on rubular
In a shell script you might want use perl e.g. execute:
cat test.csv | perl -p -e 's/,(?=[\d,.]*\d")//g and s/"(\d[\d,.]*)"/\1/g'
Explanation of the regex:
first execute:
s/,(?=[\d,.]*\d")//g
This will remove all commas that are followed by a number ([\d,.]*\d
) and a quote, thus removing only commas from numbers within quotes
next execute
s/"(\d[\d,.]*)"/\1/g
This will replace all numbers that are within quotes by the value without the quotes
来源:https://stackoverflow.com/questions/18624784/remove-double-quotes-and-comma-from-a-numeric-value-of-a-csv-file