I have a .CSV file which has few records with numbers in them which are enclosed in double quotes (such as in "455,365.44") and commas in between the quotes. I need to remove the comma from the numeric values("455,365.44" should look like 455365.44 after processing) of the records so I could use them in the further processing of the file.
Here is a example of the file
column 1, column 2, column 3, column 4, column 5, column 6, column 7
12,"455,365.44","string with quotes, and with a comma in between","4,432",6787,890,88
432,"222,267.87","another, string with quotes, and with two comma in between","1,890",88,12,455
11,"4,324,653.22","simple string",77,777,333,22
and I need the result look like:
column 1, column 2, column 3, column 4, column 5, column 6, column 7
12,455365.44,"string with quotes, and with a comma in between",4432,6787,890,88
432,222267.87,"another, string with quotes, and with two comma in between",1890,88,12,455
11,4324653.22,"simple string",77,777,333,22
P.S: I need only the values which are numeric to be converted like this and the string values should remain same.
Please help...
To remove the quotes (replace the number with the quotes with the number without them):
s/"(\d[\d.,]*)"/\1/g
See on rubular
For the commas I could only think of a lookahead and lookbehind, if thats supported by your regex implementation (replace commas with nothing if before and after is a number within quotes):
s/(?<="[\d,]+),(?=[\d,.]+")//g
You would have to execute this before removing the quotes.
It might also work without lookbehind:
s/,(?=[\d,.]*\d")//g
See on rubular
In a shell script you might want use perl e.g. execute:
cat test.csv | perl -p -e 's/,(?=[\d,.]*\d")//g and s/"(\d[\d,.]*)"/\1/g'
Explanation of the regex:
first execute:
s/,(?=[\d,.]*\d")//g
This will remove all commas that are followed by a number ([\d,.]*\d
) and a quote, thus removing only commas from numbers within quotes
next execute
s/"(\d[\d,.]*)"/\1/g
This will replace all numbers that are within quotes by the value without the quotes
来源:https://stackoverflow.com/questions/18624784/remove-double-quotes-and-comma-from-a-numeric-value-of-a-csv-file