问题
How can I delete a column from a CSV file which has comma separated value with a string enclosed in double quotes and a comma in between? I have a file 44.csv with 4 lines including the header like the below format:
column1, column2, column3, column 4, column5, column6
12,455,"string with quotes, and with a comma in between",4432,6787,890,88
4432,6787,"another, string with quotes, and with two comma in between",890,88,12,455
11,22,"simple string",77,777,333,22
I need to cut the 1,2,3 columns from the file, so I used the cut command as below
cut -d"," -f1,2,3 44.csv > 444.csv
I am getting the output as
column1, column2, column3
12,455,"string with quotes
4432,6787,"another string with quotes
11,22,"simple string"
But I need the output to be
column1, column2, column3
12,455,"string with quotes, and with a comma in between"
4432,6787,"another, string with quotes, and with two comma in between"
11,22,"simple string"
Any help is greatly appreciated.
Thanks Dhruuv.
回答1:
With GNU awk
version 4 or later, you can use FPAT
to define the patterns.
gawk '{print $1, $2, $3}' FPAT="([^,]+)|(\"[^\"]+\")" OFS="," 44.csv
Test:
$ gawk '{print $1, $2, $3}' FPAT="([^,]+)|(\"[^\"]+\")" OFS="," mycsv.csv
column1, column2, column3
12,455,"string with quotes, and with a comma in between"
4432,6787,"another, string with quotes, and with two comma in between"
11,22,"simple string"
回答2:
I had the same issue as you Dhruuv, the solution proposed by jaypal singh is correct but wasn't working for all my cases. I recommend you to use : https://github.com/dbro/csvquote (Enables common unix utlities like cut, head, tail to work correctly with csv data containing delimiters and newlines) this worked for me.
回答3:
You can probably do it with cut in this special case, by using "
as your delimiter, but I'd strongly advise against it -- even if you could make it work in this case, you might later get a string with an escaped double quote in it, e.g. \"
which would fool that too. Or, more of your columns might be quoted (which is a perfectly valid CSV-ism).
A smarter tool is required! The simplest to obtain might well be Perl and the Text::CSV module -- you've almost certainly got Perl installed, and depending on your environment installing Text::CSV as a package, with CPAN.pm, or with cpanminus ought to be straightforward.
来源:https://stackoverflow.com/questions/17199311/how-to-delete-a-column-columns-of-a-csv-file-which-has-cell-values-with-a-string