How can I delete a column from a CSV file which has comma separated value with a string enclosed in double quotes and a comma in between? I have a file 44.csv with 4 lines including the header like the below format:
column1, column2, column3, column 4, column5, column6
12,455,"string with quotes, and with a comma in between",4432,6787,890,88
4432,6787,"another, string with quotes, and with two comma in between",890,88,12,455
11,22,"simple string",77,777,333,22
I need to cut the 1,2,3 columns from the file, so I used the cut command as below
cut -d"," -f1,2,3 44.csv > 444.csv
I am getting the output as
column1, column2, column3
12,455,"string with quotes
4432,6787,"another string with quotes
11,22,"simple string"
But I need the output to be
column1, column2, column3
12,455,"string with quotes, and with a comma in between"
4432,6787,"another, string with quotes, and with two comma in between"
11,22,"simple string"
Any help is greatly appreciated.
Thanks Dhruuv.
With GNU awk
version 4 or later, you can use FPAT
to define the patterns.
gawk '{print $1, $2, $3}' FPAT="([^,]+)|(\"[^\"]+\")" OFS="," 44.csv
Test:
$ gawk '{print $1, $2, $3}' FPAT="([^,]+)|(\"[^\"]+\")" OFS="," mycsv.csv
column1, column2, column3
12,455,"string with quotes, and with a comma in between"
4432,6787,"another, string with quotes, and with two comma in between"
11,22,"simple string"
I had the same issue as you Dhruuv, the solution proposed by jaypal singh is correct but wasn't working for all my cases. I recommend you to use : https://github.com/dbro/csvquote (Enables common unix utlities like cut, head, tail to work correctly with csv data containing delimiters and newlines) this worked for me.
You can probably do it with cut in this special case, by using "
as your delimiter, but I'd strongly advise against it -- even if you could make it work in this case, you might later get a string with an escaped double quote in it, e.g. \"
which would fool that too. Or, more of your columns might be quoted (which is a perfectly valid CSV-ism).
A smarter tool is required! The simplest to obtain might well be Perl and the Text::CSV module -- you've almost certainly got Perl installed, and depending on your environment installing Text::CSV as a package, with CPAN.pm, or with cpanminus ought to be straightforward.
来源:https://stackoverflow.com/questions/17199311/how-to-delete-a-column-columns-of-a-csv-file-which-has-cell-values-with-a-string