How do I extract column from CSV with quoted commas, using the shell?

后端 未结 1 1970
清酒与你
清酒与你 2021-01-26 09:19

I have a CSV file, but unlike in related questions, it has some columns containing double-quoted strings with commas, e.g.

foo,bar,baz,quux
11,\"first line, seco         


        
相关标签:
1条回答
  • 2021-01-26 09:53

    Here is a quick and dirty Python csvcut. The Python csv library already knows everything about various CSV dialects etc so you just need a thin wrapper.

    The first argument should express the index of the field you wish to extract, like

    csvcut 3 sample.csv
    

    to extract the third column from the (possibly, quoted etc) CSV file sample.csv.

    #!/usr/bin/env python3
    
    import csv
    import sys
    
    writer=csv.writer(sys.stdout)
    # Python indexing is zero-based
    col = 1+int(sys.argv[1])
    for input in sys.argv[2:]:
        with open(input) as handle:
            for row in csv.reader(handle): 
                writer.writerow(row[col])
    

    To do: error handling, extraction of multiple columns. (Not hard per se; use row[2:5] to extract columns 3, 4, and 5; but I'm too lazy to write a proper command-line argument parser.)

    0 讨论(0)
提交回复
热议问题