Removing in-field quotes in csv file

后端 未结 3 734
心在旅途
心在旅途 2021-01-29 04:43

Let\'s say we have a comma separated file (csv) like this:

\"name of movie\",\"starring\",\"director\",\"release year\"
\"dark knight rises\",\"christian bale,          


        
相关标签:
3条回答
  • 2021-01-29 04:47

    With awk you can do something like:

    awk -v Q='"' '{ gsub("[\"']","") ; gsub(",",Q "," Q) ; print Q $0 Q }'
    
    0 讨论(0)
  • 2021-01-29 04:56

    How about

    import csv
    
    def remove_quotes(s):
        return ''.join(c for c in s if c not in ('"', "'"))
    
    with open("fixquote.csv","rb") as infile, open("fixed.csv","wb") as outfile:
        reader = csv.reader(infile)
        writer = csv.writer(outfile, quoting=csv.QUOTE_ALL)
        for line in reader:
            writer.writerow([remove_quotes(elem) for elem in line])
    

    which produces

    ~/coding$ cat fixed.csv 
    "name of movie","starring","director","release year"
    "dark knight rises","christian bale, anna hathaway","christopher nolan","2012"
    "the dark knight","christian bale, heath ledger","christopher nolan","2008"
    "The day when earth stood still","Michael Rennie,the strong man","robert wise","1951"
    "the gladiator","russel the awesome crowe","ridley scott","2000"
    

    BTW, you might want to check the spelling of some of those names..

    0 讨论(0)
  • 2021-01-29 04:59

    Split the values into an array. Iterate through the array removing any quotes, other than the first and last character. Hope it helps.

    0 讨论(0)
提交回复
热议问题