Getting “newline inside string” while reading the csv file in Python?

后端 未结 3 1573
时光说笑
时光说笑 2021-01-02 13:14

I have this utils.py file in Django Architecture:

def range_data(ip):
    r = []
    f = open(os.path.join(settings.PROJECT_ROOT, \'static\', \'csv \', 
             


        
相关标签:
3条回答
  • 2021-01-02 13:51

    had similar problem earlier today, there was an end quote missing from a line and the solution is by instructing reader to perform no special processing of quote characters (quoting=csv.QUOTE_NONE).

    0 讨论(0)
  • 2021-01-02 13:59
    1. You can preprocess the csv by removing the newline like below.

      import csv
      
      content = open("GeoIPCountryWhois.csv", "r").read().replace('\r\n','\n')
      
      with open("GeoIPCountryWhois2.csv", "w") as g:
          g.write(content)
      

      Then Use GeoIPCountryWhois2 for csv reader.

    2. A wild Guess using a lineterminator may solve your problem

      for num,row in enumerate(csv.reader(f,lineterminator='\n'))
      

      See also: http://docs.python.org/lib/csv-fmt-params.html

    0 讨论(0)
  • 2021-01-02 14:02

    You must open your files as binary:

    def range_data(ip):
        r = []
        f = open(os.path.join(settings.PROJECT_ROOT, 'static', 'csv ', 
                              'GeoIPCountryWhois.csv'), 'rb')
        for num,row in enumerate(csv.reader(f)):
            # Your things.
    

    Note the 'rb' mode there; otherwise the file could be opened with native line endings, and the CSV reader doesn't handle the various forms very well. Certainly the copy of GeoIPCountryWhois.csv that I downloaded has clean \n line endings.

    This is documented for the .reader() method:

    If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.

    If, however, your csv file is so corrupted as to still contain unexpected newline characters in unexpected places, use this file subclass instead as a stop-gap measure:

    class CleanlinesFile(file):
        def next(self):
            line = super(CleanlinesFile, self).next()
            return line.replace('\r', '').replace('\n', '') + '\n'
    

    This class guarantees there will be no newlines anywhere in the returned results except as the very last character (just the way the csv module wants it). Use it instead of the open call; the 'rb' mode modifier becomes optional in this case:

    def range_data(ip):
        r = []
        f = CleanlinesFile(os.path.join(settings.PROJECT_ROOT, 'static', 'csv ', 
                              'GeoIPCountryWhois.csv'))
        for num,row in enumerate(csv.reader(f)):
            # Your things.
    
    0 讨论(0)
提交回复
热议问题