Python import csv to list

后端 未结 13 1119
后悔当初
后悔当初 2020-11-22 06:15

I have a CSV file with about 2000 records.

Each record has a string, and a category to it:

This is the firs         


        
相关标签:
13条回答
  • 2020-11-22 06:38

    Pandas is pretty good at dealing with data. Here is one example how to use it:

    import pandas as pd
    
    # Read the CSV into a pandas data frame (df)
    #   With a df you can do many things
    #   most important: visualize data with Seaborn
    df = pd.read_csv('filename.csv', delimiter=',')
    
    # Or export it in many ways, e.g. a list of tuples
    tuples = [tuple(x) for x in df.values]
    
    # or export it as a list of dicts
    dicts = df.to_dict().values()
    

    One big advantage is that pandas deals automatically with header rows.

    If you haven't heard of Seaborn, I recommend having a look at it.

    See also: How do I read and write CSV files with Python?

    Pandas #2

    import pandas as pd
    
    # Get data - reading the CSV file
    import mpu.pd
    df = mpu.pd.example_df()
    
    # Convert
    dicts = df.to_dict('records')
    

    The content of df is:

         country   population population_time    EUR
    0    Germany   82521653.0      2016-12-01   True
    1     France   66991000.0      2017-01-01   True
    2  Indonesia  255461700.0      2017-01-01  False
    3    Ireland    4761865.0             NaT   True
    4      Spain   46549045.0      2017-06-01   True
    5    Vatican          NaN             NaT   True
    

    The content of dicts is

    [{'country': 'Germany', 'population': 82521653.0, 'population_time': Timestamp('2016-12-01 00:00:00'), 'EUR': True},
     {'country': 'France', 'population': 66991000.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': True},
     {'country': 'Indonesia', 'population': 255461700.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': False},
     {'country': 'Ireland', 'population': 4761865.0, 'population_time': NaT, 'EUR': True},
     {'country': 'Spain', 'population': 46549045.0, 'population_time': Timestamp('2017-06-01 00:00:00'), 'EUR': True},
     {'country': 'Vatican', 'population': nan, 'population_time': NaT, 'EUR': True}]
    

    Pandas #3

    import pandas as pd
    
    # Get data - reading the CSV file
    import mpu.pd
    df = mpu.pd.example_df()
    
    # Convert
    lists = [[row[col] for col in df.columns] for row in df.to_dict('records')]
    

    The content of lists is:

    [['Germany', 82521653.0, Timestamp('2016-12-01 00:00:00'), True],
     ['France', 66991000.0, Timestamp('2017-01-01 00:00:00'), True],
     ['Indonesia', 255461700.0, Timestamp('2017-01-01 00:00:00'), False],
     ['Ireland', 4761865.0, NaT, True],
     ['Spain', 46549045.0, Timestamp('2017-06-01 00:00:00'), True],
     ['Vatican', nan, NaT, True]]
    
    0 讨论(0)
  • 2020-11-22 06:42

    Using the csv module:

    import csv
    
    with open('file.csv', newline='') as f:
        reader = csv.reader(f)
        data = list(reader)
    
    print(data)
    

    Output:

    [['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]
    

    If you need tuples:

    import csv
    
    with open('file.csv', newline='') as f:
        reader = csv.reader(f)
        data = [tuple(row) for row in reader]
    
    print(data)
    

    Output:

    [('This is the first line', 'Line1'), ('This is the second line', 'Line2'), ('This is the third line', 'Line3')]
    

    Old Python 2 answer, also using the csv module:

    import csv
    with open('file.csv', 'rb') as f:
        reader = csv.reader(f)
        your_list = list(reader)
    
    print your_list
    # [['This is the first line', 'Line1'],
    #  ['This is the second line', 'Line2'],
    #  ['This is the third line', 'Line3']]
    
    0 讨论(0)
  • 2020-11-22 06:43

    Here is the easiest way in Python 3.x to import a CSV to a multidimensional array, and its only 4 lines of code without importing anything!

    #pull a CSV into a multidimensional array in 4 lines!
    
    L=[]                            #Create an empty list for the main array
    for line in open('log.txt'):    #Open the file and read all the lines
        x=line.rstrip()             #Strip the \n from each line
        L.append(x.split(','))      #Split each line into a list and add it to the
                                    #Multidimensional array
    print(L)
    
    0 讨论(0)
  • 2020-11-22 06:49

    As said already in the comments you can use the csv library in python. csv means comma separated values which seems exactly your case: a label and a value separated by a comma.

    Being a category and value type I would rather use a dictionary type instead of a list of tuples.

    Anyway in the code below I show both ways: d is the dictionary and l is the list of tuples.

    import csv
    
    file_name = "test.txt"
    try:
        csvfile = open(file_name, 'rt')
    except:
        print("File not found")
    csvReader = csv.reader(csvfile, delimiter=",")
    d = dict()
    l =  list()
    for row in csvReader:
        d[row[1]] = row[0]
        l.append((row[0], row[1]))
    print(d)
    print(l)
    
    0 讨论(0)
  • 2020-11-22 06:51

    Unfortunately I find none of the existing answers particularly satisfying.

    Here is a straightforward and complete Python 3 solution, using the csv module.

    import csv
    
    with open('../resources/temp_in.csv', newline='') as f:
        reader = csv.reader(f, skipinitialspace=True)
        rows = list(reader)
    
    print(rows)
    

    Notice the skipinitialspace=True argument. This is necessary since, unfortunately, OP's CSV contains whitespace after each comma.

    Output:

    [['This is the first line', 'Line1'], ['This is the second line', 'Line2'], ['This is the third line', 'Line3']]
    
    0 讨论(0)
  • 2020-11-22 06:52

    Next is a piece of code which uses csv module but extracts file.csv contents to a list of dicts using the first line which is a header of csv table

    import csv
    def csv2dicts(filename):
      with open(filename, 'rb') as f:
        reader = csv.reader(f)
        lines = list(reader)
        if len(lines) < 2: return None
        names = lines[0]
        if len(names) < 1: return None
        dicts = []
        for values in lines[1:]:
          if len(values) != len(names): return None
          d = {}
          for i,_ in enumerate(names):
            d[names[i]] = values[i]
          dicts.append(d)
        return dicts
      return None
    
    if __name__ == '__main__':
      your_list = csv2dicts('file.csv')
      print your_list
    
    0 讨论(0)
提交回复
热议问题