Python import csv to list

后端 未结 13 1126
后悔当初
后悔当初 2020-11-22 06:15

I have a CSV file with about 2000 records.

Each record has a string, and a category to it:

This is the firs         


        
13条回答
  •  悲&欢浪女
    2020-11-22 06:38

    Pandas is pretty good at dealing with data. Here is one example how to use it:

    import pandas as pd
    
    # Read the CSV into a pandas data frame (df)
    #   With a df you can do many things
    #   most important: visualize data with Seaborn
    df = pd.read_csv('filename.csv', delimiter=',')
    
    # Or export it in many ways, e.g. a list of tuples
    tuples = [tuple(x) for x in df.values]
    
    # or export it as a list of dicts
    dicts = df.to_dict().values()
    

    One big advantage is that pandas deals automatically with header rows.

    If you haven't heard of Seaborn, I recommend having a look at it.

    See also: How do I read and write CSV files with Python?

    Pandas #2

    import pandas as pd
    
    # Get data - reading the CSV file
    import mpu.pd
    df = mpu.pd.example_df()
    
    # Convert
    dicts = df.to_dict('records')
    

    The content of df is:

         country   population population_time    EUR
    0    Germany   82521653.0      2016-12-01   True
    1     France   66991000.0      2017-01-01   True
    2  Indonesia  255461700.0      2017-01-01  False
    3    Ireland    4761865.0             NaT   True
    4      Spain   46549045.0      2017-06-01   True
    5    Vatican          NaN             NaT   True
    

    The content of dicts is

    [{'country': 'Germany', 'population': 82521653.0, 'population_time': Timestamp('2016-12-01 00:00:00'), 'EUR': True},
     {'country': 'France', 'population': 66991000.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': True},
     {'country': 'Indonesia', 'population': 255461700.0, 'population_time': Timestamp('2017-01-01 00:00:00'), 'EUR': False},
     {'country': 'Ireland', 'population': 4761865.0, 'population_time': NaT, 'EUR': True},
     {'country': 'Spain', 'population': 46549045.0, 'population_time': Timestamp('2017-06-01 00:00:00'), 'EUR': True},
     {'country': 'Vatican', 'population': nan, 'population_time': NaT, 'EUR': True}]
    

    Pandas #3

    import pandas as pd
    
    # Get data - reading the CSV file
    import mpu.pd
    df = mpu.pd.example_df()
    
    # Convert
    lists = [[row[col] for col in df.columns] for row in df.to_dict('records')]
    

    The content of lists is:

    [['Germany', 82521653.0, Timestamp('2016-12-01 00:00:00'), True],
     ['France', 66991000.0, Timestamp('2017-01-01 00:00:00'), True],
     ['Indonesia', 255461700.0, Timestamp('2017-01-01 00:00:00'), False],
     ['Ireland', 4761865.0, NaT, True],
     ['Spain', 46549045.0, Timestamp('2017-06-01 00:00:00'), True],
     ['Vatican', nan, NaT, True]]
    

提交回复
热议问题