Taking a list of data frames and grouping by a variable and using that variable as the key to a dictionary

后端 未结 2 798
难免孤独
难免孤独 2021-01-28 15:11

I am relatively new to python programming. I have a list of pandas dataframes that all have the column \'Year\'. I am trying to group by that column and convert to a dictionary

相关标签:
2条回答
  • 2021-01-28 16:03

    Other answers have missed the mark so far, so I'll give you an alternative. Assuming you have CSV files (since your variable is named that way):

    from collections import defaultdict
    
    yearly_dfs = defaultdict(list)
    for csv in list_of_csv_files:
        df = pd.read_csv(csv)
        for yr, yr_df in df.groupby("Year"):
            yearly_dfs[yr].append(yr_df)
    

    Assuming you have DataFrames already:

    from collections import defaultdict
    
    yearly_dfs = defaultdict(list)
    for df in list_of_csv_files:
        for yr, yr_df in df.groupby("Year"):
            yearly_dfs[yr].append(yr_df)
    
    0 讨论(0)
  • 2021-01-28 16:11

    Firstly you should read the files into a single dataframe: list_of_dfs = [pd.read_csv(filename, index_col=False) for filename in list_of_csv_files] df = pd.concat(list_of_dfs, sort=True)

    Then apply the groupby transformation on the dataframe and convert it into a dictionary: grouped_dict = df.groupby('Year').apply(list).to_dict()

    This question is a duplicate of GroupBy results to dictionary of lists

    0 讨论(0)
提交回复
热议问题