问题
I am relatively new to python programming. I have a list of pandas dataframes that all have the column 'Year'. I am trying to group by that column and convert to a dictionary where the dictionary key is the variable 'Year' and values is a list of dataframes of that year. Is this possible in python?
I tried this:
grouped_dict = list_of_csv_files.groupby(by = 'Year').to_dict()
I believe I will have to loop through each dataframe? I did not provide any data because I am hoping it is a somewhat simple solution.
I also tried this:
grouped_dict = list_of_csv_files.groupby(by = 'Year').apply(lambda dfg: dfg.to_dict(orient='list')).to_dict()
Any guidance would be greatly appreciated!
回答1:
Other answers have missed the mark so far, so I'll give you an alternative. Assuming you have CSV files (since your variable is named that way):
from collections import defaultdict
yearly_dfs = defaultdict(list)
for csv in list_of_csv_files:
df = pd.read_csv(csv)
for yr, yr_df in df.groupby("Year"):
yearly_dfs[yr].append(yr_df)
Assuming you have DataFrames already:
from collections import defaultdict
yearly_dfs = defaultdict(list)
for df in list_of_csv_files:
for yr, yr_df in df.groupby("Year"):
yearly_dfs[yr].append(yr_df)
回答2:
Firstly you should read the files into a single dataframe:
list_of_dfs = [pd.read_csv(filename, index_col=False) for filename in list_of_csv_files]
df = pd.concat(list_of_dfs, sort=True)
Then apply the groupby transformation on the dataframe and convert it into a dictionary:
grouped_dict = df.groupby('Year').apply(list).to_dict()
This question is a duplicate of GroupBy results to dictionary of lists
来源:https://stackoverflow.com/questions/55692550/taking-a-list-of-data-frames-and-grouping-by-a-variable-and-using-that-variable