Sub-select a multi-index pandas dataframe to create multiple subsets (using a dictionary)

问题

I have a dataset similar to the following:

df_lenght = 240
df = pd.DataFrame(np.random.randn(df_lenght,2), columns=['a','b'] )
df['datetime'] = pd.date_range('23/06/2017', periods=df_lenght, freq='H')
unique_jobs = ['job1','job2','job3',]
job_id = [unique_jobs for i in range (1, int((df_lenght/len(unique_jobs))+1) ,1) ]
df['job_id'] = sorted( [val for sublist in job_id for val in sublist] )
df.set_index(['job_id','datetime'], append=True, inplace=True)

print(df[:5]) returns:

                                     a         b
  job_id datetime                               
0 job1   2017-06-23 00:00:00 -0.067011 -0.516382
1 job1   2017-06-23 01:00:00 -0.174199  0.068693
2 job1   2017-06-23 02:00:00 -1.227568 -0.103878
3 job1   2017-06-23 03:00:00 -0.847565 -0.345161
4 job1   2017-06-23 04:00:00  0.028852  3.111738

How can I create multiple dataframes, one for each value of job_id? Can those fed into a dictionary to be easy retrieved? Thanks

回答1:

You could unpack a groupby object into a dictionary:

dfs = {job: df for job, df in df.groupby(level='job_id')}

来源：https://stackoverflow.com/questions/44725105/sub-select-a-multi-index-pandas-dataframe-to-create-multiple-subsets-using-a-di

标签

python

pandas

multi-index

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!