Pandas split column of lists into multiple columns

后端 未结 8 1665
爱一瞬间的悲伤
爱一瞬间的悲伤 2020-11-21 06:28

I have a pandas DataFrame with one column:

import pandas as pd

df = pd.DataFrame(
    data={
        \"teams\": [
            


        
8条回答
  •  醉梦人生
    2020-11-21 07:03

    Based on the previous answers, here is another solution which returns the same result as df2.teams.apply(pd.Series) with a much faster run time:

    pd.DataFrame([{x: y for x, y in enumerate(item)} for item in df2['teams'].values.tolist()], index=df2.index)
    

    Timings:

    In [1]:
    import pandas as pd
    d1 = {'teams': [['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'],
                    ['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG'],['SF', 'NYG']]}
    df2 = pd.DataFrame(d1)
    df2 = pd.concat([df2]*1000).reset_index(drop=True)
    
    In [2]: %timeit df2['teams'].apply(pd.Series)
    
    8.27 s ± 2.73 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    In [3]: %timeit pd.DataFrame([{x: y for x, y in enumerate(item)} for item in df2['teams'].values.tolist()], index=df2.index)
    
    35.4 ms ± 5.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    

提交回复
热议问题