Creating dataframe from a dictionary where entries have different lengths

前端 未结 9 1607
生来不讨喜
生来不讨喜 2020-11-22 14:04

Say I have a dictionary with 10 key-value pairs. Each entry holds a numpy array. However, the length of the array is not the same for all of them.

How can I create a

相关标签:
9条回答
  • 2020-11-22 14:51

    In Python 3.x:

    import pandas as pd
    import numpy as np
    
    d = dict( A = np.array([1,2]), B = np.array([1,2,3,4]) )
        
    pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in d.items() ]))
    
    Out[7]: 
        A  B
    0   1  1
    1   2  2
    2 NaN  3
    3 NaN  4
    

    In Python 2.x:

    replace d.items() with d.iteritems().

    0 讨论(0)
  • 2020-11-22 14:52

    Both the following lines work perfectly :

    pd.DataFrame.from_dict(df, orient='index').transpose() #A
    
    pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in df.items() ])) #B (Better)
    

    But with %timeit on Jupyter, I've got a ratio of 4x speed for B vs A, which is quite impressive especially when working with a huge data set (mainly with a big number of columns/features).

    0 讨论(0)
  • 2020-11-22 14:54

    Here's a simple way to do that:

    In[20]: my_dict = dict( A = np.array([1,2]), B = np.array([1,2,3,4]) )
    In[21]: df = pd.DataFrame.from_dict(my_dict, orient='index')
    In[22]: df
    Out[22]: 
       0  1   2   3
    A  1  2 NaN NaN
    B  1  2   3   4
    In[23]: df.transpose()
    Out[23]: 
        A  B
    0   1  1
    1   2  2
    2 NaN  3
    3 NaN  4
    
    0 讨论(0)
提交回复
热议问题