Converting numpy ndarray of dictionaries to DataFrame

后端 未结 1 1637
情歌与酒
情歌与酒 2021-01-16 01:23

I\'ve searched stackoverflow for a solution to this -> but all solutions are slightly different to my needs.

I have a large ndarray (roughly 107 million rows) lets c

1条回答
  •  生来不讨喜
    2021-01-16 01:56

    I think input data are different:

    L =  [[{'A': 5, 'C': 3, 'D': 3}],
         [{'A': 7, 'B': 9, 'F': 5}],
         [{'B': 4, 'C': 7, 'E': 6}]]
    
    print (pd.DataFrame(L))
                              0
    0  {'A': 5, 'C': 3, 'D': 3}
    1  {'A': 7, 'B': 9, 'F': 5}
    2  {'B': 4, 'C': 7, 'E': 6}
    

    Possible solution is flattening:

    from  itertools import chain
    df = pd.DataFrame(chain.from_iterable(L)).sort_index(axis=1)
    print (df)
         A    B    C    D    E    F
    0  5.0  NaN  3.0  3.0  NaN  NaN
    1  7.0  9.0  NaN  NaN  NaN  5.0
    2  NaN  4.0  7.0  NaN  6.0  NaN
    

    If input datais numpy array use solution from comment by @Code Different:

    arr = np.array([{'A': 5, 'C': 3, 'D': 3},
                    {'A': 7, 'B': 9, 'F': 5},
                    {'B': 4, 'C': 7, 'E': 6}])
    
    df = pd.DataFrame(arr.tolist()).sort_index(axis=1)
    print (df)
         A    B    C    D    E    F
    0  5.0  NaN  3.0  3.0  NaN  NaN
    1  7.0  9.0  NaN  NaN  NaN  5.0
    2  NaN  4.0  7.0  NaN  6.0  NaN
    

    0 讨论(0)
提交回复
热议问题