Converting numpy ndarray of dictionaries to DataFrame

后端未结

关注

 1  1220

醉梦人生

I\'ve searched stackoverflow for a solution to this -> but all solutions are slightly different to my needs.

I have a large ndarray (roughly 107 million rows) lets c

相关标签:

1条回答

死守一世寂寞

2021-01-16 01:37

I think input data are different:

L =  [[{'A': 5, 'C': 3, 'D': 3}],
     [{'A': 7, 'B': 9, 'F': 5}],
     [{'B': 4, 'C': 7, 'E': 6}]]

print (pd.DataFrame(L))
                          0
0  {'A': 5, 'C': 3, 'D': 3}
1  {'A': 7, 'B': 9, 'F': 5}
2  {'B': 4, 'C': 7, 'E': 6}

Possible solution is flattening:

from  itertools import chain
df = pd.DataFrame(chain.from_iterable(L)).sort_index(axis=1)
print (df)
     A    B    C    D    E    F
0  5.0  NaN  3.0  3.0  NaN  NaN
1  7.0  9.0  NaN  NaN  NaN  5.0
2  NaN  4.0  7.0  NaN  6.0  NaN

If input datais numpy array use solution from comment by @Code Different:

arr = np.array([{'A': 5, 'C': 3, 'D': 3},
                {'A': 7, 'B': 9, 'F': 5},
                {'B': 4, 'C': 7, 'E': 6}])

df = pd.DataFrame(arr.tolist()).sort_index(axis=1)
print (df)
     A    B    C    D    E    F
0  5.0  NaN  3.0  3.0  NaN  NaN
1  7.0  9.0  NaN  NaN  NaN  5.0
2  NaN  4.0  7.0  NaN  6.0  NaN

0 讨论(0)