I have data saved in a postgreSQL
database. I am querying this data using Python2.7 and turning it into a Pandas DataFrame. However, the last column of this dat
You can use join
with pop
+ tolist
. Performance is comparable to concat
with drop
+ tolist
, but some may find this syntax cleaner:
res = df.join(pd.DataFrame(df.pop('b').tolist()))
Benchmarking with other methods:
df = pd.DataFrame({'a':[1,2,3], 'b':[{'c':1}, {'d':3}, {'c':5, 'd':6}]})
def joris1(df):
return pd.concat([df.drop('b', axis=1), df['b'].apply(pd.Series)], axis=1)
def joris2(df):
return pd.concat([df.drop('b', axis=1), pd.DataFrame(df['b'].tolist())], axis=1)
def jpp(df):
return df.join(pd.DataFrame(df.pop('b').tolist()))
df = pd.concat([df]*1000, ignore_index=True)
%timeit joris1(df.copy()) # 1.33 s per loop
%timeit joris2(df.copy()) # 7.42 ms per loop
%timeit jpp(df.copy()) # 7.68 ms per loop