I realize Dataframe takes a map of {\'series_name\':Series(data, index)}. However, it automatically sorts that map even if the map is an OrderedDict().
Is there a simpl
Build the list of series:
import pandas as pd
import numpy as np
> series = [pd.Series(np.random.rand(3), name=c) for c in list('abcdefg')]
First method pd.DataFrame.from_items
:
> pd.DataFrame.from_items([(s.name, s) for s in series])
a b c d e f g
0 0.071094 0.077545 0.299540 0.377555 0.751840 0.879995 0.933399
1 0.538251 0.066780 0.415607 0.796059 0.718893 0.679950 0.502138
2 0.096001 0.680868 0.883778 0.210488 0.642578 0.023881 0.250317
Second method pd.concat
:
> pd.concat(series, axis=1)
a b c d e f g
0 0.071094 0.077545 0.299540 0.377555 0.751840 0.879995 0.933399
1 0.538251 0.066780 0.415607 0.796059 0.718893 0.679950 0.502138
2 0.096001 0.680868 0.883778 0.210488 0.642578 0.023881 0.250317
You could use pandas.concat
:
import pandas as pd
from pandas.util.testing import rands
data = [pd.Series([rands(4) for j in range(6)],
index=pd.date_range('1/1/2000', periods=6),
name='col'+str(i)) for i in range(4)]
df = pd.concat(data, axis=1, keys=[s.name for s in data])
print(df)
yields
col0 col1 col2 col3
2000-01-01 GqcN Lwlj Km7b XfaA
2000-01-02 lhNC nlSm jCYu XLVb
2000-01-03 sSRz PFby C1o5 0BJe
2000-01-04 khZb Ny9p crUY LNmc
2000-01-05 hmLp 4rVp xF2P OmD9
2000-01-06 giah psQb T5RJ oLSh
a = pd.Series(data=[1,2,3])
b = pd.Series(data=[4,5,6])
a.name = 'a'
b.name= 'b'
pd.DataFrame(zip(a,b), columns=[a.name, b.name])
or just concat dataframes
pd.concat([pd.DataFrame(a),pd.DataFrame(b)], axis=1)
In [53]: %timeit pd.DataFrame(zip(a,b), columns=[a.name, b.name])
1000 loops, best of 3: 362 us per loop
In [54]: %timeit pd.concat([pd.DataFrame(a),pd.DataFrame(b)], axis=1)
1000 loops, best of 3: 808 us per loop
Simply passing the list of Series to DataFrame
then transposing seems to work too. It will also fill in any indices that are missing from one or the other Series.
import pandas as pd
from pandas.util.testing import rands
data = [pd.Series([rands(4) for j in range(6)],
index=pd.date_range('1/1/2000', periods=6),
name='col'+str(i)) for i in range(4)]
df = pd.DataFrame(data).T
print(df)
Check out DataFrame.from_items
too
You can first create an empty DataFrame and then use append()
to it.
df = pd.DataFrame()
then:
df = df.append(list_series)
I also like to make sure the previous script that created list_series won't mess my dataframe up:
df.drop_duplicates(inplace=True)