问题
I would like to take a Pandas Series with a single-level index and split on that index into a dataframe with multiple columns. For instance, for input:
s = pd.Series(range(10,17), index=['a','a','b','b','c','c','c'])
s
a 10
a 11
b 12
b 13
c 14
c 15
c 16
dtype: int64
What I would like as an output is:
a b c
0 10 12 14
1 11 13 15
2 NaN NaN 16
I cannot directly use the unstack command because it requires a multiindex and I only have a single-level index. I tried putting in a dummy index that all had the same value, but I got an error "ReshapeError: Index contains duplicate entries, cannot reshape".
I know that this is a little bit unusual because 1) pandas doesn't like ragged arrays, so there will need to be padding, 2) the index needs to be arbitrarily reset, 3) I can't really "initialize" the dataframe until I know how long the longest column is going to be. But this still seems like something that I should be able to do somehow. I also thought about doing it via groupby, but it doesn't seem like there is anything like grouped_df.values() without any kind of aggregating function- probably for the above reasons.
回答1:
You can use groupby
, apply
, reset_index
to create a multiindex Series, and then call unstack
:
import pandas as pd
s = pd.Series(range(10,17), index=['a','a','b','b','c','c','c'])
df = s.groupby(level=0).apply(pd.Series.reset_index, drop=True).unstack(0)
print df
output:
a b c
0 10 12 14
1 11 13 15
2 NaN NaN 16
回答2:
Not sure how generalizable this is. I call this the groupby via concat pattern. Essentially an apply, but with control over how exactly its combined.
In [24]: s = pd.Series(range(10,17), index=['a','a','b','b','c','c','c'])
In [25]: df = DataFrame(dict(key = s.index, value = s.values))
In [26]: df
Out[26]:
key value
0 a 10
1 a 11
2 b 12
3 b 13
4 c 14
5 c 15
6 c 16
In [27]: concat(dict([ (g,Series(grp['value'].values)) for g, grp in df.groupby('key') ]),axis=1)
Out[27]:
a b c
0 10 12 14
1 11 13 15
2 NaN NaN 16
来源:https://stackoverflow.com/questions/17432793/split-a-pandas-series-without-a-multiindex