Pandas long to wide

喜欢而已 提交于 2019-12-24 00:45:13

问题


Using pandas, I want to convert a long data frame to wide but the usual pivot method is not as flexible as I need.

Here is the long data:

raw = {
'sample':[1, 1, 1, 1, 2, 2, 3, 3, 3, 3],
'gene':['G1', 'G2', 'G3', 'G3', 'G1', 'G2', 'G2', 'G2', 'G3', 'G3'],
'type':['HIGH', 'HIGH', 'LOW', 'MED', 'HIGH', 'LOW', 'LOW', 'LOW', 'MED', 'LOW']}
df = pd.DataFrame(raw)`

which produces

gene  sample  type
G1       1  HIGH
G2       1  HIGH
G3       1   LOW
G3       1   MED
G1       2  HIGH
G2       2   LOW
G2       3   LOW
G2       3   LOW
G3       3   MED
G3       3   LOW

What I want is a data frame that has rows as gene and columns as sample, but I want the cell value to be filled with the "greatest" type according to HIGH > MED > LOW > NONE i.e. it should look like

casted = {
'gene':['G1', 'G2', 'G3'],
'1':['HIGH', 'HIGH', 'MED'],
'2':['HIGH', 'LOW', 'NONE'],
'3':['NONE', 'LOW', 'MED']
}
dfCast = pd.DataFrame(casted)

which makes

1     2     3      gene
HIGH  HIGH  NONE   G1
HIGH  LOW   LOW    G2
MED   NONE  MED    G3

Trivially and erroneously, my long to wide command would look like

df = df.pivot(index='gene', columns = 'sample', values='type')

but of course this doesn't account for the hierarchy I want to impose where HIGH>MED>LOW>NONE

When casting, how can I control what the cell value is?


回答1:


You can use pivot_table which provides an aggfun method to aggregate duplicated index-column values; To sort the keywords HIGH,MED,LOW in an order you need, set them as keys of a dictionary whose values go in monotonic order, and pick the extreme value with min/max as the aggregation function:

cat = {"HIGH": 3, "MED": 2, "LOW": 1}
df.pivot_table("type", "gene", "sample", aggfunc=lambda x: max(x, key=cat.get))


Or another option, convert the type to ordered categorical data type and then use pivot_table:

df['type'] = pd.Categorical(df['type'], ["LOW", "MED", "HIGH"], ordered=True)
df.pivot_table("type", "gene", "sample", aggfunc='max')



来源:https://stackoverflow.com/questions/42310781/pandas-long-to-wide

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!