How to create multiple columns from multiple columns in a pandas data frame

淺唱寂寞╮ 提交于 2019-12-11 19:57:29

问题


I am building a repository of clean, non-hard coded (= not using the data frame column names inside) function templates that enable creating 4 types of functions: 1 new column from 1 existing, many new columns from 1 existing, 1 new column from many and finally many-to-many.

The first 3 look like this and work:

In [97]:
data={'level1':[20,19,20,21,25,29,30,31,30,29,31],
      'level2': [10,10,20,20,20,10,10,20,20,10,10]}
index= pd.date_range('12/1/2014', periods=11)
frame=DataFrame(data, index=index)

In [98]:
def nonhardcoded_1to1(x):
    y=x+2
    return y
frame['test1to1']=frame['level1'].map(nonhardcoded_1to1)#works

def nonhardcoded_2to1(x,y):
    z=x+y
    return z
frame['test2to1']=frame[['level1','level2']].apply(lambda s: nonhardcoded_2to1(*s), axis=1)#works

def nonhardcoded_1to2(x):
    y=x+12
    z=x-12
    return y, z
frame['test1to2a'], frame['test1to2b'] = zip(*frame['level1'].map(nonhardcoded_1to2))#works

Now, for the many-to-many function I get errors. I am trying to stitch it together from the above '2to1' and '1-2' functions but they don't work together:

def nonhardcoded_2to2(x,y):
    z1=x+y
    z2=x-y
    return z1, z2
frame['test2to2a'], frame['test2to2b']=zip(*frame[['level1','level2']].apply(lambda s: nonhardcoded_2to2(*s), axis=1))

ValueError: too many values to unpack

So I tried to dig into the function call:

test=frame[['level1','level2']].apply(lambda s: nonhardcoded_2to2(*s), axis=1)

which returned this, so in theory this at least looks usable:

Out[104]:
level1  level2
2014-12-01  30  10
2014-12-02  29  9
2014-12-03  40  0
2014-12-04  41  1
2014-12-05  45  5
2014-12-06  39  19
2014-12-07  40  20
2014-12-08  51  11
2014-12-09  50  10
2014-12-10  39  19
2014-12-11  41  21

Then I tried:

test=zip(*frame[['level1','level2']].apply(lambda s: nonhardcoded_2to2(*s), axis=1))
test

which returned a tuple sequence. For some reason it seems to take the headers of the result and turns it into pairs. Not sure why

[('l', 'l'), ('e', 'e'), ('v', 'v'), ('e', 'e'), ('l', 'l'), ('1', '2')]

How should I create and call this function so it works?

来源:https://stackoverflow.com/questions/28398345/how-to-create-multiple-columns-from-multiple-columns-in-a-pandas-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!