Pandas: Shift one column by other column value

半世苍凉 提交于 2019-12-12 15:26:31

问题


I'm trying to use one column's values to shift another columns values by that amount. Pandas shift(), per the documentation, takes an integer, but is there a way to instead use a Series?

Current Code:

import pandas as pd

df = pd.DataFrame({ 'a':[1,2,3,4,5,6,7,8,9,10],
                    'b':[0,0,0,0,4,4,4,0,0,0]})

df['a'] = df['a'].shift(df['b'])

...which is of course not working.

Desired output:

    a  b
0   1  0
1   2  0
2   3  0
3   4  0
4   1  4
5   2  4
6   3  4
7   8  0
8   9  0
9  10  0

If it makes it easier, the shift will always be the same, so theoretically the 'b' series could be True / False or some other binary trigger, and the .shift() could still be an integer. Feels a little hacky going that route, but it would get the job done.


回答1:


we can use numba solution:

from numba import jit

@jit
def dyn_shift(s, step):
    assert len(s) == len(step), "[s] and [step] should have the same length"
    assert isinstance(s, np.ndarray), "[s] should have [numpy.ndarray] dtype"
    assert isinstance(step, np.ndarray), "[step] should have [numpy.ndarray] dtype"
    N = len(s)
    res = np.empty(N, dtype=s.dtype)
    for i in range(N):
        res[i] = s[i-step[i]]
    return res

result:

In [302]: df['new'] = dyn_shift(df['a'].values, df['b'].values)
# NOTE: we should pass Numpy arrays:   ^^^^^^^         ^^^^^^^

In [303]: df
Out[303]:
    a  b  new
0   1  0    1
1   2  0    2
2   3  0    3
3   4  0    4
4   5  4    1
5   6  4    2
6   7  4    3
7   8  0    8
8   9  0    9
9  10  0   10



回答2:


Figured it out:

df.loc[df['b'] == 4, 'a'] = df['a'].shift(4)

...this is the 'hacky' version I referred to above. The first 4 is really just a trigger and the second 4 would be hard-coded.



来源:https://stackoverflow.com/questions/45023685/pandas-shift-one-column-by-other-column-value

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!