Here is my code:
a = pd.DataFrame([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]], columns=[\'A\', \'B\'])
print(a)
a[\'C\'] = 1 # or np.nan or is there a way to
From your code:
# Variable a BEFORE apply
A B
0 1 2
1 3 4
2 5 6
3 7 8
4 9 10
# Variable a AFTER apply
A B C
0 1 2 4
1 3 4 8
2 5 6 12
3 7 8 16
4 9 10 20
Assuming this output is really what you want, then:
a = pd.DataFrame([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]], columns=['A', 'B'])
a['C'] = a['A'] + a['B'] + 1
I'm a little confused as to why you would want to access a['C'].shift(1)
since all the values are the same anyway, and you are trying not to initialize it.
If you want a working example of using df.shift(n)
, try:
a['Shift'] = a['A'] + a['B'].shift(1)
Which would give you:
A B C Shift
0 1 2 4 NaN
1 3 4 8 5.0
2 5 6 12 9.0
3 7 8 16 13.0
4 9 10 20 17.0
This would give you A(i) + B(i+1), where i is the row number. Since you shifted column B by 1, the first sum is NaN
.
I guess you need this:
a['C'] = a['A'] + a['B']
a['D'] = a['C'].cumsum()
because summing with previous element is a cumulative sum.