How can I subtract a Series from a DataFrame, while keeping the DataFrame struct intact?
df = pd.DataFrame(np.zeros((5,3)))
s = pd.Series(np.ones(5))
df - s
Maybe:
>>> df = pd.DataFrame(np.zeros((5,3)))
>>> s = pd.Series(np.ones(5))
>>> df.sub(s,axis=0)
0 1 2
0 -1 -1 -1
1 -1 -1 -1
2 -1 -1 -1
3 -1 -1 -1
4 -1 -1 -1
[5 rows x 3 columns]
or, for a more interesting example:
>>> s = pd.Series(np.arange(5))
>>> df.sub(s,axis=0)
0 1 2
0 0 0 0
1 -1 -1 -1
2 -2 -2 -2
3 -3 -3 -3
4 -4 -4 -4
[5 rows x 3 columns]
If a1 is a dataframe made of n columns and a2 is a another dataframe made by just 1 column, you can subtract a2 from each column of a1 using numpy
np.subtract(a1, a2)
You can achieve the same result if a2 is a Series making sure to transform to DataFrame
np.subtract(a1, a2.to_frame())
I guess that, before computing this operation, you need to make sure the indices in the two dataframes are coherent/overlapping. As a matter of fact, the above operations will work if a1 and a2 have the same number of rows and different indices. You can try
a1 = pd.DataFrame([[1, 2], [3, 4]], columns=['a','b'])
a2 = pd.DataFrame([[1], [2]], columns=['c'])
np.subtract(a1, a2)
and
a1 = pd.DataFrame([[1, 2], [3, 4]], columns=['a','b'])
a2 = pd.DataFrame([[1], [2]], columns=['c'], index=[3,4])
np.subtract(a1,a2)
will give you the same result.
For this reason, to make sure the two DataFrames are coherent, you could preprocess using something like:
def align_dataframes(df1, df2):
r = pd.concat([df1, df2], axis=1, join_axes=[df1.index])
return r.loc[:,df1.columns], r.loc[:,df2.columns]
I'll throw in an example that modifies a subset of the a DataFrame
df = pd.DataFrame(np.arange(20).reshape((2,10)),columns=list('abcdefghjk'))
>>> df
a b c d e f g h j k
0 0 1 2 3 4 5 6 7 8 9
1 10 11 12 13 14 15 16 17 18 19
# Series to be subtracted
dif = df['g'] - df['h']
>>> dif
0 -1
1 -1
dtype: int32
# subtract the Series from columns 'g','h','j','k'
df.loc[:,'g':] = df.loc[:,'g':].subtract(dif,axis='rows')
#df.loc[:,'g':] = df.loc[:,'g':].subtract(dif,axis=0)
>>> df
a b c d e f g h j k
0 0 1 2 3 4 5 7 8 9 10
1 10 11 12 13 14 15 17 18 19 20