Subtract a Series from a DataFrame while keeping the DataFrame struct intact

后端 未结 3 1717
无人及你
无人及你 2020-12-10 15:05

How can I subtract a Series from a DataFrame, while keeping the DataFrame struct intact?

df = pd.DataFrame(np.zeros((5,3)))
s = pd.Series(np.ones(5))

df - s         


        
相关标签:
3条回答
  • 2020-12-10 16:06

    Maybe:

    >>> df = pd.DataFrame(np.zeros((5,3)))
    >>> s = pd.Series(np.ones(5))
    >>> df.sub(s,axis=0)
       0  1  2
    0 -1 -1 -1
    1 -1 -1 -1
    2 -1 -1 -1
    3 -1 -1 -1
    4 -1 -1 -1
    
    [5 rows x 3 columns]
    

    or, for a more interesting example:

    >>> s = pd.Series(np.arange(5))
    >>> df.sub(s,axis=0)
       0  1  2
    0  0  0  0
    1 -1 -1 -1
    2 -2 -2 -2
    3 -3 -3 -3
    4 -4 -4 -4
    
    [5 rows x 3 columns]
    
    0 讨论(0)
  • 2020-12-10 16:09

    If a1 is a dataframe made of n columns and a2 is a another dataframe made by just 1 column, you can subtract a2 from each column of a1 using numpy

    np.subtract(a1, a2)
    

    You can achieve the same result if a2 is a Series making sure to transform to DataFrame

    np.subtract(a1, a2.to_frame()) 
    

    I guess that, before computing this operation, you need to make sure the indices in the two dataframes are coherent/overlapping. As a matter of fact, the above operations will work if a1 and a2 have the same number of rows and different indices. You can try

    a1 = pd.DataFrame([[1, 2], [3, 4]], columns=['a','b'])
    a2 = pd.DataFrame([[1], [2]], columns=['c'])
    
    np.subtract(a1, a2)
    

    and

    a1 = pd.DataFrame([[1, 2], [3, 4]], columns=['a','b'])
    a2 = pd.DataFrame([[1], [2]], columns=['c'], index=[3,4])
    
    np.subtract(a1,a2)
    

    will give you the same result.

    For this reason, to make sure the two DataFrames are coherent, you could preprocess using something like:

    def align_dataframes(df1, df2):
        r = pd.concat([df1, df2], axis=1, join_axes=[df1.index])
        return r.loc[:,df1.columns], r.loc[:,df2.columns]
    
    0 讨论(0)
  • 2020-12-10 16:09

    I'll throw in an example that modifies a subset of the a DataFrame

    df = pd.DataFrame(np.arange(20).reshape((2,10)),columns=list('abcdefghjk'))
    
    >>> df
        a   b   c   d   e   f   g   h   j   k
    0   0   1   2   3   4   5   6   7   8   9
    1  10  11  12  13  14  15  16  17  18  19
    
    # Series to be subtracted    
    dif = df['g'] - df['h']
    
    >>> dif
    0   -1
    1   -1
    dtype: int32
    
    # subtract the Series from columns 'g','h','j','k'
    df.loc[:,'g':] = df.loc[:,'g':].subtract(dif,axis='rows')
    #df.loc[:,'g':] = df.loc[:,'g':].subtract(dif,axis=0)
    
    >>> df
        a   b   c   d   e   f   g   h   j   k
    0   0   1   2   3   4   5   7   8   9  10
    1  10  11  12  13  14  15  17  18  19  20
    
    0 讨论(0)
提交回复
热议问题