numpy diff on a pandas Series

前端 未结 1 1629
难免孤独
难免孤独 2021-02-07 22:17

I want to use numpy.diff on a pandas Series. Am I right that this is a bug? Or am I doing it wrong?

In [163]: s = Series(np.arange(10))

In [164]: np.diff(s)
Out         


        
1条回答
  •  有刺的猬
    2021-02-07 22:58

    Pandas implements diff like so:

    In [3]: s = pd.Series(np.arange(10))
    
    In [4]: s.diff()
    Out[4]:
    0   NaN
    1     1
    2     1
    3     1
    4     1
    5     1
    6     1
    7     1
    8     1
    9     1
    

    Using np.diff directly:

    In [7]: np.diff(s.values)
    Out[7]: array([1, 1, 1, 1, 1, 1, 1, 1, 1])
    
    In [8]: np.diff(np.array(s))
    Out[8]: array([1, 1, 1, 1, 1, 1, 1, 1, 1])
    

    So why doesn't np.diff(s) work? Because np is taking np.asanyarray() of the series before finding the diff. Like so:

    In [25]: a = np.asanyarray(s)
    
    In [26]: a 
    Out[26]:
    0    0
    1    1
    2    2
    3    3
    4    4
    5    5
    6    6
    7    7
    8    8
    9    9
    
    In [27]: np.diff(a)
    Out[27]:
    0   NaN
    1     0
    2     0
    3     0
    4     0
    5     0
    6     0
    7     0
    8     0
    9   NaN
    

    0 讨论(0)
提交回复
热议问题