Autocorrelation to estimate periodicity with numpy

后端 未结 1 878
傲寒
傲寒 2021-02-06 06:40

I have a large set of time series (> 500), I\'d like to select only the ones that are periodic. I did a bit of literature research and I found out that I should look for autocor

1条回答
  •  北海茫月
    2021-02-06 06:59

    I would use mode='same' instead of mode='full' because with mode='full' we get covariances for extreme shifts, where just 1 array element overlaps self, the rest being zeros. Those are not going to be interesting. With mode='same' at least half of the shifted array overlaps the original one.

    Also, to have the true correlation coefficient (r) you need to divide by the size of the overlap, not by the size of the original x. (in my code these are np.arange(n-1, n//2, -1)). Then each of the outputs will be between -1 and 1.

    A glance at Durbin–Watson statistic, which is similar to 2(1-r), suggests that people consider its values below 1 to be a significant indication of autocorrelation, which corresponds to r > 0.5. So this is what I use below. For a statistically sound treatment of the significance of autocorrelation refer to statistics literature; a starting point would be to have a model for your time series.

    def autocorr(x):
        n = x.size
        norm = (x - np.mean(x))
        result = np.correlate(norm, norm, mode='same')
        acorr = result[n//2 + 1:] / (x.var() * np.arange(n-1, n//2, -1))
        lag = np.abs(acorr).argmax() + 1
        r = acorr[lag-1]        
        if np.abs(r) > 0.5:
          print('Appears to be autocorrelated with r = {}, lag = {}'. format(r, lag))
        else: 
          print('Appears to be not autocorrelated')
        return r, lag
    

    Output for your two toy examples:

    Appears to be not autocorrelated
    Appears to be autocorrelated with r = 1.0, lag = 4

    0 讨论(0)
提交回复
热议问题