How to limit cross correlation window width in Numpy?

和自甴很熟 提交于 2019-11-29 07:23:33

Here are a couple functions to compute auto- and cross-correlation with limited lags. The order of multiplication (and conjugation, in the complex case) was chosen to match the corresponding behavior of numpy.correlate.

import numpy as np
from numpy.lib.stride_tricks import as_strided


def _check_arg(x, xname):
    x = np.asarray(x)
    if x.ndim != 1:
        raise ValueError('%s must be one-dimensional.' % xname)
    return x

def autocorrelation(x, maxlag):
    """
    Autocorrelation with a maximum number of lags.

    `x` must be a one-dimensional numpy array.

    This computes the same result as
        numpy.correlate(x, x, mode='full')[len(x)-1:len(x)+maxlag]

    The return value has length maxlag + 1.
    """
    x = _check_arg(x, 'x')
    p = np.pad(x.conj(), maxlag, mode='constant')
    T = as_strided(p[maxlag:], shape=(maxlag+1, len(x) + maxlag),
                   strides=(-p.strides[0], p.strides[0]))
    return T.dot(p[maxlag:].conj())


def crosscorrelation(x, y, maxlag):
    """
    Cross correlation with a maximum number of lags.

    `x` and `y` must be one-dimensional numpy arrays with the same length.

    This computes the same result as
        numpy.correlate(x, y, mode='full')[len(a)-maxlag-1:len(a)+maxlag]

    The return vaue has length 2*maxlag + 1.
    """
    x = _check_arg(x, 'x')
    y = _check_arg(y, 'y')
    py = np.pad(y.conj(), 2*maxlag, mode='constant')
    T = as_strided(py[2*maxlag:], shape=(2*maxlag+1, len(y) + 2*maxlag),
                   strides=(-py.strides[0], py.strides[0]))
    px = np.pad(x, maxlag, mode='constant')
    return T.dot(px)

For example,

In [367]: x = np.array([2, 1.5, 0, 0, -1, 3, 2, -0.5])

In [368]: autocorrelation(x, 3)
Out[368]: array([ 20.5,   5. ,  -3.5,  -1. ])

In [369]: np.correlate(x, x, mode='full')[7:11]
Out[369]: array([ 20.5,   5. ,  -3.5,  -1. ])

In [370]: y = np.arange(8)

In [371]: crosscorrelation(x, y, 3)
Out[371]: array([  5. ,  23.5,  32. ,  21. ,  16. ,  12.5,   9. ])

In [372]: np.correlate(x, y, mode='full')[4:11]
Out[372]: array([  5. ,  23.5,  32. ,  21. ,  16. ,  12.5,   9. ])

(It will be nice to have such a feature in numpy itself.)

matplotlib.pyplot provides matlab like syntax for computating and plotting of cross correlation , auto correlation etc.

You can use xcorr which allows to define the maxlags parameter.

    import matplotlib.pyplot as plt


    import numpy  as np


    data = np.arange(0,2*np.pi,0.01)


    y1 = np.sin(data)


    y2 = np.cos(data)


    coeff = plt.xcorr(y1,y2,maxlags=10)

    print(*coeff)


[-10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6   7
   8   9  10] [ -9.81991753e-02  -8.85505028e-02  -7.88613080e-02  -6.91325329e-02
  -5.93651264e-02  -4.95600447e-02  -3.97182508e-02  -2.98407146e-02
  -1.99284126e-02  -9.98232812e-03  -3.45104289e-06   9.98555430e-03
   1.99417667e-02   2.98641953e-02   3.97518558e-02   4.96037706e-02
   5.94189688e-02   6.91964864e-02   7.89353663e-02   8.86346584e-02
   9.82934198e-02] <matplotlib.collections.LineCollection object at 0x00000000074A9E80> Line2D(_line0)

Until numpy implements the maxlag argument, you can use the function ucorrelate from the pycorrelate package. ucorrelate operates on numpy arrays and has a maxlag keyword. It implements the correlation from using a for-loop and optimizes the execution speed with numba.

Example - autocorrelation with 3 time lags:

import numpy as np
import pycorrelate as pyc

x = np.array([2, 1.5, 0, 0, -1, 3, 2, -0.5])
c = pyc.ucorrelate(x, x, maxlag=3)
c

Result:

Out[1]: array([20,  5, -3])

The pycorrelate documentation contains a notebook showing perfect match between pycorrelate.ucorrelate and numpy.correlate:

Pythonic

I think I have found a solution, as I was facing the same problem:

If you have two vectors x and y of any length N, and want a cross-correlation with a window of fixed len m, you can do:

x = <some_data>
y = <some_data>

# Trim your variables
x_short = x[window:]
y_short = y[window:]

# do two xcorrelations, lagging x and y respectively
left_xcorr = np.correlate(x, y_short)  #defaults to 'valid'
right_xcorr = np.correlate(x_short, y) #defaults to 'valid'

# combine the xcorrelations
# note the first value of right_xcorr is the same as the last of left_xcorr
xcorr = np.concatenate(left_xcorr, right_xcorr[1:])

Remember you might need to normalise the variables if you want a bounded correlation

Here is another answer, sourced from here, seems faster on the margin than np.correlate and has the benefit of returning a normalised correlation:

def rolling_window(self, a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

def xcorr(self, x,y):

    N=len(x)
    M=len(y)
    meany=np.mean(y)
    stdy=np.std(np.asarray(y))
    tmp=self.rolling_window(np.asarray(x),M)
    c=np.sum((y-meany)*(tmp-np.reshape(np.mean(tmp,-1),(N-M+1,1))),-1)/(M*np.std(tmp,-1)*stdy)

    return c        

as I answered here, https://stackoverflow.com/a/47897581/5122657 matplotlib.xcorr has the maxlags param. It is actually a wrapper of the numpy.correlate, so there is no performance saving. Nevertheless it gives exactly the same result given by Matlab's cross-correlation function. Below I edited the code from matplotlib so that it will return only the correlation. The reason is that if we use matplotlib.corr as it is, it will return the plot as well. The problem is, if we put complex data type as the arguments into it, we will get "casting complex to real datatype" warning when matplotlib tries to draw the plot.

<!-- language: python -->

import numpy as np
import matplotlib.pyplot as plt

def xcorr(x, y, maxlags=10):
    Nx = len(x)
    if Nx != len(y):
        raise ValueError('x and y must be equal length')

    c = np.correlate(x, y, mode=2)

    if maxlags is None:
        maxlags = Nx - 1

    if maxlags >= Nx or maxlags < 1:
        raise ValueError('maxlags must be None or strictly positive < %d' % Nx)

    c = c[Nx - 1 - maxlags:Nx + maxlags]

    return c
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!