I have to compare two time-vs-voltage waveforms. Because of the peculiarity of the sources of these waveforms, one of them can be a time shifted version of the other.
scipy provides a correlation function which will work fine for small input and also if you want non-circular correlation meaning that the signal will not wrap around. note that in mode='full'
, the size of the array returned by signal.correlation is sum of the signal sizes minus one (i.e. len(a) + len(b) - 1
), so the value from argmax
is off by (signal size -1 = 20) from what you seem to expect.
from scipy import signal, fftpack
import numpy
a = numpy.array([0, 1, 2, 3, 4, 3, 2, 1, 0, 1, 2, 3, 4, 3, 2, 1, 0, 0, 0, 0, 0])
b = numpy.array([0, 0, 0, 0, 0, 1, 2, 3, 4, 3, 2, 1, 0, 1, 2, 3, 4, 3, 2, 1, 0])
numpy.argmax(signal.correlate(a,b)) -> 16
numpy.argmax(signal.correlate(b,a)) -> 24
The two different values correspond to whether the shift is in a
or b
.
If you want circular correlation and for big signal size, you can use the convolution/Fourier transform theorem with the caveat that correlation is very similar to but not identical to convolution.
A = fftpack.fft(a)
B = fftpack.fft(b)
Ar = -A.conjugate()
Br = -B.conjugate()
numpy.argmax(numpy.abs(fftpack.ifft(Ar*B))) -> 4
numpy.argmax(numpy.abs(fftpack.ifft(A*Br))) -> 17
again the two values correspond to whether your interpreting a shift in a
or a shift in b
.
The negative conjugation is due to convolution flipping one of the functions, but in correlation there is no flipping. You can undo the flipping by either reversing one of the signals and then taking the FFT, or taking the FFT of the signal and then taking the negative conjugate. i.e. the following is true: Ar = -A.conjugate() = fft(a[::-1])
This function is probably more efficient for real-valued signals. It uses rfft and zero pads the inputs to a power of 2 large enough to ensure linear (i.e. non-circular) correlation:
def rfft_xcorr(x, y):
M = len(x) + len(y) - 1
N = 2 ** int(np.ceil(np.log2(M)))
X = np.fft.rfft(x, N)
Y = np.fft.rfft(y, N)
cxy = np.fft.irfft(X * np.conj(Y))
cxy = np.hstack((cxy[:len(x)], cxy[N-len(y)+1:]))
return cxy
The return value is length M = len(x) + len(y) - 1
(hacked together with hstack
to remove the extra zeros from rounding up to a power of 2). The non-negative lags are cxy[0], cxy[1], ..., cxy[len(x)-1]
, while the negative lags are cxy[-1], cxy[-2], ..., cxy[-len(y)+1]
.
To match a reference signal, I'd compute rfft_xcorr(x, ref)
and look for the peak. For example:
def match(x, ref):
cxy = rfft_xcorr(x, ref)
index = np.argmax(cxy)
if index < len(x):
return index
else: # negative lag
return index - len(cxy)
In [1]: ref = np.array([1,2,3,4,5])
In [2]: x = np.hstack(([2,-3,9], 1.5 * ref, [0,3,8]))
In [3]: match(x, ref)
Out[3]: 3
In [4]: x = np.hstack((1.5 * ref, [0,3,8], [2,-3,-9]))
In [5]: match(x, ref)
Out[5]: 0
In [6]: x = np.hstack((1.5 * ref[1:], [0,3,8], [2,-3,-9,1]))
In [7]: match(x, ref)
Out[7]: -1
It's not a robust way to match signals, but it is quick and easy.
Blockquote
(A very late answer) to find the time-shift between two signals: use the time-shift property of FTs, so the shifts can be shorter than the sample spacing, then compute the quadratic difference between a time-shifted waveform and the reference waveform. It can be useful when you have n shifted waveforms with a multiplicity in the shifts, like n receivers equally spaced for a same incoming wave. You can also correct dispersion substituting a static time-shift by a function of frequency.
The code goes like this:
import numpy as np
import matplotlib.pyplot as plt
from scipy.fftpack import fft, ifft, fftshift, fftfreq
from scipy import signal
# generating a test signal
dt = 0.01
t0 = 0.025
n = 512
freq = fftfreq(n, dt)
time = np.linspace(-n * dt / 2, n * dt / 2, n)
y = signal.gausspulse(time, fc=10, bw=0.3) + np.random.normal(0, 1, n) / 100
Y = fft(y)
# time-shift of 0.235; could be a dispersion curve, so y2 would be dispersive
Y2 = Y * np.exp(-1j * 2 * np.pi * freq * 0.235)
y2 = ifft(Y2).real
# scan possible time-shifts
error = []
timeshifts = np.arange(-100, 100) * dt / 2 # could be dispersion curves instead
for ts in timeshifts:
Y2_shifted = Y2 * np.exp(1j * 2 * np.pi * freq * ts)
y2_shifted = ifft(Y2_shifted).real
error.append(np.sum((y2_shifted - y) ** 2))
# show the results
ts_final = timeshifts[np.argmin(error)]
print(ts_final)
Y2_shifted = Y2 * np.exp(1j * 2 * np.pi * freq * ts_final)
y2_shifted = ifft(Y2_shifted).real
plt.subplot(221)
plt.plot(time, y, label="y")
plt.plot(time, y2, label="y2")
plt.xlabel("time")
plt.legend()
plt.subplot(223)
plt.plot(time, y, label="y")
plt.plot(time, y2_shifted, label="y_shifted")
plt.xlabel("time")
plt.legend()
plt.subplot(122)
plt.plot(timeshifts, error, label="error")
plt.xlabel("timeshifts")
plt.legend()
plt.show()
See an example here
If one is time-shifted by the other, you will see a peak in the correlation. Since calculating the correlation is expensive, it is better to use FFT. So, something like this should work:
af = scipy.fft(a)
bf = scipy.fft(b)
c = scipy.ifft(af * scipy.conj(bf))
time_shift = argmax(abs(c))
It depends on the kind of signal you have (periodic?…), on whether both signals have the same amplitude, and on what precision you are looking for.
The correlation function mentioned by highBandWidth might indeed work for you. It is simple enough that you should give it a try.
Another, more precise option is the one I use for high-precision spectral line fitting: you model your "master" signal with a spline and fit the time-shifted signal with it (while possibly scaling the signal, if need be). This yields very precise time shifts. One advantage of this approach is that you do not have to study the correlation function. You can for instance create the spline easily with interpolate.UnivariateSpline()
(from SciPy). SciPy returns a function, which is then easily fitted with optimize.leastsq
().
Here's another option:
from scipy import signal, fftpack
def get_max_correlation(original, match):
z = signal.fftconvolve(original, match[::-1])
lags = np.arange(z.size) - (match.size - 1)
return ( lags[np.argmax(np.abs(z))] )