Matlab: Aligning data by using cross covariance

后端 未结 3 1691
再見小時候
再見小時候 2021-01-26 06:07

I want to get the offset in samples between two datasets in Matlab (getting them synced in time), a quite common issue. Therefore I use the cross correlation function xcorr or t

相关标签:
3条回答
  • 2021-01-26 06:42

    Your data has a minor peak around 5 and a major peak around 101.

    If I knew something about my data then I could might window around an acceptable range of offsets as shown below.

    Code for initial exploration:

    figure; clc;
    subplot(2,1,1) 
    plot(1:numel(b), b);
    hold on
    plot(1:numel(c), c, 'r');
    legend('b','c')
    
    subplot(2,1,2)
    plot(crossCorr,'.b-')
    hold on
    plot(peakIndex,crossCorr(peakIndex),'or')
    legend('crossCorr','peak')
    

    Initial Image:

    enter image description here

    If you zoom into the first peak you can see that it is not only high around 5, but it is polynomial "enough" to allow sub-element offsets. That is convenient.

    Image showing :

    enter image description here

    Here is what the curve-fitting tool gives as the analytic for a cubic:

    Linear model Poly3:
         f(x) = p1*x^3 + p2*x^2 + p3*x + p4
    Coefficients (with 95% confidence bounds):
           p1 =  8.515e-013  (8.214e-013, 8.816e-013)
           p2 = -3.319e-011  (-3.369e-011, -3.269e-011)
           p3 =  2.253e-010  (2.229e-010, 2.277e-010)
           p4 = -4.226e-012  (-7.47e-012, -9.82e-013)
    
    Goodness of fit:
      SSE: 2.799e-024
      R-square: 1
      Adjusted R-square: 1
      RMSE: 6.831e-013
    

    You can note that the SSE fits to roundoff. If you compute the root (near n=4) you use the following matlab code:

    % Coefficients
           p1 =  8.515e-013
           p2 = -3.319e-011
           p3 =  2.253e-010
           p4 = -4.226e-012
    % Linear model Poly3:
    syms('x')
    f = p1*x^3 + p2*x^2 + p3*x + p4
    
    xz1=fzero(@(y) subs(diff(f),'x',y), 4)
    

    and you get the analytic root at 4.01420240431444.

    EDIT: Hmmm. How about fitting a gaussian mixture model to the convolution? You sweep through a good range of component count, you do between 10 and 30 repeats, and you find which component count has the best/lowest BIC. So you fit a gmdistribution to the lower subplot of the first figure, then test the covariance at the means of the components in decreasing order.

    I would try the offset at the means, and just look at sum squared error. I would then pick the offset that has the lowest error.

    Procedure:

    • compute cross correlation
    • fit cross correlation to Gaussian Mixture model
      • sweep a reasonable range of components (start with 1-10)
      • use a reasonable number of repeats (10 to 30 depending on run-to-run variation)
      • compute Bayes Information Criterion (BIC) for each level, pick the lowest because it indicates a reasonable balance of error and parameter count
    • each component is going to have a mean, evaluate that mean as a candidate offset and compute sum-squared error (sse) when you offset like that.
    • pick the offset of the component that gives best SSE

    Let me know how well that works.

    0 讨论(0)
  • 2021-01-26 06:47

    If the two signals misalign by non-integer number of samples, e.g. 3.7 samples, then the xcorr method may find the max value at 4 samples, it won't be able to find the accurate time shift. In this case, you should try a method called "unified change detection". The web-link for the paper is: [http://www.phmsociety.org/node/1404/]

    Good Luck.

    0 讨论(0)
  • 2021-01-26 06:51

    There's nothing wrong with your method in principle, I used exactly the same approach successfully for temporally aligning different audio recordings of the same signal.

    However, it appears that for your time series, correlation (or covariance) is simply not the right measure to compare shifted versions – possibly because they contain components of a time scale comparable to the total length. An alternative is to use residual variance, i.e. the variance of the difference between shifted versions. Here is a (not particularly elegant) implementation of this idea:

    lags = -1000 : 1000;
    v = nan(size(lags));
    for i = 1 : numel(lags)
        lag = lags(i);
        if lag >= 0
            v(i) = var(b(1 + lag : end) - c(1 : end - lag));
        else
            v(i) = var(b(1 : end + lag) - c(1 - lag : end));
        end
    end
    [~, ind] = min(v);
    minlag = lags(ind);
    

    For your (longer) data set, this results in minlag = 169. Plotting residual variance over lags gives:

    lag-vs-var

    0 讨论(0)
提交回复
热议问题