问题
I calculated a cross-correlation of two time series using ccf()
in R. I know how to derive the confidence limits as:
ccf1 <- ccf(x=x,y=y,lag.max=5,na.action=na.pass, plot=F)
upperCI <- qnorm((1+0.95)/2)/sqrt(ccf1$n.used)
lowerCI <- -qnorm((1+0.95)/2)/sqrt(ccf1$n.used)
But what I really need is the p-value of the maximum correlation.
ind.max <- which(abs(ccf1$acf[1:11])==max(abs(ccf1$acf[1:11])))
max.cor <- ccf1$acf[ind.max]
lag.opt <- ccf1$lag[ind.max]
How do I calculate this p-value? I have searched high and low but can't find a good answer anywhere.
回答1:
Getting p-value is straightforward.
Under Null Hypothesis that the correlation is 0, it is normally distributed:
Z ~ N(0, 1/sqrt(ccf1$n.used))
So for your observed maximum correlation max.cor
, its p-value is just the probability Pr(Z > |max.cor|)
, which can be computed by:
2 * (1 - pnorm(abs(max.cor), mean = 0, sd = 1/sqrt(ccf1$n.used)))
Follow-up
Is it really that simple? The
ccf
is computing many correlations at once!
Are you saying that ccf
is computing correlations at different lags? Well, provided you have large number of observations N
, the standard deviation of ACF at each lag is the same: 1/sqrt(N)
. That is why the confidence interval are two horizontal lines.
来源:https://stackoverflow.com/questions/38173544/how-to-calculate-p-values-from-cross-correlation-function-in-r