Resampling not producing expected result of principal component analysis

前提是你 提交于 2019-12-06 07:42:06

Given that the sign of an Eigenvector is not defined (you can flip the configuration and have the same result), it doesn't make sense to form a confidence interval on the the signed value of the loading.

Instead compute the confidence interval on the absolute value of the loading, not the signed value.

Think what happens to your interval when the Eigenvector for say Sepal.Length flips from ~ -0.3 to ~ +0.3? The loading is similar in both cases when considered from an absolute size point of view. When you look at the actual signed value however, it would be logical for the loading to be on average 0 as you are averaging a lot of ~-0.3s and ~0.3s.

To visualise why your original attempt failed, run:

set.seed(1)
mydf <- iris[1:4]
times <- 1000
ll <- vector(mode = "list", length = times)
for (i in seq_len(times)) {
  tempdf  <- mydf[sample(nrow(mydf), replace = TRUE), ]
  ll[[i]] <- prcomp(tempdf)$rotation
}

This is effectively your code, modified to suit my sensibilities. Now extract the loading for Sepal.Length on PC1 and draw a histogram of the values:

hist(sapply(ll, `[`, 1, 1))

which produces

Instead compute the confidence interval on the absolute value of the loading, not the signed value.

For example

set.seed(1)
mydf <- iris[1:4]
times <- 1000
ll <- vector(mode = "list", length = times)
for (i in seq_len(times)) {
  tempdf  <- mydf[sample(nrow(mydf), replace = TRUE), ]
  ll[[i]] <- abs(prcomp(tempdf)$rotation) ## NOTE: abs(...)
}

This gives:

> data.frame(apply(simplify2array(ll), 1:2, quantile, probs = 0.025))
                    PC1         PC2        PC3       PC4
Sepal.Length 0.33066830 0.578558222 0.45955051 0.2252653
Sepal.Width  0.05211013 0.623424084 0.49591685 0.2351746
Petal.Length 0.84823899 0.133137927 0.01226608 0.4607265
Petal.Width  0.34284824 0.007403214 0.44932031 0.6780493

> data.frame(apply(simplify2array(ll), 1:2, quantile, probs = 0.975))
                   PC1       PC2       PC3       PC4
Sepal.Length 0.3891499 0.7443276 0.6690553 0.3898237
Sepal.Width  0.1186205 0.7988607 0.7010495 0.4083784
Petal.Length 0.8653324 0.2153410 0.1450756 0.4933340
Petal.Width  0.3742441 0.1645692 0.6350899 0.8154254
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!