I am trying to do a bootstrapped correlation in R. I have two variables Var1 and Var2 and I want to get the bootstrapped p.value of the Pearson correlation.
my
If you want to bootstrap your correlation test, you only need to return the correlation coefficient from your bootstrap statistic function. Bootstrapping the p-value of the correlation test is not appropriate in this case, as you ignore the directionality of the correlation test.
Check this question on CrossValidated for some nice answers on performing bootstrap hypothesis tests: https://stats.stackexchange.com/questions/20701/computing-p-value-using-bootstrap-with-r
library("boot")
data <- read.csv("~/Documents/stack/tmp.csv", header = FALSE)
colnames(data) <- c("x", "y")
data <- as.data.frame(data)
x <- data$Var1
y <- data$Var2
dat <- data.frame(x,y)
set.seed(1)
b3 <- boot(data,
statistic = function(data, i) {
cor(data[i, "x"], data[i, "y"], method='pearson')
},
R = 1000
)
b3
#>
#> ORDINARY NONPARAMETRIC BOOTSTRAP
#>
#>
#> Call:
#> boot(data = data, statistic = function(data, i) {
#> cor(data[i, "x"], data[i, "y"], method = "pearson")
#> }, R = 1000)
#>
#>
#> Bootstrap Statistics :
#> original bias std. error
#> t1* 0.1279691 -0.0004316781 0.314056
boot.ci(b3, type = c("norm", "basic", "perc", "bca")) #bootstrapped CI.
#> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
#> Based on 1000 bootstrap replicates
#>
#> CALL :
#> boot.ci(boot.out = b3, type = c("norm", "basic", "perc", "bca"))
#>
#> Intervals :
#> Level Normal Basic
#> 95% (-0.4871, 0.7439 ) (-0.4216, 0.7784 )
#>
#> Level Percentile BCa
#> 95% (-0.5225, 0.6775 ) (-0.5559, 0.6484 )
#> Calculations and Intervals on Original Scale
plot(density(b3$t))
abline(v = 0, lty = "dashed", col = "grey60")
In this case without a p-value it's quite safe to say that most of the mass of the sampling distribution is very close to zero.