Novice needs to loop lm in R

会有一股神秘感。 提交于 2019-12-19 10:44:00

问题


I'm a PhD student of genetics and I am trying do association analysis of some genetic data using linear regression. In the table below I'm regressing each 'trait' against each 'SNP' There is also a interaction term include as 'var'

I've only used R for 2 weeks and I don't have any programming background so please explain any help provided as I want to understand.

This is a sample of my data:

Sample ID   var trait 1 trait 2 trait 3 SNP1    SNP2    SNP3
77856517    2   188      3       2        1      0       0
375689755   8   17      -1      -1        1     -1      -1
392513415   8   28       14      4        1      1       1
393612038   8   85       14      6        1      1       0
401623551   8   152      11     -1        1      0       0
348466144   7   -74      11      6        1      0       0
77852806    4   81       16      6        1      1       0
440614343   8   -93      8       0        0      1       0
77853193    5   3        6       5        1      1       1

and this is the code I've been using for a single regression:

result1 <-lm(trait1~SNP1+var+SNP1*var, na.action=na.exclude)

I want to run a loop where every trait is tested against each SNP.

I've been trying to modify codes I've found online but I always run into some error that I don't understand how to solve.

Thank you for any and all help.


回答1:


Personally I don't find the problem so easy. Specially for an R novice.

Here a solution based on creating dynamically the regression formula. The idea is to use paste function to create different formula terms, y~ x + var + x * var then coercing the result string tp a formula using as.formula. Here y and x are the formula dynamic terms: y in c(trait1,trai2,..) and x in c(SNP1,SNP2,...). Of course here I use lapply to loop.

lapply(1:3,function(i){
 y <- paste0('trait',i)
 x <- paste0('SNP',i)
 factor1 <- x
 factor2 <- 'var'
 factor3 <- paste(x,'var',sep='*')
 listfactor <- c(factor1,factor2,factor3)
 form <- as.formula(paste(y, "~",paste(listfactor,collapse="+")))
 lm(formula = form, data = dat)
})

I hope someone come with easier solution, ore more R-ish one:)

EDIT

Thanks to @DWin comment , we can simplify the formula to just y~x*var since it means y is modeled by x,var and x*var

So the code above will be simplified to :

 lapply(1:3,function(i){
     y <- paste0('trait',i)
     x <- paste0('SNP',i)
     LHS <- paste(x,'var',sep='*')
     form <- as.formula(paste(y, "~",LHS)
     lm(formula = form, data = dat)
    })


来源:https://stackoverflow.com/questions/15733977/novice-needs-to-loop-lm-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!