Repeat regression with varying dependent variable

后端 未结 3 1613
北海茫月
北海茫月 2021-01-22 17:58

I\'ve searched both Stack and google for a solution, none found to solve my problem.

I have about 40 dependent variables, for which I aim to obtain adjusted means (lsmea

相关标签:
3条回答
  • 2021-01-22 18:00

    There were a few typos and things, but I think this is what you want:

    # Examplified here with 2 outcome variables
    outcome1 <- c(2, 4, 6, 8, 10, 12, 14, 16)
    outcome2 <- c(1, 2, 3, 4, 5, 6, 7, 8)
    var1 <- c("a", "a", "a", "a", "b", "b", "b", "b")
    var2 <- c(10, 11, 12, 9, 14, 9, 5, 8)
    var3 <- c(100, 101, 120, 90, 140, 90, 50, 80)
    
    df <- data.frame(outcome1, outcome2, var1, var2, var3)
    
    dependents <- c("outcome1", "outcome2")
    
    library(lsmeans) #install.packages("lsmeans")
    
    results <- list()
    for (i in seq_along(dependents)) {
      eq <- paste(dependents[i],"~ var1 + var2 + var3")
      fit <- lm(as.formula(eq), data= df)
      summary <- summary(lsmeans(fit, "var1"))
      summary$outcome <- i
      results[[i]] <- summary
    }
    
    0 讨论(0)
  • 2021-01-22 18:06

    Here is another option using lapply.

    dependents <- c('outcome1', 'outcome2')
    lst <- lapply(dependents, function(x) {
             fit <- lm(paste(x,'~', 'var1+var2+var3'), data=df)
             summary(lsmeans(fit, 'var1', data=df))})
    Map(cbind, lst, outcome = seq_along(dependents))
    
    0 讨论(0)
  • 2021-01-22 18:12

    In more modern R, the lazyeval package provides better functions for working with formulas.

    Here's my version of your code:

    #load libs
    library(tidyverse)
    library(lazyeval)
    library(lsmeans)
    
    #make data
    df = tibble(
      y1 = c(2, 4, 6, 8, 10, 12, 14, 16),
      y2 = c(1, 2, 3, 4, 5, 6, 7, 8),
      var1 = c("a", "a", "a", "a", "b", "b", "b", "b"),
      var2 = c(10, 11, 12, 9, 14, 9, 5, 8),
      var3 = c(100, 101, 120, 90, 140, 90, 50, 80)
    )
    
    #outcomes
    outcomes = c("y1", "y2")
    
    #fit
    results <- list()
    for (i in seq_along(outcomes)) {
      #make a formula
      f = i ~ var1 + var2 + var3
      
      #set outcome, must be a symbol explicitly
      f_lhs(f) = as.symbol(outcomes[i])
      
      #fit
      fit <- lm(f, data = df)
      
      #save
      summary <- summary(lsmeans(fit, "var1"))
      results[[i]] = summary
    }
    
    #set outcome names
    names(results) = outcomes
    
    #print results
    results
    

    The last line prints:

    $y1
     var1 lsmean   SE df lower.CL upper.CL
     a       5.5 1.38  4     1.68     9.32
     b      12.5 1.38  4     8.68    16.32
    
    Confidence level used: 0.95 
    
    $y2
     var1 lsmean    SE df lower.CL upper.CL
     a      2.75 0.688  4     0.84     4.66
     b      6.25 0.688  4     4.34     8.16
    
    Confidence level used: 0.95 
    

    Generally, it would be easier to work with strings, and convert to a formula just before fitting. Here I did it using formulas.

    0 讨论(0)
提交回复
热议问题