R extract regression coefficients from multiply regression via lapply command

前端 未结 2 1017
天命终不由人
天命终不由人 2021-01-03 13:01

I have a large dataset with several variables, one of which is a state variable, coded 1-50 for each state. I\'d like to run a regression of 28 variables on the remaining 2

相关标签:
2条回答
  • 2021-01-03 13:41

    This is another example of the classic Split-Apply-Combine problem, which can be addressed using the plyr package by @hadley. In your problem, you want to

    1. Split data frame by state
    2. Apply regressions for each subset
    3. Combine coefficients into data frame.

    I will illustrate it with the Cars93 dataset available in MASS library. We are interested in figuring out the relationship between horsepower and enginesize based on origin of country.

    # LOAD LIBRARIES
    require(MASS); require(plyr)
    
    # SPLIT-APPLY-COMBINE
    regressions <- dlply(Cars93, .(Origin), lm, formula = Horsepower ~ EngineSize)
    coefs <- ldply(regressions, coef)
    
       Origin (Intercept) EngineSize
    1     USA    33.13666   37.29919
    2 non-USA    15.68747   55.39211
    

    EDIT. For your example, substitute PUF for Cars93, state for Origin and fm for the formula

    0 讨论(0)
  • 2021-01-03 13:43

    I've cleaned up your code slightly:

    fm <- z ~ class1+class2+class3+class4+class5+class6+class7+
              xtot+e00200+e00300+e00600+e00900+e01000+p04470+e04800+
              e09600+e07180+e07220+e07260+e06500+e10300+
              e59720+e11900+e18425+e18450+e18500+e19700
    
    PUFsplit <- split(PUF, PUF$state)
    mod <- lapply(PUFsplit, function(z) lm(fm, data=z))
    
    Beta <- sapply(mod, coef)
    

    If you wanted, you could even put this all in one line:

    Beta <- sapply(lapply(split(PUF, PUF$state), function(z) lm(fm, data=z)), coef)
    
    0 讨论(0)
提交回复
热议问题