How to run linear regression model for each industry-year excluding firm i observations in R?

白昼怎懂夜的黑 提交于 2020-04-30 06:57:07

问题


Here is the dput output of my dataset in R......

data1<-structure(list(Year = c(1998, 1999, 1999, 2000, 1996, 2001, 1998, 
1999, 2002, 1998, 2005, 1998, 1999, 1998, 1997, 1998, 2000), 
    `Firm name` = c("A", "A", "B", "B", "C", "C", "D", "D", "D", 
    "E", "E", "F", "F", "G", "G", "H", "H"), Industry = c("AUTO", 
    "AUTO", "AUTO", "AUTO", "AUTO", "AUTO", "AUTO", "AUTO", "AUTO", 
    "Pharma", "Pharma", "Pharma", "Pharma", "Pharma", "Pharma", 
    "Pharma", "Pharma"), X = c(1, 2, 5, 6, 7, 9, 10, 11, 12, 
    13, 15, 16, 17, 18, 19, 20, 21), Y = c(30, 31, 34, 35, 36, 
    38, 39, 40, 41, 42, 44, 45, 46, 47, 48, 49, 50), Z = c(23, 
    29, 47, 53, 59, 71, 77, 83, 89, 95, 107, 113, 119, 125, 131, 
    137, 143)), row.names = c(NA, -17L), class = c("tbl_df", 
"tbl", "data.frame"), na.action = structure(c(`1` = 1L), class = "omit"))
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50), Z = c(23, 
29, 35, 41, 47, 53, 59, 65, 71, 77, 83, 89, 95, 101, 107, 113, 
119, 125, 131, 137, 143)), row.names = c(NA, -21L), class = c("tbl_df", 
"tbl", "data.frame"), na.action = structure(c(`1` = 1L), class = "omit"))

Here I am trying to regress Y~ X+Z for each industry year but excluding firm i observations.For each firm I want to estimate the linear regression model using all industry peer firms' observations but excluding firm's own observations.For example;for firm A, I want to regress Y~ X+Z by using all observations of its industry peer firms (B,C & D) across time but excluding firm A observations. Similarly I want to run model for firm B by using all observations of firm A,C & D (part of same industry as B) across time excluding firm B observations. And same procedure for firm C & D as well. I want to do this exercise for every firm within each industry. Please help.


回答1:


As mentioned by @bonedi you can use a nested loop to accomplish this. If you want to create models for individual industry-year combinations, you will need to subset your data by Industry and Year. You can loop over Firm name and exclude that firm before creating the model. Results can be stored in a list, named by industry-year-firm. It's not a pretty solution but it should get you closer.

lst <- list()

for (ind in unique(data1$Industry)) {
  for (year in unique(data1[data1$Industry == ind, ]$Year)) {
    for (firm in unique(data1[data1$Industry == ind & data1$Year == year, ]$`Firm name`)) {
      sub_data <- data1[data1$Industry == ind & data1$Year == year & data1$`Firm name` != firm, ]
      if (nrow(sub_data) > 0) {
        name <- paste(ind, year, firm, sep = '-')
        lst[[name]] <- lm(Y ~ X + Z, data = sub_data)
      }
    }
  }
}



回答2:


The displayed code isn't nice to read. But from what you write, I'd recommend a nested loop, e.g:

for(y in year){
    for(comp in FirmName){
      # transform data : select only companys in this industry, but exclude comp
       lm(..)
     }
 }


来源:https://stackoverflow.com/questions/61380812/how-to-run-linear-regression-model-for-each-industry-year-excluding-firm-i-obser

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!