lm() called within mutate()

后端 未结 2 441
遥遥无期
遥遥无期 2021-01-19 02:49

I wonder if it is possible to use lm() within mutate() of dplyr package. Currently I have a dataframe of \"date\", \"company\", \"return\" and \"market.ret\" reproducible as

相关标签:
2条回答
  • 2021-01-19 03:35

    You seem to want to calculate a daily market return across all companies, and then regress return vs. market return for each company, across all days. If so, here's a solution using data.table; likely to be faster with very large datasets.

    library(data.table) ## 1.9.2+
    setDT(x)[ , market.ret := mean(return), by = date]
    x[, beta := coef(lm(return ~ market.ret, data = .SD))[[2]], by = company]
    

    where x is as shown below (using set.seed for reproducibility):

    set.seed(1L)     # for reproducible example
    n.dates <- 60
    n.stocks <- 2
    date <- seq(as.Date("2011-07-01"), by=1, len=n.dates)
    symbol <- replicate(n.stocks, paste0(sample(LETTERS, 5), collapse = ""))
    x <- expand.grid(date, symbol)
    x$return <- rnorm(n.dates*n.stocks, 0, sd = 0.05)
    names(x) <- c("date", "company", "return")
    
    0 讨论(0)
  • 2021-01-19 03:45

    This seems to work for me:

    group_by(x, company) %>%
        do(data.frame(beta = coef(lm(return ~ market.ret,data = .))[2])) %>%
        left_join(x,.)
    
    0 讨论(0)
提交回复
热议问题