Find top deciles from dataframe by group

前端 未结 3 2011
花落未央
花落未央 2021-01-23 07:59

I am attempting to create new variables using a function and lapply rather than working right in the data with loops. I used to use Stata and would have solved this

3条回答
  •  借酒劲吻你
    2021-01-23 08:19

    The idiomatic way to do this kind of thing in R would be to use a combination of split and lapply. You're halfway there with your use of lapply; you just need to use split as well.

    lapply(split(data, data$v1), function(df) {
        cutoff <- quantile(df$v2, c(0.8, 0.9))
        top_pct <- ifelse(df$v2 > cutoff[2], 10, ifelse(df$v2 > cutoff[1], 20, NA))
        na.omit(data.frame(id=df$custID, top_pct))
    })
    

    Finding quantiles is done with quantile.

提交回复
热议问题