Subset values with matching criteria in r

爷,独闯天下 提交于 2020-01-25 10:13:38

问题


I had a similar question here but this one is slightly different.

I would like to return values with matching conditions in another column based on a cut score criterion. If the cut scores are not available in the variable, I would like to grab closest larger value for the first and second cut, and grab the closest smallest value for the third cut. Here is a snapshot of dataset:

ids <- c(1,2,3,4,5,6,7,8,9,10)
scores.a <- c(512,531,541,555,562,565,570,572,573,588)
scores.b <- c(12,13,14,15,16,17,18,19,20,21)
data <- data.frame(ids, scores.a, scores.b)
> data
   ids scores.a scores.b
1    1      512       12
2    2      531       13
3    3      541       14
4    4      555       15
5    5      562       16
6    6      565       17
7    7      570       18
8    8      572       19
9    9      573       20
10  10      588       21

cuts <- c(531, 560, 571)

I would like to grab score.b value corresponding to the first cut score, which is 13. Then, grab score.b value corresponding to the second cut (560) score but it is not in the score.a, so I would like to get the score.a value 562 (closest larger to 560), and the corresponding value would be 16. Lastly, for the third cut score (571), I would like to get 18 which is the corresponding value of the closest smaller value (570) to the third cut score.

Here is what I would like to get.

       scores.b
cut.1  13
cut.2  16
cut.3  18

Any thoughts? Thanks


回答1:


data %>% 
   mutate(cts = Hmisc::cut2(scores.a, cuts = cuts)) %>% 
   group_by(cts) %>% 
   summarise( mn = min(scores.b),
              mx = max(scores.b)) %>% 
   slice(-c(1,4)) %>% unlist() %>% .[c(3,4,6)] %>% 
   data.frame() %>% 
   magrittr::set_colnames("scores.b") %>% 
   magrittr::set_rownames(c("cut.1", "cut.2", "cut.3"))

      scores.b
cut.1       13
cut.2       16
cut.3       18



回答2:


Using tidyverse:

data %>% 
mutate(cuts_new = cut(scores.a, breaks = c(531,560,570, 1000), right = F)) %>% 
group_by(cuts_new) %>% summarise(first_sb = first(scores.b)) %>% 
ungroup()

results in:

# A tibble: 4 x 2
  cuts_new    first_sb
  <fct>          <dbl>
1 [531,560)         13
2 [560,570)         16
3 [570,1e+03)       18
4 NA                12


来源:https://stackoverflow.com/questions/59882916/subset-values-with-matching-criteria-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!