Return values with matching conditions in r

时光毁灭记忆、已成空白 提交于 2021-02-10 06:14:43

问题


I would like to return values with matching conditions in another column based on a cut score criterion. If the cut scores are not available in the variable, I would like to grab closest larger value. Here is a snapshot of dataset:

ids <- c(1,2,3,4,5,6,7,8,9,10)
scores.a <- c(512,531,541,555,562,565,570,572,573,588)
scores.b <- c(12,13,14,15,16,17,18,19,20,21)
data <- data.frame(ids, scores.a, scores.b)
> data
   ids scores.a scores.b
1    1      512       12
2    2      531       13
3    3      541       14
4    4      555       15
5    5      562       16
6    6      565       17
7    7      570       18
8    8      572       19
9    9      573       20
10  10      588       21

cuts <- c(531, 560, 571)

I would like to grab score.b value corresponding to the first cut score, which is 13. Then, grab score.b value corresponding to the second cut (560) score but it is not in the score.a, so I would like to get the score.a value 562 (closest to 560), and the corresponding value would be 16. Lastly, for the third cut score (571), I would like to get 19 which is the corresponding value of the closest value (572) to the third cut score.

Here is what I would like to get.

       scores.b
cut.1  13
cut.2  16
cut.3  19

Any thoughts? Thanks


回答1:


We can use a rolling join

library(data.table)
setDT(data)[data.table(cuts = cuts), .(ids = ids, cuts, scores.b), 
          on = .(scores.a = cuts), roll = -Inf]
#   ids cuts scores.b
#1:   2  531       13
#2:   5  560       16
#3:   8  571       19

Or another option is findInterval from base R after changing the sign and taking the reverse

with(data, scores.b[rev(nrow(data) + 1 - findInterval(rev(-cuts), rev(-scores.a)))])
#[1] 13 16 19



回答2:


This doesn't remove the other columns, but this illustrates correct results better

df1 <- data[match(seq_along(cuts), findInterval(data$scores.a, cuts)), ]
rownames(df1) <- paste("cuts", seq_along(cuts), sep = ".")

> df1
       ids scores.a scores.b
cuts.1   2      531       13
cuts.2   5      562       16
cuts.3   8      572       19


来源:https://stackoverflow.com/questions/59570555/return-values-with-matching-conditions-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!