case_number <- c(\"1\", \"1\", \"2\", \"2\", \"2\", \"3\", \"3\")
type <- c(\"STD\", \"STD2\", \"STD\", \"STD3\", \"STD2\", \"STD\", \"STD2\")
date <- as.Date
Assuming every case_number
would have both the values , another option is to check the position of "STD"
and "STD2"
and select groups where the difference is equal to 1.
check_fun <- function(x) {
abs(diff(c(which.max(x == "STD"), which.max(x == "STD2")))) == 1
}
library(dplyr)
data %>% group_by(case_number) %>% filter(check_fun(type))
# case_number type date
# <fct> <fct> <date>
#1 1 STD 2008-11-01
#2 1 STD2 2009-03-25
#3 3 STD 2015-03-14
#4 3 STD2 2015-04-15
Or if you just need the unique case_number
data %>%
group_by(case_number) %>%
filter(check_fun(type)) %>%
pull(case_number) %>%
unique
#[1] 1 3
#Levels: 1 2 3
Here's a shot:
myfilter <- function(x) {
r <- rle(x %in% c("STD", "STD2"))
any(r$lengths[r$values] > 1)
}
library(dplyr)
data %>%
group_by(case_number) %>%
filter(myfilter(type)) %>%
ungroup()
# # A tibble: 4 x 3
# case_number type date
# <fct> <fct> <date>
# 1 1 STD 2008-11-01
# 2 1 STD2 2009-03-25
# 3 3 STD 2015-03-14
# 4 3 STD2 2015-04-15
It does not care about order, just about finding either of them in a chain of two (or more).