I have a df (“df”) containing multiple time series (value ~ time) whose observations are grouped by 3 factors: temp, rep, and species. These data need to be trimmed at the lower
We can find out indices which we want to exclude using mapply
df[-c(with(df_thresholds,
mapply(function(x, y, z, min_x, max_x)
which(df$species == x & df$temp == y & df$rep == z &
(df$value < min_x | df$value > max_x)),
species, temp, rep, min_value, max_value))), ]
# species temp rep time value
#2 A 10 1 2 4
#3 A 10 1 3 8
#6 A 20 1 2 4
#7 A 20 1 3 9
#9 A 10 2 1 2
#10 A 10 2 2 4
#11 A 10 2 3 10
#12 A 10 2 4 16
#......
In mapply
we pass all the columns of df_thresholds
filter df
accordingly and find out indices which are outside min and max value for each row and exclude them from the original dataframe.
The result of mapply
call is
#[1] 1 4 5 8 25 28
which are the rows we want to exclude from the df
since they fall out of range.