Suppose I have a data frame (df) that looks like below:
options(stringsAsFactors = F)
cars <- c(\"Car1\", \"Car2\", \"Car3\", \"Car4\", \"Car5\", \"Car6\
Here is an option with rowSums
; the logic is to check if there is any value in the row that is different (NA doesn't count) from one of the columns that you are interested in:
df[rowSums(df[-1] != df[[2]], na.rm = TRUE) != 0,]
# cars test1 test2 test3 test4 test5 test6 test7
#1 Car1 0 0 1 2 0 1 3
#3 Car3 3 2 5 2 1 1 2
#5 Car5 4 0 2 2 0 1 0
#7 Car7 1 2 6 1 1 3 1
#8 Car8 3 5 7 1 3 4 1
keep <- apply(df[2:8], 1, function(x) length(unique(x[!is.na(x)])) != 1)
df[keep, ]
cars test1 test2 test3 test4 test5 test6 test7
1 Car1 0 0 1 2 0 1 3
3 Car3 3 2 5 2 1 1 2
5 Car5 4 0 2 2 0 1 0
7 Car7 1 2 6 1 1 3 1
8 Car8 3 5 7 1 3 4 1
We can also use Map
with Reduce
df[c(Reduce(`+`, Map(function(x,y) x != y & !is.na(x), df[-1], list(df[2]))) != 0),]
# cars test1 test2 test3 test4 test5 test6 test7
#1 Car1 0 0 1 2 0 1 3
#3 Car3 3 2 5 2 1 1 2
#5 Car5 4 0 2 2 0 1 0
#7 Car7 1 2 6 1 1 3 1
#8 Car8 3 5 7 1 3 4 1
Or using tidyverse
library(tidyverse)
df %>%
filter_at(vars(starts_with("test")), any_vars((. != test1)))
# cars test1 test2 test3 test4 test5 test6 test7
#1 Car1 0 0 1 2 0 1 3
#2 Car3 3 2 5 2 1 1 2
#3 Car5 4 0 2 2 0 1 0
#4 Car7 1 2 6 1 1 3 1
#5 Car8 3 5 7 1 3 4 1