I have huge data sets which contains more than millions of rows and has some peculiar attributes. I need to filter the data retaining its other properties.
My data i
You may try
library(data.table)
setDT(df1)[, .SD[any(Prop1!=Prop2)], ID]
# ID Prop1 Prop2 TotalProp
# 1: 56892558 A61 G02 4
# 2: 56892558 A61 A61 4
# 3: 56892558 G02 A61 4
# 4: 56892558 A61 A61 4
# 5: 56892552 B61 B61 3
# 6: 56892552 B61 B61 3
# 7: 56892552 B61 A61 3
# 8: 56892559 B61 G61 3
# 9: 56892559 B61 B61 3
#10: 56892559 B61 B61 3
Or as @Frank suggested
setDT(df1)[, if(any(Prop1!=Prop2)) .SD, ID]
Similar option using dplyr
library(dplyr)
df1 %>%
group_by(ID) %>%
filter(any(Prop1!=Prop2))
Or using ave
from base R
df1[with(df1, ave(Prop1!=Prop2, ID, FUN=any)),]