Within ID, check for matches/differences

前端 未结 4 656
自闭症患者
自闭症患者 2021-01-12 00:11

I have a large dataset, over 1.5 million rows, from 600k unique subjects, so a number of subjects have multiple rows. I am trying to find the cases where the one of the subj

4条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-01-12 00:28

    One approach using plyr:

    library(plyr)
      zz <- ddply(test, "ID", summarise, dups = length(unique(DOB)))
      zz[zz$dups > 1 ,]
    

    And if base R is your thing, using aggregate()

    zzz <- aggregate(DOB ~ ID, data = test, FUN = function(x) length(unique(x)))
    zzz[zzz$DOB > 1 ,]
    

提交回复
热议问题