Determining different rows between two data sets in R

问题

I have two data files in tab separated CSV format. The files are in the following format:

EP Code    EP Name    Address    Region    ...
101654    Alpha     York Street    Northwest    ...
103628    Beta    5th Avenue    South    ...

EP codes are unique. What I want to do is to compare two files with respect to EP codes, determine the different rows and write them into a new file.

For example, file1.csv has 800 rows and file2.csv has 850 rows. file2 could be a file completely including file1 plus 50 rows; or it could be file1 - 10 rows + 60 rows. I want to determine the differences between two data sets. I'm not interested in the mutual rows.

How can I do that in R?

回答1:

There are many ways to do this, including setdiff, intersect, the %in% function, is.element. Just find the intersecting set and exclude it using !:

diff1 <- file1[setdiff(file1$ep.code, file2$ep.code),]

diff2 <- file2[!(intersect(file2$ep.code, file1$ep.code)),]

来源：https://stackoverflow.com/questions/3132778/determining-different-rows-between-two-data-sets-in-r

标签

csv

comparison

rows

import-from-csv

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!