How to compare two data frames/tables and extract data in R?

你说的曾经没有我的故事 提交于 2019-12-01 05:38:50

This question has data.table tag, so here's my attempt using this package. First step is to convert row names to columns as data.table don't like those, then converting to long format after rbinding and setting an id per data set, finding where there are more than one unique value and converting back to a wide format

library(data.table)  
setDT(dfA, keep.rownames = TRUE) 
setDT(dfB, keep.rownames = TRUE)   

dcast(melt(rbind(dfA, 
                 dfB, 
                 idcol = TRUE), 
           id = 1:2
           )[, 
             if(uniqueN(value) > 1L) .SD, 
             by = .(rn, variable)], 
      rn + variable ~ .id)

#      rn variable  1  2
# 1: snp2  animal3 TT TB
# 2: snp4  animal2 CA DF
# 3: snp4  animal3 CA DF
jogo

Here is a solution using array.indices of a matrix:

i.arr <- which(dfA != dfB, arr.ind=TRUE)

data.frame(snp=rownames(dfA)[i.arr[,1]], animal=colnames(dfA)[i.arr[,2]],
           A=dfA[i.arr], B=dfB[i.arr])
#   snp  animal  A  B
#1 snp4 animal2 CA DF
#2 snp2 animal3 TT TB
#3 snp4 animal3 CA DF

This can be done with dplyr/tidyr using a similar approach as in @David Arenburg's post.

library(dplyr)
library(tidyr)
bind_rows(add_rownames(dfA), add_rownames(dfB)) %>% 
          gather(Var, Val, -rowname) %>%
          group_by(rowname, Var) %>%
          filter(n_distinct(Val)>1) %>% 
          mutate(id = 1:2) %>% 
          spread(id, Val)
#  rowname     Var     1     2
#    (chr)   (chr) (chr) (chr)
#1    snp2 animal3    TT    TB
#2    snp4 animal2    CA    DF
#3    snp4 animal3    CA    DF
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!