Get the list of items in Venn diagram

。_饼干妹妹 提交于 2019-12-05 06:37:46

Take a look at ?intersect, ?union and ?setdiff function to extract the different fields of the Venn diagram.

I have created some list versions of the two functions to better get the elements in the different compartments:

Intersect <- function (x) {  
  # Multiple set version of intersect
  # x is a list
  if (length(x) == 1) {
    unlist(x)
  } else if (length(x) == 2) {
    intersect(x[[1]], x[[2]])
  } else if (length(x) > 2){
    intersect(x[[1]], Intersect(x[-1]))
  }
}

Union <- function (x) {  
  # Multiple set version of union
  # x is a list
  if (length(x) == 1) {
    unlist(x)
  } else if (length(x) == 2) {
    union(x[[1]], x[[2]])
  } else if (length(x) > 2) {
    union(x[[1]], Union(x[-1]))
  }
}

Setdiff <- function (x, y) {
  # Remove the union of the y's from the common x's. 
  # x and y are lists of characters.
  xx <- Intersect(x)
  yy <- Union(y)
  setdiff(xx, yy)
}

So, if we want to see the common elements (i.e. the union of A, B, C, and D) or the ones in C and D but not in A and B in your example we do something like the following.

set.seed(1)
xx.1 <- list(A = sample(LETTERS, 15), 
             B = sample(LETTERS, 15), 
             C = sample(LETTERS, 15), 
             D = sample(LETTERS, 15))
Intersect(xx.1)
#[1] "E" "L"
Setdiff(xx.1[c("C", "D")], xx.1[c("A", "B")])
#[1] "O" "P" "K" "H"

Hope this helps!

Edit: Systematically get all components

By some (I think) clever use of the combn function, indexing, and a good understanding of lapply we can all elements systematically:

# Create a list of all the combinations
combs <- 
  unlist(lapply(1:length(xx.1), 
                function(j) combn(names(xx.1), j, simplify = FALSE)),
         recursive = FALSE)
names(combs) <- sapply(combs, function(i) paste0(i, collapse = ""))
str(combs)
#List of 15
# $ A   : chr "A"
# $ B   : chr "B"
# $ C   : chr "C"
# $ D   : chr "D"
# $ AB  : chr [1:2] "A" "B"
# $ AC  : chr [1:2] "A" "C"
# $ AD  : chr [1:2] "A" "D"
# $ BC  : chr [1:2] "B" "C"
# $ BD  : chr [1:2] "B" "D"
# $ CD  : chr [1:2] "C" "D"
# $ ABC : chr [1:3] "A" "B" "C"
# $ ABD : chr [1:3] "A" "B" "D"
# $ ACD : chr [1:3] "A" "C" "D"
# $ BCD : chr [1:3] "B" "C" "D"
# $ ABCD: chr [1:4] "A" "B" "C" "D"

# "A" means "everything in A minus all others"
# "A", "B" means "everything in "A" and "B" minus all others" and so on
elements <- 
  lapply(combs, function(i) Setdiff(xx.1[i], xx.1[setdiff(names(xx.1), i)]))

n.elements <- sapply(elements, length)
print(n.elements)
#   A    B    C    D   AB   AC   AD   BC   BD   CD  ABC  ABD  ACD  BCD ABCD 
#   2    2    0    0    1    2    2    0    3    4    4    1    1    2    2 
al-ash

You can also use venn in gplots package to get a list of items in each section of venn diagram ('ItemsList'). Given your list xx.1, it should be:

ItemsList <- venn(xx.1, show.plot = FALSE)

ItemsList contains:

  1. a matrix of all diagram sections and the counts of items in these sections and
  2. the list of items in each Venn diagram section.

to get the counts:

lengths(attributes(ItemsList)$intersections)
# A       B     A:B     A:C     A:D     B:D     C:D   A:B:C   A:B:D   A:C:D   B:C:D A:B:C:D 
# 2       2       1       2       2       3       4       4       1       1       2       2
Sheng Qin

In VennDiagram package, it has a function called "calculate.overlap".

overlap <- calculate.overlap(xx.1)

And the overlap is what you want:

$a6
[1] "C"

$a12
[1] "Z" "D" "R"

$a11
[1] "Y" "O" "V"

$a5
[1] "X" "B"

$a7
[1] "H" "F" "P" "S"

$a15
[1] "I"

$a4
[1] "L" "K" "G"

$a10
[1] "W" "J"

$a13
[1] "U"

$a8
character(0)

$a2
character(0)

$a9
character(0)

$a14
[1] "N" "M"

$a1
[1] "E"

$a3
[1] "Q" "A" "T"
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!