问题
Thank you for your kind reply to my previous questions. I have two lists: list1 and list2. I would like to know if each object of list1 is contained in each object of list2. For example:
> list1
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
> list2
[[1]]
[1] 1 2 3
[[2]]
[1] 2 3
[[3]]
[1] 2 3
Here are my questions:
1.) How do you I ask R to check if an object is a subset of another object in a list?
For instance I would like to check if list2[[3]]={2,3}
is contained in (subset of) list1[[2]]={2}
. When I do list2[[3]] %in% list1[[2]]
, I get [1] TRUE FALSE
. However, this is not what I desire to do?! I just want to check if list2[[3]]
is a subset of list1[[2]]
, i.e. is {2,3} \subset of {3} as in the set theoretic notion? I do not want to perform elementwise check as R seems to be doing with the %in% command. Any suggestions?
2.) Is there some sort of way to efficiently make all pairwise subset comparisons (i.e. list1[[i]]
subset of list2[[j]]
, for all i,j
combinations? Would something like outer(list1,list2, func.subset)
work once question number 1 is answered?
Thank you for your feedback!
回答1:
setdiff
compares unique values
length(setdiff(5, 1:5)) == 0
Alternatively, all(x %in% y)
will work nicely.
To do all comparisons, something like this would work:
dt <- expand.grid(list1,list2)
dt$subset <- apply(dt,1, function(.v) all(.v[[1]] %in% .v[[2]]) )
Var1 Var2 subset
1 1 1, 2, 3 TRUE
2 2 1, 2, 3 TRUE
3 3 1, 2, 3 TRUE
4 1 2, 3 FALSE
5 2 2, 3 TRUE
6 3 2, 3 TRUE
7 1 2, 3 FALSE
8 2 2, 3 TRUE
9 3 2, 3 TRUE
Note that the expand.grid
isn't the fastest way to do this when dealing with a lot of data (dwin's solution is better in that regard) but it allows you to quickly check visually whether this is doing what you want.
回答2:
You can use the sets
package as follows:
library(sets)
is.subset <- function(x, y) as.set(x) <= as.set(y)
outer(list1, list2, Vectorize(is.subset))
# [,1] [,2] [,3]
# [1,] TRUE FALSE FALSE
# [2,] TRUE TRUE TRUE
# [3,] TRUE TRUE TRUE
@Michael or @DWin's base version of is.subset
will work just as well, but for part two of your question, I'd maintain that outer
is the way to go.
回答3:
is.subset <- function(x,y) {length(setdiff(x,y)) == 0}
First the combos of list1 elements that are subsets of list2 items:
> sapply(1:length(list1), function(i1) sapply(1:length(list2),
function(i2) is.subset(list1[[i1]], list2[[i2]]) ) )
[,1] [,2] [,3]
[1,] TRUE TRUE TRUE
[2,] FALSE TRUE TRUE
[3,] FALSE TRUE TRUE
Then the unsurprising lack of any of the list2 items (all of length > 1) that are subsets of list one items (all of length 1):
> sapply(1:length(list1), function(i1) sapply(1:length(list2),
function(i2) is.subset(list2[[i2]], list1[[i1]]) ) )
[,1] [,2] [,3]
[1,] FALSE FALSE FALSE
[2,] FALSE FALSE FALSE
[3,] FALSE FALSE FALSE
回答4:
adding to @Michael's, here's a neat way to avoid the messiness of expand.grid using the AsIs function:
list2 <- list(1:3,2:3,2:3)
a <- data.frame(list1 = 1:3, I(list2))
a$subset <- apply(a, 1, function(.v) all(.v[[1]] %in% .v[[2]]) )
list1 list2 subset
1 1 1, 2, 3 TRUE
2 2 2, 3 TRUE
3 3 2, 3 TRUE
来源:https://stackoverflow.com/questions/14410050/identify-which-objects-of-list-are-contained-subset-of-in-another-list-in-r