Remove duplicated elements from list

问题

I have a list of character vectors:

my.list <- list(e1 = c("a","b","c","k"),e2 = c("b","d","e"),e3 = c("t","d","g","a","f"))

And I'm looking for a function that for any character that appears more than once across the list's vectors (in each vector a character can only appear once), will only keep the first appearance.

The result list for this example would therefore be:

res.list <- list(e1 = c("a","b","c","k"),e2 = c("d","e"),e3 = c("t","g","f"))

Note that it is possible that an entire vector in the list is eliminated so that the number of elements in the resulting list doesn't necessarily have to be equal to the input list.

回答1:

We can unlist the list, get a logical list using duplicated and extract the elements in 'my.list' based on the logical index

un <- unlist(my.list)
res <- Map(`[`, my.list, relist(!duplicated(un), skeleton = my.list))
identical(res, res.list)
#[1] TRUE

回答2:

Here is an alternative using mapply with setdiff and Reduce.

# make a copy of my.list
res.list <- my.list
# take set difference between contents of list elements and accumulated elements
res.list[-1] <- mapply("setdiff", res.list[-1],
                                  head(Reduce(c, my.list, accumulate=TRUE), -1))

Keeping the first element of the list, we compute on subsequent elements and the a list of the cumulative vector of elements produced by Reduce with c and the accumulate=TRUE argument. head(..., -1) drops the final list item containing all elements so that the lengths align.

This returns

res.list
$e1
[1] "a" "b" "c" "k"

$e2
[1] "d" "e"

$e3
[1] "t" "g" "f"

Note that in Reduce, we could replace c with function(x, y) unique(c(x, y)) and accomplish the same ultimate output.

来源：https://stackoverflow.com/questions/45318034/remove-duplicated-elements-from-list

标签

list

redundancy