How to concisely deal with subsets when their lengths become zero?

被刻印的时光 ゝ 提交于 2020-06-27 16:53:13

问题


To exclude elements from a vector x,

x <- c(1, 4, 3, 2)

we can subtract a vector of positions:

excl <- c(2, 3)
x[-excl]
# [1] 1 2

This also works dynamically,

(excl <- which(x[-which.max(x)] > quantile(x, .25)))
# [1] 2 3
x[-excl]
# [1] 1 2

until excl is of length zero:

excl.nolength <- which(x[-which.max(x)] > quantile(x, .95))
length(excl.nolength)
# [1] 0
x[-excl.nolength]
# integer(0)

I could kind of reformulate that, but I have many objects to which excl is applied, say:

letters[1:4][-excl.nolength]
# character(0)

I know I could use setdiff, but that's rather long and hard to read:

x[setdiff(seq(x), excl.nolength)]
# [1] 1 4 3 2

letters[1:4][setdiff(seq(letters[1:4]), excl.nolength)]
# [1] "a" "b" "c" "d"

Now, I could exploit the fact that nothing is excluded if the element number is greater than the number of elements:

length(x)
# [1] 4
x[-5]
# [1] 1 4 3 2

To generalize that I should probably use .Machine$integer.max:

tmp <- which(x[-which.max(x)] > quantile(x, .95))
excl <- if (!length(tmp) == 0) tmp else .Machine$integer.max
x[-excl]
# [1] 1 4 3 2

Wrapped into a function,

e <- function(x) if (!length(x) == 0) x else .Machine$integer.max

that's quite handy and clear:

x[-e(excl)]
# [1] 1 2

x[-e(excl.nolength)]
# [1] 1 4 3 2

letters[1:4][-e(excl.nolength)]
# [1] "a" "b" "c" "d"

But it seems a little fishy to me...

Is there a better equally concise way to deal with a subset of length zero in base R?

Edit

excl comes out as dynamic result of a function before (as shown with which above) and might be of length zero or not. If length(excl) == 0 nothing should be excluded. Following lines of code, e.g. x[-excl] should not have to be changed at best or as little as possible.


回答1:


You can overwrite [ with your own function.

"["  <- function(x,y) {if(length(y)==0) x else .Primitive("[")(x,y)}

x <- c(1, 4, 3, 2)
excl <- c(2, 3)
x[-excl]
#[1] 1 2
excl <- integer()
x[-excl]
#[1] 1 4 3 2

rm("[") #Go back to normal mode



回答2:


I would argue this is somewhat opinion based.

For example i find:

x <- x[-if(length(excl <- which(x[-which.max(x)] > quantile(x, .95))) == 0) .Machine$integer.max else excl]

very unreadable, but some people like one-liners. Reading package code you'll often find this is instead split up into one of the many suggestions you gave

excl <- which(x[-which.max(x)] > quantile(x, .95))
if(length(excl) != 0)
    x <- x[-excl]

Alternatively, you could avoid which, and simply use the logical vector for subsetting, and this would likely be considered more clean by most

x <- x[!x[-which.max(x)] > quantile(x, .95)]

This would avoid zero-length index problem, at the cost of some loss of efficiency.

As a side note, the very example used above and in the question seems somewhat off. First which.max only returns the first index which is equal to the max value, and in addition the index will be offset for every value removed. More likely the expected example would be

x <- x[!(x > quantile(x, .95))[-which(x == max(x))]]



回答3:


How bout this?

a <- letters[1:3]
excl1 <- c(1,3)
excl2 <- c()

a[!(seq_along(a) %in% excl1)]
a[!(seq_along(a) %in% excl2)]


来源:https://stackoverflow.com/questions/59406950/how-to-concisely-deal-with-subsets-when-their-lengths-become-zero

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!