问题
I'm looking into a way of efficiently creating a derangement (and conversely specific permutations) of a vector in R. As far as I've seen, there's no base function that does that and also there's not much about it here on SO.
An obvious start is sample
which creates a permutation of a vector. But I need this permutation to have no fixed points, hence be a derangement of the vector. For a nice explanation of this topic, see this Cross Validated post.
This is my first approach:
derangr <- function(x){
while(TRUE){
xp <- sample(x)
if(sum(xp == x) == 0) break
}
return(xp)
}
So within a while
loop, I'm checking if there's a fixed point between a vector x
and a given permutation of x
called xp
. If there is none, I break the loop and return the vector.
As the results show, it works fine:
> derangr(1:10)
[1] 4 5 6 10 7 2 1 9 3 8
> derangr(LETTERS)
[1] "C" "O" "L" "J" "A" "I" "Y" "M" "G" "T" "S" "R" "Z" "V" "N" "K" "D" "Q" "B" "H" "F" "E" "X" "W" "U" "P"
So I'm wondering if there's a better way of doing that, potentially with substituting while
by a vectorization of some kind. I also want to keep an eye on scalability.
Here's the microbenchmark
for both examples:
library(microbenchmark)
> microbenchmark(derangr(1:10),times = 10000)
Unit: microseconds
expr min lq mean median uq max neval
derangr(1:10) 8.359 15.492 40.1807 28.3195 49.4435 6866.453 10000
> microbenchmark(derangr(LETTERS),times = 10000)
Unit: microseconds
expr min lq mean median uq max neval
derangr(LETTERS) 24.385 31.123 34.75819 32.4475 34.3225 10200.17 10000
The same question applies to the converse, producing permutations with a given number of fixed points n
:
arrangr <- function(x,n){
while(TRUE){
xp <- sample(x)
if(sum(xp == x) == n) break
}
return(xp)
}
回答1:
If you don't have only unique values, you could rearrange an index like and use it for subsetting the input vector in a new order. In this case if you have for example rep(LETTERS, 2)
the first A
and the second A
would be interchangeable. The derangr()
function proposed in the Q would also rearrange these.
derangr2 <- function(x){
ind <- seq_along(x)
while(TRUE){
indp <- sample(ind)
if(sum(indp == ind) == 0) break
}
return(x[indp])
}
Some Benchmark results:
microbenchmark(derangr(rep(LETTERS, 4)),
derangr2(rep(LETTERS, 4)), times = 1000)
# Unit: microseconds
# expr min lq mean median uq max neval
# derangr(rep(LETTERS, 4)) 6.258 113.4895 441.831094 251.724 549.384 5837.143 1000
# derangr2(rep(LETTERS, 4)) 6.542 7.3960 23.173800 12.800 22.755 4645.936 1000
However, if you face only unique values, this approach doesn't hold a lot of improvement.
microbenchmark(derangr(1:1000), derangr2(1:1000), times = 1000)
# Unit: microseconds
# expr min lq mean median uq max neval
# derangr(1:1000) 19.341 21.333 61.55154 40.959 78.0775 2770.382 1000
# derangr2(1:1000) 23.608 25.884 72.76647 46.079 84.1930 2674.243 1000
来源:https://stackoverflow.com/questions/45459623/efficiently-create-derangement-of-a-vector-in-r