问题
I'm looking to permute (or combine) c("a","b","c")
within six positions under the condition to have always sequences with alternate elements, e.g abcbab
.
Permutations could easily get with:
abc<-c("a","b","c")
permutations(n=3,r=6,v=abc,repeats.allowed=T)
I think is not possible to do that with gtools, and I've been trying to design a function for that -even though I think it may already exist.
回答1:
Since you're looking for permutations, expand.grid
can work as well as permutations
. But since you don't want like-neighbors, we can shorten the dimensionality of it considerably. I think this is legitimate random-wise!
Up front:
r <- replicate(6, seq_len(length(abc)-1), simplify=FALSE)
r[[1]] <- c(r[[1]], length(abc))
m <- t(apply(do.call(expand.grid, r), 1, cumsum) %% length(abc) + 1)
m[] <- abc[m]
dim(m)
# [1] 96 6
head(as.data.frame(cbind(m, apply(m, 1, paste, collapse = ""))))
# Var1 Var2 Var3 Var4 Var5 Var6 V7
# 1 b c a b c a bcabca
# 2 c a b c a b cabcab
# 3 a b c a b c abcabc
# 4 b a b c a b babcab
# 5 c b c a b c cbcabc
# 6 a c a b c a acabca
Walk-through:
- since you want all recycled permutations of it, we can use
gtools::permutations
, or we can useexpand.grid
... I'll use the latter, I don't know if it's much faster, but it does a short-cut I need (more later) - when dealing with constraints like this, I like to expand on the indices of the vector of values
however, since we don't want neighbors to be the same, I thought that instead of each row of values being the straight index, we
cumsum
them; by using this, we can control the ability of the cumulative sum to re-reach the same value ... by removing0
andlength(abc)
from the list of possible values, we remove the possibility of (a) never staying the same, and (b) never increasing actually one vector-length (repeating the same value); as a walk-through:head(expand.grid(1:3, 1:2, 1:2, 1:2, 1:2, 1:2), n = 6) # Var1 Var2 Var3 Var4 Var5 Var6 # 1 1 1 1 1 1 1 # 2 2 1 1 1 1 1 # 3 3 1 1 1 1 1 # 4 1 2 1 1 1 1 # 5 2 2 1 1 1 1 # 6 3 2 1 1 1 1
Since the first value can be all three values, it's
1:3
, but each additional is intended to be 1 or 2 away from it.head(t(apply(expand.grid(1:3, 1:2, 1:2, 1:2, 1:2, 1:2), 1, cumsum)), n = 6) # Var1 Var2 Var3 Var4 Var5 Var6 # [1,] 1 2 3 4 5 6 # [2,] 2 3 4 5 6 7 # [3,] 3 4 5 6 7 8 # [4,] 1 3 4 5 6 7 # [5,] 2 4 5 6 7 8 # [6,] 3 5 6 7 8 9
okay, that doesn't seem that useful (since it goes beyond the length of the vector), so we can invoke the modulus operator and a shift (since modulus returns 0-based, we want 1-based):
head(t(apply(expand.grid(1:3, 1:2, 1:2, 1:2, 1:2, 1:2), 1, cumsum) %% 3 + 1), n = 6) # Var1 Var2 Var3 Var4 Var5 Var6 # [1,] 2 3 1 2 3 1 # [2,] 3 1 2 3 1 2 # [3,] 1 2 3 1 2 3 # [4,] 2 1 2 3 1 2 # [5,] 3 2 3 1 2 3 # [6,] 1 3 1 2 3 1
To verify this works, we can do a
diff
across each row and look for0
:m <- t(apply(expand.grid(1:3, 1:2, 1:2, 1:2, 1:2, 1:2), 1, cumsum) %% 3 + 1) any(apply(m, 1, diff) == 0) # [1] FALSE
to automate this to an arbitrary vector, we enlist the help of
replicate
to generate the list of possible vectors:r <- replicate(6, seq_len(length(abc)-1), simplify=FALSE) r[[1]] <- c(r[[1]], length(abc)) str(r) # List of 6 # $ : int [1:3] 1 2 3 # $ : int [1:2] 1 2 # $ : int [1:2] 1 2 # $ : int [1:2] 1 2 # $ : int [1:2] 1 2 # $ : int [1:2] 1 2
and then
do.call
to expand it.one you have the matrix of indices,
head(m) # Var1 Var2 Var3 Var4 Var5 Var6 # [1,] 2 3 1 2 3 1 # [2,] 3 1 2 3 1 2 # [3,] 1 2 3 1 2 3 # [4,] 2 1 2 3 1 2 # [5,] 3 2 3 1 2 3 # [6,] 1 3 1 2 3 1
and then replace each index with the vector's value:
m[] <- abc[m] head(m) # Var1 Var2 Var3 Var4 Var5 Var6 # [1,] "b" "c" "a" "b" "c" "a" # [2,] "c" "a" "b" "c" "a" "b" # [3,] "a" "b" "c" "a" "b" "c" # [4,] "b" "a" "b" "c" "a" "b" # [5,] "c" "b" "c" "a" "b" "c" # [6,] "a" "c" "a" "b" "c" "a"
and then we
cbind
the united string (viaapply
andpaste
)
Performance:
library(microbenchmark)
library(dplyr)
library(tidyr)
library(stringr)
microbenchmark(
tidy1 = {
gtools::permutations(n = 3, r = 6, v = abc, repeats.allowed = TRUE) %>%
data.frame() %>%
unite(united, sep = "", remove = FALSE) %>%
filter(!str_detect(united, "([a-c])\\1"))
},
tidy2 = {
filter(unite(data.frame(gtools::permutations(n = 3, r = 6, v = abc, repeats.allowed = TRUE)),
united, sep = "", remove = FALSE),
!str_detect(united, "([a-c])\\1"))
},
base = {
r <- replicate(6, seq_len(length(abc)-1), simplify=FALSE)
r[[1]] <- c(r[[1]], length(abc))
m <- t(apply(do.call(expand.grid, r), 1, cumsum) %% length(abc) + 1)
m[] <- abc[m]
},
times=10000
)
# Unit: microseconds
# expr min lq mean median uq max neval
# tidy1 1875.400 2028.8510 2446.751 2165.651 2456.051 12790.901 10000
# tidy2 1745.402 1875.5015 2284.700 2000.051 2278.101 50163.901 10000
# base 796.701 871.4015 1020.993 919.801 1021.801 7373.901 10000
I tried the infix (non-%>%
) tidy2 version just for kicks, and though I was confident it would theoretically be faster, I didn't realize it would shave over 7% off the run-times. (The 50163 is likely R garbage-collecting, not "real".) The price we pay for readability/maintainability.
回答2:
There are probably cleaner methods, but here ya go:
abc <- letters[1:3]
library(tidyverse)
res <- gtools::permutations(n = 3, r = 6, v = abc, repeats.allowed = TRUE) %>%
data.frame() %>%
unite(united, sep = "", remove = FALSE) %>%
filter(!str_detect(united, "([a-c])\\1"))
head(res)
united X1 X2 X3 X4 X5 X6
1 ababab a b a b a b
2 ababac a b a b a c
3 ababca a b a b c a
4 ababcb a b a b c b
5 abacab a b a c a b
6 abacac a b a c a c
If you want a vector, you can use res$united
or add %>% pull(united)
as an additional step at the end of the pipes above.
来源:https://stackoverflow.com/questions/53566191/permutations-of-3-elements-within-6-positions