I have an interesting (only for me, perhaps, :)) question. I have text like:
\"abbba\"
The question is to find all possible substrings of
We may use
x <- "abbba"
allsubstr <- function(x, n) unique(substring(x, 1:(nchar(x) - n + 1), n:nchar(x)))
allsubstr(x, 2)
# [1] "ab" "bb" "ba"
allsubstr(x, 3)
# [1] "abb" "bbb" "bba"
where substring
extracts a substring from x
starting and ending at specified positions. We exploit the fact that substring
is vectorized and pass 1:(nchar(x) - n + 1)
as starting positions and n:nchar(x)
as ending positions.
With combn
all combinations of the vector will be arranged by column. Splitting the vector prior and transposing the result will give the result as a matrix. It can then be combined with do.call(paste,...)
on the matrix as a data frame:
mat <- unique(t(combn(strsplit(x, "")[[1]],2)))
do.call(paste0, as.data.frame(mat))
#[1] "ab" "aa" "bb" "ba"
Update
We can also specify the way combn
treats the combinations with a shorter syntax (@docendo):
unique(combn(strsplit(x, "")[[1]],3, FUN=paste, collapse=""))
edit
Use this solution only if you are seeking all combinations. If you are only seeking a rolling split, use Julius' answer.