Find all possible substrings of length n

后端 未结 2 772
予麋鹿
予麋鹿 2020-12-19 13:26

I have an interesting (only for me, perhaps, :)) question. I have text like:

\"abbba\"

The question is to find all possible substrings of

相关标签:
2条回答
  • 2020-12-19 13:58

    We may use

    x <- "abbba"
    allsubstr <- function(x, n) unique(substring(x, 1:(nchar(x) - n + 1), n:nchar(x)))
    allsubstr(x, 2)
    # [1] "ab" "bb" "ba"
    allsubstr(x, 3)
    # [1] "abb" "bbb" "bba"
    

    where substring extracts a substring from x starting and ending at specified positions. We exploit the fact that substring is vectorized and pass 1:(nchar(x) - n + 1) as starting positions and n:nchar(x) as ending positions.

    0 讨论(0)
  • With combn all combinations of the vector will be arranged by column. Splitting the vector prior and transposing the result will give the result as a matrix. It can then be combined with do.call(paste,...) on the matrix as a data frame:

    mat <- unique(t(combn(strsplit(x, "")[[1]],2)))
    do.call(paste0, as.data.frame(mat))
    #[1] "ab" "aa" "bb" "ba"
    

    Update

    We can also specify the way combn treats the combinations with a shorter syntax (@docendo):

    unique(combn(strsplit(x, "")[[1]],3, FUN=paste, collapse=""))
    

    edit

    Use this solution only if you are seeking all combinations. If you are only seeking a rolling split, use Julius' answer.

    0 讨论(0)
提交回复
热议问题