Trying to return a specified number of characters from a gene sequence in R

后端 未结 3 914
忘了有多久
忘了有多久 2021-01-14 07:34

I have a DNA sequence like: cgtcgctgtttgtcaaagtcg....

that is possibly 1000+ letters long.

However, I only want to look at letters 5 to 200, f

相关标签:
3条回答
  • 2021-01-14 07:45

    Try

    substr("cgtcgctgtttgtcaa[...]", 5, 200)
    

    See substr().

    0 讨论(0)
  • 2021-01-14 07:45

    Use the substring function:

    > tmp.string <- paste(LETTERS, collapse="")
    > tmp.string <- substr(tmp.string, 4, 10)
    > tmp.string
    [1] "DEFGHIJ"
    
    0 讨论(0)
  • 2021-01-14 07:59

    See also the Bioconductor package Biostrings that is a good choice if you need to handle large biological sequences or set of sequences.

    #source("http://bioconductor.org/biocLite.R");biocLite("Biostrings") 
    library(Biostrings)
    s <-paste(rep("gtcgctgtttgtcaac",20),collapse="")
    d <- DNAString(s)
    d[5:200]
    as.character(d[5:200])
    
    0 讨论(0)
提交回复
热议问题