Search for motifs with degenerate positions

元气小坏坏 提交于 2019-12-12 02:53:36

问题


I have a 15-mer nucleotide motif that uses degenerate nucleotide sequences. Example: ATNTTRTCNGGHGCN.

I would search a set of sequences for the occurrence of this motif. However, my other sequences are exact sequences, i.e. they have no ambiguity.

I have tried doing a for loop within the sequences to search for this, but I have not been able to do non-exact searches. The code I use is modeled after the code on the Biopython cookbook.

for pos,seq in m.instances.search(test_seq):
    print pos, seq

I would like to search for all possible exact instances of the non-exact 15-mer. Is there a function available, or would I have to resort to defining my own function for that? (I'm okay doing the latter, just wanted to triple-check with the world that I'm not duplicating someone else's efforts before I go ahead - I have already browsed through what I thought was the relevant parts of the docs.)


回答1:


Use Biopython's nt_search. It looks for a subsequence in a DNA sequence, expanding ambiguity codes to the possible nucleotides in that position. Example:

>>> from Bio import SeqUtils
>>> pat = "ATNTTRTCNGGHGCN"
>>> SeqUtils.nt_search("CCCCCCCATCTTGTCAGGCGCTCCCCCC", pat)
['AT[GATC]TT[AG]TC[GATC]GG[ACT]GC[GATC]', 7]

It returns a list where the first item is the search pattern, followed by the positions of the matches.



来源:https://stackoverflow.com/questions/18522093/search-for-motifs-with-degenerate-positions

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!