How to select only certain Substrings

喜欢而已 提交于 2019-12-24 18:12:45

问题


from a string say dna = 'ATAGGGATAGGGAGAGAGCGATCGAGCTAG' i got substring say dna.format = 'ATAGGGATAG','GGGAGAGAG' i only want to print substring whose length is divisible by 3 how to do that? im using modulo but its not working !

import re
if mydna = 'ATAGGGATAGGGAGAGAGCAGATCGAGCTAG'
print re.findall("ATA"(.*?)"AGA" , mydna)
if len(mydna)%3 == 0
   print mydna

corrected code

import re
mydna = 'ATAGGGATAGGGAGAGAGCAGATCGAGCTAG'
re.findall("ATA"(.*?)"AGA" , mydna.format)
if len(mydna.format)%3 == 0:
   print mydna.format

this still doesnt give me substring with length divisible by three . . any idea whats wrong ?

im expecting only substrings which has length divisible by three to be printed


回答1:


You can also use the regular expression for that:

re.findall('ATA((...)*?)AGA', mydna)

the inner braces match 3 letters at once.




回答2:


For including overlap substrings, I have the following lengthy version. The idea is to find all starting and ending marks and calculate the distance between them.

mydna = 'ATAGGGATAGGGAGAGAGCAGATCGAGCTAG'
[mydna[start.start():end.start()+3] for start in re.finditer('(?=ATA)',mydna) for end in re.finditer('(?=AGA)',mydna) if end.start()>start.start() and (end.start()-start.start())%3 == 0]
['ATAGGGATAGGG', 'ATAGGG']

Show all substrings, including overlapping ones:

[mydna[start.start():end.start()+3] for start in re.finditer('(?=ATA)',mydna) for end in re.finditer('(?=AGA)',mydna) if end.start()>start.start()]
['ATAGGGATAGGG', 'ATAGGGATAGGGAG', 'ATAGGGATAGGGAGAGAGC', 'ATAGGG', 'ATAGGGAG', 'ATAGGGAGAGAGC']



回答3:


Using modulo is the correct procedure. If it's not working, you're doing it wrong. Please provide an example of your code in order to debug it.




回答4:


re.findAll() will return you an array of matching strings, You need to iterate on each of those and do a modulo on those strings to achieve what you want.



来源:https://stackoverflow.com/questions/8390913/how-to-select-only-certain-substrings

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!