Translation DNA to Protein

前端 未结 2 1609
无人及你
无人及你 2021-02-10 05:57

I am a biology graduate student and I taught myself a very limited amount of python in the past few months to deal with some data I have. I am not asking for homework help, this

相关标签:
2条回答
  • 2021-02-10 06:31

    There is one more problem in your code - when you use stop = sequencestart.find('TAA') you don't care about opened reading frame. In code below I split sequence into triplets and use itertools.takewhile to handle that but it can be done using loops as well:

    from itertools import takewhile
    
    def translate_dna(sequence, codontable, stop_codons = ('TAA', 'TGA', 'TAG')):       
        start = sequence.find('ATG')
    
        # Take sequence from the first start codon
        trimmed_sequence = sequence[start:]
    
        # Split it into triplets
        codons = [trimmed_sequence[i:i+3] for i in range(0, len(trimmed_sequence), 3)]
        print(len(codons))
        print(trimmed_sequence)
        print(codons)
    
        # Take all codons until first stop codon
        coding_sequence  =  takewhile(lambda x: x not in stop_codons and len(x) == 3 , codons)
    
        # Translate and join into string
        protein_sequence = ''.join([codontable[codon] for codon in coding_sequence])
    
        # This line assumes there is always stop codon in the sequence
        return "{0}_".format(protein_sequence)
    
    0 讨论(0)
  • 2021-02-10 06:41

    Your problem stems from the line

    if cds[n:n+3] in codontable == True
    

    This always evaluates to False, and thus you never append to proteinsequence. Just remove the == True portion like so

    if cds[n:n+3] in codontable
    

    and you will get the protein sequence. Also, make sure to return proteinsequence in translate_dna().

    0 讨论(0)
提交回复
热议问题