问题
I am trying to generate varying length N and C termini Slices (1,2,3,4,5,6,7). But before I get there I am having problems just reading in my fasta files. I was following the 'Random subsequences' head tutorial from:https://biopython.org/wiki/SeqIO . But in this case there is only one sequence so maybe that is where I went wrong. The code with example sequences and my errors. Any help would be much appreciated. I am clearly out of my depth. It looks like there are a lot of similar problems others have come across so I imagine it is something stupid that I am doing because I do not fully understand the SeqRecord structures. Thanks!
Two example sequences in my file domains.fasta:
>GA98
TTYKLILNLKQAKEEAIKELVDAGTAEKYFKLIANAKTVEGVWTLKDEIKTFTVTE
>GB98
TTYKLILNLKQAKEEAIKELVDAGTAEKYFKLIANAKTVEGVWTYKDEIKTFTVTE
my code that is not working:
from Bio import SeqIO
from Bio.SeqRecord import SeqRecord
# Load data:
domains = list(SeqIO.parse("domains.fa",'fasta'))
#set up receiving arrays
home=[]
num=1
#slice data
for i in range(0, 6):
num = num+1
domain = domains
seq_n = domains.seq[0:num]
seq_c = domains.seq[len(domain)-num:len(domain)]
name = domains.id
record_d = SeqRecord(domain,'%s' % (name), '', '')
home.append(record_d)
record_n = SeqRecord(seq_n,'%s_n_%i' % (name,num), '', '')
home.append(record_n)
record_c = SeqRecord(seq_c,'%s_c_%i' % (name,num), '', '')
home.append(record_c)
SeqIO.write(home, "domains_variants.fasta", "fasta")
error I get is:
Traceback (most recent call last):
File "~/fasta_nc_sequences.py", line 20, in <module>
seq_n = domains.seq[0:num]
AttributeError: 'list' object has no attribute 'SeqRecord'
When I print out 'domains = list(SeqIO.parse("domains.fa",'fasta'))' I get this:
[SeqRecord(seq=Seq('TTYKLILNLKQAKEEAIKELVDAGTAEKYFKLIANAKTVEGVWTLKDEIKTFTVTE', SingleLetterAlphabet()), id='GA98', name='GA98', description='GA98', dbxrefs=[]), SeqRecord(seq=Seq('TTYKLILNLKQAKEEAIKELVDAGTAEKYFKLIANAKTVEGVWTYKDEIKTFTVTE', SingleLetterAlphabet()), id='GB98', name='GB98', description='GB98', dbxrefs=[])]
I am not sure why I cannot access what is within the SeqRecord. Maybe it is because I wrapped the SeqIO.parse in a list because before I was being thrown a different error:
AttributeError: 'generator' object has no attribute 'seq'
回答1:
I was working one level too low in my for loop so I was not iterating through the sequences. There were also problems accessing the C terminus sequence. Now the code works.
#Load data:
domains = list(SeqIO.parse("examples/data/domains.fa",'fasta'))
#set up receiving arrays
home=[]
#num=1
#subset data
for record in (domains):
num = 0
domain = record.seq
name = record.id
record_d = SeqRecord(domain,'%s' % (name), '', '')
home.append(record_d)
for i in range(0, 6):
num= num+1
seq_n = record.seq[0:num]
seq_c = record.seq[len(record.seq)-num:len(record.seq)]
record_n = SeqRecord(seq_n,'%s_n_%i' % (name,num), '', '')
home.append(record_n)
record_c = SeqRecord(seq_c,'%s_c_%i' % (name,num), '', '')
home.append(record_c)
SeqIO.write(home, "domains_variants.fasta", "fasta")
来源:https://stackoverflow.com/questions/60144261/attributeerror-list-object-has-no-attribute-seqrecord-while-trying-to-sli