问题
I'm new with python and especially with Biopython. I'm trying to take some information from an XML file with Entrez.efetch
and then read it. Last week this script worked well:
handle = Entrez.efetch(db="Protein", id="YP_008872780.1", retmode="xml")
records = Entrez.read(handle)
But now I'm getting an Error:
> Bio.Entrez.Parser.ValidationError: Failed to find tag 'GBSeq_xrefs' in
the DTD. To skip all tags that are not represented in the DTD, please
call Bio.Entrez.read or Bio.Entrez.parse with validate=False.
So I run this:
records = Entrez.read(handle, validate=False)
But I'm still getting an Error:
TypeError: 'str' object does not support item assignment
After some research I realized that NCBI made new changes concerning the RefSeq
which creates new tags in the xml file (of GenPept)
Do I need to change something in the DTD to support these new tags?
回答1:
It appears that my DTD file was out of date.
A new version can be found here or here.
来源:https://stackoverflow.com/questions/23476331/the-new-refseq-release-from-ncbi-is-compatible-with-bio-entrez-parser