问题
I am experimenting some problems using the Bio.Restrictions
methods, I am not sure if it is due to python, biopython or my poor understanding of python.
When I try to crate a RestrictionBatch
following the cookbook, I want to use enzymes I from a dictionary (read from files), and it says:
You can initiate a restriction batch by passing it a list of enzymes or enzymes name as argument.
In the python documentation for dict.keys
says:
Return a copy of the dictionary’s list of keys
So I tried this:
rb = RestrictionBatch(Enzymes.keys())
But I get an error: ValueError: <type 'list'> is not a RestrictionType
Testing where could be the error I created this code, to know if it is really a list or not
from Bio.Seq import Seq
Enzymes = {'XhoI': Seq('CTCGAG'), 'BsmBI': Seq('CGTCTC'), 'SceI': Seq('AGTTACGCTAGGGATAACAGGGTAATATAG'), 'BamHI': Seq('GGATCC'), 'BsaI': Seq('GGTCTC'), 'SacI': Seq('GAGCTC'), 'BbsI': Seq('GAAGAC'), 'AarI': Seq('CACCTGC'), 'EcoRI': Seq('GAATTC'), 'SpeI': Seq('ACTAGT'), 'CeuI': Seq('TTCGCTACCTTAGGACCGTTATAGTTACG')}
print Enzymes.keys() is list #prints False
print isinstance(Enzymes.keys(), list) #prints True
print type(Enzymes.keys()) #prints <type 'list'>
Why this behaviour? And how can I use the dictionary to run the RestrictionBatch
?
I am using:
Python 2.7.3 |EPD 7.3-2 (64-bit)| (default, Apr 11 2012, 17:52:16)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-44)] on linux2
import Bio
print(Bio.__version__)
1.59
Minor question: How can I check if it is or not in the database of Restriction? Is there any way to add one enzyme to this database (assuming I have the information needed)?
回答1:
The cookbook uses loosely the word "list". They talk about a list with the names of valid enzymes, that are already defined in the import Bio.Restriction
. You can list all of them (along with other utilities) with:
from Bio import Restriction as rst
dir(rst)
But the RestrictionType is a bit more complex than a dict with names and seqs. This is the full definition for "EcoRI":
rest_dict["EcoRI"] = {
'compsite' : '(?P<EcoRI>GAATTC)|(?P<EcoRI_as>GAATTC)',
'results' : None,
'site' : 'GAATTC',
'substrat' : 'DNA',
'fst3' : -1,
'fst5' : 1,
'freq' : 4096,
'size' : 6,
'opt_temp' : 37,
'dna' : None,
'inact_temp' : 65,
'ovhg' : -4,
'scd3' : None,
'suppl' : ('B', 'C', 'F', 'H', 'I', 'J', 'K', 'M', 'N', 'O', 'Q', 'R'
'scd5' : None,
'charac' : (1, -1, None, None, 'GAATTC'),
'ovhgseq' : 'AATT',
}
Plus a set with the suppliers, e.g.
suppliers["B"] = (
'Invitrogen Corporation',
['MluI', 'HpaII', 'SalI', 'NcoI', 'ClaI', 'DraI', 'SstII', 'AvaI', ...)
And the typedict:
typedict["212"] = (
('NonPalindromic', 'OneCut', 'Ov5', 'Defined', 'Meth_Dep', ...),
['BssHII', 'BsrFI', 'DpnII', 'MluI', 'NgoMIV', 'HpaII', 'TspMI', ...],
)
These definitions are in Bio.Restriction.Restriction_Dictionary
Using the code I previously put in another anwer:
from Bio.Restriction import Restriction as rst
from Bio.Restriction.Restriction_Dictionary import rest_dict, typedict
def create_enzyme(name):
e_types = [x for t, (x, y) in typedict.items() if name in y][0]
enzyme_types = tuple(getattr(rst, x) for x in e_types)
return rst.RestrictionType(name, enzyme_types, rest_dict[name])
enzyme_list = ["EcoRI", "MstI"]
rb = reduce(lambda x, y: x + y, map(create_enzyme, enzyme_list))
When the cookbook says "by passing it a list of enzymes or enzymes name", they are simplifying the things. As you can see in the source, /Bio/Restriction/Restriction.py
, when the object RestrictionBatch is initialized, __init__
calls self.format
, and self.format
checks that every item in the "list" is an instance of RestrictionType
.
The minor answer for the minor question is:
>>> from Bio import Restriction as rst
>>> rst.hasattr(rst, "EcoRI")
True
>>> rst.hasattr(rst, "FakeEnzyme")
False
Or
>>> from Bio.Restriction.Restriction_Dictionary import rest_dict
>>> "EcoRI" in rest_dict.keys()
True
>>> "FakeEnzyme" in rest_dict.keys()
False
来源:https://stackoverflow.com/questions/22687656/list-and-restrictiontype-from-biopython