biopython

Reverse complement of DNA strand using Python

懵懂的女人 提交于 2020-01-10 19:44:08
问题 I have a DNA sequence and would like to get reverse complement of it using Python. It is in one of the columns of a CSV file and I'd like to write the reverse complement to another column in the same file. The tricky part is, there are a few cells with something other than A, T, G and C. I was able to get reverse complement with this piece of code: def complement(seq): complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'} bases = list(seq) bases = [complement[base] for base in bases] return '

return outside function [closed]

北城余情 提交于 2020-01-01 03:57:06
问题 It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. Closed 8 years ago . Hii I am getting the following error in Biopython: 'return' outside function (filename.. line 26) Below is the code of myfile PLEASE HELP # File Name RandonProteinSequences.py # standard library import os import

how to calculate the entropy of a dna sequence in a fasta file

孤人 提交于 2019-12-25 07:15:22
问题 I need to calculate the entropy of a dna sequence in a fasta file, from the base 10000 to the base 11000 here is what I know, but I need to calculate the entropy of the sequence between the 10,000th to 11,000th base from math import log def logent(x): if x<=0: return 0 else: return -x*log(x) def entropy(lis): return sum([logent(elem) for elem in lis]) for i in SeqIO.parse("hsvs.fasta", "fasta"): lisfreq1=[i.seq.count(base)*1.0/len(i.seq) for base in ["A", "C","G","T"]] entropy(lisfreq1) 回答1:

How to select only certain Substrings

喜欢而已 提交于 2019-12-24 18:12:45
问题 from a string say dna = 'ATAGGGATAGGGAGAGAGCGATCGAGCTAG' i got substring say dna.format = 'ATAGGGATAG','GGGAGAGAG' i only want to print substring whose length is divisible by 3 how to do that? im using modulo but its not working ! import re if mydna = 'ATAGGGATAGGGAGAGAGCAGATCGAGCTAG' print re.findall("ATA"(.*?)"AGA" , mydna) if len(mydna)%3 == 0 print mydna corrected code import re mydna = 'ATAGGGATAGGGAGAGAGCAGATCGAGCTAG' re.findall("ATA"(.*?)"AGA" , mydna.format) if len(mydna.format)%3 ==

Using search terms with Biopython to return accession numbers

无人久伴 提交于 2019-12-24 17:53:21
问题 I am trying to use Biopython (Entrez) with search terms that will return the accession number (and not the GI*). Here is a tiny excerpt of my code: from Bio import Entrez Entrez.email = 'myemailaddress' search_phrase = 'Escherichia coli[organism]) AND (complete genome[keyword])' handle = Entrez.esearch(db='nuccore', term=search_phrase, retmax=100, rettype='acc', retmode='text') result = Entrez.read(handle) handle.close() gi_numbers = result['IdList'] print(gi_numbers) '745369752', '910228862'

Storing the Output to a FASTA file

。_饼干妹妹 提交于 2019-12-24 15:54:47
问题 from Bio import SeqIO from Bio import SeqRecord from Bio import SeqFeature for rec in SeqIO.parse("C:/Users/Siva/Downloads/sequence.gp","genbank"): if rec.features: for feature in rec.features: if feature.type =="Region": seq1 = feature.location.extract(rec).seq print(seq1) SeqIO.write(seq1,"region_AA_output1.fasta","fasta") I am trying to write the output to a FASTA file but i am getting error. Can anybody help me? This the error which i got Traceback (most recent call last): File "C:\Users

Sort rps-blast results by position of the hit

a 夏天 提交于 2019-12-24 14:12:56
问题 I'm beginning with biopython and I have a question about parsing results. I used a tutorial to get involved in this and here is the code that I used: from Bio.Blast import NCBIXML for record in NCBIXML.parse(open("/Users/jcastrof/blast/pruebarpsb.xml")): if record.alignments: print "Query: %s..." % record.query[:60] for align in record.alignments: for hsp in align.hsps: print " %s HSP,e=%f, from position %i to %i" \ % (align.hit_id, hsp.expect, hsp.query_start, hsp.query_end) Part of the

Errno 13 Permission denied with Django on a directory I don't want to use

前提是你 提交于 2019-12-24 04:16:11
问题 I have this error appearing in my Django app on my production server : [Errno 13] Permission denied: '/var/www/.config' I never asked to access to this unexisting file or directory in my code. The server is running in a different directory defined in my httpd.conf and I haven't defined the use of any /var/www/ elements in my Django settings. In my case I'm using the biopython library with Django : from Bio import Entrez Entrez.email = "my@email" handle = Entrez.efetch("taxonomy", id="123, 1")

Transform dna alignment into numpy array using biopython

邮差的信 提交于 2019-12-23 18:32:31
问题 I have several DNA sequences that have been aligned and I would like to keep only the bases that are variable at a specific position. This maybe could be done if we first transform the alignment into an array. I tried using the code in the Biopython tutorial but it gives an error. import numpy as np from Bio import AlignIO alignment = AlignIO.parse("ma-all-mito.fa", "fasta") align_array = np.array([list(rec) for rec in alignment], np.character) print("Array shape %i by %i" % align_array.shape

Biopython pairwise alignment results in segmentation fault when run in loop

江枫思渺然 提交于 2019-12-23 17:39:18
问题 I am trying to run pairwise global alignment method in biopython in loop for about 10000 pair of strings. Each string on an average is 20 characters long. Running the method for a single pair of sequences works fine. But running this in a loop, for as low as 4 pairs, results in segmentation fault. How can this be solved? from Bio import pairwise2 def myTrial(source,targ): if source == targ: return [source,targ,source] alignments = pairwise2.align.globalmx(source, targ,1,-0.5) return