bioperl

Parsing GenBank file

生来就可爱ヽ(ⅴ<●) 提交于 2020-01-15 10:58:05
问题 Basically, a GenBank file consists on gene entries (announced by 'gene' followed by its corresponding 'CDS' entry (only one per gene) like the two I show here below. I would like to get locus_tag vs product in a tab-delimited two column file. 'gene' and 'CDS' are always preceded and followed by spaces. If this task can be easily performed using an already available tool, please let me know. Input file: gene complement(8972..9094) /locus_tag="HAPS_0004" /db_xref="GeneID:7278619" CDS complement

How can I download the entire GenBank file with just an accession number?

廉价感情. 提交于 2020-01-05 08:07:43
问题 I've got an array full of accession numbers, and I'm wondering if there's a way to automatically save genbank files using BioPerl. I know you can grab sequence information, but I want the entire GenBank record. #!/usr/bin/env perl use strict; use warnings; use Bio::DB::GenBank; my @accession; open (REFINED, "./refine.txt") || die "Could not open: $!"; while(<REFINED>){ if(/^(\D+)\|(.*?)\|/){ push(@accession, $2); } } close REFINED; foreach my $number(@accession){ my $db_obj = Bio::DB::GenBank

how to Run 'Build installdeps' to install missing prerequisites

让人想犯罪 __ 提交于 2019-12-23 13:56:29
问题 Trying to run a Build.PL file and get following, and not uncommon error message: Checking prerequisites... build_requires: ! Test::Most is not installed recommends: * HTML::TableExtract is not installed * Math::Random is not installed * YAML is not installed ERRORS/WARNINGS FOUND IN PREREQUISITES. You may wish to install the versions of the modules indicated above before proceeding with this installation Run 'Build installdeps' to install missing prerequisites. however when I run: perl Build

extract overlapping regions

∥☆過路亽.° 提交于 2019-12-23 05:06:16
问题 I have a file characterizing genomic regions that looks like this: chrom chromStart chromEnd PGB chr1 12874 28371 2 chr1 15765 21765 1 chr1 15795 28371 2 chr1 18759 24759 1 chr1 28370 34961 1 chr3 233278 240325 1 chr3 239279 440831 2 chr3 356365 362365 1 Basically PGB describes the category of the genomic region characterised by its chromosome number (chrom), start (chromStart) and end (chromEnd) coordinates. I wish to collapse the overlapping regions such that overlapping regions of PGB = 1

I want to replace a sequence name in fasta file with another name

会有一股神秘感。 提交于 2019-12-14 03:28:25
问题 I have one fasta file and one text file fasta file contains sequences in fasta format and text file contains name of genes now I want to replace name of the sequences in fasta file after '>' sign with the gene names in text file I am new to perl though I have written a script but I don't know why its not working can anyone help me on that please following is my script: print"Enter annotated file..."; $f1=<STDIN>; print"Enter sequence file..."; $f2=<STDIN>; open(FILE1,$f1) || die"Can't open

How do I get gene features in FASTA nucleotide format from NCBI using Perl?

与世无争的帅哥 提交于 2019-12-13 13:11:34
问题 I am able to download a FASTA file manually that looks like: >lcl|CR543861.1_gene_1... ATGCTTTGGACA... >lcl|CR543861.1_gene_2... GTGCGACTAAAA... by clicking "Send to" and selecting "Gene Features", FASTA Nucleotide is the only option (which is fine because that's all I want) on this page. With a script like this: #!/usr/bin/env perl use strict; use warnings; use Bio::DB::EUtilities; my $factory = Bio::DB::EUtilities->new(-eutil => 'efetch', -db => 'nucleotide', -id => 'CR543861', -rettype =>

How to compare and merge multiple files?

送分小仙女□ 提交于 2019-12-13 11:25:10
问题 reference file chr1 288598 288656 chr1 779518 779576 chr2 2569592 2569660 chr3 5018399 5018464 chr4 5182842 5182882 file1 chr1 288598 288656 12 chr1 779518 779576 14 chr2 2569592 2569660 26 chr3 5018399 5018464 27 chr4 5182842 5182882 37 file2 chr1 288598 288656 35 chr2 2569592 2569660 348 chr3 5018399 5018464 4326 chr4 5182842 5182882 68 I have six similar files excluding the reference file. Here first three fields are similar to the reference file. Therefore, I would like export only 4th

How do I install the latest BioPerl version when using perlbrew?

烂漫一生 提交于 2019-12-11 11:45:40
问题 I'm using perlbrew and I would like to install the latest bioperl version. Should I use cpanm or git ? If git - do I just install as usual (AKA git clone ... then make and build), or should I do anything special? UPDATE Specifically, I'm not sure I understand the following expert from BioPerl Using Git manual: Tell perl where to find BioPerl (assuming you checked out the code in $HOME/src; set this in your .bash_profile, .profile, or .cshrc): bash: $ export PERL5LIB="$HOME/src/bioperl-live:

Remove Perl modules from CPAN on Mac

白昼怎懂夜的黑 提交于 2019-12-08 09:47:55
问题 As far as I know it is required to run CPAN with sudo on Mac sudo perl -MCPAN -e shell to install new modules. Theoretically, a module can be removed by deleting it from the Perl folders. My question is: Where are Perl modules put when installed from CPAN with 'sudo' and without 'sudo'? I installed BioPerl both ways and it seemed to work. Did I mess anything up by installing it with sudo and without? Thank you for a little help in the confusing Perl world. 回答1: You can see where a module got

Installing Bio::DB::Sam perl module

╄→尐↘猪︶ㄣ 提交于 2019-12-01 05:15:12
I am trying to install a perl module Bio::DB::Sam on my home directory on a remote server. I downloaded the module, extracted the files, and ran: perl Build.pl prefix=~/local this is what happens next: This module requires samtools 0.1.10 or higher (samtools.sourceforge.net). Please enter the location of the bam.h and compiled libbam.a files: **/some_places/samtools-0.1.19** Found /some_places/samtools-0.1.19/bam.h and /some_places/samtools-0.1.19/libbam.a. Created MYMETA.yml and MYMETA.json Creating new 'Build' script for 'Bio-SamTools' version '1.39' Next when I try to run: ./Build this is