Reading .fasta sequences to extract nucleotide data, and then writing to a TabDelimited file

前端 未结 1 1471
耶瑟儿~
耶瑟儿~ 2021-01-15 04:12

Before I continue, I thought I\'d refer readers to my previous problems with Perl, being a beginner to all of this.

These were my posts over the past few days, in ch

相关标签:
1条回答
  • 2021-01-15 04:17

    You have an extra brace right near the end. This should work:

    #!/usr/bin/perl -w
    # This script reads several sequences and computes the relative content of G+C of each sequence.
    
    use strict; 
    
    my $infile = "Lab1_seq.fasta";                               # This is the file path
    open INFILE, $infile or die "Can't open $infile: $!";        # This opens file, but if file isn't there it mentions this will not open
    my $outfile = "Lab1_SeqOutput.txt";             # This is the file's output
    open OUTFILE, ">$outfile" or die "Cannot open $outfile: $!"; # This opens the output file, otherwise it mentions this will not open
    
    my $sequence = ();  # This sequence variable stores the sequences from the .fasta file
    my $GC = 0;         # This variable checks for G + C content
    
    my $line;                             # This reads the input file one-line-at-a-time
    
    while ($line = <INFILE>) {
        chomp $line;                      # This removes "\n" at the end of each line (this is invisible)
    
        if($line =~ /^\s*$/) {         # This finds lines with whitespaces from the beginning to the ending of the sequence. Removes blank line.
            next;
    
        } elsif($line =~ /^\s*#/) {        # This finds lines with spaces before the hash character. Removes .fasta comment
            next; 
        } elsif($line =~ /^>/) {           # This finds lines with the '>' symbol at beginning of label. Removes .fasta label
            next;
        } else {
            $sequence = $line;
        }
    
        $sequence =~ s/\s//g;               # Whitespace characters are removed
        print OUTFILE $sequence;
    }
    

    Also I edited your return line. Return will exit your loop. I suspect what you want is to print it to a file, so I have done that. You may need to do some further transformation first to get it into a tab separated format.

    0 讨论(0)
提交回复
热议问题