How do I change this to “idiomatic” Perl?

前端 未结 6 2252
名媛妹妹
名媛妹妹 2021-02-09 03:46

I am beginning to delve deeper into Perl, but am having trouble writing \"Perl-ly\" code instead of writing C in Perl. How can I change the following code to use more Perl idiom

6条回答
  •  感情败类
    2021-02-09 04:36

    I would always advise to look at CPAN for previous solutions or examples of how to do things in Perl. Have you looked at Algorithm::NeedlemanWunsch?

    The documentation to this module includes an example for matching DNA sequences. Here is an example using the similarity matrix from wikipedia.

    #!/usr/bin/perl -w
    use strict;
    use warnings;
    use Inline::Files;                 #multiple virtual files inside code
    use Algorithm::NeedlemanWunsch;    # refer CPAN - good style guide
    
    # Read DNA sequences
    my @a = read_DNA_seq("DNA_SEQ_A");
    my @b = read_DNA_seq("DNA_SEQ_B");
    
    # Read Similarity Matrix (held as a Hash of Hashes)
    my %SM = read_Sim_Matrix();
    
    # Define scoring based on "Similarity Matrix" %SM
    sub score_sub {
        if ( !@_ ) {
            return -3;                 # gap penalty same as wikipedia)
        }
        return $SM{ $_[0] }{ $_[1] };    # Similarity Value matrix
    }
    
    my $matcher = Algorithm::NeedlemanWunsch->new( \&score_sub, -3 );
    my $score = $matcher->align( \@a, \@b, { align => \&check_align, } );
    
    print "\nThe maximum score is $score\n";
    
    sub check_align {
        my ( $i, $j ) = @_;              # @a[i], @b[j]
        print "seqA pos: $i, seqB pos: $j\t base \'$a[$i]\'\n";
    }
    
    sub read_DNA_seq {
        my $source = shift;
        my @data;
        while (<$source>) {
            push @data, /[ACGT-]{1}/g;
        }
        return @data;
    }
    
    sub read_Sim_Matrix {
    
        #Read DNA similarity matrix (scores per Wikipedia)
        my ( @AoA, %HoH );
        while () {
            push @AoA, [/(\S+)+/g];
        }
    
        for ( my $row = 1 ; $row < 5 ; $row++ ) {
            for ( my $col = 1 ; $col < 5 ; $col++ ) {
                $HoH{ $AoA[0][$col] }{ $AoA[$row][0] } = $AoA[$row][$col];
            }
        }
        return %HoH;
    }
    
    __DNA_SEQ_A__
    A T G T A G T G T A T A G T
    A C A T G C A
    __DNA_SEQ_B__
    A T G T A G T A C A T G C A
    __SIMILARITY_MATRIX__
    -  A  G  C  T
    A  10  -1  -3  -4
    G  -1  7  -5  -3
    C  -3  -5  9  0
    T  -4  -3  0  8
    

    And here is some sample output:

    seqA pos: 7, seqB pos: 2  base 'G'
    seqA pos: 6, seqB pos: 1  base 'T'
    seqA pos: 4, seqB pos: 0  base 'A'
    
    The maximum score is 100
    

提交回复
热议问题