perl quick switch from quaternary to decimal

99封情书 提交于 2019-12-25 19:05:30

问题


I'm representing nucleotides A,C,G,T as 0,1,2,3, and afterwards I need to translate the sequence representing as quaternary to decimal. Is there a way to achieve this in perl? I'm not sure if pack/unpack can do this or not.


回答1:


Base 4 requires exactly 2 bits, so it's easy to handle efficiently.

my $uvsize = length(pack('J>', 0)) * 8;
my %base4to2 = map { $_ => sprintf('%2b', $_) } 0..3;

sub base4to10 {
   my ($s) = @_;
   $s =~ s/(.)/$base4to2{$1}/sg;
   $s = substr(("0" x $uvsize) . $s, -$uvsize);
   return unpack('J>', pack('B*', $s));
}

This allows inputs of 16 digits on builds supporting 32-bit integers, and 32 digits on builds supporting 64-bit integers.

It's possible to support slightly larger numbers using floating points: 26 on builds with IEEE doubles, 56 on builds with IEEE quads. This would require a different implementation.

Larger than that would require a module such as Math::BigInt for Perl to store them.


Faster and simpler:

my %base4to16 = (
   '0' => '0',   '00' => '0',   '20' => '8',
   '1' => '1',   '01' => '1',   '21' => '9',
   '2' => '2',   '02' => '2',   '22' => 'A',
   '3' => '3',   '03' => '3',   '23' => 'B',
                 '10' => '4',   '30' => 'C',
                 '11' => '5',   '31' => 'D',
                 '12' => '6',   '32' => 'E',
                 '13' => '7',   '33' => 'F',
);

sub base4to10 {
   (my $s = $_[0]) =~ s/(..?)/$base4to16{$1}/sg;
   return hex($s);
}



回答2:


I've never used it, but it looks like the Convert::BaseN module would be a good choice. Convert::BaseN - encoding and decoding of base{2,4,8,16,32,64} strings




回答3:


It is very simple to calculate a base-4 string to decimal by processing each digit in a loop

Note that, on 32-bit machines, you won't be able to represent a sequence longer than sixteen bases

This code shows the idea

use strict;
use warnings;

print seq2dec('ACGTACGTACGTACGT');

sub seq2dec{
  my ($sequence) = @_;
  my $n = 0;
  for (map {index 'ACGT', $_} split //, $sequence) {
    $n = $n * 4 + $_;
  }
  return $n;
}

output

454761243


来源:https://stackoverflow.com/questions/12520462/perl-quick-switch-from-quaternary-to-decimal

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!