I need search a pattern in a header line of my file and concatenates the next line with Perl

主宰稳场 提交于 2019-12-08 08:43:11

问题


My multi-fasta archive is in this format:

>miRNA65 dvex2345
CGATGCTAGATGCTATGACAACGATGCCTCG-G
>miRNA60 dvex1234
T-TAA-ACTCATCATCATCATACTCATCATCATCATCAGCATATTAACAAG
>miRNA65 dvex2345
T-TAA-ACTTATCATCATCATACTCATCATCATCATCAGCATATTAACAAG

I am new in Perl and I need to search the equals '> lines' and concatenate the next line to join the sequence.

I'm expecting the following output for the above file:

>miRNA60 dvex1234
T-TAA-ACTCATCATCATCATACTCATCATCATCATCAGCATATTAACAAG
>miRNA65 dvex2345
T-TAA-ACTTATCATCATCATACTCATCATCATCATCAGCATATTAACAAG.CGATGCTAGATGCTATGACAACGATGCCTCG-G

What is the best way to get this done?


回答1:


%hash;
while (<DATA>) {
        if (/^>(miRNA\d+)/) {
                $hash{$1}[0] = $_;
                chomp($n = <DATA>);
                unshift @{$hash{$1}[1]}, $n;
        }
}

for $k (sort keys %hash) {
        print $hash{$k}[0], join(',', @{$hash{$k}[1]}), "\n";
}
__DATA__
>miRNA65 dvex2345
CGATGCTAGATGCTATGACAACGATGCCTCG-G
>miRNA60 dvex1234
T-TAA-ACTCATCATCATCATACTCATCATCATCATCAGCATATTAACAAG
>miRNA65 dvex2345
T-TAA-ACTTATCATCATCATACTCATCATCATCATCAGCATATTAACAAG


来源:https://stackoverflow.com/questions/11821962/i-need-search-a-pattern-in-a-header-line-of-my-file-and-concatenates-the-next-li

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!