问题
My multi-fasta archive is in this format:
>miRNA65 dvex2345
CGATGCTAGATGCTATGACAACGATGCCTCG-G
>miRNA60 dvex1234
T-TAA-ACTCATCATCATCATACTCATCATCATCATCAGCATATTAACAAG
>miRNA65 dvex2345
T-TAA-ACTTATCATCATCATACTCATCATCATCATCAGCATATTAACAAG
I am new in Perl and I need to search the equals '> lines' and concatenate the next line to join the sequence.
I'm expecting the following output for the above file:
>miRNA60 dvex1234
T-TAA-ACTCATCATCATCATACTCATCATCATCATCAGCATATTAACAAG
>miRNA65 dvex2345
T-TAA-ACTTATCATCATCATACTCATCATCATCATCAGCATATTAACAAG.CGATGCTAGATGCTATGACAACGATGCCTCG-G
What is the best way to get this done?
回答1:
%hash;
while (<DATA>) {
if (/^>(miRNA\d+)/) {
$hash{$1}[0] = $_;
chomp($n = <DATA>);
unshift @{$hash{$1}[1]}, $n;
}
}
for $k (sort keys %hash) {
print $hash{$k}[0], join(',', @{$hash{$k}[1]}), "\n";
}
__DATA__
>miRNA65 dvex2345
CGATGCTAGATGCTATGACAACGATGCCTCG-G
>miRNA60 dvex1234
T-TAA-ACTCATCATCATCATACTCATCATCATCATCAGCATATTAACAAG
>miRNA65 dvex2345
T-TAA-ACTTATCATCATCATACTCATCATCATCATCAGCATATTAACAAG
来源:https://stackoverflow.com/questions/11821962/i-need-search-a-pattern-in-a-header-line-of-my-file-and-concatenates-the-next-li