merging specific number of lines
Basicaly, there are many commands:
pr
- convert text files for printing
pr -at16 <file
Try:
pr -a -t -16 < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42
xargs
- build and execute command lines from standard input
... and executes the command (default is /bin/echo) ...
xargs -n 16 <file
Try:
xargs -n 16 < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42
paste
- merge lines of files
printf -v pasteargs %*s 16
paste -d\ ${pasteargs// /- } < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42
sed
- stream editor for filtering and transforming text
printf -v sedstr 'N;s/\\n/ /;%.0s' {2..16};
sed -e "$sedstr" < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42
awk
- pattern scanning and processing language
awk 'NR%16{printf "%s ",$0;next;}1' < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42 42
But, you could use pure bash:
group=()
while read -r line;do
group+=("$line")
((${#group[@]}>15))&&{
echo ${group[@]};
group=()
}
done < <(seq 1 42) ; echo ${group[@]}
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42
or as a function:
lgrp () {
local group=() line
while read -r line; do
group+=("$line")
((${#group[@]}>=$1)) && {
echo ${group[@]}
group=()
}
done
[ "$group" ] && echo ${group[@]}
}
or
lgrp () { local g=() l;while read -r l;do g+=("$l");((${#g[@]}>=$1))&&{
echo ${g[@]};g=();};done;[ "$g" ] && echo ${g[@]};}
then
lgrp 16 < <(seq 1 42)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40 41 42
Ok there are more than 3 commands, let do a little bench!
How I do it:
lgrp () { local g=() l;while read -r l;do g+=("$l");((${#g[@]}>=$1))&&{
echo ${g[@]};g=();};done;[ "$g" ] && echo ${g[@]};}
export -f lgrp
printf -v sedcmd '%*s' 15
sedcmd=${sedcmd// /N;s/\\n/ /;}
export sedcmd
{
printf "%s\n" cmd cnt$'\n'time{,,,,}
for cmd in 'paste -d " " -{,,,}{,,,}' 'pr -at16' 'sed -e "$sedcmd"' \
$'awk \47NR%16{printf "%s ",$0;next;}1;END{print}\47' \
'lgrp 16' 'xargs -n 16'
do
echo ${cmd%% *}
for length in 100 1000 10000 100000 1000000
do
echo "$(bash -c "TIMEFORMAT=%R;
time $cmd < <(seq 1 $length) | wc -l" 2>&1)";
done
done
} | pr -at11
Produce, on my computer:
cmd cnt time cnt time cnt time cnt time cnt time
paste 7 0.002 63 0.003 625 0.002 6250 0.016 62500 0.052
pr 7 0.002 63 0.002 625 0.003 6250 0.017 62500 0.125
sed 7 0.002 63 0.010 625 0.006 6250 0.059 62500 0.457
awk 7 0.003 63 0.003 626 0.008 6251 0.049 62501 0.501
lgrp 7 0.004 63 0.027 625 0.256 6250 2.701 62500 26.84
xargs 7 0.012 63 0.049 625 0.387 6250 4.209 62500 41.75
There is same bench on my raspberry pi:
cmd cnt time cnt time cnt time cnt time cnt time
paste 7 0.104 63 0.101 625 0.130 6250 0.297 62500 2.241
pr 7 0.107 63 0.112 625 0.166 6250 0.821 62500 7.076
sed 7 0.137 63 0.145 625 0.415 6250 2.868 62500 28.06
awk 7 0.198 63 0.179 626 0.620 6251 5.426 62501 51.20
lgrp 7 0.231 63 2.715 625 15.76 6250 150.3 62500 1544.
xargs 7 0.270 63 1.845 625 16.37 6250 165.3 62500 1648.
Hopefully all line count are same, then paste
are clearly the quicker, followed by pr
. Pure bash function is not slower than xargs
(I'm surprised!).