I have a reference to an array (called $intervals
) and I would like to sort the values in this array. It\'s possible that there could be a huge number of values
The second example will create a new reference:
$x = [ qw( a c b ) ]; say $x; @$x = sort @$x; say $x; $x = [sort @$x]; say $x
Reference.
Perl allows arrays to be sorted in-place with the idiom @arr = sort @arr
. Contrary to the normal behavior of the assignment operator, no copies will be made in this case. However, this optimization is limited to normal array variables; it won't work with array references:
Let's look under the hood by using the -MO=Concise
option. First, we do normal in-place sorting to see what we'd expect:
$ perl -E'say $^V'
v5.18.2
$ perl -MO=Concise -e'my @arr; @arr = sort @arr'
8 <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
3 <0> padav[@arr:1,2] vM/LVINTRO ->4
4 <;> nextstate(main 2 -e:1) v:{ ->5
- <1> ex-aassign vKS/64 ->8
- <1> ex-list lK ->-
5 <0> pushmark s ->6
7 <@> sort lK/INPLACE ->8
6 <0> padrange[@arr:1,2] l/1 ->7
- <0> padav[@arr:1,2] lRM* ->7
- <1> ex-list lK ->-
- <0> ex-pushmark s ->-
- <0> ex-padav lRM* ->-
-e syntax OK
Interesting: <@> sort lK/INPLACE ->8
, which seems to sort in place. Now let's do the same thing with references:
$ perl -MO=Concise -e'my $ref; @$ref = sort @$ref'
e <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
3 <0> padsv[$ref:1,2] vM/LVINTRO ->4
4 <;> nextstate(main 2 -e:1) v:{ ->5
d <2> aassign[t4] vKS/COMMON ->e
- <1> ex-list lK ->a
5 <0> pushmark s ->6
9 <@> sort lK ->a
6 <0> pushmark s ->7
8 <1> rv2av[t3] lK/1 ->9
7 <0> padsv[$ref:1,2] s ->8
- <1> ex-list lK ->d
a <0> pushmark s ->b
c <1> rv2av[t2] lKRM*/1 ->d
b <0> padsv[$ref:1,2] sM/DREFAV ->c
-e syntax OK
I do not see an inplace flag in <@> sort lK ->a
. So the optimization only seems to work when using the same variable, not when using the same array. But this means we can sort array references in place if we alias an array variable to the array referenced by some scalar (using Data::Alias):
perl -MData::Alias -MO=Concise -e'my $ref; alias my @arr = @$ref; @arr = sort @arr'
e <@> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
3 <0> padsv[$ref:1,3] vM/LVINTRO ->4
4 <;> nextstate(main 2 -e:1) v:{ ->5
- <1> entersub vKS/INARGS ->a
...
a <;> nextstate(main 3 -e:1) v:{ ->b
- <1> ex-aassign vKS/64 ->e
- <1> ex-list lK ->-
b <0> pushmark s ->c
d <@> sort lK/INPLACE ->e
c <0> padrange[@arr:2,3] l/1 ->d
- <0> padav[@arr:2,3] lRM* ->d
- <1> ex-list lK ->-
- <0> ex-pushmark s ->-
- <0> ex-padav lRM* ->-
-e syntax OK
… and the inplace-flag is there again <@> sort lK/INPLACE ->e
:-)
This means that Eric Strom's answer is incorrect.
Since Perl 5.8.4, the in-place sort @a = sort @a
is optimized. See the links below for details:
Performance Enhancements in perl584delta
http://perl5.git.perl.org/perl.git/commit/fe1bc4cf71e7b04d33e679798964a090d9fa7b46?f=pp_sort.c
+ /* The optimiser converts "@a = sort @a" to "sort \@a";
+ * in case of a tied @a, pessimise: push (@a) onto stack, then assign
+ * the result back to @a at the end of this function */
So you should be able to write:
@$intervals = sort by_position @$intervals
And in Perl's later than 5.8.3 you will see reduced memory usage (and preservation of aliasing for the rare times it matters).