问题
I often have to filter elements of an array
of strings, containing some substring (e.g. one character). Since it can be done either by matching a regex
or with .contains
method, I've decided to make a small test to be sure that .contains
is faster (and therefore more appropriate).
my @array = "aa" .. "cc";
my constant $substr = 'a';
my $time1 = now;
my @a_array = @array.grep: *.contains($substr);
my $time2 = now;
@a_array = @array.grep: * ~~ /$substr/;
my $time3 = now;
my $time_contains = $time2 - $time1;
my $time_regex = $time3 - $time2;
say "contains: $time_contains sec";
say "regex: $time_regex sec";
Then I change the size of @array
and the length of $substr
and compare the times which each method took to filter the @array
. In most cases (as expected), .contains
is much faster than regex
, especially if @array
is large. But in case of a small @array
(as in the code above) regex
is slightly faster.
contains: 0.0015010 sec
regex: 0.0008708 sec
Why does this happen?
回答1:
In an entirely unscientific experiment I just switched the regex version and the contains version around and found that the difference in performance you're measuring is not "regex vs contains" but in fact "first thing versus second thing":
When contains comes first:
contains: 0.001555 sec
regex: 0.0009051 sec
When regex comes first:
regex: 0.002055 sec
contains: 0.000326 sec
Benchmarking properly is a difficult task. It's really easy to accidentally measure something different from what you wanted to figure out.
When I want to compare the performance of multiple things I will usually run each thing in a separate script, or maybe have a shared script but only run one of the tasks at once (for example using a multi sub MAIN("task1")
approach). That way any startup work gets shared.
In the #perl6 IRC channel on freenode we have a bot called benchable6 which can do benchmarks for you. Read the section "Comparing Code" on its wiki page to find out how it can compare two pieces of code for you.
来源:https://stackoverflow.com/questions/47051427/filtering-with-regex-and-contains-in-perl-6