问题
I want to run multiple shell processes, but when I try to run more than 63, they hang. When I reduce max_threads
in the thread pool to n
, it hangs after running the n
th shell command.
As you can see in the code below, the problem is not in start
blocks per se, but in start
blocks that contain the shell
command:
#!/bin/env perl6
my $*SCHEDULER = ThreadPoolScheduler.new( max_threads => 2 );
my @processes;
# The Promises generated by this loop work as expected when awaited
for @*ARGS -> $item {
@processes.append(
start { say "Planning on processing $item" }
);
}
# The nth Promise generated by the following loop hangs when awaited (where n = max_thread)
for @*ARGS -> $item {
@processes.append(
start { shell "echo 'processing $item'" }
);
}
await(@processes);
Running ./process_items foo bar baz
gives the following output, hanging after processing bar
, which is just after the n
th (here 2
nd) thread has run using shell
:
Planning on processing foo Planning on processing bar Planning on processing baz processing foo processing bar
What am I doing wrong? Or is this a bug?
Perl 6 distributions tested on CentOS 7:
Rakudo Star 2018.06
Rakudo Star 2018.10
Rakudo Star 2019.03-RC2
Rakudo Star 2019.03
With Rakudo Star 2019.03-RC2, use v6.c
versus use v6.d
did not make any difference.
回答1:
The shell
and run
subs use Proc
, which is implemented in terms of Proc::Async
. This uses the thread pool internally. By filling up the pool with blocking calls to shell
, the thread pool becomes exhausted, and so cannot process events, resulting in the hang.
It would be far better to use Proc::Async
directly for this task. The approach with using shell
and a load of real threads won't scale well; every OS thread has memory overhead, GC overhead, and so forth. Since spawning a bunch of child processes is not CPU-bound, this is rather wasteful; in reality, just one or two real threads are needed. So, in this case, perhaps the implementation pushing back on you when doing something inefficient isn't the worst thing.
I notice that one of the reasons for using shell
and the thread pool is to try and limit the number of concurrent processes. But this isn't a very reliable way to do it; just because the current thread pool implementation sets a default maximum of 64 threads does not mean it always will do so.
Here's an example of a parallel test runner that runs up to 4 processes at once, collects their output, and envelopes it. It's a little more than you perhaps need, but it nicely illustrates the shape of the overall solution:
my $degree = 4;
my @tests = dir('t').grep(/\.t$/);
react {
sub run-one {
my $test = @tests.shift // return;
my $proc = Proc::Async.new('perl6', '-Ilib', $test);
my @output = "FILE: $test";
whenever $proc.stdout.lines {
push @output, "OUT: $_";
}
whenever $proc.stderr.lines {
push @output, "ERR: $_";
}
my $finished = $proc.start;
whenever $finished {
push @output, "EXIT: {.exitcode}";
say @output.join("\n");
run-one();
}
}
run-one for 1..$degree;
}
The key thing here is the call to run-one
when a process ends, which means that you always replace an exited process with a new one, maintaining - so long as there are things to do - up to 4 processes running at a time. The react
block naturally ends when all processes have completed, due to the fact that the number of events subscribed to drops to zero.
来源:https://stackoverflow.com/questions/55265176/why-dont-all-the-shell-processes-in-my-promises-start-blocks-run-is-this-a