Get response from multiple jobs in Gearman, but abort after a timeout

问题

In a nutshell: I want to have an overall timeout for a call to runTasks() in a Gearman client.

I feel like I can't be the first person to want this, but I can't find an example of how to put it together.

Here's what I want to achieve:

In a PHP script, use the Gearman client to start a series of jobs in parallel
Each job will produce some search results, which the PHP script will need to process
Some of the jobs may take some time to run, but I don't want to wait for the slowest. Instead, after N milliseconds, I want to process the results from all of the jobs that have completed, and abort or ignore those that haven't.

Requirements 1 and 2 are simple enough using the PHP GearmanClient's addTask() and runTasks() methods, but this blocks until all the submitted jobs are completed, so doesn't meet requirement 3.

Here are some approaches I've tried so far:

The timeout set with setTimeout() measures the time the connection has been idle, which isn't what I'm interested in.
Using background jobs or tasks, there doesn't seem to be any way of retrieving the data returned by the worker. There are several questions already covering this: 1 2
The custom polling loop in the example for addTaskStatus() is almost what I need, but it uses background jobs, so again can't see any results. It also includes the vague comment "a better method would be to use event callbacks", without explaining what callbacks it means, or which part of the example they'd replace.
The client options include a GEARMAN_CLIENT_NON_BLOCKING mode, but I don't understand how to use this, and if a non-blocking runTasks() is any different from using setTaskBackground() instead of setTask().

I've seen suggestions that return communication could just use a different mechanism, like a shared data store, but in that case, I might as well ditch Gearman and build a custom solution with RabbitMQ.

回答1:

I think I've found a workable solution, although I'd still be interested in alternatives.

The key is that calling runTasks() again after an I/O timeout continues to wait for the previous synchronous tasks, so you can build a polling loop out of these parts:

Synchronous, parallel, tasks set up with addTask().
A completion callback set with setCompleteCallback() which tracks which tasks have finished and how many are still pending.
A low I/O timeout set with setTimeout() which acts as your polling frequency.
Repeated calls to runTasks() in a loop, exiting when either all tasks are done, or an overall timeout is reached. This could also have more complex exit conditions, like "after N seconds, or at least X results", etc.

The big downside is that the timeouts issue a PHP Warning, so you have to squash that with a custom error handler or the @ operator.

Here's a fully tested example:

// How long should we wait each time around the polling loop if nothing happens
define('LOOP_WAIT_MS', 100);
// How long in total should we wait for responses before giving up
define('TOTAL_TIMEOUT_MS', 5000);

$client= new GearmanClient();
$client->addServer();

// This will fire as each job completes.
// In real code, this would save the data for later processing,
// as well as tracking which jobs were completed, tracked here with a simple counter.
$client->setCompleteCallback(function(GearmanTask $task) use (&$pending) {
        $pending--;
        echo "Complete!\n";
        echo $task->data();
});

// This array can be used to track the tasks created. This example just counts them.
$tasks = [];
// Sample tasks; the workers sleep for specified number of seconds before returning some data.
$tasks[] = $client->addTask('wait', '2');
$tasks[] = $client->addTask('wait', '2');
$tasks[] = $client->addTask('wait', '2');
$tasks[] = $client->addTask('wait', '2');
$tasks[] = $client->addTask('wait', '2');
$tasks[] = $client->addTask('wait', '2');

$pending = count($tasks);

// This is the key polling loop; runTasks() here acts as "wait for a notification from the server"
$client->setTimeout(LOOP_WAIT_MS);
$start = microtime(true);
do {
        // This will abort with a PHP Warning if no data is received in LOOP_WAIT_MS milliseconds
        // We ignore the warning, and try again, unless we've reached our overall time limit
        @$client->runTasks();
} while (
        // Exit the loop once we run out of time
        microtime(true) - $start < TOTAL_TIMEOUT_MS / 1000
        // Additional loop exit if all tasks have been completed
        // This counter is decremented in the complete callback
        && $pending > 0
);

echo "Finished with $pending tasks unprocessed.\n";

回答2:

Your use case sounds like what CAN_DO_TIMEOUT was created for:

CAN_DO_TIMEOUT

 Same as CAN_DO, but with a timeout value on how long the job
 is allowed to run. After the timeout value, the job server will
 mark the job as failed and notify any listening clients.

 Arguments:
 - NULL byte terminated Function name.
 - Timeout value.

So for any (Worker,Function) tuple you can define a maximum time the Worker will process a Job, otherwise it'll be discarded.

Unfortunately there appears to be a bug in the C Server where the timeout is hard-coded at 1000 seconds.

One workaround is if you're able to implement your timeout logic outside of gearman. For example, if you're using curl, soap, sockets, etc., one can often achieve the desired effect by tweaking those settings.

来源：https://stackoverflow.com/questions/49036819/get-response-from-multiple-jobs-in-gearman-but-abort-after-a-timeout

标签

php

gearman