Using Pool class in PHP7 pthreads extension

前端 未结 2 1680
梦毁少年i
梦毁少年i 2021-01-21 08:18

I took the most basic demo of pthreads PHP7 extension that uses Pool class (this demo https://github.com/krakjoe/pthreads#polyfill) and extended it a little so I ca

相关标签:
2条回答
  • 2021-01-21 08:46

    As you have quite correctly noted, the code you have copied targets pthreads v2 (for PHP 5.x).

    The problem boils down to the fact that the garbage collector in pthreads is not deterministic. This means it will not behave predictably, and so it cannot be reliably used in order to fetch data from the tasks that have been executed by the pool.

    One way you could fetch this data would be to pass in Threaded objects into the tasks being submitted to the pool:

    <?php
    
    $pool = new Pool(4);
    $data = [];
    
    foreach (range(1, 8) as $i) {
        $dataN = new Threaded();
        $dataN->i = $i;
    
        $data[] = $dataN;
    
        $pool->submit(new class($dataN) extends Threaded {
            public $data;
    
            public function __construct($data)
            {
                $this->data = $data;
            }
    
            public function run()
            {
                echo "Hello World\n";
                $this->data->i *= 2;
            }
        });
    }
    
    while ($pool->collect());
    
    $pool->shutdown();
    
    foreach ($data as $dataN) {
        var_dump($dataN->i);
    }
    

    There are a few things to note about the above code:

    • Collectable (which is now an interface in pthreads v3) is implemented by the Threaded class already, so there's no need to implement it yourself.
    • Once a task has been submitted to the pool, it is already considered to be garbage, and so there is no need to handle this part yourself. Whilst you still have the ability to override the default garbage collector, this should not be needed in the vast majority of cases (including yours).
    • I still invoke the collect method (in a loop that blocks the main thread until all tasks have finished executing) so that the tasks can be garbage collected (using pthreads' default collector) to free up memory whilst the pool is executing tasks.
    0 讨论(0)
  • 2021-01-21 08:47

    I had a similar problem, where the collecting would return true instantly. Turns out that collect would return when all work was in process and not when all work was completed. It wouldn't even handle the task, so collecting was never returned.

    So if I had a poolsize of 4 and submitted just 3 tasks, collect would never run and we would continue immediately. Example:

    define ("CRLF", "\r\n");
    
    class AsyncWork extends Thread {
      private $done = false;
      private $id;
    
      public function __construct($id) {
        $this->id = $id;
      }
    
      public function id() {
        return $this->id;
      }
    
      public function isCompleted() {
        return $this->done;
      }
    
      public function run() {
        echo '[AsyncWork] ' . $this->id . CRLF;
        sleep(rand(1,5));
        echo '[AsyncWork] sleep done ' . $this->id . CRLF;
        $this->done = true;
      }
    }
    
    $pool = new Pool(4);
    
    for($i=1;$i<=3;$i++) {
      $pool->submit(new AsyncWork($i));
    }
    
    while ($pool->collect(function(AsyncWork $work){
        echo 'Collecting ['.$work->id().']: ' . ($work->isCompleted()?1:0) . CRLF;
        return $work->isGarbage();
    })) continue;
    
    echo 'ALL DONE' . CRLF;
    
    $pool->shutdown();
    

    would output

    [AsyncWork] 1
    [AsyncWork] 2
    ALL DONE
    [AsyncWork] 3
    [AsyncWork] sleep done 2
    [AsyncWork] sleep done 3
    [AsyncWork] sleep done 1
    

    If I changed above code to have more work then the poolsize, it would collect untill all work was in process. EG:

    for($i=1;$i<=10;$i++) {
      $pool->submit(new AsyncWork($i));
    }
    
    //results:
    
    [AsyncWork] 1
    [AsyncWork] 2
    [AsyncWork] 3
    [AsyncWork] 4
    [AsyncWork] sleep done 4
    [AsyncWork] 8
    Collecting [4]: 1
    [AsyncWork] sleep done 1
    Collecting [1]: 1
    [AsyncWork] 5
    [AsyncWork] sleep done 3
    Collecting [3]: 1
    [AsyncWork] 7
    [AsyncWork] sleep done 2
    Collecting [2]: 1
    [AsyncWork] 6
    [AsyncWork] sleep done 6
    Collecting [6]: 1
    [AsyncWork] 10
    [AsyncWork] sleep done 7
    Collecting [7]: 1
    [AsyncWork] sleep done 8
    Collecting [8]: 1
    [AsyncWork] sleep done 5
    Collecting [5]: 1
    ALL DONE
    [AsyncWork] 9
    [AsyncWork] sleep done 9
    [AsyncWork] sleep done 10
    

    As you can see, it never collects the last tasks and it returns before the work is done.

    The only way I could solve this, was to handle collecting myself, by keeping track of the tasklist.

    $pool = new Pool(4);
    
    $worklist = [];
    for($i=1;$i<=10;$i++) {
      $work = new AsyncWork($i);
      $worklist[] = $work;
      $pool->submit($work);
    }
    
    do {
      $alldone = true;
      foreach($worklist as $i=>$work) {
        if (!$work->isCompleted()) {
          $alldone = false;
        } else {
          echo 'Completed: '. $work->id(). CRLF;
          unset($worklist[$i]);
        }
      }
    
      if ($alldone) {
        break;
      }
    } while(true);
    
    while ($pool->collect(function(AsyncWork $work){
        echo 'Collecting ['.$work->id().']: ' . ($work->isCompleted()?1:0) . CRLF;
        return $work->isGarbage();
    })) continue;
    
    echo 'ALL DONE' . CRLF;
    
    $pool->shutdown();
    

    This was the only way I could make sure ALL DONE was only called when it was in fact, all done.

    [AsyncWork] 1
    [AsyncWork] 2
    [AsyncWork] 3
    [AsyncWork] 4
    [AsyncWork] sleep done 1
    [AsyncWork] 5
    Completed: 1
    [AsyncWork] sleep done 2
    Completed: 2
    [AsyncWork] 6
    [AsyncWork] sleep done 4
    [AsyncWork] 8
    Completed: 4
    [AsyncWork] sleep done 6
    [AsyncWork] sleep done 3
    [AsyncWork] 7
    Completed: 6
    Completed: 3
    [AsyncWork] sleep done 5
    Completed: 5
    [AsyncWork] 10
    [AsyncWork] 9
    [AsyncWork] sleep done 9
    Completed: 9
    [AsyncWork] sleep done 8
    Completed: 8
    [AsyncWork] sleep done 7
    Completed: 7
    [AsyncWork] sleep done 10
    Completed: 10
    Collecting [1]: 1
    Collecting [5]: 1
    Collecting [9]: 1
    Collecting [2]: 1
    Collecting [6]: 1
    Collecting [10]: 1
    Collecting [3]: 1
    Collecting [7]: 1
    Collecting [4]: 1
    Collecting [8]: 1
    ALL DONE
    
    0 讨论(0)
提交回复
热议问题