GNU Parallel as job queue processor

时光毁灭记忆、已成空白 提交于 2019-12-11 06:16:53

问题


I have a worker.php file as below

<?php

$data = $argv[1];

//then some time consuming $data processing

and I run this as a poor man's job queue using gnu parallel

while read LINE; do echo $LINE; done < very_big_file_10GB.txt  | parallel -u php worker.php 

which kind of works by forking 4 php processes when I am on 4 cpu machine.

But it still feels pretty synchronous to me because read LINE is still reading one line at a time.

Since it is 10GB file, I am wondering if somehow I can use parallel to read the same file in parallel by splitting it into n parts (where n = number of my cpus), that will make my import n times faster (ideally).


回答1:


No need to do the while business:

parallel -u php worker.php :::: very_big_file_10GB.txt

-u Ungroup output. Only use this if you are not going to use the output, as output from different jobs may mix.

:::: File input source. Equivalent to -a.

I think you will benefit from reading at least chapter 2 (Learn GNU Parallel in 15 minutes) of "GNU Parallel 2018". You can buy it at http://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html or download it at: https://doi.org/10.5281/zenodo.1146014



来源:https://stackoverflow.com/questions/52945244/gnu-parallel-as-job-queue-processor

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!