Is it safe to pipe the output of several parallel processes to one file using >>?

后端未结

关注

 9  1175

一整个雨季 2020-12-25 11:20

I\'m scraping data from the web, and I have several processes of my scraper running in parallel.

I want the output of each of these processes to end up in the same f

9条回答

生来不讨喜 (楼主)

2020-12-25 12:03

One possibly interesting thing you could do is use gnu parallel: http://www.gnu.org/s/parallel/ For example if you you were spidering the sites:

stackoverflow.com, stackexchange.com, fogcreek.com

you could do something like this

(echo stackoverflow.com; echo stackexchange.com; echo fogcreek.com) | parallel -k your_spider_script

and the output is buffered by parallel and because of the -k option returned to you in the order of the site list above. A real example (basically copied from the 2nd parallel screencast):

 ~ $ (echo stackoverflow.com; echo stackexchange.com; echo fogcreek.com) | parallel -k ping -c 1 {}


PING stackoverflow.com (64.34.119.12): 56 data bytes

--- stackoverflow.com ping statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss
PING stackexchange.com (64.34.119.12): 56 data bytes

--- stackexchange.com ping statistics ---
1 packets transmitted, 0 packets received, 100.0% packet loss
PING fogcreek.com (64.34.80.170): 56 data bytes
64 bytes from 64.34.80.170: icmp_seq=0 ttl=250 time=23.961 ms

--- fogcreek.com ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 23.961/23.961/23.961/0.000 ms

Anyway, ymmv

0 讨论(0)

查看其它9个回答