How to make gnu-parallel split multiple input files

送分小仙女□ 提交于 2020-01-15 09:16:10

问题


I have a script which takes three arguments and is run like this:

myscript.sh input1.fa input2.fa out.txt

The script reads one line each from input1.fa and input2.fa, does some comparison, and writes the result to out.txt. The two inputs are required to have the same number of lines, and out.txt will also have the same number of lines after the script finishes.

Is it possible to parallelize this using GNU parallel?

I do not care that the output has a different order from the inputs, but I do need to compare the ith line of input1.fa with the ith line of input2.fa. Also, it is acceptable if I get multiple output files (like output{#}) instead of one -- I'll just cat them together.

I found this topic, but the answer wasn't quite what I wanted. I know I can split the two input files and process them in parallel in pairs using xargs, but would like to do this in one line if possible...


回答1:


If you can change myscript.sh, so it reads from a pipe and writes to a pipe you can do:

paste input1.fa input2.fa | parallel --pipe myscript.sh > out.txt

So in myscript you will need to read from STDIN and split on TAB to get the input from input1 and input2.



来源:https://stackoverflow.com/questions/18622295/how-to-make-gnu-parallel-split-multiple-input-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!