How do I use parallel programming/multi threading in my bash script?

流过昼夜 提交于 2019-12-08 15:46:59

问题


This is my script:

#!/bin/bash
#script to loop through directories to merge fastq files
sourcedir=/path/to/source
destdir=/path/to/dest

for f in $sourcedir/*
do
    fbase=$(basename "$f")
    echo "Inside $fbase"
    zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz
    zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz
done

Here there are about 30 sub-directories in the directory 'source'. Each sub-directory has certain R1.fastq.gz files and R2.fastq.gz that I want to merge into one R1.fastq.gz and R2.fastq.gz file, then save the merged file to the destination directory. My code works fine but I need to speed it up because of the amount of data. I just want to know is there any way I can implement multi threaded programming in my script? How can I run my script so that multiple jobs run in parallel? New to bash scripting, so any help would be appreciated.


回答1:


The simplest way is to execute the commands in the background, by adding & to the end of the command:

#!/bin/bash
#script to loop through directories to merge fastq files
sourcedir=/path/to/source
destdir=/path/to/dest

for f in $sourcedir/*
do
    fbase=$(basename "$f")
    echo "Inside $fbase"
    zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz &
    zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz &
done

From the bash manual:

If a command is terminated by the control operator ‘&’, the shell executes the command asynchronously in a subshell. This is known as executing the command in the background. The shell does not wait for the command to finish, and the return status is 0 (true). When job control is not active (see Job Control), the standard input for asynchronous commands, in the absence of any explicit redirections, is redirected from /dev/null.




回答2:


I am not sure but you can try using & at the end of the command like this

zcat $f/*R1*.fastq.gz | gzip > $destdir/"$fbase"_R1.fastq.gz &
zcat $f/*R2*.fastq.gz | gzip > $destdir/"$fbase"_R2.fastq.gz &


来源:https://stackoverflow.com/questions/18384505/how-do-i-use-parallel-programming-multi-threading-in-my-bash-script

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!