How can I run multiple runs of pipeline with different config files - issue with lock on .snakemake directory

雨燕双飞 提交于 2020-01-15 11:26:06

问题


I am running a snakemake pipeline from the same working directory but with different config files and the input / output are in different directories too. The issue seems to be that although both runs are using data in different folders snakemake creates the lock on the pipeline folder due to the .snakemake folder and the lock folder within. Is there a way to force separate .snakemake folders? code example below:

Both runs are ran from within /home/pipelines/qc_pipeline :

run 1:

/home/apps/miniconda3/bin/snakemake -p -k -j 999 --latency-wait 10 --restart-times 3 --use-singularity --singularity-args "-B /pipelines_test/QC_pipeline/PE_trimming/,/clusterTMP/testingQC/,/home/www/codebase/references" --configfile /clusterTMP/testingQC/config.yaml --cluster-config QC_slurm_roadsheet.json --cluster "sbatch --job-name {cluster.name} --mem-per-cpu {cluster.mem-per-cpu} -t {cluster.time} --output {cluster.output}"   

run 2:

/home/apps/miniconda3/bin/snakemake -p -k -j 999 --latency-wait 10 --restart-times 3 --use-singularity --singularity-args "-B /pipelines_test/QC_pipeline/SE_trimming/,/clusterTMP/testingQC2/,/home/www/codebase/references" --configfile /clusterTMP/testingQC2/config.yaml --cluster-config QC_slurm_roadsheet.json --cluster "sbatch --job-name {cluster.name} --mem-per-cpu {cluster.mem-per-cpu} -t {cluster.time} --output {cluster.output}"   

error:

Directory cannot be locked. Please make sure that no other Snakemake process is trying to create the same files in the following directory:
/home/pipelines/qc_pipeline
If you are sure that no other instances of snakemake are running on this directory, the remaining lock was likely caused by a kill signal or a power loss. It can be removed with the --unlock argument.

回答1:


Maarten-vd-Sande correctly points to the --nolock option (+1), but in my opinion it's a very bad idea to use --nolock routinely.

As the error says, two snakemake processes are trying to create the same file. Unless the error is a bug in snakemake, I wouldn't blindly proceed and overwrite files.

I think it would be safer to assign to each snakemake execution its own execution directory and working directory, like:

topdir=`pwd`

mkdir -p run1 
cd run1
snakemake --configfile /path/to/config1.yaml ... 
cd $topdir

mkdir -p run2
cd run2 
snakemake --configfile /path/to/config2.yaml ...
cd $topdir

mkdir -p run3
etc... 

EDIT

Actually, it should be less clunky and probably better to use the the --directory/-d option:

snakemake -d run1 --configfile /path/to/config1.yaml ...
snakemake -d run2 --configfile /path/to/config2.yaml ...
...



回答2:


As long as the different pipelines do not generate the same output files you can do it with the --nolock option:

snakemake --nolock [rest of the command]

Take a look here for a short doc about nolock.



来源:https://stackoverflow.com/questions/59642199/how-can-i-run-multiple-runs-of-pipeline-with-different-config-files-issue-with

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!