Run Snakemake rule one sample at a time

十年热恋 提交于 2021-01-28 10:33:48

问题


I'm creating a Snakemake workflow that will wrap up some of the tools in the nvidia clara parabricks pipelines. Because these tools run on GPU's, they typically can only handle one sample at a time, otherwise the GPU will run out of memory. However, Snakemake shoves all the samples through to Parabricks at one time - seemingly unaware of the GPU memory limits. One solution would be to tell Snakemake to process one sample at a time, thus the question:

How do I get Snakemake to process one sample at a time?

Because parabricks is a licensed product (and therefore not necessarily reproducible), I will show an example of the parabricks rule I am trying to run (pbrun fastq2bam), as well as a minimal reproducible example using open source software (fastqc) which we can work on/from

My parabricks rule - pbrun fastq2bam

Snakefile:

# Define samples from fastq dir using wildcards
SAMPLES, = glob_wildcards("../fastq/{sample}_1.filt.fastq.gz")

rule all:
    input:
        expand("{sample}_recalibrated.bam", sample = SAMPLES)

rule pbrun_fq2bam:
    input:
        R1 = "../fastq/{sample}_1.filt.fastq.gz",
        R2 = "../fastq/{sample}_2.filt.fastq.gz"
    output:
        bam = "{sample}_recalibrated.bam",
        recal = "{sample}_recal.txt"
    shell:
        "pbrun fq2bam --ref human_g1k_v37_decoy.fasta --in-fq {input.R1} {input.R2} --knownSites dbsnp_138.b37.vcf --out-bam {output.bam} --out-recal {output.recal}"

Run command:

snakemake -j 32 --use-conda

Error when four samples/exomes are present in the ../fastq/ directory:

GPU-BWA mem
ProgressMeter   Reads           Base Pairs Aligned
cudaSafeCall() failed at ParaBricks/src/samGenerator.cu:782 : out of memory
cudaSafeCall() failed at ParaBricks/src/samGenerator.cu:782 : out of memory
cudaSafeCall() failed at ParaBricks/src/chainGenerator.cu:185 : out of memory
cudaSafeCall() failed at ParaBricks/src/chainGenerator.cu:185 : out of memory
cudaSafeCall() failed at ParaBricks/src/chainGenerator.cu:185 : out of memory
cudaSafeCall() failed at ParaBricks/src/chainGenerator.cu:183 : out of memory
cudaSafeCall() failed at ParaBricks/src/chainGenerator.cu:185 : out of memory
cudaSafeCall() failed at ParaBricks/src/chainGenerator.cu:183 : out of memory

Minimal example - fastqc

Get data:

mkdir ../fastq/
gsutil cp -r gs://genomics-public-data/gatk-examples/example1/NA19913/* ../fastq/

Snakefile:

SAMPLES, = glob_wildcards("../fastq/{sample}_1.filt.fastq.gz")

rule all:
    input:
        expand(["{sample}_1.filt_fastqc.html", "{sample}_2.filt_fastqc.html"], sample = SAMPLES),
        expand(["{sample}_1.filt_fastqc.zip", "{sample}_2.filt_fastqc.zip"], sample = SAMPLES)

rule fastqc:
    input:
        R1 = "../fastq/{sample}_1.filt.fastq.gz",
        R2 = "../fastq/{sample}_2.filt.fastq.gz"
    output:
        html = ["{sample}_1.filt_fastqc.html", "{sample}_2.filt_fastqc.html"],
        zip = ["{sample}_1.filt_fastqc.zip", "{sample}_2.filt_fastqc.zip"]
    conda:
        "fastqc.yaml"
    shell:
        "fastqc {input.R1} {input.R2} --outdir ."

fastqc.yaml:

channels:
  - bioconda
  - conda-forge
  - defaults
dependencies:
  - bioconda::fastqc =0.11.9

Run command:

snakemake -j 32 --use-conda

Thanks in advance for any pointers!!


回答1:


I would like to expand on the answer of @jafors. Probably what is better to do instead of limiting the memory, you can make a gpu resource:

rule pbrun_fq2bam:
...
    resources:
        gpu=1

And then run your snakemake with --resources gpu=1

This case you can still use memory and threads for other rules and every resource describes what it is.




回答2:


You could try adding threads: 32 to your rule, so snakemake will use all given cores on one rule iteration/sample.

Memory can also be restricted using sth. like

resources:
    mem_mb=100

in the rule and --resources mem_mb=100 in the snakemake call. This would restrict the rule to use at most 100MB memory.



来源:https://stackoverflow.com/questions/63733419/run-snakemake-rule-one-sample-at-a-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!