snakemake: how to deal with variable number of output from a rule

眉间皱痕 提交于 2020-01-03 21:12:22

问题


I want to run bcl2fastq to generate fastq files from bcl format.

Depending on the sequencing set up with respect to sequencing mode and how many indexes were used, it can generate either read1,read2,index1 or read1,read2,index1,index2, etc.

What I want to do is, put the read output number information in the config.yaml file as this:

readids: ['I1','I2','R1','R2']

and let the rule figure out automatically how many read output (fastq.gz files) it should generate.

How do I write the output section to achieve it?

Below is what I have and it somehow can only output one file from this rule each time. So it actually run this rule 4 times, each for I1, I2, R1 and R2, which is not what I want. How to fix it at line 45? in line 45, {readid} is supposed to be one of I1,I2,R1,R2.

 39 rule bcl2fastq:                                                                                                                                                 
 40     input:
 41         "/data/MiniSeq/test"
 42     params:
 43         prefix="0_fastq"
 44     output:
 45         "0_fastq/{runid}_S0_L001_{readid}_001.fastq.gz"
 46     log:
 47         "0_fastq/bcl2fastq_log.txt"
 48     shell:
 49         """
 50         bcl2fastq -R {input} -o {params.prefix} --create-fastq-for-index-reads --barcode-mismatches 1 --use-bases-mask {config[bcl2mask]} --minimum-trimmed
    -read-length 1 --mask-short-adapter-reads 1 --no-bgzf-compression &> {log}
 52        
 53         """

回答1:


you are looking for the expand() function which basically fills in the given variables, returning a list of output files. You just need to be careful to escape wildcards that should "survive the formatting" (use double curly brackets):

So in your case

output:
      expand("0_fastq/{{runid}}_S0_L001_{readid}_001.fastq.gz", readid=config['readids'])

This will replace readid with values given in config['readids'] and keep the runid wilcard.

Andreas



来源:https://stackoverflow.com/questions/40685908/snakemake-how-to-deal-with-variable-number-of-output-from-a-rule

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!