Snakemake - Problem trying to use global_wildcards (TypeError: expected str, got list)

感情迁移 提交于 2021-01-27 18:30:03

问题


I'm a newbie using Snakemake and not an expert in Python neither so the answer might be quite obvious. Everything in my workflow worked fine in my tests until I tried to use glob_wildcards in order to turn all of my fastq.gz files from one directory (FASTQDIR) into fastqc files.

The samples names in the SAMPLES list are okay but I have an error saying that a string is expected instead of a list (I assume this is my SAMPLES list) and I don't really know where to act in my Snakefile in order to correct it. I understand that it is surely linked to my use of glob_wildcards but I don't understand where is the problem. Do you have an idea of how I can fix it ?

Here is my Snakefile code :

FASTQDIR = "/fastq/files/directory/"
WDIR = "/my/working/directory/"
SAMPLES, = glob_wildcards(FASTQDIR + "{sample}.fastq.gz")

rule all:
    input:
        expand(WDIR + "Fastqc/{sample}_fastqc.html", sample=SAMPLES),
        expand(WDIR + "Fastqc/{sample}_fastqc.zip", sample=SAMPLES)

#Generates fastqc file for the sample fastq.gz file in the Fastqc directory
rule fastqc_generate_qc:
    input:
        expand(FASTQDIR + "{sample}.fastq.gz", sample=SAMPLES)
    output:
        expand(WDIR + "Fastqc/{sample}_fastqc.html", sample=SAMPLES),
        expand(WDIR + "Fastqc/{sample}_fastqc.zip", sample=SAMPLES)
    shell:
        "fastqc --outdir Fastqc/ {input}"

Here is the entire Traceback :

Traceback (most recent call last):
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/__init__.py", line 420, in snakemake
    force_use_threads=use_threads)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/workflow.py", line 480, in execute
    success = scheduler.schedule()
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/scheduler.py", line 215, in schedule
    self.run(job)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/scheduler.py", line 229, in run
    error_callback=self._error)
  File "/home/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/executors.py", line 59, in run
    self._run(job)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/executors.py", line 120, in _run
    super()._run(job)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/executors.py", line 66, in _run
    self.printjob(job)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/executors.py", line 85, in printjob
    msg=job.message,
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/jobs.py", line 175, in message
    self.rule.message else None)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/jobs.py", line 542, in format_wildcards
    return format(string, **_variables)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/site-packages/snakemake/utils.py", line 259, in format
    return fmt.format(_pattern, *args, **variables)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/string.py", line 187, in format
    return self.vformat(format_string, args, kwargs)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/string.py", line 191, in vformat
    result, _ = self._vformat(format_string, args, kwargs, used_args, 2)
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/string.py", line 201, in _vformat
    self.parse(format_string):
  File "/home/miniconda3/envs/snakemake-tutorial/lib/python3.5/string.py", line 285, in parse
    return _string.formatter_parser(format_string)
TypeError: expected str, got list

Thank you in advance for your help


回答1:


You are not using wildcards here.

Your rule fastqc_generate_qc takes as input ALL the fastq files and output ALL the fastqc files here.
One thing to remember in snakemake is: expand produces a list of files. You don't want that here:

rule fastqc_generate_qc:
    input:
        FASTQDIR + "{sample}.fastq.gz"
    output:
        WDIR + "Fastqc/{sample}_fastqc.html",
        WDIR + "Fastqc/{sample}_fastqc.zip"
    shell:
        "fastqc --outdir Fastqc/ {input}"

Here sample is a wildcard. It is your rule all that will trigger the real file names to produce. The rule fastqc_generate_qc will then use wildcards to apply the rule to any output asked for in rule all.

For information, if you want to use a wildcard in an expand function, you have to double the brackets: expand("path/{{study}}/{sample}, sample=SAMPLES)
Here, study is a wildcard, sample is not. sample values are defined in the second argument of the expand function.



来源:https://stackoverflow.com/questions/60526646/snakemake-problem-trying-to-use-global-wildcards-typeerror-expected-str-got

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!