I'm trying to first generate 4 files, for the LETTERS x NUMS combinations, then summarize over the NUMS to obtain one file per element in LETTERS:
LETTERS = ["A", "B"]
NUMS = ["1", "2"]
rule all:
input:
expand("combined_{letter}.txt", letter=LETTERS)
rule generate_text:
output:
"text_{letter}_{num}.txt"
shell:
"""
echo "test" > {output}
"""
rule combine text:
input:
expand("text_{letter}_{num}.txt", num=NUMS)
output:
"combined_{letter}.txt"
shell:
"""
cat {input} > {output}
"""
Executing this snakefile results in the following error:
WildcardError in line 19 of /tmp/Snakefile:
No values given for wildcard 'letter'.
File "/tmp/Snakefile", line 19, in <module>
It seems that partial expand
is not possible. Is it a limitation of expand
? If so, how should I circumvent it ?
It seems that this is not a limitation of expand
, but a limitation of my familiarity with the way string-formatting works in python. I need to use double brackets for the non-expanded wildcard:
LETTERS = ["A", "B"]
NUMS = ["1", "2"]
rule all:
input:
expand("combined_{letter}.txt", letter=LETTERS)
rule generate_text:
output:
"text_{letter}_{num}.txt"
shell:
"""
echo "test" > {output}
"""
rule combine text:
input:
expand("text_{{letter}}_{num}.txt", num=NUMS)
output:
"combined_{letter}.txt"
shell:
"""
cat {input} > {output}
"""
Executing this snakefile now generates the expected following files:
text_A_2.txt
text_A_1.txt
text_B_2.txt
text_B_1.txt
combined_A.txt
combined_B.txt
Indeed, braces need to be escaped when you want to ignore them in expand
. It relies on str.format
, and hence any rules from format
apply to expand
as well.
来源:https://stackoverflow.com/questions/40398091/how-to-do-a-partial-expand-in-snakemake