snakemake

Can Snakemake work if a rule's shell command is a cluster job?

不打扰是莪最后的温柔 提交于 2020-06-13 09:03:05
问题 In below example, if shell script shell_script.sh sends a job to cluster, is it possible to have snakemake aware of that cluster job's completion? That is, first, file a should be created by shell_script.sh which sends its own job to the cluster, and then once this cluster job is completed, file b should be created. For simplicity, let's assume that snakemake is run locally meaning that the only cluster job originating is from shell_script.sh and not by snakemake . localrules: that_job rule

Snakemake InputFunctionException. AttributeError: 'Wildcards' object has no attribute

对着背影说爱祢 提交于 2020-04-17 04:26:10
问题 I have a list object with ChIP-seq single-end fastq file names allfiles=['/path/file1.fastq','/path/file2.fastq','/path/file3.fastq'] . I'm trying to set that object, allfiles , as a wildcard (I want the input of the fastqc rule (and others such as mapping, but let's keep it simple). I tried what is seen in the code below ( lambda wildcards: data.loc[(wildcards.sample),'read1'] ). This, however, is giving me the error "InputFunctionException in line 118 of Snakefile: AttributeError:

Snakemake - dynamically derive the targets from input files

纵饮孤独 提交于 2020-03-23 04:01:57
问题 I have a large number of input files organized like this: data/ ├── set1/ │ ├── file1_R1.fq.gz │ ├── file1_R2.fq.gz │ ├── file2_R1.fq.gz │ ├── file2_R2.fq.gz | : │ └── fileX_R2.fq.gz ├── another_set/ │ ├── asdf1_R1.fq.gz │ ├── asdf1_R2.fq.gz │ ├── asdf2_R1.fq.gz │ ├── asdf2_R2.fq.gz | : │ └── asdfX_R2.fq.gz : └── many_more_sets/ ├── zxcv1_R1.fq.gz ├── zxcv1_R2.fq.gz : └── zxcvX_R2.fq.gz If you are familiar with bioinformatics - these are of course fastq files from paired end sequencing runs.

Snakemake: Generic input function for different file locations

生来就可爱ヽ(ⅴ<●) 提交于 2020-02-06 11:00:47
问题 I have two locations where my huge data can be stored: /data and /work . /data is the folder where (intermediate) results are moved to after quality control. It is mounted read-only for the standard user. /work is the folder where new results are written to. Obviously, it is writable. I do not want to copy or link data from /data to /work . So I run my snakemake from within the /work folder and want my input function first to check, if the required file exists in /data (and return the

Snakemake: Generic input function for different file locations

可紊 提交于 2020-02-06 10:58:28
问题 I have two locations where my huge data can be stored: /data and /work . /data is the folder where (intermediate) results are moved to after quality control. It is mounted read-only for the standard user. /work is the folder where new results are written to. Obviously, it is writable. I do not want to copy or link data from /data to /work . So I run my snakemake from within the /work folder and want my input function first to check, if the required file exists in /data (and return the

snakemake group files together by wildcard

此生再无相见时 提交于 2020-02-05 06:17:06
问题 I've a snakemake file containing rules to concatenate files listed in a samplesheet. Samplesheet looks like : sample unit fq1 fq2 A lane1 A.l1.1.R1.txt A.l1.1.R2.txt A lane1 A.l1.2.R1.txt A.l1.2.R2.txt A lane2 A.l2.R1.txt A.l2.R2.txt B lane1 B.l1.R1.txt B.l1.R2.txt B lane2 B.l2.R1.txt B.l2.R2.txt My goal is to merge fq1 files from the same sample and same unit and put them in {sample}/fastq/ and to merge the resulting files from on sample (the ones in {sample}/fastq ) in {sample}/bam/ It

Snakemake report: How to show results in pipeline order

核能气质少年 提交于 2020-01-25 10:15:39
问题 I want to use the report() functionality of snakemake as it looks really promising. Under the Results sections, I'd like to present the results in the same order as the pipeline was executed. For example, during a (mostly) serial QC pipeline, I would then see where poor quality samples were excluded. However, by default, results are sorted by the name of the 'reported' output: -files. Ways I could achieve a better sorting are for example Use caption to insert a timestamp for execution and