Snakemake InputFunctionException. AttributeError: 'Wildcards' object has no attribute

对着背影说爱祢 提交于 2020-04-17 04:26:10

问题


I have a list object with ChIP-seq single-end fastq file names allfiles=['/path/file1.fastq','/path/file2.fastq','/path/file3.fastq'] . I'm trying to set that object, allfiles, as a wildcard (I want the input of the fastqc rule (and others such as mapping, but let's keep it simple). I tried what is seen in the code below (lambda wildcards: data.loc[(wildcards.sample),'read1']). This, however, is giving me the error

"InputFunctionException in line 118 of Snakefile:
AttributeError: 'Wildcards' object has no attribute 'sample'
Wildcards:
" 

Does someone know exactly how to define it? It seems I am close, I get the general idea but I am failing to get the syntax correct and execute it. Thank you !

Code:

import pandas as pd
import numpy as np

# Read in config file parameters
configfile: 'config.yaml'
sampleFile = config['samples'] # three columns: sample ID , /path/to/chipseq_file_SE.fastq , /path/to/chipseq_input.fastq
outputDir = config['outputdir'] # output directory

outDir = outputDir + "/MyExperiment"
qcDir = outDir + "/QC"

# Read in the samples table
data = pd.read_csv(sampleFile, header=0, names=['sample', 'read1', 'inputs']).set_index('sample', drop=False)
samples = data['sample'].unique().tolist() # sample IDs
read1 = data['read1'].unique().tolist() # ChIP-treatment file single-end file
inplist= data['inputs'].unique().tolist() # the ChIP-input files
inplistUni= data['inputs'].unique().tolist() # the ChIP-input files (unique)
allfiles = read1 + inplistUni

# Target rule
rule all:
    input:
        expand(f'{qcDir}' + '/raw/{sample}_fastqc.html', sample=samples),
        expand(f'{qcDir}' + '/raw/{sample}_fastqc.zip', sample=samples),

# fastqc report generation
rule fastqc:
    input: lambda wildcards: data.loc[(wildcards.sample), 'read1']
    output:
        html=expand(f'{qcDir}' + '/raw/{sample}_fastqc.html',sample=samples) ,
        zip=expand(f'{qcDir}' + '/raw/{sample}_fastqc.zip',sample=samples)
    log: expand(f'{logDir}' + '/qc/{sample}_fastqc_raw.log',sample=samples)
    threads: 4
    wrapper: "fastqc {input} 2>> {log}"

回答1:


Currently output files of rule fastqc doesn't have any wildcards once they are resolved. That is, there is currently one job in the snakefile where rule fastqc tries to produce one output file for all samples.

However, it appears you would like to run rule fastqc separately for each sample. In that case, it needs to be generalized as below, where {sample} is the wildcard:

rule fastqc:
    input: lambda wildcards: data.loc[(wildcards.sample), 'read1']
    output:
        html = qcDir + '/raw/{sample}_fastqc.html,
        zip=qcDir + '/raw/{sample}_fastqc.zip'
    log: logDir + '/qc/{sample}_fastqc_raw.log'
    threads: 4
    shell: "fastqc {input} 2>> {log}"


来源:https://stackoverflow.com/questions/61216641/snakemake-inputfunctionexception-attributeerror-wildcards-object-has-no-attr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!