rna-seq

Snakemake, RNA-seq : How can I execute one subpart of a pipeline or another subpart based on the characteristics of the sample that is analysed?

跟風遠走 提交于 2021-02-08 03:41:53
问题 I am using snakemake to design a RNAseq-data analysis pipeline. While I've managed to do that, I want to make my pipeline to be as adaptable as possible and make it able to deal with single-reads (SE) data or paired-end (PE) data within the same run of analyses, instead of analysing SE data in one run and PE data in another. My pipeline is supposed to be designed like this : dataset download that gives 1 file (SE data) or 2 files (PE data) --> set of rules A specific to 1 file OR set of rules

Plotting all genes in pseudotime in Monocle

吃可爱长大的小学妹 提交于 2020-03-25 19:13:05
问题 I would like to use code similar to the code just below that works for the lung dataset in the package Monocle: options(stringsAsFactors = FALSE) library("monocle") lung <- load_lung() diff_test_res <- differentialGeneTest( lung, fullModelFormulaStr = "~genotype" ) ordering_genes <- diff_test_res[diff_test_res$qval < 0.01, "gene_id"] lung <- setOrderingFilter(lung, ordering_genes) plot_ordering_genes(lung) lung <- reduceDimension( lung, max_components = 2, method = 'DDRTree' ) lung <-

Plotting all genes in pseudotime in Monocle

岁酱吖の 提交于 2020-03-25 19:12:58
问题 I would like to use code similar to the code just below that works for the lung dataset in the package Monocle: options(stringsAsFactors = FALSE) library("monocle") lung <- load_lung() diff_test_res <- differentialGeneTest( lung, fullModelFormulaStr = "~genotype" ) ordering_genes <- diff_test_res[diff_test_res$qval < 0.01, "gene_id"] lung <- setOrderingFilter(lung, ordering_genes) plot_ordering_genes(lung) lung <- reduceDimension( lung, max_components = 2, method = 'DDRTree' ) lung <-

ViSEAGO tutorial: visualising topGO object

我们两清 提交于 2020-01-25 10:14:12
问题 Earlier, I had posted a question and was able to load in my data successfully and create a topGO object after some help. I'm trying to visualise GO terms that are significantly associated with the list of differentially expressed genes that I have from mouse RNA-seq data. Now, I'd want to raise a concern about ViSEAGO's tutorial. The tutorial initially specifies loading two files: 'selection.txt' and 'background.txt'. The origin of these files is not clearly stated. However, after a lot of

Error when I load my own data using ViSEAGO create_topGO

余生颓废 提交于 2020-01-25 08:09:30
问题 I have trouble creating topGO object using my own data. Wondering if someone can help me with this! I'm following a couple of tutorials and steps mentioned in the original ViSEAGO paper. Here are chunks from the tutorial and their links. From the publication: ViSEAGO offers all statistical tests and algorithms developed in the Bioconductor topGO R package, taking into account the topology of GO graph by using ViSEAGO::create_topGO- data method followed by the topGO::runTestmethod. Under '

xgene:WGS,突变与癌,RNA-seq,WES

不想你离开。 提交于 2019-12-22 02:13:36
人类全基因组测序06 SNP( single nucleotide polymorphism):有了10倍以上的覆盖深度以后,来确认SNP信息,就相当可靠了。 一个普通黄种人的基因组,与hg19这个参考基因组序列相比,会有350万个左右的SNP。又有大概2万个是落在外显子上的,而非同义的SNP有大概9千个。 所谓非同义的SNP,就是这些SNP是会引起蛋白质的序列变化的。    indel :(insertion & deletion)是指小于50个bp以内的微小的插入、和缺失突变。一个普通黄种人的基因组和hg19相比,约有50万个Indel。其中落在外显子上的,大概在1千个左右。     那么Indel如果一旦落在外显子区域,它 一定会 引起蛋白质序列变化的。       如果它引起的是移码突变,那么在移码位点之后,所有氨基酸序列就和原来的序列完全不同。       如果它(基因)还能保持原来的阅读框,也会引起蛋白质中若干个氨基酸的增或者减。    SV : structure variation 染色体结构变异      1、 染色体内部的位移 2、 染色体之间的位移 3、 大片段的缺失 4、 大片段的插入 5、 大片倍的加倍 6、 大片段的倒位    CNV :copy number variation 拷贝数变异, 是指染色体片段的拷贝数变异:包括拷贝数增加,也包括拷贝数减少

【转载】RNA-seq测序方法

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-05 22:07:45
本文出自于 http://www.bioinfo-scrounger.com 转载请注明出处 RNA-seq测序方法 在测mRNA过程中,首先要去除rRNA。以人为例,在抽提的总RNA中,95%的RNA是rRNA,2%的RNA是mRNA,剩下的则是lncRNA、microRNA、siRNA等。 rRNA整个人类当中是非常保守的,在各个组织器官中也是非常稳定的,因此这些测序结果对我们的研究是没有用处的。mRNA则是RNA中比较重要的部分。 Illumina公司的Truseq RNA建库方法是应用最广泛的一种,真核普通转录组为例: 首先以mRNA的Poly(A)(高等生物特有的)这个特点,让带有Poly(T)探针的磁珠与总RNA进行杂交,使mRNA和磁珠相结合在一起。 接着回收磁珠,将带有Poly(A)的mRNA从磁珠上洗脱下来。 然后用镁离子溶液将洗脱下来的mRNA打成片段,被打断的mRNA片段用随机引物逆转录出第一链的cDNA,再合成出第二链,这样就有了双链cDNA。 对双链cDNA末端修复,加A加接头。 片段选择,PCR扩增、纯化(如果样本中存在污染物,则需要结合试剂盒进一步纯化)。 注: 这个建库方法对mRNA的完整性有较高要求,因此如果mRNA发生降解,那么磁珠只能吸附靠近3’端的mRNA断片,会在富集阶段被洗脱,导致最后数据有一定的偏差

Cell theory|Bulk RNA-seq|Cellar heterogeneity|Micromanipulation|Limiting dilution|LCM|FACS|MACS|Droplet|10X genomics|Human cell atlas|Spatially resolved transcriptomes|ST|Slide-seq|SeqFISH|MERFISH

99封情书 提交于 2019-12-03 17:08:56
生物信息学 Cell theory:7 个要点 All known living things are made up of one or more cells. All living cells arise from pre-existing cells by division. The cell is the fundamental unit of structure and function in all living organisms. The activity of an organism depends on the total activity of independent cells. Energy flow (metabolism and biochemistry) occurs within cells. Cells contain DNA which is found specifically in the chromosome and RNA found in the cell nucleus and cytoplasm. All cells are basically the same in chemical composition in organisms of similar species. Genotype-Transcription-gene