zoukankan      html  css  js  c++  java
  • 8、Transcriptome Assembly

    Created by Benjamin M Goetz, last modified on Jun 29, 2015

    Assembly of RNA-seq short reads into a transcriptome. 

    1. Quality Assessment

    Quality of data assessed by FastQC.

    • Deliverables
      • Reports generated by FastQC.
    • Tools Used
      • FastQC: (Andrews 2010) used to generate quality summaries of data:
        • Per base sequence quality report: useful for deciding if trimming necessary.
        • Sequence duplication levels: evaluation of library complexity. Higher levels of sequence duplication may be expected for high coverage RNAseq data.
        • Overrepresented sequences: evaluation of adapter contamination.

    2. Assembly

    We use Trinity to generate a de novo assembly. Assembly is a very computationally complex task, and may not finish within the time limits imposed on compute jobs at TACC, especially for large data sets. To increase the chance of getting an assembly, we run two assemblies: one with the original data, and one with an in silico normalization to 50x coverage before the main assembly starts. If the non-normalized data doesn't complete an assembly, the normalized data may.

    • Deliverables
      • FASTA file of assembly from full data (if it finishes).

      • FASTA file of assembly with in silico normalization to 50x coverage (if it finishes).

      • If neither assembly run finishes, no charge.

    • Tools Used
      • Trinity (Grabherr, et al 2011) is the best-known and most-used transcriptome assembler available today.

    3. Optional: Homology Against Standard Databases

    We can take a completed assembly and BLAST against UniProt or HMMER against Pfam for an additional charge. These homology searches will give some indication of what the assembled transcripts represent.

    • Deliverables
      • BLAST against UniProt table with the option of appending the best hits to the FASTA file tags.

      • HMMER against Pfam table with the option of appending the best hits to the FASTA file tags.

    • Tools Used
      • BLASTx (Altschul, et al 1997) for nucleotide-to-protein homology search in the UniProt protein database.
      • hmmscan (Eddy, 1998) for HMM-based homology search against the Pfam database of proteins and protein domains.
     
  • 相关阅读:
    代码签名证书原理和作用
    安装SSL证书有什么作用?
    有关如何修复Android手机上的SSL连接错误的快速指南
    可信时间戳如何生成?时间戳基本工作原理
    全球通用的数字证书产品选购指南
    国密SSL证书申请免费试用
    国密SSL证书免费试用申请指南
    哈希算法的原理和用途详解
    https证书安装无效的主要原因
    Anatomy of a Database System学习笔记
  • 原文地址:https://www.cnblogs.com/renping/p/7045353.html
Copyright © 2011-2022 走看看