zoukankan      html  css  js  c++  java
  • NGS NGS ngs(hisat,stringtie,ballgown)

    NGS

    ngs(hisat,stringtie,ballgown)

    #HISAT (hierarchical indexing for spliced alignment of transcripts) is a highly efficient system for aligning reads from RNA sequencing experiments. HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index, employing two types of indexes for alignment: a whole-genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments. HISAT's hierarchical index for the human genome contains 48,000 local FM indexes, each representing a genomic region of ~64,000 bp. Tests on real and simulated data sets showed that HISAT is the fastest system currently available, with equal or better accuracy than any other method. Despite its large number of indexes, HISAT requires only 4.3 gigabytes of memory. HISAT supports genomes of any size, including those larger than 4 billion bases.

    HISAT是映射的RNA序列读取快速,灵敏拼接比对程序。除了一个表示全基因组一个全局调频索引,HISAT使用一大组小调频索引共同地覆盖整个基因组(每个索引表示〜64000碱基对的基因组区域并且需要〜48000的索引,以覆盖人基因组)。这些小的索引(称为本地索引)与几个对准策略相结合使读取,特别是读取跨越多个外显子的RNA序列的有效对准。HISAT的内存占用量是比较低的(〜4.3GB为人类基因组)。我们的基础上发展HISAT Bowtie2实现来处理大部分的操作在FM指数。

    #Methods used to sequence the transcriptome often produce more than 200 million short sequences. We introduce StringTie, a computational method that applies a network flow algorithm originally developed in optimization theory, together with optional de novo assembly, to assemble these complex data sets into transcripts. When used to analyze both simulated and real data sets, StringTie produces more complete and accurate reconstructions of genes and better estimates of expression levels, compared with other leading transcript assembly programs including Cufflinks, IsoLasso, Scripture and Traph. For example, on 90 million reads from human blood, StringTie correctly assembled 10,990 transcripts, whereas the next best assembly was of 7,187 transcripts by Cufflinks, which is a 53% increase in transcripts assembled. On a simulated data set, StringTie correctly assembled 7,559 transcripts, which is 20% more than the 6,310 assembled by Cufflinks. As well as producing a more complete transcriptome assembly, StringTie runs faster on all data sets tested to date compared with other assembly software, including Cufflinks.

    StringTie是RNA测序比对快速,高效的汇编成潜在的成绩单。它采用了一种新的网络流算法以及一个可选的从头组装步骤来组装和定量表示每个基因座的多个剪接变体的全长转录物。其输入可以不仅包括原始的比对读取被其他转录物装配,也比对已经从那些reads.In为了鉴定实验之间差异表达的基因组装更长的序列,StringTie的输出可以用相同的专门的软件来处理长礼服, Cuffdiff或其他程序(DESeq2,磨边机等)。

    #Analysis of raw reads from RNA sequencing (RNA-seq) makes it possible to reconstruct complete gene structures, including multiple splice variants, without relying on previously established annotations. Downstream statistical modeling of summarized gene or transcript expression data output from these pipelines is facilitated by the Bioconductor project

    ballgown是一个软件包,旨在促进RNA测序数据的灵活的差异表达分析。它还提供了功能来组织,可视化和分析你的转录组组装表达测量。

  • 相关阅读:
    打造自己的 C# WinForm 应用程序的 SQL Server 连接配置界面
    怎么修改app.config的值
    将DATATABLE中的数据导入到数据库中
    C# 多线程使用progressBar进度条控件
    程序员技术练级攻略2
    c#中Setting.setting的使用
    该行已经属于另一个表
    Microsoft 数据访问技术的过去、现在和未来
    Winform专栏
    在 C# 中使用设置 Settings.settings
  • 原文地址:https://www.cnblogs.com/wangprince2017/p/9937593.html
Copyright © 2011-2022 走看看