zoukankan      html  css  js  c++  java
  • Comparative assessment of long-read error-correction software applied to RNA-sequencing data 长读纠错软件应用于RNA测序数据的比较评估

    Comparative assessment of long-read error-correction software applied to RNA-sequencing data    长读纠错软件应用于RNA测序数据的比较评估

    Leandro Lima, Camille Marchet, Ségolène Caboche, Corinne Da Silva, Benjamin Istace, Jean-Marc Aury, Hélène Touzet, Rayan Chikhi
     

    Abstract

    Motivation Long-read sequencing technologies offer promising alternatives to high-throughput short read sequencing, especially in the context of RNA-sequencing. However these technologies are currently hindered by high error rates in the output data that affect analyses such as the identification of isoforms, exon boundaries, open reading frames, and the creation of gene catalogues. Due to the novelty of such data, computational methods are still actively being developed and options for the error-correction of RNA-sequencing long reads remain limited.

    Results In this article, we evaluate the extent to which existing long-read DNA error correction methods are capable of correcting cDNA Nanopore reads. We provide an automatic and extensive benchmark tool that not only reports classical error-correction metrics but also the effect of correction on gene families, isoform diversity, bias towards the major isoform, and splice site detection. We find that long read error-correction tools that were originally developed for DNA are also suitable for the correction of RNA-sequencing data, especially in terms of increasing base-pair accuracy. Yet investigators should be warned that the correction process perturbs gene family sizes and isoform diversity. This work provides guidelines on which (or whether) error-correction tools should be used, depending on the application type.

    Benchmarking software https://gitlab.com/leoisl/LR_EC_analyser

    动机
    长读测序技术为高通量短读测序提供了很有前途的替代品,特别是在RNA测序领域。
    然而,这些技术目前受到输出数据的高错误率的阻碍,这些错误率会影响分析,如异构体的识别、外显子边界、开放阅读框和基因目录的创建。
    由于这些数据的新新性,计算方法仍在积极开发中,rna测序长读的纠错选项仍然有限。

    结果
    在这篇文章中,我们评估现有的长读DNA纠错方法能够纠正cDNA纳米孔读错的程度。
    我们提供了一个自动的和广泛的基准工具,不仅报告经典的错误修正指标,而且还修正对基因家族,亚型多样性,对主要亚型的偏见,剪接位点检测的影响。
    我们发现,最初为DNA开发的长读错误校正工具也适用于rna测序数据的校正,特别是在提高碱基对准确性方面。
    然而,研究者应该被警告的是,校正过程扰乱了基因家族的大小和亚型多样性。
    根据应用程序的类型,该工作提供了应该使用哪些(或者是否)错误纠正工具的指南。

  • 相关阅读:
    Windows消息传递机制详解
    TCP、UDP、IP协议分析
    桥模式
    单例模式
    WPF属性学习
    第六章 数组与索引器 6.1一维数组的声明,创建与初始化
    C#委托与事件习题
    Windows窗体应用程序四(制作随机加法练习器)
    用VS制作简易计算器(WPF)
    第五章 5.3类的静态成员,析造函数与析构函数(猫类)
  • 原文地址:https://www.cnblogs.com/wangprince2017/p/13755970.html
Copyright © 2011-2022 走看看