zoukankan      html  css  js  c++  java
  • 1、Question: prep_reads.info vs. align_summary.txt

    ###参考:https://www.biostars.org/p/163356/ 

    used TopHat to map my reads against their relative reference genome.


    When I look inside prep_reads.info, I see:

    • left_min_read_len=90
    • left_max_read_len=90
    • left_reads_in =24995053
    • left_reads_out=24994132
    • right_min_read_len=90
    • right_max_read_len=90
    • right_reads_in =24995053
    • right_reads_out=24994422

    Then when I open align_summary.txt, I see:

    Left reads:
                   Input:  24995053
                 Mapped:  22715900 (90.9% of input)
                of these:   2106892 ( 9.3%) have multiple alignments (89 have >20)
    Right reads:
                   Input:  24995053
                  Mapped:  22310498 (89.3% of input)
                of these:   2088630 ( 9.4%) have multiple alignments (148 have >20)
    90.1% overall read alignment rate.

    Aligned pairs:  21074559
         of these:   1469415 ( 7.0%) have multiple alignments
              and:    107380 ( 0.5%) are discordant alignments
    83.9% concordant pair alignment rate.


    In align_summary.txt I know the changes between "Input" number and "Mapped" is because some of reads are unmapped to reference genome. ^Ok^.

    But for prep_reads.info I do not know why "_reads_out" numbers are different from "_reads_in" numbers and If this difference is due to unmapped reads, why the difference is not equal to difference between the Input number and Mapped number in align_summary.txt?

    <caption>Differences</caption>
     prep_reads.infoalign_summary.txt
    left 24995053-24994132=921 24995053-22715900=2279153

    right

    24995053-24994422=631

    24995053-22310498=2684555



    The difference is due to filtering for things such as read length. Some reads are too short, so they're excluded. This occurs before any mapping takes place. 

            I seeeeeee. I did not know thaaat. I thought we can eliminate short reads only by trimmomatic (MINLEN). I did not know mapping tools also eliminate some reads.

     

    Well, "things such as read length". It's filtering for other things too. In your case, one of these "other things" is what's causing additional reads to get dropped, since your input is all 90 bases

  • 相关阅读:
    [Jenkins]admin用户登陆,提示登陆无效(之前登陆OK,三天没有登陆,突然提示登陆无效,重启无法解决)的解决方法
    科普技术贴:个人开发者的那些赚钱方式
    赚钱必看:独立开发者必知的一些总结
    【转】微信小程序给程序员带来的可能是一个赚钱的机遇
    20个编写现代 CSS 代码的建议
    Python爬虫入门一之综述
    Python学习基础知识概要
    文本框输入邮箱自动联想补全
    鼠标移到图片放大效果
    网站banner无缝轮播
  • 原文地址:https://www.cnblogs.com/renping/p/7851803.html
Copyright © 2011-2022 走看看