zoukankan      html  css  js  c++  java
  • Trimmomatic过滤Illumina低质量序列

    1. 下载安装

    直接去官网下载二进制软件,解压后的trimmomatic-0.36.jar即为我们需要的软件

    官网:

    http://www.usadellab.org/cms/index.php?page=trimmomatic

    wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.38.zip

    unzip Trimmomatic-0.38.zip

    wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.36.zip
    unzip Trimmomatic-0.36.zip 
    [Trimmomatic-0.38]# tree
    .
    ├── adapters
    │   ├── NexteraPE-PE.fa
    │   ├── TruSeq2-PE.fa
    │   ├── TruSeq2-SE.fa
    │   ├── TruSeq3-PE-2.fa
    │   ├── TruSeq3-PE.fa
    │   └── TruSeq3-SE.fa
    ├── LICENSE
    └── trimmomatic-0.38.jar
     

    2. 运行软件

    一般我们使用默认参数运行即可,具体使用方法可参见官网http://www.usadellab.org/cms/?page=trimmomatic
    使用默认参数运行程序:

    sudo java -jar trimmomatic-0.36.jar PE 
      -phred33 ~/SRR733/SRR2854733_1.fastq ~/SRR733/SRR2854733_2.fastq 
       ~/SRR733/clsseq/SRR2854733_1_paired.fq ~/SRR733/clsseq/SRR2854733_1_unpaired.fq 
       ~/SRR733/clsseq/SRR2854733_2_paired.fq ~/SRR733/clsseq/SRR2854733_2_unpaired.fq 
       ILLUMINACLIP:/usr/local/src/Trimmomatic/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:30:10 
       LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 HEADCROP:8 MINLEN:36
    

    运行结果:

    Input Read Pairs: 23396043 
    Both Surviving: 20842668 (89.09%)
    Forward Only Surviving: 2537100 (10.84%)
    Reverse Only Surviving: 13969 (0.06%)
    Dropped: 2306 (0.01%) TrimmomaticPE: Completed successfully

    3. 常用参数说明

    PE/SE
        设定对Paired-End或Single-End的reads进行处理,其输入和输出参数稍有不一样。
    -threads
        设置多线程运行数
    -phred33
        设置碱基的质量格式,可选pred64
    ILLUMINACLIP:TruSeq3-PE.fa:2:30:10
        切除adapter序列。参数后面分别接adapter序列的fasta文件:允许的最大mismatch数:palindrome模式下匹配碱基数阈值:simple模式下的匹配碱基数阈值。
    LEADING:3
        切除首端碱基质量小于3的碱基
    TRAILING:3
        切除尾端碱基质量小于3的碱基
    SLIDINGWINDOW:4:15
        从5'端开始进行滑动,当滑动位点周围一段序列(window)的平均碱基低于阈值,则从该处进行切除。Windows的size是4个碱基,其平均碱基
    质量小于15,则切除。
    MINLEN:50
        最小的reads长度
    CROP:<length>
        保留reads到指定的长度
    HEADCROP:<length>
        在reads的首端切除指定的长度
    TOPHRED33
        将碱基质量转换为pred33格式
    TOPHRED64
        将碱基质量转换为pred64格式

    Question: Which truseq trimmomatic adapters file to use when removing truseq adapters?
    It depends mostly on which TruSeq protocol was used (V2 - which is old at this stage and usually data from the GAII, or V3, which is everything from the HiSeq or later machines), and whether the data is single-ended or paired ended (SE or PE). The only exception is TruSeq-3-PE which has two sets - TruSeq-3-PE.fa works fine for high quality libraries, but TruSeq-3-PE-2.fa contains some additional sequences which find partial adapters in unusual location/orientation.

    ref:
    https://www.jianshu.com/p/7b5591673255
    https://www.biostars.org/p/323087/

     
     
  • 相关阅读:
    css3 径向渐变
    进度条-线性渐变
    echars 图表提示框自定义显示
    Android Ndef Message解析
    android 应用程序记录AAR
    android的nfc卡模拟开发
    《NFC开发实战详解》笔记
    1、Altium Designer 入门
    Stm32之通用定时器复习
    external与static的用法
  • 原文地址:https://www.cnblogs.com/emanlee/p/10325255.html
Copyright © 2011-2022 走看看