zoukankan      html  css  js  c++  java
  • 【Workflows】 WGS/WES Mapping to Variant Calls


    WGS/WES Mapping to Variant Calls - Version 1.0

    htslib官网上给的一个WGS/WES的流程。关于htslib、samtools和bcftools之间的关系,可以在sanger官网查看其解释:

    HTSlib is a software library for manipulating various sequencing and variant file formats: SAM, BAM, CRAM, VCF, and BCF. SAMtools and BCFtools are applications built around HTSlib, performing format conversion, file merging and splitting, sorting, variant calling, and much more.

    workflow主要三步骤:

    • Mapping
    • Improvement
    • Variant Calling

    Mapping

    bwa index <ref.fa>
    bwa mem -R '@RG	ID:foo	SM:bar	LB:library1' <ref.fa> <read1.fa> <read1.fa> > lane.sam  #官方给的,我认为是read1.fq和read2.fq
    samtools fixmate -O bam <lane.sam> <lane_fixmate.bam>
    samtools sort -O bam -o <lane_sorted.bam> -T </tmp/lane_temp> <lane_fixmate.sam>
    

    Improvement

    # realign gapped alignment
    java -Xmx2g -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R <ref.fa> -I <lane.bam> -o <lane.intervals> --known <bundle/b38/Mills1000G.b38.vcf>
    java -Xmx4g -jar GenomeAnalysisTK.jar -T IndelRealigner -R <ref.fa> -I <lane.bam> -targetIntervals <lane.intervals> --known <bundle/b38/Mills1000G.b38.vcf> -o <lane_realigned.bam>
    
    # BQSR
    ava -Xmx4g -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R <ref.fa> -knownSites >bundle/b38/dbsnp_142.b38.vcf> -I <lane.bam> -o <lane_recal.table>
    java -Xmx2g -jar GenomeAnalysisTK.jar -T PrintReads -R <ref.fa> -I <lane.bam> --BSQR <lane_recal.table> -o <lane_recal.bam>
    
    #MarkDuplicates
    java -Xmx2g -jar MarkDuplicates.jar VALIDATION_STRINGENCY=LENIENT INPUT=<lane_1.bam> INPUT=<lane_2.bam> INPUT=<lane_3.bam> OUTPUT=<library.bam>
    
    samtools merge <sample.bam> <library1.bam> <library2.bam> <library3.bam>
    samtools index <sample.bam>
    
    # realign your INDELS(可选)
    java -Xmx2g -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R <ref.fa> -I <sample.bam> -o <sample.intervals> --known >bundle/b38/Mills1000G.b38.vcf>
    java -Xmx4g -jar GenomeAnalysisTK.jar -T IndelRealigner -R <ref.fa> -I <sample.bam> -targetIntervals <sample.intervals> --known >bundle/b38/Mills1000G.b38.vcf> -o <sample_realigned.bam>
    
    samtools index <sample_realigned.bam>
    

    Variant Calling

    bcftools mpileup -Ou -f <ref.fa> <sample1.bam> <sample2.bam> <sample3.bam> | bcftools call -vmO z -o <study.vcf.gz>
    
    # exam bcf(可选)
    bcftools mpileup -Ob -o <study.bcf> -f <ref.fa> <sample1.bam> <sample2.bam> <sample3.bam>
    bcftools call -vmO z -o <study.vcf.gz> <study.bcf>
    
    tabix -p vcf <study.vcf.gz>
    
    bcftools stats -F <ref.fa> -s - <study.vcf.gz> > <study.vcf.gz.stats>
    mkdir plots
    plot-vcfstats -p plots/ <study.vcf.gz.stats>
    
    bcftools filter -O z -o <study_filtered..vcf.gz> -s LOWQUAL -i'%QUAL>10' <study.vcf.gz>
    
  • 相关阅读:
    【java】对象赋值给另一个对象
    spring boot系列(五)spring boot 配置spring data jpa (查询方法)
    Spring Data JPA 查询
    Spring Data JPA 介绍
    OpenID简介
    OAUTH协议介绍
    URL encoding(URL编码)
    RESTful 介绍
    spring boot系列(四)spring boot 配置spring data jpa (保存修改删除方法)
    spring boot 启动报 java.lang.NoClassDefFoundError: ch/qos/logback/core/spi/LifeCycle 错误
  • 原文地址:https://www.cnblogs.com/jessepeng/p/12579674.html
Copyright © 2011-2022 走看看