zoukankan      html  css  js  c++  java
  • 【Workflows】 WGS/WES Mapping to Variant Calls


    WGS/WES Mapping to Variant Calls - Version 1.0

    htslib官网上给的一个WGS/WES的流程。关于htslib、samtools和bcftools之间的关系,可以在sanger官网查看其解释:

    HTSlib is a software library for manipulating various sequencing and variant file formats: SAM, BAM, CRAM, VCF, and BCF. SAMtools and BCFtools are applications built around HTSlib, performing format conversion, file merging and splitting, sorting, variant calling, and much more.

    workflow主要三步骤:

    • Mapping
    • Improvement
    • Variant Calling

    Mapping

    bwa index <ref.fa>
    bwa mem -R '@RG	ID:foo	SM:bar	LB:library1' <ref.fa> <read1.fa> <read1.fa> > lane.sam  #官方给的,我认为是read1.fq和read2.fq
    samtools fixmate -O bam <lane.sam> <lane_fixmate.bam>
    samtools sort -O bam -o <lane_sorted.bam> -T </tmp/lane_temp> <lane_fixmate.sam>
    

    Improvement

    # realign gapped alignment
    java -Xmx2g -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R <ref.fa> -I <lane.bam> -o <lane.intervals> --known <bundle/b38/Mills1000G.b38.vcf>
    java -Xmx4g -jar GenomeAnalysisTK.jar -T IndelRealigner -R <ref.fa> -I <lane.bam> -targetIntervals <lane.intervals> --known <bundle/b38/Mills1000G.b38.vcf> -o <lane_realigned.bam>
    
    # BQSR
    ava -Xmx4g -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R <ref.fa> -knownSites >bundle/b38/dbsnp_142.b38.vcf> -I <lane.bam> -o <lane_recal.table>
    java -Xmx2g -jar GenomeAnalysisTK.jar -T PrintReads -R <ref.fa> -I <lane.bam> --BSQR <lane_recal.table> -o <lane_recal.bam>
    
    #MarkDuplicates
    java -Xmx2g -jar MarkDuplicates.jar VALIDATION_STRINGENCY=LENIENT INPUT=<lane_1.bam> INPUT=<lane_2.bam> INPUT=<lane_3.bam> OUTPUT=<library.bam>
    
    samtools merge <sample.bam> <library1.bam> <library2.bam> <library3.bam>
    samtools index <sample.bam>
    
    # realign your INDELS(可选)
    java -Xmx2g -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R <ref.fa> -I <sample.bam> -o <sample.intervals> --known >bundle/b38/Mills1000G.b38.vcf>
    java -Xmx4g -jar GenomeAnalysisTK.jar -T IndelRealigner -R <ref.fa> -I <sample.bam> -targetIntervals <sample.intervals> --known >bundle/b38/Mills1000G.b38.vcf> -o <sample_realigned.bam>
    
    samtools index <sample_realigned.bam>
    

    Variant Calling

    bcftools mpileup -Ou -f <ref.fa> <sample1.bam> <sample2.bam> <sample3.bam> | bcftools call -vmO z -o <study.vcf.gz>
    
    # exam bcf(可选)
    bcftools mpileup -Ob -o <study.bcf> -f <ref.fa> <sample1.bam> <sample2.bam> <sample3.bam>
    bcftools call -vmO z -o <study.vcf.gz> <study.bcf>
    
    tabix -p vcf <study.vcf.gz>
    
    bcftools stats -F <ref.fa> -s - <study.vcf.gz> > <study.vcf.gz.stats>
    mkdir plots
    plot-vcfstats -p plots/ <study.vcf.gz.stats>
    
    bcftools filter -O z -o <study_filtered..vcf.gz> -s LOWQUAL -i'%QUAL>10' <study.vcf.gz>
    
  • 相关阅读:
    jmeter调试-webservise服务直接HTTP请求--方式一
    jmeter-webservise服务HTTP信息头管理器方式--方式二
    使用SoupUI工具获得webservise服务的请求格式内容
    SOUPUI安装破解-小白看
    Jmeter-插件扩展及性能监控插件的安装
    jmeter-命令行执行及测试报告导出
    类加载过程
    老生常谈:String s1 = new String("abc") 创建了几个字符串对象及8 种基本类型的包装类和常量池
    mysql的日期时间类型格式
    leetCode 您正在爬楼梯。它需要n步才能到达顶部。每次您可以爬1或2步。您可以通过几种不同的方式登顶?
  • 原文地址:https://www.cnblogs.com/jessepeng/p/12579674.html
Copyright © 2011-2022 走看看