zoukankan      html  css  js  c++  java
  • ASE分析

    1、Prepare necessary input files(可以参考上次的博客http://www.cnblogs.com/renping/p/7391028.html)

         1)对fq1和fq2合并

               cat fq1 fq2

         2)对bam 文件转换成psl格式

              /share/nas2/genome/biosoft/Python/2.7.8/bin/python /share/nas1/wenyh/develop/tools/Au-public-master/iron/utilities/sam_to_psl.py  -r transcript.fa T16.bam   >T16.psl

         3)gtf format convert to gpd format

             /share/nas1/wenyh/develop/tools/gtfToGenePred transcript.gtf -genePredExt transcript.gpd.tmp

             awk '{print 0" "$0}'transcript.gpd.tmp >transcript.gpd.tmp2 

             /share/nas1/wenyh/develop/pacbio/IDP-ASE/julia/bin/julia /home/wenyh/.julia/v0.4/IDPASE/scripts/convert_gpd.jl transcript.gpd.tmp2 >transcript.gpd.tmp3 

         4)vcf注释和选杂合的vcf文件

                 注释vcf文件。(参考博客:http://www.cnblogs.com/renping/p/7467348.html)

                 awk '$10!~/1/1/;$10!~/././{print}'|le >final.snp.anno.vcf1                   ##筛选杂合

                le final.snp.anno.vcf1|grep -v '#'|cut -f 1 |sort |uniq -c | awk '{print $2,$1}'|less -S|sort -k 2nr|le >Snp.distribution

    2、Prepare Gene level data

         1) mkdir temp/; mkdir gene_files; mkdir isoform_files; mkdir gene_out; mkdir isoform_out;

         2) for i in `le snp.distribution |awk '$1<10 {print $2}'|le`; do echo "/share/nas1/yangch/tools/julia/bin/julia -p 4 /home/yangch/.julia/v0.4/IDPASE/src/prep_runs.jl

           -a /share/nas1/yangch/RENPP/out/T19.psl

           -g /share/nas1/yangch/RENPP/out/transcript.gpd.tmp3

           -v /share/nas1/yangch/RENPP/out/final.snp.anno.vcf1

           -q /share/nas1/yangch/RENPP/out/T19.fq

           -d /share/nas1/yangch/RENPP/out/temp

           -c ${i}

           -f 1

                -o /share/nas1/yangch/RENPP/out/gene_files/

                -p T19 "; done >A1.sh                                                          #####Prepare Gene level data

        3) for i in `ls /share/nas1/yangch/RENPP/out/gene_files/|perl -lne '{next if /^s+$/;/T19_(reads|true)_(.*).txt/;print $2}'|sort|uniq|less`;

                    do echo "/share/nas1/yangch/tools/julia/bin/julia -p 4 /home/yangch/.julia/v0.4/IDPASE/scripts//phase_by_loci_sub.jl

                    -t /share/nas1/yangch/RENPP/out/gene_files/T19_true_${i}.txt

                    -a /share/nas1/yangch/RENPP/out/gene_files/T19_reads_${i}.txt

                    -o /share/nas1/yangch/RENPP/out/gene_out/

                    -l 1

                    -r ${i}

                    -i 10000

                    -b 1000

                    -c 4

                   -d /home/yangch/.julia/v0.4/IDPASE/scripts/

                   -n SGS

                   -m 1 0

                   -s 1.0"; done >to_run_curr.sh                                                 #### Get commands to run each gene individually

                   

          4)  Concatenate all gene level results

                  find gene_out/ -name "REAL*" | xargs cat > gene_out/gene_results.txt

          

  • 相关阅读:
    基础语法 -实验楼
    JavaSE案例-Bank
    初识Java
    Java学习大纲-0412更新
    增量法
    蛮力法
    Host‘116.77.33.xx’is not allowed to connect to this MySQL server
    Maven坐标
    HotSpot虚拟机对象创建
    程序计数器为什么是线程私有的?
  • 原文地址:https://www.cnblogs.com/renping/p/7488170.html
Copyright © 2011-2022 走看看