zoukankan      html  css  js  c++  java
  • Nr,GenBank, RefSeq, UniProt 数据库的异同

    Nr,GenBank, RefSeq, UniProt 数据库的异同

    有的文章在做DEG分析时,会把reads比对到RefSeq的转录组上。我也没搞清楚这和直接比对到常规转录组上有什么区别。

    文章:Single-Cell Transcriptome Analysis Reveals Dynamic Changes in lncRNA Expression during Reprogramming

    方法:For differential expression analysis, we aligned reads against the refSeq mouse transcriptome using Bowtie version 0.12.7 (Langmead et al., 2009). Expression levels were then  stimated using eXpress (Roberts and Pachter, 2013) (version 1.3.0), with gene-level effective counts and RPKM values derived from the sum of the corresponding values for all isoforms of a gene.

    refseq 数据库长啥样?

    ftp://ftp.ncbi.nlm.nih.gov/refseq/

    进到小鼠里:

    mRNA_Prot

    mRNA_Prot directory
       Contents: organisms-specific RefSeq transcript and protein data
    
         {org-name}.files.installed: 
             reports the md5checksum and files included in the directory
             For example: /refseq/H_sapiens/mRNA_Prot/human.files.installed
    
       File Name Conventions:
    	File name formats are as follows:
     	     common_name.#.molecule_type.format_type
            Multiple files may be provided for any given molecule and format type and file 
    	names include a numerical increment.  Files with the same numerical increment
    	are related by content. 
    
    	For example, the files provided for human are named as:
     	      human.#.rna.fna.gz --fasta report for transcript records
     	      human.#.protein.faa.gz --fasta report for protein records
     	      human.#.rna.gbff.gz  --flatfile report for transcript records
     	      human.#.protein.gpff.gz --flatfile report for protein records
    

      

    下载一个rna.fna文件,里面是这样的:

    >NM_001013372.2 Mus musculus neural regeneration protein (Nrp), mRNA
    CGGTCCAAGGAATTTTTCTGACAAACGCAATAGGCCGACCAGTACTGGAACGCAGTGCGCTTAGCCCCTTTATGGCGGAG
    GCTGCCATGTTAAAACGGAATGAATCGAAACCCTGGAGTCGTGACCCCGGAAGAACCTGCCAGAGCCGGAATTTCGAGTT
    CTGCTTCCGGGCCAAACTGTTGGCAGCCTCGAGATGGGGAAGATGGCGGCTGCTGTGGCTTCATTAGCCACGCTGGCTGC
    AGAGCCCAGAGAGGATGCTTTCCGGAAGCTTTTCCGCTTCTACCGGCAGAGCCGGCCGGGGACAGCGGACCTGGGAGCCG
    TCATCGACTTCTCAGAGGCGCACTTGGCTCGGAGCCCGAAGCCCGGCGTGCCCCAGGTAGGAAAGGAGGAGTAGTGTGTG
    CCAGCCTAGCGGCCGACTGGGCCACCCGAGACTGGGCCGCCTCCGGGCCGGCTTTGGAGGGAAGCCCCTGCTGGGCCTGT
    CCAGTGAGCTGTAATGTCGAGCGATGAGCGACCAGCTGCCTCGCTGTCCCAACGCTCTGGCCACGGCTTGTGCCTTGCCG
    CCATTTCCCCCAACCCACGCGGGCCACGGCTTGTGCCCTGCCGCCATTTCCCCCAACCCACGCGACCTTGCTAAAAAAAA
    AAAAAGAAAGAAAAGAAAAGAAAGAAAGAAAGAAAAAAATCTGGAAATTGCTTGTACCTCCTTAACTATCTGTTTAATAC
    TAATACGATATTTTGTGTAAAGCTCAGAAGAACATCTTCGTGGACGTTAGGGTGGCCTCATAACTTCAGATAAAAGCAGC
    CATTTAATAAGTCTCAAACCGTTAATCCGTTGGGCCTGAGACTCGATCGACCCTGTCTTCTCTGAGGCTTTGAAAGTAAA
    GGTAAAATTAGCAGGTTTTTTTCCTGAGAATCTAGGAGCCTGGAGAGATAGCTCAGTAATTAAGAGCATTTACCTACTGG
    TGTTCCCAAGAACACCAAGTAGATTTGGTTCCTTGCAGCCACGTGGCAGCTCACAGCCTTCTTGTAACTCTTCCGGAGGA
    TCAGACACCCTCTCTTGAGCTCCACAGGAGAGCACTCGTAGACATGTAAATAAACTTCTAAGCTAAATCTAAACAATTTA
    TGTACCCTCCCTATTTCTTCGTGATGAGAAGAAAGGGGCCAGAGGGTATG
    >NR_046233.2 Mus musculus 45S pre-ribosomal RNA (Rn45s), ribosomal RNA
    ACTGACACGCTGTCCTTTCCCTATTAACACTAAAGGACACTATAAAGAGACCCTTTCGATTTAAGGCTGTTTTGCTTGTC
    

      

    还是没发现有什么区别!!!

    RefSeq转录本是 从gtf得到的转录本的一个子集
    

      

    后面会再详细展开~

  • 相关阅读:
    重新理解js的执行环境和闭包
    给开发插上想象力的翅膀
    Vue源码的初始化以及数据驱动逻辑
    解析Vue源码之前
    前端模块化发展介绍和未来展望
    现代前端框架具备的特征分析及Vue、React对比
    始于Flux的单项数据流发展简单介绍
    用面向对象编程解决常见需求场景
    【Docker】之重启容器相关命令
    【Java】之获取CSV文件数据以及获取Excel文件数据
  • 原文地址:https://www.cnblogs.com/leezx/p/8654421.html
Copyright © 2011-2022 走看看