zoukankan      html  css  js  c++  java
  • 2016-6-15-de novo文献阅读

    准备读四篇denovo的文献:

    1. Nature Biotechnology(2015) - [Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement](http://www.nature.com/nbt/journal/v33/n5/full/nbt.3207.html)
      
    2. Whole-genome sequencing of the snub-nosed monkey provides insights into folivory and evolutionary history
    3. Genomic analyses identify distinct patterns of selection in domesticated pigs and Tibetan wild boars
    4. Ground tit genome reveals avian adaptation to living at high altitudes in the Tibetan plateau

    1. 四倍体陆地棉基因组

    allotetraploid:异源四倍体;Upland cotton:陆地棉
    Structural rearrangements:结构重组;gene loss:基因缺失;disrupted genes:扰乱基因;sequence divergence:序列分歧;asymmetric evolution:不对称进化
    方法:

    1. whole-genome shotgun reads:全基因组鸟枪法
    2. bacterial artificial chromosome (BAC)-end sequences:
    3. genotype-by-sequencing genetic maps:

    背景:虽然已有测序,但是exact donor species that led to the formation of the tetraploid cotton species 1–2 million years ago (MYA) no longer exists。
    测序难点:discriminating between homoeologous sequences(识别同源序列)
    传统做法:祖先的二倍体基因组序列,来指导异源多倍体的同源染色体的装配。
    问题:关于同源关系的许多contigs 和 scaffolds 保持模棱两可。
    本研究的核心优势:scaffolds (N50 = 1,600 kb)长,相比与Brassica napus (N50 = 764 kb),Nicotiana tabacum(N50 = 345~386 kb),wheat (contig N50 = 515~4,297 bp),G. arboreum(N50 = 666 kb)

    方法:
    Genome sequencing data generation
    sheared with a Bioruptor sonication device for short-insert paired-end (PE) library construction
    with a Hydroshear DNA Shearing Device (Genomic Solutions Inc., Ann Arbor, MI, USA) for mate-pair library construction
    Short-insert paired-end (180, 300, and 500 bp) and large-insert mate-pair libraries (2, 5, 10 kb) were prepared
    All libraries were sequenced at 2 × 100 bp on an Illumina HiSeq 2000 platform. In total, 843 Gb of DNA sequencing read data were generated for the genome assembly, representing approximately 337-fold coverage of the raw genome.

    BAC library construction and BAC end sequencing
    read短,无法确定在基因组的位置,获得的只是contig。
    BAC文库可以帮助我们拼接更长的序列。另外由于文库构建存在一定的偏好性,或者有一些序列GC含量太高或太低,当然还有一些repeat序列,所以BAC文库也可以帮助我们对这些序列更好的获得。现在主流的方法是:构建mate-pair文库+多插入片段PE文库+多测序平台结合+fish或者binano光学图谱。

    Genome size estimation

    Genome assembly, scaffolding and gap-closing

    Linkage map construction using the TM-1 × Hai7124 mapping population

    Correction of the TM-1 assembly using the SNP map and pseudomolecule chromosome construction

    Assessment of genome assembly quality by PE reads

    TM-1 assembly validation using mRNA sequences from the G. raimondii and G. hirsutum genomes

    TM-1 assembly validation using 36 completely sequenced BACs

    Gene prediction and annotation

    Transcription factors annotation

    Noncoding RNAs annotation

    TE annotation

    Identification of homoeologous gene sets and orthologous gene sets

    Estimation of divergence time

    Phylogenetic tree construction and evolution rate estimation

    Syntenic analysis and whole-genome alignment

    PSGs

    Gene loss

    Genes involved in the ongoing process of gene loss

    2. 金丝猴基因组

    3. 藏猪基因组

    4. 地山雀基因组

  • 相关阅读:
    一次安装。net core的经历
    c# task 等待所有子线程执行完的写法
    .net 中的async,await理解
    dbeaver pgsql连接工具
    oracle 导出表结构和备注
    abp
    发布站点
    excel 拆分多个excel并保持
    重定向和反向代理的区别
    es6中的解构赋值
  • 原文地址:https://www.cnblogs.com/leezx/p/5587592.html
Copyright © 2011-2022 走看看