zoukankan      html  css  js  c++  java
  • GTF文件

    一、GTF文件格式

     Fields must be tab-separated. Also, all but the final field in each feature line must contain a value; "empty" columns should be denoted with a '.'

      1.seqname - name of the chromosome or scaffold; chromosome names can be given with or without the 'chr' prefix. Important note: the seqname must be one used within Ensembl, i.e. a standard chromosome name or an Ensembl identifier such as a scaffold ID, without any additional content such as species or assembly. See the example GFF output below.

      2.source- name of the program that generated this feature, or the data source (database or project name)

      3.feature- feature type name, e.g. Gene, Variation, Similarity

      4.start- Start position of the feature, with sequence numbering starting at 1.

      5.end- End position of the feature, with sequence numbering starting at 1.

      6.score- A floating point value.

      7.strand- defined as + (forward) or - (reverse).

      8.frame- One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on..

      9.attribute- A semicolon-separated list of tag-value pairs, providing additional information about each feature.

      1.染色体名

      2.注释信息的来源,比如”Genescan”、”Genbank” 等,可以为空,为空用”.”点号代替

      3.注释信息的类型,比如Gene、cDNA、mRNA等,或者是SO对应的编号

      4、5.开始和结束位置

      7.序列的方向, +表示正义链, -反义链 , ? 表示未知

      8.阅读框:有数字0、1和2。0代表序列的第一个碱基为密码子的第一个碱基,1代表是密码子第二个,2代表第三个。

      9.以多个键值对组成的注释信息描述,键与值之间用”=“,不同的键值用”;“隔开,一个键可以有多个值,不同值用”,“分割。注意如果描述中包括tab键以及”,=;”,要用URL转义规则进行转义,如tab键用代替。键是区分大小写的,以大写字母开头的键是预先定义好的,在后面可能被其他注释信息所调用。
     
     其中source列包含了基因注释机构,如ensembl,ensembl_havana,havana,insdc,mirbase等,要了解这些数据库,可以参考博客:https://www.cnblogs.com/always-fight/p/9002252.html
  • 相关阅读:
    HDU 1060 Leftmost Digit
    HDU 1008 Elevator
    HDU 1042 N!
    HDU 1040 As Easy As A+B
    HDU 1007 Quoit Design
    欧拉函数
    HDU 4983 Goffi and GCD
    HDU 2588 GCD
    HDU 3501 Calculation 2
    HDU 4981 Goffi and Median
  • 原文地址:https://www.cnblogs.com/always-fight/p/9318332.html
Copyright © 2011-2022 走看看