E-value identity bitscore

zoukankan html css js c++ java

E-value identity bitscore

E-value:

The E-value provides information about the likelihood that a given sequence match is purely by chance. The lower the E-value, the less likely the database match is a result of random chance and therefore the more significant the match is.

Empirical interpretation of the E-value is as follows:

If E-value < 1e-50 (or 1 X 10-50), there should be an extremely high confidence that the database match is a result of homologous relationships.

If E-value is between 0.01 and 1e-50, the match can be considered a result of homology.

If E-value is between 10 and 0.01, the match is considered not significant, but may hint at a tentative remote homology relationship. Additional evidence is needed to confirm the tentative relationship.

If E-value > 10, the sequences under consideration are either unralated or related by extremely distant realtionships that fall below the limit of detection with the current method.

Because the E-value is proportionally affected by the database size, an obvious problem is that as the database grows, the E-value for a given sequence match also increases.

Because the genuine evolutionary relationship beween the two sequence remains constant, the decrease in credibility of the sequence match as the database grows means that one may "lose" previously detected homologs as the database enlarges. Thus, an alternative to E-value calculations is needed.

The E-value is very important, the lower the better

bitscore:

A bitscore is another prominant statistical indicator used in addition to the E-value in a BLAST output. The bitscore measures sequence similarity independent of query sequence length and database size and is normalized based on the raw pairwise alignment score. The bitscore (S) is determined by the following formula: S = (λ * S - lnK) / ln2 where λ is the Gumble distribution constant, S is the raw alignment score, and K is a constant associated with the scoring matrix used. Clearly, the bitscore (S) is linearly related to the raw alignment score (S). Thus, the higher the bit score, the more highly significant the match is. The bit score provides a constant statistical indicator for searching different databases of different size or for searching the same database at different times as the database enlarges.

identity:

Identity 35% means that 35% of AA in your sequence match to other sequences in database, There isn't something like "acceptable percentage". It always depends on what you are looking for:

If you have unkown protein sequence and you would like to know the homology sequences, information about identity (even 35%) is valuable.

If you have known protein and you need to confirm the sequence, the identity 35% is small and may suggest that something went wrong during your analysis.

查看全文

相关阅读:
fiddler---Fiddler模拟接口数据（mock）
Intellij IDEA gradle项目目录介绍
 Windows netstat 查看端口、进程占用
 SpringMVC和spring常见面试题总结
 mybatis一级缓存二级缓存
 Mybatis常见面试题总结
 Spring容器
 深入理解JVM类加载机制
 理解Spring框架中Bean的5个作用域
 编程语言的分类与关系

原文地址：https://www.cnblogs.com/0820LL/p/11352294.html

E-value identity bitscore

E-value:

bitscore:

identity: