zoukankan      html  css  js  c++  java
  • topGO 精选资料

    topGO

    http://www.bio-info-trainee.com/699.html

    https://blog.csdn.net/weixin_43569478/article/details/83743975

    https://blog.csdn.net/wt141643/article/details/105162324

    https://www.omicshare.com/forum/thread-1466-1-6123.html

    http://www.zxzyl.com/archives/767

    https://datacatz.wordpress.com/2018/01/19/gene-set-enrichment-analysis-with-topgo-part-1/

    https://www.biostars.org/p/325421/

    https://www.biostars.org/p/116212/

    https://www.biostars.org/p/350710/#350712

     Author: Kevin Blighe 

    The function can be used in different ways but the input gene list should be a name-value pairing. The values can be a rank (#1, highest) or a list of p-values, e.g., from differential expression. In this example, I supply a list of ranked genes:

    1 obtain input genes

    rankedGenes <- c(1:10)
    names(rankedGenes) <- c("BRCA1","BRCA2","TP53", "ATM", "ERCC1", "BRCC3", "TP63", "ERCC2", "ERCC3", "ERCC4")
    
    rankedGenes
    BRCA1 BRCA2  TP53   ATM ERCC1 BRCC3  TP63 ERCC2 ERCC3 ERCC4 
        1     2     3     4     5     6     7     8     9    10
    

    So, in my genes, BRCA1 is ranked highest... ERCC4 is lowest.

    Edit April 25, 2019:

    • One can also supply name-value pairings with p-values, in which case the lowest p-value is ranked highest.
    • You can also just supply a vector where everything has the same value, in which case everything has equal rank

    2 set-up the enrichment object based on your genes

    require(topGO)
    require(org.Hs.eg.db)
    
    selection <- function(x) TRUE
    allGO2genes <- annFUN.org(whichOnto="BP", feasibleGenes=NULL, mapping="org.Hs.eg.db", ID="symbol")
    GOdata <- new("topGOdata", ontology="BP", allGenes=rankedGenes, annot=annFUN.GO2genes, GO2genes=allGO2genes, geneSel=selection, nodeSize=5)
    

    Depending on the format of your input genes, you can change the value of ID, which is passed to annFUN.org(). This is to allow for gene mapping to GO terms to take place.

    Also note that whichOnto and ontology can be BP, MF, or CC, relating to the GO categories.

    It's up to you to ensure that the gene list that you're providing is the correct list of genes. The geneSel parameter of new() (above) can be used to filter your list in real time, e.g., filtering on P<0.01 if supplying genes and their respective p-values; otherwise, you can just set this to geneSel=TRUE and use everything that's supplied (as I have).

    3 perfom gene enrichment. Here, we make use of the rank information by using the Kolmogorov-Smirnov (K-S) test on the enriched terms

    results.ks <- runTest(GOdata, algorithm="classic", statistic="ks")
    goEnrichment <- GenTable(GOdata, KS=results.ks, orderBy="KS", topNodes=10)
    goEnrichment <- goEnrichment[goEnrichment$KS<0.05,]
    goEnrichment <- goEnrichment[,c("GO.ID","Term","KS")]
    goEnrichment
    
    goEnrichment
            GO.ID                                        Term    KS
    1  GO:0010212              response to ionizing radiation 0.015
    2  GO:0007154                          cell communication 0.041
    3  GO:0007165                         signal transduction 0.041
    4  GO:0008630 intrinsic apoptotic signaling pathway in... 0.041
    5  GO:0016569             covalent chromatin modification 0.041
    6  GO:0016570                        histone modification 0.041
    7  GO:0023052                                   signaling 0.041
    8  GO:0030330 DNA damage response, signal transduction... 0.041
    9  GO:0035556           intracellular signal transduction 0.041
    10 GO:0042127            regulation of cell proliferation 0.041
    

    See a related thread where I also produce an enrichment plot using ggplot2A: GO analysis using topGO

    Author: Kevin Blighe Kevin

    Plot the results (the enrichment score is just negative log (base 10) of the enrichment P-value):

    require(ggplot2)
    ggplot(goEnrichment, aes(x=Term, y=-log10(KS))) +
        stat_summary(geom = "bar", fun.y = mean, position = "dodge") +
        xlab("Biological process") +
        ylab("Enrichment") +
        ggtitle("Title") +
        scale_y_continuous(breaks = round(seq(0, max(-log10(goEnrichment$KS)), by = 2), 1)) +
        theme_bw(base_size=24) +
        theme(
            legend.position='none',
            legend.background=element_rect(),
            plot.title=element_text(angle=0, size=24, face="bold", vjust=1),
            axis.text.x=element_text(angle=0, size=18, face="bold", hjust=1.10),
            axis.text.y=element_text(angle=0, size=18, face="bold", vjust=0.5),
            axis.title=element_text(size=24, face="bold"),
            legend.key=element_blank(),     #removes the border
            legend.key.size=unit(1, "cm"),      #Sets overall area/size of the legend
            legend.text=element_text(size=18),  #Text size
            title=element_text(size=18)) +
        guides(colour=guide_legend(override.aes=list(size=2.5))) +
        coord_flip()
    

      Author: Kevin Blighe

  • 相关阅读:
    MySQL事件(定时任务)
    MySQL存储过程
    WebSocket小记
    Python计算给定日期位于当年第几周
    报错解决——Failed building wheel for kenlm
    计算机基础系列之压缩算法
    计算机基础系列之内存、磁盘、二进制
    计算机基础系列之CPU
    常用正则表达大全
    报错解决——TypeError: LoadLibrary() argument 1 must be str, not None
  • 原文地址:https://www.cnblogs.com/emanlee/p/13983746.html
Copyright © 2011-2022 走看看