zoukankan      html  css  js  c++  java
  • 在甲基化ChAMP包中加入协变量

    最近在分析甲基化数据,发现ChAMP包的输入端是beta(甲基化数据)和pheno(表型数据),找不到校正协变量(比如年龄、性别、批次等)的输入端。

    如下代码所示:

    champ.DMP(beta = myNorm,
                  pheno = myLoad$pd$Sample_Group,
                  compare.group = NULL,
                  adjPVal = 0.05,
                  adjust.method = "BH",
                  arraytype = "450K")
    

    查了一下ChAMP包的代码,发现ChAMP包调用的是limma函数。

    因此,这事就比较好解决了。

    只需要在原代码的基础上修改model就行了。

    下面介绍一下如何修改model。

    1创建甲基化数据集和表型、协变量数据集

    beta <- matrix(runif(60,min=0,max=1),10,6)
    rownames(beta) <- c("cg18478105","cg09835024","cg14361672","cg01763666","cg12950382","cg02115394","cg25813447","cg07779434","cg13417420","cg12480843")
    colnames(beta) <-paste("sample",1:6)
    da=data.frame(pheno=c("1","1","1","2","2","2"),SEX=c("1","2","1","1","1","2"),AGE=c(54,23,58,43,68,36))
    rownames(da) <- paste("sample",1:6)
    

    甲基化数据如下所示:

    注意:beta数据是随机产生的,因此每个人生成的数据是不一样的。

    表型(pheno)、协变量(SEX、AGE)的数据如下:

    2 修改model、校正协变量

    在这里我们假定要校正的协变量有SEX和AGE。

    我在tian yuan创作的champ.DMP函数的基础上增加了cov1=da$cov1cov2=da$cov2两个输入参数。
    在champ.DMP函数里面修改了

    design <- model.matrix( ~ 0+ factor(p)+cov1+cov2 )
    colnames(design) <- c("control", "case","cov1","cov2")
    contrast.matrix <- makeContrasts(contrasts=paste(colnames(design)[2:1],collapse="-"), levels = design)
    

    最后修改完的函数如下所示:

    champ.DMP1 <- function(beta = myNorm,
                          pheno = da$Sample_Group,
                          cov1=da$cov1,
                          cov2=da$cov2,
                          adjPVal = 0.05,
                          adjust.method = "BH",
                          compare.group = NULL,
                          arraytype = "EPIC")
    {
        message("[===========================]")
        message("[<<<<< ChAMP.DMP START >>>>>]")
        message("-----------------------------")
    
        if(is.null(pheno) | length(unique(pheno))<=1)
        {
            stop("pheno parameter is invalid. Please check the input, pheno MUST contain at least two phenotypes.")
        }else
        {
            message("<< Your pheno information contains following groups. >>")
            sapply(unique(pheno),function(x) message("<",x,">:",sum(pheno==x)," samples."))
            message("[The power of statistics analysis on groups contain very few samples may not strong.]")
        }
    	
        if(is.null(compare.group))
        {
            message("You did not assign compare groups. The first two groups: <",unique(pheno)[1],"> and <",unique(pheno)[2],">, will be compared automatically.")
            compare.group <- unique(pheno)[1:2]
        }else if(sum(compare.group %in% unique(pheno))==2)
        {
            message("As you assigned, champ.DMP will compare ",compare.group[1]," and ",compare.group[2],".")
        }else
        {
            message("Seems you did not assign correst compare groups. The first two groups: <",unique(pheno)[1],"> and <",unique(pheno)[2],">, will be compared automatically.")
            compare.group <- unique(pheno)[1:2]
        }
        
        p <- pheno[which(pheno %in% compare.group)]
        beta <- beta[,which(pheno %in% compare.group)]
       design <- model.matrix( ~ 0+ factor(p)+cov1+cov2 )
    	colnames(design) <- c("control", "case","cov1","cov2")
        contrast.matrix <- makeContrasts(contrasts=paste(colnames(design)[2:1],collapse="-"), levels = design)
        message("
    << Contrast Matrix >>")
        print(contrast.matrix)
    
        message("
    << All beta, pheno and model are prepared successfully. >>")
    	
    	fit <- lmFit(beta, design)
    	fit2 <- contrasts.fit(fit,contrast.matrix)
    	tryCatch(fit3 <- eBayes(fit2),
          warning=function(w) 
          {
          	stop("limma failed, No sample variance.
    ")
          })
        DMP <- topTable(fit3,coef=1,number=nrow(beta),adjust.method=adjust.method)
        message("You have found ",sum(DMP$adj.P.Val <= adjPVal), " significant MVPs with a ",adjust.method," adjusted P-value below ", adjPVal,".")
        message("
    << Calculate DMP successfully. >>")
    
        if(arraytype == "EPIC") data(probe.features.epic) else data(probe.features)
        com.idx <- intersect(rownames(DMP),rownames(probe.features))
        avg <-  cbind(rowMeans(beta[com.idx,which(p==compare.group[1])]),rowMeans(beta[com.idx,which(p==compare.group[2])]))
        avg <- cbind(avg,avg[,2]-avg[,1])
        colnames(avg) <- c(paste(compare.group,"AVG",sep="_"),"deltaBeta")
        DMP <- data.frame(DMP[com.idx,],avg,probe.features[com.idx,])
    
        message("[<<<<<< ChAMP.DMP END >>>>>>]")
        message("[===========================]")
        message("[You may want to process DMP.GUI() or champ.GSEA() next.]
    ")
        return(DMP)
    }
    
    

    3、使用champ.DMP1分析甲基化数据

    champ.DMP1 <- champ.DMP1(beta =beta,pheno = da$pheno, cov1=da$SEX,cov2=da$AGE, adjust.method = "BH", arraytype = "EPIC")

    beta是第一步生成的甲基化数据,da数据框上的pheno、SEX、AGE也是在第一步生成的数据;
    arraytype = "EPIC"表示甲基化数据类型为850K。如果是"450K"数据则将"EPIC"改为"450K";

    4、修改技巧

    上面举得例子是校正两个协变量,如果想校正三个协变量的话则做如下三个改动:

    第一个改动:

    champ.DMP1 <- function(beta = myNorm,
                          pheno = da$Sample_Group,
                          cov1=da$cov1,
                          cov2=da$cov2,
                          cov3=da$cov3,
                          adjPVal = 0.05,
                          adjust.method = "BH",
                          compare.group = NULL,
                          arraytype = "EPIC")
    
    

    第二个改动:

    design <- model.matrix( ~ 0+ factor(p)+cov1+cov2+cov3 )
    colnames(design) <- c("control", "case","cov1","cov2","cov3")
    contrast.matrix <- makeContrasts(contrasts=paste(colnames(design)[2:1],collapse="-"), levels = design)
    

    第三个改动:

    champ.DMP1 <- champ.DMP1(beta =beta,pheno = da$pheno, cov1=da$SEX,cov2=da$AGE, cov3=da$cov3,adjust.method = "BH", arraytype = "EPIC")

    5、致谢

    感谢原作者tian yuan(这个包很全能!);

    感谢健明分享的甲基化分析入门练习:甲基化芯片的一般分析流程

    建议各位刚入门甲基化的同学们可以看看健明在B站的视频,讲的很详细。

  • 相关阅读:
    【青橙商城-管理后台开发】2. 商品服务模块搭建
    【青橙商城-管理后台开发】1.公共模块搭建
    Oracle对象-视图和索引
    Oracle的查询-分页查询
    Oracle的查询-子查询
    Oracle的查询-自连接概念和联系
    Oracle的查询-多表查询中的一些概念
    Oracle的查询-分组查询
    Oracle的查询-多行查询
    dotnet 开源库
  • 原文地址:https://www.cnblogs.com/chenwenyan/p/12920024.html
Copyright © 2011-2022 走看看