本文首发于“生信补给站”:https://mp.weixin.qq.com/s/8kz2oKvUQrCR2_HWYXQT4g
如果有maf格式的文件,可以直接oncoplot包绘制瀑布图,有多种展示和统计maftools | 从头开始绘制发表级oncoplot(瀑布图)和maftools|TCGA肿瘤突变数据的汇总,分析和可视化,如果只有多个样本的基因突变与否的excel,不用担心,也可以用complexheatmap包绘制。
这个包功能很强大,本次只简单的介绍如何绘制基因组景观图(瀑布图)。
一 载入R包,数据
#if (!requireNamespace("BiocManager", quietly = TRUE))
# install.packages("BiocManager")
#BiocManager::install("ComplexHeatmap")
#install.packages("openxlsx")
#install.packages("circlize")
#后面直接加载即可
library(openxlsx)
library(ComplexHeatmap)
library(circlize)
#读入数据
mut <- read.xlsx("TCGA_data.xlsx",sheet = "突变信息")
cli <- read.xlsx("TCGA_data.xlsx",sheet = "临床信息")
查看变异数据
rownames(mut) <- mut$sample
mat <- mut[,-1]
mat[is.na(mat)]<-""
mat[1:6,1:6]
二 绘制突变景观图
2.0 绘制“初始”瀑布图
oncoPrint(mat)
可以展示结果,但是为了paper,还需要一些调整!
2.1 指定变异类型的颜色和形状大小
#指定颜色, 调整颜色代码即可
col <- c( "mutation" = "blue" , "indel" = "green")
#指定变异的样子,x,y,w,h代表变异的位置(x,y)和宽度(w),高度(h)
alter_fun <- list(
background = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h-unit(0.5, "mm"),
gp = gpar(fill = "#CCCCCC", col = NA))
},
mutation = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h-unit(0.5, "mm"),
gp = gpar(fill = col["mutation"], col = NA))
},
indel = function(x, y, w, h) {
grid.rect(x, y, w-unit(0.5, "mm"), h*0.33,
gp = gpar(fill = col["indel"], col = NA))
}
)
#指定变异类型的标签,和数据中的类型对应
heatmap_legend_param <- list(title = "Alternations",
at = c("mutation","indel"),
labels = c( "mutation","indel"))
绘制景观图
#设定标题
column_title <- "This is Oncoplot "
#画图并去除无突变的样本和基因
oncoPrint(mat,
alter_fun = alter_fun, col = col,
column_title = column_title,
heatmap_legend_param = heatmap_legend_param)
2.2 简单的调整
oncoPrint(mat,
alter_fun = alter_fun, col = col,
column_title = column_title,
remove_empty_columns = TRUE, #去掉空列
remove_empty_rows = TRUE, #去掉空行
row_names_side = "left", #基因在左
pct_side = "right",
heatmap_legend_param = heatmap_legend_param)
三 添加注释信息
3.1 指定临床注释信息
pdata <- cli
head(pdata)
#对应患者
pdata <- subset(pdata,pdata$sampleID %in% colnames(mat))
mat <- mat[, pdata$sampleID]
#定义注释信息
ha<-HeatmapAnnotation(Age=pdata$age,
Gender=pdata$gender,
GeneExp_Subtype = pdata$GeneExp_Subtype ,
censor = pdata$censor,
os = pdata$os,
show_annotation_name = TRUE,
annotation_name_gp = gpar(fontsize = 7))
3.2 瀑布图 + 临床注释
oncoPrint(mat,
bottom_annotation = ha, #注释信息在底部
alter_fun = alter_fun, col = col,
column_title = column_title, heatmap_legend_param = heatmap_legend_param )
此处使用默认颜色注释,有时候会比较接近,且“变动”
#自定义样本顺序
s <- pdata[order(pdata$censor,pdata$GeneExp_Subtype),]
sample_order <- as.character(s$sampleID)
#自定义颜色
#连续性变量设置颜色(外)
col_os = colorRamp2(c(0, 4000), c("white", "red"))
ha<-HeatmapAnnotation(Age=pdata$age,
Gender=pdata$gender,
GeneExp_Subtype = pdata$GeneExp_Subtype ,
censor = pdata$censor,
os = pdata$os,
#指定颜色
col = list(censor = c("death" = "red", "alive" = "blue"),
GeneExp_Subtype = c("Classical" = "orange","Mesenchymal" = "green","Neural" = "skyblue" ),
os = col_os),
show_annotation_name = TRUE,
annotation_name_gp = gpar(fontsize = 7))
绘制瀑布图
oncoplot_anno = oncoPrint(mat,bottom_annotation = ha,
alter_fun = alter_fun, col = col,
column_order = sample_order,
remove_empty_columns = TRUE, #去掉空列
remove_empty_rows = TRUE, #去掉空行
column_title = column_title, heatmap_legend_param = heatmap_legend_param)
oncoplot_anno
注:颜色不一定好看,只是为了当默认的颜色比较接近时,或者有要求时候,可以自定义。
3.4 调整注释的位置
draw(oncoplot_anno ,annotation_legend_side = "bottom")
更改注释的位置,方便后续拼图需求。
更多参数:
https://github.com/jokergoo/ComplexHeatmap
PS:觉得内容有帮助的话,可以点点在看和转发,新机制下容易失踪。