ggheatmap复现CNS级美图
前面我们推送了南方医的一个后起之秀的新R包:快来使用ggheatmap强化你的热图吧!目前已经正式被R语言社区的CRAN接受了,大家可以放心的使用起来!
前言:自从公开ggheatmap后,笔者收到了许多读者的鼓励。真的感激大家的支持与鼓励~如果在使用过程中存在什么问题,或者有比较希望实现的功能可以联系笔者,你的每一个建议对我来说都是一次学习和进步,也希望有机会和大家一起探讨学习~
接下来,笔者将会举两个例子,用来说明
ggheatmap
的用途,希望都过去抛砖引玉的作用(本质上:也就heatmap+拼图)
前期准备
说明:以下数据并非真实存在,只是作者随机生成,所以不同人运行的结果可能存在不同,大家不放心的话,可以使用
pheatmap
包进行验证。
devtools::install_github("XiaoLuo-boy/ggheatmap")
library(ggheatmap)
library(aplot)
set.seed(123)
df <- matrix(runif(600,0,10),ncol = 12)
colnames(df) <- paste("sample",1:12,sep = "")
rownames(df) <- sapply(1:50, function(x)paste(sample(LETTERS,3,replace = F),collapse = ""))
df[1:4,1:4]
row_metaData <- data.frame(exprtype=sample(c("Up","Down"),50,replace = T),
genetype=sample(c("Metabolism","Immune","None"),50,replace = T))
rownames(row_metaData) <- rownames(df)
col_metaData <- data.frame(tissue=sample(c("Normal","Tumor"),12,replace = T),
risklevel=sample(c("High","Low"),12,replace = T))
rownames(col_metaData) <- colnames(df)
exprcol <- c("#EE0000FF","#008B45FF" )
names(exprcol) <- c("Up","Down")
genecol <- c("#EE7E30","#5D9AD3","#D0DFE6FF")
names(genecol) <- c("Metabolism","Immune","None")
tissuecol <- c("#98D352","#FF7F0E")
names(tissuecol) <- c("Normal","Tumor")
riskcol <- c("#EEA236FF","#46B8DAFF")
names(riskcol) <- c("High","Low")
col <- list(exprtype=exprcol,genetype=genecol,tissue=tissuecol,risklevel=riskcol)
stackDat <- data.frame(sample=rep(paste0("sample",1:12),each=2),
count=sample(1:10,24,replace = T),
type=rep(c("Negative","Positive"),times=12))
数据说明:
df:表达矩阵 row_metaData:行注释数据 col_metaData:列注释数据 stackDat:每个样本上调基因和下调基因的总数(为了方便,该数据也是随机生成的)
Example1
**图片出处说明:**本热图出自《Cell》文献:https://doi.org/10.1016/j.cell.2020.05.032。起初是作者在“木舟笔记”推送上面看到的图片,感觉热图超级好看的,所以尽力复现出来(只模仿形,不模仿意)。
用法:可以用于对特定基因或者样本的多重注释。比如某些基因是免疫基因,某些基因是自噬相关相关基因,某些是肿瘤驱动基因。可能某些基因可能既是免疫基因也是自噬相关基因,这也就意味这用条形图注释不在合适,所以可以采用这个办法进行可视化。
代码实现
ggheatmap<- ggheatmap(df,color=colorRampPalette(c( "#66b032","white","#ff3800"))(100),
cluster_rows = T,cluster_cols = T,scale = "row",
cluster_num = c(5,3),
tree_color_rows = c("#3B4992FF","#EE0000FF","#008B45FF","#631879FF","#008280FF"),
tree_color_cols = c("#1F77B4FF","#FF7F0EFF","#2CA02CFF"),
annotation_rows = row_metaData,
annotation_cols = col_metaData,
annotation_color = col
)
dat <- data.frame(Glycolysis=sample(c(1,NA),50,replace = T),
TAC=sample(c(1,NA),50,replace = T),
gene=rownames(df))
p1 <- ggplot(dat,aes(x=Glycolysis,y=gene))+
geom_point(color="#d40749",size=3)+theme_classic()+
theme(line = element_blank(),axis.text = element_blank(),axis.title.y = element_blank(),
axis.title.x = element_text(colour ="#d40749",face = "bold",size = 10))+
xlab("Glycolysis")+scale_x_discrete(position = "top")
p2 <- ggplot(dat,aes(x=TAC,y=gene))+
geom_point(color="#0092db",size=3)+theme_classic()+
theme(line = element_blank(),axis.text = element_blank(),axis.title.y = element_blank(),
axis.title.x = element_text(colour ="#0092db",face = "bold",size = 10))+
xlab("TAC")+scale_x_discrete(position = "top")
ggheatmap%>%insert_right(p1,width = 0.1)%>%insert_right(p2,width = 0.1)
Example2
**图片出处说明:**本图来源于data.world可视化项目:Energy Use at 10 Downing St in 2017(列为月份,行为日期)(只模仿形,不模仿意)。
用法:用于描述行列元素某一特征的变化趋势。比如说:如果样本以我们构建的模型打分的大小排序,同时需要描述特定类型基因的表达情况,那么这样绘制不免是一个很好的切入点。你也可以把柱形图换为线图等等。
代码实现
ggheatmap2<- ggheatmap(df,cluster_rows = T,cluster_cols = T,scale = "row",
cluster_num = c(5,3),
tree_color_rows = c("#3B4992FF","#EE0000FF","#008B45FF","#631879FF","#008280FF"),
tree_color_cols = c("#1F77B4FF","#FF7F0EFF","#2CA02CFF"),
annotation_rows = row_metaData,
annotation_color = col,show_cluster_cols=F
)
p3 <- ggplot(stackDat, aes(x=sample,y=count, fill=type)) +
geom_bar(position="stack", stat="identity")+
scale_fill_manual(values = c("#24c1ff","#ffbd24"))+
theme_classic()+
theme(axis.text.x =element_blank(),
axis.title.x = element_blank(),
axis.line.x = element_blank(),
axis.ticks.x = element_blank())+
scale_y_continuous(expand = c(0,0),position = "right")
ggheatmap2%>%insert_top(p3,height = 0.1)
文末友情推荐
如果你也恰好是医学生,学习R语言也有困难,那么你值得拥有下面的学习班: