学习微生物组数据比较成熟的R包microbiome
learning_microbiome_2
这两个包的安装比较麻烦
无法下载得到github包,或者无法安装后,将github包手动下载下来,解压之后定位文件夹名称后安装
这部分用来学习微生物组成的操作及其可视化
主体函数为:plot_composition,可以按照丰度和otu门类信息排序展示; 这里注意函数core,可以将otu过滤得到大部分样品中都存在的otu。
之前我写了个函数,用来排序门类的柱状图,这里也写了个函数,用来排序,但是同我的排序不同 sample.sort = “Actinobacteria”, otu.sort = “abundance”,用来选择按照某一个门类的相对丰富进行排序。好像没有这样的需求?
library(microbiome)
library(dplyr)
data(dietswap)
# Make sure we use functions from correct package
transform <- microbiome::transform
# ?core
# Just use prevalent taxa to speed up examples
# (not absolute counts used in this example)
pseq <- core(dietswap, detection = 50, prevalence = 50/100)
# Pick sample subset
library(phyloseq)
pseq2 <- subset_samples(pseq, group == "DI" & nationality == "AFR" & timepoint.within.group == 1)
# Normal western adults
data(atlas1006)
pseq3 <- atlas1006 %>%
subset_samples(DNA_extraction_method == "r") %>%
aggregate_taxa(level = "Phylum") %>%
microbiome::transform(transform = "compositional")
# Try another theme
# from https://github.com/hrbrmstr/hrbrthemes
# library(hrbrthemes)
# devtools::install_github("hrbrmstr/hrbrthemes")
library(gcookbook)
# install.packages("gcookbook")
library(tidyverse)
theme_set(theme_bw(21))
p <- pseq3 %>%
plot_composition(sample.sort = "Bacteroidetes", otu.sort = "abundance") +
# Set custom colors
scale_fill_manual(values = default_colors("Phylum")[taxa(pseq3)])
print(p)
这里没有进行排序,只是根据不同注释进行分类上色(sample.sort = “nationality”)
average_by = “bmi_group”:参数用于选择不同的样品分组
这个函数算是很强大:既可以选择门类水平,也可以选择分组水平,还可以进行排序,即使是按照某一个门类进行排序
# Limit the analysis on core taxa and specific sample group
p <- plot_composition(pseq2,
taxonomic.level = "Genus",
sample.sort = "nationality",
x.label = "nationality") +
guides(fill = guide_legend(ncol = 1)) +
# scale_y_percent() +
labs(x = "Samples", y = "Relative abundance (%)",
title = "Relative abundance data",
subtitle = "Subtitle",
caption = "Caption text.") #+
# theme_ipsum(grid="Y")
print(p)
# Averaged by group
p <- plot_composition(pseq2,
average_by = "bmi_group", transform = "compositional")
print(p)
p <- NULL
plot_composition 函数一共有三种图形,包括堆叠柱状图,热图,还有线图
p <- plot_composition(microbiome::transform(pseq, "compositional"),
plot.type = "heatmap",
sample.sort = "neatmap",
otu.sort = "neatmap")
print(p)
plot_composition 函数一共有三种图形,包括堆叠柱状图,热图,还有线图
p <- plot_composition(microbiome::transform(pseq, "compositional"),
plot.type = "lineplot",
sample.sort = "neatmap",
otu.sort = "neatmap")
print(p)
# ?plot_composition
Plot taxa prevalence
这里按照不同分类统计otu出现的频率,方便我们后续分析按进行过滤;比如:多少条count足够表征微生物多样性呢,这里将给出答案 This function allows you to have an overview of OTU prevalences alongwith their taxonomic affiliations. This will aid in checking if you filter OTUs based on prevalence, then what taxonomic affliations will be lost.
data(atlas1006)
# Use sample and taxa subset to speed up example
p0 <- subset_samples(atlas1006, DNA_extraction_method == "r")
# Define detection and prevalence thresholds to filter out rare taxa
p0 <- core(p0, detection = 10, prevalence = 0)
# For the available taxonomic levels
plot_taxa_prevalence(p0, "Phylum", detection = 10)
# Example data
library(microbiome)
# Try another theme
# from https://github.com/hrbrmstr/hrbrthemes
# you can install these if you don't have it already.
# library(hrbrthemes)
library(microbiomeutilities)
library(gcookbook)
library(tidyverse)
library(dplyr)
data("zackular2014")
#data("DynamicsIBD") #This data is currently unavialable due to size limitations
ps1 <- zackular2014
colnames(tax_table(ps1))## [1] "Domain" "Phylum" "Class" "Order" "Family" "Genus" "Species"
这里将注释表格做了一个优化,还挺好的,rank1-7 替换为对应的界门纲目等
taxic <- as.data.frame(ps1@tax_table) 提取注释文件非常慢,这里我进行了优化
# First change the column names of the taxonomy table in phyloseq to following:
colnames(tax_table(ps1)) <- c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species" )
tax_table(ps1)[tax_table(ps1)[,"Kingdom"]== "NA", "Kingdom" ] <- "Unidentified_Kingdom"
tax_table(ps1)[tax_table(ps1)[,"Phylum"]== "p__", "Phylum" ] <- "p__Unidentified_Phylum"
# make a dataframe for taxonomy information.
vegan_tax <- function(physeq){
tax <- tax_table(physeq)
return(as(tax,"matrix"))
}
taxic = as.data.frame(vegan_tax(ps1))
head(taxic)## Kingdom Phylum Class Order
## d__denovo1773 k__Bacteria p__Bacteroidetes c__Bacteroidia o__Bacteroidales
## d__denovo1771 k__Bacteria p__Bacteroidetes c__Bacteroidia o__Bacteroidales
## d__denovo1776 k__Bacteria p__Firmicutes c__Clostridia o__Clostridiales
## d__denovo1777 k__Bacteria p__Bacteroidetes c__Bacteroidia o__Bacteroidales
## d__denovo1775 k__Bacteria p__Firmicutes c__Clostridia o__Clostridiales
## d__denovo2639 k__Bacteria p__Bacteroidetes c__Bacteroidia o__Bacteroidales
## Family Genus Species
## d__denovo1773 f__Bacteroidaceae g__Bacteroides s__
## d__denovo1771 f__Bacteroidaceae g__Bacteroides s__
## d__denovo1776 f__Ruminococcaceae g__ s__
## d__denovo1777 f__Bacteroidaceae g__Bacteroides s__
## d__denovo1775 f__Lachnospiraceae g__Blautia s__
## d__denovo2639 f__Bacteroidaceae g__Bacteroides s__# taxic <- as.data.frame(ps1@tax_table)
otu.df <- abundances(ps1)
# make a dataframe for OTU information.
otu.df <- as.data.frame(otu.df)
# check the rows and columns
# head(otu.df)
# Add the OTU ids from OTU table into the taxa table at the end.
taxic$OTU <- row.names.data.frame(otu.df)
# You can see that we now have extra taxonomy levels.
colnames(taxic)## [1] "Kingdom" "Phylum" "Class" "Order" "Family" "Genus" "Species"
## [8] "OTU"# convert it into a matrix.
taxmat <- as.matrix(taxic)
# convert into phyloseq compaitble file.
new.tax <- tax_table(taxmat)
# incroporate into phyloseq Object
tax_table(ps1) <- new.tax