RNAvelocity2:使用 Seurat 和 scVelo 估计 RNA 速率
此教程展示了使用 scVelo 分析存储在 Seurat 对象中的 RNA 速率值。如果您在工作中使用 scVelo,请引用下文:
Generalizing RNA velocity to transient cell states through dynamical modeling Volker Bergen, Marius Lange, Stefan Peidli, F. Alexander Wolf & Fabian J. Theis doi: https://doi.org/10.1101/820936 Website: https://scvelo.readthedocs.io/
准备工作
首先安装如下几个R包:
Seurat scVelo SeuratDisk SeuratWrappers
In R
加载所需R包
library(Seurat)
library(SeuratDisk)
library(SeuratWrappers)
下载示例数据
# If you don't have velocyto's example mouse bone marrow dataset, download with the CURL command
curl::curl_download(url = 'http://pklab.med.harvard.edu/velocyto/mouseBM/SCG71.loom', destfile = '~/Downloads/SCG71.loom')
读取并转换为Seurat对象
ldat <- ReadVelocity(file = "~/Downloads/SCG71.loom")
bm <- as.Seurat(x = ldat)
bm[["RNA"]] <- bm[["spliced"]]
bm <- SCTransform(bm)
bm <- RunPCA(bm)
bm <- RunUMAP(bm, dims = 1:20)
bm <- FindNeighbors(bm, dims = 1:20)
bm <- FindClusters(bm)
DefaultAssay(bm) <- "RNA"
SaveH5Seurat(bm, filename = "mouseBM.h5Seurat")
Convert("mouseBM.h5Seurat", dest = "h5ad")
In Python
加载所需python包
import scvelo as scv
读取转换过的Seurat对象
adata = scv.read("mouseBM.h5ad")
adata## AnnData object with n_obs × n_vars = 6667 × 24421
## obs: 'orig.ident', 'nCount_spliced', 'nFeature_spliced', 'nCount_unspliced', 'nFeature_unspliced', 'nCount_ambiguous', 'nFeature_ambiguous', 'nCount_RNA', 'nFeature_RNA', 'nCount_SCT', 'nFeature_SCT', 'SCT_snn_res.0.8', 'seurat_clusters'
## var: 'features', 'ambiguous_features', 'spliced_features', 'unspliced_features'
## obsm: 'X_umap'
## layers: 'ambiguous', 'spliced', 'unspliced'
过滤与可视化
scv.pp.filter_and_normalize(adata, min_shared_counts=20, n_top_genes=2000)
scv.pp.moments(adata, n_pcs=30, n_neighbors=30)
scv.tl.velocity(adata)
scv.tl.velocity_graph(adata)
scv.pl.velocity_embedding_stream(adata, basis="umap", color="seurat_clusters")
scv.pl.velocity_embedding(adata, basis="umap", color="seurat_clusters", arrow_length=3, arrow_size=2, dpi=120)
scv.tl.recover_dynamics(adata)
scv.tl.latent_time(adata)
scv.pl.scatter(adata, color="latent_time", color_map="gnuplot")
top_genes = adata.var["fit_likelihood"].sort_values(ascending=False).index[:300]
scv.pl.heatmap(adata, var_names=top_genes, sortby="latent_time", col_color="seurat_clusters", n_convolve=100)