
生信探索
2023/04/21阅读:30主题:姹紫
CellChat 细胞通讯分析(预处理)
<~生~信~交~流~与~合~作~请~关~注~公~众~号@生信探索>
安装包
using函数是我写在$HOME/.Rprofile
中的函数,因此每次打开R就能使用。
using
的功能是一次加载多个包,并且使用了suppressPackageStartupMessages
函数,因此不会显示加载包过程中的信息。
使用pak管理R包,可以从Bioconductor、CRAN、Github、本地、URL安装R包,解决了R包安装需要多个不同R包去安装的问题。
因为github连接不稳定,所以想了其他办法把CellChat包从github镜像到jihulab.com,再从jihulab.com克隆下来,然后用pak::local_install
安装,如果网络可以,可以直接用pak::pkg_install("sqjin/CellChat")
安装
install.packages("pak")
using(pak)
pkgs <- c("BiocNeighbors", "ComplexHeatmap", "circlize", "NMF")
pak::pkg_install(pkgs, upgrade = FALSE, dependencies = TRUE)
pak::pkg_install("sqjin/CellChat")
git clone https://jihulab.com/BioQuest/cellchat.git
pak::local_install("cellchat")
# pak::local_install("CellChat-master", dependencies = TRUE)
Python环境
mamba create -n SC && mamba activate SC
mamba install -y -c conda-forge python=3.10 notebook ipywidgets pandas numpy seaborn matplotlib ipykernel openpyxl pyarrow scanpy python-igraph leidenalg pytables jaxlib leidenalg
which python
# /opt/homebrew/Caskroom/mambaforge/base/envs/SC/bin/python
每个分组的细胞通讯网络流程脚本
第一次使用CellChat,好像这个处理过程是固定的,但是可视化是个性化的,又因为处理过程需要消耗些时间,所以写成脚本,可同时在后台把所有分组处理好。
各个参数的意义
-
w: 工作目录,默认为当前目录 -
i: 输出文件的标识
根据i可以得到,normalise过的基因表达数据,行基因,列细胞;Arrow格式;第一列是为index,细胞barcode;
细胞类型信息;CSV格式,第一列是Cell,细胞barcode,第二列为CellType,细胞的类型
-
s: 物种,可以选human和mouse,默认为human -
n: 线程数默认8 -
y: python的路径
使用方法
nohup Rscript CellChat_1.R -i C -y /opt/homebrew/Caskroom/mambaforge/base/envs/SC/bin/python &> C.log &
nohup Rscript CellChat_1.R -i T -y /opt/homebrew/Caskroom/mambaforge/base/envs/SC/bin/python &> T.log &
运行后的文件目录
├── CellChat_1.R
├── C.arrow
├── C.csv
├── T.arrow
├── T.csv
├── cc_C.rds
├── cc_T.rds
├── estimationNumCluster_C_functional_dataset_single.pdf
├── estimationNumCluster_T_structural_dataset_single.pdf
├── estimationNumCluster_C_functional_dataset_single.pdf
└── estimationNumCluster_T_structural_dataset_single.pdf
Rscript
using(optparse, data.table, tidyverse, CellChat, arrow, reticulate)
option_list <- list(
make_option(c("-w", "--workdir"), type = "character", action = "store", default = "./", help = "工作目录路径"),
make_option(c("-i", "--id"), type = "character", action = "store", default = FALSE, help = "文件标识"),
make_option(c("-s", "--species"), type = "character", action = "store", default = "human", help = "物种"),
make_option(c("-n", "--n_jobs"), type = "integer", action = "store", default = 8, help = "线程"),
make_option(c("-y", "--python"), type = "character", action = "store", default = FALSE, help = "Python路径")
)
opt <- parse_args(OptionParser(
option_list = option_list, add_help_option = TRUE,
usage = "Usage: %prog [options] \nDescription: CellChat pre!"
))
setwd(opt$workdir)
reticulate::use_python(opt$python)
# ==============================================================================
# 1.创建CellChat对象
# ==============================================================================
mt <- arrow::read_ipc_file(str_glue("{opt$id}.arrow")) %>%
column_to_rownames("index") %>%
t() %>%
as.matrix()
df <- data.table::fread(str_glue("{opt$id}.csv")) %>%
column_to_rownames("Cell")
cc <- createCellChat(mt, meta = df, group.by = "CellType")
cc@DB <- switch(opt$species,
human = CellChat::CellChatDB.human,
mouse = CellChat::CellChatDB.mouse
)
# ==============================================================================
# 2.预处理表达数据 + 细胞通讯网络
# ==============================================================================
future::plan("multisession", workers = opt$n_jobs)
cc <- subsetData(cc)
cc <- identifyOverExpressedGenes(cc)
cc <- identifyOverExpressedInteractions(cc)
cc <- projectData(cc, adjMatrix = switch(opt$species,
human = PPI.human,
mouse = PPI.mouse
))
cc <- computeCommunProb(cc, raw.use = FALSE, population.size = TRUE)
cc <- filterCommunication(cc, min.cells = 10)
cc <- computeCommunProbPathway(cc)
cc <- aggregateNet(cc)
cc <- netAnalysis_computeCentrality(cc, slot.name = "netP")
cc <- computeNetSimilarity(cc, type = "functional")
cc <- netEmbedding(cc, type = "functional")
cc <- netClustering(cc, fig.id = opt$id, type = "functional", do.parallel = FALSE)
cc <- computeNetSimilarity(cc, type = "structural")
cc <- netEmbedding(cc, type = "structural")
cc <- netClustering(cc, fig.id = opt$id, type = "structural", do.parallel = FALSE)
saveRDS(cc, file = str_glue("cc_{opt$id}.rds"))
查看帮助
Rscript CellChat_1.R -h
Usage: CellChat_1.R [options]
Description: CellChat pre!
Options:
-w WORKDIR, --workdir=WORKDIR
工作目录路径
-i ID, --id=ID
文件标识
-s SPECIES, --species=SPECIES
物种
-n N_JOBS, --n_jobs=N_JOBS
线程
-y PYTHON, --python=PYTHON
Python路径
-h, --help
Show this help message and exit
说明
createCellChat
可以有多种方法构建CellChat
对象,可以是我上边那种方法,因为我从Anndata
导出数据;当然也可以用Seurat去构建。
表达量可以是:a normalized (NOT count) data matrix (genes by cells), Seurat or SingleCellExperiment object
netClustering
函数的do.parallel = FALSE
,因为多线程会报错。
CellChat::CellChatDB也可以使用subsetDB
函数取子集
Reference
https://www.jianshu.com/p/da145cff3d41
https://www.jianshu.com/p/b3d26ac51c5a
https://cloud.tencent.com/developer/inventory/26535/article/1935670
作者介绍

生信探索
微信公众号:生信探索