Gene Regulatory Network (GRN) inference is the foundation of CellOracleR analysis. This vignette provides detailed documentation of the GRN inference pipeline.
CellOracleR requires a normalized expression matrix (genes × cells):
# From Seurat object
library(Seurat)
library(CellOracleR)
# Create Oracle object
oracle <- create_oracle(
seurat_obj,
cluster_col = "seurat_clusters",
embedding_name = "umap"
)
# Extract expression data
expr_matrix <- oracle$get_expression()
dim(expr_matrix) # genes × cellsRecommendations:
The TF-target dictionary specifies which TFs can potentially regulate each gene:
# Structure: list with gene names as keys
# Each element contains a character vector of potential TF regulators
TFdict <- list(
Gene1 = c("TF_A", "TF_B", "TF_C"),
Gene2 = c("TF_A", "TF_D"),
Gene3 = c("TF_B", "TF_C", "TF_D", "TF_E")
)Sources for TF-target priors:
GRNs are fitted separately for each cell cluster to capture cell-type-specific regulatory logic.
The regularization parameter controls the bias-variance trade-off:
Guidelines for choosing α:
| α value | Effect | Use case |
|---|---|---|
| 0.1-1 | Minimal shrinkage | High-quality data, few predictors |
| 1-10 | Moderate shrinkage | Standard scRNA-seq (recommended) |
| 10-100 | Strong shrinkage | Noisy data, many predictors |
| >100 | Heavy shrinkage | Highly correlated predictors |
Default: α = 10 - provides good balance for typical scRNA-seq data.
# Pseudocode for bagging
bagging_coefficients <- function(X, y, n_bootstrap = 200, sample_ratio = 0.8) {
n_cells <- ncol(X)
n_sample <- floor(n_cells * sample_ratio)
coef_list <- list()
for (b in 1:n_bootstrap) {
# Random subsample without replacement
idx <- sample(n_cells, n_sample, replace = FALSE)
# Fit Ridge regression
coef_list[[b]] <- ridge_fit(X[, idx], y[idx], alpha = 10)
}
# Aggregate: use median (robust to outliers)
final_coef <- apply(do.call(cbind, coef_list), 1, median)
return(final_coef)
}The Links class stores and analyzes the fitted GRN:
# Get Links object from Oracle
links <- oracle$get_links(
cluster_name_for_GRN_unit = "Cluster1"
)
# Explore the network
links$filter(
threshold_p = 0.001, # p-value threshold
threshold_coef = 0.1 # coefficient magnitude threshold
)
# Calculate network metrics
links$get_network_score()
# Export for visualization
links$get_igraph()| Source (TF) | Target (Gene) | Coefficient | P-value |
|---|---|---|---|
| TF_A | Gene_1 | 0.85 | 0 |
| TF_A | Gene_2 | 0.42 | 0 |
| TF_B | Gene_1 | -0.61 | 0 |
| TF_B | Gene_3 | 0.33 | 0 |
| TF_C | Gene_2 | 0.71 | 0 |
| Component | Time Complexity | Memory |
|---|---|---|
| Ridge regression | O(p³) per gene | O(p²) |
| Bagging (B iterations) | O(B × n × p³) | O(p²) |
| Full GRN | O(G × B × n × p³) | O(G × p) |
Where: G = genes, n = cells, p = max TFs per gene, B = bootstrap iterations
CellOracleR uses the future package for parallel
processing:
Performance-critical operations are implemented in C++ via Rcpp:
The CellOracleR GRN inference pipeline:
This produces robust, interpretable gene regulatory networks suitable for perturbation simulation.
sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] patchwork_1.3.2 Matrix_1.7-5 ggplot2_4.0.3 rmarkdown_2.31
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.6 jsonlite_2.0.0 dplyr_1.2.1 compiler_4.6.0
#> [5] tidyselect_1.2.1 jquerylib_0.1.4 splines_4.6.0 scales_1.4.0
#> [9] yaml_2.3.12 fastmap_1.2.0 lattice_0.22-9 R6_2.6.1
#> [13] labeling_0.4.3 generics_0.1.4 knitr_1.51 tibble_3.3.1
#> [17] maketools_1.3.2 bslib_0.11.0 pillar_1.11.1 RColorBrewer_1.1-3
#> [21] rlang_1.2.0 cachem_1.1.0 xfun_0.57 sass_0.4.10
#> [25] sys_3.4.3 S7_0.2.2 otel_0.2.0 viridisLite_0.4.3
#> [29] cli_3.6.6 mgcv_1.9-4 withr_3.0.2 magrittr_2.0.5
#> [33] digest_0.6.39 grid_4.6.0 nlme_3.1-169 lifecycle_1.0.5
#> [37] vctrs_0.7.3 evaluate_1.0.5 glue_1.8.1 farver_2.1.2
#> [41] buildtools_1.0.0 tools_4.6.0 pkgconfig_2.0.3 htmltools_0.5.9