Stem cells are characterized by their ability to differentiate into multiple cell types. At the molecular level, this pluripotency is reflected in the gene expression patterns:
The signaling entropy quantifies this “randomness” of information flow through the protein-protein interaction (PPI) network.
Given a gene expression profile \(\mathbf{x} = (x_1, ..., x_n)\) and an adjacency matrix \(\mathbf{A}\) of the PPI network, the SR is computed as follows:
For each gene \(j\), compute the probability of signaling to neighbor \(k\):
\[P_{jk} = \frac{A_{jk} \cdot x_k}{\sum_l A_{jl} \cdot x_l}\]
This represents the probability that a signal at gene \(j\) will transition to gene \(k\), weighted by the expression level of \(k\).
The stationary distribution \(\pi_j\) represents the long-term probability of finding a signal at gene \(j\):
\[\pi_j = \frac{x_j \cdot (\mathbf{A}\mathbf{x})_j}{Z}\]
where \(Z = \sum_j x_j \cdot (\mathbf{A}\mathbf{x})_j\) is the normalization constant.
The local entropy at gene \(j\) measures the uncertainty in signaling transitions:
\[S_j = -\sum_k P_{jk} \log P_{jk}\]
The signaling entropy rate is the weighted average of local entropies:
\[SR = \frac{\sum_j \pi_j S_j}{SR_{max}}\]
where \(SR_{max} = \log(\lambda_1)\) and \(\lambda_1\) is the largest eigenvalue of \(\mathbf{A}\).
# Network statistics
n_genes <- nrow(net13Jun12.m)
n_edges <- sum(net13Jun12.m) / 2
degrees <- rowSums(net13Jun12.m)
cat("Network Statistics:\n")
#> Network Statistics:
cat(" Genes:", n_genes, "\n")
#> Genes: 8434
cat(" Interactions:", n_edges, "\n")
#> Interactions: 303600
cat(" Mean degree:", round(mean(degrees), 2), "\n")
#> Mean degree: 71.99
cat(" Max degree:", max(degrees), "\n")
#> Max degree: 1030# Degree distribution
df_degree <- data.frame(degree = degrees)
ggplot(df_degree, aes(x = degree)) +
geom_histogram(bins = 50, fill = "#3498db", alpha = 0.7, color = "white") +
scale_x_log10() +
labs(
title = "PPI Network Degree Distribution",
subtitle = "Scale-free network property",
x = "Degree (log scale)",
y = "Count"
) +
theme_minimal() +
theme(plot.title = element_text(face = "bold"))set.seed(123)
# Create two contrasting expression patterns
n_genes_sim <- 5500
n_cells <- 20
# Pattern 1: Uniform expression (high entropy)
exp_uniform <- matrix(rep(5, n_genes_sim * n_cells), nrow = n_genes_sim)
rownames(exp_uniform) <- head(rownames(net13Jun12.m), n_genes_sim)
# Pattern 2: Focused expression (low entropy)
exp_focused <- matrix(1, nrow = n_genes_sim, ncol = n_cells)
exp_focused[1:500, ] <- 50 # High expression in subset
rownames(exp_focused) <- head(rownames(net13Jun12.m), n_genes_sim)
# Compute SR
integ_uniform <- DoIntegPPI(exp_uniform, net13Jun12.m)
integ_focused <- DoIntegPPI(exp_focused, net13Jun12.m)
sr_uniform <- CompSRana(integ_uniform)
sr_focused <- CompSRana(integ_focused)
cat("Uniform expression SR:", round(mean(sr_uniform$SR), 4), "\n")
#> Uniform expression SR: 0.9121
cat("Focused expression SR:", round(mean(sr_focused$SR), 4), "\n")
#> Focused expression SR: 0.8868df_patterns <- data.frame(
Pattern = c(rep("Uniform\n(Pluripotent-like)", n_cells),
rep("Focused\n(Differentiated-like)", n_cells)),
SR = c(sr_uniform$SR, sr_focused$SR)
)
ggplot(df_patterns, aes(x = Pattern, y = SR, fill = Pattern)) +
geom_boxplot(alpha = 0.7, outlier.shape = NA) +
geom_jitter(width = 0.2, alpha = 0.5, size = 2) +
scale_fill_manual(values = c("#e74c3c", "#3498db")) +
labs(
title = "SR Reflects Expression Pattern Entropy",
subtitle = "Higher SR = More pluripotent-like state",
x = "",
y = "Signaling Entropy Rate"
) +
theme_minimal() +
theme(
plot.title = element_text(face = "bold"),
legend.position = "none"
)CCAT (Correlation of Connectome And Transcriptome) is based on the observation that:
Pluripotent cells express hub genes (high-degree nodes) at higher levels
The CCAT score is simply the Pearson correlation between gene expression and network degree:
\[CCAT = cor(\mathbf{x}, \mathbf{k})\]
where \(\mathbf{k} = (k_1, ..., k_n)\) are the node degrees.
# Demonstrate CCAT
ccat_uniform <- CompCCAT(exp_uniform, net13Jun12.m)
ccat_focused <- CompCCAT(exp_focused, net13Jun12.m)
cat("Uniform expression CCAT:", round(mean(ccat_uniform), 4), "\n")
#> Uniform expression CCAT: NA
cat("Focused expression CCAT:", round(mean(ccat_focused), 4), "\n")
#> Focused expression CCAT: 0.4218The mathematical connection:
Therefore, SR and CCAT capture the same biological phenomenon from different angles.
set.seed(42)
exp_test <- matrix(rpois(5500 * 100, 5), nrow = 5500)
rownames(exp_test) <- head(rownames(net13Jun12.m), 5500)
integ_test <- DoIntegPPI(exp_test, net13Jun12.m)
sr_test <- CompSRana(integ_test)
ccat_test <- CompCCAT(exp_test, net13Jun12.m)
r <- cor(sr_test$SR, ccat_test)
cat("SR-CCAT correlation in random data: r =", round(r, 3), "\n")
#> SR-CCAT correlation in random data: r = 0.856
cat("(Original paper reports r ~ 0.78)\n")
#> (Original paper reports r ~ 0.78)sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] Matrix_1.7-5 ggplot2_4.0.3 SCENT_2.0.0 rmarkdown_2.31
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.6 jsonlite_2.0.0 dplyr_1.2.1 compiler_4.6.0
#> [5] tidyselect_1.2.1 Rcpp_1.1.1-1.1 jquerylib_0.1.4 scales_1.4.0
#> [9] yaml_2.3.12 fastmap_1.2.0 lattice_0.22-9 R6_2.6.1
#> [13] labeling_0.4.3 generics_0.1.4 igraph_2.3.2 knitr_1.51
#> [17] tibble_3.3.1 maketools_1.3.2 bslib_0.11.0 pillar_1.11.1
#> [21] RColorBrewer_1.1-3 rlang_1.2.0 cachem_1.1.0 xfun_0.59
#> [25] sass_0.4.10 sys_3.4.3 S7_0.2.2 otel_0.2.0
#> [29] cli_3.6.6 withr_3.0.3 magrittr_2.0.5 digest_0.6.39
#> [33] grid_4.6.0 lifecycle_1.0.5 vctrs_0.7.3 evaluate_1.0.5
#> [37] glue_1.8.1 farver_2.1.2 buildtools_1.0.0 tools_4.6.0
#> [41] pkgconfig_2.0.3 htmltools_0.5.9