Package 'BioTransition'

Title: Dynamic Network Biomarker Analysis for Critical Transitions
Description: A comprehensive toolkit for detecting critical transitions and identifying dynamic network biomarkers (DNB) in biological systems. Critical transitions, characterized by sudden shifts between distinct states, are prevalent in complex biological processes including disease progression, cellular differentiation, and developmental transitions. This package implements seven complementary DNB methodologies: (1) conventional DNB (cDNB) based on the original DNB theory (Chen et al. 2012 <doi:10.1038/srep00342>); (2) topological DNB (tDNB), a novel approach utilizing network topology and scale-free properties; (3) landscape DNB (LDNB) for quantifying state transitions (Liu et al. 2019 <doi:10.1093/nsr/nwy162>); (4) local DNB (LcDNB) leveraging protein-protein interaction networks; (5) module-based DNB (MDNB) for modular analysis (Li et al. 2022 <doi:10.1016/j.xinn.2022.100364>); (6) time-series network module biomarker (TSNMB) for temporal dynamics (Zhong et al. 2022 <doi:10.1093/jmcb/mjac052>); and (7) time-series leading edge (TSLE) analysis (Liu et al. 2020 <doi:10.1093/bioinformatics/btz758>). Core computational routines are implemented in C++ via 'Rcpp' for optimal performance. Compatible with bulk RNA-seq, single-cell RNA-seq, and spatial transcriptomics data. Includes curated protein-protein interaction networks for human and mouse from the STRING database.
Authors: Zaoqu Liu [aut, cre] (ORCID: <https://orcid.org/0000-0002-0452-742X>), Chuhan Zhang [ctb] (MDNB implementation)
Maintainer: Zaoqu Liu <[email protected]>
License: GPL (>= 3) + file LICENSE
Version: 2.0.0
Built: 2026-05-22 08:42:04 UTC
Source: https://github.com/SolvingLab/BioTransition

Help Index


Conventional DNB analysis

Description

Performing conventional dynamic network biomarker analysis based on the original DNB theory proposed by Chen et al. (2012).

Usage

cDNB(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  p.adjust.method = "BH",
  variation.method = "sd",
  min.size = 10,
  max.size = 2000,
  AddModuleSize = FALSE
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

state

A time-series dataframe with two columns, the first is the sample names and the second is the group or time point information.

state.levels

A vector for state sequence.

cor.method

specifies the method for correlation analysis.

p.adjust.method

correction method, a character string. Can be abbreviated. c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

variation.method

specifies the method for calculating gene variation. sd or cv.

min.size

Minimum gene number of gene modules.

max.size

Maximum gene number of gene modules.

AddModuleSize

Whether to consider gene module size when calculating DNB score.

Value

A list containing:

DNB.score

A data.frame with composite index (CI) values for each state, including V_in (mean variation), R_in (mean correlation within DNB), and R_out (mean correlation with non-DNB genes)

DNB.genes

Character vector of genes in the identified DNB module

CI_all

List of data.frames with CI scores for all candidate modules in each state

Gene_module

List of gene modules detected in each state

Candidate

Data.frame summarizing the best candidate module per state

Cor

List of correlation matrices for each state

V

List of gene variation values for each state

Author(s)

Zaoqu Liu; E-mail: [email protected]

References

Chen L, Liu R, Liu ZP, Li M, Aihara K. (2012). Detecting early-warning signals for sudden deterioration of complex diseases by dynamical network biomarkers. Scientific Reports, 2:342. doi:10.1038/srep00342

See Also

tDNB for topological DNB, LcDNB for local DNB with PPI, MDNB for module-based DNB

Examples

# Create example data
set.seed(42)
n_genes <- 100
n_samples <- 15

expr <- matrix(
  rnorm(n_genes * n_samples, mean = 10, sd = 2),
  nrow = n_genes, ncol = n_samples
)
rownames(expr) <- paste0("Gene", seq_len(n_genes))
colnames(expr) <- paste0("Sample", seq_len(n_samples))

state <- data.frame(
  sample_id = colnames(expr),
  state = rep(c("A", "B", "C"), each = 5)
)


result <- cDNB(
  expr = expr,
  state = state,
  state.levels = c("A", "B", "C"),
  min.size = 5,
  max.size = 50
)
result$DNB.score

Local Conventional DNB Analysis

Description

Performs local conventional dynamic network biomarker (LcDNB) analysis using protein-protein interaction (PPI) networks.

Usage

LcDNB(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  p.adjust.method = "BH",
  variation.method = "sd",
  min.first.neighbor.size = 3,
  min.second.neighbor.size = 1,
  ppi = ppi_h,
  min.combined.score = 900,
  percent = TRUE,
  top.n = 30,
  top.p = 0.05,
  AddModuleSize = FALSE
)

Arguments

expr

A numeric matrix with genes in rows and samples in columns.

state

A data.frame with sample IDs and state labels (2 columns).

state.levels

Character vector specifying the order of states.

cor.method

Correlation method: "pearson", "spearman", or "kendall".

p.adjust.method

P-value adjustment method. Default: "BH".

variation.method

Method for variation: "sd" or "cv". Default: "sd".

min.first.neighbor.size

Minimum first-order neighbors. Default: 3.

min.second.neighbor.size

Minimum second-order neighbors. Default: 1.

ppi

PPI network data.frame with G1, G2, combined_score columns.

min.combined.score

Minimum STRING score. Default: 900.

percent

Use percentage (TRUE) or absolute number (FALSE).

top.n

Number of top genes when percent=FALSE. Default: 30.

top.p

Proportion when percent=TRUE. Default: 0.05.

AddModuleSize

Weight by module size. Default: FALSE.

Value

A list containing DNB.score, DNB.genes, CI_all, Gene_module, Candidate, Cor, V, PPI.used, first.order.genes, second.order.genes.

Author(s)

Zaoqu Liu

See Also

cDNB, LDNB, ppi_h

Examples

# See vignette for detailed examples

Landscape DNB Analysis

Description

Performs landscape dynamic network biomarker (LDNB) analysis for detecting critical transitions. This method uses sample-specific perturbation networks (SSPN) to identify tipping points in biological state transitions.

Usage

LDNB(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  p.adjust.method = "BH",
  ppi = ppi_h,
  min.combined.score = 990,
  min.first.neighbor.size = 10,
  min.second.neighbor.size = 1,
  use.PCC.P.type = "FDR",
  use.PCC.P.cutoff = 0.05,
  percent = FALSE,
  top.n = 30,
  top.p = 0.05,
  nCores = max(1, parallel::detectCores() - 2)
)

Arguments

expr

A numeric matrix or data.frame with genes in rows and samples in columns. Row names should be gene symbols.

state

A data.frame with exactly two columns: sample identifiers and state labels. Must include a "ref" state for reference samples.

state.levels

A character vector specifying the order of states. The first level should be "ref" (reference state).

cor.method

Character string specifying correlation method. One of "pearson" (default), "spearman", or "kendall".

p.adjust.method

Character string specifying p-value adjustment method. Default: "BH".

ppi

A data.frame containing protein-protein interactions with columns G1, G2, and combined_score.

min.combined.score

Numeric. Minimum STRING combined score. Default: 990.

min.first.neighbor.size

Integer. Minimum first-order neighbors. Default: 10.

min.second.neighbor.size

Integer. Minimum second-order neighbors. Default: 1.

use.PCC.P.type

Character. Type of p-value for filtering: "FDR" (default) or "NP" (nominal p-value).

use.PCC.P.cutoff

Numeric. P-value cutoff. Default: 0.05.

percent

Logical. Use percentage (top.p) or absolute number (top.n). Default: FALSE.

top.n

Integer. Number of top genes when percent = FALSE. Default: 30.

top.p

Numeric. Proportion when percent = TRUE. Default: 0.05.

nCores

Integer. Number of CPU cores for parallel computation.

Details

LDNB analysis requires a reference state (labeled as "ref" in the state column) to construct background networks. The algorithm:

  1. Constructs sample-specific perturbation networks (SSPN)

  2. Calculates local landscape indices for each gene

  3. Aggregates to global indices per sample and state

  4. Identifies critical state and DNB genes

Value

A list containing:

state.GI

Data.frame with global indices per state

DNB.genes

Character vector of identified DNB genes

Gene.LI

Data.frame of genes ranked by landscape index

case.GI

Data.frame with global index per sample

case.LI.list

List of local indices per sample

SSPN

Sample-specific perturbation network results

Author(s)

Zaoqu Liu [email protected]

References

Liu R, et al. (2019). Single-sample landscape entropy reveals the imminent phase transition during disease progression. National Science Review, 7(7):775-785. doi:10.1093/nsr/nwy162

See Also

SSPN1, LcDNB, SLE

Examples

# LDNB requires a reference state
# See vignette for detailed examples

Module-based Dynamic Network Biomarker (MDNB) Analysis

Description

Performing module-based dynamic network biomarker analysis using PPI network

Usage

MDNB(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  ppi = ppi_h,
  min.combined.score = 900,
  PCC.min = 0.02,
  PCC.module = 0.02,
  numeber.module.QI = 10
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

state

A dataframe with two columns: sample names and group information.

state.levels

A vector for state sequence (e.g., cell groups, time points, treatment conditions).

cor.method

Specifies the method for correlation analysis.

ppi

Protein-protein interaction network; background network.

min.combined.score

Minimum combined score for determining protein-protein interaction.

PCC.min

Remove genes if their highest correlation with any other gene is less than this value.

PCC.module

A gene must maintain at least this minimum correlation with the seed gene in every group to be included in the module.

numeber.module.QI

Number of gene modules selected for QI calculation.

Value

A list containing:

  • DNB.score: The DNB score values (QI values) calculated for different groups.

  • DNB.genes: Genes from the module with highest CI in the critical state.

  • CI_all: Module contribution index (CI) details for each state.

  • Gene_module: Generated co-expression gene modules.

  • QIplot: Line chart showing QI changes across states.

  • metadata: Additional information about the analysis.

Author(s)

Zaoqu Liu, Chuhan Zhang; Email: [email protected]

References

Li L, Xu Y, Yan L, et al. Dynamic network biomarker factors orchestrate cell-fate determination at tipping points during hESC differentiation. Innovation (Camb). 2022 Dec 20;4(1):100364. doi: 10.1016/j.xinn.2022.100364


Human Protein-Protein Interaction Network

Description

Protein-protein interaction network for human (Homo sapiens) from STRING database v12.0. This dataset contains interactions with confidence scores, suitable for network-based DNB analyses including LcDNB, MDNB, LDNB, TSNMB, and TSLE.

Usage

ppi_h

Format

A data frame with protein-protein interactions:

G1

Gene 1 symbol (character). First interacting gene.

G2

Gene 2 symbol (character). Second interacting gene.

combined_score

Interaction confidence score (numeric, 0-999). Higher scores indicate stronger evidence for interaction. Typical cutoffs: 400 (medium), 700 (high), 900 (highest).

Details

Data Source: STRING database v12.0 (https://string-db.org/)

Species: Homo sapiens (NCBI Taxonomy ID: 9606)

Recommended Score Cutoffs:

  • 400-600: Medium confidence (broad coverage)

  • 700-800: High confidence (balanced)

  • 900+: Highest confidence (most reliable)

Usage in DNB Analysis: Use this PPI network for human single-cell or bulk RNA-seq data with PPI-based DNB methods: LcDNB, MDNB, LDNB, TSNMB, TSLE.

Source

STRING database v12.0 https://string-db.org/ https://stringdb-downloads.org/

References

Szklarczyk D, et al. The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023;51(D1):D638-D646.

See Also

ppi_m for mouse PPI network

Examples

# Load human PPI
data("ppi_h")

# Explore the network
dim(ppi_h)
head(ppi_h)

# Check gene coverage
all_genes <- unique(c(ppi_h$G1, ppi_h$G2))
length(all_genes)

# Filter by confidence
high_conf <- ppi_h[ppi_h$combined_score >= 700, ]
nrow(high_conf)

Mouse Protein-Protein Interaction Network

Description

Protein-protein interaction network for mouse (Mus musculus) from STRING database v12.0. This dataset contains interactions with confidence scores, suitable for network-based DNB analyses including LcDNB, MDNB, LDNB, TSNMB, and TSLE.

Usage

ppi_m

Format

A data frame with 12,684,354 protein-protein interactions:

G1

Gene 1 symbol (character). First interacting gene.

G2

Gene 2 symbol (character). Second interacting gene.

combined_score

Interaction confidence score (numeric, 0-999). Higher scores indicate stronger evidence for interaction. Typical cutoffs: 400 (medium), 700 (high), 900 (highest).

Details

Data Source: STRING database v12.0 (https://string-db.org/)

Species: Mus musculus (NCBI Taxonomy ID: 10090)

Statistics:

  • Total interactions: 12,684,354

  • Unique genes: 21,645

  • Score range: 150-999

Recommended Score Cutoffs:

  • 400-600: Medium confidence (broad coverage)

  • 700-800: High confidence (balanced)

  • 900+: Highest confidence (most reliable)

Usage in DNB Analysis: Use this PPI network for mouse single-cell or bulk RNA-seq data with PPI-based DNB methods: LcDNB, MDNB, LDNB, TSNMB, TSLE.

Source

STRING database v12.0 https://string-db.org/ https://stringdb-downloads.org/

References

Szklarczyk D, et al. The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023;51(D1):D638-D646.

See Also

ppi_h for human PPI network

Examples

# Load mouse PPI
data("ppi_m")

# Explore the network
dim(ppi_m)
head(ppi_m)

# Check gene coverage
all_genes <- unique(c(ppi_m$G1, ppi_m$G2))
length(all_genes) # 21,645 genes

# Filter by confidence
high_conf <- ppi_m[ppi_m$combined_score >= 700, ]
nrow(high_conf)

## Not run: 
# Use in DNB analysis
result <- MDNB(
  expr = mouse_expr,
  state = sample_groups,
  state.levels = c("Control", "Treatment"),
  ppi = ppi_m,
  min.combined.score = 700
)

## End(Not run)

Single-sample landscape entropy analysis

Description

Performing single-sample landscape entropy analysis.

Usage

SLE(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  p.adjust.method = "BH",
  ppi = ppi_h,
  min.combined.score = 900,
  min.first.neighbor.size = 1,
  percent = TRUE,
  top.n = 30,
  top.p = 0.05,
  nCores = parallel::detectCores() - 10
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

state

A time-series dataframe with two columns, the first is the sample names and the second is the group or time point information.

state.levels

A vector for state sequence.

cor.method

specifies the method for correlation analysis.

p.adjust.method

correction method, a character string. Can be abbreviated. c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

ppi

Protein-protein interaction network; background network.

min.combined.score

Minimum combined score for determining protein-protein interaction.

min.first.neighbor.size

Minimum size of first order genes of a specific center gene.

percent

Whether to use Percent to determine the number of DNB genes.

top.n

Only percent = FALSE takes effect. Center genes with top (number) DNB score were defined as DNB genes.

top.p

Only percent = TRUE takes effect. Center genes with top (percent) DNB score were defined as DNB genes.

nCores

The number of cores will be used.

Author(s)

Zaoqu Liu; E-mail: [email protected]


Single-sample network module biomarker analysis

Description

Performing single-sample network module biomarker analysis.

Usage

sNMB(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  p.adjust.method = "BH",
  ppi = ppi_h,
  min.combined.score = 900,
  min.first.neighbor.size = 3,
  percent = TRUE,
  top.n = 30,
  top.p = 0.05,
  nCores = parallel::detectCores() - 10
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

state

A time-series dataframe with two columns, the first is the sample names and the second is the group or time point information.

state.levels

A vector for state sequence.

cor.method

specifies the method for correlation analysis.

p.adjust.method

correction method, a character string. Can be abbreviated. c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

ppi

Protein-protein interaction network; background network.

min.combined.score

Minimum combined score for determining protein-protein interaction.

min.first.neighbor.size

Minimum size of first order genes of a specific center gene.

percent

Whether to use Percent to determine the number of DNB genes.

top.n

Only percent = FALSE takes effect. Center genes with top (number) DNB score were defined as DNB genes.

top.p

Only percent = TRUE takes effect. Center genes with top (percent) DNB score were defined as DNB genes.

nCores

The number of cores will be used.

Author(s)

Zaoqu Liu; E-mail: [email protected]


Sample-specific perturbation network based on PPI network

Description

Performing sample-specific perturbation network analysis based on PPI network.

Usage

SSPN1(
  expr,
  ref.samples,
  cor.method = "pearson",
  p.adjust.method = "BH",
  ppi = ppi_h,
  min.combined.score = 900,
  nCores = parallel::detectCores() - 10
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

ref.samples

Samples for constructing the background reference network; Column ID or names.

cor.method

specifies the method for correlation analysis.

p.adjust.method

correction method, a character string. Can be abbreviated. c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

ppi

Protein-protein interaction network; background network.

min.combined.score

Minimum combined score for determining protein-protein interaction.

nCores

The number of cores will be used.

Author(s)

Zaoqu Liu; E-mail: [email protected]


Sample-specific perturbation network based on customized network

Description

Performing sample-specific perturbation network analysis based on customized network.

Usage

SSPN2(
  expr,
  ref.samples,
  net,
  cor.method = "pearson",
  p.adjust.method = "BH",
  nCores = parallel::detectCores() - 10
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

ref.samples

Samples for constructing the background reference network; Column ID or names.

net

A customized network with two columns.

cor.method

specifies the method for correlation analysis.

p.adjust.method

correction method, a character string. Can be abbreviated. c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

nCores

The number of cores will be used.

Author(s)

Zaoqu Liu; E-mail: [email protected]


Topological DNB analysis

Description

Performing topological dynamic network biomarker analysis using scale-free network topology. This is a novel method developed by Liu Z. that leverages topological overlap matrix (TOM) for more robust DNB detection.

Usage

tDNB(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  p.adjust.method = "BH",
  variation.method = "sd",
  min.size = 10,
  max.size = 2000,
  AddModuleSize = FALSE,
  power.vec = c(seq_len(10), seq(from = 12, to = 20, by = 2)),
  network.type = "signed",
  RsquaredCut = 0.85
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

state

A time-series dataframe with two columns, the first is the sample names and the second is the group or time point information.

state.levels

A vector for state sequence.

cor.method

specifies the method for correlation analysis.

p.adjust.method

correction method, a character string. Can be abbreviated. c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

variation.method

specifies the method for calculating gene variation. sd or cv.

min.size

Minimum gene number of gene modules.

max.size

Maximum gene number of gene modules.

AddModuleSize

Whether to consider gene module size when calculating DNB score.

power.vec

a vector of soft thresholding powers for which the scale free topology fit indices are to be calculated.

network.type

network type. Allowed values are (unique abbreviations of) "unsigned" or "signed".

RsquaredCut

desired minimum scale free topology fitting index R^2.

Value

A list containing:

DNB.score

A data.frame with composite index (CI) values for each state, including V_in (mean variation), R_in (mean correlation within DNB), and R_out (mean correlation with non-DNB genes)

DNB.genes

Character vector of genes in the identified DNB module

CI_all

List of data.frames with CI scores for all candidate modules in each state

Gene_module

List of gene modules detected in each state

Candidate

Data.frame summarizing the best candidate module per state

RawCor

List of correlation matrices for each state

Tom

List of topological overlap matrices for each state

V

List of gene variation values for each state

SFNet

List of scale-free network fitting results

Author(s)

Zaoqu Liu; E-mail: [email protected]

Examples

# Create example data
set.seed(42)
n_genes <- 100
n_samples <- 15

expr <- matrix(
  rnorm(n_genes * n_samples, mean = 10, sd = 2),
  nrow = n_genes, ncol = n_samples
)
rownames(expr) <- paste0("Gene", seq_len(n_genes))
colnames(expr) <- paste0("Sample", seq_len(n_samples))

state <- data.frame(
  sample_id = colnames(expr),
  state = rep(c("A", "B", "C"), each = 5)
)


result <- tDNB(
  expr = expr,
  state = state,
  state.levels = c("A", "B", "C"),
  min.size = 5,
  max.size = 50
)
result$DNB.score

Time-series landscape entropy analysis

Description

Performing time-series landscape entropy analysis.

Usage

TSLE(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  p.adjust.method = "BH",
  ppi = ppi_h,
  min.combined.score = 900,
  min.first.neighbor.size = 3,
  percent = TRUE,
  top.n = 30,
  top.p = 0.05
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

state

A time-series dataframe with two columns, the first is the sample names and the second is the group or time point information.

state.levels

A vector for state sequence.

cor.method

specifies the method for correlation analysis.

p.adjust.method

correction method, a character string. Can be abbreviated. c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

ppi

Protein-protein interaction network; background network.

min.combined.score

Minimum combined score for determining protein-protein interaction.

min.first.neighbor.size

Minimum size of first order genes of a specific center gene.

percent

Whether to use Percent to determine the number of DNB genes.

top.n

Only percent = FALSE takes effect. Center genes with top (number) DNB score were defined as DNB genes.

top.p

Only percent = TRUE takes effect. Center genes with top (percent) DNB score were defined as DNB genes.

Author(s)

Zaoqu Liu; E-mail: [email protected]


Time-series network module biomarker analysis

Description

Performing time-series network module biomarker analysis.

Usage

TSNMB(
  expr,
  state,
  state.levels,
  cor.method = "pearson",
  p.adjust.method = "BH",
  ppi = ppi_h,
  min.combined.score = 900,
  min.first.neighbor.size = 3,
  percent = TRUE,
  top.n = 30,
  top.p = 0.05
)

Arguments

expr

A expression dataframe with gene rows and sample columns.

state

A time-series dataframe with two columns, the first is the sample names and the second is the group or time point information.

state.levels

A vector for state sequence.

cor.method

specifies the method for correlation analysis.

p.adjust.method

correction method, a character string. Can be abbreviated. c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none")

ppi

Protein-protein interaction network; background network.

min.combined.score

Minimum combined score for determining protein-protein interaction.

min.first.neighbor.size

Minimum size of first order genes of a specific center gene.

percent

Whether to use Percent to determine the number of DNB genes.

top.n

Only percent = FALSE takes effect. Center genes with top (number) DNB score were defined as DNB genes.

top.p

Only percent = TRUE takes effect. Center genes with top (percent) DNB score were defined as DNB genes.

Author(s)

Zaoqu Liu; E-mail: [email protected]