scGate - Marker-Based Cell Type Purification for Single-Cell Sequencing Data

A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. 'scGate' automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. Briefly, 'scGate' takes as input: i) a gene expression matrix stored in a 'Seurat' object and ii) a “gating model” (GM), consisting of a set of marker genes that define the cell population of interest. The GM can be as simple as a single marker gene, or a combination of positive and negative markers. More complex GMs can be constructed in a hierarchical fashion, akin to gating strategies employed in flow cytometry. 'scGate' evaluates the strength of signature marker expression in each cell using the rank-based method 'UCell', and then performs k-nearest neighbor (kNN) smoothing by calculating the mean 'UCell' score across neighboring cells. kNN-smoothing aims at compensating for the large degree of sparsity in scRNA-seq data. Finally, a universal threshold over kNN-smoothed signature scores is applied in binary decision trees generated from the user-provided gating model, to annotate cells as either “pure” or “impure”, with respect to the cell population of interest. See the related publication Andreatta et al. (2022) <doi:10.1093/bioinformatics/btac141>.

Last updated

5.10 score 1 dependents 209 scripts 503 downloads

BioTransition - Dynamic Network Biomarker Analysis for Critical Transitions

A comprehensive toolkit for detecting critical transitions and identifying dynamic network biomarkers (DNB) in biological systems. Critical transitions, characterized by sudden shifts between distinct states, are prevalent in complex biological processes including disease progression, cellular differentiation, and developmental transitions. This package implements seven complementary DNB methodologies: (1) conventional DNB (cDNB) based on the original DNB theory (Chen et al. 2012 <doi:10.1038/srep00342>); (2) topological DNB (tDNB), a novel approach utilizing network topology and scale-free properties; (3) landscape DNB (LDNB) for quantifying state transitions (Liu et al. 2019 <doi:10.1093/nsr/nwy162>); (4) local DNB (LcDNB) leveraging protein-protein interaction networks; (5) module-based DNB (MDNB) for modular analysis (Li et al. 2022 <doi:10.1016/j.xinn.2022.100364>); (6) time-series network module biomarker (TSNMB) for temporal dynamics (Zhong et al. 2022 <doi:10.1093/jmcb/mjac052>); and (7) time-series leading edge (TSLE) analysis (Liu et al. 2020 <doi:10.1093/bioinformatics/btz758>). Core computational routines are implemented in C++ via 'Rcpp' for optimal performance. Compatible with bulk RNA-seq, single-cell RNA-seq, and spatial transcriptomics data. Includes curated protein-protein interaction networks for human and mouse from the STRING database.

Last updated

softwarestatisticalmethodnetworksystemsbiologygeneexpressiontranscriptomicssinglecellspatialbiomedicalinformaticsdifferentialexpressioncpp

3.00 score