| Title: | Markov Affinity-Based Graph Imputation of Cells |
|---|---|
| Description: | Native R implementation of MAGIC (Markov Affinity-based Graph Imputation of Cells) for denoising and imputation of single-cell RNA sequencing data. MAGIC uses diffusion geometry to denoise single-cell data and fill in missing transcripts, as described in van Dijk et al. (2018) <doi:10.1016/j.cell.2018.05.061>. This package provides a pure R implementation with optional C++ acceleration, parallel computing support, and seamless integration with Seurat (v4 and v5) objects. |
| Authors: | Zaoqu Liu [aut, cre] |
| Maintainer: | Zaoqu Liu <[email protected]> |
| License: | GPL-2 |
| Version: | 1.0.0 |
| Built: | 2026-05-25 07:29:49 UTC |
| Source: | https://github.com/Zaoqu-Liu/MAGICR |
Subset MAGIC result
## S3 method for class 'magic' x[i, j, ...]## S3 method for class 'magic' x[i, j, ...]
x |
MAGIC object |
i |
Row indices |
j |
Column indices |
... |
Additional arguments |
Create an animation showing how gene-gene relationships evolve with increasing diffusion time t.
animate_magic( data, gene_x, gene_y, gene_color = NULL, t_max = 20, delay = 2, operator = NULL, filename = NULL, width = 600, height = 500, point_size = 1, interval = 200, verbose = FALSE, ... )animate_magic( data, gene_x, gene_y, gene_color = NULL, t_max = 20, delay = 2, operator = NULL, filename = NULL, width = 600, height = 500, point_size = 1, interval = 200, verbose = FALSE, ... )
data |
Input data matrix (cells x genes). |
gene_x |
Gene for x-axis (name or index). |
gene_y |
Gene for y-axis (name or index). |
gene_color |
Gene for coloring points (optional). |
t_max |
Maximum t value. Default is 20. |
delay |
Number of initial frames before diffusion. Default is 2. |
operator |
Pre-computed magic object. Default is NULL. |
filename |
Output filename (.gif). Default is NULL (no save). |
width |
Plot width in pixels. Default is 600. |
height |
Plot height in pixels. Default is 500. |
point_size |
Point size. Default is 1. |
interval |
Milliseconds between frames. Default is 200. |
verbose |
Verbosity. Default is FALSE. |
... |
Additional arguments passed to magic(). |
This function creates an animation showing how MAGIC diffusion
progressively reveals gene-gene relationships. It requires the
gganimate and gifski packages for GIF output.
A list of ggplot objects (one per frame), or writes a GIF file.
## Not run: data(magic_testdata) # Create animation frames frames <- animate_magic(magic_testdata, gene_x = 1, gene_y = 2, t_max = 10) # Save as GIF (requires gganimate and gifski) animate_magic(magic_testdata, gene_x = 1, gene_y = 2, t_max = 10, filename = "magic_animation.gif") ## End(Not run)## Not run: data(magic_testdata) # Create animation frames frames <- animate_magic(magic_testdata, gene_x = 1, gene_y = 2, t_max = 10) # Save as GIF (requires gganimate and gifski) animate_magic(magic_testdata, gene_x = 1, gene_y = 2, t_max = 10, filename = "magic_animation.gif") ## End(Not run)
Computes the alpha-decaying kernel from kNN distances.
compute_alpha_kernel(knn_graph, decay = 1, threshold = 1e-04, verbose = FALSE)compute_alpha_kernel(knn_graph, decay = 1, threshold = 1e-04, verbose = FALSE)
knn_graph |
Result from |
decay |
Numeric. Alpha parameter. If NULL, uses unweighted kernel. Default is 1. |
threshold |
Numeric. Values below this are set to 0. Default is 1e-4. |
verbose |
Logical or integer. Verbosity level. |
The alpha-decaying kernel:
where is the distance to the k-th neighbor (adaptive bandwidth)
and controls the decay rate.
Reference: Moon, K.R., et al. (2019). Visualizing structure and transitions in high-dimensional biological data. Nature Biotechnology, 37(12), 1482-1492.
A sparse symmetric kernel matrix (dgCMatrix).
Functions for computing kNN-DREMI on single-cell data.
Convenience function to extract MAGIC-imputed data from a Seurat object.
GetMAGICData( object, assay = "MAGIC_RNA", genes = NULL, cells = NULL, as_matrix = TRUE )GetMAGICData( object, assay = "MAGIC_RNA", genes = NULL, cells = NULL, as_matrix = TRUE )
object |
Seurat object that has been processed with magic() |
assay |
Name of the MAGIC assay. Default is "MAGIC_RNA". |
genes |
Genes to extract. Default is NULL (all genes). |
cells |
Cells to extract. Default is NULL (all cells). |
as_matrix |
Logical. Return as matrix instead of data.frame? Default TRUE. |
Matrix or data.frame of MAGIC-imputed expression values (cells x genes).
## Not run: # After running magic() on Seurat object imputed_data <- GetMAGICData(seurat_obj) imputed_subset <- GetMAGICData(seurat_obj, genes = c("Gene1", "Gene2")) ## End(Not run)## Not run: # After running magic() on Seurat object imputed_data <- GetMAGICData(seurat_obj) imputed_subset <- GetMAGICData(seurat_obj, genes = c("Gene1", "Gene2")) ## End(Not run)
Returns TRUE if C++ functions compiled successfully.
has_cpp_acceleration()has_cpp_acceleration()
Logical
has_cpp_acceleration()has_cpp_acceleration()
Functions for kNN graph and diffusion kernel construction.
Calculates k-Nearest Neighbor conditional Density Resampled Estimate of Mutual Information as defined in van Dijk et al, 2018.
knnDREMI( x, y, k = 10, n_bins = 20, n_mesh = 3, n_jobs = 1, plot = FALSE, return_drevi = FALSE, ... )knnDREMI( x, y, k = 10, n_bins = 20, n_mesh = 3, n_jobs = 1, plot = FALSE, return_drevi = FALSE, ... )
x |
Numeric vector. Independent variable (gene X). |
y |
Numeric vector. Dependent variable (gene Y). |
k |
Integer. Number of nearest neighbors. Default is 10. |
n_bins |
Integer. Number of bins for density resampling. Default is 20. |
n_mesh |
Integer. Mesh density within bins. Default is 3. |
n_jobs |
Integer. Number of parallel workers (not used in R). Default is 1. |
plot |
Logical. Create diagnostic plots. Default is FALSE. |
return_drevi |
Logical. Also return DREVI density matrix. Default is FALSE. |
... |
Additional arguments passed to plot function. |
kNN-DREMI is an adaptation of DREMI (Krishnaswamy et al. 2014) for single cell RNA-sequencing data. DREMI captures the functional relationship between two genes across their entire dynamic range. The key change to kNN-DREMI is the replacement of the heat diffusion-based kernel-density estimator by a k-nearest neighbor-based density estimator.
Note: kNN-DREMI, like Mutual Information and DREMI, is not symmetric. Here we are estimating I(Y|X).
This implementation follows exactly the scprep Python implementation.
If return_drevi is FALSE, returns the DREMI value (numeric). If return_drevi is TRUE, returns a list with dremi and drevi (density matrix).
van Dijk, D., et al. (2018). Recovering gene interactions from single-cell data using data diffusion. Cell, 174(3), 716-729. doi:10.1016/j.cell.2018.05.061
Krishnaswamy, S., et al. (2014). Conditional density-based analysis of T cell signaling in single-cell data. Science, 346(6213), 1250689.
## Not run: # After running MAGIC result <- magic(data, t = 3) imputed <- as.matrix(result) # Compute kNN-DREMI between two genes dremi <- knnDREMI(imputed[, "GeneA"], imputed[, "GeneB"]) # With diagnostic plot dremi <- knnDREMI(imputed[, "GeneA"], imputed[, "GeneB"], plot = TRUE) ## End(Not run)## Not run: # After running MAGIC result <- magic(data, t = 3) imputed <- as.matrix(result) # Compute kNN-DREMI between two genes dremi <- knnDREMI(imputed[, "GeneA"], imputed[, "GeneB"]) # With diagnostic plot dremi <- knnDREMI(imputed[, "GeneA"], imputed[, "GeneB"], plot = TRUE) ## End(Not run)
Normalizes data by library size (total counts per cell).
library_size_normalize(data, verbose = FALSE)library_size_normalize(data, verbose = FALSE)
data |
Matrix (cells x genes). |
verbose |
Logical. Print progress. Default is FALSE. |
Formula:
Normalized matrix.
Applies log transformation with pseudo-count.
log_transform(data, pseudo_count = 1, base = exp(1), verbose = FALSE)log_transform(data, pseudo_count = 1, base = exp(1), verbose = FALSE)
data |
Matrix. |
pseudo_count |
Numeric. Default is 1. |
base |
Numeric. Log base. Default is exp(1). |
verbose |
Logical. Default is FALSE. |
Formula:
Transformed matrix.
Main function for MAGIC imputation of single-cell data.
Performs MAGIC (Markov Affinity-based Graph Imputation of Cells) for denoising and imputation of single-cell data.
Run MAGIC on a Seurat object and store results in a new assay.
Run MAGIC on a SingleCellExperiment object.
magic(data, ...) ## Default S3 method: magic( data, genes = NULL, knn = 5, knn_max = NULL, decay = 1, t = 3, npca = 100, solver = c("exact", "approximate"), init = NULL, t_max = 20, knn_dist = c("euclidean", "cosine", "manhattan", "correlation"), n_jobs = 1, seed = NULL, verbose = TRUE, ... ) ## S3 method for class 'matrix' magic( data, genes = NULL, knn = 5, knn_max = NULL, decay = 1, t = 3, npca = 100, solver = c("exact", "approximate"), init = NULL, t_max = 20, knn_dist = c("euclidean", "cosine", "manhattan", "correlation"), n_jobs = 1, seed = NULL, verbose = TRUE, ... ) ## S3 method for class 'dgCMatrix' magic(data, ...) ## S3 method for class 'data.frame' magic(data, ...) ## S3 method for class 'Seurat' magic( data, assay = NULL, slot = "data", genes = "all_genes", new_assay_name = NULL, knn = 5, knn_max = NULL, decay = 1, t = 3, npca = 100, solver = c("exact", "approximate"), init = NULL, t_max = 20, knn_dist = c("euclidean", "cosine", "manhattan", "correlation"), n_jobs = 1, seed = NULL, verbose = TRUE, ... ) ## S3 method for class 'SingleCellExperiment' magic( data, assay_name = "logcounts", genes = "all_genes", new_assay_name = "MAGIC", knn = 5, knn_max = NULL, decay = 1, t = 3, npca = 100, solver = c("exact", "approximate"), init = NULL, t_max = 20, knn_dist = c("euclidean", "cosine", "manhattan", "correlation"), n_jobs = 1, seed = NULL, verbose = TRUE, ... )magic(data, ...) ## Default S3 method: magic( data, genes = NULL, knn = 5, knn_max = NULL, decay = 1, t = 3, npca = 100, solver = c("exact", "approximate"), init = NULL, t_max = 20, knn_dist = c("euclidean", "cosine", "manhattan", "correlation"), n_jobs = 1, seed = NULL, verbose = TRUE, ... ) ## S3 method for class 'matrix' magic( data, genes = NULL, knn = 5, knn_max = NULL, decay = 1, t = 3, npca = 100, solver = c("exact", "approximate"), init = NULL, t_max = 20, knn_dist = c("euclidean", "cosine", "manhattan", "correlation"), n_jobs = 1, seed = NULL, verbose = TRUE, ... ) ## S3 method for class 'dgCMatrix' magic(data, ...) ## S3 method for class 'data.frame' magic(data, ...) ## S3 method for class 'Seurat' magic( data, assay = NULL, slot = "data", genes = "all_genes", new_assay_name = NULL, knn = 5, knn_max = NULL, decay = 1, t = 3, npca = 100, solver = c("exact", "approximate"), init = NULL, t_max = 20, knn_dist = c("euclidean", "cosine", "manhattan", "correlation"), n_jobs = 1, seed = NULL, verbose = TRUE, ... ) ## S3 method for class 'SingleCellExperiment' magic( data, assay_name = "logcounts", genes = "all_genes", new_assay_name = "MAGIC", knn = 5, knn_max = NULL, decay = 1, t = 3, npca = 100, solver = c("exact", "approximate"), init = NULL, t_max = 20, knn_dist = c("euclidean", "cosine", "manhattan", "correlation"), n_jobs = 1, seed = NULL, verbose = TRUE, ... )
data |
A SingleCellExperiment object. |
... |
Additional arguments. |
genes |
Gene selection. Default is "all_genes". |
knn |
Number of neighbors. Default is 5. |
knn_max |
Maximum neighbors. Default is NULL. |
decay |
Alpha decay. Default is 1. |
t |
Diffusion steps. Default is 3. |
npca |
PCA components. Default is 100. |
solver |
Solver type. Default is "exact". |
init |
Previous result. Default is NULL. |
t_max |
Maximum t. Default is 20. |
knn_dist |
Distance metric. Default is "euclidean". |
n_jobs |
Parallel workers. Default is 1. |
seed |
Random seed. Default is NULL. |
verbose |
Verbosity. Default is TRUE. |
assay |
Character. Name of the assay to use as input. Default is the default assay of the object. |
slot |
Character. Slot to use for input data. Default is "data" (normalized data). For Seurat v5, this corresponds to the layer name. |
new_assay_name |
Name for the new assay. Default is "MAGIC". |
assay_name |
Character. Name of the assay to use. Default is "logcounts". |
MAGIC denoises single-cell data using diffusion on a cell similarity graph:
Optional PCA dimension reduction
Build k-nearest neighbor graph
Construct alpha-decaying kernel
Create row-stochastic diffusion operator
Apply diffusion operator t times
The imputation formula is:
This function:
Extracts normalized data from the specified assay
Transposes to cells x genes format
Runs MAGIC imputation
Creates a new assay with the imputed data
Stores MAGIC parameters in the object's misc slot
The function is compatible with both Seurat v4 and v5. For v4, it uses GetAssayData/SetAssayData. For v5, it uses LayerData where available.
A "magic" object (for matrices) or modified input object (for Seurat/SCE).
The Seurat object with a new assay containing MAGIC-imputed values. The new assay is named "MAGIC_<original_assay>" by default.
SingleCellExperiment object with MAGIC results in a new assay.
van Dijk, D., et al. (2018). Recovering gene interactions from single-cell data using data diffusion. Cell, 174(3), 716-729. doi:10.1016/j.cell.2018.05.061
## Not run: # Basic usage set.seed(42) data <- matrix(rpois(5000, 2), nrow = 500, ncol = 10) result <- magic(data, t = 3) imputed <- as.matrix(result) # Automatic t selection result <- magic(data, t = "auto") # Specific genes result <- magic(data, genes = c(1, 2, 3)) ## End(Not run) ## Not run: library(Seurat) # Create example Seurat object counts <- matrix(rpois(3000, lambda = 5), nrow = 30, ncol = 100) rownames(counts) <- paste0("Gene", 1:30) colnames(counts) <- paste0("Cell", 1:100) seurat_obj <- CreateSeuratObject(counts = counts) seurat_obj <- NormalizeData(seurat_obj) # Run MAGIC seurat_obj <- magic(seurat_obj, t = 3) # Access MAGIC results GetAssayData(seurat_obj, assay = "MAGIC_RNA", slot = "data") # Make MAGIC assay the default DefaultAssay(seurat_obj) <- "MAGIC_RNA" ## End(Not run)## Not run: # Basic usage set.seed(42) data <- matrix(rpois(5000, 2), nrow = 500, ncol = 10) result <- magic(data, t = 3) imputed <- as.matrix(result) # Automatic t selection result <- magic(data, t = "auto") # Specific genes result <- magic(data, genes = c(1, 2, 3)) ## End(Not run) ## Not run: library(Seurat) # Create example Seurat object counts <- matrix(rpois(3000, lambda = 5), nrow = 30, ncol = 100) rownames(counts) <- paste0("Gene", 1:30) colnames(counts) <- paste0("Cell", 1:100) seurat_obj <- CreateSeuratObject(counts = counts) seurat_obj <- NormalizeData(seurat_obj) # Run MAGIC seurat_obj <- magic(seurat_obj, t = 3) # Access MAGIC results GetAssayData(seurat_obj, assay = "MAGIC_RNA", slot = "data") # Make MAGIC assay the default DefaultAssay(seurat_obj) <- "MAGIC_RNA" ## End(Not run)
Creates the row-stochastic diffusion operator (Markov matrix).
magic_diffusion_operator(kernel, verbose = FALSE)magic_diffusion_operator(kernel, verbose = FALSE)
kernel |
Symmetric kernel matrix from |
verbose |
Logical or integer. Verbosity level. |
The diffusion operator is:
where D is the diagonal matrix of row sums.
A row-stochastic sparse matrix (each row sums to 1).
Applies the diffusion operator to impute and denoise data.
magic_impute( data, diff_op, t = 3, t_max = 20, threshold = 0.001, solver = c("exact", "approximate"), verbose = FALSE )magic_impute( data, diff_op, t = 3, t_max = 20, threshold = 0.001, solver = c("exact", "approximate"), verbose = FALSE )
data |
Matrix (cells x genes) to impute. |
diff_op |
Diffusion operator (Markov matrix). |
t |
Integer or "auto". Number of diffusion steps. |
t_max |
Integer. Maximum t for automatic selection. Default is 20. |
threshold |
Numeric. Convergence threshold for automatic t. Default is 0.001. |
solver |
Character. "exact" or "approximate". Default is "exact". |
verbose |
Logical or integer. Verbosity level. |
MAGIC imputation applies the diffusion operator repeatedly:
When t = "auto", the algorithm stops when the Procrustes disparity between consecutive iterations falls below the threshold.
A matrix of imputed values.
Constructs a kNN graph from input data.
magic_knn_graph( data, knn = 5, knn_max = NULL, distance = c("euclidean", "cosine", "manhattan", "correlation"), verbose = FALSE )magic_knn_graph( data, knn = 5, knn_max = NULL, distance = c("euclidean", "cosine", "manhattan", "correlation"), verbose = FALSE )
data |
Matrix (cells x features). |
knn |
Integer. Number of neighbors for bandwidth. Default is 5. |
knn_max |
Integer or NULL. Maximum neighbors. Default is 3 * knn. |
distance |
Character. Distance metric: "euclidean", "cosine", "manhattan", or "correlation". Default is "euclidean". |
verbose |
Logical or integer. Verbosity level. |
A list with idx (neighbor indices), dist (distances), and parameters.
Convenience function to compute kNN-DREMI directly from a MAGIC result object.
magic_knnDREMI(magic_result, gene_x, gene_y, ...)magic_knnDREMI(magic_result, gene_x, gene_y, ...)
magic_result |
A magic object from |
gene_x |
Name or index of gene X. |
gene_y |
Name or index of gene Y. |
... |
Additional arguments passed to |
DREMI value or list with DREMI and DREVI.
## Not run: result <- magic(data, t = 3) dremi <- magic_knnDREMI(result, "GeneA", "GeneB") ## End(Not run)## Not run: result <- magic(data, t = 3) dremi <- magic_knnDREMI(result, "GeneA", "GeneB") ## End(Not run)
Determines optimal diffusion steps using Procrustes analysis.
magic_optimal_t( data, diff_op, t_max = 20, threshold = 0.001, subsample_genes = 500, plot = FALSE, verbose = FALSE )magic_optimal_t( data, diff_op, t_max = 20, threshold = 0.001, subsample_genes = 500, plot = FALSE, verbose = FALSE )
data |
Matrix (cells x genes). |
diff_op |
Diffusion operator. |
t_max |
Integer. Maximum t to test. Default is 20. |
threshold |
Numeric. Convergence threshold. Default is 0.001. |
subsample_genes |
Integer or NULL. Genes to subsample. Default is 500. |
plot |
Logical. Create diagnostic plot. Default is FALSE. |
verbose |
Logical or integer. Verbosity level. |
A list with t_opt, error_vec, and threshold.
A small example dataset for demonstrating MAGICR functionality. Contains 500 cells and 100 genes with simulated count data.
magic_testdatamagic_testdata
A matrix with 500 rows (cells) and 100 columns (genes):
Cell identifiers (Cell_1 to Cell_500)
Gene identifiers (Gene_1 to Gene_100)
Integer count values (Poisson distributed with lambda=5)
Simulated data for package demonstration
## Not run: data(magic_testdata) dim(magic_testdata) # [1] 500 100 # Run MAGIC on example data result <- magic(magic_testdata, t = 3) ## End(Not run)## Not run: data(magic_testdata) dim(magic_testdata) # [1] 500 100 # Run MAGIC on example data result <- magic(magic_testdata, t = 3) ## End(Not run)
Returns current configuration settings.
magicr_config()magicr_config()
Named list of settings
magicr_config()magicr_config()
Create a scatter plot of two genes before and after MAGIC.
plot_magic_genes( data, magic_result, gene_x, gene_y, gene_color = NULL, point_size = 1 )plot_magic_genes( data, magic_result, gene_x, gene_y, gene_color = NULL, point_size = 1 )
data |
Original data matrix. |
magic_result |
MAGIC result object. |
gene_x |
Gene for x-axis. |
gene_y |
Gene for y-axis. |
gene_color |
Gene for coloring (optional). |
point_size |
Point size. Default is 1. |
A ggplot object or plots using base R.
## Not run: data(magic_testdata) result <- magic(magic_testdata, t = 3) plot_magic_genes(magic_testdata, result, 1, 2) ## End(Not run)## Not run: data(magic_testdata) result <- magic(magic_testdata, t = 3) plot_magic_genes(magic_testdata, result, 1, 2) ## End(Not run)
Configure package-wide options.
set_magicr_options(use_cpp = NULL, verbose_default = NULL)set_magicr_options(use_cpp = NULL, verbose_default = NULL)
use_cpp |
Logical. Whether to use C++ acceleration when available. |
verbose_default |
Integer. Default verbosity level (0, 1, or 2). |
Invisibly returns previous settings
## Not run: # Disable C++ acceleration set_magicr_options(use_cpp = FALSE) ## End(Not run)## Not run: # Disable C++ acceleration set_magicr_options(use_cpp = FALSE) ## End(Not run)
Applies square root transformation.
sqrt_normalize(data, verbose = FALSE)sqrt_normalize(data, verbose = FALSE)
data |
Matrix. |
verbose |
Logical. Default is FALSE. |
Transformed matrix.