scClustEval (Single Cell Clustering Evaluation) is an R package for evaluating and optimizing single-cell RNA-seq clustering results using self-projection machine learning approaches.
The package implements an iterative optimization strategy that:
# Create example data
set.seed(42)
n_cells <- 500
n_features <- 100
n_clusters <- 5
# Generate expression matrix with cluster structure
X <- matrix(0, nrow = n_cells, ncol = n_features)
labels <- character(n_cells)
for (i in 1:n_clusters) {
idx <- ((i-1) * 100 + 1):(i * 100)
X[idx, ] <- matrix(rnorm(100 * n_features, mean = i), nrow = 100)
labels[idx] <- paste0("Cluster_", i)
}
# Run assessment
result <- sc_assessment(
X = X,
labels = labels,
classifier = "LR",
n_per_class = 50,
cv = 5
)
# Print result
print(result)The optimization process works as follows:
# Start with over-clustering
seurat_obj <- FindClusters(seurat_obj, resolution = 2.0)
# Run optimization
seurat_obj <- RunOptimization(
seurat_obj,
cluster_col = "seurat_clusters",
min_accuracy = 0.9,
result_col = "optimized_clusters"
)
# Compare before and after
DimPlot(seurat_obj, group.by = c("seurat_clusters", "optimized_clusters"))The package supports multiple classifiers:
| Classifier | Code | Description |
|---|---|---|
| Logistic Regression | "LR" |
L1/L2 regularized (default) |
| Random Forest | "RF" |
Using randomForest package |
| Ranger | "RANGER" |
Fast random forest |
| SVM | "SVM" |
Support Vector Machine |
| Naive Bayes | "NB" |
Gaussian Naive Bayes |
| Decision Tree | "DT" |
Using rpart |
| XGBoost | "XGB" |
Gradient boosting |
You can constrain the optimization process using an under-clustering as a boundary:
# Create low and high resolution clusterings
seurat_obj <- FindClusters(seurat_obj, resolution = 0.2, key_added = "low_res")
seurat_obj <- FindClusters(seurat_obj, resolution = 2.0, key_added = "high_res")
# Optimize with constraint
seurat_obj <- RunOptimization(
seurat_obj,
cluster_col = "high_res",
under_cluster_col = "low_res", # Constraint
min_accuracy = 0.95
)sessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] scClustEval_1.0.0 rmarkdown_2.31
#>
#> loaded via a namespace (and not attached):
#> [1] shape_1.4.6.1 gtable_0.3.6 xfun_0.57
#> [4] bslib_0.11.0 ggplot2_4.0.3 recipes_1.3.2
#> [7] lattice_0.22-9 vctrs_0.7.3 tools_4.6.0
#> [10] generics_0.1.4 stats4_4.6.0 parallel_4.6.0
#> [13] tibble_3.3.1 pkgconfig_2.0.3 ModelMetrics_1.2.2.2
#> [16] Matrix_1.7-5 data.table_1.18.4 RColorBrewer_1.1-3
#> [19] S7_0.2.2 lifecycle_1.0.5 compiler_4.6.0
#> [22] farver_2.1.2 stringr_1.6.0 codetools_0.2-20
#> [25] htmltools_0.5.9 sys_3.4.3 buildtools_1.0.0
#> [28] class_7.3-23 sass_0.4.10 glmnet_5.0
#> [31] yaml_2.3.12 prodlim_2026.03.11 pillar_1.11.1
#> [34] jquerylib_0.1.4 MASS_7.3-65 cachem_1.1.0
#> [37] gower_1.0.2 iterators_1.0.14 rpart_4.1.27
#> [40] foreach_1.5.2 nlme_3.1-169 parallelly_1.47.0
#> [43] lava_1.9.1 tidyselect_1.2.1 digest_0.6.39
#> [46] stringi_1.8.7 future_1.70.0 dplyr_1.2.1
#> [49] reshape2_1.4.5 purrr_1.2.2 listenv_0.10.1
#> [52] maketools_1.3.2 splines_4.6.0 fastmap_1.2.0
#> [55] grid_4.6.0 cli_3.6.6 magrittr_2.0.5
#> [58] survival_3.8-6 future.apply_1.20.2 withr_3.0.2
#> [61] scales_1.4.0 lubridate_1.9.5 timechange_0.4.0
#> [64] globals_0.19.1 igraph_2.3.1 otel_0.2.0
#> [67] nnet_7.3-20 timeDate_4052.112 evaluate_1.0.5
#> [70] knitr_1.51 hardhat_1.4.3 caret_7.0-1
#> [73] rlang_1.2.0 Rcpp_1.1.1-1.1 glue_1.8.1
#> [76] pROC_1.19.0.1 ipred_0.9-15 jsonlite_2.0.0
#> [79] R6_2.6.1 plyr_1.8.9This package is an R implementation inspired by the SCCAF Python package: