This vignette provides recommendations for optimal use of MultiK and solutions to common issues.
Before running MultiK, ensure your data has been properly quality-controlled:
reps)| Dataset Size | Recommended reps |
|---|---|
| < 5,000 cells | 100-150 |
| 5,000-20,000 cells | 100 |
| > 20,000 cells | 50-100 |
Rule of thumb: More repetitions = more stable results, but longer runtime.
pSample)For large datasets:
| Cells | Reps | Cores | Approximate Time |
|---|---|---|---|
| 2,000 | 100 | 4 | 5-10 min |
| 10,000 | 100 | 8 | 20-40 min |
| 50,000 | 50 | 16 | 1-2 hours |
Ideal scenario: - Single peak in K frequency distribution - Low rPAC at that K - Pareto-optimal point stands out
When multiple K values appear Pareto-optimal:
# Consider biological context
# Lower K: Major cell types
# Higher K: Subtypes/states
# Examine both
clusters_low <- getClusters(seu, optK = 3)
clusters_high <- getClusters(seu, optK = 5)
# Use SigClust to help decide
pval_low <- CalcSigClust(seu, clusters_low$clusters[, 1])
pval_high <- CalcSigClust(seu, clusters_high$clusters[, 1])Cause: All clustering runs produced the same K.
Solutions:
Cause: Data may have continuous structure rather than discrete clusters.
Solutions:
Cause: Too few cells in a cluster.
Solutions:
Solutions:
# Reduce resolution granularity
result <- MultiK(seu, resolution = seq(0.1, 2, 0.1))
# Use more cores
result <- MultiK(seu, cores = parallel::detectCores())
# Reduce reps (minimum ~50 for stability)
result <- MultiK(seu, reps = 50)
# Subsample large datasets first
seu_sub <- seu[, sample(ncol(seu), 10000)]
result <- MultiK(seu_sub, reps = 100)When publishing results using MultiK, report:
packageVersion("MultiK")“Optimal cluster number was determined using the MultiK algorithm (v1.0.0, Liu 2025). We performed 100 subsampling iterations with 80% cell sampling, testing resolution parameters from 0.05 to 2.0 in increments of 0.05. The optimal K was selected based on the Pareto frontier of frequency and stability (rPAC). Cluster significance was validated using pairwise SigClust tests with 100 simulations.”