--- title: "NNLS Solver Implementation Details" author: "Zaoqu Liu" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 3 vignette: > %\VignetteIndexEntry{NNLS Solver Implementation Details} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, fig.align = "center" ) ``` ## Overview CellProgramMapper implements two algorithms for solving the Non-Negative Least Squares (NNLS) problem: 1. **Coordinate Descent** (default, faster) 2. **Active Set Method** (Lawson-Hanson algorithm) Both solvers are implemented in C++ using RcppArmadillo for optimal performance. ## The NNLS Problem Given matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$ and vector $\mathbf{b} \in \mathbb{R}^m$, NNLS solves: $$\min_{\mathbf{x} \geq 0} \|\mathbf{A}\mathbf{x} - \mathbf{b}\|_2^2$$ ## Active Set Method (Lawson-Hanson) ### Algorithm Description The active set method partitions variables into two sets: - **P (Positive set)**: Variables that can be positive - **Z (Zero set)**: Variables constrained to zero ``` Algorithm: Lawson-Hanson NNLS Input: A (m x n matrix), b (m-vector) Output: x (n-vector) with x >= 0 1. Initialize: x = 0, P = empty, Z = {1,...,n} 2. Compute gradient: w = A'(b - Ax) 3. While Z is not empty and max(w[Z]) > 0: a. Move index of max(w[Z]) from Z to P b. Inner loop: - Solve unconstrained LS on P: z[P] = argmin ||A[:,P]z - b||^2 - While any z[P] <= 0: * Compute step alpha to boundary * Update x = x + alpha(z - x) * Move zero indices from P to Z c. Update gradient: w = A'(b - Ax) 4. Return x ``` ### Convergence Properties - **Finite termination**: Guaranteed to terminate in at most $\binom{n+m}{n}$ iterations - **Exact solution**: Returns the globally optimal solution - **Numerical stability**: Uses QR decomposition for least squares subproblems ### Implementation ```cpp // Simplified C++ implementation (actual code in src/nnls_solver.cpp) arma::vec nnls_single(const arma::mat& A, const arma::vec& b, int max_iter, double tol) { int n = A.n_cols; arma::vec x = arma::zeros(n); arma::vec w = A.t() * b; // Initial gradient arma::uvec P_set, Z_set = arma::regspace(0, n-1); while (Z_set.n_elem > 0 && iter < max_iter) { // Find index with max positive gradient in Z arma::uword t = Z_set(w.elem(Z_set).index_max()); if (w(t) <= tol) break; // Move t from Z to P P_set = join_cols(P_set, {t}); Z_set.shed_row(find_idx); // Inner loop: solve unconstrained subproblem while (true) { arma::vec z_P = solve(A.cols(P_set), b); if (all(z_P > 0)) { x.elem(P_set) = z_P; break; } // Handle negative values... } w = A.t() * (b - A * x); } return x; } ``` ## Coordinate Descent Method ### Algorithm Description Coordinate descent optimizes one variable at a time while keeping others fixed. For NNLS, we work with the normal equations: $$\mathbf{Q}\mathbf{x} = \mathbf{c}$$ where $\mathbf{Q} = \mathbf{A}^\top\mathbf{A}$ and $\mathbf{c} = \mathbf{A}^\top\mathbf{b}$. ``` Algorithm: Coordinate Descent NNLS Input: Q = A'A, c = A'b Output: x (n-vector) with x >= 0 1. Initialize: x = 0, g = -c (gradient at x=0) 2. Repeat until convergence: For j = 1 to n: a. Compute update: delta = -g[j] / Q[j,j] b. Project: x_new = max(0, x[j] + delta) c. Update gradient: g = g + (x_new - x[j]) * Q[:,j] d. x[j] = x_new 3. Return x ``` ### Advantages - **Simplicity**: Easy to implement and understand - **Efficiency**: Each iteration is $O(n)$ for gradient update - **Precomputation**: $\mathbf{Q} = \mathbf{A}^\top\mathbf{A}$ computed once for batch processing ### Implementation ```cpp // Simplified C++ implementation arma::vec nnls_coord_descent(const arma::mat& AtA, const arma::vec& Atb, int max_iter, double tol) { int n = AtA.n_cols; arma::vec x = arma::zeros(n); arma::vec grad = -Atb; // Gradient at x=0 for (int iter = 0; iter < max_iter; iter++) { double max_change = 0; for (int j = 0; j < n; j++) { if (AtA(j,j) < 1e-12) continue; // Skip zero diagonal double old_xj = x(j); double new_xj = std::max(0.0, old_xj - grad(j) / AtA(j,j)); if (new_xj != old_xj) { double delta = new_xj - old_xj; x(j) = new_xj; grad += delta * AtA.col(j); // Efficient gradient update max_change = std::max(max_change, std::abs(delta)); } } if (max_change < tol) break; } return x; } ``` ## Comparison ```{r comparison-table, echo=FALSE, results='asis'} df <- data.frame( Aspect = c("Time per iteration", "Iterations to converge", "Memory", "Convergence guarantee", "Best for"), ActiveSet = c("O(k^3)", "Usually fewer", "O(p*k)", "Finite", "Small k"), CoordinateDescent = c("O(k^2)", "Usually more", "O(k^2)", "Linear", "Large k") ) knitr::kable(df, col.names = c("Aspect", "Active Set", "Coordinate Descent")) ``` ## Benchmark ```{r benchmark, eval=TRUE, fig.cap="Solver performance comparison"} library(CellProgramMapper) # Benchmark function using CellProgramMapper benchmark_solver <- function(n_cells, n_genes, n_programs, method) { set.seed(42) X <- matrix(abs(rnorm(n_cells * n_genes)), nrow = n_cells) colnames(X) <- paste0("Gene", 1:n_genes) H <- matrix(abs(rnorm(n_programs * n_genes)), nrow = n_programs) colnames(H) <- paste0("Gene", 1:n_genes) rownames(H) <- paste0("GEP", 1:n_programs) start_time <- Sys.time() result <- CellProgramMapper( query = X, reference = H, method = method, verbose = FALSE ) end_time <- Sys.time() as.numeric(difftime(end_time, start_time, units = "secs")) } # Run benchmarks n_cells_vec <- c(100, 500, 1000) n_genes <- 200 n_programs <- 10 results <- data.frame() for (n_cells in n_cells_vec) { for (method in c("cd", "active_set")) { time_taken <- benchmark_solver(n_cells, n_genes, n_programs, method) results <- rbind(results, data.frame( n_cells = n_cells, method = method, time = time_taken )) } } # Plot par(mar = c(4, 4, 2, 1)) cd_times <- results$time[results$method == "cd"] as_times <- results$time[results$method == "active_set"] barplot(rbind(cd_times, as_times), beside = TRUE, names.arg = n_cells_vec, col = c("#1976d2", "#388e3c"), xlab = "Number of cells", ylab = "Time (seconds)", main = "NNLS Solver Performance") legend("topleft", legend = c("Coordinate Descent", "Active Set"), fill = c("#1976d2", "#388e3c"), bty = "n") ``` ## Choosing a Solver Use **Coordinate Descent** (default) when: - Processing large numbers of cells (>1000) - Number of programs is moderate (5-50) - Speed is the priority Use **Active Set** when: - High numerical precision is required - Number of programs is small (<10) - Guaranteed finite convergence is important ```r # Use coordinate descent (default) result <- CellProgramMapper(query, reference, method = "cd") # Use active set method result <- CellProgramMapper(query, reference, method = "active_set") ``` ## Numerical Considerations ### Tolerance Parameter The `tol` parameter controls convergence: - Smaller values: more iterations, higher precision - Larger values: fewer iterations, faster but less precise - Default: `1e-8` (good balance) ### Maximum Iterations The `max_iter` parameter prevents infinite loops: - Default: `1000` (sufficient for most problems) - Increase for very large programs or ill-conditioned problems ### Division by Zero Protection Both solvers include protection against division by zero: ```cpp if (AtA(j,j) < 1e-12) continue; // Skip near-zero diagonal ``` ## Session Info ```{r session} sessionInfo() ```