---
title: "Getting Started with SCEVAN"
author: "Zaoqu Liu"
date: "`r Sys.Date()`"
output: 
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 3
vignette: >
  %\VignetteIndexEntry{Getting Started with SCEVAN}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(
  echo = TRUE,
  message = FALSE,
  warning = FALSE,
  collapse = TRUE,
  comment = "#>",
  fig.width = 8,
  fig.height = 6
)
```

## Introduction

**SCEVAN** (**S**ingle **CE**ll **V**ariational **A**neuploidy a**N**alysis) is a comprehensive R package for analyzing copy number alterations (CNAs) in single-cell RNA sequencing (scRNA-seq) data. This vignette provides a quick introduction to get you started with SCEVAN.

### Key Capabilities

- **Automated Cell Classification**: Distinguishes malignant cells from tumor microenvironment (TME) cells
- **CNA Inference**: Infers copy number profiles from gene expression data
- **Subclone Detection**: Identifies clonal subpopulations with distinct copy number architectures
- **Cross-Platform**: Works on Windows, macOS, and Linux

## Installation

### From R-universe (Recommended)

```r
install.packages("SCEVAN", repos = "https://zaoqu-liu.r-universe.dev")
```
### From GitHub

```r
# Install Bioconductor dependencies
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install(c("EnsDb.Hsapiens.v86", "scran", "fgsea"))

# Install yaGST
remotes::install_github("miccec/yaGST")

# Install SCEVAN
remotes::install_github("Zaoqu-Liu/SCEVAN")
```

## Quick Start

### Load the Package

```{r load-package, eval=FALSE}
library(SCEVAN)
```

### Prepare Your Data

SCEVAN requires a **raw count matrix** with:

- **Rows**: Gene names (Gene Symbols or Ensembl IDs)
- **Columns**: Cell barcodes

```{r example-data, eval=FALSE}
# Example: Loading count matrix from a file
count_mtx <- read.csv("your_counts.csv", row.names = 1)

# Or from a Seurat object (v4/v5 compatible)
count_mtx <- getCountMtxFromSeurat(seurat_obj, assay = "RNA")
```

### Run the Analysis

The main function `pipelineCNA()` performs the complete analysis:
```{r run-pipeline, eval=FALSE}
results <- pipelineCNA(
  count_mtx,
  sample = "MySample",
  par_cores = 4,
  organism = "human"
)
```

### Understand the Output

The function returns a data frame with cell classifications:

| Column | Description |
|--------|-------------|
| `class` | Cell type: "tumor", "normal", or "filtered" |
| `confidentNormal` | Whether cell was used as normal reference |
| `subclone` | Subclone assignment (if detected) |

Output files are saved to the `./output/` directory:

- `*heatmap.png` - CNA heatmap with cell classifications
- `*_CN.seg` - Segmentation results
- `*_CNAmtx.RData` - CNA matrix for downstream analysis

## Next Steps
- **[Algorithm Details](algorithm.html)**: Understand the methodology behind SCEVAN
- **[Single-Sample Analysis](single-sample-analysis.html)**: Detailed walkthrough with real data
- **[Multi-Sample Comparison](multi-sample-analysis.html)**: Compare CNAs across samples
- **[Seurat Integration](seurat-integration.html)**: Integrate with Seurat workflows

## Citation

If you use SCEVAN in your research, please cite:

> De Falco, A., Caruso, F., Su, X.-D., Varone, A., & Ceccarelli, M. (2023). 
> A variational algorithm to detect the clonal copy number substructure of tumors from scRNA-seq data. 
> *Nature Communications*, 14, 1074. https://doi.org/10.1038/s41467-023-36790-9

## Session Info

```{r session-info, eval=FALSE}
sessionInfo()
```