| Title: | Single CEll Variational Aneuploidy aNalysis |
|---|---|
| Description: | SCEVAN automatically classifies cells in scRNA-seq data by segregating non-malignant cells of tumor microenvironment from malignant cells. It also infers copy number profiles of malignant cells, identifies subclonal structures and analyzes specific and shared alterations of each subpopulation. |
| Authors: | Zaoqu Liu [ctb, cre], A. De Falco [aut], M. Ceccarelli [aut] |
| Maintainer: | Zaoqu Liu <[email protected]> |
| License: | GPL-2 |
| Version: | 1.0.6 |
| Built: | 2026-05-27 07:43:59 UTC |
| Source: | https://github.com/Zaoqu-Liu/SCEVAN |
annotateGenes Annotate genes with genomic coordinates with reference to hg38 using Ensembl based annotation package
annotateGenes(mtx, organism = "human")annotateGenes(mtx, organism = "human")
mtx |
Count matrix with genes on row names (Ensemble or Symbol) |
organism |
Organism to be analysed ("human" or "mouse", default "human") |
Annotated matrix
## Not run: count_mtx_annot <- annotateGenes(count_mtx) ## End(Not run)## Not run: count_mtx_annot <- annotateGenes(count_mtx) ## End(Not run)
annoteBandOncoHeat Annotate with chromosome bands the data frame with difference copy number alterations between subclones
annoteBandOncoHeat(mtx_annot, diffSub, nSub, organism = "human")annoteBandOncoHeat(mtx_annot, diffSub, nSub, organism = "human")
mtx_annot |
Annotation matrix |
diffSub |
Data frame with difference copy number alterations between subclones |
nSub |
Number of subclones |
organism |
Organism to be analysed (default = "human") |
classifyCluster Classify the two major clusters of CNA matrix on the basis of confident normal cells
classifyCluster(hcc2, norm_cell_names)classifyCluster(hcc2, norm_cell_names)
hcc2 |
Two clusters from hierarchical clustering |
norm_cell_names |
Vector of confident normal cells |
classification of tumor and normal cells
classifyTumorCells Classify tumour and normal cells from the raw count matrix, using normal cells in the matrix or by subtracting a synthetic baseline from the matrix if there are no normal cells in the matrix.
classifyTumorCells( count_mtx, annot_mtx, sample = "", distance = "euclidean", par_cores = 20, ground_truth = NULL, norm_cell_names = NULL, SEGMENTATION_CLASS = TRUE, SMOOTH = TRUE, beta_vega = 0.5, FIXED_NORMAL_CELLS = FALSE, output_dir = "./output" )classifyTumorCells( count_mtx, annot_mtx, sample = "", distance = "euclidean", par_cores = 20, ground_truth = NULL, norm_cell_names = NULL, SEGMENTATION_CLASS = TRUE, SMOOTH = TRUE, beta_vega = 0.5, FIXED_NORMAL_CELLS = FALSE, output_dir = "./output" )
count_mtx |
raw count matrix |
annot_mtx |
matrix containing the annotations of the genes (rows: genes, columns: chr start end) |
sample |
sample name (optional) |
distance |
distance used in hierarchical clustering (default euclidean) |
par_cores |
number of cores (default 20) |
norm_cell_names |
confident normal cells (optional) |
SEGMENTATION_CLASS |
Boolean value to perform segmentation before classification (default TRUE) |
SMOOTH |
Boolean value to perform smoothing (default TRUE) |
beta_vega |
specifies beta parameter for segmentation, higher beta for more coarse-grained segmentation. (default 0.5) |
FIXED_NORMAL_CELLS |
TRUE if vector of norm_cell to be used as reference fixed, if you are interested only in clonal structure e non nella classificazione normal/tumor (default FALSE) |
gr_truth |
ground truth of classification (optional) |
computeCNAmtx computed the CNA matrix using the break points obtained from segmentation
computeCNAmtx(count_mtx, breaks, par_cores = 20, segmAlt)computeCNAmtx(count_mtx, breaks, par_cores = 20, segmAlt)
count_mtx |
count matrix |
par_cores |
number of cores for parallel computing (optional) |
breaksbreak |
points obtained from segmentation |
CNA matrix
getBreaksVegaMC Get SCEVAN segmentation of the matrix.
getBreaksVegaMC( mtx, chr_vect, sample = "", beta_vega = 0.5, output_dir = "./output" )getBreaksVegaMC( mtx, chr_vect, sample = "", beta_vega = 0.5, output_dir = "./output" )
mtx |
count matrix |
chr_vect |
Vector specifying for each gene the chromosome where it is located |
sample |
sample name (optional) |
beta_vega |
specifies beta parameter for segmentation, higher beta for more coarse-grained segmentation. (default 0.5) |
breakpoints
getConfidentNormalCells Get at most top 30 confident normal cells from count matrix.
getConfidentNormalCells( mtx, sample = "", par_cores = 20, AdditionalGeneSets = NULL, SCEVANsignatures = TRUE, organism = "human", output_dir = "./output" )getConfidentNormalCells( mtx, sample = "", par_cores = 20, AdditionalGeneSets = NULL, SCEVANsignatures = TRUE, organism = "human", output_dir = "./output" )
mtx |
count matrix |
sample |
sample name (optional) |
par_cores |
number of cores (default 20) |
AdditionalGeneSets |
list of additional signatures of normal cell types (optional) |
SCEVANsignatures |
FALSE if you only want to use only the signatures specified in AdditionalGeneSets (default TRUE) |
This function extracts the raw count matrix from a Seurat object, supporting both Seurat V4 and V5 data structures. It prioritizes V4 format.
getCountMtxFromSeurat(seurat_obj, assay = "RNA", layer = "counts")getCountMtxFromSeurat(seurat_obj, assay = "RNA", layer = "counts")
seurat_obj |
A Seurat object |
assay |
Assay name to use (default "RNA") |
layer |
Layer name for Seurat V5 (default "counts") |
Raw count matrix with genes on rows and cells on columns
## Not run: count_mtx <- getCountMtxFromSeurat(seurat_obj) results <- pipelineCNA(count_mtx) ## End(Not run)## Not run: count_mtx <- getCountMtxFromSeurat(seurat_obj) results <- pipelineCNA(count_mtx) ## End(Not run)
multiSampleComparisonClonalCN Compare the clonal Copy Number of multiple samples.
multiSampleComparisonClonalCN( listCountMtx, listNormCells = NULL, analysisName = "all", organism = "human", par_cores = 20, plotTree = TRUE, output_dir = "./output" )multiSampleComparisonClonalCN( listCountMtx, listNormCells = NULL, analysisName = "all", organism = "human", par_cores = 20, plotTree = TRUE, output_dir = "./output" )
listCountMtx |
Named list of raw count matrix of samples |
analysisName |
Name of the analysis (default "all") |
organism |
Organism to be analysed (optional - "mouse" or "human" - default "human") |
par_cores |
number of cores (default 20) |
pipelineCNA Executes the entire SCEVAN pipeline that classifies tumour and normal cells from the raw count matrix, infer the clonal profile of cancer cells and looks for possible sub-clones in the tumour cell matrix automatically analysing the specific and shared alterations of each subclone and a differential analysis of pathways and genes expressed in each subclone.
pipelineCNA( count_mtx, sample = "", par_cores = 20, norm_cell = NULL, SUBCLONES = TRUE, beta_vega = 0.5, ClonalCN = TRUE, plotTree = TRUE, AdditionalGeneSets = NULL, SCEVANsignatures = TRUE, organism = "human", ngenes_chr = 5, perc_genes = 10, FIXED_NORMAL_CELLS = FALSE, output_dir = "./output" )pipelineCNA( count_mtx, sample = "", par_cores = 20, norm_cell = NULL, SUBCLONES = TRUE, beta_vega = 0.5, ClonalCN = TRUE, plotTree = TRUE, AdditionalGeneSets = NULL, SCEVANsignatures = TRUE, organism = "human", ngenes_chr = 5, perc_genes = 10, FIXED_NORMAL_CELLS = FALSE, output_dir = "./output" )
count_mtx |
Raw count matrix with genes on rows (both Gene Symbol or Ensembl ID are allowed) and cells on columns. |
sample |
Sample name to save results (optional) |
par_cores |
Number of cores to run the pipeline (optional - default 20) |
norm_cell |
Vector of possible known normal cells to be used as confident normal cells (optional) |
SUBCLONES |
Boolean value TRUE if you are interested in analysing the clonal structure and FALSE if you are only interested in the classification of malignant and non-malignant cells (optional - default TRUE) |
beta_vega |
Specifies beta parameter for segmentation, higher beta for more coarse-grained segmentation. (optional - default 0.5) |
ClonalCN |
Get clonal CN profile inference from all tumour cells (optional) |
plotTree |
Plot Phylogenetic tree (optional - default FALSE) |
AdditionalGeneSets |
list of additional signatures of normal cell types (optional) |
SCEVANsignatures |
FALSE if you only want to use only the signatures specified in AdditionalGeneSets (default TRUE) |
organism |
Organism to be analysed (optional - "mouse" or "human" - default "human") |
ngenes_chr |
Minimum number of genes expressed on chromosome (optional - default 5) |
perc_genes |
Minimum percentage gene expressed in each cell (optional - default 10) |
FIXED_NORMAL_CELLS |
TRUE if norm_cell vector to be used as fixed reference, if you are only interested in clonal structure and not normal/tumor classification (default FALSE) |
## Not run: res_pip <- pipelineCNA(count_mtx) ## End(Not run)## Not run: res_pip <- pipelineCNA(count_mtx) ## End(Not run)
Title plotAllClonalCN
plotAllClonalCN(samples, name)plotAllClonalCN(samples, name)
samples |
Vector with sample names to be plotted |
name |
Analysis name |
plotAllSubclonalCN Plot the copy number of each subclone of a sample.
plotAllSubclonalCN(sample, pathOutput = "./output/")plotAllSubclonalCN(sample, pathOutput = "./output/")
sample |
Name of the sample. |
pathOutput |
Path to the output folder containing the output of pipelineCNA. |
plotCNA_withAnnotCells allows generating a heatmap of the copy number profile of each cell, adding cell annotations as tracks on the heatmap.
plotCNA_withAnnotCells( SampleName, metadata, COLUMNS_TO_PLOT, outputPATH = "./output/", SUBCLONE = FALSE, hcc = NULL, plotNAME = "heatmap_with_annotation.png", par_cores = 20 )plotCNA_withAnnotCells( SampleName, metadata, COLUMNS_TO_PLOT, outputPATH = "./output/", SUBCLONE = FALSE, hcc = NULL, plotNAME = "heatmap_with_annotation.png", par_cores = 20 )
SampleName |
Sample name used in pipelineCNA |
metadata |
data.frame cells (rownames) and annotations (columns) |
COLUMNS_TO_PLOT |
columns of metadata to be added as tracks in the heatmap |
outputPATH |
output folder of pipelineCNA (optional) |
SUBCLONE |
Boolean value TRUE if you are interested in CNA matrix of sublclone and FALSE if you are interested in CNA matrix of all cells. |
hcc |
if you have previously computed clustering for the heatmap (optional - default 0.5) |
plotNAME |
name file to save the figure (optional) |
par_cores |
number of cores used for clustering (optional - default 20) |
## Not run: plotCNA_withAnnotCells(SampleName, metadata, c("CellType","Tissue","Cluster")) ## End(Not run)## Not run: plotCNA_withAnnotCells(SampleName, metadata, c("CellType","Tissue","Cluster")) ## End(Not run)
preprocessingMtx Pre-processing steps: Cells with less than 200 genes and the genes expressed in less than 1 according to genomic coordinates. Highly confident normal cells are sought in the matrix. Genes involved in the cell cycle pathway are removed. Log-Freeman–Tukey transformation to stabilize variance and a polynomial dynamic linear modeling (DLM) to smooth out the outliers.
preprocessingMtx( count_mtx, sample, ngenes_chr = 5, perc_genes = 0.1, par_cores = 20, findConfident = TRUE, AdditionalGeneSets = NULL, SCEVANsignatures = TRUE, organism = "human", output_dir = "./output" )preprocessingMtx( count_mtx, sample, ngenes_chr = 5, perc_genes = 0.1, par_cores = 20, findConfident = TRUE, AdditionalGeneSets = NULL, SCEVANsignatures = TRUE, organism = "human", output_dir = "./output" )
count_mtx |
raw count matrix |
ngenes_chr |
minimum number of genes per chromosome (optional) |
perc_genes |
percentage of cells in which each gene is to be expressed (optional) |
par_cores |
number of cores (optional) |
findConfident |
Boolean value to search for normal cells (default TRUE) |
AdditionalGeneSets |
List of additional signatures to be used to search for normal cells (optional) |
SCEVANsignatures |
Boolean value TRUE to use internal SCEVAN signatures for normal cells or FALSE to use only signatures specified in AdditionalGeneSets (default TRUE) |
SMOOTH |
Boolean value to perform smoothing (optional) |
count_mtx_smooth processed and smoothed matrix count_mtx_annot annotated matrix
## Not run: res <- preprocessingMtx(count_mtx, sample = "test") ## End(Not run)## Not run: res <- preprocessingMtx(count_mtx, sample = "test") ## End(Not run)
removeSyntheticBaseline Removes a synthetic baseline from a tumour pure matrix
removeSyntheticBaseline(count_mtx, par_cores = 20)removeSyntheticBaseline(count_mtx, par_cores = 20)
count_mtx |
count matrix |
par_cores |
number of cores for parallel computing. |
relative matrix
The SCEVAN package
The SCEVAN functions ...
Useful links:
This function sorts a dataset file by the genomic position of the probes. This function makes very easy the integration of VegaMC with the output of PennCNV tool.
sortData(dataset, output_file_name = "")sortData(dataset, output_file_name = "")
dataset |
Dataset file. |
output_file_name |
Name of the file in which sorted data are stored. |
This function returns the input matrix ordered by the genomic position of the probes.
This function allows to sort a dataset by the genomic position. The input file must have the chromosome and the position in column two and three respectively. This format follows the standard output of PennCNV. An example of file can be found in inst/example folder.
Sandro Morganella
Morganella S., and Ceccarelli M. VegaMC: a R/bioconductor package for fast downstream analysis of large array comparative genomic hybridization datasets. Bioinformatics, 28(19):2512-4 (2012).
## Not run: ## Copy the example dataset in current folder file.copy(system.file("example/breast_Affy500K.txt", package="VegaMC"), ".") ## Sort data and save results in sorted.txt file sortData("breast_Affy500K.txt", "sorted.txt") ## End(Not run)## Not run: ## Copy the example dataset in current folder file.copy(system.file("example/breast_Affy500K.txt", package="VegaMC"), ".") ## Sort data and save results in sorted.txt file sortData("breast_Affy500K.txt", "sorted.txt") ## End(Not run)
Get at most top 30 confident normal cells
top30classification( NES, pValue, FDR, pval_filter, fdr_filter, pval_cutoff, nes_cutoff, nNES )top30classification( NES, pValue, FDR, pval_filter, fdr_filter, pval_cutoff, nes_cutoff, nNES )
NES |
|
pValue |
|
FDR |
|
pval_filter |
|
fdr_filter |
|
pval_cutoff |
|
nes_cutoff |
|
nNES |