--- title: "Multi-Species Analysis" author: "Zaoqu Liu" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 3 fig_caption: true vignette: > %\VignetteIndexEntry{Multi-Species Analysis} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 8, fig.height = 6, fig.align = "center", message = FALSE, warning = FALSE ) ``` ## Introduction NOVA supports cell-cell communication analysis across **21 species** through integration with NCBI HomoloGene. This enables: - Analysis of model organism data (mouse, rat, zebrafish, etc.) - Cross-species comparative studies - Translational research applications ## Supported Species ```{r species} library(NOVA) # View all supported species species <- supported_species() print(species) ``` ### Species Details | Species | Common Name | Taxonomy ID | Model Use | |---------|-------------|-------------|-----------| | human | Human | 9606 | Clinical research | | mouse | Mouse | 10090 | Disease models | | rat | Rat | 10116 | Pharmacology | | zebrafish | Zebrafish | 7955 | Development | | fruitfly | Drosophila | 7227 | Genetics | | nematode | C. elegans | 6239 | Neuroscience | ## Homology Mapping ### How It Works NOVA uses NCBI HomoloGene to map gene symbols between species: 1. **Query species genes** → HomoloGene IDs 2. **HomoloGene IDs** → Target species orthologs 3. **Apply mapping** to ligand-receptor database ```{r homology_example} # Get homology mapping from mouse to human mapping <- GetHomologyMapping(from = "mouse", to = "human") head(mapping) cat("\nTotal mappings:", nrow(mapping), "\n") ``` ### Converting Gene Symbols ```{r convert} # Example mouse genes mouse_genes <- c("Cd4", "Cd8a", "Ptprc", "Itgam", "Cd19") # Convert to human symbols human_genes <- ConvertGeneSymbols(mouse_genes, from = "mouse", to = "human") print(data.frame(mouse = mouse_genes, human = human_genes)) ``` ## Analyzing Mouse Data ### Standard Workflow ```{r mouse_analysis, eval=FALSE} set.seed(123) # Simulate mouse single-cell data n_genes <- 200 n_cells <- 300 # Create expression matrix with mouse gene names expr <- matrix(0, nrow = n_genes, ncol = n_cells) expressed <- sample(length(expr), size = length(expr) * 0.25) expr[expressed] <- abs(rnorm(length(expressed), mean = 2, sd = 1)) # Get mouse LR pairs lr_db <- GetLRDatabase("lrc2p") mouse_mapping <- GetHomologyMapping("human", "mouse") # Map some human ligands/receptors to mouse mouse_ligands <- mouse_mapping$to_symbol[match(lr_db$ligand[1:30], mouse_mapping$from_symbol)] mouse_receptors <- mouse_mapping$to_symbol[match(lr_db$receptor[1:30], mouse_mapping$from_symbol)] # Remove NAs mouse_ligands <- na.omit(mouse_ligands) mouse_receptors <- na.omit(mouse_receptors) # Set gene names gene_names <- c(as.character(mouse_ligands[1:20]), as.character(mouse_receptors[1:20]), paste0("MouseGene", 41:n_genes)) rownames(expr) <- gene_names colnames(expr) <- paste0("Cell", 1:n_cells) # Create annotation clusters <- sample(c("T_cells", "B_cells", "Macrophages", "Fibroblasts"), n_cells, replace = TRUE) annotation <- data.frame( cell = colnames(expr), cluster = clusters ) # Run analysis specifying mouse result <- ExtractEdges( expression = Matrix::Matrix(expr, sparse = TRUE), annotation = annotation, species = "mouse", # Specify species database = "lrc2p", min_pct = 0.05 ) print(result) ``` ## Cross-Species Comparison ### Comparative Study Design When comparing communication across species: ```{r comparative, eval=FALSE} # Human analysis human_result <- ExtractEdges( expression = human_expr, annotation = human_ann, species = "human", database = "lrc2p" ) # Mouse analysis (genes auto-converted) mouse_result <- ExtractEdges( expression = mouse_expr, annotation = mouse_ann, species = "mouse", database = "lrc2p" ) # Compare conserved interactions human_pairs <- paste(human_result$edges$ligand, human_result$edges$receptor, sep = "-") mouse_pairs <- paste(mouse_result$edges$ligand, mouse_result$edges$receptor, sep = "-") conserved <- intersect(human_pairs, mouse_pairs) cat("Conserved LR interactions:", length(conserved), "\n") ``` ## Gene ID Types NOVA supports multiple gene identifier types: ```{r id_types} # View supported ID types id_types <- supported_id_types() print(id_types) ``` ### Converting Between ID Types ```{r id_convert, eval=FALSE} # Convert Ensembl IDs to symbols ensembl_ids <- c("ENSG00000153563", "ENSG00000010610") symbols <- ConvertGeneIDs(ensembl_ids, from = "ensembl", to = "symbol", species = "human") ``` ## Special Considerations ### 1. One-to-Many Mappings Some genes have multiple orthologs: ```{r one_to_many} # Check for duplicated mappings mapping <- GetHomologyMapping("mouse", "human") dup_genes <- mapping$from_symbol[duplicated(mapping$from_symbol)] cat("Genes with multiple human orthologs:", length(unique(dup_genes)), "\n") ``` ### 2. Missing Orthologs Not all genes have orthologs: ```{r missing} # Example: genes without orthologs all_mouse_genes <- c("Actb", "Gapdh", "NoOrtholog123") converted <- ConvertGeneSymbols(all_mouse_genes, "mouse", "human") print(data.frame(mouse = all_mouse_genes, human = converted)) ``` ### 3. Species-Specific Genes Some genes are species-specific and won't have orthologs. These are automatically excluded from analysis. ## Best Practices ### Workflow Recommendations 1. **Start with human database**: The LR database is human-centric 2. **Verify ortholog coverage**: Check how many genes map successfully 3. **Report unmapped genes**: Document genes that couldn't be mapped 4. **Validate key interactions**: Confirm important findings in species-specific literature ### Quality Control ```{r qc} # Check ortholog mapping rate mapping <- GetHomologyMapping("mouse", "human") lr_db <- GetLRDatabase("lrc2p") # How many ligands can be mapped? ligand_mapped <- sum(lr_db$ligand %in% mapping$from_symbol) receptor_mapped <- sum(lr_db$receptor %in% mapping$from_symbol) cat("Ligands mappable to mouse:", ligand_mapped, "/", length(unique(lr_db$ligand)), "\n") cat("Receptors mappable to mouse:", receptor_mapped, "/", length(unique(lr_db$receptor)), "\n") ``` ## Session Info ```{r session} sessionInfo() ``` ## Author **Zaoqu Liu** - Email: liuzaoqu@163.com - GitHub: [Zaoqu-Liu](https://github.com/Zaoqu-Liu)