Skip to contents

Performs a virtual knock-out or knock-in simulation for a target gene in a single-cell dataset. The function identifies cells with high or low expression of the gene, matches group sizes and cell-type composition, and performs differential expression and enrichment analysis to infer potential functional effects.

Usage

silico_ko(
  all,
  gene,
  sko_mode = c("ko", "ki"),
  cell_col = "cell_anno",
  filter_cell_threshold = 10,
  pct_threshold = 0.1,
  abs_threshold = NULL,
  deg_method = "default",
  enrichment_method = c("ORA", "GSEA"),
  enrichment_bg = c("GO", "KEGG", "MKEGG", "Reactome"),
  simplify = TRUE
)

Arguments

all

A Seurat object containing expression data and cell metadata.

gene

Character(1). The target gene symbol (must exist in the expression matrix).

sko_mode

One of c("ko", "ki"). Specifies virtual knock-out ("ko") or knock-in ("ki") mode.

cell_col

Character(1). The metadata column indicating cell type.

filter_cell_threshold

Integer. Minimum number of cells per cell type to retain.

pct_threshold

Numeric. Fraction of expressing cells to extract per group (ignored if abs_threshold is set).

abs_threshold

Integer. Absolute number of cells per group to extract; overrides pct_threshold if provided.

deg_method

Character. Differential expression method: "default" or "MAST".

enrichment_method

Character vector. Enrichment methods to run; default is c("ORA", "GSEA").

enrichment_bg

Character vector. Background databases for enrichment; default is c("GO", "KEGG", "MKEGG", "Reactome").

simplify

Logical. Whether to simplify redundant enrichment terms via clusterProfiler::simplify. Defaults to TRUE. Set to FALSE if enrichment results are sparse (e.g., small datasets) to avoid errors.

Value

Invisibly returns a list containing:

  • high_cells: Vector of selected high-expression cells.

  • low_cells: Vector of selected low-expression cells.

  • cell_rato: Data frame summarizing cell-type composition.

  • deg_results: Differential expression results (data frame).

  • enrichments: Enrichment analysis results.

Details

The pipeline includes:

  1. Filtering cells expressing the target gene.

  2. Retaining cell types with sufficient cell counts.

  3. Selecting top-expressing cells (high group) and matched low-expressing cells (low group).

  4. Running differential expression between the two groups using Seurat::FindMarkers.

  5. Performing enrichment analysis via leo.basic::leo_enrich.

Examples

if (FALSE) { # \dontrun{
res <- silico_ko(
  all = srt, gene = "LRRK2",
  sko_mode = "ko", cell_col = "cell_anno",
  pct_threshold = 0.1, deg_method = "default"
)
} # }