K2Taxonomer is an R package built around a “top-down” recursive partitioning framework to perform unsupervised learning of nested “taxonomy-like” subgroups from high-throughput -omics data. This framework was devised to be flexible to different data structures, enabling its applicability to analyze both bulk and single-cell data sets. In addition to implementing the algorithm, this package includes functionality to annotate estimated subgroups using gene- and pathway-level analyses.
The recursive partitioning approach utilized by K2Taxonomer
presents advantages over conventional unsupervised approaches, including:
The documentation of this package describes how K2Taxonomer
can be applied to either bulk or single-cell gene expression data. For analyses of single-cell gene expression data K2Taxonomer
is designed to characterize nested subgroups of previously identified cell types, such as those previously estimated by scRNAseq clustering analysis.
A preprint of the manuscript describing K2Taxonomer
is publicly available here.
You may install K2Taxonomer
from GitHub directly using the devtools
R package or clone the repository and download from source. Typical download time is around 1 minute.
install.packages("devtools")
devtools::install_github("montilab/K2Taxonomer")
Here we demonstrate the basic functionality of K2Taxonomer
, which is described in more detail in the vignette, Running K2Taxonomer.
An alternative workflow for running K2Taxonomer
for subgrouping cell type labels using single-cell expression data is described in the vignette, Running K2Taxonomer on single-cell RNA sequencing data.
K2
object
K2res <- K2preproc(sample.ExpressionSet)
K2res <- K2tax(K2res,
stabThresh=0.5)
K2res <- runDGEmods(K2res)
genesetsMadeUp <- list(
GS1=genes[1:50],
GS2=genes[51:100],
GS3=genes[101:150]
)
K2res <- runGSEmods(K2res,
genesets=genesetsMadeUp,
qthresh=0.1)
K2res <- runGSVAmods(K2res,
ssGSEAalg="gsva",
ssGSEAcores=1,
verbose=FALSE)
K2res <- runDSSEmods(K2res)
K2dashboard(K2res,
analysis_name="Example",
output_dir=".")