Pre-filter a dataset prior running candidate_search
to avoid
testing features that are too prevalent or too sparse across samples in
the dataset
prefilter_data(FS, max_cutoff = 0.6, min_cutoff = 0.03, verbose = FALSE)
a matrix of binary features or a SummarizedExperiment class object from SummarizedExperiment package where rows represent features of interest (e.g. genes, transcripts, exons, etc...) and columns represent the samples. The assay of FS contains binary (1/0) values indicating the presence/absence of ‘omics’ features.
a numeric value between 0 and 1 describing the absolute prevalence of a feature across all samples in the FS object which the feature will be filtered out. Default is 0.6 (feature that occur in 60 percent or more of the samples will be removed)
a numeric value between 0 and 1 describing the absolute prevalence of a feature across all samples in the FS object which the feature will be filtered out. Default is 0.03 (feature that occur in 3 percent or less of the samples will be removed)
a logical value indicates whether or not to print the
diagnostic messages. Default is FALSE
.
A SummarizedExperiment object with only the filtered-in features given the filtered thresholds
# Load pre-computed feature set
data(sim_FS)
# Filter out features having < 3% and > 60% prevalence across all samples
# by (default)
sim_FS_filt1 <- prefilter_data(FS = sim_FS)
# Change the min cut-off to 1% prevalence, instead of the default of 3%
sim_FS_filt2 <- prefilter_data(FS = sim_FS, min_cutoff = 0.01)
# Change the max cut-off to 65% prevalence, instead of the default of 60%
sim_FS_filt3 <- prefilter_data(FS = sim_FS, max_cutoff = 0.65)