Perform permutation-based testings on a sample of permuted input scores
using candidate_search
as the main iterative function for each run.
CaDrA(
FS,
input_score,
method = c("ks_pval", "ks_score", "wilcox_pval", "wilcox_score", "revealer", "custom"),
method_alternative = c("less", "greater", "two.sided"),
custom_function = NULL,
custom_parameters = NULL,
weights = NULL,
search_start = NULL,
top_N = 1,
search_method = c("both", "forward"),
max_size = 7,
n_perm = 1000,
perm_alternative = c("one.sided", "two.sided"),
obs_best_score = NULL,
smooth = TRUE,
plot = FALSE,
ncores = 1,
cache = FALSE,
cache_path = NULL,
verbose = FALSE
)
a matrix of binary features or a SummarizedExperiment class object from SummarizedExperiment package where rows represent features of interest (e.g. genes, transcripts, exons, etc...) and columns represent the samples. The assay of FS contains binary (1/0) values indicating the presence/absence of omics features.
a vector of continuous scores representing a phenotypic readout of interest such as protein expression, pathway activity, etc.
NOTE: input_score
object must have names or labels that match
the column names of FS
object.
a character string specifies a scoring method that is
used in the search. There are 6 options: ("ks_pval"
or ks_score
or "wilcox_pval"
or wilcox_score
or
"revealer"
(conditional mutual information from REVEALER) or
"custom"
(a user-defined scoring method)).
Default is ks_pval
.
a character string specifies an alternative
hypothesis testing ("two.sided"
or "greater"
or "less"
).
Default is less
for left-skewed significance testing.
NOTE: This argument only applies to ks_pval
and wilcox_pval
method
If method is "custom"
, specifies
a user-defined function here. Default is NULL
.
NOTE: custom_function
must take FS
and input_score
as its input arguments and its final result must return a vector of row-wise
scores where its labels or names match the row names of FS
object.
If method is "custom"
, specifies a list of
additional arguments (excluding FS
and input_score
) to be
passed to custom_function
. For example:
custom_parameters = list(alternative = "less"). Default is NULL
.
If method is ks_score
or ks_pval
, specifying a
vector of weights will perform a weighted-KS testing. Default is NULL
.
NOTE: weights
must have names or labels that match the labels of
input_score
.
a vector of character strings (separated by commas)
specifies feature names in the FS object to start the search with.
If search_start
is provided, then top_N
parameter will be
ignored and vice versa. Default is NULL
.
an integer specifies the number of features to start the search over. By default, it starts with the feature that has the highest best score (top_N = 1).
NOTE: If top_N
is provided, then search_start
parameter
will be ignored and vice versa. If top_N > 10, it may result in a longer
search time.
a character string specifies an algorithm to filter out
the best candidates ("forward"
or "both"
). Default is
both
(i.e., backward and forward).
an integer specifies a maximum size that a meta-feature can
extend to do for a given search. Default is 7
.
an integer specifies the number of permutations to perform.
Default is 1000
.
an alternative hypothesis type for calculating
permutation-based p-value. Options: one.sided, two.sided. Default is
one.sided
.
a numeric value corresponding to the best observed
score. This value is used to compare against the n_perm
calculated best
scores. Default is NULL
. If set to NULL, we will compute the observed
best score based on the given parameters.
a logical value indicates whether or not to add a smoothing
factor of 1 to the calculation of permutation-based p-value. This option is
used to avoid a returned p-value of 0. Default is TRUE
.
a logical value indicates whether or not to plot the empirical
null distribution of the permuted best scores. Default is FALSE
.
an integer specifies the number of cores to perform
parallelization for permutation-based testing. Default is 1
.
a logical value determines whether or not to cache the
permuted best scores. This helps to save time for future loading instead
of re-computing the permutation-based testing every time.
Default is FALSE
.
a path to cache permuted best scores. Default is NULL
.
If NULL, the cache path is set to system home directory
(e.g. $HOME/.Rcache
) for future loading.
a logical value indicates whether or not to print the
diagnostic messages. Default is FALSE
.
a list of 4 objects: key
, perm_best_scores
,
obs_best_score
, perm_pval
-key
: a list of parameters that was used to cache the
results of the permutation-based testing. This is useful as the
permuted best scores can be recycled to save time for future loading.
-perm_best_scores
: a vector of permuted best scores obtained
by performing candidate_search
over n_perm
iterations of
permuted input scores.
-obs_best_score
: a user-provided best score or an observed best score
obtained by performing candidate_search
on a given dataset and input
parameters. This value is later used to compare against the permuted best
scores (perm_best_scores
).
perm_pval
: a permutation-based p-value obtained by calculating
sum(perm_best_scores > obs_best_score)/n_perm
NOTE: If smooth = TRUE, a smoothing factor of 1 will be added to the
calculation of perm_pval
.
e.g. (sum(perm_best_scores > obs_best_score) + 1) / (n_perm + c)
This is just to not return a p-value of 0
# Load pre-computed feature set
data(sim_FS)
# Load pre-computed input-score
data(sim_Scores)
# Set seed for permutation
set.seed(21)
# Define additional parameters and start the function
cadra_result <- CaDrA(
FS = sim_FS, input_score = sim_Scores, method = "ks_pval",
weights = NULL, method_alternative = "less", top_N = 1,
search_start = NULL, search_method = "both", max_size = 7,
n_perm = 10, perm_alternative = "one.sided", plot = FALSE,
smooth = TRUE, obs_best_score = NULL,
ncores = 1, cache = FALSE, cache_path = NULL
)
#>
|
| | 0%
|
|======= | 10%
|
|============== | 20%
|
|===================== | 30%
|
|============================ | 40%
|
|=================================== | 50%
|
|========================================== | 60%
|
|================================================= | 70%
|
|======================================================== | 80%
|
|=============================================================== | 90%
|
|======================================================================| 100%