Skip to contents

Computes per-node Graphlet Degree Vector (GDV) distances between two networks (and optionally signed GDV distances), then assesses one-sided significance against a null distribution generated by [null_ggm()] via either permutation or bootstrap resampling.

Specifically, for each node and GDV metric, the observed distance between the two input networks is compared to the null distribution obtained from repeatedly shuffled (or bootstrapped) networks. The p-values are computed as one-sided probabilities \(P(\text{null} \ge \text{observed})\), with an add-one correction applied.

- If `sign = FALSE`: returns a p × 2 data frame with columns `GDV_distance_dist`, `GDV_distance_pval` - If `sign = TRUE`: returns a p × 4 data frame adding `signed_GDV_distance_dist`, `signed_GDV_distance_pval`

Usage

diff_gdv(
  obs_networks,
  dat,
  group_col = "phenotype",
  null_networks = NULL,
  sign = FALSE,
  inference_method = c("D-S_NW_SL", "B_NW_SL"),
  shuffle_method = c("permutation", "bootstrap"),
  shuffle_iter = 100,
  balanced = FALSE,
  filter = c("pval", "fdr", "none"),
  threshold = 0.05,
  n_cores = NULL,
  seed = NULL
)

Arguments

obs_networks

A **named** list of exactly two [`igraph`][igraph::igraph] objects corresponding to the two conditions to compare. The list names (e.g., `"healthy"`, `"disease"`) must match the unique values in `dat[[group_col]]`.

dat

A data frame used to generate the null networks when `null_networks` is `NULL`. Must include a grouping column defined by `group_col`.

group_col

Character scalar; the name of the column in `dat` indicating group labels. Default is `"phenotype"`.

null_networks

Optional; a list returned directly from [null_ggm()], which has elements `null_networks` (the list of null replicates) and `tag = "null_ggm"`. When provided, the function will **not regenerate** null networks, and the argument `shuffle_iter` is ignored.

sign

Logical; whether to also compute signed GDV distances using the edge attribute `"sign"`. Defaults to `FALSE`. If `TRUE`, both unsigned and signed GDV distances are computed and tested.

inference_method

Character; the network inference method passed to [null_ggm()]. One of `"D-S_NW_SL"` or `"B_NW_SL"`. Only used when `null_networks` is `NULL`.

shuffle_method

Character; resampling method used to generate the null networks via [null_ggm()]. One of `"permutation"` or `"bootstrap"`. Ignored if `null_networks` is provided.

shuffle_iter

Integer; number of null replicates to generate when constructing the null via [null_ggm()]. Default is `100`. Ignored if `null_networks` is provided.

balanced

Logical; whether to downsample to equal group sizes before permutation. Forwarded to [null_ggm()] and used only when `shuffle_method = "permutation"`.

filter

Character; one of `"pval"`, `"fdr"`, or `"none"`, defining which edge-filtering criterion to use in [null_ggm()]. Default is `"pval"`.

threshold

Numeric; edge inclusion threshold used when `filter != "none"`. Default is `0.05`.

n_cores

Integer or `NULL`; number of CPU cores used for parallel computation (both for GDV distance and null generation). Defaults to one less than available cores if `NULL`.

seed

Optional integer random seed for reproducibility.

Value

A `data.frame` with rows corresponding to nodes (row names taken from the network vertex names) and interleaved columns by statistic: - `*_dist`: observed GDV distance (unsigned or signed) - `*_pval`: one-sided p-value (P(null ≥ observed))