Generating and Analyzing Brain Networks with Community Structures

Overview

This vignette shows the full BrainNetTest workflow on synthetic populations of brain networks with community (block) structure. We simulate two groups of graphs with different intra- and inter-community connection probabilities, run the global L1-distance ANOVA test, and identify the edges and nodes that drive the observed difference.

library(BrainNetTest)

Generating a single community-structured graph

generate_community_graph() samples a symmetric binary adjacency matrix with a block-stochastic structure: edges within a community are drawn with probability intra_prob, edges between communities with inter_prob.

set.seed(42)
G <- generate_community_graph(
  n_nodes       = 40,
  n_communities = 4,
  intra_prob    = 0.8,
  inter_prob    = 0.2)

dim(G)
#> [1] 40 40
mean(G[upper.tri(G)])
#> [1] 0.3358974

Visualising the network

A single community-structured adjacency matrix is best inspected directly with igraph. The package’s own visualisation helper, plot_critical_edges(), is designed to summarise the result of an analysis (per-population central graphs plus the identified critical edges) and is illustrated at the end of this vignette.

Simulating populations of graphs

generate_category_graphs() produces a list of graphs that share a common community structure but vary slightly in their edge probabilities across replicates, modelling natural between-subject variability.

control <- generate_category_graphs(
  n_graphs             = 20,
  n_nodes              = 20,
  n_communities        = 2,
  base_intra_prob      = 0.8,
  base_inter_prob      = 0.2,
  intra_prob_variation = 0.05,
  inter_prob_variation = 0.05,
  seed                 = 1)

patient <- generate_category_graphs(
  n_graphs             = 20,
  n_nodes              = 20,
  n_communities        = 2,
  base_intra_prob      = 0.6,
  base_inter_prob      = 0.4,
  intra_prob_variation = 0.05,
  inter_prob_variation = 0.05,
  seed                 = 2)

populations <- list(Control = control, Patient = patient)
lengths(populations)
#> Control Patient 
#>      20      20

Global test statistic

T_obs <- compute_test_statistic(populations, a = 1)
T_obs
#>   Control 
#> -85.03753

Identifying critical edges

identify_critical_links() performs the full pipeline: marginal edge p-values, permutation null for T, and iterative edge removal. It returns the edges whose removal eliminates the group-level difference.

result <- identify_critical_links(
  populations,
  alpha          = 0.05,
  method         = "fisher",
  n_permutations = 500,
  seed           = 42)

nrow(result$critical_edges)
#> [1] 50
head(result$critical_edges)
#>     node1 node2      p_value
#> 185    14    20 0.0004359198
#> 117    12    16 0.0021996412
#> 50      5    11 0.0033420517
#> 56      1    12 0.0033420517
#> 65     10    12 0.0033420517
#> 153    17    18 0.0033420517

Critical nodes

get_critical_nodes() aggregates the critical edges at the node level, reporting the critical degree (number of critical edges incident on each node):

get_critical_nodes(result)
#>    node critical_degree
#> 1    12              10
#> 2    16               8
#> 3    14               7
#> 4    17               7
#> 5    20               7
#> 6     1               6
#> 7     5               6
#> 8    19               6
#> 9     3               5
#> 10    6               5
#> 11    9               5
#> 12    2               4
#> 13    8               4
#> 14   11               4
#> 15   13               4
#> 16   15               4
#> 17    4               3
#> 18   10               2
#> 19   18               2
#> 20    7               1

Visualising the result

plot_critical_edges() produces a multi-panel figure that summarises the analysis: one panel per population showing the (weighted) central graph, plus a final panel that highlights the critical edges on the chosen reference central graph. Communities can be passed in to color the vertices consistently across panels.

plot_critical_edges(
  populations,
  result,
  communities = rep(seq_len(2), each = 10),
  reference   = "Control")