Hybrid Dynamic Networks and Higher-Order Statistics

Published on 04 Feb 202630 min readLink

TL;DR

Hybrid dynamic networks let you zoom into one region at high resolution (farms) while keeping the rest of the country as regional supernodes. This is a multiscale coarsening strategy: reduce state space without cutting away context.^[5]
We work with discrete-time temporal graphs (daily snapshots). The key question is not just what is connected, but when connections appear and disappear.^[1]^[2]
To compare hybrid networks (or models that generate them), we use three families of higher-order statistics: TEA/TNA (edge/node turnover), π-mass (random-walk mass on Farms vs Regions), and magnetic Laplacian spectra (direction-sensitive spectral fingerprints).^[1]^[10]^[12]

Interactive hybrid-network simulator

This embedded demo generates a 10-day hybrid network on the Netherlands COROP map (40 regions).^[13] You can pick a focal COROP to keep at farm resolution, tune persistence and birth probabilities, and explore: TEA/TNA (edge/node turnover), π-mass (F vs R), and a magnetic Laplacian spectrum heatmap.

Performance note: Magnetic spectra are the most expensive step. The simulator shows other panels first; spectra compute last. You can cancel spectra by hovering the progress badge inside the network panel and clicking “Cancel”.

Why a hybrid representation?

Livestock movement data is naturally a temporal directed network: a shipment from premise u to premise v on day t is a time-stamped directed edge. Temporal ordering matters because disease, information, and material flows must respect time.^[1]

In practice, national-scale premise networks can become very large. But many questions are regional: “What is the import pressure into this region?” “Which local farms are repeatedly connected to outside flows?”. In livestock epidemiology, network-based summaries are routinely used to identify vulnerability, target surveillance, and compare intervention strategies.^[4]

A hybrid network is a compromise: keep the focal region detailed (farm nodes), and contract everything else into regional “supernodes”. This is a form of graph coarsening tuned for interpretation and computation. We have fewer nodes outside, but still explicit cross-region flows.^[5]

Left: farms everywhere (full premise graph). Right: click to contract or expand the same graph. In the contracted view, focal farms stay as themselves, non-focal farms are merged into region supernodes, parallel edges are aggregated, and within-region links appear as supernode self-loops.

Formal definition: hybrid temporal graph

Start from daily shipment records $$r = (t, u, v, w, c_u, c_v),$$ where u and v are premises, w is a shipment weight (e.g., animals moved), and c_u, c_v are COROP codes of origin/destination.^[13] Let $c^*$ be the focal COROP.

Define the hybrid node set as $V' = F \cup R$:

Farms (F): all premises with COROP = $c^*$.
Regions (R): one supernode per non-focal COROP $c \neq c^*$.

The contraction map $\varphi$ sends each endpoint to either its farm node if in the focal COROP or its region node otherwise:

$$ \varphi(u) = \begin{cases} u, & \text{if } c_u=c^* \\ c_u, & \text{if } c_u \neq c^* \end{cases} $$

Each day t yields a directed weighted snapshot $G'_t = (V', E'_t, W_t)$ by aggregating shipments with the same hybrid endpoints: $$ W_t(a,b)=\sum_{r:\,\varphi(u)=a,\,\varphi(v)=b} w. $$ Edges can be of four blocks: F→F (within focal), F→R (exports), R→F (imports), and R→R (outside-only superflows).

This “resolution switch” is similar to multilayer and multiscale network thinking: you keep multiple interaction scales in one representation rather than picking a single resolution everywhere.^[3]

Why “higher-order” statistics?

Static summaries of the time-aggregated graph (degree, density, communities) can hide crucial temporal structure: bursts, periodicity, and edge turnover can change reachability and outbreak risk even when the aggregate looks similar.^[1]^[2] Higher-order statistics here means: summaries that depend on time ordering, persistence, or directional circulation, not just the aggregated adjacency.

TEA splits day-to-day edges into persist, birth (new or reactivated), and churn. This is closely related to separating formation vs persistence processes in dynamic network models.^[6]

Statistic family A — TEA & TNA (temporal turnover)

Let $E_t$ be the set of directed edges present on day t (after aggregation), and let $V_t$ be the set of active nodes (nonzero in- or out-degree). TEA (Temporal Edge Appearance) compares $E_t$ to $E_{t-1}$:

persist: $E_t \cap E_{t-1}$
birth: $E_t \setminus E_{t-1}$
churn: $E_{t-1} \setminus E_t$

Births are further split using a global memory set $E_{\le t-1}$: a birth is new if it has never appeared before, and reactivated if it reappears after absence. TNA (Temporal Node Appearance) repeats the same decomposition on active node sets $V_t$.

Interpretation (livestock trade / epidemiology). Persistent edges often reflect stable trading relationships or logistics “channels”. High new edge rates can indicate exploration of new partners or market-driven rewiring, which increases the chance of connecting previously weakly linked parts of the system—relevant for introduction and spread risk.^[4] Reactivation suggests periodic trade (e.g., production cycles), which can create predictable “windows” of connectivity.

Generic interpretation. TEA/TNA are useful whenever repeated interactions matter: email/contact networks, financial transfers, supply chains, or transportation flows. The “formation vs persistence” viewpoint is standard in statistical models for dynamic networks (e.g., STERGM).^[6]

Statistic family B — π-mass (random-walk mass on Farms vs Regions)

TEA/TNA focus on turnover. π-mass focuses on where flow concentrates. One simple choice is a lazy random walk on the directed graph snapshot:

$$ P_{\mathrm{lazy}} = (1-\alpha)D^{-1}A + \alpha I, $$ where $A$ is the adjacency (unweighted or weighted), $D$ is the out-degree (or out-strength) diagonal, and $\alpha \in (0,1)$ adds a chance of waiting at the current node. Laziness is useful because it smooths day-to-day oscillations and helps break periodic behavior, but it does not by itself connect parts of a graph that are unreachable from one another.^[9]

A different choice is teleporting PageRank: $$ P_{\mathrm{tele}} = (1-\beta)D^{-1}A + \beta\,\mathbf{1}v^\top, $$ where $v$ is a positive teleportation distribution. This is a broader restart mechanism: instead of sometimes waiting, the walk sometimes jumps to a new node. With positive teleportation weights, every node can still be reached through the restart channel, which gives a well-defined full-graph stationary distribution even when the raw directed snapshot is reducible. In plain language: lazy means “sometimes stay put”, whereas teleporting means “sometimes restart somewhere else”.^[7]^[14]

The same hybrid graph is shown twice. Left: a lazy walk adds a self-loop, so part of the probability stays at the current node. Right: a teleporting walk adds a restart channel, shown as dashed arrows from the highlighted node. Laziness helps with periodicity, but teleportation is what reconnects a reducible directed graph in the Markov-chain sense.^[9]^[14]

Once a walk has been defined, π-mass aggregates stationary probability by node type: $$\pi_F(t)=\sum_{i:\,type(i)=farm} \pi_i,\quad \pi_R(t)=\sum_{i:\,type(i)=region} \pi_i.$$ The subtlety is the state space on which $\pi$ is computed.

A largest strongly connected component (LSCC, or LIC in the simulator labels) is a set of nodes that can all reach one another. But mutual reachability does not imply closure: probability may still leak out to nodes outside that component. A largest closed strongly connected class is both mutually reachable and non-escaping. That difference matters because in a finite Markov chain, stationary mass lives on essential closed classes, not on transient leaky ones.^[9]

Blue nodes form the largest strongly connected component. Orange nodes form a smaller closed class. When the red leak arrow is positive, the blue SCC is no longer closed: a walker can leave it and not come back. Restricting π-mass to the blue SCC renormalizes that loss away, whereas the full finite chain eventually puts stationary mass only on the closed class. This is the source of the mass-leak limitation in an LIC-based summary.^[9]

The current simulator uses a lazy-walk π-mass computed on the largest strongly connected component because it is fast and easy to teach. That makes it a useful exploratory diagnostic, but it can overstate mass retained in a leaky class. For formal analyses, two safer choices are to use a teleporting PageRank on the full snapshot or to restrict the calculation to a largest closed strongly connected class.^[9]^[14]

Interpretation (hybrid network quality). Once the walk definition and state space have been fixed, π_F(t) and π_R(t) summarize whether long-run flow sits mainly in focal farms or in external region supernodes. If contraction is too aggressive, mass can collapse onto region supernodes (π_R≈1), washing out focal structure; if the focal subgraph becomes too isolated, π_F≈1 and the outside context is almost gone. A “reasonable” hybrid representation tends to keep a meaningful split across time, reflecting both local circulation and external pressure.

Generic interpretation. Stationary mass is a flow-centrality concept (PageRank is the canonical example).^[7] It is useful for prioritizing nodes for monitoring, intervention, caching, or auditing when you believe the system behaves like repeated movement on edges. For background on random walks, PageRank, and mixing, see Page et al., Lovász, Langville–Meyer, and Levin–Peres–Wilmer.^[8]^[9]^[14]

A tiny hybrid graph with 3 farms (blue) and 3 regions (orange). Circle size is the stationary probability πᵢ after the walk and the state space have already been fixed. The bottom bar aggregates mass into π_F and π_R. Increase “outside dominance” to see how contraction can concentrate mass on regional supernodes. In formal work, the same split can be computed either with a teleporting PageRank on the full graph or on a closed class, depending on the question being asked.^[7]^[9]^[14]

Statistic family C — magnetic Laplacian spectra (direction-sensitive fingerprints)

Many spectral tools (e.g., the normalized Laplacian) are naturally defined for undirected graphs. For directed networks, a magnetic Laplacian encodes direction as a complex phase while producing a Hermitian operator with real eigenvalues. This allows spectral signatures that respond to directed circulation and asymmetry.^[10]^[12]

The weighted construction used in the simulator starts from a directed adjacency $A$ and defines:

symmetric magnitude $A_s = \tfrac{1}{2}(A + A^\top)$
imbalance ratio $R_{uv} = \dfrac{A_{uv} - A_{vu}}{A_{uv} + A_{vu}}$ when $A_{uv}+A_{vu}>0$
phase $\Theta_{uv} = 2\pi q\,R_{uv}$ (charge parameter $q$)
magnetic adjacency $H_{uv} = (A_s)_{uv}\,e^{i\Theta_{uv}}$
magnetic normalized Laplacian $L_q = I - D_s^{-1/2} H D_s^{-1/2}$, where $(D_s)_{uu}=\sum_v (A_s)_{uv}$

Earlier magnetic constructions typically use a sign-based phase, so any reciprocal pair with the same dominant direction receives the same phase magnitude even if one pair is nearly balanced and the other is strongly one-sided.^[10]^[12] In the simulator we instead use the weighted reciprocal-imbalance phase $\Theta_{uv}=2\pi q\,(A_{uv}-A_{vu})/(A_{uv}+A_{vu})$. To our knowledge this is an application-driven variant rather than a standard choice in the magnetic-Laplacian literature. Its practical advantage is that it separates nearly balanced reciprocal traffic from strongly one-sided reciprocal traffic, which a sign-only phase cannot do. That extra resolution is useful when both reciprocity and flow magnitude matter.^[12]^[17]

In the full-quality setting, the simulator computes the first K eigenvalues of $L_q$ for each day on the full node set and shows them as a heatmap across time. The fast setting uses an active-node approximation and interpolates skipped days for responsiveness. In directed graphs, changes in this spectrum can reflect shifts in “circulation modes” (cycles, one-way backbones, and directional community structure). The heatmap is therefore best read as a direction-sensitive fingerprint. For formal goodness-of-fit, however, a quantitative per-day spectral distance between observed and simulated networks—such as a distance between magnetic spectra or magnetic spectral densities—would be stronger than visual inspection alone.^[11]^[12]^[15]

Interpretation (livestock trade / epidemiology). Directionality matters because risk is not symmetric: imports into a region carry different implications than exports. Magnetic spectra are a compact “fingerprint” of directional structure and its evolution, complementing turnover (TEA/TNA) and flow concentration (π-mass).^[4]

Left: a directed ring with two directions. Reciprocal-weight imbalance is encoded as a phase θ = 2πqR on each pair, where R = (A_uv - A_vu)/(A_uv + A_vu). A sign-only phase would give the same phase magnitude to weak and strong imbalances; this weighted variant does not. The wheel shows one representative pair. Right: eigenvalues of the magnetic normalized Laplacian L_q. Increasing directional imbalance changes circulation modes and shifts the spectrum. ^[10]^[12]^[17]

NetSpectra: the Hybrid Simulator project

The simulator you just used is packaged as a standalone, embeddable web app (D3 + Web Workers) for teaching and exploratory analysis. I maintain it alongside the broader HerdLink ecosystem.^[4]

NetSpectra

An interactive, embeddable teaching tool for hybrid temporal networks: COROP map + 10-day dynamic graph + TEA/TNA, π-mass, and magnetic spectra.

GitHub Open Source

View Project

Using these summaries for goodness of fit

These statistics and plots are most informative when treated as posterior-predictive or simulation-based model checks rather than as standalone pictures. The central question is not simply whether an observed curve “looks plausible”, but whether networks generated by the fitted model reproduce the observed trajectory of the same descriptors. In network analysis, that means comparing observed statistics with the distribution of those statistics on replicated networks generated from the fitted model.^[16]^[18]

Practically, one can compute TEA/TNA summaries, π-mass, and a magnetic spectral discrepancy for the observed snapshot on each day and for many replicated snapshots. The visual layer should then show the observed trajectory against replicated envelopes, while the quantitative layer should summarize error over time—for example by average absolute deviation, calibration bands, or a snapshot-wise spectral distance. Visual comparison remains useful for intuition, but the envelope and the error summary are what make a goodness-of-fit claim reproducible and comparable across models.^[12]^[15]^[16]^[18]

References

Numbers in brackets link to the sources used for the conceptual framing of this post.

Holme P, Saramäki J. 2012. Temporal networks. Physics Reports. doi:10.1016/j.physrep.2012.03.001
Holme P. 2015. Modern temporal network theory: a colloquium. Eur Phys J B. doi:10.1140/epjb/e2015-60657-4
Kivelä M, Arenas A, Barthelemy M, Gleeson JP, Moreno Y, Porter MA. 2014. Multilayer networks. Journal of Complex Networks. doi:10.1093/comnet/cnu016
Chaters GL, et al. 2019. Analysing livestock network data for infectious disease control: an argument for routine data collection in emerging economies. Phil Trans R Soc B. doi:10.1098/rstb.2018.0264
Chen J, Saad Y, Zhang Z. 2022. Graph coarsening: from scientific computing to machine learning. SeMA Journal. doi:10.1007/s40324-021-00282-x
Krivitsky PN, Handcock MS. 2014. A Separable Model for Dynamic Networks. J R Stat Soc Series B. doi:10.1111/rssb.12014
Page L, Brin S, Motwani R, Winograd T. 1999. The PageRank Citation Ranking: Bringing Order to the Web. Stanford InfoLab Technical Report. PDF
Lovász L. 1993. Random walks on graphs: a survey. In: Combinatorics, Paul Erdős is Eighty (Vol. 2). PDF
Levin DA, Peres Y, Wilmer EL. 2009/2017. Markov Chains and Mixing Times (2nd ed.). American Mathematical Society. PDF
Fanuel M, Alaíz CM, Fernandez A, Suykens JAK. 2018. Magnetic Eigenmaps for the visualization of directed networks. Appl Comput Harmon Anal. doi:10.1016/j.acha.2017.01.004
Fanuel M, Alaíz CM, Suykens JAK. 2017. Magnetic eigenmaps for community detection in directed networks. Phys Rev E. doi:10.1103/PhysRevE.95.022302
de Resende BMF, Costa LdF. 2020. Characterization and comparison of large directed networks through the spectra of the magnetic Laplacian. Chaos. doi:10.1063/5.0006891
Statistics Netherlands (CBS). COROP region (definition and background). cbs.nl definition
Langville AN, Meyer CD. 2004. Deeper Inside PageRank. Internet Mathematics. doi:10.1080/15427951.2004.10129091
Shore J, Lubin B. 2015. Spectral goodness of fit for network models. Social Networks. doi:10.1016/j.socnet.2015.04.004
Vaca-Ramírez F, Peixoto TP. 2022. Systematic assessment of the quality of fit of the stochastic block model for empirical networks. Physical Review E. doi:10.1103/PhysRevE.105.054311
Yin M, Zhu L. 2016. Reciprocity in directed networks. Physica A. doi:10.1016/j.physa.2015.12.008
Hunter DR, Goodreau SM, Handcock MS. 2008. Goodness of fit of social network models. Journal of the American Statistical Association. doi:10.1198/016214507000000446

Tianjian Qin