Networks and Graphs
A network encodes who is connected to whom — which hosts contact each other, which species eat which, which neurons wire together. Turning that web of relationships into a matrix lets us bring the full machinery of linear algebra to bear on ecology and disease transmission.
Nodes, edges, and the adjacency matrix
A network (or graph) is a set of nodes (vertices) joined by edges (links). Write for the number of nodes and for the number of edges. Edges may be undirected (a symmetric relationship, like a shared handshake) or directed (an ordered relationship, like “ infects ”), and each edge may carry a weight recording contact frequency, flow, or interaction strength.
The structure is captured compactly by the adjacency matrix , an array whose entries record which pairs are connected: For an undirected network is symmetric, , whereas a directed network generally has . Writing a network this way — see matrix notation — means that graph questions become matrix operations: the number of length- paths from to is the entry of , and the eigenvalues of govern spreading dynamics, with the largest eigenvalue setting the epidemic threshold on a contact network.
Degree and degree distribution
The degree of node is its number of connections, which for an undirected network is simply a row sum of the adjacency matrix: In a directed network we distinguish the in-degree (edges pointing in) from the out-degree (edges pointing out). The degree distribution is the fraction of nodes with degree , a probability distribution over the graph; its shape distinguishes homogeneous networks from heavy-tailed ones dominated by hubs. The mean degree for an undirected network is where the factor of appears because each of the edges contributes to the degree of both of its endpoints.
Paths, components, and clustering
A path is a sequence of edges leading from one node to another, and the shortest-path length between two nodes is the fewest edges needed to connect them. The largest shortest-path length over all pairs is the diameter of the network. A connected component is a maximal set of nodes all mutually reachable by paths; when one component contains a finite fraction of all nodes as grows, it is called the giant component, and its emergence marks the point at which a pathogen can reach much of the population.
Local cohesion is measured by the clustering coefficient (transitivity), the probability that two neighbours of a node are themselves neighbours — “the friend of my friend is also my friend.” Overall sparsity is summarised by the density (connectance), the fraction of possible edges that are actually present:
Worked example: a 5-node graph
Take an undirected network on nodes with edges , so . Its adjacency matrix is Summing each row gives the degrees , which sum to as they must. The mean degree is . The density is , so half of the ten possible edges are present. Node is the most connected, and the whole graph is a single connected component (you can walk from node to node ), so its giant component contains all five nodes.
In code
We build the example graph and read off its degrees, density, and components.
R
library(igraph)
g <- graph_from_literal(1-2, 1-3, 2-3, 3-4, 4-5)
degree(g) # 1:2 2:2 3:3 4:2 5:1
mean(degree(g)) # 2
edge_density(g) # 0.5
components(g)$no # 1 (one connected component)
$transitivity(g) # global clustering coefficient
Python
import networkx as nx
g = nx.Graph([(1, 2), (1, 3), (2, 3), (3, 4), (4, 5)])
print(dict(g.degree())) # {1:2, 2:2, 3:3, 4:2, 5:1}
print(2 * g.number_of_edges() / g.number_of_nodes()) # 2.0
print(nx.density(g)) # 0.5
print(nx.number_connected_components(g)) # 1
print(nx.transitivity(g)) # global clustering
{1: 2, 2: 2, 3: 3, 4: 2, 5: 1}
2.0
0.5
1
0.5
Julia
using Graphs
g = SimpleGraph(5)
for (i, j) in [(1,2),(1,3),(2,3),(3,4),(4,5)]
add_edge!(g, i, j)
end
degree(g) # [2, 2, 3, 2, 1]
2 * ne(g) / nv(g) # 2.0
2 * ne(g) / (nv(g) * (nv(g)-1)) # 0.5 (density)
length(connected_components(g)) # 1
global_clustering_coefficient(g)
Why it matters
Representing contact patterns, food webs, or gene-regulatory wiring as a graph turns qualitative “who interacts with whom” questions into quantitative ones. The degree distribution predicts how fast an epidemic spreads and who the likely super-connectors are; the giant component tells you whether an outbreak can become widespread; connectance and clustering summarise how tightly an ecological network is knit, which bears on its stability. Because a network is just its adjacency matrix, node-importance scores (see centrality) and spreading thresholds follow directly from linear algebra, and comparing a real network against a random-graph model reveals which features are surprising rather than expected by chance.