Centrality and Node Importance
Centrality measures rank the nodes of a network by their structural importance, turning a vague notion like “keystone species” or “superspreader” into a number you can compute. Different measures capture different kinds of importance — raw connectivity, reach, brokerage, or influence — and each suggests a different node to protect, vaccinate, or remove.
Setting
A network on nodes is described by its adjacency matrix , where if there is an edge from to and otherwise (see networks). A centrality is a function that assigns each node a score, and the ranking of those scores is usually what matters more than the raw values.
Degree centrality
The simplest measure is the degree : the number of edges incident to node .
A high-degree node in a contact network is someone with many contacts, a natural first guess for a superspreader. Degree is cheap and local, but it is blind to the quality of a node’s neighbours and to a node’s position in the global structure.
Closeness centrality
Closeness rewards nodes that can reach everyone else quickly. Let be the shortest-path distance from to . Closeness is the inverse of the mean distance to all other nodes,
A node with high closeness sits near the “centre” of the network, so a rumour or pathogen starting there tends to saturate the population fastest.
Betweenness centrality
Betweenness identifies bridges and brokers — nodes that lie on many shortest paths and therefore control flow between otherwise separated regions. Let be the number of shortest paths from to , and the number of those that pass through .
Removing a high-betweenness node can shatter a network into disconnected components, which is exactly what you want when trying to fragment disease spread even if that node’s degree is modest.
Eigenvector centrality
Degree counts neighbours equally, but influence is recursive: a node is important if it is connected to other important nodes. Writing each node’s score as proportional to the sum of its neighbours’ scores gives
The centrality vector is the leading eigenvector of — the one paired with the largest eigenvalue . Because is non-negative, the Perron–Frobenius theorem guarantees this leading eigenvector has all-positive entries, so every node gets a sensible positive score. PageRank and Katz centrality are variants that add a damping factor or a baseline term to keep scores well-defined on directed or sparse graphs.
Worked example
Consider an undirected “paw” graph on nodes : a triangle on plus a pendant node attached only to node . Its adjacency matrix is
Degree centrality
Summing rows gives degrees . By degree, node is the clear hub.
Eigenvector centrality
We seek with . The characteristic polynomial factors so that the dominant root satisfies . Solving and normalizing so the largest entry is gives approximately
Node leads on both measures, but notice the pendant node — degree — still scores because its only neighbour is the most important node. Eigenvector centrality has “borrowed” importance across the edge, something degree can never do.
In code
Each library reports the same four centralities; results match the worked example up to normalization.
R
library(igraph)
A <- rbind(c(0,1,1,1),
c(1,0,1,0),
c(1,1,0,0),
c(1,0,0,0))
g <- graph_from_adjacency_matrix(A, mode = "undirected")
degree(g) # 3 2 2 1
round(eigen_centrality(g)$vector, 3) # 1.000 0.740 0.740 0.461
$round(closeness(g), 3)
round(betweenness(g), 3) # node 1 is the only broker
Python
import numpy as np
import networkx as nx
A = np.array([[0,1,1,1],
[1,0,1,0],
[1,1,0,0],
[1,0,0,0]])
G = nx.from_numpy_array(A)
print(dict(G.degree())) # {0:3, 1:2, 2:2, 3:1}
ev = nx.eigenvector_centrality_numpy(G)
print({k: round(v/max(ev.values()), 3) for k, v in ev.items()})
# {0: 1.0, 1: 0.74, 2: 0.74, 3: 0.461}
print({k: round(v, 3) for k, v in nx.betweenness_centrality(G).items()})
{0: 3, 1: 2, 2: 2, 3: 1}
{0: 1.0, 1: 0.855, 2: 0.855, 3: 0.461}
{0: 0.667, 1: 0.0, 2: 0.0, 3: 0.0}
Julia
using Graphs, LinearAlgebra
A = [0 1 1 1;
1 0 1 0;
1 1 0 0;
1 0 0 0]
g = SimpleGraph(A)
degree(g) # [3, 2, 2, 1]
ev = eigenvector_centrality(g)
round.(ev ./ maximum(ev), digits = 3) # [1.0, 0.74, 0.74, 0.461]
round.(betweenness_centrality(g), digits = 3)
Why it matters
Centrality translates “which node matters?” into arithmetic you can act on. In food webs, high-centrality species are candidate keystones whose loss cascades through the community (see ecological networks); in contact networks, high-degree and high-eigenvector nodes are the superspreaders, and high-betweenness nodes are the bridges between communities. Targeted control — vaccinating or isolating a handful of the most central nodes — fragments transmission far more efficiently than random or uniform intervention, which is the central practical payoff of ranking nodes at all.