Random Walks and Brownian Motion

A drunkard staggering left or right, a pollen grain jittering in water, and a neutral allele drifting in a finite population all trace out the same kind of path: a sequence of small, independent, unpredictable steps. The random walk is the simplest stochastic process, and when you zoom out — many tiny steps over a long time — it converges to Brownian motion, the continuous process at the heart of diffusion, genetic drift, and continuous-time trait evolution. Its signature is a rule you will meet again and again: typical displacement grows not like time but like the square root of time.

Six sample paths of standard Brownian motion spreading away from the origin, bounded on average by the <span class=±t\pm\sqrt{t} envelope." />

The simple random walk

Start at the origin and take independent steps X1,X2,X_1, X_2, \dots, each +1+1 or 1-1 with probability 12\tfrac{1}{2}. Your position after nn steps is the partial sum Sn=i=1nXi.S_n = \sum_{i=1}^{n} X_i. Each step is a random variable with E[Xi]=0\mathbb{E}[X_i] = 0 and Var(Xi)=1\operatorname{Var}(X_i) = 1. Because the steps are independent and have mean zero, the walk has no preferred direction: E[Sn]=0,Var(Sn)=i=1nVar(Xi)=n.\mathbb{E}[S_n] = 0, \qquad \operatorname{Var}(S_n) = \sum_{i=1}^{n} \operatorname{Var}(X_i) = n. The expectation being zero is the easy part; the variance adding up is the important part. Variances of independent quantities sum, so the spread of SnS_n is SD(Sn)=n\operatorname{SD}(S_n) = \sqrt{n}.

This is the fundamental fact. After nn steps the walker is not typically near step nn away — it is typically about n\sqrt{n} away. To wander twice as far you must wait four times as long. This slow, square-root spreading is what “diffusive” means, and it separates random motion from directed motion, where displacement would grow linearly in time.

From random walk to Brownian motion

The n\sqrt{n} scaling tells us exactly how to zoom out. Rescale the walk in time by a factor NN and in space by N\sqrt{N}, and look at WN(t)=SNtN.W_N(t) = \frac{S_{\lfloor N t \rfloor}}{\sqrt{N}}. As NN \to \infty this rescaled walk converges to a limiting process W(t)W(t) called Brownian motion (or the Wiener process). The central limit theorem supplies the reason: SNtS_{\lfloor Nt\rfloor} is a sum of about NtNt independent steps, so once divided by N\sqrt{N} it is approximately normal with mean 00 and variance tt. Any step distribution with finite variance gives the same limit — the microscopic details of a single step wash out, leaving a universal Gaussian process. This is why Brownian motion appears whenever many small independent perturbations accumulate.

Defining properties

Standard Brownian motion W(t)W(t) is characterized by four properties.

The path is therefore continuous but infinitely wiggly, a fractal that looks statistically the same at every magnification.

Drift and scaling

Real processes rarely have unit variance and zero trend, so we add two parameters. A drift μ\mu imposes a directional tendency and a volatility (or diffusion) scale σ\sigma stretches the fluctuations: WD(t)=μt+σW(t).W_D(t) = \mu\, t + \sigma\, W(t). Then E[WD(t)]=μt\mathbb{E}[W_D(t)] = \mu t grows linearly while Var(WD(t))=σ2t\operatorname{Var}(W_D(t)) = \sigma^2 t still grows linearly in time, so WD(t)N(μt, σ2t)W_D(t) \sim \mathcal{N}(\mu t,\ \sigma^2 t). Drift dominates at long times (it scales like tt) while diffusion dominates at short times (it scales like t\sqrt{t}), and the crossover between “the trend” and “the noise” happens near tσ2/μ2t \approx \sigma^2 / \mu^2.

Connection to the diffusion equation

Brownian motion is the probabilistic face of the heat (diffusion) equation. Let p(x,t)p(x, t) be the density of WD(t)W_D(t) started at the origin; it is the Gaussian p(x,t)=12πσ2texp ⁣((xμt)22σ2t),p(x, t) = \frac{1}{\sqrt{2\pi \sigma^2 t}} \exp\!\left(-\frac{(x - \mu t)^2}{2 \sigma^2 t}\right), and with no drift (μ=0\mu = 0) this density solves pt=D2px2,D=12σ2.\frac{\partial p}{\partial t} = D\, \frac{\partial^2 p}{\partial x^2}, \qquad D = \tfrac{1}{2}\sigma^2. The variance Var(W(t))=σ2t=2Dt\operatorname{Var}(W(t)) = \sigma^2 t = 2Dt growing linearly in time is the probabilistic statement of Fick’s law: a spreading Gaussian is both an ensemble of Brownian particles and a solution of the diffusion equation. The same DD that scales a single particle’s wandering sets the rate at which a whole population’s density flattens out.

Mean-reverting and multiplicative relatives

Two variants appear constantly in biology and finance. Geometric Brownian motion models a quantity that varies multiplicatively rather than additively — its logarithm is Brownian motion with drift, logY(t)=logY(0)+μt+σW(t)\log Y(t) = \log Y(0) + \mu t + \sigma W(t), so Y(t)Y(t) stays positive and its relative changes, not its absolute changes, are Gaussian; it is the standard model for exponentially growing populations or asset prices under noise. The Ornstein–Uhlenbeck process adds a restoring pull toward a mean θ\theta, following dY=α(Yθ)dt+σdWdY = -\alpha\,(Y - \theta)\,dt + \sigma\, dW, so instead of wandering off forever the process is mean-reverting: excursions decay at rate α\alpha and the variance saturates at σ2/(2α)\sigma^2 / (2\alpha) rather than growing without bound. Ornstein–Uhlenbeck is the natural model for a trait held near an adaptive optimum by selection, and it is the mean-reverting counterpart to Brownian motion’s unbounded drift.

A worked example

A tagged fish disperses along a river as Brownian motion with no drift and diffusion scale σ=3 kmday1/2\sigma = 3\ \text{km}\,\text{day}^{-1/2}, so its position at time tt days is WD(t)N(0, σ2t)W_D(t) \sim \mathcal{N}(0,\ \sigma^2 t). After t=16t = 16 days the position has standard deviation SD(WD(16))=σt=316=12 km.\operatorname{SD}(W_D(16)) = \sigma \sqrt{t} = 3 \sqrt{16} = 12\ \text{km}. A central 95% interval for its displacement is therefore ±1.96×12±23.5 km\pm 1.96 \times 12 \approx \pm 23.5\ \text{km}, so it is about 95% likely to be found within roughly 24 km of the release point despite there being no average movement at all. Note the square-root law in action: to double the typical dispersal distance to 24 km, the fish must swim for t=64t = 64 days, four times as long. The chance it has drifted more than 30 km downstream is P(WD(16)>30)=P ⁣(Z>30/12)=P(Z>2.5)0.006P(W_D(16) > 30) = P\!\big(Z > 30/12\big) = P(Z > 2.5) \approx 0.006.

In code

R

set.seed(7)
reps <- 2e5
for (n in c(10, 100, 1000)) {
  steps <- matrix(sample(c(-1, 1), reps * n, replace = TRUE), nrow = reps)
  Sn <- rowSums(steps)
  cat(sprintf("n=%4d  E[S_n]=%+.3f  Var[S_n]=%7.1f  (theory %d)\n",
              n, mean(Sn), var(Sn), n))
}

Python

import numpy as np
rng = np.random.default_rng(7)

reps = 200_000
for n in (10, 100, 1000):
    steps = rng.choice([-1, 1], size=(reps, n))      # simple +/-1 random walk
    Sn = steps.sum(axis=1)
    print(f"n={n:>4}  E[S_n]={Sn.mean():+.3f}  Var[S_n]={Sn.var():7.1f}  (theory {n})")

# Diffusive scaling: S_n / sqrt(n) is approximately standard normal at t = 1.
W1 = rng.normal(0.0, 1.0, size=reps)                 # standard BM at time t = 1
print(f"P(-1.96 < W(1) < 1.96) = {np.mean(np.abs(W1) < 1.96):.3f}  (theory 0.950)")
n=  10  E[S_n]=+0.005  Var[S_n]=   10.0  (theory 10)
n= 100  E[S_n]=-0.012  Var[S_n]=  100.1  (theory 100)
n=1000  E[S_n]=-0.036  Var[S_n]= 1000.3  (theory 1000)
P(-1.96 < W(1) < 1.96) = 0.950  (theory 0.950)

Julia

using Random, Statistics
Random.seed!(7)

reps = 200_000
for n in (10, 100, 1000)
    Sn = [sum(rand((-1, 1), n)) for _ in 1:reps]
    println("n=$n  E[S_n]=", round(mean(Sn), digits=3),
$            "  Var[S_n]=", round(var(Sn), digits=1), "  (theory $n)")
$end

Why it matters

Brownian motion is the default null model for aimless movement, and departures from it are how we detect directed processes. In movement ecology, animal dispersal is often modeled as diffusion, so a home range spreads like t\sqrt{t} and the diffusion coefficient DD converts tracking data into a colonization or spatial-spread rate. In population genetics, genetic drift turns allele frequencies into a random walk whose variance accumulates over generations, and the diffusion approximation — treating frequency as a bounded Brownian-type process — underlies the classical theory of fixation and heterozygosity loss. In macroevolution, quantitative traits are modeled as Brownian motion along the branches of a phylogeny, so that the expected squared difference between two species grows in proportion to the time since their common ancestor, and an Ornstein–Uhlenbeck version captures stabilizing selection toward an optimum. The same mathematics that describes a pollen grain jittering under molecular bombardment thus describes a lineage’s trait wandering through deep time.