The Central Limit Theorem
The central limit theorem explains why the bell curve is everywhere: add up or average many independent effects and the result is approximately normal, no matter what the individual pieces look like. It is the reason normal-based inference works so broadly.
Statement
Let be iid with mean and finite variance . As , the standardized sample mean converges in distribution to a standard normal: Equivalently, is approximately for large . The remarkable part: this holds regardless of the shape of the parent distribution — skewed, discrete, bimodal — as long as the variance is finite.
LLN vs. CLT
The law of large numbers says (the mean stops moving). The CLT is the finer statement: the leftover fluctuations, magnified by , are Gaussian. LLN gives the location; CLT gives the shape.
Worked example
Suppose service times are exponential with rate , so and — a strongly right-skewed parent. Average of them. The CLT says So , even though a single exponential draw exceeding has probability . Averaging tames the skew.
Simulation
Take means of samples from a skewed parent (exponential) and watch the histogram of means become bell-shaped as grows.
R
set.seed(11)
for (n in c(1, 5, 30)) {
means <- replicate(10000, mean(rexp(n, rate = 1)))
hist(means, breaks = 40, main = paste("n =", n), xlab = "sample mean")
cat("n =", n, " skewness shrinks; SD =", round(sd(means), 3),
" theory =", round(1 / sqrt(n), 3), "\n")
}
Python
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(11)
for i, n in enumerate((1, 5, 30)):
means = np.array([np.random.exponential(1.0, n).mean()
for _ in range(10000)])
plt.subplot(1, 3, i + 1); plt.hist(means, bins=40)
plt.title(f"n={n}")
print(f"n={n:>2} SD={means.std(ddof=1):.3f} theory={1/np.sqrt(n):.3f}")
plt.tight_layout()
n= 1 SD=0.986 theory=1.000
n= 5 SD=0.453 theory=0.447
n=30 SD=0.181 theory=0.183
Julia
using Random, Statistics
Random.seed!(11)
for n in (1, 5, 30)
means = [mean(randexp(n)) for _ in 1:10000]
println("n=$n SD=", round(std(means), digits=3),
$ " theory=", round(1 / sqrt(n), digits=3))
end
Why it matters for statistics
The CLT is why - and -based confidence intervals and tests apply to means from almost any population, not just normal ones. It underwrites the normal approximation for proportions and sums, and it tells us how large a sample is “large enough” for inference to be trustworthy. Nearly every classical procedure leans on it.