Moment Generating Functions
The moment generating function (MGF) is a single function that encodes an entire probability distribution. As its name promises, it manufactures the moments of a random variable through differentiation, and it turns sums of independent variables into products.
Definition
For a random variable , the moment generating function is defined for values of in a neighborhood of where this expectation is finite. For a discrete variable , and for a continuous one . Not every distribution has an MGF (heavy-tailed ones may fail to converge), but when it exists near it is extremely useful.
It generates moments
Expand the exponential as a Taylor series, , and take expectations term by term: The moments are exactly the Taylor coefficients, so differentiating and setting picks them off: In particular and , from which the variance follows as
Two defining properties
Uniqueness. When the MGF exists in an open interval around , it uniquely determines the distribution: two variables with the same MGF have the same law. This makes the MGF a fingerprint for identifying distributions.
Sums become products. If and are independent, then Adding independent variables corresponds to multiplying their MGFs. This factorization is the engine behind one standard proof of the central limit theorem: the MGF of a standardized sum converges to , the MGF of the standard normal.
MGFs of standard distributions
| Distribution | MGF | Valid for |
|---|---|---|
| Normal | all | |
| Poisson | all | |
| Exponential |
See the normal and Poisson distributions for the parent densities.
The cumulant generating function
Taking logs gives the cumulant generating function whose derivatives at generate the cumulants: is the mean and is the variance. Cumulants are often more convenient than moments because simply adds over independent variables. The convex function is also the object whose Legendre transform yields the rate function of large-deviations theory.
The discrete analogue
For nonnegative integer counts, the parallel tool is the probability generating function It relates to the MGF by , and it is the workhorse for branching processes, where composing with itself tracks successive generations.
Worked example: Poisson mean and variance
Let with . Differentiate using the chain rule: At , and the exponential factor is , so Differentiate again: so . Therefore recovering the familiar fact that a Poisson variable has equal mean and variance.
In code
R
# Check Poisson moments from the MGF against a large sample
lambda <- 4
set.seed(1)
x <- rpois(1e6, lambda)
mean(x) # ~ 4.00 -> M'(0) = lambda
var(x) # ~ 4.00 -> M''(0) - M'(0)^2 = lambda
# Numerical derivatives of M(t) = exp(lambda*(exp(t)-1)) at t = 0
M <- function(t) exp(lambda * (exp(t) - 1))
h <- 1e-4
(M(h) - M(-h)) / (2 * h) # ~ 4 (mean)
(M(h) - 2 * M(0) + M(-h)) / h^2 # ~ 20 (E[X^2] = lambda + lambda^2)
Python
import sympy as sp
t, lam = sp.symbols('t lambda', positive=True)
M = sp.exp(lam * (sp.exp(t) - 1)) # Poisson MGF
mean = sp.diff(M, t).subs(t, 0) # lambda
EX2 = sp.diff(M, t, 2).subs(t, 0) # lambda + lambda**2
var = sp.simplify(EX2 - mean**2) # lambda
print(mean, sp.simplify(EX2), var) # lambda lambda**2 + lambda lambda
lambda lambda*(lambda + 1) lambda
Julia
using Symbolics
@variables t λ
M = exp(λ * (exp(t) - 1)) # Poisson MGF
D = Differential(t)
mean = substitute(expand_derivatives(D(M)), Dict(t => 0)) # λ
EX2 = substitute(expand_derivatives(D(D(M))), Dict(t => 0)) # λ^2 + λ
println(mean, " ", simplify(EX2 - mean^2)) # λ λ
Why it matters
The MGF is a Swiss-army knife: it extracts means, variances, and higher moments by differentiation, identifies distributions by uniqueness, and collapses convolutions of independent variables into simple products. Those properties make it the cleanest route to results like the central limit theorem and the additivity of normal and Poisson variables. Through its logarithm, the cumulant generating function, it also opens the door to large-deviations theory and the Legendre-transform duality at the heart of statistical mechanics.