Latin Hypercube Sampling
Latin Hypercube Sampling (LHS) is a space-filling design for sweeping a multi-dimensional parameter space efficiently. It is the workhorse of computer experiments and of uncertainty/sensitivity analysis for models — for example, sampling an SIR model’s transmission rate and recovery rate so that a few dozen runs cover the plausible space far better than plain Monte Carlo.
The idea: stratify, then permute
Suppose we want sample points in a -dimensional unit cube .
- Stratify. Cut each dimension’s range into equal-probability bins (intervals of width ).
- Permute. For each dimension, place exactly one sample in each bin, in a random order. Independently choosing a random permutation per dimension pairs the bins across dimensions.
The result is that every one of the bins of every input is used exactly once. This generalizes a Latin square (an grid with one mark per row and per column) to a “Latin hypercube” in dimensions.
A point is then drawn as where is the permutation for dimension and jitters the point within its bin. Setting centers the points instead.
Why it beats plain Monte Carlo
With independent uniform draws, the marginal coverage of each input is random — you can get clumps and gaps. LHS forces each input’s marginal to be perfectly stratified into strata, which reduces the variance of estimated means of functions that are close to additive in the inputs. At the same , LHS typically gives smoother, lower-variance estimates.
Maximin LHS
The permutations only control marginals, so points can still lie close together in the full cube. Maximin LHS searches over permutations to maximize the minimum pairwise distance , spreading points out for better joint coverage.
Arbitrary marginals via inverse CDFs
LHS lives on .
To give input any target distribution with CDF , apply the inverse-CDF (quantile) transform .
Because is monotonic, it maps the equal-probability bins of onto equal-probability strata of the target, so the Latin property is preserved.
For example qnorm gives a normal margin; scaling to gives a uniform margin.
Worked example: , two dimensions
Take inputs and samples. Each axis is split into 5 equal bins: .
Choose the identity permutation for input 1 and the permutation for input 2. Centering points at bin midpoints () gives the five design points:
| Sample | bin | bin | ||
|---|---|---|---|---|
| 1 | 1 | 3 | 0.1 | 0.5 |
| 2 | 2 | 5 | 0.3 | 0.9 |
| 3 | 3 | 1 | 0.5 | 0.1 |
| 4 | 4 | 4 | 0.7 | 0.7 |
| 5 | 5 | 2 | 0.9 | 0.3 |
Reading down each column, the five bins each appear exactly once — one point per row and per column of the grid, just like a Latin square. Plain random sampling could easily have left, say, bin 4 of input 2 empty; LHS cannot.
In code
R
# install.packages("lhs")
library(lhs)
set.seed(1)
n <- 20; k <- 2
U <- randomLHS(n, k) # plain LHS on the unit square
Um <- maximinLHS(n, k) # extra-spread variant
# Map columns to SIR parameter marginals via inverse CDFs:
beta <- qunif(U[, 1], min = 0.10, max = 0.60) # transmission rate
gamma <- qnorm(U[, 2], mean = 0.20, sd = 0.03) # recovery rate
R0 <- beta / gamma # basic reproduction number
head(round(cbind(beta, gamma, R0), 3))
# Each column's 20 values are perfectly stratified into 20 quantile bins.
Python
import numpy as np
from scipy.stats import norm
from scipy.stats.qmc import LatinHypercube, scale
np.random.seed(0)
n, d = 20, 2
sampler = LatinHypercube(d=d, seed=0)
U = sampler.random(n) # LHS on [0,1]^2
# Uniform margins via qmc.scale; normal margin via inverse CDF (ppf):
XU = scale(U, l_bounds=[0.10, 0.0], u_bounds=[0.60, 1.0])
beta = XU[:, 0] # Uniform(0.10, 0.60)
gamma = norm.ppf(U[:, 1], loc=0.20, scale=0.03)
R0 = beta / gamma
print(np.round(np.c_[beta, gamma, R0][:5], 3))
[[0.584 0.236 2.472]
[0.124 0.192 0.645]
[0.255 0.121 2.099]
[0.31 0.181 1.712]
[0.286 0.25 1.144]]
Julia
using QuasiMonteCarlo, Distributions, Random
Random.seed!(0)
n = 20
lb = [0.10, 0.0]; ub = [0.60, 1.0]
S = QuasiMonteCarlo.sample(n, lb, ub, LatinHypercubeSample()) # 2 x n matrix
beta = S[1, :] # Uniform(0.10, 0.60)
u2 = (S[2, :] .- lb[2]) ./ (ub[2] - lb[2]) # back to [0,1]
gamma = quantile.(Normal(0.20, 0.03), u2) # inverse-CDF transform
R0 = beta ./ gamma
println(round.(hcat(beta, gamma, R0)[1:5, :], digits = 3))
Why it matters for statistics
Many statistical and epidemiological questions boil down to “how does model output vary as uncertain inputs vary?” Estimating output means, quantiles, and sensitivity indices requires a design of input points. LHS gives low-variance, reproducible coverage at a fixed budget , which is decisive when each model run is expensive (large simulations, ODE solves, or Bayesian posterior evaluations). It is the standard front end for variance-based sensitivity analysis and for building emulators of costly models.