Global Sensitivity Analysis

Global sensitivity analysis (GSA) attributes the variation in a model’s output to its uncertain inputs. It answers “which knobs actually matter?” — essential for prioritizing data collection, simplifying models, and communicating which epidemiological parameters drive a forecast.

Local vs global

Local sensitivity looks at partial derivatives at a single nominal point, $S^{\text{loc}}_i = \frac{\partial y}{\partial x_i}\Big|_{x = x_0},$ a one-at-a-time (OAT) view. It is cheap but only describes the neighborhood of $x_0$ , ignores interactions, and depends on the (arbitrary) units and base point. See partial derivatives.

Global sensitivity varies all inputs across their full uncertainty ranges simultaneously, treating the inputs as random variables $X_1,\dots,X_k$ and studying how the output $Y = f(X)$ responds over the whole space. The input designs are typically generated by Latin Hypercube Sampling.

Morris elementary-effects screening

The Morris method is a cheap global screening method. It computes elementary effects along one-at-a-time steps but from many randomly scattered starting points: $EE_i = \frac{f(x_1,\dots,x_i+\Delta,\dots,x_k) - f(x)}{\Delta}.$ Over many trajectories, the mean of $|EE_i|$ (denoted $\mu_i^*$ ) ranks overall importance, while the standard deviation $\sigma_i$ flags nonlinearity and interactions. It uses few runs, so it is used to discard unimportant inputs before a more expensive analysis.

Variance-based Sobol indices

Sobol indices decompose the output variance by inputs. Using the law of total variance (see measures of variability): $\operatorname{Var}(Y) = \operatorname{Var}\!\big(\mathbb{E}[Y \mid X_i]\big) + \mathbb{E}\!\big[\operatorname{Var}(Y \mid X_i)\big].$

The first-order index measures the variance explained by $X_i$ alone: $S_i = \frac{\operatorname{Var}\!\big(\mathbb{E}[Y \mid X_i]\big)}{\operatorname{Var}(Y)}.$

The total-effect index measures all variance involving $X_i$ , including every interaction it participates in: $S_{Ti} = \frac{\mathbb{E}\!\big[\operatorname{Var}(Y \mid X_{\sim i})\big]}{\operatorname{Var}(Y)} = 1 - \frac{\operatorname{Var}\!\big(\mathbb{E}[Y \mid X_{\sim i}]\big)}{\operatorname{Var}(Y)},$ where $X_{\sim i}$ denotes all inputs except $X_i$ .

Interpretation:

$\sum_i S_i \le 1$ , with equality only when the model is purely additive (no interactions).
Always $S_{Ti} \ge S_i$ , and $S_{Ti} - S_i$ quantifies $X_i$ ’s involvement in interactions.
$S_{Ti} \approx 0$ means $X_i$ can be safely fixed.

Worked example

Let $Y = X_1 + X_2 + X_1 X_2$ with $X_1, X_2$ independent, each $\text{Uniform}(0,1)$ , so $\mu = \tfrac12$ and $\operatorname{Var}(X_i)=\tfrac{1}{12}$ .

Condition on $X_1 = x_1$ and take the expectation over $X_2$ (mean $\tfrac12$ ): $\mathbb{E}[Y \mid X_1 = x_1] = x_1 + \tfrac12 + \tfrac12 x_1 = \tfrac12 + \tfrac32 x_1 .$ Its variance is $\big(\tfrac32\big)^2 \operatorname{Var}(X_1) = \tfrac{9}{4}\cdot\tfrac{1}{12} = \tfrac{3}{16}$ .

For the total variance, add the residual term $\mathbb{E}[\operatorname{Var}(Y\mid X_1)]$ . Given $X_1=x_1$ , $Y = x_1 + (1+x_1)X_2$ , so $\operatorname{Var}(Y\mid X_1=x_1) = (1+x_1)^2/12$ and $\mathbb{E}[\operatorname{Var}(Y\mid X_1)] = \tfrac{1}{12}\mathbb{E}[(1+X_1)^2] = \tfrac{1}{12}\cdot\tfrac{7}{3} = \tfrac{7}{36}$ . Hence $\operatorname{Var}(Y) = \frac{3}{16} + \frac{7}{36} = \frac{55}{144}\approx 0.382, \qquad S_1 = S_2 = \frac{3/16}{55/144} = \frac{27}{55} \approx 0.491,$ so $S_1 + S_2 = \tfrac{54}{55} \approx 0.982 < 1$ . The remaining $\approx 1.8\%$ ( $=1/55$ ) is the pure interaction term $X_1 X_2$ , so each input’s total effect is $S_{T1} = S_{T2} = 1 - S_2 = 1 - \tfrac{27}{55} = \tfrac{28}{55}\approx 0.509 > S_i$ . Both inputs matter equally, and a small but nonzero share is interaction.

In code

R

# install.packages(c("sensitivity", "boot"))
library(sensitivity)
set.seed(1)

f <- function(X) X[, 1] + X[, 2] + X[, 1] * X[, 2]
n <- 10000
X1 <- data.frame(x1 = runif(n), x2 = runif(n))
X2 <- data.frame(x1 = runif(n), x2 = runif(n))

s <- sobolSalt(model = f, X1, X2, scheme = "A", nboot = 100)
print(s)     # First-order S_i ~ 0.49 each; total S_Ti ~ 0.51 each

# Morris screening:
m <- morris(model = f, factors = c("x1", "x2"), r = 50,
            design = list(type = "oat", levels = 6, grid.jump = 3))
print(m)     # mu.star ranks importance

Python

import numpy as np
from SALib.sample import sobol as sobol_sample
from SALib.analyze import sobol
np.random.seed(0)

problem = {"num_vars": 2, "names": ["x1", "x2"],
           "bounds": [[0, 1], [0, 1]]}

X = sobol_sample.sample(problem, 1024)          # Saltelli design
Y = X[:, 0] + X[:, 1] + X[:, 0] * X[:, 1]
Si = sobol.analyze(problem, Y)
print(Si["S1"])    # ~ [0.49, 0.49]
print(Si["ST"])    # ~ [0.51, 0.51]

Julia

using GlobalSensitivity, Statistics, Random
Random.seed!(0)

f(x) = x[1] + x[2] + x[1] * x[2]
bounds = [[0.0, 1.0], [0.0, 1.0]]

res = gsa(f, Sobol(order = [0, 1]), bounds; samples = 10_000)
println(res.S1)   # first-order  ~ [0.49, 0.49]
println(res.ST)   # total-effect ~ [0.51, 0.51]

Why it matters for statistics

GSA turns “uncertainty in the inputs” into a ranked, quantitative statement about “uncertainty in the output.” Sobol indices give a variance decomposition that is model-free (no linearity assumption), captures interactions, and is directly interpretable as a fraction of output variance. In epidemiology this tells you whether a forecast’s spread is driven by $\beta$ , by reporting rates, or by their interaction — guiding which parameters to measure more precisely.