Generalized Linear Models
Generalized linear models (GLMs) are a single framework that unifies linear, logistic, and Poisson regression. By separating the outcome’s distribution from the way its mean depends on predictors, GLMs let epidemiologists model continuous measurements, binary events, and event counts with the same fitting machinery.
Three components
Every GLM is built from three pieces:
- Random component: the outcome follows an exponential-family distribution (normal, binomial, Poisson, gamma, …) with mean .
- Linear predictor: a linear combination of covariates, .
- Link function: an invertible function connecting the mean to the linear predictor, .
The link is what frees the model from the constraints of ordinary linear regression: it maps a bounded mean (a probability in , a positive rate) onto the unbounded scale where a linear predictor can roam freely.
Three familiar special cases
Choosing the distribution and link recovers the standard models:
| Distribution | Link | Model |
|---|---|---|
| Normal | identity, | linear regression |
| Binomial | logit, | logistic regression |
| Poisson | log, | Poisson regression |
So the identity link with a normal outcome is exactly the linear model , while the logit link with a binomial outcome is logistic regression.
Poisson regression for counts and rates
For count outcomes we use the Poisson distribution with a log link: The log link guarantees a positive mean and makes covariate effects multiplicative: is a rate ratio. To model a rate rather than a raw count, we add an offset with a fixed coefficient of : so that . The exposure is typically person-time or population size, turning the model into one for cases per unit of population.
Fitting: deviance and IRLS
All GLMs are fit by maximum likelihood, using iteratively reweighted least squares (IRLS): each iteration solves a weighted linear regression on a linearized version of the response until the coefficients converge. Goodness of fit is measured by the deviance, twice the gap between the log-likelihood of the fitted model and that of a saturated model; it generalizes the residual sum of squares and drives likelihood-ratio tests between nested models.
Overdispersion
The Poisson model assumes the variance equals the mean. Real count data are often overdispersed: their variance exceeds the mean, inflating precision and shrinking standard errors if ignored. Two common remedies are quasi-Poisson, which multiplies the variance by an estimated dispersion parameter so that , and the negative binomial, which adds an explicit extra-variance parameter through a different exponential-family distribution.
Worked example: case counts with a population offset
Two districts report new cases over a year. District A has cases in a population of ; District B has cases in a population of . Fit , where indicates District B. The rates are and per person-year. The rate ratio for B versus A is , so and : District B has half the incidence rate. Without the offset, District B’s larger raw count would misleadingly suggest a higher burden.
In code
R
district <- c("A", "B")
cases <- c(30, 60)
pop <- c(10000, 40000)
fit <- glm(cases ~ district, family = poisson,
offset = log(pop))
summary(fit)
exp(coef(fit)) # districtB rate ratio ~ 0.5
Python
import numpy as np
import statsmodels.api as sm
cases = np.array([30, 60])
pop = np.array([10000, 40000])
X = sm.add_constant([0, 1]) # 0 = A, 1 = B
fit = sm.GLM(cases, X, family=sm.families.Poisson(),
offset=np.log(pop)).fit()
print(np.exp(fit.params)) # B rate ratio ~ 0.5
[0.003 0.5 ]
Julia
using GLM, DataFrames
df = DataFrame(cases = [30, 60],
district = ["A", "B"],
logpop = log.([10000, 40000]))
fit = glm(@formula(cases ~ district), df, Poisson(), LogLink();
offset = df.logpop)
exp.(coef(fit)) # districtB rate ratio ~ 0.5
Why it matters
GLMs give analysts one coherent language for regression across continuous, binary, and count outcomes, so the same ideas of coefficients, standard errors, and deviance carry over unchanged. This unification is why incidence-rate models, dose-response curves, and risk models can all be fit, compared, and interpreted with a single well-understood toolkit.