Hardy–Weinberg Equilibrium

Hardy–Weinberg equilibrium (HWE) is the null model of population genetics: it says how genotype frequencies relate to allele frequencies when nothing interesting is happening. Deviations from it are how we detect inbreeding, natural selection, hidden population structure, and — very practically — genotyping errors in a sequencing pipeline.

The equilibrium

Consider a single biallelic locus with alleles AA and aa. Let pp be the frequency of allele AA and q=1pq = 1 - p the frequency of allele aa. Under random mating, an individual’s two alleles are like two independent draws from the allele pool, so the genotype frequencies are the terms of (p+q)2(p+q)^2: f(AA)=p2,f(Aa)=2pq,f(aa)=q2.f(AA) = p^2, \qquad f(Aa) = 2pq, \qquad f(aa) = q^2. These are the Hardy–Weinberg proportions, and they are reached after a single generation of random mating regardless of the starting genotype frequencies. Once reached, both the allele frequencies and the genotype frequencies stay constant generation after generation — hence “equilibrium”.

Assumptions

The result holds when the idealizing assumptions of the model are met.

Because the model treats an individual as two independent allele draws, the genotype probability 2pq2pq for heterozygotes carries the factor 22: the ordered outcomes AaAa and aAaA are both heterozygous.

Testing for HWE

Given observed genotype counts in a sample of NN individuals, we can test whether the population is consistent with HWE.

Estimating the allele frequency

Each individual carries two alleles, so with observed counts nAAn_{AA}, nAan_{Aa}, naan_{aa} (summing to NN) the allele-frequency estimate is p^=2nAA+nAa2N,q^=1p^.\hat p = \frac{2 n_{AA} + n_{Aa}}{2N}, \qquad \hat q = 1 - \hat p .

The chi-square goodness-of-fit test

Under the null hypothesis of HWE, the expected counts are EAA=p^2NE_{AA} = \hat p^2 N, EAa=2p^q^NE_{Aa} = 2\hat p\hat q N, and Eaa=q^2NE_{aa} = \hat q^2 N. The Pearson chi-square statistic compares observed and expected counts: χ2=g{AA,Aa,aa}(OgEg)2Eg.\chi^2 = \sum_{g \in \{AA,Aa,aa\}} \frac{(O_g - E_g)^2}{E_g}. There are three genotype categories, but we lose one degree of freedom for the total count constraint and one more for estimating p^\hat p from the data, leaving 311=13 - 1 - 1 = 1 degree of freedom for a biallelic locus. A large statistic (small p-value relative to the χ12\chi^2_1 distribution) is evidence against HWE.

Worked example

Suppose we genotype N=200N = 200 individuals and observe nAA=90n_{AA} = 90, nAa=60n_{Aa} = 60, naa=50n_{aa} = 50.

First estimate the allele frequency: p^=2(90)+602(200)=240400=0.6,q^=0.4.\hat p = \frac{2(90) + 60}{2(200)} = \frac{240}{400} = 0.6, \qquad \hat q = 0.4 . Then the expected counts under HWE are EAA=0.62200=72,EAa=2(0.6)(0.4)200=96,Eaa=0.42200=32.E_{AA} = 0.6^2 \cdot 200 = 72, \quad E_{Aa} = 2(0.6)(0.4)\cdot 200 = 96, \quad E_{aa} = 0.4^2 \cdot 200 = 32 . The chi-square statistic is χ2=(9072)272+(6096)296+(5032)232=4.5+13.5+10.125=28.125.\chi^2 = \frac{(90-72)^2}{72} + \frac{(60-96)^2}{96} + \frac{(50-32)^2}{32} = 4.5 + 13.5 + 10.125 = 28.125 . Against χ12\chi^2_1, the 5%5\% critical value is 3.843.84, so 28.12528.125 is highly significant (p107p \approx 10^{-7}). The sample has far too few heterozygotes and too many homozygotes — the classic signature of a heterozygote deficit.

What deviations mean

A significant departure from HWE is a signal, not a diagnosis, and the direction is informative.

In code

R

obs <- c(AA = 90, Aa = 60, aa = 50)
N <- sum(obs)
phat <- (2 * obs["AA"] + obs["Aa"]) / (2 * N)   # 0.6
qhat <- 1 - phat
exp_freq <- c(AA = phat^2, Aa = 2 * phat * qhat, aa = qhat^2)
expected <- exp_freq * N                          # 72, 96, 32

chisq <- sum((obs - expected)^2 / expected)       # 28.125
pval  <- pchisq(chisq, df = 1, lower.tail = FALSE) # ~ 1.1e-07
c(chisq = chisq, pval = pval)

# chisq.test uses df = 2 by default (does not know p was estimated),
# so compute the 1-df p-value manually as above.

Python

import numpy as np
from scipy import stats

obs = np.array([90, 60, 50])          # AA, Aa, aa
N = obs.sum()
phat = (2 * obs[0] + obs[1]) / (2 * N)  # 0.6
qhat = 1 - phat
expected = np.array([phat**2, 2 * phat * qhat, qhat**2]) * N  # [72, 96, 32]

chisq = np.sum((obs - expected)**2 / expected)   # 28.125
pval = stats.chi2.sf(chisq, df=1)                # ~ 1.1e-07
print(chisq, pval)
28.124999999999993 1.1372725656979712e-07

Julia

using Distributions

obs = [90, 60, 50]                # AA, Aa, aa
N = sum(obs)
phat = (2obs[1] + obs[2]) / (2N)  # 0.6
qhat = 1 - phat
expected = [phat^2, 2phat*qhat, qhat^2] .* N   # [72.0, 96.0, 32.0]

chisq = sum((obs .- expected).^2 ./ expected)  # 28.125
pval = ccdf(Chisq(1), chisq)                   # ~ 1.1e-7
println((chisq, pval))

Why it matters

Hardy–Weinberg equilibrium is the reference point against which almost every population-genetic observation is measured. Because it converts allele frequencies into expected genotype frequencies under a clean set of assumptions, any deviation localizes an interesting force — mating structure, selection, or subdivision — and its routine use as a quality-control filter keeps spurious markers out of downstream association analyses.