The Binomial Distribution

The binomial distribution counts how many “successes” occur in a fixed number of independent yes/no trials — how many of nn vaccinated people avoid infection, how many of nn tossed coins land heads, how many of nn patients respond to a treatment. It is the natural model behind testing a proportion.

The Binomial(10, 0.3) probability mass function; the mean is np = 3.

Definition

Let XBinomial(n,p)X\sim\mathrm{Binomial}(n,p) count the successes in nn independent trials, each with success probability pp. Its probability mass function is P(X=k)=(nk)pk(1p)nk.P(X=k)=\binom{n}{k}p^k(1-p)^{n-k}.

The binomial coefficient (nk)=n!k!(nk)!\binom{n}{k}=\frac{n!}{k!\,(n-k)!} counts the number of ways to arrange kk successes among nn trials.

Sum of Bernoulli trials

A single trial with outcome 11 (success, probability pp) or 00 (failure) is a Bernoulli random variable with mean pp and variance p(1p)p(1-p). A binomial variable is just the sum of nn independent Bernoulli trials: X=i=1nBi,BiBernoulli(p).X=\sum_{i=1}^{n}B_i,\qquad B_i\sim\mathrm{Bernoulli}(p). Because expectation and (for independent terms) variance add, this immediately gives E[X]=np\mathbb{E}[X]=np and Var(X)=np(1p)\mathrm{Var}(X)=np(1-p).

When it arises

The binomial arises whenever you count successes among a fixed number of independent, identical trials: proportion testing (fraction cured, fraction defective), survey responses (yes/no), and quality control. The sample proportion p^=X/n\hat p=X/n is the basis for inference about an unknown pp.

Extension: the multinomial

When each trial has more than two possible outcomes (say categories 1,,m1,\dots,m with probabilities p1,,pmp_1,\dots,p_m summing to 11), the counts follow the multinomial distribution P(X1=k1,,Xm=km)=n!k1!km!p1k1pmkm,P(X_1=k_1,\dots,X_m=k_m)=\frac{n!}{k_1!\cdots k_m!}\,p_1^{k_1}\cdots p_m^{k_m}, the direct generalization of the binomial to several categories.

In code

R

# pmf, cdf, quantile, and sampling for Binomial(n = 20, p = 0.3)
dbinom(6, size = 20, prob = 0.3)   # P(X = 6)
pbinom(6, size = 20, prob = 0.3)   # P(X <= 6)
qbinom(0.95, size = 20, prob = 0.3) # 95% quantile

set.seed(123)
x <- rbinom(10000, size = 20, prob = 0.3)  # random sample
hist(x, breaks = seq(-0.5, 20.5, 1), freq = FALSE)  # histogram
points(0:20, dbinom(0:20, 20, 0.3))                 # overlay the pmf

Python

import numpy as np
from scipy import stats

n, p = 20, 0.3
stats.binom.pmf(6, n, p)    # P(X = 6)
stats.binom.cdf(6, n, p)    # P(X <= 6)
stats.binom.ppf(0.95, n, p) # 95% quantile

rng = np.random.default_rng(123)
x = rng.binomial(n, p, size=10000)  # random sample
# plt.hist(x, bins=range(0, 22), density=True); overlay stats.binom.pmf(range(21), n, p)

Julia

using Distributions, Random

d = Binomial(20, 0.3)   # Binomial(n, p)
pdf(d, 6)               # P(X = 6)   (pdf = pmf for discrete)
cdf(d, 6)               # P(X <= 6)
quantile(d, 0.95)       # 95% quantile

Random.seed!(123)
x = rand(d, 10_000)     # random sample
# histogram(x, normalize=:pdf); scatter!(0:20, pdf.(d, 0:20)) to overlay the pmf

Simulation

The empirical mean of many binomial draws converges to npnp and the variance to np(1p)np(1-p).

set.seed(7)
x <- rbinom(1e6, size = 20, prob = 0.3)
mean(x)  # ~ 6.0   (theoretical mean = np = 20 * 0.3)
var(x)   # ~ 4.2   (theoretical variance = np(1-p) = 20 * 0.3 * 0.7)

Why it matters for statistics

The binomial underpins inference for proportions: confidence intervals for pp, tests comparing two proportions, and the reasoning behind p-values in exact tests. For large nn it is well approximated by the normal distribution, and for large nn with small pp it approaches the Poisson distribution.