Measures of Center

A measure of center summarizes a distribution or dataset with a single “typical” value. Choosing the right one — mean, median, or mode — depends on shape, skew, and how much you trust the extremes. Medians are prized in biology for just this reason: the median incubation period or median survival time is far more robust than the mean for the right-skewed distributions typical of infectious-disease data.

The three classic measures

Quantiles, percentiles, and order statistics

Sorting the data x(1)x(2)x(n)x_{(1)} \le x_{(2)} \le \dots \le x_{(n)} gives the order statistics; x(1)x_{(1)} is the minimum and x(n)x_{(n)} the maximum. A quantile at level q[0,1]q \in [0,1] is a value below which a fraction qq of the data fall; percentiles are quantiles expressed in percent. The median is the 0.50.5 quantile.

When to use which

Worked example

Dataset: {2,4,4,5,100}\{2, 4, 4, 5, 100\}, with n=5n = 5.

The lone value 100100 drags the mean up to 2323, far from every other point, while the median 44 stays representative. This is robustness in action.

Simulation

x <- c(2, 4, 4, 5, 100)
mean(x)                 # 23
median(x)               # 4
quantile(x, c(.25, .5, .75))
# mode: value with highest frequency
as.numeric(names(which.max(table(x))))  # 4

Python

import numpy as np
from statistics import mode

x = np.array([2, 4, 4, 5, 100])
print(np.mean(x))                    # 23.0
print(np.median(x))                  # 4.0
print(np.quantile(x, [.25, .5, .75]))
print(mode(x.tolist()))              # 4
23.0
4.0
[4. 4. 5.]
4

Julia

using Statistics, StatsBase

x = [2, 4, 4, 5, 100]
println(mean(x))                     # 23.0
println(median(x))                   # 4.0
println(quantile(x, [.25, .5, .75]))
println(mode(x))                     # 4

Why it matters for statistics

The center you report shapes the story your data tell. The mean feeds directly into variances, standard errors, and the central limit theorem; the median and quantiles give robust, distribution-free summaries that survive outliers and skew. Knowing when each is appropriate is the difference between an honest summary and a misleading one.