Matrix Operations
Matrix arithmetic is how statistical models are actually computed: fitting a regression, propagating a covariance, or stepping a dynamical system all reduce to a few matrix operations. The one rule that trips everyone up is that matrix multiplication is not element-wise and is not commutative.
Addition, subtraction, and scalar multiplication
Addition and subtraction act element-wise and require identical dimensions:
Scalar multiplication scales every entry: multiplies each by .
Element-wise (Hadamard) vs matrix multiplication
The Hadamard product multiplies corresponding entries and needs identical dimensions:
Matrix multiplication is different. It requires conformability: an matrix times an matrix yields an matrix. The entry is the dot product of row of with column of :
Transpose
The transpose flips rows and columns: . Useful properties:
Note the order reversal in the last identity.
Worked example (by hand)
Let
Since is and is , is . Compute each entry:
So
Not commutative: here is , a completely different object. Even for square matrices, in general.
Computing it
R
A <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, byrow = TRUE) # 2 x 3
B <- matrix(c(7, 8, 9, 10, 11, 12), nrow = 3, byrow = TRUE) # 3 x 2
A %*% B # matrix multiplication -> [[58, 64], [139, 154]]
t(A) # transpose -> 3 x 2
# Element-wise multiplication needs equal dimensions:
C <- matrix(1:4, 2, 2); D <- matrix(5:8, 2, 2)
C * D # Hadamard product, NOT matrix mult
C %*% D # matrix multiplication (different result)
Python
import numpy as np
A = np.array([[1, 2, 3], [4, 5, 6]]) # 2 x 3
B = np.array([[7, 8], [9, 10], [11, 12]]) # 3 x 2
A @ B # matrix multiplication -> [[58, 64], [139, 154]]
A.T # transpose -> 3 x 2
C = np.array([[1, 2], [3, 4]]); D = np.array([[5, 6], [7, 8]])
C * D # element-wise -> [[5, 12], [21, 32]]
C @ D # matrix multiplication -> [[19, 22], [43, 50]]
Julia
using LinearAlgebra
A = [1 2 3; 4 5 6] # 2 x 3
B = [7 8; 9 10; 11 12] # 3 x 2
A * B # matrix multiplication -> [58 64; 139 154]
A' # transpose (adjoint) -> 3 x 2
C = [1 2; 3 4]; D = [5 6; 7 8]
C .* D # element-wise -> [5 12; 21 32]
C * D # matrix multiplication -> [19 22; 43 50]
Why it matters for statistics
The least-squares fit is built entirely from transpose and matrix multiplication; the cross-product collapses an dataset into a summary.
Confusing * (element-wise) with matrix multiplication is a classic source of silently wrong results, so keep the operators straight.