Matrix Operations — A Complete Reference

From basic notation to real-world applications · 표기법부터 실제 응용까지

1. What is a Matrix?

행렬이란?

A matrix is a rectangular arrangement of numbers (or symbols) organized into rows and columns. An m×n matrix has m rows and n columns, and every entry is identified by its row index i and column index j — written as Aij. A 3×4 matrix, for instance, contains 12 entries laid out in 3 rows of 4 columns each.

Matrices are far more than spreadsheets of numbers. They encode linear transformations: any matrix-vector product Ax represents stretching, rotating, shearing, projecting, or otherwise reshaping the vector x in a controlled, reversible-or-not way. This single idea unifies enormous swaths of science and engineering. A 2×2 matrix can rotate or scale a 2D image. A 3×3 matrix can transform a 3D model in a video game. A 1000×1000 matrix can describe how 1000 financial markets influence each other on a single trading day.

Why study matrices? They are the lingua franca of applied mathematics: computer graphics uses them for 3D-to-2D projection, machine learning stores every neural-network layer as a matrix of weights, economics models supply chains with input-output matrices, physics expresses quantum states as state vectors transformed by Hermitian matrices, and statistics fits regression models by solving a matrix equation. Mastering matrix operations is therefore the gateway to nearly every quantitative discipline.

행렬은 단순한 숫자 표가 아닙니다. 어떤 행렬을 벡터에 곱한다는 것은 그 벡터를 회전·확대·전단·투영하는 선형 변환을 의미하며, 이 한 가지 아이디어가 컴퓨터 그래픽스(3D 변환), 머신러닝(신경망 가중치), 경제학(투입-산출 모형), 물리학(양자 상태), 통계학(회귀 모형)을 하나로 묶어줍니다.

[ ] 123 456 789 3×3 matrix — 9 entries arranged in rows and columns

2. Notation & Special Matrices

표기법과 특수 행렬

Matrices are conventionally written with capital letters (A, B, M), and individual entries with the same letter in lowercase with subscripts: aij denotes the entry in row i, column j. Indexing starts at 1 in mathematics and 0 in most programming languages — a constant source of off-by-one bugs when translating formulas to code.

Several special matrices appear so often they have dedicated names. The identity matrix In is an n×n square matrix with 1s on the main diagonal and 0s elsewhere; it satisfies AI = IA = A for any compatible A, playing the same role as the number 1 in scalar multiplication. The zero matrix has every entry equal to 0. A diagonal matrix has nonzero entries only on the main diagonal, making operations like inversion and exponentiation trivial. A symmetric matrix satisfies A = AT, meaning aij = aji — common in covariance matrices and physics. An upper-triangular matrix has zeros below the diagonal, and lower-triangular the reverse; together they enable the efficient LU decomposition.

The transpose AT is formed by swapping rows and columns: the entry at row i, column j of AT equals aji. Transposing twice returns the original: (AT)T = A. The transpose distributes over addition, (A + B)T = AT + BT, but reverses order under multiplication: (AB)T = BTAT. This reversal is one of the most-tested identities in linear algebra exams.

단위행렬 I는 곱셈의 항등원, 영행렬은 덧셈의 항등원입니다. 전치(transpose) AT는 행과 열을 바꾼 것으로, (AB)T = BTAT처럼 곱셈 순서가 뒤집힙니다. 대칭 행렬(A = AT)은 공분산 행렬과 물리학에서 자주 등장합니다.

3. Matrix Addition & Scalar Multiplication

행렬 덧셈과 스칼라 곱

To add two matrices, sum their corresponding entries: Cij = Aij + Bij. Both matrices must share the same dimensions — you cannot add a 2×3 matrix to a 3×2 matrix. Subtraction follows the same rule with a minus sign. Scalar multiplication scales every entry by a single number: (kA)ij = k · Aij.

Addition is commutative (A + B = B + A) and associative ((A + B) + C = A + (B + C)). Scalar multiplication distributes over both matrix addition (k(A + B) = kA + kB) and scalar addition ((k + l)A = kA + lA). These properties match those of ordinary number arithmetic, which is why addition rarely causes confusion — most matrix-related bugs surface around multiplication and inversion instead.

In image processing, adding two image matrices blends pixel values, useful for combining exposures or producing simple HDR effects. In machine learning, gradient updates during training are scalar-times-matrix additions of the form W ← W − η · ∇L: the new weight matrix W is the old one minus a small learning-rate η times the gradient matrix ∇L. Every step of stochastic gradient descent reduces to scalar multiplication followed by matrix subtraction, repeated millions of times.

행렬 덧셈은 대응 원소끼리 더하는 연산이며, 두 행렬의 크기가 같아야 합니다. 교환·결합·분배 법칙이 모두 성립하기 때문에 일반 수의 덧셈과 동일한 직관으로 다룰 수 있습니다. 머신러닝의 경사하강법에서 가중치 업데이트 W ← W − η∇L은 본질적으로 스칼라 곱과 행렬 뺄셈의 반복입니다.

[ 12 34 ] + [ 56 78 ] = [ 68 1012 ] element-wise addition — same dimensions required

4. Matrix Multiplication

행렬 곱셈

Matrix multiplication is where the subject becomes genuinely powerful — and genuinely error-prone. For A of size m×n and B of size n×p, the product AB has size m×p, and each entry is a dot product of a row of A with a column of B:

(AB)ij = Σk=1..n aik · bkj

The inner dimensions must match: you can multiply (2×3)·(3×4), but never (2×3)·(2×4). This dimensional rule is the single most common source of runtime errors when working with libraries like NumPy or PyTorch.

Worked example. Compute the 2×2 product:

A = [[1, 2], [3, 4]], B = [[5, 6], [7, 8]]

So AB = [[19, 22], [43, 50]].

Matrix multiplication is associative (A(BC) = (AB)C) and distributive over addition (A(B + C) = AB + AC), but emphatically not commutative — in general AB ≠ BA. Sometimes only one direction is even defined: if A is 2×3 and B is 3×2, AB is 2×2 while BA is 3×3 — completely different objects. This non-commutativity is exactly what makes matrices able to model the asymmetric reality of rotations, projections, and information flow in deep networks.

Every neural network layer is essentially y = Wx + b followed by a nonlinear activation, where W is a weight matrix and x is the previous layer’s output. Training a network with billions of parameters therefore reduces to billions of matrix multiplications per second — which is why GPUs (designed for parallel matrix math) revolutionized deep learning.

행렬 곱셈은 (m×n)·(n×p) = (m×p) 형태로, 내부 차원이 일치해야 합니다. 결합·분배 법칙은 성립하지만 교환법칙은 일반적으로 성립하지 않아 AB ≠ BA입니다. 딥러닝 신경망의 모든 층은 y = Wx + b 형태의 행렬 곱셈이며, GPU의 병렬 행렬 연산이 딥러닝 시대를 열었습니다.

[ 1 2 3 4 ] · [ 5 6 7 8 ] 1×5 + 2×7 = 19 row × column → dot product → one entry

5. Determinant

행렬식

The determinant is a single number computed from a square matrix that captures crucial geometric and algebraic information. For a 2×2 matrix [[a, b], [c, d]], the determinant is simply det(A) = ad − bc. For a 3×3 matrix, the cofactor expansion along the first row gives det(A) = a₁₁(a₂₂a₃₃ − a₂₃a₃₂) − a₁₂(a₂₁a₃₃ − a₂₃a₃₁) + a₁₃(a₂₁a₃₂ − a₂₂a₃₁). General n×n determinants are computed via LU decomposition or cofactor expansion, both of which have O(n³) complexity.

Geometric interpretation. In 2D, |det(A)| is the area of the parallelogram spanned by the matrix’s column vectors. In 3D, it’s the volume of the parallelepiped. The sign indicates orientation: a positive determinant preserves orientation (like a rotation), a negative determinant reverses it (like a reflection), and a zero determinant collapses the shape to a lower-dimensional figure — a line, point, or plane — meaning information has been irrecoverably lost.

Why it matters. A matrix A is invertible if and only if det(A) ≠ 0. When det(A) = 0, A is called singular: the linear system Ax = b either has no solution or infinitely many. Determinants also satisfy two beautiful identities — det(AB) = det(A)·det(B) and det(AT) = det(A) — that make them useful theoretical tools even when they’re computationally expensive.

In practice, never compute large determinants by cofactor expansion (which has O(n!) complexity if implemented naively). For n > 4, always use LU decomposition: det(A) = ±∏ uii, the product of pivots up to sign.

2×2 행렬식은 det(A) = ad − bc. 기하학적으로 행렬식의 절대값은 변환에 의해 넓이(2D)나 부피(3D)가 얼마나 배율로 바뀌는지를 나타내며, 부호는 방향 보존(+) 또는 뒤집힘(−)을 의미합니다. det = 0이면 변환이 차원을 축소시켰다는 뜻으로, 행렬은 비가역(특이) 상태가 됩니다. 큰 행렬의 행렬식은 LU 분해로 구해야 하며, 여인수 전개는 N이 커지면 계산 불가능합니다.

det | a b c d | = ad bc det = 0 → singular (non-invertible)

6. Inverse Matrix

역행렬

The inverse A⁻¹ of a square matrix A is the matrix that satisfies A · A⁻¹ = A⁻¹ · A = I, where I is the identity matrix. Conceptually, A⁻¹ “undoes” the transformation A: if A rotates a vector 30° clockwise, A⁻¹ rotates it 30° counterclockwise back to where it started.

Existence. Not every matrix has an inverse. A square matrix is invertible if and only if det(A) ≠ 0 — equivalently, if its columns are linearly independent, or its rank equals its size. Non-square matrices have no two-sided inverse, though they may have pseudo-inverses (the Moore-Penrose A+) that solve least-squares problems.

2×2 closed form. For A = [[a, b], [c, d]] with det(A) = ad − bc ≠ 0:

A⁻¹ = (1 / det(A)) · [[d, −b], [−c, a]]

That is: swap the diagonal entries, negate the off-diagonal entries, and divide everything by the determinant. For larger matrices, the formula generalizes to A⁻¹ = (1 / det(A)) · adj(A), where adj(A) is the adjugate (transpose of the cofactor matrix). But the closed-form approach is prohibitively expensive for n > 3.

Practical computation. For real applications, always solve the linear system AX = I via Gaussian elimination or LU decomposition — both run in O(n³) time, while the adjugate formula is exponentially slow. In numerical software (NumPy np.linalg.inv, MATLAB inv), you almost never want the explicit inverse anyway: solving Ax = b directly via np.linalg.solve(A, b) is faster and numerically more stable.

Where it appears. In cryptography, the Hill cipher encrypts plaintext blocks as Ax and decrypts via A⁻¹y. In computer graphics, the inverse of a camera’s view matrix maps screen coordinates back to world coordinates. In robotics, the inverse Jacobian translates desired end-effector velocities into joint angle velocities.

역행렬 A⁻¹는 A의 변환을 "되돌리는" 역할을 합니다. 정사각 행렬이면서 행렬식이 0이 아닐 때만 존재하며, 2×2의 경우 대각 원소를 바꾸고 비대각 원소의 부호를 뒤집은 뒤 행렬식으로 나누면 됩니다. 실제로는 N > 3에서 직접 공식을 쓰면 비효율적이고, LU 분해나 가우스 소거법으로 계산합니다. 암호학의 힐 암호, 컴퓨터 그래픽스의 카메라 좌표 변환, 로봇공학의 역 자코비안 등이 대표적 응용입니다.

A × A⁻¹ = 1 0 0 1 Identity I A⁻¹ = (1/det) × adj(A)

7. Systems of Linear Equations

연립일차방정식

A system of linear equations is a collection of equations sharing the same unknowns — for example, 2x + 3y = 8 and x − y = 1. Any such system can be packaged into a single matrix equation Ax = b, where A is the coefficient matrix, x is the vector of unknowns, and b is the constant vector. When A is square and invertible, the unique solution is x = A⁻¹b.

Three possible outcomes. A linear system has either: (a) exactly one solution — when det(A) ≠ 0 and the system is “consistent”; (b) no solution — when the equations contradict each other geometrically (parallel non-intersecting planes); or (c) infinitely many solutions — when the equations are redundant (overlapping planes). The rank of A versus the rank of the augmented matrix [A | b] tells you which case you’re in.

Gaussian elimination is the standard hand-computation algorithm: use row operations (swap two rows, multiply a row by a nonzero scalar, add a multiple of one row to another) to reduce A to upper-triangular form, then back-substitute. LU decomposition stores the row operations as a lower-triangular matrix L so that A = LU; once you have L and U, solving Ax = b for any new b is just two triangular solves, each O(n²) instead of O(n³).

Real-world systems. Kirchhoff’s current and voltage laws in circuit analysis produce linear systems with as many unknowns as nodes and loops in the circuit. Traffic flow at an intersection network forms a sparse linear system where each node’s flow-in equals flow-out. Balancing chemical equations is a homogeneous linear system in the molecule coefficients. Even PageRank, the algorithm behind Google’s original search ranking, is the dominant eigenvector of a stochastic matrix — found by repeatedly solving (I − αM)x = (1−α)v.

연립일차방정식은 행렬 형태 Ax = b로 표현되며, A가 정사각·가역이면 해는 x = A⁻¹b로 유일하게 결정됩니다. 가우스 소거법은 행 연산으로 A를 상삼각 형태로 만든 뒤 후방 대입하는 표준 알고리즘이고, LU 분해는 같은 A에 대해 여러 b를 반복적으로 풀 때 효율적입니다. 키르히호프 회로 해석, 교통 흐름, 화학 방정식 균형, 심지어 구글의 페이지랭크까지 모두 본질은 연립일차방정식 풀이입니다.

A x = b x = A⁻¹ b e.g. 2x + 3y = 8, x − y = 1 → x = 2.2, y = 1.2

8. Properties & Algebraic Laws

성질과 대수적 법칙

The reason matrices feel like generalized numbers is that many — but not all — familiar algebraic laws still hold. Knowing which laws survive and which fail is essential to manipulating matrix expressions without breaking them.

What works:

What does NOT work:

The “reverse-order” identities for both transpose and inverse — (AB)T = BTAT and (AB)⁻¹ = B⁻¹A⁻¹ — trip up nearly every student first encountering them. The intuition: to undo “A then B,” you must “undo B, then undo A.”

행렬 곱셈은 결합·분배 법칙은 성립하지만 교환법칙은 일반적으로 성립하지 않습니다. 또한 일반 수와 달리 AB = AC에서 B = C를 결론낼 수 없고(좌소거 불가능), AB = 0이라고 A 또는 B가 0인 것도 아닙니다. 전치와 역행렬 모두 곱셈에서 순서가 뒤집힌다는 점((AB)T = BTAT, (AB)⁻¹ = B⁻¹A⁻¹)은 가장 시험에 자주 나오는 항등식입니다.

9. Common Mistakes & Pitfalls

흔한 실수와 주의점

가장 흔한 실수는 차원 불일치(곱셈에서 내부 차원, 덧셈에서 양쪽 차원), 교환법칙 가정, 역행렬 곱하는 방향 혼동입니다. 코드에서는 명시적 역행렬 계산(`inv(A) @ b`) 대신 직접 선형 해법(`solve(A, b)`)을 쓰는 것이 수치적으로 더 안정적이며, 수학(1-인덱스)과 프로그래밍(0-인덱스) 사이의 인덱스 변환도 주의가 필요합니다.

실세계 응용과 관련 개념

Matrices appear in nearly every quantitative discipline:

Once you’re comfortable with matrices, the natural next steps are vectors (matrices’ column-shaped cousins), statistics (covariance and PCA in /docs/stat/), and eigenvalues and eigenvectors (the spectral theorem and diagonalization, foundational to spectral clustering, PageRank, and quantum mechanics). Modern applications also draw heavily on probability when matrices contain random entries — as in Markov chains and random matrix theory.

Ready to practice? Use C:Matrix for interactive drills on multiplication, determinants, inverses, and Gaussian elimination, with step-by-step solutions you can compare against.

행렬은 컴퓨터 그래픽스(3D 변환), 머신러닝(신경망 가중치), 공학(유한요소 해석), 양자역학(에르미트 연산자), 경제학(투입-산출 모형), 통계학(회귀의 정규방정식 XTXβ = XTy, PCA), 암호학(블록 암호, 격자 기반) 등 거의 모든 정량적 분야에 등장합니다. 다음 단계는 벡터, 통계학(공분산, PCA), 고유값/고유벡터(스펙트럼 분해)이며, 무작위 행렬을 다루려면 확률론이 동반됩니다. 실전 연습은 C:Matrix에서 가능합니다.

11. Explore Each Matrix Operation in Depth

연산별 심화 가이드

Each core matrix operation has its own focused, worked-example guide. Use these as deep dives whenever you need the formula, a step-by-step example, and the common pitfalls for one specific operation:

각 핵심 연산마다 공식·단계별 예제·흔한 실수를 담은 심화 가이드가 있습니다: 덧셈, 뺄셈, 곱셈, 직사각 곱셈, 전치, 행렬식, 역행렬, 연립방정식, 계수(rank).

Practice now → C:Matrix