un — Geometry of Error Correcting Codes

The Parity Check Matrix

Every linear code over GF(2) — the field with just 0 and 1, arithmetic modulo 2 — has a parity check matrix H. For the (7,4) Hamming code, H is a 3×7 matrix whose columns are the binary representations of 1 through 7:

``H = [ 0 0 0 1 1 1 1 ] ← bit 2 of position number [ 0 1 1 0 0 1 1 ] ← bit 1 of position number [ 1 0 1 0 1 0 1 ] ← bit 0 of position number``

Column j of H is the binary representation of j. This is why the syndrome s = Hr directly names the error position.

The map H: GF(2)^7 → GF(2)^3 is a linear map (matrix multiplication mod 2). Its kernel ker(H) = { r : Hr = 0 } is exactly the set of valid codewords.

Dimension count: dim(ker H) = 7 − rank(H) = 7 − 3 = 4. So there are 2^4 = 16 codewords. This confirms 4 message bits.

Parity Check Matrix & Syndrome Table

The Code as a Kernel

A word c ∈ {0,1}^7 is a valid codeword if and only if Hc = 0 (mod 2). This is the defining equation of a linear code.

Example: c = 0011001. Check Hc mod 2:

``Row 1: 0·0+0·0+0·1+1·1+1·0+1·0+1·1 = 0+0+0+1+0+0+1 = 2 ≡ 0 Row 2: 0·0+1·0+1·1+0·1+0·0+1·0+1·1 = 0+0+1+0+0+0+1 = 2 ≡ 0 Row 3: 1·0+0·0+1·1+0·1+1·0+0·0+1·1 = 0+0+1+0+0+0+1 = 2 ≡ 0``

Hc = 000. So 0011001 is a valid codeword.

Verify that c = 1101011 is a codeword of the (7,4) Hamming code by computing Hc mod 2 using the matrix above. Show all three row computations.

Cosets of the Code

The codewords C = ker(H) form a subgroup of GF(2)^7. Every other word in GF(2)^7 belongs to exactly one coset of C.

Coset definition: for any word e, the coset C + e = { c + e : c ∈ C }.

Two words belong to the same coset if and only if their difference is a codeword — equivalently, if they have the same syndrome: H(r₁) = H(r₂) iff H(r₁ − r₂) = 0 iff r₁ − r₂ ∈ C.

The syndrome map H induces an isomorphism GF(2)^7 / C ≅ GF(2)^3. With |C| = 16 and |GF(2)^7| = 128: there are exactly 128/16 = 8 cosets, one for each syndrome value in GF(2)^3.

Standard array: list one coset leader per coset — the word of minimum Hamming weight in that coset. For single-error correction, each non-zero syndrome uniquely corresponds to a weight-1 error pattern (a single 1 in one of 7 positions). That is why 7 error patterns + 1 no-error = 8 cosets = 2^3.

Coset Structure

Decoding via Coset Leaders

Decoding algorithm: receive r → compute s = Hr → look up coset leader e_s (the canonical error for syndrome s) → output r − e_s = r + e_s (same in GF(2)).

For the (7,4) code, the 8 coset leaders are: 0000000 (syndrome 000), 1000000 (syndrome 001), 0100000 (syndrome 010), 0010000 (syndrome 011), 0001000 (syndrome 100), 0000100 (syndrome 101), 0000010 (syndrome 110), 0000001 (syndrome 111).

A received word r = 1110101 has syndrome s = Hr = 011. Using the coset leader table above (syndrome 011 → error pattern 0010000), decode r to obtain the transmitted codeword. Then compute Hr' mod 2 for your corrected word r' to verify it is in ker(H).

Why (7,4) is Perfect

A code C ⊆ GF(2)^n with minimum distance 3 corrects all single errors. Each codeword c has a Hamming ball of radius 1: the center c and its n immediate neighbors (weight-1 error patterns). Ball volume = 1 + n.

For n = 7: ball volume = 8. With 16 codewords: 16 × 8 = 128 = 2^7. The balls tile GF(2)^7 perfectly — no point left uncovered, no point covered twice.

This is the definition of a perfect code: M · Vol(n, t) = 2^n. Perfect single-error-correcting codes (t=1) exist only for specific parameters: (7,4), (15,11), (23,12) (the binary Golay code), and trivially n = 2^r − 1, k = n − r for any r ≥ 2.

The geometry: GF(2)^7 decomposes into exactly 8 orbits under the action of the code as a coset partition. Each orbit is a coset; each coset has a unique syndrome. The syndrome map H is a bijection from cosets to GF(2)^3.

The Counting Argument

For a perfect single-error-correcting code on n = 2^r − 1 bits with r parity check bits:

- Number of codewords: M = 2^k = 2^(n−r)

- Ball volume: Vol(n, 1) = 1 + n = 1 + 2^r − 1 = 2^r

- Total covered: M × Vol = 2^(n−r) × 2^r = 2^n ✓

The fact that Vol(n,1) = 2^r when n = 2^r − 1 is what makes perfect codes possible at these lengths. At other lengths, 1 + n is not a power of 2, and perfect single-error-correcting binary codes cannot exist.

Verify the perfect-code count for the (15,11) Hamming code: n = 15, k = 11, r = 4 parity bits. Compute M, Vol(15,1), and M × Vol. Then explain why perfect codes cannot exist for n = 10, t = 1. Show your reasoning about what Vol(10,1) would need to be.

Generator Matrix & Duality

The parity check matrix H defines the code via the kernel. Its dual: a generator matrix G whose rows form a basis for ker(H).

For the (7,4) code: G is a 4×7 matrix. Any codeword c = mG for some message vector m ∈ GF(2)^4. The rows of G span the code; the rows of H span the null space of the code.

Key relationship: HG^T = 0 (mod 2). Every row of G is in ker(H); every row of H annihilates every row of G.

Dual code: C⊥ = ker(G^T) = image(H^T). For the (7,4) code, the dual is a (7,3) code — a [7,3,4] code with minimum distance 4.

The geometry of duality: C and C⊥ are orthogonal subspaces of GF(2)^7. The entire space decomposes into the code, its cosets, and the dual structure that the syndrome map encodes.

The (7,4) Hamming code has C⊥ = a [7,3,4] code. What does minimum distance 4 of C⊥ tell you about error detection in C⊥? Then explain: why does the dual code have only 2^3 = 8 codewords while the original (7,4) code has 16?