un — Geometry of Unhamming

un

guest

1 / ?

back to lessons

Hamming's Geometric Intuition

Hamming Saw Geometry Everywhere

Hamming's Ch. 9 opens with a warning: geometric intuition collapses in high dimensions. At 3D, a sphere fills most of its enclosing cube. At 10D, the sphere occupies less than 0.2% of the cube's volume. At 100D, the fraction rounds to zero. Volume concentrates near the surface. Points cluster near corners, not the center.

His error-correcting codes exploited this directly. Hamming codes pack codewords into n-dimensional binary space such that every codeword is surrounded by a sphere of correctable errors. The geometry determines how many errors you can correct. Sphere packing in n-space is a math problem with real stakes: pack the spheres more densely, correct more errors.

The same geometric failure mode applies to algorithmic complexity. At small N, O(N²) and O(N) look nearly identical on a graph. The gap between them seems manageable. At large N, the gap explodes — exactly as the sphere's volume fraction collapses in high dimensions. What feels tractable at small scale becomes impossible at scale.

The Shape of Each Complexity Class

Drawing Complexity

Each complexity class has a geometric shape:

- O(1): a horizontal line. Same cost regardless of N. No slope.

- O(log N): a gentle upward curve that flattens. Doubles every time N squares. Grows proportionally to the number of digits in N.

- O(N): a diagonal line through the origin. Cost grows proportionally to N.

- O(N log N): slightly steeper diagonal. A line that curves very gently upward.

- O(N²): a parabola. At N=100: 10,000 operations. At N=1,000: 1,000,000 operations.

Complexity Growth Curves

The critical insight: the ratio between two complexity classes is itself a function of N. At N=10, O(N²) costs 10× more than O(N). At N=1,000, O(N²) costs 1,000× more. At N=1,000,000, it costs 1,000,000× more. The gap does not just grow — it grows at the rate of N itself.

This is the geometric argument for why MOAD-0001 patches matter. Moving a dependency resolver from O(N²) to O(N) is not a constant speedup. At N=50,000 packages, it is a 50,000× speedup. At N=100,000, it is a 100,000× speedup. The value of the patch grows with the problem size.

The Widening Gap

O(N²) and O(N) both produce correct results.

At N=10: O(N²) costs 100 operations, O(N) costs 10 operations. Ratio: 10×.

At N=1,000: O(N²) costs 1,000,000 operations, O(N) costs 1,000 operations.

How many times slower is O(N²) compared to O(N) at N=1,000? What is the geometric shape that describes the widening gap between these two curves as N grows?

Graphs as Geometry

The Directed Acyclic Graph

A DAG (directed acyclic graph) is a geometric structure: nodes connected by directed edges with no cycles. Software dependency graphs, build pipelines, and data flow architectures all form DAGs.

Factory Model DAG with Workaholic & Glutton Nodes

Each node carries geometric properties measured by counting its edges:

- In-degree: number of edges arriving at the node. High in-degree means many upstream dependencies feed this node.

- Out-degree: number of edges leaving the node. High out-degree means many downstream consumers depend on this node.

- Betweenness: in-degree + out-degree. Measures how many paths pass through this node. A high-betweenness node sits at a crossroads in the graph.

- Surge score: speedup × in-degree. Measures how much work floods downstream when this bottleneck clears.

The factory model gives these geometric properties physical meaning:

- High betweenness + high surge score = workaholic node. Remove this bottleneck without staging downstream = collapse.

- High out-degree + low surge score = glutton node. Consumes without producing. The machine that forgets to halt.

Surge & Betweenness in Practice

Reading a DAG

Consider a 5-node chain: A → B → C → D → E, plus an additional edge B → D.

- A: in-degree 0, out-degree 1, betweenness 1. Source node. Nothing feeds it. One consumer.

- B: in-degree 1 (from A), out-degree 2 (to C and to D), betweenness 3. Feeds two downstream nodes — a fan-out point.

- C: in-degree 1 (from B), out-degree 1 (to D), betweenness 2. A pass-through node.

- D: in-degree 2 (from B and from C), out-degree 1 (to E), betweenness 3. Receives from two paths.

- E: in-degree 1 (from D), out-degree 0, betweenness 1. Sink node.

B and D share the highest betweenness (3). B is the fan-out: it feeds two nodes. D is the convergence point: it receives from two paths. After a MOAD-0001 patch at C, D receives surge from both the B→D path and the C→D path simultaneously.

Calculating Node Metrics

A dependency graph: A → B → C → D → E (a chain), plus an additional edge B → D.

Node C has a measured speedup of 50× after a MOAD-0001 patch.

Calculate C's in-degree, out-degree, betweenness, and surge score. Which node faces the highest risk of MOAD-0005 after the patch, and why?

The Flatland Defect

MOAD-0007: Spatial Data Queried as a Flat List

MOAD-0007 (the Flatland Defect): spatial data queried as a flat list ignores the geometric structure of the data. Every query checks all N objects. O(N) per query. With M queries: O(M × N). At scale: catastrophic.

BVH vs Flat List Spatial Query

A real-time raycaster checks every object in a scene against every ray. At 60 frames per second, with 100 rays per frame and 10,000 scene objects: 60,000,000 intersection tests per second. All of them avoidable.

The geometric insight: space has structure. Objects cluster. A ray that misses the left half of the scene misses every object in the left half. A flat list cannot exploit this — it checks every object regardless of spatial location. A spatial data structure can.

The BVH: Binary Search in 3D

How a BVH Works

A BVH (Bounding Volume Hierarchy) decomposes 3D space into a tree of nested bounding boxes. Each internal node holds a bounding box that contains all of its children's geometry.

A raycast query:

1. Test the root bounding box. If the ray misses, exit immediately — the entire scene is pruned.

2. If the ray hits, recurse into children. Test each child's bounding box.

3. For each child that misses: prune that subtree. For each child that hits: recurse deeper.

4. At leaf nodes: test the actual geometry.

Geometry of pruning: at each level, non-intersecting branches get eliminated. With a balanced BVH of depth d: 2^d leaf nodes exist, but a single ray needs at most O(log N) comparisons for the pruned path.

This is the same geometric argument as binary search: each comparison halves the remaining search space. Binary search halves a sorted array. A BVH halves the visible 3D space. The structure is identical — a balanced binary tree with pruning at each node.

A MOAD-0007 fix: replace the flat list with a BVH. Move from O(N) per query to O(log N) per query. At N=1,024 objects, O(log₂ 1,024) = 10 comparisons instead of 1,024.

Calculating BVH Speedup

A game scene has 1,024 objects.

Without a BVH: a raycast checks all 1,024 objects.

With a balanced BVH of depth 10 (2^10 = 1,024 leaves): a raycast traverses at most 10 levels, with 2 child comparisons per level.

Estimate the maximum number of bounding box checks the BVH needs for one raycast, and calculate the speedup compared to the flat scan.