un — MOAD-0001: The Sedimentary Defect

un

guest

1 / ?

back to lessons

How Code Sediment Forms

Sedimentary rock forms by deposition, not explosion. Each layer: correct for its time. Each layer: older than the one above. The oldest layers calcified before the ecosystem matured around them. No error caused them. Time caused them.

MOAD-0001 works the same way.

MOAD-0001 Sediment Layers: code copied across decades as N grows

The Formation Story

A graph traversal written in 1993:

List<Node> visited = new ArrayList<>();
Stack<Node> stack = new Stack<>();
stack.push(start);
while (!stack.isEmpty()) {
    Node n = stack.pop();
    if (!visited.contains(n)) {  // O(N) linear scan
        visited.add(n);
        for (Node neighbor : n.getNeighbors())
            stack.push(neighbor);
    }
}

This code: correct. Tests: passing. In 1993, graphs had 50 nodes.

50 nodes: 50 × 25 avg scanned = 1,250 operations. Invisible.

The code entered version control. Got copied into tutorials. Got wrapped in libraries. Got shipped in build tools, package managers, & dependency resolvers. By 2024, the same pattern ran on dependency graphs with 50,000 nodes.

50,000 nodes: 50,000 × 25,000 avg scanned = 1,250,000,000 operations.
A 1-second build becomes 17 minutes.

The code did not change. The world around it grew. Every layer of the sediment: correct at deposition. Expensive at excavation.

Your Sediment

Sedimentary code does not contain a logic error. It contains a performance assumption that stopped holding as scale changed. The code produces correct results; only the cost changed.

This distinction matters for diagnosis. A logic error produces wrong answers. A sedimentary defect produces correct answers at unacceptable cost.

Sedimentary code: correct code that grows expensive as scale changes around it. Give a concrete example from software you have used or written where something worked at small scale & became a bottleneck at large scale. What changed: the code, the data size, or the usage pattern?

Five Forms of MOAD-0001

MOAD-0001 appears in five documented forms across 60+ software ecosystems. The structure: a membership test using a sequential container, nested inside a loop over the same or related collection.

The Five Forms

Domain	Pattern	Sequential Container	Correct Container
Graph traversal	`if (!stack.contains(n))` in DFS/Tarjan	ArrayList	HashSet
Route/event dedup	`TAILQ_FOREACH` strcmp per request	linked list	HashMap
Collision tracking	`findLinearSearch()` per physics tick	array	unordered_set
Resource allocation filter	`List.contains()` in stream filter	ArrayList	HashSet
A* pathfinding open-list	`LocalVector::find()` per neighbor	vector	stored heap index

All five share the same structure: a membership check nested in a loop, using a container that requires a linear scan to answer the membership question.

The Scanning Keyword List

Auditing for MOAD-0001 means grepping for membership-test keywords inside loops:

- Python: in with a list variable, .count(), list.index()

- Java: ArrayList.contains(), List.contains(), LinkedList.contains()

- JavaScript: Array.includes(), Array.indexOf(), Array.find()

- C++: std::vector::find(), manual loop with == comparison

- Go: slices.Contains(), manual loop over a slice

The fix in every case: replace the sequential container with a hash container. list → set. ArrayList → HashSet. Array → Set. vector → unordered_set. slice → map[T]struct{}.

One keyword. One substitution. Zero behavioral change. 1,000× speedup at N=1,000.

Audit a Code Fragment

Apply MOAD-0001 pattern recognition to a real code fragment.

You audit a JavaScript codebase & find this inside a loop that runs over all graph neighbors: `if (!openSet.includes(neighbor)) openSet.push(neighbor)`. Is this MOAD-0001? What would you replace `openSet` with, & what does the complexity change from O(?) to O(?)?

One Line, Four Languages

The fix for MOAD-0001 in four languages:

# Python
visited = []       # BEFORE: O(N) membership
visited = set()    # AFTER:  O(1) membership

// Java
List<Node> visited = new ArrayList<>();    // BEFORE
Set<Node> visited = new HashSet<>();       // AFTER

// JavaScript
const visited = [];      // BEFORE: Array.includes() O(N)
const visited = new Set(); // AFTER: Set.has() O(1)
// visited.push(n) → visited.add(n)
// visited.includes(n) → visited.has(n)

// Go
visited := []NodeID{}              // BEFORE: slices.Contains() O(N)
visited := make(map[NodeID]struct{}) // AFTER: map lookup O(1)
// _, ok := visited[n]  for membership check
// visited[n] = struct{}{}  for insertion

In every case: same behavior, same output, same tests passing. Only the growth rate changes.

What the Fix Does NOT Change

- The algorithm's logical behavior: depth-first still visits depth-first

- Output correctness: the same nodes get visited in the same order (within DFS)

- Test results: any test that checks correctness continues to pass

What the Fix DOES Change

- Membership test: O(N) → O(1)

- Loop total: O(N²) → O(N)

- At N=1,000: 1,000× faster

- At N=10,000: 10,000× faster

One Limit: Order

Hash containers (set, HashSet, unordered_set) do not preserve insertion order. In Python 3.7+, dict preserves insertion order; plain set does not. In Java, HashSet does not preserve order; LinkedHashSet does.

If order matters alongside membership: maintain two structures. A set (or HashSet) for O(1) membership tests. A separate list (or ArrayList) for ordered traversal. Or use LinkedHashSet in Java, which provides both.

For MOAD-0001 in graph traversal: visited answers a binary question (has this node been seen before?). Order does not matter for the membership question. The traversal order comes from the stack or queue, not from visited.

Membership vs Ordering

In Tarjan's strongly connected components algorithm, onStack tracks which nodes remain on the current DFS call stack. It must answer two questions: (1) membership — is this node currently on the stack? (2) at the end of a DFS path, pop nodes off in order until this node appears.

A naive implementation uses a List for onStack, answering the membership question with contains() — O(N) per check, O(N²) total for large graphs.

You fix a Tarjan SCC implementation by replacing `onStack = []` with `onStack = set()`. The tests pass. A code reviewer asks: 'but what if order matters for onStack?' How do you answer?

Why This Is a Disclosure, Not a Patch

MOAD-0001 exists in over 1,000 verified sites across 60+ software ecosystems. Graph traversal in build tools, route deduplication in network routing daemons, collision detection in game engines, resource allocation in operating system schedulers.

Each site: correct code. Each site: O(N²) growth waiting for N to cross a threshold.

The Disclosure Pipeline

A patch without upstream disclosure fixes one site. The pattern recurs in the next version, the next library, the next language port. Disclosure: name the pattern, document it as CWE-407 (Algorithmic Complexity: Inefficient Algorithmic Complexity), publish the recognition heuristics & the one-line fix. Then every developer who reads the disclosure can recognize & fix their own instance.

Hamming called this 'giving a fish vs teaching fishing.' The patch gives a fish. The disclosure — MOAD-0001 named, pattern documented, fix generalized — teaches the heuristic.

The Permacomputer Extension

Hamming worked on point solutions: fix this filter, improve this code. Unhamming adds the disclosure layer: name the pattern, publish the heuristic, & give it to the commons.

The compound-knowledge principle applies here at ecosystem scale. One researcher names a pattern. One hundred developers recognize it in their own codebases. One thousand lines of O(N²) code become O(N). The infrastructure gets faster for everyone who builds on it.

This is the dragon growing: same fire (work on important problems, compound knowledge, systems thinking), different flight (open disclosure, commons ownership, no patron required).