un — Algorithms in Practice: Code Review & Complexity Defects

un

guest

1 / ?

back to lessons

Reading Code for Growth Rates

Code Review for Correctness vs Code Review for Complexity

Code review for correctness catches logic errors: wrong conditions, off-by-one indexes, missing null checks. Complexity-aware code review catches a different class of defect: code that works correctly at N = 100 but grows catastrophically at N = 100,000.

MOAD-0001 Pattern: list seen O(N²) vs set seen O(N) — one-line fix

This lesson connects to a deeper investigation in the unhamming course: The Missing Chapter: Algorithmic Complexity covers Big O in the context of production defects, & MOAD-0001: The Sedimentary Defect maps the pattern across 60+ software ecosystems.

Two Review Heuristics

Nested loops multiply complexity. Two nested loops over N items produce O(N²). Three produce O(N³). When reviewing: count loop nesting depth first.

Sequential container inside a hot loop. Any .contains(), .includes(), .find(), or in list check inside a loop costs O(N) per check. Over N iterations: O(N²) total. This pattern appears in production code across dozens of ecosystems — the GHC Haskell compiler, Python pip, Apache Maven, Rust Cargo, TypeScript, Godot. Same mistake, different codebases.

Review: find_duplicates

Review the following Python function for complexity:

def find_duplicates(items):
    seen = []
    duplicates = []
    for item in items:
        if item in seen:
            duplicates.append(item)
        else:
            seen.append(item)
    return duplicates

Perform a complexity-aware code review: (a) What is the time complexity of this function? (b) Why? Identify the exact line responsible. (c) Rewrite it to O(N).

MOAD-0001: The Sedimentary Defect

The Same Defect, Sixty Ecosystems

The pattern if x not in list: list.append(x) inside a loop appears — has appeared — in production code across dozens of software ecosystems:

- GHC (Haskell compiler): dependency resolver, O(N²) at N = 50,000 packages, 17-minute builds

- Python pip: dependency conflict resolution

- Apache Maven: classpath deduplication

- Rust Cargo: feature unification

- TypeScript compiler: module resolution

- Godot / Redot game engine: node graph traversal

None of these teams made a mistake. They wrote correct code at small N. N grew. The code calcified. The pattern has a name: MOAD-0001, The Sedimentary Defect. Sediment: correct, harmless in thin layers. Over time, the layers accumulate and harden.

The Fix

In every case: replace the sequential container with a hash container. One line. Zero behavioral change at correct inputs. 100–1,000× speedup at real-world N.

The fix works because two operations need to stay fast:

1. Membership check: has this element been seen before?

2. Ordered output: return elements in the order they appeared (sometimes required, sometimes not).

A set handles (1) in O(1). If (2) also matters, keep a separate list for ordered output and a set for the membership check. Two data structures, each doing one job.

Responding to a Reviewer

A pull request fixes MOAD-0001 in a project's graph traversal function. A reviewer comments:

> "Sets don't preserve insertion order — this changes behavior."

How do you respond? Explain when the reviewer's concern applies, and when it does not. Propose a solution that satisfies both concerns when they do conflict.

The Interview Analysis Pattern

The Expected Format

Algorithmic complexity analysis appears in software engineering interviews. A strong answer follows a four-part pattern:

1. State the current complexity — O(?) for time, O(?) for space.

2. Explain why — identify the specific construct responsible (nested loop, linear scan in loop, recursive branching).

3. Propose an optimization — name the change and the data structure or algorithm it introduces.

4. State the new complexity — after the fix, what is the time & space complexity?

Example:

Code: [function that checks membership in a list inside a loop]
Current: O(N²) — `item in seen_list` is O(N) per check × N iterations
Optimization: replace seen_list with seen_set (hash set)
After: O(N) — set membership check is O(1)

This skill transfers beyond interviews: complexity-aware code review, architecture design, performance debugging, security audits. Any system that processes inputs of variable size benefits from it.

This lesson connects forward to the unhamming course, which applies this exact pattern to production defect investigation across 60+ open-source projects.

Interview: Analyze has_common_element

Apply the four-part interview format to this function:

def has_common_element(list_a, list_b):
    for item in list_a:
        for other in list_b:
            if item == other:
                return True
    return False

State current complexity, explain why, propose an optimization, state new complexity.