un — Geometry of Computer Hardware

Log-Log Plots & Saturation

Computing speed followed an exponential growth curve for 50 years. On a log-linear plot (log speed vs linear time), this appears as a straight line with slope b = growth rate in orders of magnitude per year.

Physical limits impose a horizontal ceiling: a maximum speed S_max determined by molecular size, light-speed, & heat constraints. As the exponential approaches S_max, growth must slow.

Logistic Saturation

A common model for growth with a ceiling:

S(t) = S_max / (1 + e^(−r(t − t₀)))

This is the logistic equation applied to technology. At early times (t << t₀): S(t) ≈ S_max × e^(r(t−t₀)) — pure exponential. Near the ceiling (t >> t₀): S(t) → S_max asymptotically.

The geometry: a straight line on log-linear coordinates bends over near the ceiling, producing an S-shape when viewed on linear-linear coordinates.

Hardware Geometry: Amdahl's Law & Light-Speed Sphere

When Does Growth Saturate?

Suppose single-processor speed grows at 10^(0.09t) starting from 10⁰ ops/sec in 1940. Physical ceiling: S_max = 10^(12) ops/sec (a rough estimate for a single-core processor, limited by thermal & light-speed constraints).

Using S(t) = 10^(0.09·t) (t = years since 1940), find the year when S(t) reaches 10¹⁰ (approaching the ceiling). Show the algebra. Then compute: how many doublings of speed occur between 1952 (IBM 701) and that ceiling year? Use doubling time = ln(2)/b where b = 0.09 × ln(10).

Maximum Communication Radius

A processor's clock speed determines the maximum radius within which it can communicate in one cycle. Signals travel at approximately 2×10⁸ m/s in copper.

For a clock period T (in seconds), the maximum one-way communication radius:

r_max = v × T / 2

(divide by 2 for round-trip: signal must go out and return within T)

As clock speed increases, T decreases, so r_max shrinks. This geometric constraint forces components to cluster closer — reducing chip area — or accept multi-cycle latency for off-chip communication.

The Sphere of Influence

All components reachable within one clock cycle form a sphere of radius r_max centered on the processor. Volume: V = (4/3)π r_max³.

If component density is ρ (components/m³), the number of components reachable within one cycle: N = ρ × V = ρ × (4/3)π r_max³.

As r_max shrinks with increasing clock speed, N shrinks cubically — a 2× clock speed increase shrinks the reachable component count by (1/2)³ = 1/8.

Reachable Components per Clock Cycle

A 1993-era workstation runs at 100 MHz (T = 10 ns). Signal speed = 2×10⁸ m/s. Component density on a circuit board ≈ 10⁸ components/m³ (rough estimate including chips, resistors, capacitors).

A modern GPU runs at 2 GHz (T = 0.5 ns).

For the 100 MHz workstation (T=10 ns, v=2×10⁸ m/s, ρ=10⁸/m³): compute r_max, then V = (4/3)π r_max³, then N = ρ × V. Then compute the same N for a 2 GHz processor (same ρ, same v). What is the ratio N(100MHz) / N(2GHz)?

The Parallel Speedup Limit

Single-processor speed approaches physical ceilings. The industry response: parallel architectures. Amdahl's Law quantifies the speedup achievable from parallelism.

Amdahl's Law

Suppose a fraction f of a program can be parallelized, and fraction (1−f) must run serially. With p processors:

Speedup(p) = 1 / ((1−f) + f/p)

As p → ∞: Speedup_max = 1 / (1−f)

The serial fraction (1−f) sets a hard ceiling on achievable speedup, regardless of how many processors you add.

Geometric insight: Speedup as a function of p follows a hyperbolic curve. The asymptote is 1/(1−f). For f = 0.9, max speedup = 10, even with infinite processors. For f = 0.99, max speedup = 100.

Hamming's implicit lesson: the interest in parallel architectures was real, but the payoff depended entirely on how parallelizable the workload was — a fact many parallel computing optimists ignored.

Computing Parallel Speedup

A scientific simulation runs in 1000 seconds on one processor. Profiling reveals: 200 seconds in a serial initialization phase (cannot be parallelized); 800 seconds in a parallel computation phase.

Compute the parallel fraction f. Using Amdahl's Law, compute Speedup(4), Speedup(16), Speedup(∞). Show each formula application. Then interpret: is adding processors from 16 to ∞ worth the hardware cost? What does the geometry tell you about diminishing returns?