un

guest
1 / ?
back to lessons

Derivation of the Logistic Equation

Hamming's S-curve has a precise mathematical derivation. Start with two observations about technology adoption:

1. Adoption rate accelerates with current adoption (word-of-mouth, network effects): dP/dt ∝ P

2. Adoption rate decelerates as the market saturates: dP/dt ∝ (1 − P)

Combine: dP/dt = r · P · (1 − P)

This is the logistic differential equation. It is separable: partial-fraction decomposition allows direct integration.

Derivation

Separate variables: dP / [P(1−P)] = r dt

Partial fractions: 1/[P(1−P)] = 1/P + 1/(1−P)

Integrate both sides: ln(P) − ln(1−P) = rt + C

ln[P/(1−P)] = rt + C

P/(1−P) = e^(rt+C) = e^C · e^(rt)

Let K = e^C. Solve for P: P = K·e^(rt) / (1 + K·e^(rt))

Equivalently: P(t) = 1 / (1 + e^(−r(t − t₀)))

where t₀ = (ln K)/r is the inflection point.

Inflection Point

At t = t₀: P = 0.5. Second derivative d²P/dt² = 0: growth rate is maximum. Before t₀: concave up (accelerating). After t₀: concave down (decelerating).

Computer Applications Geometry: Metcalfe & Optimization Landscape

Fitting the Logistic to Data

Given two data points on a logistic curve, you can solve for both r and t₀.

Internet adoption: P(1995) = 0.01 (1% of US households), P(2005) = 0.70 (70%).

Using P(t) = 1/(1 + e^(−r(t−t₀))), set up two equations from the data points P(1995)=0.01 and P(2005)=0.70. From P(2005)=0.70: compute t₀ using ln[P/(1−P)] = r(t−t₀). Then use both equations to solve for r. Show all algebra. What does your r value predict for P(2010)?

Network Value as a Geometric Count

Hamming noted that applications drove computing adoption more than hardware or software. Network-dependent applications follow a specific growth model: their value increases faster than their cost.

Metcalfe's Law

The value of a network with n users is proportional to the number of possible connections between users:

V(n) = k · n(n−1)/2 ≈ k · n²/2 (for large n)

where k is the value of one connection. Cost of a network: C(n) ∝ n (roughly linear in user count).

Value-to-cost ratio: V/C ∝ n²/n = n. As n grows, the ratio grows linearly. A network with 10x more users delivers roughly 100x more value at only 10x the cost.

Geometric Picture

With n nodes, the number of edges in a complete graph K_n is C(n,2) = n(n−1)/2. This is a combinatorial formula: choose 2 nodes from n. For n=10: C(10,2)=45. For n=100: C(100,2)=4950. For n=1000: C(1000,2)=499,500.

The S-curve and Metcalfe's Law interact: during Phase 2 rapid adoption, n grows rapidly, and V(n) grows as n². The value inflection occurs before the adoption inflection — value accelerates ahead of adoption, pulling more adoption in a positive feedback loop.

Network Value at Different Adoption Levels

Email adoption: in 1985 (n=100,000 users), k = $0.01 per connection-year. In 1995 (n=30,000,000 users).

Compute V(1985) = k · n(n−1)/2 and V(1995) = k · n(n−1)/2 using the given values. What is the ratio V(1995)/V(1985)? Then compute the user growth ratio n(1995)/n(1985). What does the ratio of value growth to user growth tell you about why email became indispensable so suddenly in the early 1990s?

Optimization as Geometry

Hamming's Boeing tape story describes an optimization failure with precise geometric meaning. Optimization of a function f(x) on a landscape requires:

1. A well-defined function f: the objective (drag, cost, time-to-market)

2. A fixed landscape: f evaluated at the same state each time

3. A gradient: the direction of steepest improvement

When the landscape changes between measurements, the gradient you estimate may point in a direction that no longer exists when you take the next step. You are computing gradient(f₁) but stepping in landscape f₂.

Gradient Descent

Standard gradient descent: x_{t+1} = x_t − α ∇f(x_t)

where α = step size (learning rate), ∇f = gradient vector (partial derivatives).

The Boeing failure: at time t, team measures f(x_t). At time t+1, team changes x to x_t + Δx. But the shared database also changed: f is now f' ≠ f. The observed change: f'(x_t + Δx) − f(x_t). This is NOT the gradient of f — it includes a term from the landscape shift.

The Phantom Gradient

Measured change = [f'(x+Δx) − f(x)] = [f(x+Δx) − f(x)] + [f'(x+Δx) − f(x+Δx)]

= true gradient × Δx + landscape shift

If the landscape shift dominates: the team moves toward a minimum in f' that is a maximum in f. They optimize the wrong thing — possibly making their design worse while measurements show improvement.

Quantifying Phantom Gradient Error

A team optimizes drag f(θ, s) where θ = wing angle, s = span. True gradient: ∂f/∂θ = −0.5 (drag decreases with θ), ∂f/∂s = +0.3 (drag increases with s).

Another team simultaneously reduces fuselage weight, which changes the drag function: f' = f − 0.8. (A lighter fuselage reduces drag by 0.8 units at all configurations.)

The first team measures: f'(θ+Δθ, s) − f(θ, s) = [f(θ+Δθ, s) − 0.8] − f(θ, s) = −0.5·Δθ − 0.8.

If the first team sets Δθ = 1 (changes wing angle by 1 unit), what is the measured change? What do they attribute it to? What is the actual contribution of their own wing-angle change versus the phantom contribution from the fuselage change? Show the arithmetic and interpret: could the phantom gradient cause the team to stop optimizing θ prematurely?