un — Geometry of Machine Reasoning: II

un

guest

1 / ?

back to lessons

Formal Proofs as Paths

A formal proof system defines a set of axioms & inference rules. Every theorem-proving program navigates this system as a search problem: starting at the given premises, apply inference rules to generate new statements, until you reach the goal.

Represent this as a directed graph:

Nodes: well-formed statements in the formal system.

Edges: single applications of inference rules (modus ponens, SAS congruence, etc.).

Proof: a directed path from the given premises to the desired conclusion.

Proof length: number of inference steps in the path.

The shortest proof of a theorem corresponds to the shortest path between the premise node & the conclusion node in this graph.

Proof as Path in Axiom Space

The geometry theorem proving program explored this graph by: (1) direct application of rules; (2) if stuck, introducing auxiliary constructions (which add new nodes to the search). The program found the selfcongruence proof by avoiding the auxiliary construction entirely — a shorter path existed that the classical approach missed.

Proof Length & Proof Search

Proof search faces the same exponential growth as game tree search. The branching factor at each node equals the number of applicable inference rules. Proof depth grows with theorem complexity.

The theorem-proving program used heuristics to prune the proof space, analogous to alpha-beta pruning in games.

Suppose a formal geometry system has 12 applicable inference rules at each step and you are searching for a proof. The classical proof of the isosceles triangle theorem requires 3 steps (given → construct → SAS → conclusion). The program's proof requires 2 steps (given → selfcongruence → conclusion). Compute the number of paths of each length that the search must explore in the worst case (before finding the proof). How much smaller is the 2-step search space relative to the 3-step space?

Points, Lines & Duality

The geometry program's selfcongruence proof of the isosceles triangle theorem uses a perspective that doesn't appear in classical Euclidean proofs. The insight: instead of comparing triangle ABC to a second constructed triangle, compare ABC to itself with the base vertices swapped — the correspondence A↔A, B↔C, C↔B.

This is a geometric symmetry argument: the isosceles triangle is symmetric under reflection across the altitude from the apex. The program didn't construct the reflection explicitly; it used the correspondence as an abstraction.

The general principle behind this is projective duality: in the projective plane, every theorem about points & lines has a dual theorem obtained by swapping the words 'point' and 'line' throughout.

Duality dictionary:

- Point ↔ Line

- Point lies on line ↔ Line passes through point

- Two points determine a unique line ↔ Two lines determine a unique point

- Collinear points ↔ Concurrent lines

A single proof of a theorem about points automatically yields a proof of the dual theorem about lines — and vice versa. The two proofs have the same structure, the same length, & the same inference steps. Finding the dual perspective often reveals a simpler proof of the original.

Applying Duality

Desargues' Theorem: If two triangles are in perspective from a point (the three lines through corresponding vertices all meet at one point), then they are in perspective from a line (the three intersections of corresponding sides all lie on one line).

This theorem is self-dual: swapping points and lines gives exactly the same theorem statement.

State the dual of the following theorem: 'Three points are collinear if and only if no two of them are distinct lines.' Wait — that statement is poorly formed. Instead, consider: 'Any two distinct points determine exactly one line.' State the dual theorem by swapping points and lines. Then state whether the dual theorem is true in the projective plane, and give a brief reason.

Sampling Rate & Frequency Space

The computer music system at Bell Labs rested on one mathematical theorem: the Nyquist-Shannon sampling theorem.

Statement: a bandlimited signal with maximum frequency f_max can be perfectly reconstructed from samples taken at a rate of at least 2 × f_max samples per second.

The geometric interpretation: a bandlimited signal lives in a finite-dimensional subspace of the space of all continuous functions. Sampling at rate 2f_max provides enough coordinates to uniquely identify a point in that subspace.

Aliasing: The Geometry of Sampling Failure

Below the Nyquist rate, frequencies above f_max alias — they appear as lower frequencies in the sampled signal. Two distinct signals become indistinguishable after sampling. Geometrically: the sampling operator projects the signal space onto a lower-dimensional space, causing different signals to collide.

For digital audio (CD quality): f_max = 22,050 Hz (slightly above the 20,000 Hz human hearing limit), sampling rate = 44,100 samples/second. For telephone: f_max = 4,000 Hz, sampling rate = 8,000 samples/second.

Nyquist Rate Calculations

The Nyquist theorem determines the minimum sampling rate needed to avoid information loss.

A voice-over-internet system needs to reproduce speech up to 8,000 Hz. What is the minimum sampling rate required? Then: to store 5 minutes of audio at this sampling rate with 16-bit samples (65,536 quantization levels), how many bytes does the recording require? Show all calculations.

Proof Space & Signal Space: Shared Geometry

The proof-as-path and the Nyquist sampling theorem share a common geometric structure: both involve finding the minimum representation of something complex.

Proof minimization: find the shortest path (fewest inference steps) through the proof graph from premises to conclusion. The selfcongruence proof minimized path length by exploiting symmetry.

Signal sampling: find the minimum number of samples (lowest sampling rate) that preserves all information in a bandlimited signal. The Nyquist theorem minimizes the representation by exploiting bandwidth limits.

Both problems live in spaces with structure that enables minimum-representation results. Both fail when that structure breaks down: proofs get longer when the axiom space is poorly organized; aliasing occurs when the signal is not bandlimited.

Both proof minimization and signal sampling exploit a structural property to achieve minimum representation. For proofs, the structure is the proof graph's connectivity. For signals, the structure is bandlimitedness. Identify one other domain where a minimum-representation result exists because of an underlying structural property. Name the structure, the representation, and what the minimum result says.