un

guest
1 / ?
back to lessons

Programming in Absolute Binary

The first programmers wrote in absolute binary: every instruction & every address in raw binary digits. A single instruction might look like 01100101 00001010 — instruction code & memory address in binary.

The Spaghetti Code Problem

When an error required inserting a new instruction, programmers faced a dilemma. Inserting in place meant every subsequent instruction address shifted by one — requiring the programmer to update every address reference in the entire program. Catastrophic.

The solution: replace the instruction just before the insertion point with a jump to empty memory. At that empty location: write the overwritten instruction, add the new instructions, then jump back. When errors appeared in the corrections, apply the same trick again using other empty memory.

Result: the execution path through the program jumped to seemingly random locations. Hamming called this 'a can of spaghetti.' The control flow path, drawn on paper, looked exactly like tangled spaghetti.

The Escape Routes

Two immediate improvements: octal notation (group binary digits in sets of 3) and hexadecimal (groups of 4, using A–F for values beyond 9). These reduced writing errors but did not solve the fundamental address problem.

Symbolic assembly (e.g. IBM's SAP — Symbolic Assembly Program — and SOAP — Symbolic Optimizing Assembly Program on the IBM 650) allowed programmers to write instruction names (ADD, MOVE) and symbolic address labels instead of binary. The assembler translated to binary at input time, automatically managing address assignments.

SOAP performed an additional optimization: it arranged instructions on the rotating drum so the next instruction arrived at the reading head just as the previous one completed — minimum latency coding. SOAP even compiled itself: program A processed as data to produce B, B run on A to measure how much self-compilation improved it.

Parse Tree & Language Hierarchy

Libraries & Relocatable Code

Hamming noted that the idea of reusable software (mathematical libraries) came very early — Babbage had conceived it. The problem: an absolute-address library required every routine to occupy the same memory locations every time it was used. When the total library grew too large, programs competed for the same addresses.

The solution: relocatable code. The assembler generates instructions that reference memory relatively — offsets from a base address — rather than absolute addresses. A linker resolves the final addresses at load time.

Von Neumann's unpublished reports (widely circulated) described the necessary programming tricks. The first published programming book (Wilkes, Wheeler & Gill, EDSAC, 1951) codified these techniques.

Explain why absolute-address libraries created a scalability problem, and how relocatable code solved it. What specific property of absolute addresses caused the collision, and what does 'relocatable' mean technically?

The Language Design Fork

FORTRAN (1957, IBM) and ALGOL (1958, international committee) represent two design philosophies that produced radically different outcomes.

FORTRAN

John Backus led the FORTRAN (FORmula TRANslation) project at IBM. The design goal: make the language easy for scientists & engineers to use. FORTRAN accepted mathematical notation that felt natural to its users: A = B + C * D rather than ADD B, C; STORE T; MULTIPLY T, D; STORE A.

FORTRAN survived 60+ years. It remains in active use in scientific computing, fluid dynamics, climate modeling, & computational physics. Hamming noted this durability as proof of successful design.

ALGOL

ALGOL (ALGOrithmic Language) was designed by a committee of logicians & computer scientists aiming for mathematical rigor: a logically clean, formally definable language. The Backus-Naur Form (BNF) notation for describing grammars was invented to specify ALGOL.

ALGOL failed in practice. Despite its logical elegance & its enormous influence on subsequent language design (Pascal, C, & nearly every modern language descend from ALGOL's grammar concepts), ALGOL itself was never widely deployed. Hamming's verdict: logically designed, humanly unusable.

The Hierarchy of Languages

Hamming described a natural hierarchy from machine code up through assembly, higher-level languages, & ultimately a 'problem-oriented language' close to how practitioners think about their problem domain. Each level adds human readability at the cost of machine efficiency.

Hamming's Four Language Design Criteria

Hamming distilled the lesson of FORTRAN vs ALGOL into four criteria for a successful programming language:

1. Easy to learn — a novice can become productive quickly

2. Easy to use — routine tasks require minimal ceremony

3. Easy to debug — errors produce meaningful, locatable messages

4. Easy to use subroutines — reuse & abstraction do not require heroic effort

He added a structural observation: human language carries about 60% redundancy; written language about 40%. Low-redundancy languages (like APL) produce elegant one-liners that experts find beautiful & beginners find opaque — & that contain undetectable errors when a single character changes meaning.

The implication: a language designed for logical elegance optimizes for the wrong reader. The programmer is human; humans need redundancy to catch errors & communicate intent.

Apply Hamming's four criteria to a programming language you know well. Score each criterion 1–5 (5=excellent). Then identify which criterion, if strengthened, would most improve the language — and explain what a specific change would look like.

Psychological vs Logical Language Design

Hamming returned to the FORTRAN/ALGOL contrast as a lesson in institutional & human dynamics, not just language design.

FORTRAN was designed psychologically — for the humans who would use it, specifically scientists who thought in mathematical notation. ALGOL was designed logically — for formal correctness & theoretical elegance.

The paradox Hamming identified: a logically correct language that humans resist fails; a pragmatically designed language that humans adopt succeeds, even if it is logically messier.

He cited APL as the extreme case: logically elegant, one-liner expressible, with its own special character set. Experts loved it. Normal programmers found it unreadable. A single character change could silently transform a program's meaning. APL has a small devoted community & nearly zero mainstream use.

The human redundancy argument: spoken language is ~60% redundant (repeated context, clarifying words, predictable structure). Written language ~40% redundant. This redundancy serves error detection — humans are unreliable, so language evolved to carry enough repeated information to catch & correct errors. A low-redundancy language removes this safety net.

The Compiler Hierarchy

Hamming described the compiler/interpreter layering: a program can read in a higher-level language & translate it to a lower-level one. Stack these layers — each translates one level down. At the top: a domain-specific language that experts in a field (biology, finance, physics) write naturally. At the bottom: machine code. Each transition is a compiler or interpreter.

Predicting Language Survival

By 1993, Hamming had watched many languages succeed & fail. FORTRAN (1957) survived. ALGOL (1958) failed. COBOL (1959) survived decades in business computing. LISP (1958) survived in AI research. PL/I (1964) tried to unify everything & failed.

Using Hamming's psychological vs logical design distinction and his four criteria, explain why one language you know thrived and one failed (or is failing). Your explanation should identify the specific human factors that drove adoption or rejection — not just technical properties.

The Recurring Pattern

Hamming's software history chapter contains a recurring structure:

1. A painful limitation exists (absolute addresses, binary notation, unmaintainable code)

2. Someone invents an abstraction layer that hides the limitation

3. The abstraction enables new scale, which creates new painful limitations

4. Repeat

Binary → octal/hex → symbolic assembly → FORTRAN → structured programming → object-oriented languages → domain-specific languages. Each layer resolves the predecessor's most acute pain while introducing a new class of problem.

The spaghetti code problem (absolute addresses) led to symbolic assembly. Large assembly programs led to FORTRAN. Large FORTRAN programs led to structured programming & then object orientation. Hamming's lecture ended before these later transitions, but the pattern continues.

His lesson for engineers: you are always solving the pain exposed by the previous abstraction. Understanding the layer you are currently on requires knowing why the layer below it exists.

Identify a software abstraction layer you work with regularly. What painful limitation in the layer below it does it hide? And what new class of problems does your current layer introduce — what pain will the next layer above need to solve?