The Object at the End of the Universe

This is Part 1 of a seven-part series on building a complete Smalltalk-80 virtual machine from scratch in C#.

There is a moment, early in learning Smalltalk-80, when you realize that the system is describing itself. Not metaphorically — literally. The class hierarchy isn’t just a convenient organization of types. It is the actual mechanism by which the system operates. Classes are objects. Metaclasses are objects. The compiler, the garbage collector, the syntax highlighter — all objects, all interacting by the same single rule: send a message, receive a reply.

This is not a novelty. It is the point.

I spent several months in 2025 building Wise Owl Smalltalk from scratch — compiler, heap, interpreter, and eventually a GPU-accelerated display system. This series is the account of how it was built and why everything about it was harder than it looked.

But before any of that — before the first line of C#, before the first parse failure, before the first garbage collection panic — you have to understand what Smalltalk-80 actually is. Not the surface syntax. The model.

The One Rule

Alan Kay’s original conception of object-oriented programming, developed at Xerox PARC through the 1970s, had a single organizing principle: objects communicate only by sending messages. There are no global procedures. There is no shared state. There is no notion of “calling” something. There is only: one object sends a message to another, and the receiving object decides what to do with it.

This sounds abstract until you see what it means for arithmetic.

In most languages, 3 + 4 is a special form. The + operator is baked into the language. The integer type is baked into the language. The compiler knows about integers and arithmetic and emits machine instructions directly. The + is not a method call.

In Smalltalk-80, 3 + 4 is a message send. The object 3 — an instance of SmallInteger — receives the message + with the argument 4. The SmallInteger class has a method named +. That method is looked up, found, and executed. The result is a new SmallInteger object, 7.

Here is what that method actually looks like in the original 1983 source file:

+ aNumber
    "Add the receiver to the argument and answer with the result if it is a
    SmallInteger. Fail if the argument or the result is not a SmallInteger.
    Essential. No Lookup. See Object documentation whatIsAPrimitive."

    <primitive: 1>
    ^super + aNumber

The <primitive: 1> annotation tells the interpreter to try the C-level fast path first. If it succeeds — the common case — the bytecode never executes. If it fails (integer overflow, wrong argument type), control falls to ^super + aNumber, which routes to the LargeInteger fallback. A two-line method handles all of integer addition, including overflow, through the same message-dispatch mechanism as everything else.

The fact that this ultimately compiles to machine arithmetic is an implementation detail. The model says: a message was sent to an object. The object responded. That’s all that happened.

The same rule applies everywhere:

anArray at: 3 put: 'hello'.        "sends at:put: to the array"
Window new.                         "sends new to the Window class"
x isNil ifTrue: [ y := 0 ].        "sends ifTrue: to a Boolean, with a block argument"

There is no special syntax for control flow. ifTrue: is just a method. The block [ y := 0 ] is just an object.

If you follow this rule to its conclusion, without compromise, you end up in a strange place. And Smalltalk-80 follows it to its conclusion.

Classes Are Objects

Most object-oriented languages have a split personality. Objects are values that your code manipulates at runtime. Classes are types that the compiler knows about at compile time. You can create instances of classes; you cannot send arbitrary messages to classes themselves. The class Integer is syntax, not data.

Smalltalk-80 has no such split. Classes are objects. This is not a figure of speech.

Consider: if you send the message new to OrderedCollection, you get a new empty OrderedCollection. You sent new to OrderedCollection. OrderedCollection is the receiver. It is an object that receives messages, just like every other object. The fact that the result of new is another object of a known type is the response — but the mechanism is identical to any other message send.

Since classes are objects, they are instances of something. A class is an instance of its metaclass. Every class in Smalltalk-80 has a corresponding metaclass, named by convention as “ClassName class” — so SmallInteger class, OrderedCollection class, Boolean class. The metaclass holds the class-side methods: the ones that respond to messages sent to the class itself, like new.

This is manageable until you ask the next question.

Metaclasses All the Way Down

Metaclasses are objects. Since every object is an instance of something, every metaclass is an instance of something. What?

The answer is Metaclass. There is a class called Metaclass and every metaclass — including SmallInteger class, Array class, Object class, all of them — is an instance of Metaclass. So if you send class to any metaclass, you get Metaclass.

And Metaclass is itself a class. So it has a metaclass: Metaclass class. And Metaclass class is an instance of Metaclass. The circle closes.

Here is the full cycle:

SmallInteger          is an instance of   SmallInteger class
SmallInteger class    is an instance of   Metaclass
Metaclass             is an instance of   Metaclass class
Metaclass class       is an instance of   Metaclass

That last line is where it bottoms out: Metaclass class is an instance of Metaclass, not of some further MetaMetaclass. The system is self-grounding.

The class hierarchy has its own circularity:

SmallInteger          is a subclass of    Integer
Integer               is a subclass of    Number
Number                is a subclass of    Magnitude
Magnitude             is a subclass of    Object

SmallInteger class    is a subclass of    Integer class
Integer class         is a subclass of    Number class
...
Object class          is a subclass of    Class
Class                 is a subclass of    Behavior
Behavior              is a subclass of    Object

And there it is: Object class — the metaclass of the root class — is eventually a subclass of Class, which is a subclass of Behavior, which is a subclass of Object. The metaclass chain and the class chain rejoin at Object. The whole thing is a single connected loop.

The Smalltalk-80 metaclass hierarchy — instance-of and subclass-of relationships form a single connected loop through SmallInteger, its metaclass chain, Metaclass, Class, Behavior, and Object

When I first read this, I thought it was a clever philosophical puzzle. After spending months implementing it, I can tell you: it is not a puzzle. It is the most consistent design decision in the system. Every bit of the class hierarchy, every method lookup, every object creation — it all works by exactly the same mechanism. There is nothing special about new. There is nothing special about class-side methods. The metaclass circle is what makes that uniformity possible.

That said, the circle has practical consequences: allocating objects in a cold heap requires setting a class’s metaclass pointer before that metaclass exists. That’s Part 3.

Understanding the model is one thing. Implementing it requires a reference — and for Smalltalk-80, there is exactly one.

The Blue Book

The canonical reference for Smalltalk-80 is a book: Smalltalk-80: The Language and Its Implementation by Adele Goldberg and David Robson, published in 1983 by Addison-Wesley. In the Smalltalk community it is called the Blue Book, because it has a blue cover.

The three Smalltalk-80 books: the Blue Book (The Language and Its Implementation, Goldberg & Robson), the Orange Book (The Interactive Programming Environment, Goldberg), and the Green Book (Bits of History, Words of Advice, Krasner)

The Blue Book has four parts. Part I is the language: syntax, message expressions, block closures, the class hierarchy. Part II is the programming environment: the Model-View-Controller framework, the System Browser, workspaces, Inspectors. Part III is the implementation: object memory, the interpreter, the bytecode set, primitive operations. Part IV is the kernel: the actual Smalltalk-80 source code for all the system classes, reproduced in its entirety.

It is a remarkable document. It is also incomplete.

The grammar described in Part I does not cover all the constructs you will encounter in the Part IV source code. The Blue Book describes the language the way a native speaker describes their own grammar: with authority and with gaps they never noticed because they fill them in automatically. The actual specification of Smalltalk-80 grammar, in practice, is not the Blue Book. It is the 41,400 lines of source code in Part IV — a file in the file-out format, which looks like this:

Object subclass: #BitBlt
    instanceVariableNames: 'destForm sourceForm halftoneForm combinationRule
        destX destY width height sourceX sourceY clipX clipY clipWidth clipHeight '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'Graphics-Support'!

!BitBlt methodsFor: 'accessing'!
destForm: aForm
    "Set the destination Form."
    destForm _ aForm!
combinationRule: anInteger
    "Set the combination rule. anInteger is in the range 0-15."
    combinationRule _ anInteger! !

The Smalltalk-80 file-out format — chunk separators, assignment arrows, and method category headers in the original 1983 source notation

The ! is the chunk separator between method definitions. The _ is the original Smalltalk-80 assignment operator. Smalltalk systems of the day often displayed the underscore as a left arrow (←), so the left arrow is also commonly considered the assignment operator. Since there is no left arrow on most keyboards, modern Smalltalk dialects adopted the multi-character token := instead. Wise Owl Smalltalk accepts all three. The original 1983 source uses _ exclusively: 41,400 lines, 4,947 assignments, zero uses of :=. The grammar that parses all of it, including every construct the Blue Book doesn’t mention, had to be discovered empirically. That’s where this story really starts. But first: the live image.

You Don’t Compile and Run

In most languages, the development cycle is: write source code, compile to a binary, run the binary, observe output, repeat. The source code is the artifact you keep. The binary is disposable; you regenerate it from source whenever you need it.

Smalltalk-80 has a different model.

A running Smalltalk-80 system is a live heap of objects: all the system classes, all their methods (compiled to bytecode), all global variables, all active execution contexts, everything. This heap is periodically serialized to a binary file called an image. Starting Smalltalk means loading this binary back into memory and resuming execution from exactly where it stopped. The image is the artifact you keep.

Source code exists in a separate role. You can file in source code to a running Smalltalk, which adds or modifies classes in the live system. You can file out a class or a set of classes from a running system to produce a text representation. But the file-out is a secondary artifact — a snapshot of part of the system’s code, not a complete record of its state.

This is a crucial distinction. A running Smalltalk system might have global variables that were set interactively during a session twenty years ago and never filed out to source. They exist in the image. They don’t exist anywhere in any text file. If you try to reconstruct the system from its file-out alone, those globals are simply absent — but the code that depends on them was written assuming they exist.

I hit this problem repeatedly, and it is one of the more interesting challenges of the whole project. We’ll get to it in detail in Part 5. For now, the point is: the Smalltalk-80 “source code” you can find in the Blue Book’s Part IV, or in the file-out distributed with various historical implementations, is not a complete description of a running system. It is a partial transcript.

What You’re Signing Up For

Here is what building a Smalltalk-80 implementation from scratch actually requires.

You need a compiler that can parse the Smalltalk-80 grammar — all of it, including the parts the Blue Book doesn’t mention — and produce something useful. You can’t produce bytecode directly, because bytecode is only useful inside a running Smalltalk system, and you don’t have one yet. So you need to produce some other intermediate representation that carries complete structural information about every class: its name, its superclass, its instance variables, its metaclass, every method, every method’s compiled bytecode.

You need a heap allocator and a garbage collector. The Smalltalk-80 object memory model — an object table, OOP-tagged pointers, a specific memory layout for each kind of object — has to be implemented correctly before you can do anything else. The garbage collector turns out to be useful not just for memory management but as a correctness test: run a GC right after building the initial heap from your intermediate representation, and anything it collects is a pointer you forgot to set.

You need a cold-start loader that reads your intermediate representation and populates the heap in the right order, setting up every class, every metaclass, every method dictionary, every global variable, correctly. This is harder than it sounds because the metaclass relationships have to be wired in a specific order, and the object table has to be consistent throughout.

You need a bytecode interpreter that correctly implements all the Smalltalk-80 bytecodes, including non-local block returns, the correct method lookup algorithm across class and metaclass chains, and primitive dispatch with fallback to Smalltalk methods.

And then you need a display system, because a Smalltalk that can’t show you anything isn’t really a Smalltalk. The original display model is BitBlt-based, which presents its own challenges in 2025.

That’s the scope of the project. The following six parts cover each piece in the order it was built.

Before the first line of C#, one more piece of context. The Smalltalk-80 source — the Part IV file-out — is roughly 41,400 lines of code defining a complete programming environment: the class hierarchy from Object down to BitBlt, the MVC framework, the compiler (which is itself in Smalltalk), the garbage collector (ditto), the file system, the debugging tools, all of it. It is a complete computing environment, self-described in the language it implements.

Building a system that can run that environment, starting from nothing, is a genuinely strange kind of bootstrapping problem. You are building the substrate for a system that, once running, could in principle rebuild its own substrate. The Blue Book understood this. Part III of the book is not a spec for an abstract VM. It is a description of the specific implementation choices the Xerox PARC team made so that the system in Part IV would run correctly on the interpreter in Part III.

They were building toward self-description. We are building backward from it.