14,840 lines, zero parse errors
The compiler parsed the entire Smalltalk-80 system source without error.
Smalltalk-80.sources is the complete source code of the original 1980 system: the class library, the compiler, the browser, the process scheduler, the graphics layer. Everything. 14,840 lines. It’s written in Smalltalk-80 file-in format — a specific chunk-delimited syntax that’s not quite what any modern parser expects.
The grammar has been through several rewrites since June. The first version was lifted directly from an ANTLR4 Smalltalk grammar found online. It handled arithmetic and simple message sends but choked on cascades, on non-local block returns, on the ! chunk delimiters, on binary messages that look like comparison operators. Each failure mode required understanding what the Blue Book actually specifies rather than what modern Smalltalk dialects do.
The breakthrough today was fixing the statements grammar rule to correctly handle period-prefixed blocks — a Blue Book-ism where a statement separator can appear before the first statement in a block. Every modern Smalltalk parser I looked at got this wrong in the same direction: they required statements to be separated by periods, not optionally prefixed by them. Once that rule was fixed the last class of parse failures evaporated.
The parser now produces a clean AST for every method in the system. All 128 single-character symbols are in the symbol table. Cascade compilation generates correct DUP placement. The chunk-delimiter ! is handled at the file-in level, not the expression level, which is how the original system works.
Parsing is not compiling. The AST nodes exist but no bytecodes have been generated yet. That’s the next phase — walking the AST and emitting Blue Book-compliant bytecode sequences. But a parser that correctly understands the whole system is the prerequisite for everything else.
Lines parsed: 14,840. Parse errors: 0. Grammar rules fixed in final session: 3.