a2fpga-series — Part 4
Four Bugs That Lived in the Dark
Adding a second expansion-ROM-capable card to the A2FPGA exposed four latent bus timing bugs in the upstream codebase — bugs that had been present for approximately two years and went undetected because only one emulated card had ever used C8-space at the same time.
Part 4 of the A2FPGA series. This part covers bugs discovered in the upstream A2FPGA codebase during Videx development — bugs that existed for approximately two years and were only exposed by adding a second card that uses C8 expansion ROM space.
Every one of these bugs was present in the upstream A2FPGA codebase before I started. None of them caused visible problems in the existing configuration — Mockingboard, SuperSprite, and Disk II don’t use C8 expansion ROM space, and the Super Serial Card was the only card that did. A single C8-capable card in a well-behaved slot is a much simpler system than two C8-capable cards competing for the same shared address space.
Adding the Videx VideoTerm changed everything. Videx needed slot 3, which meant navigating the INTCXROM hardware. Videx needed C8-space, which meant contending with the SSC for it. Videx ran firmware that executed I/O writes from C8 space — exactly the scenario that exposed the most serious timing bug. All four bugs became visible within the first few days of Videx testing. Three were submitted as pull requests to the upstream repository; one was internal to the Videx implementation.
Bug 1: The CPLD Bus OE Timing Error
This was the most architecturally interesting bug, and the hardest to diagnose — a runtime crash in the SSC firmware that only manifested when both the Videx card and SSC were running simultaneously. (Note: this is distinct from the Pascal initialization hang described in Part 2, which was a C8-space ownership issue in the Videx cold-start path. This bug is a bus contention problem that caused Pascal’s disk loading to fail after initialization completed.)
The A2FPGA emulates multiple cards through a single XC9572XL CPLD that acts as a bus bridge between the Apple II and the FPGA fabric. When an emulated card responds to a bus read, it asserts rd_en_o. The top-level logic ORs all cards’ rd_en_o signals together and drives a2_bridge_bus_d_oe_n_o to tell the CPLD to drive data onto the Apple II bus.
The original implementation was purely combinational:
assign a2_bridge_bus_d_oe_n_o = ~(data_out_en_i & BUS_DATA_OUT_ENABLE);
The problem is the CDC denoise pipeline. The FPGA samples the phi0 signal from the Apple II bus and runs it through a CDC synchronizer to eliminate metastability: 2 flip-flop synchronization stages, 3 debounce stages, and 1 FIFO stage — approximately 6 clock cycles at 54MHz, which is about 111ns.
When the Apple II’s phi0 drops (the bus cycle ends), the FPGA doesn’t see that drop for ~111ns. During those 111ns, the CPLD is still driving the Apple II data bus. On real hardware, a slot card’s 74LS245 bus transceiver would have released within ~10-20ns of phi0 dropping. The FPGA’s CPLD stays active for ~100ns longer than it should.
For most operations, this doesn’t matter. The 6502 has already latched the data before phi0 dropped, so the extended drive doesn’t affect the read. But for I/O write cycles ($C0xx) that originate from C8-space instruction fetches — exactly what the Videx and SSC firmware do constantly — the Apple II motherboard’s I/O decode logic is sensitive to bus conditions during this window. The extended CPLD drive creates contention that interferes with the I/O decode timing. The result was Pascal disk loading failures: the SSC’s FINIT routine, called from C8 expansion ROM, would crash partway through loading track 0.
The fix uses the existing phase counter (phase_cycles_r) — already present in the code, tracking position within the phi0 half-cycle — to cut off the CPLD drive before the CDC-delayed phi0 drop arrives:
localparam int CDC_DELAY = 6; // CDC pipeline stages
localparam int OE_MARGIN = 2; // Extra cycles past estimated real phi0 drop
localparam int OE_CUTOFF = PHASE_COUNT - CDC_DELAY + OE_MARGIN; // = 22
wire oe_early_cutoff = a2bus_if.phi0 && (phase_cycles_r >= OE_CUTOFF[5:0]);
assign a2_bridge_bus_d_oe_n_o = ~(data_out_en_i & BUS_DATA_OUT_ENABLE & ~oe_early_cutoff);
The real phi0 drop occurs at approximately cycle PHASE_COUNT - CDC_DELAY (= 20) from the start of the phi0 half-cycle. The cutoff fires at cycle 22, about 37ns after the estimated real phi0 edge — close enough to real card behavior to eliminate the contention, conservative enough not to cut off data too early.
Registering the OE signal (the obvious alternative) does not work. Hardware testing confirmed: a registered OE extends the drive time by an additional cycle, making contention worse, not better. The combinational assignment must remain; the fix adds an early cutoff gate.
After this fix, Pascal 1.3 boots cleanly with all five emulated cards enabled.
Bug 2: INTC8ROM Permanently Breaks C8 Reads on the Apple ][+
This is a straightforward logic error with a severe effect.
INTC8ROM is an Apple IIe soft switch. On the IIe, accessing $C3xx with SLOTC3ROM=0 sets INTC8ROM=1, which routes $C800–$CFFF to the IIe’s internal 80-column firmware ROM rather than to external expansion cards. This is how the built-in 80-column card on the IIe works.
On the Apple ][+, SLOTC3ROM does not exist as a soft switch. It defaults to 0. The upstream code did not check what kind of Apple II it was running on:
// Original (buggy):
if (!a2mem_if.SLOTC3ROM && (a2bus_if.addr[15:8] == 8'hC3))
INTC8ROM <= 1'b1;
On a ][+, SLOTC3ROM is always 0. Any access to $C3xx — including the very first PR#3 to activate the Videx card — permanently sets INTC8ROM=1. With INTC8ROM set, the slot framework’s io_strobe_n generation is suppressed for all cards. Every card that uses C8 expansion ROM stops working. This is a system-wide failure triggered by the first slot 3 access.
The fix adds a runtime IIe detection register. The IIe’s boot ROM writes to $C00x soft switches within milliseconds of startup. A ][+ never touches these addresses. A single register tracks whether we’ve seen those writes:
reg is_iie;
always @(posedge clk_logic or negedge system_reset_n) begin
if (!system_reset_n)
is_iie <= 1'b0;
else if (!rw_n && phi1_posedge &&
(addr[15:4] == 12'hC00) && !m2sel_n)
is_iie <= 1'b1;
end
INTC8ROM is then gated by is_iie:
if (is_iie && !SLOTC3ROM && (addr[15:8] == 8'hC3))
INTC8ROM <= 1'b1;
else
INTC8ROM <= 1'b0;
On a ][+, is_iie stays 0 and INTC8ROM is never set — matching real ][+ hardware, which has no INTCXROM mechanism at all. On an IIe, is_iie is set within milliseconds of boot and INTC8ROM works exactly as before, with the addition of the correct else clear (which was also missing — on a real IIe, INTC8ROM clears when a non-slot-3 $Csxx is accessed).
Bug 3: SSC Expansion ROM Responds to the Wrong Slot
The Super Serial Card’s C8-space ownership flag (C8S2) is set when the CPU accesses $C2xx (slot 2’s ROM space) and cleared on $CFFF. The problem: C8S2 can remain set after a slot transition without an intervening $CFFF access. If C8S2 is stale and another slot’s expansion ROM is being accessed, the SSC still drives the data bus.
On real hardware with physical per-slot bus transceivers, stale ownership flags are harmless — each card has its own bus drivers, and the card’s drivers are only enabled when the card’s slot is actually selected. On the A2FPGA, all cards share one CPLD. A stale C8S2 causes the SSC to drive the bus during Videx’s C8-space reads, or during any other card’s expansion ROM access.
The fix adds a SLOTROM guard — a check that the most recently accessed $Csxx slot was slot 2:
// Before:
assign ENA_C8S = {(C8S2 & !INTCXROM), addr[15:11]} == 6'b111001;
// After:
assign ENA_C8S = ({(C8S2 & !INTCXROM), addr[15:11]} == 6'b111001)
&& (SLOTROM == 3'd2);
SLOTROM is already maintained by apple_memory.sv and accurately tracks which slot’s $Csxx space was most recently accessed. The guard adds one 3-bit comparison.
Bug 4: SSC Phi0 Qualification Missing
The SSC’s C8-space ownership logic was not phi0-qualified. When any card de-asserts rd_en_o at the phi0→phi1 boundary, the PCB bus transceiver creates real address glitches — the signal lines ring as they transition. At 54MHz, the FPGA samples these glitches and can interpret them as valid addresses.
Without phi0 qualification, the SSC’s $CFFF detection fires on glitches during the phi0→phi1 transition, clearing C8S2 mid-execution. The SSC’s expansion ROM ownership disappears while the SSC firmware is running — the bus goes open, and the firmware crashes.
The fix is one line: gate the address detection with a2bus_if.phi0.
// After:
always @(posedge clk_logic) begin
if (!system_reset_n) C8S2 <= 1'b0;
else if (a2bus_if.phi0) begin // ← the fix
case (addr[15:8])
8'hC2: if (!INTCXROM) C8S2 <= 1'b1;
8'hCF: if (!INTCXROM && addr[7:0] == 8'hFF) C8S2 <= 1'b0;
endcase
end
end
phi0 is the valid data phase of the Apple II bus. Address-sensitive logic should always be qualified by phi0 to avoid responding to transients during phi1. This is the same pattern applied to the Videx cfff_access detection, and should be the standard for any future card emulation that uses C8-space ownership.
A Note on Character ROM Storage
One more observation from the Videx implementation that belongs in the same conversation as these bugs: the Videx character ROM stores 256 characters, but the upper 128 ($80–$FF) are exactly the bit-inverse of the lower 128 ($00–$7F). This is not an approximation — it’s a documented property of the Videx character set, used to implement inverse video without additional hardware.
The original hardware uses two physical EPROMs to store both halves. In the FPGA, storing both halves costs 2 BSRAM blocks. Storing only the lower 128 and inverting in logic costs 1 BSRAM block plus one XOR gate per pixel. On a device with 46 BSRAM blocks total where the five-card build uses 45, this is not optional. It’s also not a bug in any conventional sense — it’s the kind of resource accounting that only becomes visible when you’re building for a device this close to its limits.
The Pattern
Looking at these five bugs together, they share a structure. Four of the five (bugs 1, 3, 4, and aspects of 2) are variations on the same fundamental issue: the A2FPGA is a 54MHz system pretending to be a 1MHz system, and the places where that pretense breaks down are exactly the places where timing qualifications are missing.
The Apple II bus was designed for hardware that physically could not respond faster than phi0 allowed. The FPGA samples everything on every clock edge and has no such constraint. Without careful qualification — phi0 gating, phase counter cutoffs, ownership guards — the FPGA sees bus events that real hardware would have been physically incapable of seeing. Most of the time this is fine. In the specific scenarios where it isn’t, the failures are intermittent, timing-dependent, and reproducible only under exactly the conditions that expose them.
Adding a second C8-capable card created exactly those conditions. The Videx card and the SSC compete for the same address space, exercise C8-space ownership simultaneously, and generate the bus transceiver glitches that expose unqualified edge detection. Every one of these bugs required that specific combination to become visible.
The upstream pull requests (PRs #35–#38) were accepted. The fixes are now in the main A2FPGA codebase, where they benefit any future card emulation that uses expansion ROM space.
A Remaining HDMI Mystery
The AVI InfoFrame and control period fixes (PRs #35 and #38) resolved compatibility with most displays — after applying them, the A2FPGA produces clean HDMI output on a range of flat-panel monitors that previously rejected the signal entirely.
One display still does not cooperate: a Samsung Odyssey Neo G9. The G9 receives the A2FPGA’s HDMI signal and produces nothing — no image, no “no signal” message, just blank. The same signal routed through an HDMI switch first, then from the switch to the G9, displays perfectly.
The HDMI switch is not doing anything sophisticated. It is not upscaling, transcoding, or reformatting the signal. It is acting as a passive repeater with some signal conditioning circuitry — and that conditioning is apparently enough to satisfy whatever the G9 requires that the A2FPGA’s raw output does not provide.
The most likely explanation is an electrical issue rather than a protocol issue: HDMI signal levels, impedance matching, or TMDS pre-emphasis that is marginal from the Tang Nano 20K’s FPGA I/O pins and falls within acceptable range for most monitors but outside the G9’s tolerance. The HDMI switch regenerates the signal with its own output drivers, bringing it back within spec.
This is consistent with the G9 being a high-end gaming monitor with unusually strict HDMI compliance checking. Consumer TVs and general-purpose monitors tend to be more tolerant of marginal signals. The fix — HDMI switch in the chain — is inelegant but reliable. A proper fix would require either changes to the FPGA’s I/O drive strength and termination configuration or a hardware revision to the A2FPGA PCB’s HDMI output stage, neither of which is in scope for this fork.