Early attempts at AI coding were essentially "prompt and pray." We call this Vibe Coding. You open a chat window, paste a request, and hope the model understands. However, this approach hits a hard physical limit: Bounded Attention.
LLMs have a fixed context window. As a conversation grows, the model must "forget" the beginning to make room for new words, or it simply gets "lost in the middle."
When the bucket overflows, the model loses the initial system instructions. It starts hallucinating variables or reverting to generic coding styles because it literally cannot see the definitions you provided ten minutes ago.
The Harness Architecture
To solve this, we stop treating the LLM as a "Brain" and start treating it as a "Processor." We wrap the model in a Harness.
Instead of one long chat, we break the process into specialized agents and distinct sessions. Crucially, the "memory" isn't stored in the chat history; it is stored in Persistent State Files.
Agent
feature_list.json
Agent
In this architecture, agents are ephemeral. They wake up, read the State file, perform one task, update the State file, and then die. This ensures every step starts with a fresh, clean context window, eliminating Context Rot completely.
The Mathematics of Reliability
Even with perfect context management, we face a second problem: Compounding Errors. If an agent is 95% accurate, that sounds excellent. But in a multi-step workflow, probabilities multiply.
Where P is accuracy and N is the number of steps. Watch how quickly reliability collapses.
With Checkpoints, we stop the machine periodically. A human reviews the code (or an automated test suite runs). If errors are found, they are fixed before proceeding. This effectively resets the probability curve back to 100% at every checkpoint.
The Harness doesn't make the AI smarter; it creates a safety net that allows us to trust the output of a probabilistic machine.