I rewrote 13 software engineering books into AGENTS.md rules.
I rewrote 13 software engineering books into AGENTS.md rules because context windows are finite and senior judgment is expensive. Most developers treat AI coding assistants like autocomplete on steroids, but that ignores the structural decay that happens when you scale complexity without architectural guardrails. The problem isn't generating code; it's generating *correct* code that fits into a maintainable system. By distilling the core tenets of canonical engineering texts into machine-readable instruction sets, we shift the AI from a code monkey to a disciplined junior engineer who actually reads the documentation.
The Context Gap: Why LLMs Fail at Scale
Large Language Models are probabilistic engines, not deterministic compilers. When you ask an LLM to "write a REST API," it gives you the most statistically likely REST API it has seen in its training data. This works for a toy project. It fails catastrophically in a production environment where state management, transaction integrity, and domain boundaries matter.
The fundamental issue is that LLMs lack long-term memory of your specific codebase unless you explicitly feed it. Standard prompts are ephemeral. They vanish after the session ends. Without a persistent source of truth that dictates architectural constraints, every new chat session starts with the AI "forgetting" that you prefer Clean Architecture over Hexagonal, or that you strictly enforce SOLID principles.
This is where the `AGENTS.md` concept emerges as a critical infrastructure layer. It acts as a persistent system prompt that lives in your repository, ensuring that every time the AI initializes, it inherits the collective wisdom of decades of software engineering practice. It’s not just about syntax; it’s about enforcing a philosophy of design before a single line of code is written.
Decoding the Canon: From Books to Bytes
The selection of books for this distillation process was deliberate. We didn't pick random bestsellers; we picked the texts that define how modern software is structured. The list includes heavy hitters like Clean Code and Clean Architecture by Robert C. Martin, A Philosophy of Software Design by John Ousterhout, and Designing Data-Intensive Applications by Martin Kleppmann.
Translating these texts into `AGENTS.md` rules requires stripping away the narrative and retaining the imperative. For example, Ousterhout’s concept of "Deep Modules" — interfaces that are simple but implementations that are complex — becomes a strict rule: "Prefer modules with minimal public APIs and maximal internal functionality." This prevents the AI from exposing unnecessary internal state, a common failure mode when LLMs generate overly verbose interfaces.
Similarly, Martin Kleppmann’s work on data-intensive applications provides rules for handling consistency models. Instead of the AI guessing whether to use strong or eventual consistency, the `AGENTS.md` file contains explicit directives: "For user-facing data, prioritize availability; for financial transactions, enforce strong consistency." This transforms abstract theoretical concepts into actionable constraints that the AI cannot ignore.
Resolving Tensions: Ousterhout vs. The Clean Code Dogma
One of the most valuable aspects of this synthesis is resolving contradictions between canonical texts. Software engineering is not a monolith; it is a collection of competing heuristics. A prime example is the tension between Robert C. Martin’s Clean Code and John Ousterhout’s A Philosophy of Software Design.
Martin advocates for short functions and minimal complexity, often leading to code that is easy to read but fragmented. Ousterhout argues for deep modules and hiding complexity, which can result in longer, more abstract interfaces. If you feed both books to an LLM without mediation, it will oscillate between these styles, creating inconsistent codebases.
The `AGENTS.md` resolution prioritizes Ousterhout’s "Deep Modules" for public APIs and Martin’s "Clean Code" for internal implementation details. This hybrid approach ensures that the interface presented to other developers (or other parts of the system) is simple and robust, while the internal logic remains clean and testable. This nuance is impossible to achieve with a generic "write clean code" prompt.
- Rule 1: Public APIs must be minimal and intuitive (Ousterhout).
- Rule 2: Internal functions must be short, focused, and well-tested (Martin).
- Rule 3: Complexity should be hidden behind abstractions, not exposed through verbose interfaces.
Domain-Driven Design: The AI’s Blind Spot
Domain-Driven Design (DDD) is notoriously difficult for LLMs to grasp because it requires understanding business context, not just syntax. Eric Evans’ Domain-Driven Design and Vaughn Vernon’s Implementing Domain-Driven Design provide the framework, but the AI needs explicit rules to apply them.
Without guidance, an LLM will often create anemic domain models — entities that are essentially data bags with getters and setters. This violates the core tenet of DDD: behavior belongs in the domain. The `AGENTS.md` rules explicitly forbid anemic models. Instead, they instruct the AI to place business logic within the entity itself, ensuring that invariants are maintained at the object level.
For instance, when creating a `User` entity, the AI is instructed to include methods like `changeEmail()` rather than allowing direct assignment of the email field. This ensures that validation logic (e.g., checking for valid email format) is encapsulated within the domain object. This level of architectural discipline is what separates a prototype from a production-ready system.
Implementation: Cursor, Claude, and Codex
The utility of these rules depends on the toolchain. Currently, the `AGENTS.md` format is optimized for Cursor, Claude, and Codex. These tools allow for persistent context injection, meaning the rules are applied consistently across sessions.
In Cursor, for example, the `AGENTS.md` file is treated as a high-priority context source. When you ask Cursor to generate a new service, it first consults the rules derived from Clean Architecture to determine the layering of the service. It then applies the DDD rules to structure the domain entities. Finally, it uses the Code Complete rules to ensure naming conventions and error handling are consistent.
This creates a feedback loop where the AI’s output is constrained by the best practices of the industry. It’s not just about writing code faster; it’s about writing code that is easier to maintain, test, and scale. The result is a codebase that feels like it was written by a senior engineer who has read all the right books.
Practical Application: Beyond the Hype
While the concept of rewriting books into rules is compelling, the real value lies in execution. Many developers attempt to create their own `AGENTS.md` files, but they often miss the nuance of the original texts. They might include a rule like "write unit tests," but fail to specify how to write them according to Code Complete or Clean Code.
The synthesized rules go deeper. They specify that unit tests should test behavior, not implementation details. They mandate that tests should be independent and idempotent. They enforce that mocks should be used sparingly and only when necessary for isolation. These specifics are what make the difference between a superficial prompt and a robust engineering standard.
If you are looking to operationalize this approach without spending weeks distilling these texts yourself, consider the AI Operator Startup Kit. It provides pre-built workflows and system prompts that embody these principles, allowing you to launch a disciplined AI-driven development process immediately. This kit is designed for operators who want to turn AI agents into a profitable, scalable business by enforcing quality from day one.
Where to go from here
The future of software engineering is not just about AI writing code; it’s about AI understanding architecture. By encoding the wisdom of 13 foundational books into `AGENTS.md` rules, we create a bridge between human expertise and machine execution. This approach ensures that as we scale development with AI, we do not sacrifice the quality and maintainability that define professional software.
Start by auditing your current prompts. Are they generic? Do they lack architectural constraints? If so, you are leaving quality on the table. Integrate these distilled rules into your workflow. Use tools like Cursor and Claude to enforce them. And remember, the goal is not just speed; it’s sustainability. For those ready to take this further, the AI Operator Startup Kit offers a comprehensive framework to build and scale your AI operations with these best practices at the core.