How to Run AI Agents on Your Codebase Without Burning Your Budget
Running AI agents on your codebase feels like handing the keys to a Ferrari to a toddler. You want speed, but you’re terrified of the crash. Most teams are currently watching their AI budgets evaporate because they treat agents like chatbots that happen to have file access. The reality is harsher: without strict architectural constraints, agents will write clever, untestable, and deeply coupled code that breaks your production environment. I solved this by distilling thirteen foundational software engineering books into a single `AGENTS.md` configuration file, forcing the AI to respect decades of engineering wisdom before it writes a single line of code.
The Uber Trap: Why "Just Let It Code" Fails
Recent reports indicate that Uber exhausted its AI budget in just four months. The culprit wasn’t the cost of the API calls themselves, but the inefficiency of the workflow. When you allow an AI agent to operate without structural guardrails, it defaults to the path of least resistance. It generates verbose, repetitive, and often incorrect code that requires human review, debugging, and refactoring. This turns the agent from a multiplier into a liability.
The problem isn't the model's intelligence; it's the lack of context. An LLM doesn't inherently know what "Clean Code" means unless you define it in its system prompt. It doesn't understand "Domain-Driven Design" unless you provide the bounded contexts. Without these definitions, the agent optimizes for syntactic correctness, not architectural integrity. It will create a class that works, but it will likely be a god object that violates the Single Responsibility Principle.
To fix this, you must shift from prompting for code to prompting for structure. This requires a bridge between human engineering standards and machine instruction sets. That bridge is the `AGENTS.md` file. By codifying the principles from books like *Clean Architecture* and *A Philosophy of Software Design*, you transform the agent from a chaotic coder into a disciplined engineer.
The Anatomy of AGENTS.md
The `AGENTS.md` file is not just a README for humans; it is the constitution for your AI agent. It lives in the root of your repository and is ingested by tools like Claude Code, Codex, and Cursor before any task begins. It defines the rules of engagement, the architectural patterns to follow, and the anti-patterns to avoid.
I synthesized thirteen key texts into this single file. The goal was to strip away the academic fluff and leave only the actionable constraints. For example, John Ousterhout’s *A Philosophy of Software Design* emphasizes complexity management. In the agent's rules, this translates to: "Keep functions small. If a function has more than three nested conditionals, refactor it." Robert C. Martin’s *Clean Code* becomes: "Name variables by their intent, not their type. Never use magic numbers."
This file serves as a persistent memory for the agent. Since LLMs have limited context windows, you cannot paste the entire text of *Designing Data-Intensive Applications* into every prompt. Instead, you distill Martin Kleppmann’s insights into bullet points about consistency models and partitioning strategies. The agent references these rules dynamically, ensuring that every piece of generated code aligns with enterprise-grade standards.
- Complexity Management: Reduce cognitive load by keeping modules shallow and wide.
- Separation of Concerns: Enforce strict boundaries between UI, business logic, and data access.
- Testability: Code must be written with testing in mind, not as an afterthought.
Distilling the Classics into Machine Instructions
The process of rewriting these books was an exercise in translation. Software engineering books are written for humans, with nuances, anecdotes, and gradual explanations. AI agents need binary, unambiguous instructions. Here is how the core texts map to agent behavior:
Domain-Driven Design (Evans, Vernon): The agent must identify bounded contexts before writing code. It should not mix user authentication logic with payment processing logic. The rule in `AGENTS.md` explicitly states: "Define the domain language. Use ubiquitous terminology in variable names. Do not introduce anemic domain models."
Clean Architecture (Martin): The agent must respect dependency rules. Dependencies point inward. The agent is instructed to never let a database schema dictate the structure of the business logic. If the agent attempts to import a database driver into a core domain module, it must halt and refactor.
Code Complete (McConnell): This provides the tactical rules. "Check all inputs." "Avoid side effects." The agent is programmed to validate arguments at the start of every function and to return early on failure conditions. This reduces nesting and makes the code easier to read and debug.
By combining these sources, you create a composite expert. The agent doesn't just know how to write Python or JavaScript; it knows how to write maintainable Python or JavaScript. It understands that a clever one-liner is often worse than a verbose, clear block of code.
Resolving Tensions in Engineering Philosophy
Not all engineering advice aligns perfectly. There is often tension between different schools of thought. For instance, *Clean Code* advocates for very small functions, while *Designing Data-Intensive Applications* might require complex orchestration logic for distributed systems. If you feed these contradictory instructions to an AI, it will hallucinate or produce inconsistent code.
The solution is hierarchy. In the `AGENTS.md` file, I established a precedence order. Architectural integrity (Clean Architecture) takes precedence over code aesthetics (Clean Code). If a function needs to be slightly larger to maintain a clear boundary between domains, it is allowed. However, if the complexity comes from poor data structure choices, it is rejected.
Another tension exists between speed and correctness. Agents are incentivized to be fast. They will skip tests to deliver code quicker. The `AGENTS.md` file must explicitly penalize this behavior. The rule is simple: "No code without tests. If a function is added, a corresponding unit test must be generated. If the test coverage drops, the commit is rejected."
This resolution of tension is critical. It prevents the agent from becoming a "yes-man" that generates whatever you ask for, regardless of quality. It becomes a gatekeeper, enforcing standards even when you, as the developer, might be tempted to cut corners.
Implementation: Tools and Workflow
Getting this to work requires the right toolchain. I tested this setup with Claude Code, GitHub Copilot (Codex), and Cursor. Each tool handles the `AGENTS.md` file slightly differently, but the core principle remains the same: the file is the source of truth.
In Cursor, you place the `AGENTS.md` file in the root and reference it in the `.cursorrules` file. In Claude Code, you can pass it as a system prompt file. The key is consistency. The agent must read these rules before every interaction. If you are working on a large codebase, you may need to break the rules down into smaller, context-specific files for different modules.
For those looking to operationalize this without starting from scratch, the Freelancer AI Lead Generation Toolkit demonstrates how structured prompts and workflows can turn chaotic tasks into repeatable, high-value outputs. While focused on lead gen, the underlying principle of structuring AI behavior for consistent results is identical to managing code generation.
Here is a simplified example of a rule from the `AGENTS.md` file:
// RULE: SOLID Principles
// 1. Single Responsibility: A class should have only one reason to change.
// 2. Open/Closed: Entities should be open for extension, but closed for modification.
// ACTION: Before creating a new class, analyze its responsibilities. If it handles both data validation and persistence, split it into two classes.
This level of specificity is what separates a useful agent from a toy. It forces the AI to think before it types. It reduces the need for human review because the code is already aligned with best practices.
The Economic Reality of Agentic AI
Let’s be clear about the economics. AI agents are not free. The cost is not just the token usage; it’s the time spent debugging bad code. When you burn through your budget like Uber, it’s because you’re paying for mistakes. You’re paying for the agent to write a function, you’re paying for it to fail, and you’re paying for a human to fix it.
By implementing `AGENTS.md`, you shift the cost structure. You pay a small upfront cost to define the rules. You pay a moderate ongoing cost for the agent to follow them. But you save a massive amount of downstream cost in debugging, refactoring, and technical debt. The ROI comes from the reduction in error rates and the increase in code quality.
Leaders need to understand this distinction. Agentic AI is not about replacing engineers; it’s about amplifying them. But amplification requires a stable foundation. If the foundation is weak, the amplification just makes the noise louder. The `AGENTS.md` file provides that foundation. It ensures that the agent’s output is predictable, maintainable, and aligned with your architectural goals.
As the industry matures, the teams that win will be the ones that treat AI agents as junior engineers with infinite energy but zero common sense. They will provide the structure, the rules, and the guidance. They will not leave the agent to its own devices. This is the difference between burning your budget and building a competitive advantage.
Where to go from here
The transition to agentic workflows is inevitable. The question is whether you will manage it with discipline or chaos. If you are ready to implement these principles, start by auditing your current AI prompts. Are they vague? Do they lack architectural constraints? Begin drafting your own `AGENTS.md` file, pulling from the thirteen books mentioned here.
If you want to see how a single operator builds and scales these systems, check out Milo Antaeus for insights into running a one-person business at scale using autonomous agents. The same discipline applied to code can be applied to business operations. Structure your prompts, define your rules, and let the AI do the heavy lifting. The future belongs to those who can direct the machine, not just ask it questions.