Design Decision Logs When AI Is Your Co-Author

Without explicit context, AI coding assistants start each session with no memory of your architectural decisions. Each suggestion is made without knowing what was tried before.

This means they routinely propose approaches you’ve already rejected. Worse, they propose approaches you rejected for good reasons — reasons that aren’t captured anywhere the model can find them.

The problem compounds with scale. Anthropic’s experimental agent teams [1] feature lets multiple Claude Code instances work in parallel — a lead coordinating teammates, each with its own context window. Teammates share a task list and can message each other, but they don’t inherit the lead’s conversation history. The only persistent context they all share is CLAUDE.md and what’s in the repository. If your architectural decisions aren’t written down in a place every agent reads on startup, five parallel agents will independently re-derive (or contradict) decisions you made last week.

The Ecosystem Is Working on This

This is a recognized problem, and people we respect are building real solutions.

Entire.io, created by Thomas Dohmke (CEO of Entire.io, formerly GitHub CEO), is building what they call AI-native version control — a semantic reasoning layer alongside Git that captures the conversational context behind code changes [2]. Their checkpoints save not just code snapshots but the prompts, AI reasoning, and transcripts that produced them. They’re addressing what they call the “provenance gap” — the disconnect between how code was written and why. Where Git tracks what changed, Entire tracks the intent behind the change.

SageOx, founded by Ajit Banerjee (ex-Hugging Face), Milkana Brace (founder of Jargon, acquired by Remitly), and Ryan Snodgrass (15 years at Amazon), is building agentic context infrastructure — tools that capture team discussions and architectural decisions, structure them into durable history, and prime every new agent session with relevant context [3]. Their insight is that agentic sessions shouldn’t start from zero; they should start from shared memory. When two engineers debate REST vs. GraphQL, that decision and its rationale should be queryable by every future agent session.

Meanwhile, the AI coding tools themselves are converging on simpler forms of persistent context. Anthropic’s CLAUDE.md files [4] give Claude Code project-level memory. Cursor has .cursorrules. Windsurf has rules files. These are lightweight but real — they keep core decisions visible across sessions.

What We Do in the Meantime

While these purpose-built solutions mature, we’ve found that a low-tech approach works surprisingly well: structured decision logs, adapted from Michael Nygard’s Architecture Decision Records [5] (ADRs).

Nygard proposed ADRs in 2011 as a way to capture architectural decisions for human teams. The format works just as well — maybe better — when the audience includes an AI co-author that joins the project with zero context.

A design decision log records:

What was decided — the chosen approach
Why — the reasoning and constraints that led to it
What was rejected — alternatives considered and why they lost
Status — SETTLED (final), OPEN (under discussion), or SUPERSEDED (replaced)

The enforcement rule is simple: do not revisit a settled decision without new evidence. This was always good practice — decisions exist for reasons, and reopening them without new information wastes time. What’s different now is that AI agents read these logs on startup and tend to respect settled decisions much more consistently than when context lived in Slack threads and meeting notes.

What This Looks Like

In Biff, our team communication tool, we maintain two decision logs: DESIGN.md with 13 runtime decisions and DESIGN-INSTALLER.md with 9 installation decisions. Each entry follows the same format:

### D-007: Push Notifications via Tool Description Mutation
Status: SETTLED
Topic: How to notify the model of state changes

Design: Mutate MCP tool descriptions with current state and
fire tools/list_changed.

Why: MCP has no push notification mechanism.
notifications/message is silently dropped by Claude Code.
Tool descriptions are the only channel the model reads
proactively.

Rejected:
- notifications/message → silently dropped
- Polling via repeated tool calls → wastes context window
- File-based state → model has no trigger to check

When Claude starts a new session on the Biff codebase, it reads DESIGN.md first. It knows D-007 is settled. It won’t suggest notifications/message because the log explicitly records that it was tried and failed.

The Compound Effect

Each decision log entry is small. But across a project, they compound into an institutional memory that survives context window resets, session boundaries, and contributor turnover — human or AI.

The patterns in punt-kit were all discovered through this process. Prior-context priming, stash-and-wrap, sibling PPID binding (see MCP Patterns) — each emerged from trial, failure, and documentation. The log captures not just what works but the graveyard of approaches that didn’t.

Where This Fits

This isn’t our invention — it’s Nygard’s ADRs [5] applied to a new audience. And it’s deliberately low-tech. A DESIGN.md file doesn’t capture conversational context the way Entire.io does [2], and it doesn’t automatically prime agent sessions the way SageOx’s Ox CLI does [3]. What it does is work today, with no dependencies, in any repository.

As purpose-built tools for agentic context mature, we expect the manual decision log to become less necessary — or to become an input that those tools structure and surface automatically. That would be a good outcome. In the meantime, when AI is your co-author, the decision log is the difference between an assistant that helps and one that repeatedly suggests the thing you already know doesn’t work.

References

Anthropic. “Agent Teams.” 2025–present. docs.anthropic.com
Entire.io. “AI-Native Version Control.” 2025–present. entire.io
SageOx. “Agentic Context Infrastructure.” 2025–present. sageox.ai
Anthropic. “Claude Code Memory (CLAUDE.md).” 2024–present. docs.anthropic.com
Nygard, M. “Documenting Architecture Decisions.” Cognitect Blog, 2011. cognitect.com