What if rigour were fast and free?
AI is writing more of the code. How can teams verify it and be accountable for it? The practices that produce trustworthy software — formal specs, structured requirements, verified transformations — were always effective, just too time-intensive and inaccessible. We’re building open source tools to try to solve that.
What we are building
Decades of computer science and software engineering research — structured product discovery, test driven development, formal methods, and behavior-preserving transformations — have always worked. They were just challenging to apply consistently and time-cost effectively in most commercial settings. We believe that AI makes it feasible to learn and consistently apply these practices even under pressure and constraints. We build across four layers.
What we are releasing
All repos ↗chore: post-release v3.0.0 by @claude-puntlabs in https://github.com/punt-labs/ethos/pull/230
Playback timeout scales to audio duration (vox-ddf) — the fixed 30-second timeout was silently truncating any TTS speech over ~450 charac...
Background music generation (/music on|off, vox music on|off) — vibe-driven instrumental music that loops during coding sessions. When mu...
What we are reading
All readings →Prediction: AI Will Make Formal Verification Go Mainstream
Kleppmann argues AI removes the human bottleneck from formal verification — the same thesis driving our work, arrived at independently.
martinfowler.comLLMs Bring a New Nature of Abstraction
Fowler argues LLMs create a new kind of abstraction — probabilistic rather than deterministic — and explores what that means for how we build software.
arXiv preprint (2507.13290)Towards Formal Verification of LLM-Generated Code from Natural Language Prompts
Proposes using formal verification to check LLM-generated code against natural language intent — closing the gap between what you asked for and what you got.
What we are learning
All posts →The MCP Singleton Problem
When every Claude Code session spawns its own MCP server, you get N copies of your embedding model. Here's how we found out, what we tried, and where we landed.
Building L1 Tools for L4 Agents
When agents need a visual interface, the most important design decision is where to draw the boundary between deterministic rendering and agentic composition.
The Verification Gap
As code moves from deterministic to agentic, the gap between what we can specify and what we can verify widens at every level. Five levels of code, five verification paradigms, and the case for formal methods in an agentic world.