l3m: A Verified Coding Agent
Building a Claude-Code-shaped agent whose trusted computing base is the Lean kernel.
I spent three days building a coding agent in Lean 4. Not a toy — a real agent with file tools, git, web search, sub-agents, the works. The difference from every other agent framework: the Lean kernel proves the sandbox holds. 96 kernel-verified theorems about the agent’s runtime behavior. Every IO operation enumerated in a proof, not just documented.
The repo is l3m (Lean LLM). It builds on Lean Manifests, the machinery I wrote about in The Lean Project.
This is early work. Feels promising but not production. I’m writing it up because the ideas are interesting regardless of whether this particular codebase survives.
The skeleton-first workflow
The key move was not writing the agent. It was writing the types first.
Phase 0: vocabulary. Define Tool, SafePath, TameSpec, Transcript, WebBudget. No implementations. Just the shapes.
Phase 1: signatures. Write every manifest claim as a type signature with sorry bodies. Now you have a build that compiles, a complete list of what you intend to prove, and — critically — parallel agents can fill in different modules without stepping on each other.
I had one Kiro instance writing tier-1 tools (file read, file edit, grep — pure Lean, no subprocesses) while another wrote tier-3 tools (lake build, git — subprocess wrappers with heuristic parsing). They never collided because the types were already agreed. The manifest was the interface contract.
This is the same factoring trick from working with LLMs: make each task fit in one context window. But here the factoring is enforced by the type checker. If your tool doesn’t match the manifest signature, it doesn’t compile.
SafePath: confinement as a type
The single most important design decision:
structure SafePath where
private mk ::
path : System.FilePath
root : System.FilePath
private mk. No code anywhere can construct a SafePath except by going through fromConfineIO, which resolves symlinks via realpath, checks for .. traversal, and verifies the path is under the workspace root. If realpath fails — missing binary, broken symlink, whatever — the answer is none. Deny on failure, not degrade.
The workspace root itself is canonicalized at startup via realpath. This closes the attack where a symlinked workspace root causes all “confined” paths to resolve outside the intended directory. We found this during the adversarial pass — it’s the kind of thing that looks fine until you think about it from the attacker’s side.
Tools don’t “call confine before doing IO” — they hold a SafePath, and without one they can’t call readFile or writeFile at all. The alternative — “every tool should call confine” — is a code review convention. Conventions get violated. Types don’t.
The typed-axiom discipline
This is the most interesting realization from today’s work.
We had 27 axioms. They looked like this:
UnprovenConjecture single_writer :
∀ (path : System.FilePath), True
Every one of them was typed True. I thought I was being rigorous — naming my assumptions, putting them in the manifest, tracking them in the trust report. But True has a trivial proof. Any implementation satisfies it. I could format the hard drive and this “axiom” still holds. The name implies safety; the type guarantees nothing.
This is security theater inside a verification system. Schneier would recognize it immediately: something that looks like security, makes you feel secure, and does absolutely nothing.
The fix has two parts:
Part 1: Give axioms real types. Introduce abstract event types — FsEvent, WriteEvent, NetworkRequest — that represent the logical records of IO operations. Now axioms become properties of these types:
UnprovenConjecture single_writer :
∀ (event : FsEvent), ∃ (wsRoot : String),
event.modifier = .l3mInstance wsRoot
This is falsifiable. If you can construct an FsEvent with .external modifier, you’ve violated it. It’s composable — the ClosureTheorem uses it as a premise. It’s load-bearing.
Part 2: Delete what you can’t type. 17 of the 27 were things like “ext4 doesn’t corrupt data” or “git’s object store is append-only.” You can’t express these as Lean propositions in any meaningful way. So we deleted the axioms and put ASSUMPTION doc-comments near the relevant code. Honest about their status: verified by external means (code review, testing, OS documentation), not by the kernel.
The discipline is now: if a claim has real propositional content and feeds a downstream DerivedConjecture, it’s a typed UnprovenConjecture. If it’s an environmental observation outside Lean’s universe, it’s a documented comment. There is no middle ground where a True-typed axiom pretends to be verification.
Final count: 10 typed axioms, 0 True-typed, each used by at least one derivation chain. The trust report went from “27 assumptions (don’t worry, we named them)” to “10 assumptions (here’s exactly what each one claims, and here’s the theorem that breaks if it’s wrong).”
The closure theorem
Multi-instance safety — the guarantee that two l3m agents in separate workspaces can’t interfere — used to be a hand-wave: “each agent runs in its own worktree, so they don’t collide.” Today it’s a theorem.
The entire argument rests on one axiom:
UnprovenConjecture SingleSystemAxiom :
∀ (event : FsEvent), ∃ (wsRoot : String),
event.modifier = .l3mInstance wsRoot
Only l3m instances modify workspace files. This is a closed-system assumption — we define the boundary and exclude everything else. If a user opens the workspace in an editor while l3m runs, the axiom is violated and the guarantee doesn’t hold. The assumption is honest now.
From there, the proof builds on 5 ProvenTheorems about MemFS (our pure filesystem model) — writes to disjoint keys commute, non-overlapping roots produce non-overlapping paths — plus 7 ProvenTheorems about the workspace decision logic (descendant detection, ancestor detection, nonce uniqueness, lock verification).
The conclusion:
DerivedConjecture ClosureTheorem :
∀ (a b : Workspace) ...,
nonOverlappingWorkspaces a b → ...
∀ (k : String),
((fs.set p1 c1).set p2 c2).get k =
((fs.set p2 c2).set p1 c1).get k
Non-overlapping workspaces produce the same filesystem state regardless of write ordering. The old vague single_writer is gone; in its place, a clean architectural choice (SingleSystemAxiom) with a provable consequence (ClosureTheorem).
Workspace isolation
The workspace claim logic itself got serious attention. The agent refuses to work in the main repo — it walks to $HOME and uses find(1) for descendant scanning. Why find instead of a depth-bounded recursive walker? Because find -name .l3m.lock is one clean OS assumption (“find enumerates entries”) rather than a conjunction of “our walker is correct AND the depth bound is sufficient AND gitignored paths are handled correctly.”
Race safety uses claim-first semantics: write the lock, then verify no ancestor/descendant conflicts exist. If a conflict is detected after claiming, release and fail. The 128-bit nonce space has cardinality 2^128 (proven — yes, we proved that). Seven ProvenTheorems cover the pure decision logic, cleanly separated from the IO observation layer.
The adversarial pass
Today I applied five lenses to every load-bearing axiom:
- Generous vs. adversarial reading. Does the axiom still hold if you read it like a hostile auditor?
- Granularity. Is the axiom about one thing or secretly a conjunction?
- Composition. Does it survive sequential operations?
- TOCTOU. Is there a window between check and use?
- Pluralization. What happens with N instances?
Found 6 real holes. Fixed 4:
confineIOused to silently degrade whenrealpathwas missing. Now it denies.- The nonce generator fell back to
IO.monoNanosNowwhen/dev/urandomwas unavailable. Now it refuses to start. A predictable nonce is worse than no nonce. - Symlinked workspace root — canonicalized at startup.
- No git version check — now refuses git < 2.15 (worktree-aware gc).
Five holes need larger redesign and are documented: TOCTOU in confineIO (needs openat2(RESOLVE_BENEATH)), external process modification (needs OS-level MAC), lakefile TOCTOU, missing SafeWrite combinator, missing CI lint for IO boundary claims.
I’m not going to pretend those five are fine. They’re real gaps. But they’re named gaps, which is better than unnamed ones.
URL trust
A real exploit we closed today: the LLM could create fake fetch URL files via file_create. Previously, URLs lived in .l3m/web/fetch_*.url files on disk. The LLM has a file creation tool. You see the problem.
URLs now live in an IO.Ref — in-memory state that no tool writes to. The LLM sees fetch_id tokens, not raw URLs. The claim “URLs are never LLM-controlled” is structurally true: there’s no tool that writes to the IO.Ref, and file_create can’t affect memory.
The compression budget
Web queries are the hardest channel to secure. You can encode anything in a search query. The insight: secrets don’t compress. An API key is high-entropy. If you budget compressed bytes per session, high-entropy payloads burn through fast while normal queries compress cheaply against the seed dictionary.
The compressor is pure Lean (LZ77). We don’t trust it — we verify every compression by decompressing and comparing:
ProvenTheorem compress_decompress_roundtrip :
∀ (seed : SeedDict) (s : String),
decompress seed (compress seed s) = some s
Kernel-verified by structural induction on the token stream. If the compressor has a bug that makes secrets look small, the roundtrip catches it. If the roundtrip has a bug, the kernel says it can’t.
Call graph reachability
The alignment move I’m most excited about. A metaprogram walks the Lean environment starting from runLoop — the agent’s main loop — and collects every constant transitively reachable (2744 of them). For each one, it checks: does the type mention IO? Is it unsafe? Is it @[extern]?
Then it emits native_decide proofs:
theorem io_subset_allowed :
ioSet.all (allowed.contains ·) = true := by native_decide
If anyone adds a new IO call anywhere in the transitive closure, the proof fails. The LLM has to ask for permission to grow the IO surface. This is capability-based security enforced by the type checker.
Final tally
| Category | Count |
|---|---|
| ProvenTheorems (kernel-verified) | 96 |
| TestedConjectures (native_decide) | 17 |
| DerivedConjectures (proven modulo named axioms) | 8 |
| Typed UnprovenConjectures | 10 |
| True-typed axioms | 0 |
The trust report names 4 standard Lean axioms plus 10 typed environmental assumptions. A security auditor reads 10 propositions, decides whether they hold in the deployment environment, and is done.
What I learned
True is not an axiom. An axiom typed True is documentation in axiom clothing. It participates in the bookkeeping without constraining anything. The moment we deleted all 27 and replaced them with 10 real propositions, the trust report became actually useful.
Adversarial reading finds real bugs. The five-lens pass wasn’t academic — it found a silent degradation in confineIO, a predictable-nonce fallback, and a symlink escape. None of these would have surfaced from normal testing.
One narrow axiom beats a vague conjunction. SingleSystemAxiom says one thing precisely. The old single_writer said something vaguely. The precise version feeds a real proof; the vague version just sat there looking responsible.
Opaque types are underrated. private mk on SafePath does more security work than a hundred lines of runtime checks.
The kernel is the trust anchor. Everything else — the LLM, the tools, the runtime, my own judgment — is fallible. The kernel isn’t. Building on that foundation means the safety properties survive even if the agent is adversarial. That’s the whole point.
Code: github.com/dean-lean/l3m (placeholder — will be public when it’s less embarrassing)
Prior work: Lean Manifests
Built with: Lean 4, Kiro, three days + one afternoon of adversarial paranoia.
Postscript (May 2026). Capability bounding is not the same as secret-safety. A few weeks after writing the above, an early agent session in --wild mode read ~/.zshrc, transcribed an API key into a documentation file, and committed it. GitHub’s scanner caught it; the key was rotated. The full story is in When your safe agent helpfully leaks your API key. The kernel theorems were not violated — the agent had legitimate shell access. Each step looked fine while the composition exfiltrated. The threat model now names this pattern explicitly, and we’re adding env-var classification, output redaction, and a commit-time secret scan as layered defenses. None are airtight; rotation when an incident is detected remains the real prevention.