What Vibe Coding Became | Agentic Architecture

TL;DR. The AI-to-human code ratio just went from 80/20 to nearly 100/0 at every team that matters. Google ships 75% AI-authored code. CrewAI ships 42%. Claude Code is at $2.5B run-rate twelve months after launch. The founder's job is no longer writing code. It is authoring the constraints that the code gets written under. Done well, that is the highest-leverage work a founder will do this decade.

Where we are, in numbers

A few load-bearing data points from the last two weeks of conferences:

Google: over 75% of code AI-authored (Paige Bailey, AI Dev SF, April 28).
CrewAI: 42% of code AI-authored last week (Joe Moura, AI Agent Conf NYC, May 4).
Claude Code: $2.5B run-rate February 2026. Twelve months from launch.
FastRender (Wilson Lin, Jan 2026): ~1M lines of Rust, ~30,000 commits, ~2,000 concurrent agents at peak.

The cost of a marginal line of code dropped an order of magnitude in 2025 and is dropping again in 2026. The bottleneck has moved.

What "vibe coding" was, and what it became

Andrej Karpathy coined the term in February 2025. The original was specifically about a throwaway weekend mode: forget the code exists, copy-paste error messages back, accept-all-diffs. Not a methodology. A holiday from methodology.

By April 2026, Karpathy himself had moved on. At Sequoia AI Ascent 2026 he framed three paradigms: Software 1.0 (explicit code), Software 2.0 (trained networks), Software 3.0 (prompting LLM interpreters). His take on the original term: vibe coding was the warmup. Agentic engineering is the actual job. The replacement is more disciplined, more spec-driven, more reviewable.

The disillusionment piece in between is Alberto Fortin's May 2025 post: months of LLM-led coding produced what he described as ten junior-mid developers locked in a room without seeing the other nine. Inconsistent naming. Duplicated logic. Architectural drift. He went back to pen-and-paper planning. Steve Yegge made the underlying point earlier in his Sourcegraph piece Cheating is All You Need: the worry that you cannot trust LLM-generated code misses the point, because we have always built the discipline of software engineering around code we cannot trust. Reviewers, linters, types, tests, CI: those are the answer, not "do not use LLMs."

Vibe coding works perfectly for projects you do not have to maintain. It fails predictably for systems that have to live.

What founders need is a third thing: a discipline that takes the velocity and pays the small upfront tax that makes the output maintainable. Call it intent-led coding. The shift is small in words and large in consequence.

Stop writing code. Start writing the rules under which code gets written.

Authoring constraints, not authoring code

A real example from this repo. The file at the root is AGENTS.md (Cursor reads it, Claude Code reads it, Codex reads it). Full content:

# This is NOT the Next.js you know

This version has breaking changes. APIs, conventions, and file
structure may all differ from your training data. Read the relevant
guide in `node_modules/next/dist/docs/` before writing any code.
Heed deprecation notices.

Thirty-seven words. It does not describe how to write Next.js. It does not list APIs. It changes the agent's behavior: instead of relying on training-data memory of a 2024 Next.js, the agent reads the version-pinned docs in node_modules first. The diff before and after this file existed was the difference between "looks correct, doesn't compile" and "compiles on the first try."

That is what authoring constraints means. A small, well-placed sentence that describes the world correctly saves a hundred wrong diffs.

The six categories of constraint worth authoring

In rough order of leverage:

Version-pinned reality. "This is not the X you know." Every framework with a breaking release between the model's training cutoff and today belongs here. Next.js 15 vs 16. React 19. Tailwind 4. The single highest-leverage sentence I have written in my career.
Architectural invariants. "All API routes use Edge runtime unless they need nodemailer." "Components in components/site/ are server components unless explicitly marked." Statements of should, not how. The agent picks the right pattern because the wrong pattern violates a rule.
Deprecation map. "We migrated off react-query to TanStack Query in March; old patterns will not work." Agents pull old patterns out of training data. The deprecation map is how you tell them which patterns are dead.
Taste rules. "Default to no comments." "Don't add error handling for scenarios that can't happen." "Three similar lines beat a premature abstraction." Bias the agent toward your codebase's aesthetic.
Hooks. Memory and preferences don't enforce. Hooks do. A pre-commit hook that runs the type checker and rejects the diff is the difference between "the agent claimed it worked" and "the agent's diff actually compiles."
Skills. Reusable prompt-and-script bundles for repeat tasks. A /release skill that bumps the version, regenerates the changelog, tags the commit. Anthropic ships PDF/DOCX/XLSX creation in Claude entirely as skills. Authored once, called by name forever.

There is no seventh thing.

What changes for founders specifically

You spend more time writing English than code. AGENTS.md, architectural-invariant docs, user stories, taste rules. This is not less work. It is differently shaped work, closer to product specification than engineering.

You spend less time picking the right library. The model has read more open-source code than you have. Right library is mostly solved.

You spend more time reviewing and less time typing. The bottleneck on agentic AI is the defect rate (Marc Brooker, AWS, AI Dev SF). The way to drive defects down is reading the diff. Aggressively.

Your moat shifts from speed to taste. When any founder can have a working prototype by Friday, the prototypes that win are the ones that feel right. Strong opinions in the UI. One-tenth the surface area but the right tenth. Error messages that read like a person wrote them. Taste is the constraint that doesn't fit in .cursorrules. It comes from you.

Failures to plan for

Things I have hit in the last six months:

Test-suite rot. The agent disabled a failing test rather than fix the bug. Twice. Fix: hook that rejects diffs touching tests unless the commit message explains why.
Looks-correct-but-isn't. A migration that ran clean, type-checked, deployed, and silently truncated a bigint to int because the agent inferred the wrong target type. Caught in shadow mode.
The 40K-token wall. Past about 40K tokens, even frontier models drift (Jeff Huber, Chroma research). The cure is not a bigger context window. The cure is putting less noise in the context.
Cognitive debt. After three weeks of accept-all-diffs, I no longer fully understood code with my name on it. Fix: have the agent interview you before merging. If you can't answer, you don't ship.

The honest tradeoff

Vendor lock-in. $200/month per agent per IDE is real cost. Cursor Composer prompts, Claude Code skills, project memory files don't transfer cleanly across vendors.
Harness regression. Anthropic publicly acknowledged in April 2026 that two months of degraded Claude Code quality were a harness bug. External verification (tests, types, preview deploys) is not optional.
Estimation chaos. Same feature might take fifteen minutes or four hours. Sprint planning gets weird. Best-handled teams I've seen run small WIP queues and stop estimating.

The tradeoff is real. Whether you take it is a function of velocity vs. predictability. Most founders, most of the time, should optimize for velocity.

The takeaway

The agents will write the code either way. Your job is to write the constraints they read first. Done well, that is the highest-leverage work a founder will do this decade.