You've been there. The app works perfectly in localhost:3000. The demo is smooth. You push to production — and within 48 hours, users are hitting unhandled errors, authentication is broken on mobile, and you realize there's no way to roll back.

The code was fine. The product wasn't ready.

This is the gap between "it works" and "it ships." And it shows up predictably in the same places, every time, across every vibe-coded project I've seen.

Why AI misses the same ingredients

AI coding assistants are exceptional at one thing: writing functional code for the problem you described. If you say "build me a login form," you'll get a login form. Fast. Clean. It'll probably even look good.

What you won't get — unless you asked for it specifically — is:

These aren't bugs. The AI didn't fail. It just optimized for what you asked for — which was a login form, not a production-grade auth system.

The fix isn't prompting harder. It's having a checklist that forces you to ask the right questions before you ship.

The 12 ingredients

These 12 categories come from the Golden Code methodology's completeness audit — the same audit that midas_completeness runs automatically on your project. Think of them as the 12 areas where production apps either hold together or fall apart.

01

Frontend

UI/UX, responsive design, accessibility, loading states, client-side validation

02

Backend

API routes, business logic, request/response lifecycle, rate limiting

03

Database

Schema design, migrations, indexes, connection pooling, seeding

04

Auth

Identity, sessions/tokens, RBAC, secure cookies, token refresh

05

API Integrations often missed

Third-party services, webhooks, retry logic, credential management

06

State Management often missed

Optimistic UI, cache invalidation, sync vs async, stale data handling

07

Design System often missed

Token consistency, component library, typography scale, dark mode

08

Testing often missed

Unit, integration, E2E, coverage gates, contract testing for APIs

09

Security often missed

Input validation, CSRF, CORS, secrets management, dependency audits

10

Error Handling often missed

Boundaries, fallbacks, logging, user-facing messages, alerting

11

Version Control

Git hygiene, branching strategy, commit conventions, .gitignore

12

Deployment often missed

CI/CD pipeline, env config, health checks, rollback strategy

The first four — Frontend, Backend, Database, Auth — are what you asked for. The AI built them. They work.

The last eight are what separates a working demo from a production system. They're the ones that come back to bite you at 2 AM.

The four groups

Functional (01–04): What you asked for

These are the core pillars of any app. Most AI-generated code covers them well because they're explicit in the prompt. "Build me a user dashboard" generates Frontend. "Add Stripe payments" generates Backend + API Integrations (sort of).

The trap here isn't that they're missing — it's that they're incomplete. Auth is wired up, but token refresh isn't. The database schema exists, but there are no indexes on the columns you're querying in production.

Integrated (05–07): The connective tissue

API Integrations, State Management, and Design System are the parts that hold the app together under real-world conditions. They're easy to forget because they don't produce visible features — they produce reliability.

API Integrations without retry logic and webhook verification will silently drop data. State Management without cache invalidation will show users stale data and they won't know why. Design System without token consistency produces a UI that looks fine in screenshots and broken in production at 1440px on a dim display.

Protected (08–10): The safety net

Testing, Security, and Error Handling are the ingredients most likely to be entirely absent in vibe-coded projects — and the ones most likely to cause catastrophic failures.

The hard truth: A vibe-coded app with no tests isn't "untested." It has exactly one test — production. And your users are running it.

Security is the one that costs the most to miss. Hardcoded secrets in git history. No CSRF protection on state-changing endpoints. User-controlled inputs passed directly to database queries. These aren't hypothetical — they're patterns I see in AI-generated code constantly, because the AI wasn't asked to think adversarially.

Error Handling is what determines whether a bug is a minor inconvenience or a cascading failure. An uncaught promise rejection in a Next.js API route takes down the entire route. A missing error boundary in React renders a blank screen. A missing alert means you find out about the outage from a user tweet.

Production (11–12): The launch gate

Version Control and Deployment are the ingredients that determine whether you can actually ship — and whether you can recover when something goes wrong.

Most AI-generated projects have a git repo (because you created one). Most don't have a CI/CD pipeline, health checks, or a rollback strategy. The deployment works once. Then you make a change and discover you have no way to verify it before it hits prod.

Where vibe-coded apps typically score

typical vibe-coded app audit score 4–6 / 12

Most projects nail 01–04 (Functional) and 11 (Version Control). Everything else is partial or missing. A score of 10+ is the threshold for confident production deployment.

This isn't a knock on AI coding. It's a structural problem: you got great output for what you asked for. You just didn't know to ask for the other eight things.

The fix is a forcing function — something that checks all 12 before you ship, every time, automatically.

How to audit your project automatically

The midas_completeness MCP tool runs a completeness audit against all 12 ingredients. It doesn't just ask "does auth exist?" — it checks for token refresh, RBAC, session expiry handling. It doesn't just ask "do you have tests?" — it checks coverage, test type distribution, and whether you have E2E coverage for critical paths.

# Run in Cursor (via MCP) or CLI
npx merlyn-mcp

# Then ask Claude:
# "Run midas_completeness and tell me what's missing"

The output is a scored breakdown of all 12 ingredients, with specific gaps called out and suggested next steps. Run it before you ship. Run it every time you add a major feature. Use it as your definition of "done."

The practical checklist

Before any production deploy, walk these 12:

  1. Frontend: Does it work on mobile? Are loading/error states handled?
  2. Backend: Are all endpoints rate-limited? Is input validated server-side?
  3. Database: Are migrations versioned? Are production indexes in place?
  4. Auth: Does token refresh work? Are session cookies httpOnly + secure?
  5. API Integrations: Are webhooks verified? Is retry logic in place?
  6. State Management: Does stale data get invalidated on mutations?
  7. Design System: Are spacing, color, and typography consistent?
  8. Testing: Do critical paths have test coverage? Do tests run in CI?
  9. Security: Any hardcoded secrets? Is npm audit clean of criticals?
  10. Error Handling: Are all async errors caught? Do users see useful messages?
  11. Version Control: Is .env in .gitignore? Are commits clean?
  12. Deployment: Is there a CI pipeline? Is there a rollback path?

If you can't confidently answer yes to every item, you're not ready to ship. That's not a judgment — it's a definition.


The goal isn't perfection. A score of 10/12 with known, documented gaps is a production-ready system. A score of 4/12 with unknown gaps is a liability.

Know your gaps. Ship with eyes open.

Audit your project now

Run midas_completeness in Cursor and get a scored breakdown of all 12 ingredients. No signup required.

npx merlyn-mcp click to copy

Then read: Golden Code: The Methodology That Turns Vibe-Coding into Production Software →