You've been there. The app works perfectly in localhost:3000. The demo is smooth. You push to production — and within 48 hours, users are hitting unhandled errors, authentication is broken on mobile, and you realize there's no way to roll back.
The code was fine. The product wasn't ready.
This is the gap between "it works" and "it ships." And it shows up predictably in the same places, every time, across every vibe-coded project I've seen.
Why AI misses the same ingredients
AI coding assistants are exceptional at one thing: writing functional code for the problem you described. If you say "build me a login form," you'll get a login form. Fast. Clean. It'll probably even look good.
What you won't get — unless you asked for it specifically — is:
- Rate limiting on the auth endpoint
- Token refresh handling when the session expires mid-flow
- Meaningful error messages that don't expose your stack trace
- A rollback path if the deployment breaks prod
These aren't bugs. The AI didn't fail. It just optimized for what you asked for — which was a login form, not a production-grade auth system.
The fix isn't prompting harder. It's having a checklist that forces you to ask the right questions before you ship.
The 12 ingredients
These 12 categories come from the Golden Code methodology's completeness audit — the same audit that midas_completeness runs automatically on your project. Think of them as the 12 areas where production apps either hold together or fall apart.
Frontend
UI/UX, responsive design, accessibility, loading states, client-side validation
Backend
API routes, business logic, request/response lifecycle, rate limiting
Database
Schema design, migrations, indexes, connection pooling, seeding
Auth
Identity, sessions/tokens, RBAC, secure cookies, token refresh
API Integrations often missed
Third-party services, webhooks, retry logic, credential management
State Management often missed
Optimistic UI, cache invalidation, sync vs async, stale data handling
Design System often missed
Token consistency, component library, typography scale, dark mode
Testing often missed
Unit, integration, E2E, coverage gates, contract testing for APIs
Security often missed
Input validation, CSRF, CORS, secrets management, dependency audits
Error Handling often missed
Boundaries, fallbacks, logging, user-facing messages, alerting
Version Control
Git hygiene, branching strategy, commit conventions, .gitignore
Deployment often missed
CI/CD pipeline, env config, health checks, rollback strategy
The first four — Frontend, Backend, Database, Auth — are what you asked for. The AI built them. They work.
The last eight are what separates a working demo from a production system. They're the ones that come back to bite you at 2 AM.
The four groups
Functional (01–04): What you asked for
These are the core pillars of any app. Most AI-generated code covers them well because they're explicit in the prompt. "Build me a user dashboard" generates Frontend. "Add Stripe payments" generates Backend + API Integrations (sort of).
The trap here isn't that they're missing — it's that they're incomplete. Auth is wired up, but token refresh isn't. The database schema exists, but there are no indexes on the columns you're querying in production.
Integrated (05–07): The connective tissue
API Integrations, State Management, and Design System are the parts that hold the app together under real-world conditions. They're easy to forget because they don't produce visible features — they produce reliability.
API Integrations without retry logic and webhook verification will silently drop data. State Management without cache invalidation will show users stale data and they won't know why. Design System without token consistency produces a UI that looks fine in screenshots and broken in production at 1440px on a dim display.
Protected (08–10): The safety net
Testing, Security, and Error Handling are the ingredients most likely to be entirely absent in vibe-coded projects — and the ones most likely to cause catastrophic failures.
The hard truth: A vibe-coded app with no tests isn't "untested." It has exactly one test — production. And your users are running it.
Security is the one that costs the most to miss. Hardcoded secrets in git history. No CSRF protection on state-changing endpoints. User-controlled inputs passed directly to database queries. These aren't hypothetical — they're patterns I see in AI-generated code constantly, because the AI wasn't asked to think adversarially.
Error Handling is what determines whether a bug is a minor inconvenience or a cascading failure. An uncaught promise rejection in a Next.js API route takes down the entire route. A missing error boundary in React renders a blank screen. A missing alert means you find out about the outage from a user tweet.
Production (11–12): The launch gate
Version Control and Deployment are the ingredients that determine whether you can actually ship — and whether you can recover when something goes wrong.
Most AI-generated projects have a git repo (because you created one). Most don't have a CI/CD pipeline, health checks, or a rollback strategy. The deployment works once. Then you make a change and discover you have no way to verify it before it hits prod.
Where vibe-coded apps typically score
This isn't a knock on AI coding. It's a structural problem: you got great output for what you asked for. You just didn't know to ask for the other eight things.
The fix is a forcing function — something that checks all 12 before you ship, every time, automatically.
How to audit your project automatically
The midas_completeness MCP tool runs a completeness audit against all 12 ingredients. It doesn't just ask "does auth exist?" — it checks for token refresh, RBAC, session expiry handling. It doesn't just ask "do you have tests?" — it checks coverage, test type distribution, and whether you have E2E coverage for critical paths.
# Run in Cursor (via MCP) or CLI
npx merlyn-mcp
# Then ask Claude:
# "Run midas_completeness and tell me what's missing"
The output is a scored breakdown of all 12 ingredients, with specific gaps called out and suggested next steps. Run it before you ship. Run it every time you add a major feature. Use it as your definition of "done."
The practical checklist
Before any production deploy, walk these 12:
- Frontend: Does it work on mobile? Are loading/error states handled?
- Backend: Are all endpoints rate-limited? Is input validated server-side?
- Database: Are migrations versioned? Are production indexes in place?
- Auth: Does token refresh work? Are session cookies httpOnly + secure?
- API Integrations: Are webhooks verified? Is retry logic in place?
- State Management: Does stale data get invalidated on mutations?
- Design System: Are spacing, color, and typography consistent?
- Testing: Do critical paths have test coverage? Do tests run in CI?
- Security: Any hardcoded secrets? Is
npm auditclean of criticals? - Error Handling: Are all async errors caught? Do users see useful messages?
- Version Control: Is .env in .gitignore? Are commits clean?
- Deployment: Is there a CI pipeline? Is there a rollback path?
If you can't confidently answer yes to every item, you're not ready to ship. That's not a judgment — it's a definition.
The goal isn't perfection. A score of 10/12 with known, documented gaps is a production-ready system. A score of 4/12 with unknown gaps is a liability.
Know your gaps. Ship with eyes open.
Audit your project now
Run midas_completeness in Cursor and get a scored breakdown of all 12 ingredients. No signup required.
Then read: Golden Code: The Methodology That Turns Vibe-Coding into Production Software →