When AI Makes Code Worse: 10 Failure Patterns + Fixes (with examples)

I’ve had to fix AI-generated code that looked fine until it hit production or a security review. The patterns are predictable: injection flaws, wrong API usage, phantom imports, and over-engineered solutions that introduce bugs instead of removing them. This guide names 10 failure patterns with concrete “wrong vs fix” examples and a short checklist so you can catch them before they ship.

The data backs it up. In Veracode’s 2025 GenAI Code Security Report, 45% of AI-generated code samples failed security tests and introduced OWASP Top 10 vulnerabilities; only 55% passed. Cross-site scripting (XSS) defenses failed in 86% of relevant samples, and failure rates varied by language (e.g. Java 72%, JavaScript 43%, Python 38%). So “when AI makes code worse” isn’t rare—it’s something you should plan for. Below are the 10 patterns I see most often, plus fixes you can apply immediately.

Failure Patterns 1–4: Security (Injection, XSS, Auth, Secrets)

Code and security or vulnerability concept

1. SQL / NoSQL injection from string concatenation

AI often suggests queries built with string interpolation or concatenation. That’s fine for static parts, but as soon as user input is spliced in, you get injection.

Wrong (pattern):

const query = `SELECT * FROM users WHERE id = ${userId}`;
db.query(query);

or in NoSQL:

const filter = { username: req.body.username }; // unsanitized
db.collection("users").findOne(filter);

Fix: Use parameterized queries or prepared statements so the driver treats input as data, not code. In Node with pg: db.query('SELECT * FROM users WHERE id = $1', [userId]). In MongoDB, avoid passing raw req.body directly into find/update; validate and build the filter from whitelisted fields. Veracode’s report notes that AI-generated code often fails here; manual review of any query that includes user input is non-negotiable.

2. Cross-site scripting (XSS) from unescaped output

AI may generate markup or HTML that embeds user data without escaping. In Veracode’s 2025 report, 86% of relevant AI-generated samples failed to defend against XSS.

Wrong (pattern):

element.innerHTML = `<div>Hello, ${userInput}</div>`;

If userInput is <img src=x onerror="alert(1)">, you get script execution.

Fix: Never assign raw user input to innerHTML. Use textContent for plain text, or a sanitization library (e.g. DOMPurify) if you must render HTML. In React, default escaping in {userInput} is safe; avoid dangerouslySetInnerHTML unless you sanitize first.

3. JWT / session handling that accepts unsigned or mis-verified tokens

AI sometimes suggests jwt.decode() to “read” the token. Decode only parses the payload; it does not verify the signature, so an attacker can forge a token and your app may accept it. Other issues include not specifying allowed algorithms, leading to algorithm-confusion attacks.

Wrong (pattern):

const payload = jwt.decode(token);
if (payload) req.user = payload;

Fix: Use jwt.verify(token, secretOrPublicKey, { algorithms: ['RS256'] }) (or whatever algorithm you use) so the signature is checked. Never trust a token without verification. Documented cases of Copilot-generated auth code with JWT misuse have led to critical vulnerabilities; always verify.

4. Hardcoded secrets and credentials

AI tends to put API keys, passwords, or connection strings directly in code when you ask for “a quick example.” Once committed, they’re in history and in any copy of the repo.

Wrong (pattern):

const apiKey = "sk_live_abc123...";
const dbUrl = "postgresql://user:password@host/db";

Fix: Use environment variables (e.g. process.env.API_KEY) and a .env file that is gitignored. Load secrets at runtime from a secret manager in production. Add “do not hardcode secrets; use env vars” to your prompt if you use AI to generate config or client code.

Failure Patterns 5–7: Correctness (Phantom Code, Wrong APIs, Logic Bugs)

Broken code or error on screen

5. Phantom imports and non-existent APIs

Models hallucinate packages and function names that look plausible but don’t exist or have different signatures. You see correct-looking imports and calls that fail at import or runtime.

Wrong (pattern):

import { createClient } from "redis-client"; // package doesn't exist or name wrong
const data = await fetchUserById(userId);    // function never defined in codebase

Fix: After generating code, run the app and tests. Install dependencies and confirm package names and exports (e.g. check npm or official docs). For codebase-specific functions, ensure the AI is given enough context (file paths, existing module names) so it doesn’t invent APIs. If the assistant suggests a “convenience” helper, verify it exists or implement it.

6. Deprecated or wrong API usage

AI training data includes old examples, so it may suggest deprecated methods or signatures that no longer match the current library version (e.g. Next.js, React, Express).

Wrong (pattern):

// Next.js: getServerSideProps vs App Router; or old React lifecycle
componentWillMount() { ... }

Fix: Specify framework and version in your prompt (e.g. “Next.js 14 App Router,” “React 18 with hooks”). After generation, skim the official docs for the APIs used. If something looks outdated, search for the current API before merging.

7. React useEffect and dependency bugs

AI often generates useEffect with missing or incorrect dependency arrays, leading to stale closures, infinite re-renders, or effects that don’t run when they should.

Wrong (pattern):

useEffect(() => {
  fetchData(userId);
}, []); // userId not in deps; stale or missing updates
// or
useEffect(() => {
  setCount(c => c + 1);
}, [count]); // infinite loop

Fix: Include every value from the effect body that can change in the dependency array (or use a linter rule like exhaustive-deps). For “run once on mount” with a value, either include it in deps or document why it’s safe to omit. Avoid updating state in an effect in a way that retriggers the same effect unless that’s intentional (e.g. pagination). A quick test or manual run will often catch infinite loops.

Failure Patterns 8–10: Structure and Maintainability

Overly complex or tangled code structure

8. Over-engineering and unnecessary abstraction

AI can produce “enterprise-style” code: extra layers, generic interfaces, and patterns that don’t match the actual problem size. The result is harder to read and change and can introduce bugs in the glue code.

Wrong (pattern): A simple “get user by ID” endpoint wrapped in a factory, a strategy pattern, and three indirection layers when a single function would do.

Fix: Ask for minimal, readable code and “no extra abstraction unless we need it.” Review with “would a teammate understand this in 30 seconds?” If not, simplify. Prefer a small, clear function over a “flexible” framework when the requirement is one-off or small.

9. Copy-paste duplication and inconsistent patterns

When you ask for “the same thing in another file,” AI often duplicates logic instead of reusing a shared function or component. That leads to drift: one place gets fixed, the other doesn’t.

Wrong (pattern): Two components each with a 20-line “format date and status” block that should be one helper.

Fix: After generating, look for repeated blocks and extract a shared function or component. Add to your prompt: “reuse the existing helper in utils/format.ts” or “match the pattern used in ComponentA.” A quick grep for similar strings helps spot duplication.

10. Wrong patterns for the framework or runtime

AI might mix patterns from different ecosystems (e.g. Express-style middleware in a serverless handler, or sync file APIs in Node where async is required) or suggest a pattern that doesn’t fit your runtime (e.g. heavy in-memory caching in a short-lived serverless function).

Wrong (pattern): Blocking readFileSync in a Lambda or an Express next()-style middleware chain in a single Vercel serverless function.

Fix: State runtime and environment in the prompt (e.g. “Node 20, serverless function, no persistent in-memory state”). Review for sync vs async, lifecycle (cold start, request scope), and that the pattern matches the framework’s intended use. When in doubt, check the official docs for the target platform.

How to Catch and Prevent These Failures

Checklist and code review

A few habits reduce how often AI-generated code makes things worse.

Pre-generation

Scope the change: One clear task per request (e.g. “add parameterized query to this function”) instead of “add auth and logging and error handling.” Smaller scope means fewer failure modes. For a full workflow on scoping and reviewing AI-assisted changes, see our AI coding assistant workflow guide.
Constrain the prompt: Specify stack, version, and “no hardcoded secrets / use parameterized queries / use verify not decode for JWT.” The more precise the constraints, the less the model fills in the wrong way.

Post-generation

Run and test: npm install, npm run build, npm test. Many phantom imports and wrong APIs fail immediately. Add a smoke test or a quick manual run for the path you changed.
Security pass: For any code that touches user input, auth, or data access, check: parameterized queries (no string concat for user input), no raw innerHTML with user data, JWT verified with explicit algorithm, no secrets in code. Even 5 minutes of targeted review catches most of patterns 1–4.
Simplicity pass: Would a teammate refactor this to half the code? If yes, simplify before merging. That addresses over-engineering and a lot of “looks correct but breaks later” cases.

Checklist (quick scan before merge)

No string concatenation or unsanitized input in SQL/NoSQL queries.
No raw user input in innerHTML (or sanitized if HTML is required).
JWT/session: verify signature and algorithm; no decode-only.
No API keys or passwords in source; env or secret manager only.
Imports and called functions exist (run the code).
APIs match current library/framework version.
useEffect (or equivalent) deps correct; no obvious infinite loop.
No unnecessary abstraction or duplicated logic.
Pattern fits the runtime (sync/async, serverless vs long-lived).

Real-world impact is real: there are documented cases of Copilot-generated auth code with multiple critical vulnerabilities being exploited within days of deployment. The 10 patterns above and the checklist won’t catch everything, but they catch the majority of what shows up in security reports and production incidents. Use them as a standard pass before you ship AI-generated code.

FAQ

Q: How often does AI-generated code introduce security vulnerabilities?
In Veracode’s 2025 GenAI Code Security Report, 45% of AI-generated code samples failed security tests and introduced OWASP Top 10 issues. XSS defenses failed in 86% of relevant samples. So a significant share of AI output needs a security-focused review before production.

Q: What are the most common security failures in AI-written code?
Frequent ones include SQL/NoSQL injection (string concatenation with user input), XSS (unescaped user output in HTML), JWT/session misuse (e.g. using decode instead of verify, or weak algorithm handling), and hardcoded secrets. Fix by using parameterized queries, safe output encoding, proper verify() and algorithm specification, and environment-based secrets.

Q: Why does AI suggest wrong or non-existent APIs?
Models hallucinate when they lack accurate context: they fill in with statistically plausible names or signatures that don’t exist in your stack or version. Stale training data also leads to deprecated APIs. Reduce this by specifying framework and version in prompts and by running the code and tests to catch phantom imports and wrong calls.

Q: How do I reduce over-engineering in AI-generated code?
Ask explicitly for minimal, readable code and “no extra abstraction.” After generation, review with “can we delete half of this and still solve the problem?” Extract shared logic to avoid duplication and remove unnecessary layers. Small, clear functions beat “flexible” frameworks for one-off or small features.

Q: Should I always review AI-generated code before merging?
Yes. Treat AI output as a draft: run it, test it, and do at least a short security and simplicity pass. A checklist (parameterized queries, no raw innerHTML, JWT verify, no secrets, deps correct, pattern fits runtime) plus running the app and tests catches most of the 10 failure patterns above.

Related keywords

AI generated code security vulnerabilities
when AI makes code worse
GitHub Copilot code failures
AI coding assistant mistakes
SQL injection AI generated code
JWT verify vs decode security
how to review AI generated code
AI code review checklist
Cursor Copilot wrong code patterns
reduce AI code hallucinations

I’ve had to fix production issues that came from AI-generated code that passed a quick look but failed under security review or real usage. The 10 patterns above—injection, XSS, JWT misuse, secrets, phantom APIs, deprecated usage, useEffect bugs, over-engineering, duplication, and wrong runtime patterns—are the ones I see over and over. Use the “wrong vs fix” examples and the checklist as a standard pass; combine that with small, well-scoped prompts and tests so AI makes code better, not worse.