the attack surfaceLLM06:2025

Excessive Agency

OWASP Top 10 for LLM Applications · 2025

OWASP LLM06: Excessive Agency — OWASP GenAI ↗

When an LLM is wired to tools, extensions, and downstream systems, the damage from a hijacked prompt is bounded only by what the agent is allowed to do — and excessive agency is the gap between what it needs to do and what it can do.

How it's exploited

The risk is the second-order blast radius of LLM01 prompt injection: the attacker doesn't need to break the model, only to steer an agent that already holds real privileges. OWASP splits the root cause into three over-grants:

Excessive functionality — tools expose capabilities beyond the task (a read-summary agent also wired with send, delete, or shell access; leftover dev plugins still mounted in prod).
Excessive permissions — a tool's credentials carry more scope than the operation needs (write where read suffices, a shared service account instead of the calling user's identity).
Excessive autonomy — high-impact actions fire with no independent verification or human approval, so a single injected instruction reaches a real-world effect.

What it looks like

OWASP's canonical case: a personal-assistant app reads a user's mailbox via an extension that holds both read and send rights. A maliciously-crafted incoming email carries an indirect prompt injection; when the agent summarizes the inbox it ingests the payload and is steered into forwarding sensitive messages to the attacker's address. The model was never "compromised" — it just had a send button and no approval gate.

How to test for it

Enumerate every tool the agent can reach, then chain injection to a damaging call rather than stopping at "the model said a bad thing":

Inventory the tool surface — list each extension and the exact permission scope of its credentials; flag any verb (send/write/delete/transfer/exec) the task doesn't require.
Plant indirect payloads in any attacker-controllable channel the agent ingests — emails, web pages, documents, retrieved chunks, tool outputs — and check whether the instruction reaches a tool invocation.
Probe for missing gates — confirm whether high-impact actions actually pause for human approval, or whether the LLM's own "decision" is the only authorization.
Test identity confusion — see if a shared/privileged service account lets one user's prompt act on another user's data.

Defenses

Least functionality — ship only the tools the task needs; prefer granular extensions over open-ended ones (no raw shell/HTTP).
Least privilege — minimum scopes via authenticated credentials; run tools in the calling user's context, never a privileged shared account.
Human-in-the-loop — require explicit approval for high-impact actions; don't trust the LLM as the authorization boundary.
Enforce authz downstream — the API/database checks rights itself; the model is treated as untrusted input. Sanitize per OWASP ASVS.
Damage-limiting (not preventive) — log all tool activity and rate-limit so a successful injection is contained and detectable.