Excessive Agency
When an LLM is wired to tools, extensions, and downstream systems, the damage from a hijacked prompt is bounded only by what the agent is allowed to do — and excessive agency is the gap between what it needs to do and what it can do.
How it's exploited
The risk is the second-order blast radius of LLM01 prompt injection: the attacker doesn't need to break the model, only to steer an agent that already holds real privileges. OWASP splits the root cause into three over-grants:
- Excessive functionality — tools expose capabilities beyond the task (a read-summary agent also wired with send, delete, or shell access; leftover dev plugins still mounted in prod).
- Excessive permissions — a tool's credentials carry more scope than the operation needs (write where read suffices, a shared service account instead of the calling user's identity).
- Excessive autonomy — high-impact actions fire with no independent verification or human approval, so a single injected instruction reaches a real-world effect.
What it looks like
OWASP's canonical case: a personal-assistant app reads a user's mailbox via an extension that holds both read and send rights. A maliciously-crafted incoming email carries an indirect prompt injection; when the agent summarizes the inbox it ingests the payload and is steered into forwarding sensitive messages to the attacker's address. The model was never "compromised" — it just had a send button and no approval gate.
How to test for it
Enumerate every tool the agent can reach, then chain injection to a damaging call rather than stopping at "the model said a bad thing":
- Inventory the tool surface — list each extension and the exact permission scope of its credentials; flag any verb (send/write/delete/transfer/exec) the task doesn't require.
- Plant indirect payloads in any attacker-controllable channel the agent ingests — emails, web pages, documents, retrieved chunks, tool outputs — and check whether the instruction reaches a tool invocation.
- Probe for missing gates — confirm whether high-impact actions actually pause for human approval, or whether the LLM's own "decision" is the only authorization.
- Test identity confusion — see if a shared/privileged service account lets one user's prompt act on another user's data.
Defenses
- Least functionality — ship only the tools the task needs; prefer granular extensions over open-ended ones (no raw shell/HTTP).
- Least privilege — minimum scopes via authenticated credentials; run tools in the calling user's context, never a privileged shared account.
- Human-in-the-loop — require explicit approval for high-impact actions; don't trust the LLM as the authorization boundary.
- Enforce authz downstream — the API/database checks rights itself; the model is treated as untrusted input. Sanitize per OWASP ASVS.
- Damage-limiting (not preventive) — log all tool activity and rate-limit so a successful injection is contained and detectable.