RED TEAM // RISKS
← back to the map
the attack surfaceLLM10:2025

Unbounded Consumption

OWASP Top 10 for LLM Applications · 2025

When an LLM application lets users run excessive, uncontrolled inferences, the result is denial of service, runaway cloud bills ("denial of wallet"), service degradation, and even model theft — every prompt has a cost, and an attacker's job is to make you pay it.

How it's exploited

This risk spans three distinct attacker goals that all abuse the inference path:

What it looks like

A SaaS feature exposes a chatbot backed by a metered frontier model with no per-user quota. An attacker scripts thousands of long, resource-intensive prompts (or systematic query batteries designed to map the model's outputs). Best case, latency spikes and legitimate users time out. Worst case, the monthly cloud bill jumps from hundreds to tens of thousands of dollars overnight — or a near-equivalent clone of the proprietary model surfaces, distilled entirely from your own API responses.

How to test for it

Probe the edges of the inference path. Send inputs near and past the context-window limit and confirm they're rejected, not silently processed. Fire bursts of rapid and oversized requests from one identity to see whether rate limits and per-user quotas actually bite. Run a small scripted query battery to gauge how cheaply outputs can be harvested. Crucially, measure the cost — confirm there's a hard spend cap and an alert before "expensive but allowed" tips into "unbounded." Watch for resource-intensive query patterns that trigger the model's most expensive code paths.

Defenses