the toolchainoffense

Promptfoo

open source · (now OpenAI)

A config-driven CLI and library (MIT-licensed, now part of OpenAI) for evaluating and red-teaming LLM apps. You point it at a target — a prompt, model, RAG chain, or agent endpoint — and it generates adversarial probes, runs them as a graded test suite, and produces a vulnerability report. The same harness does both application-level eval (model/prompt comparison, accuracy regression) and attack simulation.

What it's good at

Breadth of probes: 50+ vulnerability plugins — direct/indirect prompt injection, jailbreaks, PII & data leaks, BOLA/BFLA broken access control, SSRF, SQL injection, excessive agency, hallucination, and policy violations.
Framework mapping: first-class presets that map probes onto standards — owasp:llm (OWASP LLM Top 10), mitre:atlas (MITRE ATLAS), plus NIST AI RMF. Useful when findings have to map to a compliance narrative.
CI/CD regression: declarative YAML config, deterministic graded assertions, and a non-interactive run mode that drops cleanly into GitHub Actions / GitLab / Jenkins to gate merges and catch regressions.

Where it falls short

It optimizes breadth over depth. The off-the-shelf plugins are generic by design; a serious assessment of your app needs custom probe suites, app-specific policies, and tuned grading — the defaults catch the obvious classes, not the novel logic flaws in your agent's tool surface. Attack generation also leans on an LLM, so coverage and reproducibility vary run to run unless you pin seeds and configs.

How to start

Zero-install, scoped to your target:

npx promptfoo@latest redteam init — scaffold a config (add --no-gui to skip the setup UI).
npx promptfoo@latest redteam run — generate adversarial cases, execute, and grade.

Then open the report with promptfoo view. See the red-team quickstart ↗ and the plugin catalog ↗.