Supply Chain
Everything you didn't build yourself is attack surface: third-party base models, fine-tuning datasets, LoRA/PEFT adapters, and the platforms that serve them can all carry hidden backdoors, poisoning, or tampering long before your code ever runs.
How it's exploited
Unlike classic software supply chain risk, the ML supply chain extends to weights and data, not just packages. Common vectors:
- Vulnerable pre-trained models — black-box weights from a hub can hide biases, backdoors, or trigger-activated behavior that passes published benchmarks.
- Poisoned datasets — backdoors embedded in fine-tuning or RAG corpora that bias outputs toward an attacker-chosen entity or phrase.
- Malicious LoRA / PEFT adapters — a tampered adapter merged into a trusted base undermines its integrity; vLLM and OpenLLM let adapters be pulled and applied to a live model.
- Weak provenance — a compromised or typosquatted supplier account on a model repo, amplified by social engineering, ships malicious weights under a trusted name.
- Collaborative merge/convert services, outdated/deprecated models, vulnerable third-party packages, on-device firmware, and unclear T&Cs that quietly train on your data.
What it looks like
PoisonGPT: a base model is surgically edited so it confidently states a false fact, then re-uploaded under a name resembling a reputable one — downstream apps inherit the lie with safety evals intact. In another documented case, tampered models replaced legitimate ones inside 116 Google Play apps. A malicious LoRA adapter offered as a "performance" tweak is merged into a deployed model and silently exfiltrates or biases responses.
How to test for it
As a red teamer, treat provenance as the target. Probe whether downloaded weights match a published hash and signature; diff a fine-tuned model against its claimed base for unexpected layers or merged adapters. Plant trigger phrases to surface backdoors that activate only on specific inputs. Audit the dependency and model inventory for deprecated, unsigned, or typosquatted sources, and confirm whether adapter-loading endpoints (vLLM/OpenLLM) will accept an unvetted adapter at runtime.
Defenses
Vet suppliers and their T&Cs/privacy terms; verify model and data sources with hashing, code signing, and integrity checks. Maintain a signed, tamper-resistant AI/ML SBOM (e.g. CycloneDX) covering models, datasets, and adapters. Red-team and anomaly-test models beyond vendor benchmarks, scan collaborative environments (e.g. HuggingFace SF_Convertbot Scanner), enforce a patching and version policy for maintained models/APIs, and encrypt plus attest edge-deployed models.