AI poisoning: Anthropic study finds just 250 malicious files can compromise the largest LLMs

October 27, 2025
Posted by jeff

27 Oct

Contrary to long-held beliefs that attacking or contaminating large language models (LLMs) requires enormous volumes of malicious data, new research from AI startup Anthropic, conducted in collaboration with the UK AI Security Institute and the Alan Turing Institute, has found otherwise. The study demonstrates that as few as 250 poisoned documents can implant backdoor vulnerabilities into LLMs, regardless of their size or training dataset volume.