WHAT ARE YOU LOOKING FOR?

Popular Tags

All Major Gen-AI Models Exposed to ‘Policy Puppetry’ Attack

Peace Nwakamma Artificial Intelligence 25 نيسان/أبريل 2025 الزيارات: 1160

Universal Prompt Injection Technique Exposes Major AI Model Flaws, HiddenLayer Warns

AI security firm HiddenLayer has revealed a new universal prompt injection method—called Policy Puppetry—that can bypass the safety mechanisms of all leading generative AI models.

The technique works by disguising harmful prompts as policy documents (e.g., in XML, INI, or JSON formats), tricking large language models (LLMs) into interpreting malicious instructions as internal policies. This approach allows attackers to override safety alignments and generate content that would normally be blocked, including material related to violence, self-harm, and chemical or biological threats.

“Policy Puppetry exploits the models’ trust in structured policy files, effectively bypassing built-in safeguards without relying on any specific policy language,” HiddenLayer explained.

While previous methods like Context Compliance Attacks or narrative manipulation have shown similar risks, Policy Puppetry stands out for its cross-model effectiveness. HiddenLayer tested it on LLMs from Anthropic, DeepSeek, Google, Meta, Microsoft, Mistral, OpenAI, and Qwen, and found that all were vulnerable—though some needed slight prompt adjustments.

By treating prompts as policy input, attackers can insert sections that control the model’s output and override system instructions. HiddenLayer warns that this not only highlights serious security gaps in how LLMs are trained and aligned, but also lowers the barrier for threat actors to craft effective jailbreaks.

“This is the first known instruction hierarchy alignment bypass that works universally across frontier models,” the company emphasized. “It shows that LLMs cannot self-regulate and require stronger external security layers to prevent misuse.”

Found this article interesting? Follow us on X(Twitter) ,Threads and FaceBook to read more exclusive content we post.

Follow us on

Get Our Newsletter

Get scoops stories delivered in your inbox

Popular Tags

Tech

The Deep Side of AI: How Modern Models Are Bypassing Enterprise Security Posture

19 كانون2/يناير 2026By Elvis Emeka Ikeji

19 كانون2/يناير 2026

Palo Alto Networks Unveils Vibe: A New Framework for Coding Security Governance

12 كانون2/يناير 2026

Anthropic Introduces Claude AI for Healthcare, Enabling Secure Access to Health Records

12 كانون2/يناير 2026

North Korea Leverages AI for Advanced Surveillance and Military Operations.

27 تشرين2/نوفمبر 2025

MORE TECH

Cybersecurity Insight delivers timely updates on global cybersecurity developments, including recent system breaches, cyber-attacks, advancements in artificial intelligence (AI), and emerging technology innovations. Our goal is to keep viewers well-informed about the latest trends in technology and system security, and how these changes impact our lives and the broader ecosystem

WHAT ARE YOU LOOKING FOR?

Popular Tags

U.S. to Quit Key Cyber and Hybrid Threat Partnerships Under Trump Order

Prosecutors Claim Cybersecurity Pros Secretly Conducted Ransomware Attacks

Nvidia’s Elite AI Chips Reserved for U.S. Use, Trump Announces

Senator Wyden Urges FTC to Probe Microsoft for Cybersecurity Negligence

Worker Scam North Korea, Lures Engineers to Rent Identities for Remote Jobs.

India CCTV Hack Intimate Ward Footage Stolen.

China Seeks AI Leadership as Xi Urges Global Governance at APEC

Cyberattack Halts All Operations for Japan's Top Brewer

UK Government Establishes Centralized Cyber Unit to Coordinate Public Sector Incident Response

Russia Bans FaceTime and Snapchat Over Alleged Terrorist Activity.

Russia warns of possible WhatsApp ban

Cybercrime Pipeline Shut Down: Dutch Police Seize 250 Servers.

Cybersecurity Advancement in West Africa: The Current Phase of Readiness, Reform, and Rising Threats

MTN Rwanda Fights Cyberattacks with New Anti-DDoS Solution Launch

Sovereign AI Cloud Debuts in South Africa via Touchnet–Zadara Alliance

Sui Opens Lagos Hub to Boost West Africa’s Blockchain Development

Economic Crisis Protests Lead to Nationwide Internet Infrastructure Collapse in Iran

Pakistan's Government Launches Probe Into SIM Data Leak

Red Sea Cable Damage Disrupts Internet in Asia and Middleeast

Hackers Claim Breach of Saudi Industrial Services Firm

Raleigh, NC

All Major Gen-AI Models Exposed to ‘Policy Puppetry’ Attack

Follow us on

Get Our Newsletter

Tech

The Deep Side of AI: How Modern Models Are Bypassing Enterprise Security Posture

Palo Alto Networks Unveils Vibe: A New Framework for Coding Security Governance

Anthropic Introduces Claude AI for Healthcare, Enabling Secure Access to Health Records

North Korea Leverages AI for Advanced Surveillance and Military Operations.

Category

Popular Sections

About