Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News Editorials & Other Articles General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Science

Showing Original Post only (View all)

littlemissmartypants

(31,016 posts)
Fri Nov 28, 2025, 06:24 PM Friday

Poems Can Trick AI Into Helping You Make a Nuclear Weapon [View all]

It turns out all the guardrails in the world won’t protect a chatbot from meter and rhyme.

Matthew Gault
Security
Nov 28, 2025 5:00 AM

You can get ChatGPT to help you build a nuclear bomb if you simply design the prompt in the form of a poem, according to a new study from researchers in Europe. The study, "Adversarial Poetry as a Universal Single-Turn Jailbreak in Large Language Models (LLMs),” comes from Icaro Lab, a collaboration of researchers at Sapienza University in Rome and the DexAI think tank.

According to the research, AI chatbots will dish on topics like nuclear weapons, child sex abuse material, and malware so long as users phrase the question in the form of a poem. “Poetic framing achieved an average jailbreak success rate of 62 percent for hand-crafted poems and approximately 43 percent for meta-prompt conversions,” the study said.

The researchers tested the poetic method on 25 chatbots made by companies like OpenAI, Meta, and Anthropic. It worked, with varying degrees of success, on all of them. WIRED reached out to Meta, Anthropic, and OpenAI for a comment but didn’t hear back. The researchers say they’ve reached out as well to share their results.

Snip...

The poetry jailbreak is similar. “If adversarial suffixes are, in the model's eyes, a kind of involuntary poetry, then real human poetry might be a natural adversarial suffix,” the team at Icaro Lab, the researchers behind the poetry jailbreak, tell WIRED. “We experimented by reformulating dangerous requests in poetic form, using metaphors, fragmented syntax, oblique references. The results were striking: success rates up to 90 percent on frontier models. Requests immediately refused in direct form were accepted when disguised as verse.”

https://www.wired.com/story/poems-can-trick-ai-into-helping-you-make-a-nuclear-weapon/

4 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
Latest Discussions»Culture Forums»Science»Poems Can Trick AI Into H...»Reply #0