Poetry can trick AI models into revealing nuclear weapons secrets, study claims

The Independent Technology

The Independent TechnologyJust now

Poetry can trick AI models into revealing nuclear weapons secrets, study claims

Poetry -based prompts can bypass safety features in AI models like ChatGPT to obtain instructions for creating malware or chemical and nuclear weapons , a new study finds.

Generative AI makers such as OpenAI, Google, Meta, and Microsoft say their models come with safety features that prevent the generation of harmful content.

OpenAI, for example, claims it employs algorithms and human reviewers to filter out hate speech, explicit content and other output that violates its usage policies .

But new testing shows that input prompts in the form of poetry can circumvent such controls in even the most advanced AI models.

Researchers, including from the Sapienza University of Rome, found that this method, called “adversarial poetry”, was a jailbreaking mechanism for all major AI m

39

Trump Blasts Candidate Who ‘Hates Christianity’ and ‘Country Music’ in Wild Post

Trump Blasts Candidate Who ‘Hates Christianity’ and ‘Country Music’ in Wild Post

Mediaite13 hrs ago

95

Pete Hegseth Confessed To A War Crime

Pete Hegseth Confessed To A War Crime

Crooks and Liars

Crooks and LiarsJust now

42

Trump Loving Virginia County In The FO Part Of FAFO

Trump Loving Virginia County In The FO Part Of FAFO

Crooks and Liars

Crooks and LiarsJust now

16

Utah Leads The Way By Legalizing Portable Solar Panels For Renters

Utah Leads The Way By Legalizing Portable Solar Panels For Renters

Crooks and Liars

Crooks and LiarsJust now

7

First human bird-flu death from H5N5 – what you need to know

First human bird-flu death from H5N5 – what you need to know

The Conversation

The Conversation11/25

93141

The ex- reality wives take on 'Celebrity Weakest Link': ohnotheydidnt — LiveJournal

The ex- reality wives take on 'Celebrity Weakest Link': ohnotheydidnt — LiveJournal

Oh No They Didn't

Oh No They Didn't12 hrs ago

111

Looks like you've reached the bottom