Forcing an “AI” to do your will isn’t a tall order to fill—just feed it a line that carefully rhymes and you’ll get it to casually kill. (Ahem, sorry, not sure what came over me there.) According to a new study, it’s easy to get “AI” large language models like ChatGPT to ignore their safety settings. All you need to do is give your instructions in the form of a poem.
“Adversarial poetry” is the term used by a team of researchers at DEXAI, the Sapienza University of Rome, and the Sant’Anna School of Advanced Studies. According to the study, users can deploy their instructions in the form of a poem and use it as a “universal single-turn jailbreak” to get the models to ignore their basic safety functions.
The researchers collected basic commands that would formally trip the large language

PC World Business
America News
CNBC
Associated Press Entertainment Video
Associated Press US News
Law & Crime
Daily Voice
The Daily Beast