OpenAI Tries to Train AI Not to Deceive Users, Realizes It's Instead Teaching It How to Deceive Them While Covering Its Tracks

Futurism

Futurism3 hrs ago

OpenAI Tries to Train AI Not to Deceive Users, Realizes It's Instead Teaching It How to Deceive Them While Covering Its Tracks

OpenAI researchers tried to train the company's AI to stop "scheming" — a term the company defines as meaning "when an AI behaves one way on the surface while hiding its true goals" — but their efforts backfired in an ominous way.

In reality, the team found, they were unintentionally teaching the AI how to more effectively deceive humans by covering its tracks.

"A major failure mode of attempting to 'train out' scheming is simply teaching the model to scheme more carefully and covertly," OpenAI wrote in an accompanying blog post.

As detailed in a new collaboration with AI risk analysis firm Apollo Research, engineers attempted to develop an "anti-scheming" technique to stop AI models from "secretly breaking rules or intentionally underperforming in tests."

They found that they could on

22

20% Off Toybox Alpha One 3D Printer for Kids

20% Off Toybox Alpha One 3D Printer for Kids

America News1 hrs ago

802

Flushable wipes and Iran: Water treatment facility adds cyberattacks to worry list

Flushable wipes and Iran: Water treatment facility adds cyberattacks to worry list

KAWC2 hrs ago

58

AI summit unites students, educators at Orcutt Academy High School

AI summit unites students, educators at Orcutt Academy High School

Santa Maria Times Safety

Santa Maria Times Safety14 hrs ago

96

Sam Altman has a lot of things keeping him awake at night

Sam Altman has a lot of things keeping him awake at night

Newsday3 hrs ago

46

Daniel Domenjo, New Movistar Plus CEO, Lays Out a Roadmap

Daniel Domenjo, New Movistar Plus CEO, Lays Out a Roadmap

VARIETY Film7 hrs ago

46

Try these 5 great productivity hacks Apple just added to the iPhone with iOS 26

Try these 5 great productivity hacks Apple just added to the iPhone with iOS 26

Fast Company Technology

Fast Company Technology5 hrs ago

110

ACPS takes proactive approach to AI in classrooms

ACPS takes proactive approach to AI in classrooms

CBS19 News Crime

CBS19 News Crime12 hrs ago

49

Texas teen uses computer science to fight scammers

Texas teen uses computer science to fight scammers

CBS News14 hrs ago

48

Tesla wins approval to test autonomous robotaxis in Arizona

Tesla wins approval to test autonomous robotaxis in Arizona

Reuters US Top12 hrs ago

15

Schwarzenegger 'Desperate' to Run for President to 'Restore Democracy'

Schwarzenegger 'Desperate' to Run for President to 'Restore Democracy'

RadarOnline16 hrs ago

111

Looks like you've reached the bottom