OpenAI wants its next generation of AI models to be a lot more upfront about their mistakes. With ChatGPT wrong about 25% of the time, this feature seems long overdue. But the company isn't training them to be more self-aware; it's training them to report errors directly.
This week, OpenAI published new research on a technique it's calling “confessions” — a method that adds a second output channel to a model, where it’s specifically trained to describe whether it followed the rules, where it may have fallen short or hallucinated and what uncertainties it faced during the task.
Here's the thing, though. It’s not a ChatGPT feature that's available yet to users; instead, it's a proof-of-concept safety tool designed to help researchers detect subtle failures that are otherwise hard to see. A

Tom's Guide

GV Wire
Fast Company Technology
The Daily Sentinel
Nola Business
The Fayetteville Observer Sports
Associated Press US News
AlterNet
Mediaite
KTBS Health
Fortune