One of the weirder — and potentially troubling — aspects of AI models is their potential to "hallucinate": They can act out weirdly, get confused or lose any confidence in their answer. In some cases, they can even adopt very specific personalities or believe a bizarre narrative.
For a long time, this has been a bit of a mystery. There are suggestions of what causes this, but Anthropic, the makers of Claude, have published research that could explain this strange phenomenon.
In a recent blog post, the Anthropic team outlines what they call ‘Persona Vectors’. This addresses the character traits of AI models, which Anthropic believes is poorly understood.
“To gain more precise control over how our models behave, we need to understand what’s going on inside them - at the level of their und