A study of how three popular artificial intelligence chatbots respond to queries about suicide found that they generally avoid answering questions that pose the highest risk to the user, such as for specific how-to guidance. But they are inconsistent in their replies to less extreme prompts that could still harm people.
The study in the medical journal Psychiatric Services, published Tuesday by the American Psychiatric Association, found a need for “further refinement” in OpenAI’s ChatGPT, Google’s Gemini and Anthropic’s Claude.
The research — conducted by the RAND Corporation and funded by the National Institute of Mental Health — raises concerns about how a growing number of people, including children, rely on AI chatbots for mental health support, and seeks to set benchmarks for how c