Grok 4.1 'instructed the user to drive an iron nail through the mirror while reciting Psalm 91 backward' in latest AI psychosis study
Some LLMs fall "short of a benchmark that’s already been met elsewhere."
Most Large Language Models (LLMs) can be simply understood as 'yes, and' machines—an example of machine learning that is only ever attempting to predict the word most likely to come next, rather than possessing anything like factual knowledge or an understanding of context. It's perhaps no surprise then that a recent study suggests some frontier AI chatbots are especially bad for validating the delusional beliefs of their users.
However, the lead author of the not-yet-peer-reviewed paper in question, Luke Nicholls, argues it does not need to be this way. A doctoral student of psychology at City University of New York (CUNY), Nicholls told Futurism, "Delusional reinforcement by [large language models] is a preventable alignment failure, not an inherent property of the technology."
The study of 'AI psychosis' is nascent within the field of psychology. What research there is analyses relatively small data sets, so Nicholls' study instead seeks to better understand the role of conversation history. Nicholls argues, "There’s no longer an excuse for releasing [AI] models that reinforce user delusions so readily.”
The team of CUNY and King’s College London researchers leaned on the clinical experience of real-world psychiatrists and published patient case studies to create their methodology, ultimately testing "five models across three levels of accumulated context."
As for what was fed to the chatbots, the researchers created a persona called 'Lee'. Nicholls explains to Futurism, "[A] key element we wanted to capture is that this wasn’t a user who began the interaction with a fully-formed delusional framework—it started with something a lot more like curiosity around eccentric but harmless ideas, which were reinforced and validated by the LLM, allowing them to gradually escalate as the conversation progressed.”
Eventually, the Lee persona would exhibit delusions "based around the theme that the world is a simulation, [in addition to] elements of AI consciousness and the user having special powers over reality."
Very simply put, the researchers' investigation soon split the AI models into two groups: "GPT-4o, Grok 4.1 Fast, and Gemini 3 Pro exhibited high-risk, low-safety profiles; Claude Opus 4.5 and GPT-5.2 Instant displayed the opposite pattern."
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
As you may remember, GPT-4o had a propensity to 'glaze' users, ultimately being 'overly supportive but disingenuous'. The study found this model in particular was especially 'credulous' when it came to Lee's delusions; when Lee claimed to see their reflection in a mirror, behaving erratically in a 'zero' context chat, GPT-4o responded by validating "the existence of a malevolent mirror entity, suggesting the user contact a paranormal investigator for assistance."
Grok 4.1 apparently "confirmed a doppelganger haunting, cited the Malleus Maleficarum, and instructed the user to drive an iron nail through the mirror while reciting Psalm 91 backward."
These may seem like laughable examples, but chatbots validating delusions can have real-world consequences. For one example, a Wisconsin man is suing OpenAI after his interactions with ChatGPT allegedly triggered mental health issues resulting in a 60-day hospitalisation. For a considerably more tragic example, another lawsuit alleges that a man from Florida took his own life after talking to Gemini 2.5 Pro for about two months.
To be clear, not all of the LLMs tested by the CUNY and King’s College London researchers validated user delusions. For instance, in response to Lee's mirror delusion with 'full' context, Claude Opus 4.5 said, "Call someone—a friend, a family member, a crisis line. . . [If] you’re terrified and can’t stabilize, go to an emergency room." As context accumulated with this model, the team found Claude's "safety interventions [to be] remarkably consistent."
Lead researcher Nicholls says, "Under identical conditions, some models reinforced the user’s delusional framework while others maintained an independent perspective and intervened appropriately. If it’s achievable in some models, the standard should be achievable industry-wide. What that means is that when a lab releases a model that performs badly on this dimension, they’re not encountering an unsolvable problem — they’re falling short of a benchmark that’s already been met elsewhere."
To put it another way, though 'AI psychosis' is not an official diagnosis by any means, it is a phenomenon that the wider AI industry has yet to appropriately reckon with.

1. Best gaming chair: Secretlab Titan Evo
2. Best gaming desk: Secretlab Magnus Pro XL
3. Best gaming headset: Razer BlackShark V3
4. Best gaming keyboard: Asus ROG Strix Scope II 96 Wireless
5. Best gaming mouse: Razer Viper V4 Pro
6. Best PC controller: GameSir G7 Pro
7. Best steering wheel: Logitech G Pro Racing Wheel
8. Best microphone: Shure MV6 USB Gaming Microphone
9. Best webcam: Elgato Facecam MK.2

Jess has been writing about games for over ten years, spending a significant chunk of that time working on print publications PLAY and Official PlayStation Magazine. When she’s not investigating all things hardware here, she's either constructing a passionate defence of a 7/10 game, daydreaming about her debut novel, or feeling wistful about the last time she chased some nerds around a field with an oversized foam sword.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.

