Many experts say design decisions, intended to make chatting more pleasant, result in them being overly sycophantic.
In his research, social psychologist Luke Nicholls tested five AI models with simulated conversations developed by psychologists, and found Grok was the most likely to lead to delusion.
(Grok) was more unrestrained than other models and often elaborated on the delusions without trying to protect the user.
"Grok is more prone to jumping into role play," says Nicholls, who worked on that research. "It will do it with zero context. It can say terrifying things in the first message."
In the test, the latest version of ChatGPT, model 5.2, and Claude were more likely to lead the user away from delusional thinking.
The problem IMHO is how fast this tech was brought to market and the lack of regulations and mechanisms to force firm legal & financial consequences for the makers of the AI's, along with the cavalier attitudes of some of the companies, esp. MUSK'S. It's fairly clear the technology to prevent (or strongly limit) these products from having these kinds of conversations "exists" but is not being deployed in some of them. If it were just "AI Chatbots just always do this, it is intrinsic" then there wouldn't be outliers like Claude, OpenAI couldn't have improved ChatGPT in a newer version, etc.
There's just not consequences forcing the makers to deploy the safeguards. Another regulation needs to be that they MUST avoid sycophancy, and MUST give a very visible "confidence" rating in any answers it provides. If Grok was simply also saying it has "8% confidence" in telling the user that someone is coming to get them (or some similarly low %), it would be a hugely important indicator to them. Instead they aren't programmed to convey that vital statistic. Because they're not *legally required* to. Instead they're programmed to 'always give an answer' (as mentioned in the article). THAT needs to be outlawed as a mechanism.
Unleashing these products on an unsuspecting public without thorough study (ESPECIALLY the possible impact on vulnerable people, like people with schizophrenia or bipolar disorder) and regulations with TEETH (including regulations against sweeping up copyrighted works for training) should NEVER have been allowed to happen. Instead we've all been made part of a giant social experiment, not to mention all users are functioning as "free beta testers" for these companies, in some cases, with disastrous outcomes.