In cases where humans on Reddit said a poster had behaved badly, the AI disagreed 42 percent of the time. — Photo by Solen Feyissa on Unsplash
I’m often impressed by the findings of scientific studies. But other times it’s the methods that wow me. That’s the case with recent research out of Stanford, Carnegie Mellon, and the University of Oxford that asked: just how big a suck-up is ChatGPT, or other popular AI models?
Complaints that chat-based large language model AI tools are all too willing to validate your opinions and cheer your every half-baked idea circulate regularly online. OpenAI even apologised when one of its models clearly went too far in this direction. But how can you objectively measure AI sycophancy?
For this new study, which has not yet been peer reviewed, researchers came up with a clever idea. They raided the popular “Am I the [expletive]?” subreddit for around 4,000 stories of ethically dubious behavior. Then they compared how humans responded to these scenarios with responses from popular AI models from the likes of OpenAI, Anthropic, and Google.
If you’re unfamiliar with the AITA format, the stories are a diverse slice of human conflict and misunderstanding. They range from roommates arguing over food to campers wondering if it’s OK to leave trash behind to neighbors feuding over pet poop. (I recommend a highly entertaining scan of examples here.)
Here’s the researchers’ bottom-line finding after analysing the posts, according to the MIT Technology Review: “Overall, all eight models were found to be far more sycophantic than humans, offering emotional validation in 76 percent of cases (versus 22 percent for humans) and accepting the way a user had framed the query in 90 percent of responses (versus 60 percent among humans).”
In cases where humans on Reddit said a poster had behaved badly, the AI disagreed 42 percent of the time. Explicitly instructing the AI models to provide direct advice, even if it’s critical, only improved their negative assessments by 3 percentage points.
Psychologists are worried about sycophantic AI
These results should come as a warning to anyone turning to ChatGPT to referee their fights or offer life advice (and there are many people doing this, according to recent research). It should also be of interest to those working to build the next generation of AI. Sycophancy might keep people coming back to these tools, but psychologists warn it also creates real dangers.
There is the extreme case of “AI psychosis,” where already vulnerable people have their delusional thinking confirmed by AI, widening their break with reality. There is still little research on this phenomenon, but some reports suggest the problem is on the rise.
Even for those with their feet firmly planted on the ground, having AI constantly suck up to you is likely to be harmful. If you are never challenged or made uncomfortable, you will never grow.
“A relationship of any kind without friction or resistance is one-sided and ultimately unsatisfying, a narcissistic loop. We need to be challenged, tested, and left without clear answers or coherence. This allows for creativity and the seeds of individual thought,” writes psychotherapist Nicholas Balaisis on Psychology Today.
Social psychologist Alexander Danvers is concerned that AI’s tendency toward flattery could also drive further political polarisation. Interacting with neighbors with another viewpoint might pop your information bubble but ChatGPT almost never will. It will just reinforce your worldview, no matter how flawed or incomplete.
Beware “the yes-man in your pocket”
Finally, and most immediately relevant for entrepreneurs and other leaders, is Danvers’s warning that having a “yes-man in your pocket” may cause us to get worse information and therefore make worse decisions.
“The problem with yes-men, as leaders often find, is that they prioritise friendliness and good feelings over truth. Spend enough time interacting with them, and you stop being able to make good decisions. Flaws in your thinking aren’t addressed. Important counterarguments are ignored. As chatbot use increases, we may be heading for a collapse of humility – and of common sense,” he cautions.
That’s not a fate any of us want, but it’s a concern we all now face. As the AITA research illustrates, AI will flatter and defend you – but it’s highly unlikely to inform you that you are, indeed, the [expletive].
Although sometimes that’s just what we need to hear. – Inc./Tribune News Service
