The advent of generative AI has led to such impressive advances that many companies are betting that the technology will one day learn to discern good from bad, if they feed it enough examples of each. But according to 13 professional moderators, the AI now relied upon to stop the spread of dangerous content, like child sex abuse material or political disinformation, is replacing workers faster than it can learn the job. — Pixabay
Kevin decided on a career in content moderation after his YouTube recommendations took a bewildering swerve.
In 2021, videos appeared on his feed depicting violent attacks by Boko Haram, a Nigerian militant group. They were raw and gory, and reminded him of violence he witnessed as a child in Nigeria.
One day, the videos disappeared. Who did that?, he wondered. He read online that removing disturbing stuff from the Internet was a job – one he could apply for.
“I didn’t want people to go through life seeing these graphic things,” said Kevin, who spoke through a pseudonym because of the confidential nature of his job. “I wanted to make the world better.”
Now, Kevin is one of the employees of a content moderation firm working for TikTok, where he screens videos posted from users across sub-Saharan Africa. His judgement determines whether “everything bad and everything cruel” – animal abuse, human abuse, mutilations, deaths, accidents involving children – stays up or gets removed, he said.
It’s a tough and emotional job. Recently, he’s been asked to use an artificial intelligence (AI) program to help. But after more than a year working with the system, Kevin said the AI has created new problems. The technology can’t reliably spot abuse or evil, he said. Still, Kevin believes that won’t stop his employer from using the tools to replace him.
Humans vs AI
The advent of generative AI has led to such impressive advances that many companies are betting that the technology will one day learn to discern good from bad, if they feed it enough examples of each. But according to 13 professional moderators, the AI now relied upon to stop the spread of dangerous content, like child sex abuse material or political disinformation, is replacing workers faster than it can learn the job.
The workers who remain fear that the AI-monitored Internet will be a hazardous minefield where coded hate speech, propaganda, child grooming and other forms of online harm persist and spread unchecked.
“If you go down the road of implementing AI to reduce the amount of trust and safety staff, it will reduce safety overall,” said Lloyd Richardson, director of technology at the Canadian Centre for Child Protection. “You need human beings.”
Companies including Meta Platforms Inc, ByteDance Ltd’s TikTok, Roblox Corp and X are touting the benefits of greater reliance on AI content moderation. It’s traumatic, stressful work that can leave lasting emotional scars on the people who do it, and it’s impossible or prohibitively expensive to hire enough humans to tackle it all.
Consider that users post more than 20 million YouTube videos every day. Automated systems have lessened the load somewhat, by blocking content already known to be bad – like videos of mass shootings – from being reuploaded. But now, companies are relying on the AI learning to perceive nuances in posted content and make decisions.
Not ready for prime time
Today’s AI-powered chatbots can support humans in emotional conversations. That doesn’t mean the same type of technology is ready to protect Internet users from trauma, the moderators said. Of the 13 moderators Bloomberg spoke with, all but one said the tools they’re supplied actively make their jobs harder.
“We can’t rely on AI’s suggestions,” said Zhanerke Kadenova, 36, who works for a content moderation firm in Kazakhstan that contracts with a big tech company. “It doesn’t match most of the time – like 80%. We don’t even look at them.”
Kevin estimates that, for him, it fails up to 70% of the time. The system still frequently jumps to incorrect conclusions – like pointing out the low fuel gauge on a car dashboard in a video when it should have pointed to the 200-kilometre-per-hour reading on its speedometer. Or it might circle a young child’s face on the platform and identify them as 17.
He takes the time to correct its errors, and enters in hyper-specific labels for everything he sees. Correcting the AI increases the workload, taking time that could be spent addressing more urgent problems.
“It’s scary. It’s very scary,” said Kevin, who believes his meticulous corrections are training the system and will eventually replace him, though his bosses haven’t said that explicitly.
Roblox, YouTube, TikTok, Meta, and X have all faced scrutiny over their moderation practices. In 2024, a US Senate hearing on child safety questioned the latter three companies. Today, TikTok is battling dozens of lawsuits after investigations around suicidal content, predation and more. The US Federal Trade Commission accused Meta’s Instagram of connecting minors with groomers and countless reports have described Facebook’s challenges moderating violent and extremist content.
After Elon Musk’s takeover of Twitter, which he renamed X, regulators and journalists have raised a litany of concerns that the platform is proliferating nonconsensual deepfake pornography and child abuse content. Roblox, too, has been criticised for failing to protect children against predators on the platform, including in a 2024 Businessweek investigation.
Over the last few months, seven legal complaints alleging child safety harms have been filed against the company, including from Louisiana Attorney General Liz Murrill. The suits claim that predators targeted children as young as eight years old on Roblox.
“The assertion that Roblox would intentionally put our users at risk of exploitation is categorically untrue,” the company said in response to the suit.
No one-size-fits-all solution
Yet over the past year, under pressure to cut costs, these companies have all announced their intention to rely more on AI moderation, accelerating efforts that began during the pandemic, when companies decided sensitive content work wasn’t possible from home. For some of the apps, the move was also fuelled in part by conservative US lawmakers’ claims that human moderation was biased.
Under Musk, X dramatically scaled back its content moderation workforce, halving European Union-based moderators since 2023 to 1,486, according to the company’s Digital Services Act report. Meta’s content moderation contractor Telus cut more than 2,000 Barcelona-based jobs in April. Former Roblox chief financial officer Michael Guthrie told investors last year that the company had freed up cash from operations from “more of using artificial intelligence, requiring less and less on manual moderation”, while keeping headcount flat.
Roblox did not respond to a request for comment on the size of its moderation workforce and whether it’s expanded alongside the platform’s 41% user growth in the second quarter, from a year earlier. Bloomberg reported in 2024 that Roblox had about 3,000 moderators for its 80 million daily active users at the time. The company said its number of moderators is not an indicator of quality.
Recently, TikTok shared plans to cut hundreds of moderators across the United Kingdom, as well as roles in South and South-East Asia, while it invests in moderation technologies including AI. A company spokesperson said it’s “concentrating our operations in fewer locations globally to ensure that we maximise effectiveness and speed as we evolve this critical function for the company with the benefit of technological advancements”.
Last October, a Reuters report described plans to cut hundreds more moderation jobs in Malaysia and fifty of the nearly 150 TikTok moderators in Germany are threatening to strike over an upcoming layoff. One TikTok content moderator for Dutch videos, who handles topics like conspiracy theories and election misinformation, said AI is not fit to replace him because it can’t catch context for specific regions, like different dialects of Flemish.
X did not respond to requests for comment. A TikTok representative said AI “can help support the well-being of our moderation staff and improve human moderation”, citing the ability to personalise moderators’ workloads based on their cultural knowledge.
“Our use of AI to support content moderation is still in very early stages,” the spokesperson said, adding that the company sets “firm quality benchmarks” for new moderation technologies before putting them in use. Human moderators are more focused on complex tasks that are “context-heavy but have lower prevalence”, the representative said.
Roblox in August announced an AI system designed to detect early signs of child endangerment, which helped Roblox identify 1,200 reports of potential child exploitation for the US National Center for Missing and Exploited Children. Moderating its high volume of text and audio content is “a job that humans cannot manage alone”, the company said in a July blog post, adding that it would “require hundreds of thousands of human moderators working 24/7”.
Roblox shared how it trains systems to prevent false negatives and only deploys AI when it performs “significantly higher in both precision and recall than humans at scale”.
Under chief executive officer David Baszucki’s post on X about the blog, dozens of commenters complained about moderation on the platform, with many saying that humans are better suited to moderate the kids’ platform. Some shared screenshots of inappropriate content they’d seen, including avatars wearing thong underwear.
There’s a lot AI can and does do to augment human work, like prioritising the most dangerous content for human moderators to check. AI can also make particularly upsetting videos and images black and white, which researchers say benefits moderators’ mental health. A TikTok representative said its moderators removed 60% fewer videos violating its policy against shocking or graphic content as “our moderation technologies took on more of these potentially distressing videos”. But even companies selling AI moderation tools warn of an over-reliance on the technology.
“Some AI systems have a lot of false positives,” said Ron Kerbs, CEO of child safety software firm Kidas. AI might not be able to tell whether someone streaming Call Of Duty and saying “I’m going to kill you” is part of play, or a real-world death threat. In a January blog post, Meta said the company’s automated systems that scanned for policy violations “resulted in too many mistakes” and over- censorship, so it would focus those systems on high-severity violations like drugs and fraud.
The AI’s false positives can lead to bad user experiences for people who are doing nothing wrong online, locking them out of their accounts erroneously. That’s led to a higher workload for the humans, who have received a deluge of decision appeals.
“Some of the things that the AI would take down just didn’t make sense,” said Michael Nkoko, 29, a former moderator at a contract firm for Meta. Because he still had to adhere to his daily ticket-response quota, the appeals “created more work”. A Meta representative said mistakes make up a small portion of removed content.
Computers are very good at yes or no questions. They’re less good at subjectivity, cultural context and taste. Development of the right judgement is a constant process. Company policies change rapidly depending on regulations and political climates, social norms evolve and new slurs are constantly invented.
Humans evolve their language more rapidly than AI can detect and understand it, said Mike Pappas, CEO of AI moderation company Modulate. “You need humans to say, Hey, ‘purple’ is now a racial slur.” Multiply that by the thousands of living languages posted to these platforms.
In order to give an AI some approximation of human judgement, the system has to be trained on highly specific yes-or-no questions so it can more frequently achieve what a human already knows how to do by gut.
Kevin said his coworkers are often being asked to review the same piece of content – a sign, he says, that the company is trying to train the AI based on their work. In one video he remembers, two people were fighting, with one holding a weapon and hitting another.
“Did the subject raise its hand? You click on yes. Did the hand come down at a particular speed? Yes. Was there an impact? Yes. Did the other subject react to the impact? Yes. Did the subject seem like it sustained a grievous injury? Yes. Was there a splash of blood or mutilation? Yes,” Kevin explained. “It would be gullible to think you’re doing this and not teaching the AI.”
Peter, who is also Nigerian and asked to use a pseudonym, said he sees moderation as a community service, after witnessing violence as a child. When he sits down at his work setup, he is sometimes prompted to label every little detail in a video, like body parts or household objects. Other times, he reports slurs the AI didn’t catch. Eventually, his company’s automated system began to pick up on specific words, like a derogatory term in Hausa for marginalised groups. (Peter, like Kevin, knows a half-dozen languages.)
“I knew it wouldn’t be long before my services were no longer needed,” Peter said of his company’s AI ambitions. “If I were a business owner, I’d want to cut costs.”
Moderators worry that, no matter how well they train AI, it won’t pick up on the subtleties of human speech and behaviour. One, who asked to remain anonymous, said he often reviews videos in which people are wearing skimpy underwear. AI can detect that reliably. But it won’t make exceptions if a user is standing near a body of water, he said. In other cases, AI might detect an exposed breast, but won’t take into account a policy that allows images of breastfeeding.
Bikinis are the least of these moderators’ worries. Many said their primary concern is children. Predators meticulously study platforms’ automation tactics so they can circumvent them – and they learn faster than AI.
On dark web forums, child predators trade tricks on what words get caught by platforms’ automation, or how far they can go speaking with a child before getting banned. And while AI can reliably identify children in images or videos, it has difficulty picking up on text conversations that a human moderator would classify as grooming, the moderators said.
Predators who attempt to move children to a platform with looser rules use innocuous phrases like “Let’s party”, or “Meet me on the ghost app”. If a platform picks up on that, predators will put Xs between the letters, said one moderator.
Experts said the safest route for big tech companies is to maintain their human workforces while developing and supplementing them with AI tools. The number of potentially harmful posts to look at is “only increasing – especially with generative AI and platforms getting flooded with that content”, said Jonathan Freger, senior vice president at moderation firm WebPurify.
Savannah Badalich, Discord’s head of product policy, said in an interview that the company has no plans to cut costs associated with moderation ahead of its initial public offering. While Discord uses machine learning and large language models to support human reviewers, she said, “It’s really important for us to have humans in the loop, especially for severe enforcement decisions. Our use of AI is not replacing any of our employees. It’s meant to support and accelerate their work.”
Outsourcing company Teleperformance SE employs thousands of contract moderators who scan content at companies like TikTok.
A representative said that “despite major advances in automation, human moderators are essential for ensuring safety, accuracy, and empathy in both social media and gaming environments”. Moderation is more than saying yes or no to an image; it’s “interpreting behaviour, understanding context, and making judgement calls that AI still struggles with”, the spokesperson said.
Kevin said this is his last year moderating for TikTok.
“I don’t intend to stay longer than four years,” he said. “We are depressed. Real depression.”
In his free time, he watches old comedies, plays guitar and prays. “As long as we are training the AI on what we are doing, it is actually getting better,” he said.
But he said it won’t be enough. “There will be a lot of leakages of very violative content,” Kevin said. – Bloomberg
