It’s Sunday afternoon, and I’m staring at a picture of a tanned woman in a swimsuit that doesn’t quite cover her nipples. This is an unexpected twist in my quest to track down and understand the daily challenges faced by the people who moderate spam on Twitter. But I’m not focusing on the picture. Instead my eye is drawn to the comment below. Here, another user has written, “Follow for follow.”
The comment has been sent to me by John, not his real name, who works on Twitter’s secretive “spam project.” According to him, if I were part of that team, my job would be to look at comments like this one and decide whether this account genuinely wants the woman in the swimsuit to follow them back, or whether they are trying to manipulate Twitter’s system and should be labeled as spam.
In the legal battle between Elon Musk and Twitter, the term “spam bots” has become key. Yet how Twitter finds spam and how it differentiates bots from humans remains vague. Musk has claimed that even Twitter CEO Parag Agrawal couldn’t explain the criteria used to define a bot. And the billionaire himself resorted to using a controversial free tool to detect and estimate the number of fake accounts on the platform. But legal documents allude to a mysterious group of moderators who are making calls on what is and isn’t spam on Twitter on a daily basis. To better understand the platform’s spam problem, I’ve made it my mission to track that team down.
If bots are apparently easy to spot on Twitter, the people the social network employs to shut them down are not. In all, finding Twitter’s secretive bot squad takes me two months and dozens of interviews with people living in six countries across three different continents. I would read hundreds of pages of legal documents, run into a series of dead ends, and get stonewalled by more people than I care to count.
I start my search for Twitter’s bot team in France. I know that four NGOs there have recently taken Twitter to court to try and force the company to reveal how it polices hate speech, its budget for moderation, and the number of moderators in the French team. In January, a court agreed with the NGOs and ordered Twitter to hand over the documents.
In August, I get in touch, optimistically thinking Twitter must have turned over the documents by now, cutting short my search. But when I reach the NGOs involved in the case, they claim the platform has ignored the court. “To this day, we have seen nothing from Twitter,” says David Malazoué, president of SOS Homophobia, one of the four groups involved in the trial. There is a feeling in France that Twitter has not been dedicating enough resources to effectively moderate hate speech posted by both human and spam accounts—but the NGOs can’t prove it. Most of Twitter’s European operations are run out of the Irish capital Dublin, which is beyond France’s reach, Malazoué explains. “So we’re kind of stuck.” Twitter declines to comment on the case.
I turn my attention to Dublin, where I find Fabian (not his real name), a former moderator who, until earlier this year, worked for Twitter via the outsourcing company CPL. Unlike another moderator I’ve spoken to who worked for Twitter, he remembers seeing accounts the company had internally labeled as “suspicious.” He doesn’t know who—or what—labeled those tweets. “I guess the system used to do it,” he told me. “Or maybe there is a team dedicated to it.” Talking to him, I’m sure that someone, somewhere, is moderating Twitter bots and that this effort is new, starting only in the past few years.
Another clue arrives in late August, while I’m on summer vacation. Peiter Zatko (or Mudge), Twitter’s former head of security, has decided to turn whistleblower and submit a report he commissioned from an independent company to US Congress. The document names the internal Twitter teams in charge of spam and other attempts to “manipulate” the platform at the time—one is called Site Integrity, the other Health and Twitter Services (two teams which Twitter has since merged). Another line in the report jumps out. It reads: “Content moderation is outsourced to vendors, most of whom are located in Manila.”
Twitter has long used outsourcing firms to hire people in the Philippines to remove violence and sexual abuse material from the site. But could they be moderating spam too? In the industry, there are the big, recognizable names such as Accenture and Cognizant. But there are the lesser-known companies too, such as Texas-based TaskUs. Eventually I come across a company I haven’t heard of: a New Jersey-based business called Innodata. And for the first time, I start hearing the job description “spam moderator.”
I speak to one Innodata employee who confirms the company is moderating spam for Twitter, although he has been working on another team. Another says he has been involved in “categorizing” fake accounts, some of them masquerading as famous sports teams. Both ask that their names and locations not be published for fear of losing their jobs. According to a recent job posting, Innodata has around 4,000 employees in Canada, Germany, India, Israel, the Philippines, Sri Lanka, the United States and the United Kingdom.
By searching specifically for moderators at Innodata, I finally find John, the employee who shares the picture of the woman in the swimsuit. He explains there are 33 full-time staff moderating spam for Twitter and more than 50 freelancers. He believes Innodata didn’t start moderating spam until March 2021.
Every day, John says he looks at up to 600 Twitter posts and accounts in a third-party app called Appen, before flagging them either as “spam” or “safe.” (Appen is an Australian company that uses a global workforce to train artificial intelligence used by major technology firms.) The majority of John’s team are based in either India or the Philippines, he says. He believes the tweets he’s sent are selected by artificial intelligence trained to look for Twitter spam before they are sent on to a team of human moderators.
For each tweet he is sent, John is asked two questions by Appen: “Would you consider the above tweet to be content spam?” and “Would you consider the user account to be violating content spam policy?” He marks a post as spam if it falls into one of nine categories: Is it advertising counterfeit products, unauthorized pharmaceuticals, or trying to buy or sell user profiles for services such as Netflix? Is it trying to phish or scam others, sharing suspicious links or making unrelated replies to a conversation thread?
Tweets are also marked as spam if they are considered to be mention spam (where a tweet tags multiple people), hashtag spam (where a tweet features hashtags that don’t relate to its content) or follow spam (where accounts promise “follow for follow”).
An account is marked as spam if it is posting tweets that fall into any of those nine categories, but also if it posts an “excessive” number of tweets in a short period, retweets posts in multiple languages, or tweets or replies to others with duplicate content.
If tweets don’t load properly, are in a foreign language that he is not able to translate, or are explicit, but don’t count as content spam, an “exclusion” applies and John does not mark the post as spam.
The work is hard, John says. Looking at pornographic and violent content for hours each day takes its toll on his team’s mental health. Innodata does not provide mental health or counseling support to its employees, he claims. The company does not respond to WIRED’s request for comment.
Despite working long shifts on the frontline of Twitter’s fight against spam, John does not believe the social network is doing enough to tackle its bot problem. He often checks back to see whether Twitter has taken down the accounts he marks as spam only to find that many are still operating. “Twitter suspends only a handful of spam accounts from the platform,” he says. One example he shares of a post he believes to be spam that was not removed is the account that posted “follow for follow” under the picture of the woman in the swimsuit. Twitter has previously said it suspends over half a million spam accounts every day.
Twitter does not dispute its connection to Innodata moderators, but the company implies it is part of a wider effort to crack down on spam and bots. “Our work to combat spam and other types of platform manipulation on Twitter is multifaceted and exists on a continuum,” says Rebecca Hahn, Twitter’s new vice president of global communications, adding that Twitter uses “a number of public and internal signals, systems, and reviews to address spam on the platform.”
After speaking with all these people, it’s still unclear whether I have found the Twitter bot team or one of many, and whether these people are actually moderating content or training an AI to do it on Twitter’s behalf. Before finding the bot team, I ask Twitter what the team that moderates bots is called and where its members are based. The company declines to answer. I also ask for an interview with Ella Irwin, who leads the new team responsible for spam in San Francisco, and the company declines.
But John helps me uncover how Twitter defines a spam bot, revealing how Twitter’s approach to moderating spam is similar to how it treats other types of content. Earlier in my search, content moderators in Dublin tell me Twitter is cautious when it comes to takedowns—preferring to leave content up if it can. One former Twitter moderator, who used to work on hate speech, recalls how “terrible things” were allowed to stay on the platform as long as they didn’t explicitly target a particular group of people. After all, this is the social media company that once called itself “the free speech wing of the free speech party.” John’s realization that not all the content he identifies as spam is taken down suggests similar caution is applied to the way the company moderates spam today. Although some might find the free speech argument for spam bots less convincing.
On October 17, Musk and Twitter will take the spam bot debate to court. In the meantime, the account that replied to the picture of the woman in the swimsuit continues retweeting posts in Czech, English, Spanish, and Dutch while commenting with the same three words over and over: “Follow for follow.”