July 19th, 2018


I can be baited with a tweet, and that's OK

Leonid Bershidsky

By Leonid Bershidsky Bloomberg View

Published August 1, 2016

"A man you can bait with a tweet is not a man we can trust with nuclear weapons," Hillary Clinton declared in her speech to the Democratic National Convention on Thursday. The jab played well with the anti-Trump audience, but in truth, anyone who runs a Twitter account knows they, too, can be baited with a tweet. "Oh hell, that rules me right out, too," the venture capitalist Marc Andreessen, an early investor in most of today's tech miracles, tweeted in response to Clinton's soundbite.

On the internet -- in the comment sections of websites, on social networks or via e-mail -- anyone who voices opinions, even uncontroversial ones, can be subjected to abuse. I'm a columnist, so I know. I have been attacked for being Jewish, for advocating Islamization, for being pro-Putin and pro-Poland (by Ukrainians), pro-Ukrainian (by Russians and Poles), liberal, Communist, neocon, and many other things. I've received vicious hate mail from the fans of Apple, Elon Musk and Donald Trump. People have posted suggestions that I kill myself. And I don't have it particularly bad: The Finnish journalist Jessikka Aro has been hounded by Russian trolls so relentlessly that she was driven to write a book about them; Julia Ioffe was subjected to an anti-Semitic abuse campaign after her interview with Melania Trump.

If you're a living, warm-blooded person, it's impossible to take this kind of thing in stride. Sometimes the finger reaches for that nuclear button (but ends up merely blocking particularly offensive trolls). Clinton, whose communications are painstakingly curated, probably has never had that temptation.

There is a potential cure, though I'm afraid it will be worse than the disease. A team of Yahoo Labs researchers has published a paper describing a machine learning-based system that could block online abuse. The system interprets the meanings of words as vectors, and even if particular words aren't abusive, the vector representation of a string of them might trigger alarms.

The Yahoo engineers trained their software on a set of user comments on Yahoo News and Yahoo Finance and compared its performance with that of trained raters Yahoo employs to police its comment sections. The software proved roughly 90 percent accurate.

The social networks and big news sites would love to have a system like that. They'd be able to cut costs, employing far fewer human moderators and "user operators," as they are called at Facebook. They would have to deal with far fewer requests from authorities to remove hate speech, variously defined by different countries' laws. At the same time, it will be possible to remove even unreported comments that could potentially be seen as abusive. The owners and managers of tech companies are likely to rush to put in effect artificial intelligence-powered abuse detection systems, despite the caveats about the software in academic papers.

The big caveat in the case of abuse detection, of course, is that artificial intelligence still does a very poor job processing natural language. In May, Google opened the source code for what it claims is the world's most accurate natural language parser -- a program that can understand sentences the way humans do. The English-language part of it, called Parsey McParser, achieves 90 percent accuracy on certain tests, Google claimed. Yet anyone who uses voice assistants in mobile phones or translation engines -- based on the same technology as Parsey McParser -- knows that the percentages can be misleading and that the 10 percent that is wrong could be critical.

The Yahoo model flagged these comments as abusive: "Gays in Indiana are pooping their pants over this law" and "Please stop the black on white crimes." The researchers then labeled these as "false positives," but admitted the cases were dubious. The model also couldn't figure out what to do about a comment such as "Soak their clothes in gasoline and set them on fire": It would take some new capabilities to parse a whole string of comments to get the context.

As much as I hate to see comments that are obviously homophobic or nationalist, they are part of a writer's feedback -- reminders that it takes all kinds to make a world, and that the way a story is written is not necessarily the way it's read. I suspect, however, that when automatic abuse detection systems are in place, the management of sites and social networks won't worry too much about false positives. They will err on the safe side because that's the nature of business.

Even Yahoo's trained raters reached an agreement rate of only 0.6 when asked to categorize in what way a comment was abusive -- was it profane, hateful, derogatory? Ordinary people, hired on Amazon's Mechanical Turk platform, showed an agreement rate of 0.4, meaning that they couldn't agree in the majority of cases. People have different "pain thresholds" when it comes to abuse, and the same person can be variably sensitive to different kinds of abuse. Training artificial intelligence on a big dataset of human judgments will teach it to make decisions resembling those that would be made by the majority of the humans in that set -- but it still would be highly imperfect. In fact, it could probably be deliberately corrupted, like Microsoft's cheerful Twitter chatbot that started out exuding love and peace but spewed racial hate after less than a day of interaction with malicious users.

Tech evangelists usually answer such concerns by saying that human judgment is no better, and often is worse. It's hard to argue with that. There is, however, a natural constraint when humans are involved: It's impossible or economically inefficient to hire enough of them to weed out all the possible abuse, ban every troll as Twitter recently banned Breitbart tech editor Milo Yiannopoulos for inciting a racist abuse campaign against an actress.

I don't want to be protected from abuse by a machine. I may not be trustworthy with a nuclear button in the presence of a troll, but I don't want Facebook or Twitter or to nuke him -- and lots of relatively harmless commentators whose views I need for my work and, more generally. to keep an open mind, even if I don't agree with them. I'd rather wait for singularity to be achieved than have social network managers mess with highly imperfect technology that can quietly redefine the freedom of speech, creating "safe zones" where politically incorrect, vigorous debate now reigns.

Comment by clicking here.

Leonid Bershidsky, a Bloomberg View contributor, is a Berlin-based writer.