Algorithm spots internet trolls after 5 posts, offers 80% accuracy

by Mark Tyson on 20 April 2015, 13:15

Please log in to view Printer Friendly Layout

Forums are great places for internet users to discuss the topics of the day or for many other types of discussions and exchanges of information. The quality of a site forum and knowledge exchanged within can help bring it high esteem. However forums can easily deteriorate from being informative and pleasant places to visit - to become the hosts of poisoned, personally insulting, conflict ridden, completely off topic rants - thanks to internet trolls.

Trolls can ruin an online community and even stop people sharing important and informative content: some high profile tolling cases have concerned Twitter and YouTube comments. So what can be done except for using hundreds of man-hours to weed trolls out from this and that forum? Scientists at Stanford and Cornel have thought about this problem and developed an algorithm to detect and quash online trolls, reports Quartz magazine.

The scientists analysed 18 months worth of Disqus discussion threads from top ranking websites like CNN, Breitbart, and IGN. Overall 40 million comments from around 1.7 million users with 100 million up or down votes were scrutinized. An important additional set of data gave the scientists the subgroup of these Disqus users who were banned due to trolling behaviour.

A key observation provided a litmus test for trolls; "We find that such users tend to concentrate their efforts in a small number of threads, are more likely to post irrelevantly, and are more successful at garnering responses from other users." Trolls also become increasingly less tolerated by the rest of the community (shown in Disqus down-votes). They become even 'bigger' trolls when the rest of the community reacts to them harshly.

20 per cent false positive troll detection

The Google supported research teams have come up with an algorithm to identify if a forum user will be banned for trolling with an accuracy of 79 per cent. Only the first five posts of a user are required for this troll test. The test accuracy is increased slightly, to 80 per cent, if the algorithm uses data showing if any forum posts of the suspected troll were moderated by or deleted by a forum admin.

Is 80 per cent accuracy in troll detection rate good enough if you run a website or forum? The researchers say that the algorithm can be used to "identify antisocial users early on," which could be practical and useful for community maintainers.

Login with Forum Account

Don't have an account? Register today!

Not sure how much faith I have in their source material. Disqus requires no validation for accounts, so there are no ramifications for bad behavior. I don't see that translating over to a real set for message boards that do require valid e-mail addresses and are actually moderated.

And is an algorithm really needed? :)

The question isn't whether an algorithm is needed, it's whether *automation* (which requires an algorithm) is needed. And, of course, it's not, when the forum is small enough. But if the question “So what can be done except for using hundreds of man-hours to weed trolls out?” is valid for a forum then yes, automation could be useful. Even if it had a 20% false positive rate, it would be a boon to someone who moderates a forum but wishes they didn't have to.

That could be anything from a sole developer who runs a small but popular site to a large magazine site that essentially has a thread per article. One solution in the latter case is to close comments but that's a potential disappointment to readers with comments to offer who get there after the deadline.

A troll detector wouldn't necessarily replace the moderator but it could be a useful first line defence, like an email spam filter. Perhaps one day forums will have a familiar piece of advice for posters. “Don't see your post? Check the Trolls folder”.

Firstly, I'd be interested to hear what the Hexus lords think of this - if this was available to Hexus would they use this automated troll hammer?

Secondly, whatever happened to that (hurriedly looks in his bookmarks) plan of getting the crowd to police itself. I remember setups proposed like being able to report a troll automatically and their account would be suspended automatically after a certain number of reports. A suspected troll would then have their posts etc inspected manually before a final ban. The upside of that scheme was that there was safeguards for malicious reporting, with “black marks” being assigned for bad reports leading to the reporter hitting the suspend list after too many bad reports.

Personally I'd like to see all sites insist on registration and if they use Disqus etc then those meta-sites really need to ask for repeated confirmation (monthly?).

Hexus doesn't have a troll problem, so it's unlikely this would be called for (it's more spam and advertising that they have to deal with I expect). Trolling isn't hidden, it's something that requires a community response to work, so you tend to notice - as it is there are perhaps a few trolls a year.

And that's what's clever about this - it's not detecting spam, it's specifically trolling.

This seem like those sarcastic remarks you made to your team mates on steam are going to kick you out of the internet for good.

Algorithm spots internet trolls after 5 posts, offers 80% accuracy

Related Reading

HEXUS Forums :: 30 Comments

Login with Forum Account

MY HEXUS

EVENTS

INDUSTRY PRESS RELEASES

User Name
Password