Multi-Class Detection of Abusive Language Using Automated Machine Learning

Jorgensen Mackenzie, Choi Minho, Niemann Marco, Brunk Jens, Becker Jörg

Research article in digital collection (conference) | Peer reviewed

Abstract

Abusive language detection online is a daunting task for moderators. We propose Automated Machine Learning (Auto-ML) to semi-automate abusive language detection and to assist moderators. In this paper, we show that multi-class classification powered by Auto-ML is successful in detecting abusive language in English and German as well as and better than the state-ofthe- art machine learning models. We also highlight how we combatted the imbalanced data problem in our data-sets through feature selection and undersampling methods. We propose Auto-ML as a promising approach to the field of abusive language detection, especially for small companies who may have little machine learning knowledge and computing resources.

Details about the publication

Name of the repositorylibrary.gito.de
StatusPublished
Release year2020
Language in which the publication is writtenEnglish
Conference15. Internationale Tagung Wirtschaftsinformatik (WI 2020), Potsdam, Germany
DOI10.30844/wi_2020_r7-jorgensen
Link to the full texthttps://library.gito.de/open-access-pdf/R7_Jorgensen-Multi-Class_Detection_of_Abusive_Language_Using_Automated_Machine_Learning-248_c.pdf
KeywordsAbusive Language Detection, Automated-Machine Learning, Multi-Class Classification

Authors from the University of Münster

Becker, Jörg
Chair of Information Systems and Information Management (IS)
Brunk, Jens
Chair of Information Systems and Information Management (IS)
Niemann, Marco
Chair of Information Systems and Information Management (IS)