Detecting hate speech on twitter using a convolution-GRU based deep neural network

Tools

Zhang, Z ORCID: https://orcid.org/0000-0002-8587-8618, Robinson, D ORCID: https://orcid.org/0000-0003-2760-7163 and Tepper, J ORCID: https://orcid.org/0000-0001-7339-0132, 2018. Detecting hate speech on twitter using a convolution-GRU based deep neural network. In: Gangemi, A, Navigli, R, Vidal, M-E, Hitzler, P, Troncy, R, Hollink, L, Tordai, A and Alam, M, eds., The Semantic Web: Proceedings of the 15th European Semantic Web Conference (ESWC 2018), Heraklion, Crete, Greece, 3-7 June 2018. Lecture notes in computer science, 10843 . Cham, Switzerland: Springer, pp. 745-760. ISBN 9783319934167

Preview

Text
11440_Tepper.pdf - Pre-print
Download (628kB) | Preview

Official URL: http://doi.org/10.1007/978-3-319-93417-4_48

Abstract

In recent years, the increasing propagation of hate speech on social media and the urgent need for effective counter-measures have drawn significant investment from governments, companies, as well as empirical research. Despite a large number of emerging scientific studies to address the problem, existing methods are limited in several ways, such as the lack of comparative evaluations which makes it difficult to assess the contribution of individual works. This paper introduces a new method based on a deep neural network combining convolutional and long short term memory networks, and conducts an extensive evaluation of the method against several baselines and state of the art on the largest collection of publicly available datasets to date. We show that our proposed method outperforms state of the art on 6 out of 7 datasets by between 0.2 and 13.8 points in F1. We also carry out further analysis using automatic feature selection to understand the impact of the conventional manual feature engineering process that distinguishes most methods in this field. Our findings challenge the existing perception of the importance of feature engineering, as we show that: the automatic feature selection algorithm drastically reduces the original feature space by over 90% and selects predominantly generic features from datasets; nevertheless, machine learning algorithms perform better using automatically selected features than the original features.

Item Type:	Chapter in book
Creators:	Zhang, Z., Robinson, D. and Tepper, J.
Publisher:	Springer
Place of Publication:	Cham, Switzerland
Date:	2018
Volume:	10843
ISBN:	9783319934167
Identifiers:	Number Type 10.1007/978-3-319-93417-4_48 DOI
Divisions:	Schools > School of Science and Technology
Record created by:	Jonathan Gallacher
Date Added:	06 Jul 2018 09:30
Last Modified:	06 Jul 2018 09:41
URI:	https://irep.ntu.ac.uk/id/eprint/34022