Email Spam Classification Using Gated Recurrent Unit and Long Short-Term Memory

Iqbal Basyar; Adiwijaya; Danang Triantoro Murdiansyah

doi:10.3844/jcssp.2020.559.567

Research Article Open Access

Email Spam Classification Using Gated Recurrent Unit and Long Short-Term Memory

Iqbal Basyar¹, Adiwijaya¹ and Danang Triantoro Murdiansyah¹

¹ Telkom University, Indonesia

Abstract

High numbers of spam emails have led to an increase in email triage, causing losses amounting to USD 355 million per year. One way to reduce this loss is to classify spam email into categories including fraud or promotions made by unwanted parties. The initial development of spam email classification was based on simple methods such as word filters. Now, more complex methods have emerged such as sentence modeling using machine learning. Some of the most well-known methods for dealing with the problem of text classification are networks with Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU). This study focuses on the classification of spam emails, so both the LTSM and GRU methods were used. The results of this study show that, under the scenario without dropout, the LSTM and GRU obtained the same accuracy value of 0.990183, superior to XGBoost, the base model. Meanwhile, in the dropout scenario, LSTM outperformed GRU and XGboost with each obtaining an accuracy of 98.60%, 98.58% and 98.52%, respectively. The GRU recall score was better than that of LSTM and XGBoost in the scenario with dropouts, each obtaining values of 98.98%, 98.92% and 98.15% respectively. In the scenario without dropouts, LSTM was superior to GRU and XGBoost, with each obtaining values of 98.39%, 98.39% and 98.15% respectively.

Journal of Computer Science

Volume 16 No. 4, 2020, 559-567

DOI: https://doi.org/10.3844/jcssp.2020.559.567

Submitted On: 9 August 2019 Published On: 3 April 2020

How to Cite: Basyar, I., Adiwijaya, & Murdiansyah, D. T. (2020). Email Spam Classification Using Gated Recurrent Unit and Long Short-Term Memory. Journal of Computer Science, 16(4), 559-567. https://doi.org/10.3844/jcssp.2020.559.567

Copyright: © 2020 Iqbal Basyar, Adiwijaya and Danang Triantoro Murdiansyah. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

5,500 Views
2,874 Downloads
16 Citations

Download

Keywords

GRU
LSTM
Spam Classification