OLF-ML: An Offensive Language Framework for Detection, Categorization, and Offense Target Identification Using Text Processing and Machine Learning Algorithms

Hasan, MD. Nahid; Sakib, Kazi Shadman; Preeti, Taghrid Tahani; Allohibi, Jeza; Alharbi, Abdulmajeed Atiah; Uddin, Jia

doi:10.3390/math12132123

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

OLF-ML: An Offensive Language Framework for Detection, Categorization, and Offense Target Identification Using Text Processing and Machine Learning Algorithms

by

MD. Nahid Hasan

¹,

Kazi Shadman Sakib

²

,

Taghrid Tahani Preeti

¹,

Jeza Allohibi

³

,

Abdulmajeed Atiah Alharbi

³

and

Jia Uddin

^4,*

¹

Department of Computer Science and Engineering, School of Data and Sciences, Brac University, Dhaka 1212, Bangladesh

²

Department of Computer Science and Engineering, University of Dhaka, Dhaka 1000, Bangladesh

³

Department of Mathematics, Taibah University, Madinah 42353, Saudi Arabia

⁴

Artificial Intelligence and Big Data Department, Endicott College, Woosong University, Daejeon 34606, Republic of Korea

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(13), 2123; https://doi.org/10.3390/math12132123 (registering DOI)

Submission received: 2 June 2024 / Revised: 4 July 2024 / Accepted: 4 July 2024 / Published: 6 July 2024

Download Versions Notes

Abstract

The pervasiveness of offensive language on social media emphasizes the necessity of automated systems for identifying and categorizing content. To ensure a more secure online environment and improve communication, effective identification and categorization of this content is essential. However, existing research encounters challenges such as limited datasets and biased model performance, hindering progress in this domain. To address these challenges, this research presents a comprehensive framework that simplifies the utilization of support vector machines (SVM), random forest (RF) and artificial neural networks (ANN). The proposed methodology yields notable gains in offensive language detection, automatic categorization of offensiveness, and offense target identification tasks by utilizing the Offensive Language Identification Dataset (OLID). The simulation results indicate that SVM performs exceptionally well, exhibiting excellent accuracy scores (77%, 88%, and 68%), precision scores (76%, 87%, and 67%), F1 scores (57%, 88%, and 68%), and recall rates (45%, 88%, and 68%), proving to be practically successful in identifying and moderating offensive content on social media. By applying sophisticated preprocessing and meticulous hyperparameter tuning, our model outperforms some earlier research in detecting and categorizing offensive language tasks.

Keywords: machine learning; offensive language detection; offensive language categorization; offensive target identification; SVM; random forest; ANN; OLID

Share and Cite

MDPI and ACS Style

Hasan, M.N.; Sakib, K.S.; Preeti, T.T.; Allohibi, J.; Alharbi, A.A.; Uddin, J. OLF-ML: An Offensive Language Framework for Detection, Categorization, and Offense Target Identification Using Text Processing and Machine Learning Algorithms. Mathematics 2024, 12, 2123. https://doi.org/10.3390/math12132123

AMA Style

Hasan MN, Sakib KS, Preeti TT, Allohibi J, Alharbi AA, Uddin J. OLF-ML: An Offensive Language Framework for Detection, Categorization, and Offense Target Identification Using Text Processing and Machine Learning Algorithms. Mathematics. 2024; 12(13):2123. https://doi.org/10.3390/math12132123

Chicago/Turabian Style

Hasan, MD. Nahid, Kazi Shadman Sakib, Taghrid Tahani Preeti, Jeza Allohibi, Abdulmajeed Atiah Alharbi, and Jia Uddin. 2024. "OLF-ML: An Offensive Language Framework for Detection, Categorization, and Offense Target Identification Using Text Processing and Machine Learning Algorithms" Mathematics 12, no. 13: 2123. https://doi.org/10.3390/math12132123

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

OLF-ML: An Offensive Language Framework for Detection, Categorization, and Offense Target Identification Using Text Processing and Machine Learning Algorithms

Abstract

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI