Instagram-Based Benchmark Dataset for Cyberbullying Detection in Arabic Text
Abstract
:1. Introduction
2. Related Work
3. Methods
3.1. Dataset Collection and Preprocessing
- Arabic fashionistas who had been suffering/subjected to bullying,
- Arabic singers who had been suffering/subjected to bullying,
- Arabic YouTubers who had been suffering/subjected to bullying,
- Arabic bloggers who had been suffering/subjected to bullying.
- Instagram profiles,
- Arabic accounts,
- The minimum number of comments on the post is 200.
3.2. Dataset Labeling
Annotation Scheme
- Annotators have been encouraged to avoid interpreting text subjectivity based on their own feelings and other background information in general. This is because the types of sentiment expressed may differ depending on the annotator’s or reader’s background knowledge [30]. For example, “يجب وآد البنات” means “We have to bury girls alive.” If the readers do not like girls, they could find this comment neutral and not offensive.
- Be as consistent as possible in the whole annotation journey.
- Document any questions you may have or issues you may come across and report them back to the data team. These questions will be valuable feedback for further expansion and improvements of this document.
- Negative: if there is no offensive, aggressive, insulting or profanity content. Positive: if it contains any words of praise, thanks, appreciation, etc.
- Normal: anything else, such as announcements, benedictions, etc.
- Toxic: for the comments that hold bad feelings but do not consist of any bad words.
- Bullying: for comments that include extreme (abusive language/offensive, insulting, and aggressive) language based on some characteristics such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or others. This labeling was chosen along with the hate speech definition in [3] [Nockleby, 2000]. Examples for the labeling are missioned in Table 2.
- If there is a disagreement, we used majority voting.
3.3. Dataset Descriptive Analysis
3.4. Evaluation
3.4.1. Labeling Evaluation
3.4.2. Benchmark Evaluation
- Logistic regression (LR): this is a predictive model. It is a statistical learning technique used for the task of classification. Even though the name of the classifier has the word ‘regression’ in it, it is used to produce discrete binary outputs.
- Multinomial naïve Bayes (MNB): this classifier estimates the probability of each class label, based on the Bayes theorem, for some texts. The result of this is the class label with the highest probability score. MNB assumes the features are independent and, as a result, all features contribute equally to the computation of the predicted label.
- Support vector machines (SVM): SVM is a very prevalent supervised classifier.
- It is non-probabilistic. SVM uses hyperplanes to segregate labels. SVM supports linear and nonlinear models. Basically, each hyperplane is expressed by the input documents (vector).
- Random forest (RF): RF is a supervised learning-based classifier. This ensemble model utilizes a set of decision trees, which computes the resulting label aggregately.
4. Result and Analysis
5. Conclusions
Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Jay, T.; Janschewitz, K. The pragmatics of swearing. J. Polite- Res. 2008, 4, 267–288. [Google Scholar] [CrossRef]
- Razavi, A.H.; Inkpen, D.; Uritsky, S.; Matwin, S. Offensive Language Detection Using Multi-level Classification. In Canadian Conference on Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2010; pp. 16–27. [Google Scholar] [CrossRef] [Green Version]
- Patchin, J.W.; Hinduja, S. Bullies Move Beyond the Schoolyard. Youth Violence Juv. Justice 2006, 4, 148–169. [Google Scholar] [CrossRef]
- Matsuda, M.J. Public Response to Racist Speech: Considering the Victim’ s Story. Mich. L. Rev. 1989, 87, 17–51. [Google Scholar] [CrossRef]
- López-Meneses, E.; Vázquez-Cano, E.; González-Zamar, M.-D.; Abad-Segura, E. Socioeconomic Effects in Cyberbullying: Global Research Trends in the Educational Context. Int. J. Environ. Res. Public Heal. 2020, 17, 4369. [Google Scholar] [CrossRef]
- Haidar, B.; Chamoun, M.; Serhrouchni, A. Arabic Cyberbullying Detection: Using Deep Learning. In Proceedings of the 2018 7th International Conference on Computer and Communication Engineering (ICCCE), Kuala Lumpur, Malaysia, 19–20 September 2018; pp. 284–289. [Google Scholar] [CrossRef]
- Ozel, S.A.; Sarac, E.; Akdemir, S.; Aksu, H. Detection of cyberbullying on social media messages in Turkish. In Proceedings of the 2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Turkey, 5–8 October 2017; pp. 366–370. [Google Scholar] [CrossRef]
- Malmasi, S.; Zampieri, M. Detecting hate speech in social media. Int. Conf. Recent Adv. Nat. Lang. Process. RANLP 2017, 2017, 467–472. [Google Scholar] [CrossRef]
- Sanchez, H. Twitter Bullying Detection. Homo 2011, 12, 15. [Google Scholar]
- Stella, M. Cognitive Network Science for Understanding Online Social Cognitions: A Brief Review. Top. Cogn. Sci. 2022, 14, 143–162. [Google Scholar] [CrossRef]
- Marzouki, Y.; Aldossari, F.S.; Veltri, G.A. Understanding the buffering effect of social media use on anxiety during the COVID-19 pandemic lockdown. Humanit. Soc. Sci. Commun. 2021, 8, 1–10. [Google Scholar] [CrossRef]
- Del Vicario, M.; Vivaldo, G.; Bessi, A.; Zollo, F.; Scala, A.; Caldarelli, G.; Quattrociocchi, W. Echo Chambers: Emotional Contagion and Group Polarization on Facebook. Sci. Rep. 2016, 6, 37825. [Google Scholar] [CrossRef]
- Purba, K.R.; Asirvatham, D.; Murugesan, R.K. Classification of instagram fake users using supervised machine learning algorithms. Int. J. Electr. Comput. Eng. (IJECE) 2020, 10, 2763–2772. [Google Scholar] [CrossRef]
- Efthimion, P.G.; Payne, S.; Proferes, N.; Efthimion, P.G.; Proferes, N. Supervised Machine Learning Bot Detection Techniques to Identify Social Twitter Bots. SMU Data Sci. Rev. 2018, 1, 5. [Google Scholar]
- Zhong, H.; Li, H.; Squicciarini, A.C.; Rajtmajer, S.M.; Griffin, C.; Miller, D.J.; Caragea, C. Content-driven detection of cyberbullying on the instagram social network. IJCAI Int. Jt. Conf. Artif. Intell. 2016, 2016, 3952–3958. [Google Scholar]
- Mubarak, H.; Darwish, K.; Magdy, W. Abusive Language Detection on Arabic Social Media. In Proceedings of the First Workshop on Abusive Language, Online, 30 July–4 August 2017; pp. 52–56. [Google Scholar] [CrossRef] [Green Version]
- Albayari, R.; Abdullah, S.; Salloum, S.A. Cyberbullying Classification Methods for Arabic: A Systematic Review. In Proceedings of the International Conference on Artificial Intelligence Computer Vision, Settat, Morocco, 28–30 June 2021; pp. 375–385. [Google Scholar] [CrossRef]
- Alakrot, A.; Murray, L.; Nikolov, N.S. Dataset Construction for the Detection of Anti-Social Behaviour in Online Communication in Arabic. Procedia Comput. Sci. 2018, 142, 174–181. [Google Scholar] [CrossRef]
- Di Capua, M.; Di Nardo, E.; Petrosino, A. Unsupervised cyber bullying detection in social networks. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016; pp. 432–437. [Google Scholar] [CrossRef]
- Hani, J.; Nashaat, M.; Ahmed, M.; Emad, Z.; Amer, E.; Mohammed, A. Social Media Cyberbullying Detection using Machine Learning. Int. J. Adv. Comput. Sci. Appl. 2019, 10, 703–707. [Google Scholar] [CrossRef]
- Bayari, R.; Bensefia, A. Text Mining Techniques for Cyberbullying Detection: State of the Art. Adv. Sci. Technol. Eng. Syst. J. 2021, 6, 783–790. [Google Scholar] [CrossRef]
- Haidar, B.; Chamoun, M.; Serhrouchni, A. Arabic Cyberbullying Detection: Enhancing Performance by Using Ensemble Machine Learning. In Proceedings of the 2019 International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), Atlanta, GA, USA, 14–17 July 2019; pp. 323–327. [Google Scholar] [CrossRef]
- Otiefy, Y.; Abdelmalek, A.; El Hosary, I. WOLI at SemEval-2020 Task 12: Arabic Offensive Language Identification on Different Twitter Datasets. arXiv 2020, arXiv:2009.05456. [Google Scholar] [CrossRef]
- Mulki, H.; Haddad, H.; Ali, C.B.; Alshabani, H. L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language. 2019, pp. 111–118. Available online: http://aclanthology.lst.uni-saarland.de/W19-3512.pdf (accessed on 20 May 2022).
- Al-Ajlan, M.A.; Ykhlef, M. Optimized Twitter Cyberbullying Detection based on Deep Learning. In Proceedings of the 2018 21st Saudi Computer Society National Computer Conference (NCC), Riyadh, Saudi Arabia, 25–26 April 2018; pp. 1–5. [Google Scholar]
- Hosam, O. Toxic comments identification in arabic social media. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 2019, 11, 219–226. [Google Scholar]
- Al-Harbi, S.; Almuhareb, A.; Al-Thubaity, A.; Khorsheed, M.S.; Al-Rajeh, A. Automatic Arabic Text Classification. In Proceedings of The 9th International Conference on the Statistical Analysis of Textual Data. Available online: https://www.researchgate.net/publication/313363859_Automatic_Arabic_Text_Classification (accessed on 20 May 2022).
- El Rifai, H.; Al Qadi, L.; Elnagar, A. Arabic text classification: The need for multi-labeling systems. Neural Comput. Appl. 2021, 34, 1135–1159. [Google Scholar] [CrossRef]
- Hosseinmardi, H.; Mattson, S.A.; Rafiq, R.I.; Han, R.; Lv, Q.; Mishra, S. Analyzing labeled cyberbullying incidents on the instagram social network. Lect. Notes Comput. Sci. 2015, 9471, 49–66. [Google Scholar] [CrossRef]
- Balahur, A.; Steinberger, R. Rethinking Sentiment Analysis in the News: From Theory to Practice and back. Proc. WOMSA 2009, 9, 1–12. [Google Scholar]
- Alharbi, B.; Alamro, H.; Alshehri, M.; Khayyat, Z.; Kalkatawi, M.; Jaber, I.I.; Zhang, X. ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset. arXiv 2020, arXiv:2011.00578. [Google Scholar]
- Alshamsi, A.; Bayari, R.; Salloum, S. Sentiment Analysis in English Texts. Adv. Sci. Technol. Eng. Syst. J. 2020, 5, 1683–1689. [Google Scholar] [CrossRef]
- Batanović, V.; Cvetanović, M.; Nikolić, B. A versatile framework for resource-limited sentiment articulation, annotation, and analysis of short texts. PLoS ONE 2020, 15, e0242050. [Google Scholar] [CrossRef] [PubMed]
- Cao, H.; Sen, P.K.; Peery, A.F.; Dellon, E.S. Assessing agreement with multiple raters on correlated kappa statistics. Biom. J. 2016, 58, 935–943. [Google Scholar] [CrossRef] [PubMed]
- Al Shamsi, A.A.; Abdallah, S. Sentiment Analysis of Emirati Dialect. Big Data Cogn. Comput. 2022, 6, 57. [Google Scholar] [CrossRef]
- Shehab, M.A.; Badarneh, O.; Al-Ayyoub, M.; Jararweh, Y. A supervised approach for multi-label classification of Arabic news articles. In Proceedings of the CSIT 2016 7th International Conference on Computer Science and Information Technology (CSIT), Amman, Jordan, 13–14 July 2016; pp. 1–6. [Google Scholar] [CrossRef]
- Al Shamsi, A.A.; Abdallah, S. Text Mining Techniques for Sentiment Analysis of Arabic Dialects: Literature Review. Adv. Sci. Technol. Eng. Syst. J. 2021, 6, 1012–1023. [Google Scholar] [CrossRef]
- Al Shamsi, A.A.; Abdallah, S. A Systematic Review for Sentiment Analysis of Arabic Dialect Texts Researches. In Proceedings of the International Conference on Emerging Technologies and Intelligent Systems (ICETIS 2021), Al Buraimi, Oman, 25–26 June 2021; Springer: Cham, Switzerland, 2022; pp. 291–309. [Google Scholar]
Comments Translation | Comments |
---|---|
Whore. | يا فاجره |
Why ok? This is low level, madam. | ليه طيب؟ ده رخص يا مدام |
I love you, but why are you so raffish? | انا بحبك بس انتي بقيتي سفله ليه كده |
A hideous and disgusting, don’t wear this style because it is not sweet at all. | وحش ومقزز جدا يريت، بلاش الاستايل،دة،مش حلو فيكي خالص |
Comments in English | Comments in Arabic | Classification |
---|---|---|
Follow me too | تابعني كمان | Neutral |
Watch my story | شاهد لستوري | Neutral |
God forgive me, I did not expect her to do such a thing! | استغفر الله العظيم، ما توقعت تعمل كذا | Toxic |
Death is inevitably coming, I don’t know why … you insist on making me mad | الموت قادم لا محاله قد متى ايه معرفش ليه كدا … انتي مصره تزعليني | Toxic |
He doesn’t deserve her, see how she looks like, she is from Hollywood. And he is “No comment” | ما يستاهلها شوفوا شكلها كأنها من هوليوود وهو ... نو كومنت | Toxic |
Uglier than ugliness | أبشع من البشاعة | Bullying |
Bitch | ساقطه | Bullying |
Resulting Tags | Precision | Recall | F1-Score |
---|---|---|---|
Bullying | 0.70 | 0.58 | 0.63 |
Neutral | 0.65 | 0.52 | 0.58 |
Positive | 0.77 | 0.90 | 0.83 |
Toxic | 0.30 | 0.37 | 0.33 |
Macro avg | 0.60 | 0.59 | 0.59 |
Weighted avg | 0.66 | 0.59 | 0.59 |
Accuracy | 0.66 |
Resulting Tags | Precision | Recall | F1-Score |
---|---|---|---|
Bullying | 0.62 | 0.72 | 0.67 |
Neutral | 0.63 | 0.56 | 0.60 |
Positive | 0.76 | 0.90 | 0.82 |
Toxic | 0.42 | 0.13 | 0.20 |
Macro avg | 0.61 | 0.58 | 0.57 |
Weighted avg | 0.65 | 0.67 | 0.65 |
Accuracy | 0.67 |
Resulting Tags | Precision | Recall | F1-Score |
---|---|---|---|
Bullying | 0.70 | 0.58 | 0.63 |
Neutral | 0.65 | 0.52 | 0.58 |
Positive | 0.77 | 0.90 | 0.83 |
Toxic | 0.30 | 0.37 | 0.33 |
Macro avg | 0.60 | 0.59 | 0.59 |
Weighted avg | 0.66 | 0.66 | 0.65 |
Accuracy | 0.66 |
Resulting Tags | Precision | Recall | F1-Score |
---|---|---|---|
Bullying | 0.62 | 0.77 | 0.69 |
Neutral | 0.63 | 0.60 | 0.62 |
Positive | 0.80 | 0.90 | 0.85 |
Toxic | 0.50 | 0.10 | 0.17 |
Macro avg | 0.64 | 0.59 | 0.58 |
Weighted avg | 0.67 | 0.69 | 0.66 |
Accuracy | 0.69 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
ALBayari, R.; Abdallah, S. Instagram-Based Benchmark Dataset for Cyberbullying Detection in Arabic Text. Data 2022, 7, 83. https://doi.org/10.3390/data7070083
ALBayari R, Abdallah S. Instagram-Based Benchmark Dataset for Cyberbullying Detection in Arabic Text. Data. 2022; 7(7):83. https://doi.org/10.3390/data7070083
Chicago/Turabian StyleALBayari, Reem, and Sherief Abdallah. 2022. "Instagram-Based Benchmark Dataset for Cyberbullying Detection in Arabic Text" Data 7, no. 7: 83. https://doi.org/10.3390/data7070083
APA StyleALBayari, R., & Abdallah, S. (2022). Instagram-Based Benchmark Dataset for Cyberbullying Detection in Arabic Text. Data, 7(7), 83. https://doi.org/10.3390/data7070083