Next Article in Journal
A Multi-Plant Height Detection Method Based on Ruler-Free Monocular Computer Vision
Previous Article in Journal
An Improved Target Network Model for Rail Surface Defect Detection
Previous Article in Special Issue
A Reliable Publish–Subscribe Mechanism for Internet of Things-Enabled Smart Greenhouses
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Study of Discriminatory Speech Classification Based on Improved Smote and SVM-RF

1
School of Software, South China Normal University, Guangzhou 510631, China
2
School of Computer Science, South China Normal University, Guangzhou 510631, China
3
Department of Industrial and Systems Engineering, Hong Kong Polytechnic University, Hong Kong 999077, China
4
Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK M4Y1M7, Canada
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Appl. Sci. 2024, 14(15), 6468; https://doi.org/10.3390/app14156468
Submission received: 9 June 2024 / Revised: 21 July 2024 / Accepted: 22 July 2024 / Published: 24 July 2024
(This article belongs to the Special Issue Advances in Security, Trust and Privacy in Internet of Things)

Abstract

The rapid development of the Internet has facilitated expression, sharing, and interaction on social networks, but some speech may contain harmful discrimination. Therefore, it is crucial to classify such speech. In this paper, we collected discriminatory data from Sina Weibo and propose the improved Synthetic Minority Over-sampling Technique (SMOTE) algorithm based on Latent Dirichlet Allocation (LDA) to improve data quality and balance. And we propose a new integration method integrating Support Vector Machine (SVM) and Random Forest (RF). The experimental results demonstrate that the integrated model exhibits enhanced precision, recall, and F1 score by 6.0%, 5.4%, and 5.7%, respectively, in comparison with SVM alone. Moreover, it exhibits the best performance in comparison with other machine learning methods. Furthermore, the positive impact of improved SMOTE and this integrated method on model classification is also confirmed in ablation experiments.
Keywords: discrimination speech; latent Dirichlet allocation; support vector machine; random forest; integration method discrimination speech; latent Dirichlet allocation; support vector machine; random forest; integration method

Share and Cite

MDPI and ACS Style

Wu, C.; Hu, H.; Zhu, D.; Shan, X.; Yung, K.-L.; Ip, A.W.H. A Study of Discriminatory Speech Classification Based on Improved Smote and SVM-RF. Appl. Sci. 2024, 14, 6468. https://doi.org/10.3390/app14156468

AMA Style

Wu C, Hu H, Zhu D, Shan X, Yung K-L, Ip AWH. A Study of Discriminatory Speech Classification Based on Improved Smote and SVM-RF. Applied Sciences. 2024; 14(15):6468. https://doi.org/10.3390/app14156468

Chicago/Turabian Style

Wu, Chao, Huijuan Hu, Dingju Zhu, Xilin Shan, Kai-Leung Yung, and Andrew W. H. Ip. 2024. "A Study of Discriminatory Speech Classification Based on Improved Smote and SVM-RF" Applied Sciences 14, no. 15: 6468. https://doi.org/10.3390/app14156468

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop