Comprehensive Study of Arabic Satirical Article Classification
Abstract
:1. Introduction
- Compile a large Arabic news satirist articles dataset;
- Perform a thorough analysis of the impact of several traditional and innovative feature extraction methods for satirical articles classification;
- Build a satire classification model using machine learning (ML), deep learning(DL), and transformers;
- Perform a detailed linguistic analysis of the formation of satirical articles;
- Create an open-source satire classification platform.
2. Background
2.1. Satirical News
2.2. Feature Extraction Techniques
2.2.1. Traditional Techniques
- N-grams: a string of n-syllables that are contiguous. By collecting the most common n-grams rather than the whole corpus, this method might be used to obtain a more accurate categorization [18]. The n-grams approach was used with several standard algorithmic combinations to identify satire within the Arabic language [19]. Moreover, the n-grams method was used with other methods such as TF-IDF to achieve higher prediction results in classification tasks [20]. N-grams can be used in language processing to analyze patterns and relationships between words or phrases in a text. This method can provide useful insights into the satirical content found as forms of exaggeration or humorous words, i.e., the more humorous words found, the more likely the article would be classified as satiric.
- TF-IDF: Term Frequency and Inverse Document Frequency. Here, a term’s frequency in a corpus indicates how significant a role it plays in the text [18,21]. It is simple to compute and express word similarity, but the semantic problems that impact the algorithm’s overall performance make it ineffective [22]. Satire detection in Arabic was performed using the TF-IDF approach in several studies, such as those that utilized to reflect tweets syntactic structure [23] or linguistic features [24]. Generally, the TF-IDF approach has been proven to be effective in various natural language processing tasks, such as sentiment analysis and text classification [25]. Furthermore, researchers have also explored the use of TF-IDF in combination with machine learning algorithms to improve the accuracy of satirical detection models [26,27]. Since TF-IDF is a numerical statistic used in information retrieval to determine the importance of a word in a document, it is a useful feature to detect the occurrence of satirical cues in a document. As the frequency of satirical elements increases, the probability of the model correctly classifying the piece as satire also increases.
- Textual features: addresses lexical, syntactic, and stylistic features. These attributes were useful in detecting satirical contexts in Arabic [28]. For instance, the authors of [29] employed stylistic elements that included quote marks, exclamation points, and questions to classify satirical text. Additionally, other features such as sentiment features, shifters, and contextual features were used for Arabic satire identification [28]. Regarding the shifters, they pick up a variety of linguistic phenomena, such as claims that are inconsistent with reality. They also detect instances of exaggeration, such as when people use the strong descriptive words “huge” and “gigantic”. These linguistic phenomena help identify instances of Arabic satire, as they indicate a deviation from the norm or an intentional exaggeration. By analyzing surface/stylistic features, sentiment features, and contextual features, the authors in [28] were able to accurately identify and classify instances of satire in Arabic texts. These findings highlight the importance of considering various linguistic cues and markers when studying irony in different languages. As stated earlier, satirical articles have certain language characteristics, such as exaggeration, humor, and the use of conflicting terminology, which effectively convey the satirist’s message. These linguistic cues presented as textual characteristics may be employed to determine the presence of satire within a given content.
2.2.2. Innovative Techniques
- Word Embeddings: word embeddings are useful for capturing the semantic connections between words [30]. The raw data used to train the network are fed into a low-dimensional dense vector. Following enough training, the lexicon’s semantics are learned, and a map is created by grouping words with similar semantic connotations [18,22]. Moreover, standalone representations that are independent of context are captured via static word embedding. Word and subword embeddings were applied using the word2vec tool, which provides two models for representation: a continuous bag of words and a continuous skip-gram of words [31]. This application was for the purpose of detecting Arabic satire. Moreover, Arabic FastText was used to extract word embeddings [32]. The deep emoji technique was additionally utilized for the extraction of emotion features [23]. Even though the static word embedding method is effective, especially the embeddings trained on large datasets, it does not account for the meaning of a word in various contexts. Due to the possibility that a word in one domain could have a completely different meaning in another, it did not perform well on domain-specific datasets [33]. Contextual word embedding is, therefore, employed to represent the term in accordance with the context in which it appears [22,34]. Technologies such as ElMo, for example, have successfully addressed this problem, although they need larger datasets, whereas domain-specific datasets do not. It is important to note that in non-contextual jobs, such as studying vector spaces, utilizing static word embedding is occasionally preferred. The computational cost of static word embedding is also significantly lower than that of contextual word embedding [35]. Contextual word embedding has been utilized in a variety of studies, including [22,34], to identify Arabic sarcasm.Transformers take natural language text as input and generate a prediction for a classification problem. Its architecture uses an encoder–decoder structure, as shown in Figure 1. The encoder takes an input and converts then into a sequence of continuous representations that are fed to a decoder, which generates a prediction.According to Vaswani et al. [36], “The Transformer is the first transduction model relying purely on self-attention to calculate representations of its input and output without employing sequence-aligned RNNs or convolution”. Using layered self-attention and pointwise, completely linked layers, the transformer conforms with the encoder decoder’s overall architectural design. By translating a query and a collection of key-value pairs to an output, the transformer’s attention function is calculated. The result is then calculated as a weighted sum of the values, with each value’s weight determined by the query’s compatibility function with the connected key, where dimension is of the keys, and dimension is of the values. Moreover, the transformer performs the attention function in parallel, resulting in output values using “multi-head attention” as shown in Equation (1):
3. Related Work
4. Methodology
4.1. Dataset
4.2. Data Preprocessing
- Tokenization, which is a method of separating a piece of text into smaller chunks called tokens.
- Normalized elongated word by removing repetition of three or more characters
- Normalized the three Arabic letters: alef, alef maqsoura, and ta-marbouta.
- Removed diacritics and punctuation marks.
- Removed non-Arabic characters
4.3. Feature Extractions
4.3.1. N-Grams
4.3.2. Textual Features
- Emotions: The most well-known list, frequently referred to as “The Big Six”, was utilized by Ekman et al. [49] in their investigation into the universal detection of emotion from facial expression. The list contained the most widely acknowledged candidates for fundamental emotions, including joy, sadness, fear, surprise, anger, and disgust. This emotion list was translated to Arabic in the study by Saad [50].
- Part of speech (POS): These refer to nouns, verbs, adverbs, adjectives, prepositions, determiners, pronouns, conjunctions, and proper nouns. We used the Farasa [51] tool to extract each word’s POS tag.
- Linguistics: Linguistic features are certain syntactic categories that are too fine-grained to be captured by general POS. Each syntactic unit conforms to a certain linguistic purpose, which is used to build meaningful statements. In recent years, there has been an increasing amount of literature investigating authors’ writing styles to identify unique features associated with their writing and to identify certain characteristics [52,53,54,55,56]. The set of linguistic markers investigated in this study, as described in Table 2, are assurance, negations, justification, intensifiers, hedges, illustrations, temporal, spatial, superlative, exceptions, and oppositions. These linguistics were extracted following the approach of Himdi et al. [57].
4.3.3. Word Embeddings
- FastText: is a library developed by Facebook that allows for efficient text classification and representation learning. It is designed to work with language models that are capable of learning from a large corpus of text data. It uses a technique called n-grams, which breaks down a sentence into its component words and then looks at the frequency of the words used to determine the overall meaning of the sentence. This technique allows FastText to accurately identify words and their meanings in any given sentence, no matter how long or short. FastText has seen great success in NLP processing tasks, such as sentiment analysis, text summarization, and entity recognition [62]. Additionally, FastText has pretrained versions for several languages. Since there were only Arabic datasets available for the experiment, FastText’s 300-dimension Arabic pretrained vectors were used in this study [63].
- Word2vec: is a predictive model that trains by attempting to predict a target word given a context (CBOW method) or by using the context words from the target word to predict the context words (skip-gram method) [64]. It employs trainable embedding weights to map words to their respective embeddings, which are used to assist the model in making predictions. As the loss function for training the model is proportional to the accuracy of the model’s predictions, training the model to make more accurate predictions will result in more accurate embeddings. It used a neural network model to generate embeddings for each word [65]. AraVec pretrained embedding on a Twitter dataset employing CBOW and 300 embedding dimensions was used [65].
- GloVe: employs matrix factorization techniques applied to the word-context matrix. First, it creates a large matrix of (words × context) co-occurrence information. For example, for each “word” (the rows), it counts how frequently (matrix values) this word appears in a given “context” (the columns) in a large corpus. The number of “contexts” is essentially combinatorial in size, so it would be very large. Therefore, we factorize this matrix to produce a lower-dimensional (word × features) matrix, where each row represents each word as a vector. This is typically achieved by minimizing “reconstruction loss”. This loss seeks to identify the lower-dimensional representations that can explain the majority of the variance in high-dimensional data, [66]. For usability in Arabic, there is just one pretrained GloVe embedding currently accessible online, which is a 256-dimensional pretrained GloVe vector word embedding [67].
4.4. Classification Models
4.4.1. Machine Learning
- Naive Bayes (NB): This classifier is composed of a number of algorithms that are based on the Bayes Theorem and assumes independence of the attributes. It is based on estimates, in which the model adjusts its probability table using the training data and predicts new observations by estimating the class probability based on its feature values. The small amount of training data needed by NB results in storage space savings. It also yields quicker results and is not sensitive to missing data [73].
- Support Vector Machine (SVM): is a supervised machine learning model that classifies input data based on dimensional surfaces by finding the maximum separating hyperplane between different classes [74]. SVM offers data analysis for both classification and regression analysis. After that, it chooses a boundary that maximizes the distance between neighboring members of various classes [75]. SVM offers the advantage of resolving overfitting in high-dimensional spaces. As a result, by choosing a plot from a large selection, it can simulate non-linear acceptable boundaries. It is widely used in text classification projects [76,77].
- Logistic Regression (LR): is a classifier that establishes a link between features and likelihood of the outcome [78]. It is based on the logistic function, an S-shaped curve that maps real value numbers to values between 0 and 1 [79]. LR is reliable for classifying problems [80], and it prevents overfitting.
- Random Forest (RF): this classification algorithm uses a decision tree model that is based on bootstrap aggregation techniques, called bagging [81], which is an ensemble method that combines the predictions from several ML algorithms. Bagging also reduces high variances that can be produced by the algorithm, which is due to its sensitivity to the training data. RF is scalable and robust to outliers.
4.4.2. Deep Learning
- CNN: is one of the most popular neural networks that consist of an input layer, hidden layer, and output layer. It uses a convolution layer that is used to transform the input data into an easier-processed form. Additionally, the pooling layer is also used to reduce the input dimensions.
- Bi-LSTM: is a bidirectional recurrent neural network (RNN) that takes two long short-term memory (LSTM) neural networks. The first time process the input sequence in the forward direction, and the second time process the input in the backward direction. Thus, improves the model learning and produces better accuracy.
- CNN & Bi-LSTM: is a hybrid model that combines CNN and Bi-LSTM. CNN is used to reduce the input dimensions. Then, the output is fed into the LSTM layer. This model uses the convolution layer to extract local features, and the LSTM layer uses the ordering of those features to learn about the text ordering of the input.
4.4.3. Transformers
- BERT (Bidirectional Encoder Representations from Transformers): is a powerful language model transformer that has been shown to achieve state-of-the-art performance on a variety of natural language processing tasks [85]. By examining relationships in sequential data, the neural network that helps the transformer architecture to comprehend context and meaning. This information is the words in a sentence when natural language processing (NLP) is used. Encoder–decoder architecture is used in this system. An input sequence’s characteristics are extracted by the encoder on the left side of the architecture, which is then used by the decoder on the right to create the output sequence.
- GPT (Generative Pretrained Transformer): is a language model comprising an encoder and a decoder as part of a transformer architecture. It has been applied to NLP tasks. As humans, they generate messages, respond to inquiries, and produce photographs and movies. Antoun et al. [86] proposed the generative pretrained language model AraGPT2. The model utilizes a self-attention mechanism to identify long-term relationships between sequences over time. The model is trained using a collection of texts, the majority of which are written in Modern Standard Arabic (MSA). The AraGPT2 program has been extensively used in NLP projects such as text creation [87]. In contrast to regular transformers, it extends the self-attention block with a second normalizing layer, which makes it unique among transformers. Both BERT and GPT excel at text classification due to their abilities to capture contextual information. BERT’s bidirectional approach allows it to understand word meaning in relation to both preceding and succeeding contexts, while GPT’s generative nature enables it to generate coherent text based on the preceding context. These attributes contribute to their effectiveness in text classification tasks, as they provide models with a deeper understanding of language, context, and semantic relationships within texts [88].
5. Experiments
5.1. Model Compilation
- The batch size is the number of examples to be taken into account prior to updating the model’s parameters. Batch sizes of 32, 64, and 128 were examined in this study.
- The number of epochs is determined by how frequently the algorithm will execute the training dataset. Here, we experimented with epoch numbers ranging from 1 to 15. To avoid wasting time and storage, we used the stop accuracy method. It allowed us to terminate the training when we reached the highest accuracy to make up for the discrepancy between the loss function and the updating of model parameters. The Adam optimizer was adjusted with a learning rate of 0.001.
- Dropout improves the model’s generalization and reduces the likelihood of overfitting. To constrain the weight of layers, the dropout rate was modified to 0.2.
- The classifier is the final layer that converts all the input into predicted classes. Thus, choosing this layer included Conv1D layer, MaxPooling1D layer, and Dense layer with a sigmoid activation function for binary classification.
- CNN: The architecture consists of several layers. First, the Conv1D layer, which approximately applies 64 filters with a kernel size of 3 to the input data, and ReLU was added for an applied activation function. Second, the MaxPooling1D layer, which was used to perform max pooling with a pool size of two. Third, the Conv1D layer, which applied 64 filters with a kernel size of three to the output of the previous layer. MaxPooling1D layer was then applied to perform max pooling with a pool size of two. Then, the Flatten layer, which flattens the output of the previous layer, was applied. Finally, the dense layer, which has two neurons with a sigmoid activation function, outputs a probability distribution over the output classes.
- Bi-LSTM: its architecture uses the Keras library. The model has a single bidirectional-LSTM layer with 64 units, followed by a dense layer with a sigmoid activation function that outputs two values for binary classification. The loss function used is categorical cross-entropy, and the optimizer was Adam. The model was trained for two epochs; the input shape was 768.1, which is the size of the BERT embeddings after reshaping.
- CNN & Bi-LSTM: this model’s architecture combines both the CNN and Bi-LSTM layers: A bidirectional LSTM layer with 64 units and a 1D convolutional layer with 64 filters and a kernel size of three extract local patterns from the sequence. Then, the MaxPooling1D layer with pool size two was applied to reduce the dimensionality of the feature maps. The flattening layer converts the 3D tensor output from the previous layer into a 1D tensor. Last, there is the dense layer with two units and the sigmoid activation function, which outputs the probability distribution over the classes.
- BERT (Bidirectional Encoder Representations from Transformers): BERT employs bidirectional self-attention transformers to capture both short- and long-term contextual dependencies in the input text. We use AraBERT [86], which has a vocabulary capacity of 64,000 words, 12 attention heads, 12 hidden layers, 768 hidden sizes, a total of 110 M parameters, and 512 maximum sequence lengths. It was trained using a dataset of 3B Arabic words. Specifically, we used the available version “paraphrase-multilingual-mpnet-base-v2 model (https://huggingface.co/ (accessed on 1 September 2023))”, which is a pretrained BERT model that is designed to encode multilingual texts into high-quality embeddings. Notably, the Adam optimizer with a learning rate of 1e−4, batch size of 512, and sequence length of 128 was utilized.
- GPT (Generative preTrained Transformer): ArAGPT2 is the largest publicly accessible collection of filtered Arabic corpora was used to train the model [89]. The complexity metric, which assesses how effectively a probability model predicts a sample, was used to assess the model. The model is trained on 77 GB of Arabic text. ARAGPT2 is available in four training size variants: base, medium, large, and mega; the smallest model, base, has the same dimensions as ARABERT-base. This enables it to be accessible to a greater number of researchers. Larger model variants (medium, large, and xlarge) provide enhanced performance but are more difficult to fine-tune and computationally costly. The ARAGPT2- detector is based on a pretrained ARAELECTRA model that was refined using a synthetically generated dataset. In this study, we used the ARAGPT2 base model; which has a batch size of 1792, a learning rate of 1.27e−3, LAMB optimizer, 12 heads and 12 layers, and a training size of 135 M.
5.2. Evaluation Metrics
5.3. Results and Discussion
5.4. Error Analysis
5.5. Limitations
5.6. Satire Lexical Density
6. Model Development
7. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rahma, A.; Azab, S.S.; Mohammed, A. A Comprehensive Review on Arabic Sarcasm Detection: Approaches, Challenges and Future Trends. IEEE Access 2023, 11, 18261–18280. [Google Scholar] [CrossRef]
- Baumgartner, J.C.; Morris, J.S. One “nation,” under Stephen? The effects of the Colbert Report on American youth. J. Broadcast. Electron. Media 2008, 52, 622–643. [Google Scholar] [CrossRef]
- Stones, S.; Glazzard, J.; Muzio, M.R. Selected Topics in Child and Adolescent Mental Health; BoD-Books on Demand: Norderstedt, Germany, 2020. [Google Scholar]
- Egelhofer, J.L.; Lecheler, S. Fake news as a two-dimensional phenomenon: A framework and research agenda. Ann. Int. Commun. Assoc. 2019, 43, 97–116. [Google Scholar] [CrossRef]
- Bowyer, B.T.; Kahne, J.E.; Middaugh, E. Youth comprehension of political messages in YouTube videos. New Media Soc. 2017, 19, 522–541. [Google Scholar] [CrossRef]
- Baym, G.; Jones, J.P. News parody in global perspective: Politics, power, and resistance. Pop. Commun. 2012, 10, 2–13. [Google Scholar] [CrossRef]
- Young, D.G.; Tisinger, R.M. Dispelling late-night myths: News consumption among late-night comedy viewers and the predictors of exposure to various late-night shows. Harv. Int. J. Press/Politics 2006, 11, 113–134. [Google Scholar] [CrossRef]
- O’Keefe, P.A.; Horberg, E.; Plante, I. The multifaceted role of interest in motivation and engagement. In The Science of Interest; Springer: Berlin/Heidelberg, Germany, 2017; pp. 49–67. [Google Scholar]
- Baum, M.A. Soft news and political knowledge: Evidence of absence or absence of evidence? Political Commun. 2003, 20, 173–190. [Google Scholar] [CrossRef]
- del Pilar Salas-Zárate, M.; Paredes-Valverde, M.A.; Rodriguez-García, M.Á.; Valencia-García, R.; Alor-Hernández, G. Automatic detection of satire in Twitter: A psycholinguistic-based approach. Knowl.-Based Syst. 2017, 128, 20–33. [Google Scholar] [CrossRef]
- Gupta, A.; Kumaraguru, P.; Castillo, C.; Meier, P. Tweetcred: A real-time web-based system for assessing credibility of content on twitter. arXiv 2014, arXiv:1405.5490. [Google Scholar]
- Lichtheim, M. Ancient Egyptian Literature; Univ of California Press: Berkeley, CA, USA, 2019. [Google Scholar]
- Peifer, J.; Lee, T. Satire and journalism. In Oxford Research Encyclopedia of Communication; Oxford University Press: Oxford, UK, 2019. [Google Scholar]
- Young, D.G. Can satire and irony constitute misinformation. In Misinformation and Mass Audiences; University of Texas Press: Austin, TX, USA, 2018; pp. 124–139. [Google Scholar]
- Cockerell, I. Fear, Panic and Fake News Spread after Ebola Outbreak in Uganda. 2022. Available online: https://www.codastory.com/newsletters/ebola-disinformation-uganda/ (accessed on 15 April 2023).
- Khalid, S.; Khalil, T.; Nasreen, S. A survey of feature selection and feature extraction techniques in machine learning. In Proceedings of the 2014 Science and Information Conference, London, UK, 27–29 August 2014; pp. 372–378. [Google Scholar]
- Velliangiri, S.; Alagumuthukrishnan, S. A review of dimensionality reduction techniques for efficient computation. Procedia Comput. Sci. 2019, 165, 104–111. [Google Scholar] [CrossRef]
- Mehta, A.; Parekh, Y.; Karamchandani, S. Performance evaluation of machine learning and deep learning techniques for sentiment analysis. In Information Systems Design and Intelligent Applications: Proceedings of Fourth International Conference INDIA 2017; Springer: Berlin/Heidelberg, Germany, 2018; pp. 463–471. [Google Scholar]
- Allaith, A.; Shahbaz, M.; Alkoli, M. Neural Network Approach for Irony Detection from Arabic Text on Social Media. In Proceedings of the FIRE (Working Notes), Kolkata, India, 12–15 December 2019; pp. 445–450. [Google Scholar]
- Nayel, H.; Amer, E.; Allam, A.; Abdallah, H. Machine learning-based model for sentiment and sarcasm detection. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kiev, Ukraine, 19 April 2021; pp. 386–389. [Google Scholar]
- Abuteir, M.M.; Elsamani, E. Automatic Sarcasm Detection in Arabic Text: A Supervised Classification Approach. Int. J. New Technol. Res. 2021, 7, 1–11. [Google Scholar]
- Elgabry, H.; Attia, S.; Abdel-Rahman, A.; Abdel-Ate, A.; Girgis, S. A contextual word embedding for Arabic sarcasm detection with random forests. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kiev, Ukraine, 19 April 2021; pp. 340–344. [Google Scholar]
- Kanwar, N.; Mundotiya, R.K.; Agarwal, M.; Singh, C. Emotion based voted classifier for Arabic irony tweet identification. In Proceedings of the FIRE (Working Notes), Kolkata, India, 12–15 December 2019; pp. 426–432. [Google Scholar]
- Abuzayed, A.; Al-Khalifa, H. Sarcasm and sentiment detection in Arabic tweets using BERT-based models and data augmentation. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kiev, Ukraine, 19 April 2021; pp. 312–317. [Google Scholar]
- Wadhawan, A. Arabert and farasa segmentation based approach for sarcasm and sentiment detection in arabic tweets. arXiv 2021, arXiv:2103.01679. [Google Scholar]
- Hengle, A.; Kshirsagar, A.; Desai, S.; Marathe, M. Combining Context-Free and Contextualized Representations for Arabic Sarcasm Detection and Sentiment Identification. arXiv 2021, arXiv:2103.05683. [Google Scholar]
- Sarsam, S.M.; Al-Samarraie, H.; Alzahrani, A.I.; Wright, B. Sarcasm detection using machine learning algorithms in Twitter: A systematic review. Int. J. Mark. Res. 2020, 62, 578–598. [Google Scholar] [CrossRef]
- Karoui, J.; Zitoune, F.B.; Moriceau, V. Soukhria: Towards an irony detection system for arabic in social media. Procedia Comput. Sci. 2017, 117, 161–168. [Google Scholar] [CrossRef]
- Al-Ghadhban, D.; Alnkhilan, E.; Tatwany, L.; Alrazgan, M. Arabic sarcasm detection in Twitter. In Proceedings of the 2017 International Conference on Engineering & MIS (ICEMIS), IEEE, Monastir, Tunisia, 8–10 May 2017; pp. 1–7. [Google Scholar]
- Gupta, M.; Bakliwal, A.; Agarwal, S.; Mehndiratta, P. A comparative study of spam SMS detection using machine learning classifiers. In Proceedings of the 2018 Eleventh International Conference on Contemporary Computing (IC3), IEEE, Noida, India, 2–4 August 2018; pp. 1–7. [Google Scholar]
- Moudjari, L.; Akli-Astouati, K. An Embedding-based Approach for Irony Detection in Arabic tweets. In Proceedings of the FIRE (Working Notes), Kolkata, India, 12–15 December 2019; pp. 409–415. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Zhou, W.; Bloem, J. Comparing Contextual and Static Word Embeddings with Small Data. In Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021), Dusseldorf, Germany, 6–9 September 2021; pp. 253–259. [Google Scholar]
- Alharbi, A.I.; Lee, M. Multi-task learning using a combination of contextualised and static word embeddings for arabic sarcasm detection and sentiment analysis. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kiev, Ukraine, 19 April 2021; pp. 318–322. [Google Scholar]
- Gupta, P.; Jaggi, M. Obtaining better static word embeddings using contextual embedding models. arXiv 2021, arXiv:2106.04302. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Saadany, H.; Mohamed, E.; Orasan, C. Fake or real? A study of Arabic satirical fake news. arXiv 2020, arXiv:2011.00452. [Google Scholar]
- Farha, I.A.; Magdy, W. Mazajak: An online Arabic sentiment analyser. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, Florence, Italy, 1 August 2019; pp. 192–198. [Google Scholar]
- Naski, M.; Messaoudi, A.; Haddad, H.; BenHajhmida, M.; Fourati, C.; Mabrouk, A.B.E. iCompass at shared task on sarcasm and sentiment detection in Arabic. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kiev, Ukraine, 19 April 2021; pp. 381–385. [Google Scholar]
- Farha, I.A.; Zaghouani, W.; Magdy, W. Overview of the wanlp 2021 shared task on sarcasm and sentiment detection in arabic. In Proceedings of the Sxth Arabic Natural Language Processing Workshop, Kiev, Ukraine, 19 April 2021; pp. 296–305. [Google Scholar]
- Godara, J.; Batra, I.; Aron, R.; Shabaz, M. Ensemble classification approach for sarcasm detection. Behav. Neurol. 2021, 2021, 9731519. [Google Scholar] [CrossRef]
- Babanejad, N.; Davoudi, H.; An, A.; Papagelis, M. Affective and contextual embedding for sarcasm detection. In Proceedings of the 28th International Conference on Computational Linguistics, Online, 8–13 December 2020; pp. 225–243. [Google Scholar]
- Sharma, D.K.; Singh, B.; Agarwal, S.; Kim, H.; Sharma, R. Sarcasm detection over social media platforms using hybrid auto-encoder-based model. Electronics 2022, 11, 2844. [Google Scholar] [CrossRef]
- Israeli, A.; Nahum, Y.; Fine, S.; Bar, K. The idc system for sentiment classification and sarcasm detection in Arabic. In Proceedings of the Sixth Arabic Natural Language Processing Workshop, Kiev, Ukraine, 19 April 2021; pp. 370–375. [Google Scholar]
- Băroiu, A.C.; Trăușan-Matu, Ș. Automatic Sarcasm Detection: Systematic Literature Review. Information 2022, 13, 399. [Google Scholar] [CrossRef]
- AlMazrua, H.; AlHazzani, N.; AlDawod, A.; AlAwlaqi, L.; AlReshoudi, N.; Al-Khalifa, H.; AlDhubayi, L. Sa ‘7r: A Saudi Dialect Irony Dataset. In Proceedings of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur’an QA and Fine-Grained Hate Speech Detection, Marseille, France, 20–25 June 2022; pp. 60–70. [Google Scholar]
- Yang, F.; Mukherjee, A.; Dragut, E. Satirical news detection and analysis using attention mechanism and linguistic features. arXiv 2017, arXiv:1709.01189. [Google Scholar]
- Rendalkar, S.; Chandankhede, C. Sarcasm detection of online comments using emotion detection. In Proceedings of the 2018 International Conference on Inventive Research in Computing Applications (Icirca), IEEE, Coimbatore, India, 11–12 July 2018; pp. 1244–1249. [Google Scholar]
- Ekman, P.; Sorenson, E.R.; Friesen, W.V. Pan-cultural elements in facial displays of emotion. Science 1969, 164, 86–88. [Google Scholar] [CrossRef] [PubMed]
- Saad, M. Mining Documents and Sentiments in Cross-lingual Context. Ph.D. Thesis, Université de Lorraine, Lorraine, France, 2015. [Google Scholar]
- Abdelali, A.; Darwish, K.; Durrani, N.; Mubarak, H. Farasa: A fast and furious segmenter for arabic. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, San Diego, CA, USA, 12–17 July 2016; pp. 11–16. [Google Scholar]
- Alsmearat, K.; Al-Ayyoub, M.; Al-Shalabi, R.; Kanaan, G. Author gender identification from Arabic text. J. Inf. Secur. Appl. 2017, 35, 85–95. [Google Scholar] [CrossRef]
- Alwajeeh, A.; Al-Ayyoub, M.; Hmeidi, I. On authorship authentication of arabic articles. In Proceedings of the 2014 5th International Conference on Information and Communication Systems (ICICS), IEEE, Irbid, Jordan, 1–3 April 2014; pp. 1–6. [Google Scholar]
- Burgoon, J.K.; Blair, J.P.; Qin, T.; Nunamaker, J.F. Detecting deception through linguistic analysis. In Proceedings of the International Conference on Intelligence and Security Informatics, San Antonio, TX, USA, 2–3 November 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 91–101. [Google Scholar]
- Gröndahl, T.; Asokan, N. Text analysis in adversarial settings: Does deception leave a stylistic trace? ACM Comput. Surv. (CSUR) 2019, 52, 1–36. [Google Scholar] [CrossRef]
- Hajja, M.; Yahya, A.; Yahya, A. Authorship attribution of arabic articles. In Proceedings of the International Conference on Arabic Language Processing, Nancy, France, 16–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 194–208. [Google Scholar]
- Himdi, H.; Weir, G.; Assiri, F.; Al-Barhamtoshy, H. Arabic fake news detection based on textual analysis. Arab. J. Sci. Eng. 2022, 47, 10453–10469. [Google Scholar] [CrossRef]
- Ghannay, S.; Esteve, Y.; Camelin, N.; Dutrey, C.; Santiago, F.; Adda-Decker, M. Combining continuous word representation and prosodic features for asr error prediction. In Proceedings of the Statistical Language and Speech Processing: Third International Conference, SLSP 2015, Proceedings 3, Budapest, Hungary, 24–26 November 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 84–95. [Google Scholar]
- Ghannay, S.; Favre, B.; Esteve, Y.; Camelin, N. Word embedding evaluation and combination. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Portoroz, Slovenia, 23–28 May 2016; pp. 300–305. [Google Scholar]
- Naseem, U.; Razzak, I.; Eklund, P.; Musial, K. Towards improved deep contextual embedding for the identification of irony and sarcasm. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), IEEE, Glasgow, UK, 19–24 July 2020; pp. 1–7. [Google Scholar]
- Ranasinghe, T.; Saadany, H.; Plum, A.; Mandhari, S.; Mohamed, E.; Orasan, C.; Mitkov, R. RGCL at IDAT: Deep Learning Models for Irony Detection in Arabic Language; University of Wolverhampton: Wolverhampton, UK, 2019. [Google Scholar]
- Joulin, A.; Grave, E.; Bojanowski, P.; Douze, M.; Jégou, H.; Mikolov, T. Fasttext. zip: Compressing text classification models. arXiv 2016, arXiv:1612.03651. [Google Scholar]
- Bojanowski, P.; Grave, E.; Joulin, A.; Mikolov, T. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 2017, 5, 135–146. [Google Scholar] [CrossRef]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Soliman, A.B.; Eissa, K.; El-Beltagy, S.R. Aravec: A set of arabic word embedding models for use in arabic nlp. Procedia Comput. Sci. 2017, 117, 256–265. [Google Scholar] [CrossRef]
- Hindocha, E.; Yazhiny, V.; Arunkumar, A.; Boobalan, P. Short-text Semantic Similarity using GloVe word embedding. Int. Res. J. Eng. Technol. 2019, 6, 553–558. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Shah, K.; Patel, H.; Sanghvi, D.; Shah, M. A comparative analysis of logistic regression, random forest and KNN models for the text classification. Augment. Hum. Res. 2020, 5, 1–16. [Google Scholar] [CrossRef]
- Chen, H.; Wu, L.; Chen, J.; Lu, W.; Ding, J. A comparative study of automated legal text classification using random forests and deep learning. Inf. Process. Manag. 2022, 59, 102798. [Google Scholar] [CrossRef]
- Pranckevičius, T.; Marcinkevičius, V. Comparison of naive bayes, random forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification. Balt. J. Mod. Comput. 2017, 5, 221. [Google Scholar] [CrossRef]
- Omar, A.; Mahmoud, T.M.; Abd-El-Hafeez, T.; Mahfouz, A. Multi-label arabic text classification in online social networks. Inf. Syst. 2021, 100, 101785. [Google Scholar] [CrossRef]
- Al Qadi, L.; El Rifai, H.; Obaid, S.; Elnagar, A. Arabic text classification of news articles using classical supervised classifiers. In Proceedings of the 2019 2nd International Conference on New Trends In Computing Sciences (ICTCS), IEEE, Amman, Jordan, 9–11 October 2019; pp. 1–6. [Google Scholar]
- Osisanwo, F.; Akinsola, J.; Awodele, O.; Hinmikaiye, J.; Olakanmi, O.; Akinjobi, J. Supervised machine learning algorithms: Classification and comparison. Int. J. Comput. Trends Technol. (IJCTT) 2017, 48, 128–138. [Google Scholar]
- Vijayan, V.K.; Bindu, K.; Parameswaran, L. A comprehensive study of text classification algorithms. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, Manipal, India, 13–16 September 2017; pp. 1109–1113. [Google Scholar]
- Xie, J.; Su, B.; Li, C.; Lin, K.; Li, H.; Hu, Y.; Kong, G. A review of modeling methods for predicting in-hospital mortality of patients in intensive care unit. J. Emerg. Crit. Care Med. 2017, 1, 1–10. [Google Scholar] [CrossRef]
- George, J.; Skariah, S.M.; Xavier, T.A. Role of contextual features in fake news detection: A review. In Proceedings of the 2020 international conference on innovative trends in information technology (ICITIIT), IEEE, Kottayam, India, 13–14 February 2020; pp. 1–6. [Google Scholar]
- Shaji, A.; Binu, S.; Nair, A.M.; George, J. Fraud Detection in Credit Card Transaction Using ANN and SVM. In Proceedings of the International Conference on Ubiquitous Communications and Network Computing, Bangalore, India, 8–10 February 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 187–197. [Google Scholar]
- Pramanik, P.K.D.; Pal, S.; Mukhopadhyay, M.; Singh, S.P. Big Data classification: Techniques and tools. In Applications of Big Data in Healthcare; Khanna, A., Gupta, D., Dey, N., Eds.; Academic Press: Cambridge, MA, USA, 2021; pp. 1–43. [Google Scholar]
- Learning, M. Machine Learning Plus. 2021. Available online: https://www.machinelearningplus.com/ (accessed on 1 September 2023).
- Grover, K. Advantages and Disadvantages of Logistic Regression. 2022. Available online: https://iq.opengenus.org/advantages-and-disadvantages-of-logistic-regression/ (accessed on 1 September 2023).
- Genuer, R.; Poggi, J.M.; Tuleau-Malot, C.; Villa-Vialaneix, N. Random forests for big data. Big Data Res. 2017, 9, 28–46. [Google Scholar] [CrossRef]
- Razali, M.S.; Halin, A.A.; Chow, Y.W.; Norowi, N.M.; Doraisamy, S. Context-Driven Satire Detection with Deep Learning. IEEE Access 2022, 10, 78780–78787. [Google Scholar] [CrossRef]
- Zhang, M.; Zhang, Y.; Fu, G. Tweet sarcasm detection using deep neural network. In Proceedings of the COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2016, Osaka, Japan, 1–16 December 2016; pp. 2449–2460. [Google Scholar]
- Venkatesh, B.; Vishwas, H. Real time sarcasm detection on twitter using ensemble methods. In Proceedings of the 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA) IEEE, Coimbatore, India, 2–4 September 2021; pp. 1292–1297. [Google Scholar]
- Kenton, J.D.M.W.C.; Toutanova, L.K. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Proceedings of naacL-HLT, Minneapolis, MN, USA, 2–7 June 2019; Volume 1, p. 2. [Google Scholar]
- Antoun, W.; Baly, F.; Hajj, H. Arabert: Transformer-based model for arabic language understanding. arXiv 2020, arXiv:2003.00104. [Google Scholar]
- Alnabrisi, I.; Saad, M. Detect Arabic Fake News Through Deep Learning Models and Transformers; Available at SSRN 4341610; SSRN: Rochester, NY, USA, 2023. [Google Scholar]
- Rehana, H.; Çam, N.B.; Basmaci, M.; He, Y.; Özgür, A.; Hur, J. Evaluation of GPT and BERT-based models on identifying protein–protein interactions in biomedical text. arXiv 2023, arXiv:2303.17728. [Google Scholar]
- Antoun, W.; Baly, F.; Hajj, H. AraGPT2: Pre-trained transformer for Arabic language generation. arXiv 2020, arXiv:2012.15520. [Google Scholar]
- Cer, D.M.; De Marneffe, M.C.; Jurafsky, D.; Manning, C.D. Parsing to Stanford Dependencies: Trade-offs between Speed and Accuracy. In Proceedings of the LREC, Floriana, Malta, 19–21 May 2010. [Google Scholar]
- Abu Farha, I.; Magdy, W. From Arabic Sentiment Analysis to Sarcasm Detection: The ArSarcasm Dataset. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, Marseille, France, 12 May 2020; pp. 32–39. [Google Scholar]
- Braga, I.A. Evaluation of stopwords removal on the statistical approach for automatic term extraction. In Proceedings of the 2009 Seventh Brazilian Symposium in Information and Human Language Technology, IEEE, Sao Carlos, Brazil, 8–11 September 2009; pp. 142–149. [Google Scholar]
- Rubin, V.L.; Chen, Y.; Conroy, N.K. Deception detection for news: Three types of fakes. Proc. Assoc. Inf. Sci. Technol. 2015, 52, 1–4. [Google Scholar] [CrossRef]
- Ermida, I. News satire in the press: Linguistic construction of humour inspoof news articles. In Language and Humour in the Media; Cambridge Scholars Publishing: Newcastle upon Tyne, UK, 2012; p. 185. [Google Scholar]
Data | Satire | Non-Satire |
---|---|---|
No. of Articles | 768 | 768 |
Avg. no. of sentences | 3 | 3 |
Avg. no. of characters | 453 | 483 |
Avg. word length | 6.74 | 6.83 |
Avg. sentence length | 45.52 | 36.35 |
POS | ||
---|---|---|
A. Content Words | ||
Nouns | Verbs | Adjectives |
Adverbs | Proper Nouns | |
B. Function Words | ||
Conjunctions | Prepositions | Pronouns |
Particles | Determiners | |
Emotion | ||
Anger | Sad | Fear |
Joy | Disgust | Surprise |
Linguistic | ||
Assurance | Negations | Illustration |
Intensifier | Hedges | Temporal |
Spatial | Exclusion | Opposition |
Justification |
Technique | Description |
---|---|
N-Grams | Contiguous sequences of n tokens that can be words, characters, or other units extracted from a given text. |
Textual Features | Refer to various characteristics or properties of text that can be extracted or analyzed to gain insights into the text |
Word Embeddings | Capture semantic relationships between words by representing each word as a point in the embedding space, where similar words are closer to each other. |
Model | Parameters |
---|---|
SVM | batchSize 100 kernel linear |
NB | batchSize 100 |
LR | batchSize 100, maxBoosting-Iterations 500 |
RF | batchSize 100, bagging with num-Iteraions 100, and number of trees 100 |
Parameter | Value |
---|---|
Batch size | 32, 64, and 128 |
Epochs | range 1 to 15 |
Dropout Rate | 0.2 |
ML | NB | SVM | LR | RF | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Class | P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 |
Uni | 68.0 | 68.1 | 68.0 | 62.2 | 64.9 | 62.2 | 70.0 | 70.2 | 69.9 | 68.9 | 69.2 | 68.9 |
Bi | 57.3 | 60.5 | 53.4 | 49.7 | 50.0 | 39.4 | 59.0 | 62.7 | 59.0 | 57.7 | 61.0 | 54.0 |
Tri | 49.9 | 49.8 | 44.9 | 49.3 | 24.6 | 49.3 | 49.9 | 49.2 | 41.0 | 49.8 | 49.6 | 47.7 |
DL | CNN | Bi-LSTM | CNN & Bi-LSTM | ||||||
---|---|---|---|---|---|---|---|---|---|
Class | P | R | F1 | P | R | F1 | P | R | F1 |
Uni | 94.4 | 95.2 | 94.1 | 94.2 | 91.4 | 93.7 | 89.1 | 97.3 | 93.5 |
Bi | 63.1 | 99.1 | 77.4 | 70.1 | 91.1 | 79.0 | 74.2 | 99.0 | 85.3 |
Tri | 70.1 | 64.2 | 67.6 | 53.1 | 92.7 | 67.3 | 99.0 | 30.2 | 46.3 |
ML | NB | SVM | LR | RF | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Class | P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 |
Emo | 41.3 | 45.2 | 46.5 | 55.6 | 55.3 | 55.3 | 50.8 | 45.6 | 48.8 | 59.2 | 59.3 | 59.2 |
POS | 59.9 | 59.6 | 59.9 | 67.6 | 67.6 | 67.6 | 62.9 | 63.1 | 63.0 | 60.4 | 60.4 | 60.3 |
Ling | 61.3 | 60.2 | 58.7 | 60.1 | 62.4 | 62.1 | 63.1 | 62.9 | 62.8 | 62.5 | 62.5 | 62.5 |
Comb | 67.5 | 67.5 | 68.9 | 67.0 | 67.0 | 67.0 | 63.7 | 63.8 | 64.1 | 72.7 | 73.5 | 72.6 |
DL | CNN | Bi-LSTM | CNN & B-LSTM | ||||||
---|---|---|---|---|---|---|---|---|---|
Class | P | R | F1 | P | R | F1 | P | R | F1 |
Emo | 50.2 | 43.8 | 42.1 | 41.5 | 37.1 | 34.9 | 53.2 | 45.7 | 43.6 |
POS | 62.3 | 55.2 | 54.8 | 58.7 | 46.7 | 42.0 | 59.4 | 54.3 | 54.4 |
Ling | 61.1 | 51.4 | 49.6 | 64.9 | 63.8 | 64.2 | 64.8 | 51.4 | 48.1 |
Comb | 60.9 | 57.1 | 47.5 | 39.7 | 37.1 | 27.8 | 63.1 | 60.0 | 60.5 |
ML | NB | SVM | LR | RF | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Class | P | R | F1 | P | R | F1 | P | R | F1 | P | R | F1 |
Fast | 76.0 | 78.0 | 79.0 | 81.0 | 80.0 | 82.0 | 88.0 | 87.0 | 89.0 | 85.0 | 85.6 | 87.0 |
W2V | 82.0 | 83.0 | 81.0 | 82.0 | 84.0 | 82.0 | 78.0 | 79.0 | 79.0 | 91.0 | 87.5 | 89.0 |
GloVe | 86.0 | 91.0 | 88.0 | 89.0 | 95.0 | 91.0 | 89.0 | 92.0 | 90.0 | 86.0 | 92.0 | 87.5 |
DL | CNN | Bi-LSTM | CNN & Bi-LSTM | ||||||
---|---|---|---|---|---|---|---|---|---|
Class | P | R | F1 | P | R | F1 | P | R | F1 |
Fast | 89.8 | 89.8 | 89.8 | 86.2 | 86.2 | 86.2 | 84.2.8 | 84.2 | 84.2 |
W2V | 86.1 | 96.0 | 91.0 | 83.1 | 89.0 | 86.0 | 92.0 | 96.0 | 94.0 |
GloVe | 86.0 | 91.0 | 89.0 | 73.0 | 95.0 | 83.0 | 96.0 | 90.0 | 91.0 |
BERT | GPT | ||||
---|---|---|---|---|---|
P | R | F1 | P | R | F1 |
94.0 | 97.0 | 95.0 | 82.0 | 89.1 | 85.0 |
Class | Satire | Non-Satire |
---|---|---|
Nouns | 28.6 | 29.4 |
Verbs | 5.57 | 6.65 |
Prepositions | 9.72 | 9.99 |
Determiners | 16.5 | 15.5 |
Interjections | 0.01 | 0.04 |
Adverbs | 0.22 | 0.29 |
Adjectives | 6.30 | 5.77 |
Conjunctions | 5.06 | 5.9 |
Proper nouns | 4.6 | 5.2 |
Pronouns | 6.10 | 2.33 |
Anger | 0.08 | 0.096 |
Sadness | 0.04 | 0.033 |
Fear | 0.06 | 0.05 |
Joy | 0.12 | 0.133 |
Disgust | 0.003 | 0.015 |
Surprise | 0.02 | 0.032 |
Assurances | 0.04 | 0.07 |
Negations | 0.14 | 0.277 |
Illustrations | 0.06 | 0.05 |
Intensifiers | 0.03 | 0.14 |
Hedges | 0.03 | 0.05 |
Justifications | 0.09 | 0.04 |
Temporal | 0.08 | 0.08 |
Spatial | 0.07 | 0.08 |
Exclusive | 0.018 | 0.02 |
Oppositions | 0.03 | 0.18 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Assiri, F.; Himdi, H. Comprehensive Study of Arabic Satirical Article Classification. Appl. Sci. 2023, 13, 10616. https://doi.org/10.3390/app131910616
Assiri F, Himdi H. Comprehensive Study of Arabic Satirical Article Classification. Applied Sciences. 2023; 13(19):10616. https://doi.org/10.3390/app131910616
Chicago/Turabian StyleAssiri, Fatmah, and Hanen Himdi. 2023. "Comprehensive Study of Arabic Satirical Article Classification" Applied Sciences 13, no. 19: 10616. https://doi.org/10.3390/app131910616