MDPI - Publisher of Open Access Journals

23 pages, 1004 KB

Open AccessReview

Toward Transparent Modeling: A Scoping Review of Explainability for Arabic Sentiment Analysis

by Afnan Alsehaimi, Amal Babour and Dimah Alahmadi

Appl. Sci. 2025, 15(19), 10659; https://doi.org/10.3390/app151910659 - 2 Oct 2025

The increasing prevalence of Arabic text in digital media offers significant potential for sentiment analysis. However, challenges such as linguistic complexity and limited resources make Arabic sentiment analysis (ASA) particularly difficult. In addition, explainable artificial intelligence (XAI) has become crucial for improving the [...] Read more.

The increasing prevalence of Arabic text in digital media offers significant potential for sentiment analysis. However, challenges such as linguistic complexity and limited resources make Arabic sentiment analysis (ASA) particularly difficult. In addition, explainable artificial intelligence (XAI) has become crucial for improving the transparency and trustworthiness of artificial intelligence (AI) models. This paper addresses the integration of XAI techniques in ASA through a scoping review of developments. This study critically identifies trends in model usage, examines explainability methods, and explores how these techniques enhance the explainability of model decisions. This review is crucial for consolidating fragmented efforts, identifying key methodological trends, and guiding future research in this emerging area. Online databases (IEEE Xplore, ACM Digital Library, Scopus, Web of Science, ScienceDirect, and Google Scholar) were searched to identify papers published between 1 January 2016 and 31 March 2025. The last search across all databases was conducted on 1 April 2025. From these, 19 peer-reviewed journal articles and conference papers focusing on ASA with explicit use of XAI techniques were selected for inclusion. This time frame was chosen to capture the most recent decade of research, reflecting advances in deep learning and the transformer-based and explainable AI methods. The findings indicate that transformer-based models and deep learning approaches dominate in ASA, achieving high accuracy, and that local interpretable model-agnostic explanations (LIME) is the most widely used explainability tool. However, challenges such as dialectal variation, small or imbalanced datasets, and the black box nature of advanced models persist. To address these challenges future research directions should include the creation of richer Arabic sentiment datasets, the development of hybrid explainability models, and the enhancement of adversarial robustness. Full article

► Show Figures

Figure 1

21 pages, 2253 KB

Open AccessArticle

Legal Judgment Prediction in the Saudi Arabian Commercial Court

by Ashwaq Almalki, Safa Alsafari and Noura M. Alotaibi

Future Internet 2025, 17(10), 439; https://doi.org/10.3390/fi17100439 - 26 Sep 2025

Abstract

Legal judgment prediction is an emerging application of artificial intelligence in the legal domain, offering significant potential to enhance legal decision support systems. Such systems can improve judicial efficiency, reduce burdens on legal professionals, and assist in early-stage case assessment. This study focused [...] Read more.

Legal judgment prediction is an emerging application of artificial intelligence in the legal domain, offering significant potential to enhance legal decision support systems. Such systems can improve judicial efficiency, reduce burdens on legal professionals, and assist in early-stage case assessment. This study focused on predicting whether a legal case would be Accepted or Rejected using only the Fact section of court rulings. A key challenge lay in processing long legal documents, which often exceeded the input length limitations of transformer-based models. To address this, we proposed a two-step methodology: first, each document was segmented into sentence-level inputs compatible with AraBERT—a pretrained Arabic transformer model—to generate sentence-level predictions; second, these predictions were aggregated to produce a document-level decision using several methods, including Mean, Max, Confidence-Weighted, and Positional aggregation. We evaluated the approach on a dataset of 19,822 real-world cases collected from the Saudi Arabian Commercial Court. Among all aggregation methods, the Confidence-Weighted method applied to the AraBERT-based classifier achieved the highest performance, with an overall accuracy of 85.62%. The results demonstrated that combining sentence-level modeling with effective aggregation methods provides a scalable and accurate solution for Arabic legal judgment prediction, enabling full-length document processing without truncation. Full article

(This article belongs to the Special Issue Deep Learning and Natural Language Processing—3rd Edition)

► Show Figures

Graphical abstract

31 pages, 1887 KB

Open AccessArticle

ZaQQ: A New Arabic Dataset for Automatic Essay Scoring via a Novel Human–AI Collaborative Framework

by Yomna Elsayed, Emad Nabil, Marwan Torki, Safiullah Faizullah and Ayman Khalafallah

Data 2025, 10(9), 148; https://doi.org/10.3390/data10090148 - 19 Sep 2025

Viewed by 331

Abstract

Automated essay scoring (AES) has become an essential tool in educational assessment. However, applying AES to the Arabic language presents notable challenges, primarily due to the lack of labeled datasets. This data scarcity hampers the development of reliable machine learning models and slows [...] Read more.

Automated essay scoring (AES) has become an essential tool in educational assessment. However, applying AES to the Arabic language presents notable challenges, primarily due to the lack of labeled datasets. This data scarcity hampers the development of reliable machine learning models and slows progress in Arabic natural language processing for educational use. While manual annotation by human experts remains the most accurate method for essay evaluation, it is often too costly and time-consuming to create large-scale datasets, especially for low-resource languages like Arabic. In this work, we introduce a human–AI collaborative framework designed to overcome the shortage of scored Arabic essays. Leveraging QAES, a high-quality annotated dataset, our approach uses Large Language Models (LLMs) to generate multidimensional essay evaluations across seven key writing traits: Relevance, Organization, Vocabulary, Style, Development, Mechanics, and Structure. To ensure accuracy and consistency, we design prompting strategies and validation procedures tailored to each trait. This system is then applied to two unannotated Arabic essay datasets: ZAEBUC and QALB. As a result, we introduce ZaQQ, a newly annotated dataset that merges ZAEBUC, QAES, and QALB. Our findings demonstrate that human–AI collaboration can significantly enhance the availability of labeled resources without compromising assessment quality. The proposed framework serves as a scalable and replicable model for addressing data annotation challenges in low-resource languages and supports the broader goal of expanding access to automated educational assessment tools where expert evaluation is limited. Full article

(This article belongs to the Special Issue Data Mining and Computational Intelligence for E-Learning and Education—3rd Edition)

► Show Figures

Figure 1

31 pages, 799 KB

Open AccessArticle

Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework

by Reham Bin Jabr and Aqil M. Azmi

Mathematics 2025, 13(18), 2975; https://doi.org/10.3390/math13182975 - 14 Sep 2025

Viewed by 389

Abstract

In this work, we propose a knowledge-aware approach for Arabic automatic question generation (QG) that leverages the multilingual T5 (mT5) transformer augmented with a pre-trained Arabic question-answering model to address challenges posed by Arabic’s morphological richness and limited QG resources. Our system generates [...] Read more.

In this work, we propose a knowledge-aware approach for Arabic automatic question generation (QG) that leverages the multilingual T5 (mT5) transformer augmented with a pre-trained Arabic question-answering model to address challenges posed by Arabic’s morphological richness and limited QG resources. Our system generates both subjective questions and multiple-choice questions (MCQs) with contextually relevant distractors through a dual-model pipeline that tailors the decoding strategy to each subtask: the question generator employs beam search to maximize semantic fidelity and lexical precision, while the distractor generator uses top-k sampling to enhance diversity and contextual plausibility. The QG model is fine-tuned on Arabic SQuAD, and the distractor model is trained on a curated combination of ARCD and Qudrat. Experimental results show that beam search significantly outperforms top-k sampling for fact-based question generation, achieving a BLEU-4 score of 27.49 and a METEOR score of 25.18, surpassing fine-tuned AraT5 and translated English–Arabic baselines. In contrast, top-k sampling is more effective for distractor generation, yielding higher BLEU scores and producing distractors that are more diverse yet remain pedagogically valid, with a BLEU-1 score of 20.28 establishing a strong baseline in the absence of prior Arabic benchmarks. Human evaluation further confirms the quality of the generated questions. This work advances Arabic QG by providing a scalable, knowledge-aware solution with applications in educational technology, while demonstrating the critical role of task-specific decoding strategies and setting a foundation for future research in automated assessment. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence Models and Its Applications, 2nd Edition)

► Show Figures

Figure 1

24 pages, 1064 KB

Open AccessArticle

Arabic Abstractive Text Summarization Using an Ant Colony System

by Amal M. Al-Numai and Aqil M. Azmi

Mathematics 2025, 13(16), 2613; https://doi.org/10.3390/math13162613 - 15 Aug 2025

Viewed by 659

Abstract

Arabic abstractive summarization presents a complex multi-objective optimization challenge, balancing readability, informativeness, and conciseness. While extractive approaches dominate NLP, abstractive methods—particularly for Arabic—remain underexplored due to linguistic complexity. This study introduces, for the first time, ant colony system (ACS) for Arabic abstractive summarization [...] Read more.

Arabic abstractive summarization presents a complex multi-objective optimization challenge, balancing readability, informativeness, and conciseness. While extractive approaches dominate NLP, abstractive methods—particularly for Arabic—remain underexplored due to linguistic complexity. This study introduces, for the first time, ant colony system (ACS) for Arabic abstractive summarization (named AASAC—Arabic Abstractive Summarization using Ant Colony), framing it as a combinatorial evolutionary optimization task. Our method integrates collocation and word-relation features into heuristic-guided fitness functions, simultaneously optimizing content coverage and linguistic coherence. Evaluations on a benchmark dataset using LemmaRouge, a lemma-based metric that evaluates semantic similarity rather than surface word forms, demonstrate consistent superiority. For 30% summaries, AASAC achieves 51.61% (LemmaRouge-1) and 46.82% (LemmaRouge-L), outperforming baselines by 13.23% and 20.49%, respectively. At 50% summary length, it reaches 64.56% (LemmaRouge-1) and 61.26% (LemmaRouge-L), surpassing baselines by 10.73% and 3.23%. These results highlight AASAC’s effectiveness in addressing multi-objective NLP challenges and establish its potential for evolutionary computation applications in language generation, particularly for complex morphological languages like Arabic. Full article

(This article belongs to the Special Issue Multi-Objective Optimization and Evolutionary Computing with Applications)

► Show Figures

Figure 1

33 pages, 11250 KB

Open AccessArticle

RADAR#: An Ensemble Approach for Radicalization Detection in Arabic Social Media Using Hybrid Deep Learning and Transformer Models

by Emad M. Al-Shawakfa, Anas M. R. Alsobeh, Sahar Omari and Amani Shatnawi

Information 2025, 16(7), 522; https://doi.org/10.3390/info16070522 - 22 Jun 2025

Cited by 4 | Viewed by 950

Abstract

The recent increase in extremist material on social media platforms makes serious countermeasures to international cybersecurity and national security efforts more difficult. RADAR#, a deep ensemble approach for the detection of radicalization in Arabic tweets, is introduced in this paper. Our model combines [...] Read more.

The recent increase in extremist material on social media platforms makes serious countermeasures to international cybersecurity and national security efforts more difficult. RADAR#, a deep ensemble approach for the detection of radicalization in Arabic tweets, is introduced in this paper. Our model combines a hybrid CNN-Bi-LSTM framework with a top Arabic transformer model (AraBERT) through a weighted ensemble strategy. We employ domain-specific Arabic tweet pre-processing techniques and a custom attention layer to better focus on radicalization indicators. Experiments over a 89,816 Arabic tweet dataset indicate that RADAR# reaches 98% accuracy and a 97% F1-score, surpassing advanced approaches. The ensemble strategy is particularly beneficial in handling dialectical variations and context-sensitive words common in Arabic social media updates. We provide a full performance analysis of the model, including ablation studies and attention visualization for better interpretability. Our contribution is useful to the cybersecurity community through an effective early detection mechanism of online radicalization in Arabic language content, which can be potentially applied in counter-terrorism and online content moderation. Full article

► Show Figures

Figure 1

37 pages, 3049 KB

Open AccessArticle

English-Arabic Hybrid Semantic Text Chunking Based on Fine-Tuning BERT

by Mai Alammar, Khalil El Hindi and Hend Al-Khalifa

Computation 2025, 13(6), 151; https://doi.org/10.3390/computation13060151 - 16 Jun 2025

Cited by 1 | Viewed by 1750

Abstract

Semantic text chunking refers to segmenting text into coherently semantic chunks, i.e., into sets of statements that are semantically related. Semantic chunking is an essential pre-processing step in various NLP tasks e.g., document summarization, sentiment analysis and question answering. In this paper, we [...] Read more.

Semantic text chunking refers to segmenting text into coherently semantic chunks, i.e., into sets of statements that are semantically related. Semantic chunking is an essential pre-processing step in various NLP tasks e.g., document summarization, sentiment analysis and question answering. In this paper, we propose a hybrid chunking; two-steps semantic text chunking method that combines the effectiveness of unsupervised semantic text chunking based on the similarities between sentences embeddings and the pre-trained language models (PLMs) especially BERT by fine-tuning the BERT on semantic textual similarity task (STS) to provide a flexible and effective semantic text chunking. We evaluated the proposed method in English and Arabic. To the best of our knowledge, there is an absence of an Arabic dataset created to assess semantic text chunking at this level. Therefore, we created an AraWiki50k to evaluate our proposed text chunking method inspired by an existing English dataset. Our experiments showed that exploiting the fine-tuned pre-trained BERT on STS enhances results over unsupervised semantic chunking by an average of 7.4 in the PK metric and by an average of 11.19 in the WindowDiff metric on four English evaluation datasets, and 0.12 in the PK and 2.29 in the WindowDiff for the Arabic dataset. Full article

(This article belongs to the Section Computational Social Science)

► Show Figures

Figure 1

28 pages, 925 KB

Open AccessArticle

Edge Convolutional Networks for Style Change Detection in Arabic Multi-Authored Text

by Abeer Saad Alsheddi and Mohamed El Bachir Menai

Appl. Sci. 2025, 15(12), 6633; https://doi.org/10.3390/app15126633 - 12 Jun 2025

Viewed by 614

Abstract

The style change detection (SCD) task asks to find the positions of authors’ style changes within multi-authored texts. It has several application areas, such as forensics, cybercrime, and literary analysis. Since 2017, SCD solutions in English have been actively investigated. However, to the [...] Read more.

The style change detection (SCD) task asks to find the positions of authors’ style changes within multi-authored texts. It has several application areas, such as forensics, cybercrime, and literary analysis. Since 2017, SCD solutions in English have been actively investigated. However, to the best of our knowledge, this task has not yet been investigated in Arabic text. Moreover, most existing SCD solutions represent boundaries surrounding segments by concatenating them. This shallow concatenation may lose style patterns within each segment and also increase input lengths while several embedding models restrict these lengths. This study seeks to bridge these gaps by introducing an Edge Convolutional Neural Network for the Arabic SCD task (ECNN-ASCD) solution. It represents boundaries as standalone learnable parameters across layers based on graph neural networks. ECNN-ASCD was trained on an Arabic dataset containing three classes of instances according to difficulty level: easy, medium, and hard. The results show that ECNN-ASCD achieved a high

F_{1}

score of 0.9945%, 0.9381%, and 0.9120% on easy, medium, and hard instances, respectively. The ablation experiments demonstrated the effectiveness of ECNN-ASCD components. As the first publicly available solution for Arabic SCD, ECNN-ASCD would open the door for more active research on solving this task and contribute to boosting research in Arabic NLP. Full article

(This article belongs to the Special Issue New Trends in Natural Language Processing)

► Show Figures

Figure 1

21 pages, 959 KB

Open AccessReview

A Scoping Review of Arabic Natural Language Processing for Mental Health

by Ashwag Alasmari

Healthcare 2025, 13(9), 963; https://doi.org/10.3390/healthcare13090963 - 22 Apr 2025

Cited by 1 | Viewed by 1606

Abstract

Mental health disorders represent a substantial global health concern, impacting millions and placing a significant burden on public health systems. Natural Language Processing (NLP) has emerged as a promising tool for analyzing large textual datasets to identify and predict mental health challenges. The [...] Read more.

Mental health disorders represent a substantial global health concern, impacting millions and placing a significant burden on public health systems. Natural Language Processing (NLP) has emerged as a promising tool for analyzing large textual datasets to identify and predict mental health challenges. The aim of this scoping review is to identify the Arabic NLP techniques employed in mental health research, the specific mental health conditions addressed, and the effectiveness of these techniques in detecting and predicting such conditions. This scoping review was conducted according to the PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) framework. Studies were included if they focused on the application of NLP techniques, addressed mental health issues (e.g., depression, anxiety, suicidal ideation) within Arabic text data, were published in peer-reviewed journals or conference proceedings, and were written in English or Arabic. The relevant literature was identified through a systematic search of four databases: PubMed, ScienceDirect, IEEE Xplore, and Google Scholar. The results of the included studies revealed a variety of NLP techniques used to address specific mental health issues among Arabic-speaking populations. Commonly utilized techniques included Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), Recurrent Neural Network (RNN), and advanced transformer-based models such as AraBERT and MARBERT. The studies predominantly focused on detecting and predicting symptoms of depression and suicidality from Arabic social media data. The effectiveness of these techniques varied, with trans-former-based models like AraBERT and MARBERT demonstrating superior performance, achieving accuracy rates of up to 99.3% and 98.3%, respectively. Traditional machine learning models and RNNs also showed promise but generally lagged in accuracy and depth of insight compared to transformer models. This scoping review highlights the significant potential of NLP techniques, particularly advanced transformer-based models, in addressing mental health issues among Arabic-speaking populations. Ongoing research is essential to keep pace with the rapidly evolving field and to validate current findings. Full article

(This article belongs to the Special Issue Data Driven Insights in Healthcare)

► Show Figures

Figure 1

24 pages, 3284 KB

Open AccessArticle

Exploring GPT-4 Capabilities in Generating Paraphrased Sentences for the Arabic Language

by Haya Rabih Alsulami and Amal Abdullah Almansour

Appl. Sci. 2025, 15(8), 4139; https://doi.org/10.3390/app15084139 - 9 Apr 2025

Cited by 2 | Viewed by 2678

Abstract

Paraphrasing means expressing the semantic meaning of a text using different words. Paraphrasing has a significant impact on numerous Natural Language Processing (NLP) applications, such as Machine Translation (MT) and Question Answering (QA). Machine Learning (ML) methods are frequently employed to generate new [...] Read more.

Paraphrasing means expressing the semantic meaning of a text using different words. Paraphrasing has a significant impact on numerous Natural Language Processing (NLP) applications, such as Machine Translation (MT) and Question Answering (QA). Machine Learning (ML) methods are frequently employed to generate new paraphrased text, and the generative method is commonly used for text generation. Generative Pre-trained Transformer (GPT) models have demonstrated effectiveness in various text generation tasks, including summarization, proofreading, and rephrasing of English texts. However, GPT-4’s capabilities in Arabic paraphrase generation have not been extensively studied despite Arabic being one of the most widely spoken languages. In this paper, the researchers evaluate the capabilities of GPT-4 in text paraphrasing for Arabic. Furthermore, the paper presents a comprehensive evaluation method for paraphrase quality and developing a detailed framework for evaluation. The framework comprises Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation (ROUGE), Lexical Diversity (LD), Jaccard similarity, and word embedding using the Arabic Bi-directional Encoder Representation from Transformers (AraBERT) model with cosine and Euclidean similarity. This paper illustrates that GPT-4 can effectively produce a new paraphrased sentence that is semantically equivalent to the original sentence, and the quality framework efficiently ranks paraphrased pairs according to quality criteria. Full article

► Show Figures

Figure 1

36 pages, 4245 KB

Open AccessArticle

An Unsupervised Integrated Framework for Arabic Aspect-Based Sentiment Analysis and Abstractive Text Summarization of Traffic Services Using Transformer Models

by Alanoud Alotaibi and Farrukh Nadeem

Smart Cities 2025, 8(2), 62; https://doi.org/10.3390/smartcities8020062 - 8 Apr 2025

Cited by 1 | Viewed by 1597

Abstract

Social media is crucial for gathering public feedback on government services, particularly in the traffic sector. While Aspect-Based Sentiment Analysis (ABSA) offers a means to extract actionable insights from user posts, analyzing Arabic content poses unique challenges. Existing Arabic ABSA approaches heavily rely [...] Read more.

Social media is crucial for gathering public feedback on government services, particularly in the traffic sector. While Aspect-Based Sentiment Analysis (ABSA) offers a means to extract actionable insights from user posts, analyzing Arabic content poses unique challenges. Existing Arabic ABSA approaches heavily rely on supervised learning and manual annotation, limiting scalability. To tackle these challenges, we suggest an integrated framework combining unsupervised BERTopic-based Aspect Category Detection with distance supervision using a fine-tuned CAMeLBERT model for sentiment classification. This is further complemented by transformer-based summarization through a fine-tuned AraBART model. Key contributions of this paper include: (1) the first comprehensive Arabic traffic services dataset containing 461,844 tweets, enabling future research in this previously unexplored domain; (2) a novel unsupervised approach for Arabic ABSA that eliminates the need for large-scale manual annotation, using FastText custom embeddings and BERTopic to achieve superior topic clustering; (3) a pioneering integration of aspect detection, sentiment analysis, and abstractive summarization that provides a complete pipeline for analyzing Arabic traffic service feedback; (4) state-of-the-art performance metrics across all tasks, achieving 92% accuracy in ABSA and a ROUGE-L score of 0.79 for summarization, establishing new benchmarks for Arabic NLP in the traffic domain. The framework significantly enhances smart city traffic management by enabling automated processing of citizen feedback, supporting data-driven decision-making, and allowing authorities to monitor public sentiment, identify emerging issues, and allocate resources based on citizen needs, ultimately improving urban mobility and service responsiveness. Full article

► Show Figures

Figure 1

16 pages, 12177 KB

Open AccessArticle

An Advanced Natural Language Processing Framework for Arabic Named Entity Recognition: A Novel Approach to Handling Morphological Richness and Nested Entities

by Saleh Albahli

Appl. Sci. 2025, 15(6), 3073; https://doi.org/10.3390/app15063073 - 12 Mar 2025

Cited by 4 | Viewed by 1544

Abstract

Named Entity Recognition (NER) is a fundamental task in Natural Language Processing (NLP) that supports applications such as information retrieval, sentiment analysis, and text summarization. While substantial progress has been made in NER for widely studied languages like English, Arabic presents unique challenges [...] Read more.

Named Entity Recognition (NER) is a fundamental task in Natural Language Processing (NLP) that supports applications such as information retrieval, sentiment analysis, and text summarization. While substantial progress has been made in NER for widely studied languages like English, Arabic presents unique challenges due to its morphological richness, orthographic ambiguity, and the frequent occurrence of nested and overlapping entities. This paper introduces a novel Arabic NER framework that addresses these complexities through architectural innovations. The proposed model incorporates a Hybrid Feature Fusion Layer, which integrates external lexical features using a cross-attention mechanism and a Gated Lexical Unit (GLU) to filter noise, while a Compound Span Representation Layer employs Rotary Positional Encoding (RoPE) and Bidirectional GRUs to enhance the detection of complex entity structures. Additionally, an Enhanced Multi-Label Classification Layer improves the disambiguation of overlapping spans and assigns multiple entity types where applicable. The model is evaluated on three benchmark datasets—ANERcorp, ACE 2005, and a custom biomedical dataset—achieving an F1-score of 93.0% on ANERcorp and 89.6% on ACE 2005, significantly outperforming state-of-the-art methods. A case study further highlights the model’s real-world applicability in handling compound and nested entities with high confidence. By establishing a new benchmark for Arabic NER, this work provides a robust foundation for advancing NLP research in morphologically rich languages. Full article

(This article belongs to the Special Issue Techniques and Applications of Natural Language Processing)

► Show Figures

Figure 1

26 pages, 906 KB

Open AccessArticle

Large Language Models as Kuwaiti Annotators

by Hana Alostad

Big Data Cogn. Comput. 2025, 9(2), 33; https://doi.org/10.3390/bdcc9020033 - 8 Feb 2025

Viewed by 1433

Abstract

Stance detection for low-resource languages, such as the Kuwaiti dialect, poses a significant challenge in natural language processing (NLP) due to the scarcity of annotated datasets and specialized tools. This study addresses these limitations by evaluating the effectiveness of open large language models [...] Read more.

Stance detection for low-resource languages, such as the Kuwaiti dialect, poses a significant challenge in natural language processing (NLP) due to the scarcity of annotated datasets and specialized tools. This study addresses these limitations by evaluating the effectiveness of open large language models (LLMs) in automating stance detection through zero-shot and few-shot prompt engineering, with a focus on the potential of open-source models to achieve performance levels comparable to those of closed-source alternatives. We also highlight the critical distinctions between zero- and few-shot learning, emphasizing their significance for addressing the challenges posed by low-resource languages. Our evaluation involved testing 11 LLMs on a manually labeled dataset of social media posts, including GPT-4o, Gemini Pro 1.5, Mistral-Large, Jais-30B, and AYA-23. As expected, closed-source models such as GPT-4o, Gemini Pro 1.5, and Mistral-Large demonstrated superior performance, achieving maximum F1 scores of 95.4%, 95.0%, and 93.2%, respectively, in few-shot scenarios with English as the prompt template language. However, open-source models such as Jais-30B and AYA-23 achieved competitive results, with maximum F1 scores of 93.0% and 93.1%, respectively, under the same conditions. Furthermore, statistical analysis using ANOVA and Tukey’s HSD post hoc tests revealed no significant differences in overall performance among GPT-4o, Gemini Pro 1.5, Mistral-Large, Jais-30B, and AYA-23. This finding underscores the potential of open-source LLMs as cost-effective and privacy-preserving alternatives for low-resource language annotation. This is the first study comparing LLMs for stance detection in the Kuwaiti dialect. Our findings highlight the importance of prompt design and model consistency in improving the quality of annotations and pave the way for NLP solutions for under-represented Arabic dialects. Full article

(This article belongs to the Special Issue Generative AI and Large Language Models)

► Show Figures

Figure 1

22 pages, 7770 KB

Open AccessArticle

Advancing Arabic Word Embeddings: A Multi-Corpora Approach with Optimized Hyperparameters and Custom Evaluation

by Azzah Allahim and Asma Cherif

Appl. Sci. 2024, 14(23), 11104; https://doi.org/10.3390/app142311104 - 28 Nov 2024

Cited by 2 | Viewed by 1823

Abstract

The expanding Arabic user base presents a unique opportunity for researchers to tap into vast online Arabic resources. However, the lack of reliable Arabic word embedding models and the limited availability of Arabic corpora poses significant challenges. This paper addresses these gaps by [...] Read more.

The expanding Arabic user base presents a unique opportunity for researchers to tap into vast online Arabic resources. However, the lack of reliable Arabic word embedding models and the limited availability of Arabic corpora poses significant challenges. This paper addresses these gaps by developing and evaluating Arabic word embedding models trained on diverse Arabic corpora, investigating how varying hyperparameter values impact model performance across different NLP tasks. To train our models, we collected data from three distinct sources: Wikipedia, newspapers, and 32 Arabic books, each selected to capture specific linguistic and contextual features of Arabic. By using advanced techniques such as Word2Vec and FastText, we experimented with different hyperparameter configurations, such as vector size, window size, and training algorithms (CBOW and skip-gram), to analyze their impact on model quality. Our models were evaluated using a range of NLP tasks, including sentiment analysis, similarity tests, and an adapted analogy test designed specifically for Arabic. The findings revealed that both the corpus size and hyperparameter settings had notable effects on performance. For instance, in the analogy test, a larger vocabulary size significantly improved outcomes, with the FastText skip-gram models excelling in accurately solving analogy questions. For sentiment analysis, vocabulary size was critical, while in similarity scoring, the FastText models achieved the highest scores, particularly with smaller window and vector sizes. Overall, our models demonstrated strong performance, achieving 99% and 90% accuracies in sentiment analysis and the analogy test, respectively, along with a similarity score of 8 out of 10. These results underscore the value of our models as a robust tool for Arabic NLP research, addressing a pressing need for high-quality Arabic word embeddings. Full article

► Show Figures

Figure 1

20 pages, 4970 KB

Open AccessArticle

Revealing the Next Word and Character in Arabic: An Effective Blend of Long Short-Term Memory Networks and ARABERT

by Fawaz S. Al-Anzi and S. T. Bibin Shalini

Appl. Sci. 2024, 14(22), 10498; https://doi.org/10.3390/app142210498 - 14 Nov 2024

Cited by 2 | Viewed by 1622

Abstract

Arabic raw audio datasets were initially gathered to produce a corresponding signal spectrum, which was further used to extract the Mel-Frequency Cepstral Coefficients (MFCCs). The pronunciation dictionary, language model, and acoustic model were further derived from the MFCCs’ features. These output data were [...] Read more.

Arabic raw audio datasets were initially gathered to produce a corresponding signal spectrum, which was further used to extract the Mel-Frequency Cepstral Coefficients (MFCCs). The pronunciation dictionary, language model, and acoustic model were further derived from the MFCCs’ features. These output data were processed into Baidu’s Deep Speech model (ASR system) to attain the text corpus. Baidu’s Deep Speech model was implemented to precisely identify the global optimal value rapidly while preserving a low word and character discrepancy rate by attaining an excellent performance in isolated and end-to-end speech recognition. The desired outcome in this work is to forecast the next word and character in a sequential and systematic order that applies under natural language processing (NLP). This work combines the trained Arabic language model ARABERT with the potential of Long Short-Term Memory (LSTM) networks to predict the next word and character in an Arabic text. We used the pre-trained ARABERT embedding to improve the model’s capacity and, to capture semantic relationships within the language, we educated LSTM + CNN and Markov models on Arabic text data to assess the efficacy of this model. Python libraries such as TensorFlow, Pickle, Keras, and NumPy were used to effectively design our development model. We extensively assessed the model’s performance using new Arabic text, focusing on evaluation metrics like accuracy, word error rate, character error rate, BLEU score, and perplexity. The results show how well the combined LSTM + ARABERT and Markov models have outperformed the baseline models in envisaging the next word or character in the Arabic text. The accuracy rates of 64.9% for LSTM, 74.6% for ARABERT + LSTM, and 78% for Markov chain models were achieved in predicting the next word, and the accuracy rates of 72% for LSTM, 72.22% for LSTM + CNN, and 73% for ARABERET + LSTM models were achieved for the next-character prediction. This work unveils a novelty in Arabic natural language processing tasks, estimating a potential future expansion in deriving a precise next-word and next-character forecasting, which can be an efficient utility for text generation and machine translation applications. Full article

► Show Figures

Figure 1

Search Results (57)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (57)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI