Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework

Jabr, Reham Bin; Azmi, Aqil M.

doi:10.3390/math13182975

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework

by

Reham Bin Jabr

and

Aqil M. Azmi

^*

Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(18), 2975; https://doi.org/10.3390/math13182975

Submission received: 30 July 2025 / Revised: 28 August 2025 / Accepted: 3 September 2025 / Published: 14 September 2025

(This article belongs to the Special Issue Advanced Artificial Intelligence Models and Its Applications, 2nd Edition)

Download Versions Notes

Abstract

In this work, we propose a knowledge-aware approach for Arabic automatic question generation (QG) that leverages the multilingual T5 (mT5) transformer augmented with a pre-trained Arabic question-answering model to address challenges posed by Arabic’s morphological richness and limited QG resources. Our system generates both subjective questions and multiple-choice questions (MCQs) with contextually relevant distractors through a dual-model pipeline that tailors the decoding strategy to each subtask: the question generator employs beam search to maximize semantic fidelity and lexical precision, while the distractor generator uses top-k sampling to enhance diversity and contextual plausibility. The QG model is fine-tuned on Arabic SQuAD, and the distractor model is trained on a curated combination of ARCD and Qudrat. Experimental results show that beam search significantly outperforms top-k sampling for fact-based question generation, achieving a BLEU-4 score of 27.49 and a METEOR score of 25.18, surpassing fine-tuned AraT5 and translated English–Arabic baselines. In contrast, top-k sampling is more effective for distractor generation, yielding higher BLEU scores and producing distractors that are more diverse yet remain pedagogically valid, with a BLEU-1 score of 20.28 establishing a strong baseline in the absence of prior Arabic benchmarks. Human evaluation further confirms the quality of the generated questions. This work advances Arabic QG by providing a scalable, knowledge-aware solution with applications in educational technology, while demonstrating the critical role of task-specific decoding strategies and setting a foundation for future research in automated assessment.

Keywords: Arabic question generation; multiple-choice question generation; knowledge-aware NLP; low-resource language processing; beam search; top-k sampling

Share and Cite

MDPI and ACS Style

Jabr, R.B.; Azmi, A.M. Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework. Mathematics 2025, 13, 2975. https://doi.org/10.3390/math13182975

AMA Style

Jabr RB, Azmi AM. Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework. Mathematics. 2025; 13(18):2975. https://doi.org/10.3390/math13182975

Chicago/Turabian Style

Jabr, Reham Bin, and Aqil M. Azmi. 2025. "Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework" Mathematics 13, no. 18: 2975. https://doi.org/10.3390/math13182975

APA Style

Jabr, R. B., & Azmi, A. M. (2025). Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework. Mathematics, 13(18), 2975. https://doi.org/10.3390/math13182975

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI