MDPI - Publisher of Open Access Journals

17 pages, 320 KB

Open AccessArticle

Language Attitudes of Parents with Russian L1 in Tartu: Transition to Estonian-Medium Education

by Birute Klaas-Lang, Kristiina Praakli and Diana Vender

Languages 2025, 10(9), 218; https://doi.org/10.3390/languages10090218 - 29 Aug 2025

Viewed by 607

In 2023, the authors conducted a qualitative study in five bilingual educational institutions (two general education schools and three kindergartens) in Tartu, Estonia, undergoing a transition to Estonian-medium education. The empirical material for this qualitative research was collected during ten discussion evenings with [...] Read more.

In 2023, the authors conducted a qualitative study in five bilingual educational institutions (two general education schools and three kindergartens) in Tartu, Estonia, undergoing a transition to Estonian-medium education. The empirical material for this qualitative research was collected during ten discussion evenings with Russian L1 parents, with around 300 attendees. Given the emotional and political sensitivity of the topic, the discussions were documented through researchers’ handwritten field notes and subsequently reconstructed from these notes for thematic analysis following the principles of qualitative content analysis. This study aimed to map the concerns and fears of Russian L1 parents and to collaboratively explore possible solutions. The broader objective was to understand and interpret Russian-speaking parents’ attitudes toward the shift to Estonian-medium instruction. A further aim was to raise language awareness among parents and to help lay a more positive foundation for the transition process. The theoretical framework draws on the notion that parents’ language attitudes significantly influence their children’s perceptions of the value of the language being learned. Our results show that many Russian L1 parents in Tartu consider it important for both Estonian- and Russian-speaking children to study in a shared, Estonian-medium learning environment. At the same time, parents identified several key challenges, including concerns about a decline in education quality, increased academic pressure and stress for children learning in a non-native language, a lack of suitable learning materials, and parents’ limited ability to assist with homework due to their own insufficient proficiency in Estonian. Full article

(This article belongs to the Special Issue Language Attitudes and Language Ideologies in Eastern Europe)

28 pages, 2518 KB

Open AccessEditor’s ChoiceArticle

Enhancing Keyword Spotting via NLP-Based Re-Ranking: Leveraging Semantic Relevance Feedback in the Handwritten Domain

by Stergios Papazis, Angelos P. Giotis and Christophoros Nikou

Electronics 2025, 14(14), 2900; https://doi.org/10.3390/electronics14142900 - 20 Jul 2025

Viewed by 900

Abstract

Handwritten Keyword Spotting (KWS) remains a challenging task, particularly in segmentation-free scenarios where word images must be retrieved and ranked based on their similarity to a query without relying on prior page-level segmentation. Traditional KWS methods primarily focus on visual similarity, often overlooking [...] Read more.

Handwritten Keyword Spotting (KWS) remains a challenging task, particularly in segmentation-free scenarios where word images must be retrieved and ranked based on their similarity to a query without relying on prior page-level segmentation. Traditional KWS methods primarily focus on visual similarity, often overlooking the underlying semantic relationships between words. In this work, we propose a novel NLP-driven re-ranking approach that refines the initial ranked lists produced by state-of-the-art KWS models. By leveraging semantic embeddings from pre-trained BERT-like Large Language Models (LLMs, e.g., RoBERTa, MPNet, and MiniLM), we introduce a relevance feedback mechanism that improves both verbatim and semantic keyword spotting. Our framework operates in two stages: (1) projecting retrieved word image transcriptions into a semantic space via LLMs and (2) re-ranking the retrieval list using a weighted combination of semantic and exact relevance scores based on pairwise similarities with the query. We evaluate our approach on the widely used George Washington (GW) and IAM collections using two cutting-edge segmentation-free KWS models, which are further integrated into our proposed pipeline. Our results show consistent gains in Mean Average Precision (mAP), with improvements of up to

2.3 %

(from

94.3 %

to

96.6 %

) on GW and

3 %

(from

79.15 %

to

82.12 %

) on IAM. Even when mAP gains are smaller, qualitative improvements emerge: semantically relevant but inexact matches are retrieved more frequently without compromising exact match recall. We further examine the effect of fine-tuning transformer-based OCR (TrOCR) models on historical GW data to align textual and visual features more effectively. Overall, our findings suggest that semantic feedback can enhance retrieval effectiveness in KWS pipelines, paving the way for lightweight hybrid vision-language approaches in handwritten document analysis. Full article

(This article belongs to the Special Issue AI Synergy: Vision, Language, and Modality)

► Show Figures

Figure 1

15 pages, 2124 KB

Open AccessArticle

Toward Building a Domain-Based Dataset for Arabic Handwritten Text Recognition

by Khawlah Alhefdhi, Abdulmalik Alsalman and Safi Faizullah

Electronics 2025, 14(12), 2461; https://doi.org/10.3390/electronics14122461 - 17 Jun 2025

Viewed by 1563

Abstract

The problem of automatic recognition of handwritten text has recently been widely discussed in the research community. Handwritten text recognition is considered a challenging task for cursive scripts, such as Arabic-language scripts, due to their complex properties. Although the demand for automatic text [...] Read more.

The problem of automatic recognition of handwritten text has recently been widely discussed in the research community. Handwritten text recognition is considered a challenging task for cursive scripts, such as Arabic-language scripts, due to their complex properties. Although the demand for automatic text recognition is growing, especially to assist in digitizing archival documents, limited datasets are available for Arabic handwritten text compared to other languages. In this paper, we present novel work on building the Real Estate and Judicial Documents dataset (REJD dataset), which aims to facilitate the recognition of Arabic text in millions of archived documents. This paper also discusses the use of Optical Character Recognition and deep learning techniques, aiming to serve as the initial version in a series of experiments and enhancements designed to achieve optimal results. Full article

► Show Figures

Figure 1

18 pages, 6224 KB

Open AccessData Descriptor

A Structured Dataset for Automated Grading: From Raw Data to Processed Dataset

by Ibidapo Dare Dada, Adio T. Akinwale and Ti-Jesu Tunde-Adeleke

Data 2025, 10(6), 87; https://doi.org/10.3390/data10060087 - 6 Jun 2025

Cited by 1 | Viewed by 2084

Abstract

The increasing volume of student assessments, particularly open-ended responses, presents a significant challenge for educators in ensuring grading accuracy, consistency, and efficiency. This paper presents a structured dataset designed for the development and evaluation of automated grading systems in higher education. The primary [...] Read more.

The increasing volume of student assessments, particularly open-ended responses, presents a significant challenge for educators in ensuring grading accuracy, consistency, and efficiency. This paper presents a structured dataset designed for the development and evaluation of automated grading systems in higher education. The primary objective is to create a high-quality dataset that facilitates the development and evaluation of natural language processing (NLP) models for automated grading. The dataset comprises student responses to open-ended questions from the Management Information Systems (MIS221) and Project Management (MIS415) courses at Covenant University, collected during the 2022/2023 academic session. The responses were originally handwritten, scanned, and transcribed into Word documents. Each response is paired with corresponding scores assigned by human graders, following a detailed marking guide. To assess the dataset’s potential for automated grading applications, several machine learning and transformer-based models were tested, including TF-IDF with Linear Regression, TF-IDF with Cosine Similarity, BERT, SBERT, RoBERTa, and Longformer. The experimental results demonstrate that transformer-based models outperform traditional methods, with Longformer achieving the highest Spearman’s Correlation of 0.77 and the lowest Mean Squared Error (MSE) of 0.04, indicating a strong alignment between model predictions and human grading. The findings highlight the effectiveness of deep learning models in capturing the semantic and contextual meaning of both student responses and marking guides, making it possible to develop more scalable and reliable automated grading solutions. This dataset offers valuable insights into student performance and serves as a foundational resource for integrating educational technology into automated assessment systems. Future work will focus on enhancing grading consistency and expanding the dataset for broader academic applications. Full article

(This article belongs to the Special Issue Data Mining and Computational Intelligence for E-Learning and Education—3rd Edition)

► Show Figures

Figure 1

25 pages, 53077 KB

Open AccessArticle

Close-Range Photogrammetry and RTI for 2.5D Documentation of Painted Surfaces: A Tiryns Mural Case Study

by Georgios Tsairis, Athina Georgia Alexopoulou, Nicolaos Zacharias and Ioanna Kakoulli

Coatings 2025, 15(4), 388; https://doi.org/10.3390/coatings15040388 - 26 Mar 2025

Viewed by 1153

Abstract

Painted surfaces, regardless of their substrate, possess unique elements crucial for their study and interpretation. These elements include geometric characteristics, surface texture, brushwork relief, color layer morphology, and preservation state indicators like overpainting, interventions, cracks, and mechanical deformations. Traditional recording methods such as [...] Read more.

Painted surfaces, regardless of their substrate, possess unique elements crucial for their study and interpretation. These elements include geometric characteristics, surface texture, brushwork relief, color layer morphology, and preservation state indicators like overpainting, interventions, cracks, and mechanical deformations. Traditional recording methods such as handwritten or digital descriptions, 2D scale drawings, calipers, rulers, tape measures, sketches, tracings, and conventional or technical photography fall short in capturing the three-dimensional detail necessary for comprehensive analysis. To overcome these limitations, this paper proposes the integration of two digital tools, Close-Range Photogrammetry (SfM-MVS) and Reflectance Transformation Imaging (RTI), which have become accessible with the advancement of computing power. While other 3D imaging tools like laser scanners and structured light systems exist and may be preferred for very specialized applications, such as capturing the texture of the surface with sub-millimeter accuracy, SfM-MVS and RTI offer a cost-efficient and highly accurate alternative, with 3D modeling capabilities and advanced pixel color fidelity, essential for documenting the geometric and color details of painted artifacts. The application of these highly promising methods to the mural paintings from the Palace of Tiryns (Nafplion, Greece) demonstrates their potential, providing significant insights for art historians, researchers, conservators, and curators. Full article

(This article belongs to the Special Issue Coatings for Cultural Heritage: Cleaning, Protection and Restoration)

► Show Figures

Figure 1

18 pages, 5623 KB

Open AccessArticle

Detection of Personality Traits Using Handwriting and Deep Learning

by Daniel Gagiu and Dorin Sendrescu

Appl. Sci. 2025, 15(4), 2154; https://doi.org/10.3390/app15042154 - 18 Feb 2025

Cited by 1 | Viewed by 5714

Abstract

A series of studies and research have shown the existence of a link between handwriting and a person’s personality traits. There are numerous fields that require a psychological assessment of individuals, where there is a need to determine personality traits in a faster [...] Read more.

A series of studies and research have shown the existence of a link between handwriting and a person’s personality traits. There are numerous fields that require a psychological assessment of individuals, where there is a need to determine personality traits in a faster and more efficient manner than that based on classic questionnaires or graphological analysis. The development of image processing and recognition algorithms based on machine learning and deep neural networks has led to a series of applications in the field of graphology. In the present study, a system for automatically extracting handwriting characteristics from written documents and correlating them with Myers–Briggs type indicator is implemented. The system has an architecture composed of three levels, the main level being formed by four convolutional neural networks. To train the networks, a database with different types of handwriting was created. The experimental results show an accuracy ranging between 89% and 96% for handwritten features’ recognition and results ranging between 83% and 91% in determining Myers–Briggs indicators. Full article

(This article belongs to the Special Issue Deep Learning for Signal Processing Applications-2nd Edition)

► Show Figures

Figure 1

18 pages, 18456 KB

Open AccessArticle

iForal: Automated Handwritten Text Transcription for Historical Medieval Manuscripts

by Alexandre Matos, Pedro Almeida, Paulo L. Correia and Osvaldo Pacheco

J. Imaging 2025, 11(2), 36; https://doi.org/10.3390/jimaging11020036 - 25 Jan 2025

Cited by 1 | Viewed by 2717

Abstract

The transcription of historical manuscripts aims at making our cultural heritage more accessible to experts and also to the larger public, but it is a challenging and time-intensive task. This paper contributes an automated solution for text layout recognition, segmentation, and recognition to [...] Read more.

The transcription of historical manuscripts aims at making our cultural heritage more accessible to experts and also to the larger public, but it is a challenging and time-intensive task. This paper contributes an automated solution for text layout recognition, segmentation, and recognition to speed up the transcription process of historical manuscripts. The focus is on transcribing Portuguese municipal documents from the Middle Ages in the context of the iForal project, including the contribution of an annotated dataset containing Portuguese medieval documents, notably a corpus of 67 Portuguese royal charter data. The proposed system can accurately identify document layouts, isolate the text, segment, and transcribe it. Results for the layout recognition model achieved 0.98 mAP@0.50 and 0.98 precision, while the text segmentation model achieved 0.91 mAP@0.50, detecting 95% of the lines. The text recognition model achieved 8.1% character error rate (CER) and 25.5% word error rate (WER) on the test set. These results can then be validated by palaeographers with less effort, contributing to achieving high-quality transcriptions faster. Moreover, the automatic models developed can be utilized as a basis for the creation of models that perform well for other historical handwriting styles, notably using transfer learning techniques. The contributed dataset has been made available on the HTR United catalogue, which includes training datasets to be used for automatic transcription or segmentation models. The models developed can be used, for instance, on the eSriptorium platform, which is used by a vast community of experts. Full article

(This article belongs to the Section Document Analysis and Processing)

► Show Figures

Figure 1

16 pages, 436 KB

Open AccessArticle

Improved Localization and Recognition of Handwritten Digits on MNIST Dataset with ConvGRU

by Yalin Wen, Wei Ke and Hao Sheng

Appl. Sci. 2025, 15(1), 238; https://doi.org/10.3390/app15010238 - 30 Dec 2024

Cited by 1 | Viewed by 1637

Abstract

Video location prediction for handwritten digits presents unique challenges in computer vision due to the complex spatiotemporal dependencies and the need to maintain digit legibility across predicted frames, while existing deep learning-based video prediction models have shown promise, they often struggle with preserving [...] Read more.

Video location prediction for handwritten digits presents unique challenges in computer vision due to the complex spatiotemporal dependencies and the need to maintain digit legibility across predicted frames, while existing deep learning-based video prediction models have shown promise, they often struggle with preserving local details and typically achieve clear predictions for only a limited number of frames. In this paper, we present a novel video location prediction model based on Convolutional Gated Recurrent Units (ConvGRU) that specifically addresses these challenges in the context of handwritten digit sequences. Our approach introduces three key innovations. Firstly, we introduce a specialized decoupling model using modified Generative Adversarial Networks (GANs) that effectively separates background and foreground information, significantly improving prediction accuracy. Secondly, we introduce an enhanced ConvGRU architecture that replaces traditional linear operations with convolutional operations in the gating mechanism, substantially reducing spatiotemporal information loss. Finally, we introduce an optimized parameter-tuning strategy that ensures continuous feature transmission while maintaining computational efficiency. Extensive experiments on both the MNIST dataset and custom mobile datasets demonstrate the effectiveness of our approach. Our model achieves a structural similarity index of 0.913 between predicted and actual sequences, surpassing current state-of-the-art methods by 1.2%. Furthermore, we demonstrate superior performance in long-term prediction stability, with consistent accuracy maintained across extended sequences. Notably, our model reduces training time by 9.5% compared to existing approaches while maintaining higher prediction accuracy. These results establish new benchmarks for handwritten digit video prediction and provide practical solutions for real-world applications in digital education, document processing, and real-time handwriting recognition systems. Full article

(This article belongs to the Special Issue Advances in Image Recognition and Processing Technologies)

► Show Figures

Figure 1

13 pages, 1585 KB

Open AccessArticle

Analyzing Arabic Handwriting Style through Hand Kinematics

by Vahan Babushkin, Haneen Alsuradi, Muhamed Osman Al-Khalil and Mohamad Eid

Sensors 2024, 24(19), 6357; https://doi.org/10.3390/s24196357 - 30 Sep 2024

Cited by 3 | Viewed by 2182

Abstract

Handwriting style is an important aspect affecting the quality of handwriting. Adhering to one style is crucial for languages that follow cursive orthography and possess multiple handwriting styles, such as Arabic. The majority of available studies analyze Arabic handwriting style from static documents, [...] Read more.

Handwriting style is an important aspect affecting the quality of handwriting. Adhering to one style is crucial for languages that follow cursive orthography and possess multiple handwriting styles, such as Arabic. The majority of available studies analyze Arabic handwriting style from static documents, focusing only on pure styles. In this study, we analyze handwriting samples with mixed styles, pure styles (Ruq’ah and Naskh), and samples without a specific style from dynamic features of the stylus and hand kinematics. We propose a model for classifying handwritten samples into four classes based on adherence to style. The stylus and hand kinematics data were collected from 50 participants who were writing an Arabic text containing all 28 letters and covering most Arabic orthography. The parameter search was conducted to find the best hyperparameters for the model, the optimal sliding window length, and the overlap. The proposed model for style classification achieves an accuracy of 88%. The explainability analysis with Shapley values revealed that hand speed, pressure, and pen slant are among the top 12 important features, with other features contributing nearly equally to style classification. Finally, we explore which features are important for Arabic handwriting style detection. Full article

(This article belongs to the Special Issue Sensor-Based Behavioral Biometrics)

► Show Figures

Figure 1

15 pages, 3491 KB

Open AccessArticle

Enhancing Signature Verification Using Triplet Siamese Similarity Networks in Digital Documents

by Sara Tehsin, Ali Hassan, Farhan Riaz, Inzamam Mashood Nasir, Norma Latif Fitriyani and Muhammad Syafrudin

Mathematics 2024, 12(17), 2757; https://doi.org/10.3390/math12172757 - 5 Sep 2024

Cited by 13 | Viewed by 3661

Abstract

In contexts requiring user authentication, such as financial, legal, and administrative systems, signature verification emerges as a pivotal biometric method. Specifically, handwritten signature verification stands out prominently for document authentication. Despite the effectiveness of triplet loss similarity networks in extracting and comparing signatures [...] Read more.

In contexts requiring user authentication, such as financial, legal, and administrative systems, signature verification emerges as a pivotal biometric method. Specifically, handwritten signature verification stands out prominently for document authentication. Despite the effectiveness of triplet loss similarity networks in extracting and comparing signatures with forged samples, conventional deep learning models often inadequately capture individual writing styles, resulting in suboptimal performance. Addressing this limitation, our study employs a triplet loss Siamese similarity network for offline signature verification, irrespective of the author. Through experimentation on five publicly available signature datasets—4NSigComp2012, SigComp2011, 4NSigComp2010, and BHsig260—various distance measure techniques alongside the triplet Siamese Similarity Network (tSSN) were evaluated. Our findings underscore the superiority of the tSSN approach, particularly when coupled with the Manhattan distance measure, in achieving enhanced verification accuracy, thereby demonstrating its efficacy in scenarios characterized by close signature similarity. Full article

(This article belongs to the Section D2: Operations Research and Fuzzy Decision Making)

► Show Figures

Figure 1

19 pages, 3640 KB

Open AccessArticle

Recognition of Chinese Electronic Medical Records for Rehabilitation Robots: Information Fusion Classification Strategy

by Jiawei Chu, Xiu Kan, Yan Che, Wanqing Song, Kudreyko Aleksey and Zhengyuan Dong

Sensors 2024, 24(17), 5624; https://doi.org/10.3390/s24175624 - 30 Aug 2024

Viewed by 2013

Abstract

Named entity recognition is a critical task in the electronic medical record management system for rehabilitation robots. Handwritten documents often contain spelling errors and illegible handwriting, and healthcare professionals frequently use different terminologies. These issues adversely affect the robot’s judgment and precise operations. [...] Read more.

Named entity recognition is a critical task in the electronic medical record management system for rehabilitation robots. Handwritten documents often contain spelling errors and illegible handwriting, and healthcare professionals frequently use different terminologies. These issues adversely affect the robot’s judgment and precise operations. Additionally, the same entity can have different meanings in various contexts, leading to category inconsistencies, which further increase the system’s complexity. To address these challenges, a novel medical entity recognition algorithm for Chinese electronic medical records is developed to enhance the processing and understanding capabilities of rehabilitation robots for patient data. This algorithm is based on a fusion classification strategy. Specifically, a preprocessing strategy is proposed according to clinical medical knowledge, which includes redefining entities, removing outliers, and eliminating invalid characters. Subsequently, a medical entity recognition model is developed to identify Chinese electronic medical records, thereby enhancing the data analysis capabilities of rehabilitation robots. To extract semantic information, the ALBERT network is utilized, and BILSTM and MHA networks are combined to capture the dependency relationships between words, overcoming the problem of different meanings for the same entity in different contexts. The CRF network is employed to determine the boundaries of different entities. The research results indicate that the proposed model significantly enhances the recognition accuracy of electronic medical texts by rehabilitation robots, particularly in accurately identifying entities and handling terminology diversity and contextual differences. This model effectively addresses the key challenges faced by rehabilitation robots in processing Chinese electronic medical texts, and holds important theoretical and practical value. Full article

(This article belongs to the Special Issue Dynamics and Control System Design for Robot Manipulation)

► Show Figures

Figure 1

16 pages, 5331 KB

Open AccessArticle

A Gateway API-Based Data Fusion Architecture for Automated User Interaction with Historical Handwritten Manuscripts

by Christos Spandonidis, Fotis Giannopoulos and Kyriakoula Arvaniti

Heritage 2024, 7(9), 4631-4646; https://doi.org/10.3390/heritage7090218 - 27 Aug 2024

Viewed by 1406

Abstract

To preserve handwritten historical documents, libraries are choosing to digitize them, ensuring their longevity and accessibility. However, the true value of these digitized images lies in their transcription into a textual format. In recent years, various tools have been developed utilizing both traditional [...] Read more.

To preserve handwritten historical documents, libraries are choosing to digitize them, ensuring their longevity and accessibility. However, the true value of these digitized images lies in their transcription into a textual format. In recent years, various tools have been developed utilizing both traditional and AI-based models to address the challenges of deciphering handwritten texts. Despite their importance, there are still several obstacles to overcome, such as the need for scalable and modular solutions, as well as the ability to cater to a continuously growing user community autonomously. This study focuses on introducing a new information fusion architecture, specifically highlighting the Gateway API. Developed as part of the μDoc.tS research program, this architecture aims to convert digital images of manuscripts into electronic text, ensuring secure and efficient routing of requests from front-end applications to the back end of the information system. The validation of this architecture demonstrates its efficiency in handling a large volume of requests and effectively distributing the workload. One significant advantage of this proposed method is its compatibility with everyday devices, eliminating the need for extensive computational infrastructures. It is believed that the scalability and modularity of this architecture can pave the way for a unified multi-platform solution, connecting diverse user environments and databases. Full article

► Show Figures

Figure 1

26 pages, 12966 KB

Open AccessArticle

Optical Medieval Music Recognition—A Complete Pipeline for Historic Chants

by Alexander Hartelt, Tim Eipert and Frank Puppe

Appl. Sci. 2024, 14(16), 7355; https://doi.org/10.3390/app14167355 - 20 Aug 2024

Cited by 3 | Viewed by 1721

Abstract

Manual transcription of music is a tedious work, which can be greatly facilitated by optical music recognition (OMR) software. However, OMR software is error prone in particular for older handwritten documents. This paper introduces and evaluates a pipeline that automates the entire OMR [...] Read more.

Manual transcription of music is a tedious work, which can be greatly facilitated by optical music recognition (OMR) software. However, OMR software is error prone in particular for older handwritten documents. This paper introduces and evaluates a pipeline that automates the entire OMR workflow in the context of the Corpus Monodicum project, enabling the transcription of historical chants. In addition to typical OMR tasks such as staff line detection, layout detection, and symbol recognition, the rarely addressed tasks of text and syllable recognition and assignment of syllables to symbols are tackled. For quantitative and qualitative evaluation, we use documents written in square notation developed in the 11th–12th century, but the methods apply to many other notations as well. Quantitative evaluation measures the number of necessary interventions for correction, which are about 0.4% for layout recognition including the division of text in chants, 2.4% for symbol recognition including pitch and reading order and 2.3% for syllable alignment with correct text and symbols. Qualitative evaluation showed an efficiency gain compared to manual transcription with an elaborate tool by a factor of about 9. In a second use case with printed chants in similar notation from the “Graduale Synopticum”, the evaluation results for symbols are much better except for syllable alignment indicating the difficulty of this task. Full article

(This article belongs to the Special Issue Machine Learning in Audio Signal Processing and Music Information Retrieval)

► Show Figures

Figure 1

28 pages, 26533 KB

Open AccessArticle

End-to-End Deep Learning Framework for Arabic Handwritten Legal Amount Recognition and Digital Courtesy Conversion

by Hakim A. Abdo, Ahmed Abdu, Mugahed A. Al-Antari, Ramesh R. Manza, Muhammed Talo, Yeong Hyeon Gu and Shobha Bawiskar

Mathematics 2024, 12(14), 2256; https://doi.org/10.3390/math12142256 - 19 Jul 2024

Cited by 2 | Viewed by 2542

Abstract

Arabic handwriting recognition and conversion are crucial for financial operations, particularly for processing handwritten amounts on cheques and financial documents. Compared to other languages, research in this area is relatively limited, especially concerning Arabic. This study introduces an innovative AI-driven method for simultaneously [...] Read more.

Arabic handwriting recognition and conversion are crucial for financial operations, particularly for processing handwritten amounts on cheques and financial documents. Compared to other languages, research in this area is relatively limited, especially concerning Arabic. This study introduces an innovative AI-driven method for simultaneously recognizing and converting Arabic handwritten legal amounts into numerical courtesy forms. The framework consists of four key stages. First, a new dataset of Arabic legal amounts in handwritten form (“.png” image format) is collected and labeled by natives. Second, a YOLO-based AI detector extracts individual legal amount words from the entire input sentence images. Third, a robust hybrid classification model is developed, sequentially combining ensemble Convolutional Neural Networks (CNNs) with a Vision Transformer (ViT) to improve the prediction accuracy of single Arabic words. Finally, a novel conversion algorithm transforms the predicted Arabic legal amounts into digital courtesy forms. The framework’s performance is fine-tuned and assessed using 5-fold cross-validation tests on the proposed novel dataset, achieving a word level detection accuracy of 98.6% and a recognition accuracy of 99.02% at the classification stage. The conversion process yields an overall accuracy of 90%, with an inference time of 4.5 s per sentence image. These results demonstrate promising potential for practical implementation in diverse Arabic financial systems. Full article

► Show Figures

Figure 1

25 pages, 730 KB

Open AccessReview

Handwritten Recognition Techniques: A Comprehensive Review

by Husam Ahmad Alhamad, Mohammad Shehab, Mohd Khaled Y. Shambour, Muhannad A. Abu-Hashem, Ala Abuthawabeh, Hussain Al-Aqrabi, Mohammad Sh. Daoud and Fatima B. Shannaq

Symmetry 2024, 16(6), 681; https://doi.org/10.3390/sym16060681 - 2 Jun 2024

Cited by 17 | Viewed by 14460

Abstract

Given the prevalence of handwritten documents in human interactions, optical character recognition (OCR) for documents holds immense practical value. OCR is a field that empowers the translation of various document types and images into data that can be analyzed, edited, and searched. In [...] Read more.

Given the prevalence of handwritten documents in human interactions, optical character recognition (OCR) for documents holds immense practical value. OCR is a field that empowers the translation of various document types and images into data that can be analyzed, edited, and searched. In handwritten recognition techniques, symmetry can be crucial to improving accuracy. It can be used as a preprocessing step to normalize the input data, making it easier for the recognition algorithm to identify and classify characters accurately. This review paper aims to summarize the research conducted on character recognition for handwritten documents and offer insights into future research directions. Within this review, the research articles focused on handwritten OCR were gathered, synthesized, and examined, along with closely related topics, published between 2019 and the first quarter of 2024. Well-established electronic databases and a predefined review protocol were utilized for article selection. The articles were identified through keyword, forward, and backward reference searches to comprehensively cover all relevant literature. Following a rigorous selection process, 116 articles were included in this systematic literature review. This review article presents cutting-edge achievements and techniques in OCR and underscores areas where further research is needed. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

Search Results (71)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (71)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI