A Gateway API-Based Data Fusion Architecture for Automated User Interaction with Historical Handwritten Manuscripts

Spandonidis, Christos; Giannopoulos, Fotis; Arvaniti, Kyriakoula

doi:10.3390/heritage7090218

Open AccessArticle

A Gateway API-Based Data Fusion Architecture for Automated User Interaction with Historical Handwritten Manuscripts

by

Christos Spandonidis

^*,

Fotis Giannopoulos

and

Kyriakoula Arvaniti

Prisma Electronics SA, Department R&D, 17561 Paleo Faliro, Greece

^*

Author to whom correspondence should be addressed.

Heritage 2024, 7(9), 4631-4646; https://doi.org/10.3390/heritage7090218

Submission received: 10 July 2024 / Revised: 13 August 2024 / Accepted: 21 August 2024 / Published: 27 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

To preserve handwritten historical documents, libraries are choosing to digitize them, ensuring their longevity and accessibility. However, the true value of these digitized images lies in their transcription into a textual format. In recent years, various tools have been developed utilizing both traditional and AI-based models to address the challenges of deciphering handwritten texts. Despite their importance, there are still several obstacles to overcome, such as the need for scalable and modular solutions, as well as the ability to cater to a continuously growing user community autonomously. This study focuses on introducing a new information fusion architecture, specifically highlighting the Gateway API. Developed as part of the μDoc.tS research program, this architecture aims to convert digital images of manuscripts into electronic text, ensuring secure and efficient routing of requests from front-end applications to the back end of the information system. The validation of this architecture demonstrates its efficiency in handling a large volume of requests and effectively distributing the workload. One significant advantage of this proposed method is its compatibility with everyday devices, eliminating the need for extensive computational infrastructures. It is believed that the scalability and modularity of this architecture can pave the way for a unified multi-platform solution, connecting diverse user environments and databases.

Keywords:

API; handwritten text recognition; keyword spotting models

1. Introduction

Handwritten text has been the medium through which a great portion of the knowledge of the ancient world was effectively transferred to the modern era. Written in parchment, papyrus, and/or paper, manuscripts of great historical importance survive in large quantities scattered throughout the world in museums, private collections, and archives [1]. Even after the advent of printing technology, a considerable portion of texts authored by renowned historical figures remains exclusively in handwritten form. These texts encompass personal notes, letters, and diaries, serving as invaluable resources for historical research. However, they are typically inaccessible to the general public, residing instead within museums and private collections, just like their older counterparts. Conversely, publicly available texts often suffer from illegibility due to various factors, including natural deterioration, environmental damage, idiosyncrasies in the author’s handwriting, and the evolution of writing styles over time [2]. In addition to that, in many cases, the writing style, symbols, and idioms employed by the writer render the text written in them difficult or even impossible to study.

Historical Document Processing (HDP) is essential for converting written material into digital format for historians, using computer science tools like computer vision, document analysis, natural language processing, and machine learning, and its need has been identified by renowned scholars [3]. Digitizing manuscripts preserves cultural heritage and improves accessibility, but converting historical scripts into machine-readable text is challenging. Ever since the early 2000s, classical scholars and archaeologists, in collaboration with computer engineers, have started digitization projects to bring manuscripts of renowned works to the general public [4]. With the progress in the fields of Artificial Intelligence (AI) and machine learning (ML), researchers are exploring AI-based methods to automate text generation, focusing on ancient languages like Latin [5], Greek [6], Hebrew [7], Japanese [8], Tulu [9], or multi-language approaches [10]. Collaboration with computing scientists has led to the successful development of customized Handwritten Text Recognition (HTR) solutions [11]. The Homer Multitext project was the result of many years of study and digitization of manuscripts that brought Homer’s Iliad to the modern world, and it was scattered throughout Europe in museums, monasteries, and private collections [12]. The platform is being enhanced with new features up to this day and has spawned Computer Science and Computer Engineering projects related to the manuscripts, such as the identification of the number of hands that wrote them using an automatic, computerized writer identification system [13]. The HisDoc project in 2012 was one of the pioneering efforts to digitize historical documents using image analysis and text recognition [14]. HisDoc 2.0 improved digital paleography techniques and introduced features like text localization and script differentiation [15]. DivaDesk, a virtual workspace, was a key component. Challenges remain in digitizing high-quality images and integrating historical sources effectively [16]. The EU-funded IMPACT initiative enhanced digitization capabilities in European libraries [17]. Transkribus offers automated recognition and transcription of historical texts, increasing accessibility. It originated from the EU-funded ‘Transcriptorium’ project and is widely used in the cultural heritage sector, with 1700 regular monthly users [18].

Despite their great importance, all these efforts to digitize historical manuscripts face challenges in accurately transcribing handwritten text. Handwritten Text Recognition (HTR) is complex due to diverse writing styles and unique characters in different languages. Scalable tools are needed to incorporate these algorithms, especially with increasing digitization [19]. Robust infrastructure is essential for providing access to historical documents while effectively protecting them from cyberattacks, which have been on the rise in recent years, focusing on important online archives [20,21]. However, many manuscripts remain unexplored due to these obstacles. To meet the growing demand for historical analysis, tools must serve a large user base simultaneously while providing the essential security features for effectively prohibiting direct access of the end users to the documents themselves and/or the infrastructure that hosts them. Academic institutions with limited computational resources require efficient services that balance workload and infrastructure cost-effectively [22]. The research inquiry that necessitates a response is as follows:

Q1: What kind of information fusion framework could facilitate the provision of HTR services with minimal computational burden?

Q2: What is the optimal quantity of parallel end users that can be effectively accommodated by this framework?

Q3: Can an HDP and/or HTR system expand in order to support more manuscript collections and/or additional user applications as part of a growing demand for manuscript accessibility without compromising its performance?

The primary objective of the present study is to tackle these inquiries through a concentrated effort on an expandable and adaptable information fusion (IF) tool, with the ultimate aim of streamlining the transcription procedure under the μDoc.tS initiative, an IF architecture developed to process and transcribe Greek handwritten texts from the Byzantine era. The project focuses on text recognition and aims to automatically transcribe handwritten text of historically significant manuscripts to digitized text, thus rendering it accessible to the average student or researcher and enabling its further automated processing. The μDoc.tS information system features a machine learning-based system for the automated transcription of handwritten text to digitized text and a complete suite of image processing tools that can aid in automated word detection of an image. It also features an intuitive user interface that allows the μDoc.tS information system to be used on manuscript images already uploaded to the system, as well as on manuscript images that can be uploaded to the system by the user. Despite μDoc.tS’ focus being on Greek handwritten text, the presented architecture is designed to be language and period-agnostic, allowing for the support of various algorithmic models and Graphical User Interfaces in a scalable manner.

The central element of the IF architecture, known as the gateway API, is presented in detail. The gateway API facilitates the efficient delivery of request services between the information system’s end users and its services and tools, such as transcription tools, databases, etc., by acting as a request manager and workload balancer among different entities. It allows for the seamless integration and synchronization of the system’s tools and software packages, ensuring smooth operation and access only to authorized applications and services. This type of architecture is believed to offer support to researchers, institutions, and organizations like archives, libraries, and museums. To ensure the proper functioning of the information system, tools, and individual applications, various use-case scenarios are tested. The performance of the system is evaluated by analyzing critical parameters such as its response under different load conditions and user numbers, the execution time of various processes, and the corresponding memory usage. Many more manuscripts have been produced for the μDoc.tS information system, which focuses on other aspects of it, such as its transcription tools and methods [23].

The paper is organized as follows: Section 2 outlines the information fusion system’s architecture. Although all systems’ main functionalities are briefly discussed, the primary focus is on providing a detailed description of the API Gateway. Section 3 explains the methodology used for validating and verifying the data fusion architecture, while Section 4 showcases the relevant results. A brief discussion, including tacit knowledge acquired during the process, is presented in Section 5, with concluding remarks in Section 6.

2. Architecture

2.1. The μDoc.tS Information System

To address the ever-growing demands of the end users, the proposed data fusion incorporates a three-tier architecture.

Figure 1 illustrates the architecture of the information system μDoc.tS, showcasing its most recent version and the arrangement of its building blocks and presenting the integral role of the API Gateway in the information system. The architecture has already been successfully validated in other industrial sectors (e.g., Shipping [24], Oil and Gas [25]). The algorithmic processes operate in the background (first layer), utilizing the information and requests received from the front through a Graphical User Interface (third layer). Acting as a request director and workload manager, the intermediate layer (second layer) operates between these two layers.

First layer. The μDoc.tS information system is composed of several tools and applications designed to enhance the digital representation of a manuscript and facilitate its transcription.

The Document Layout Analysis Tools aim to eliminate unwanted noise, convert the image into a binary format, and identify specific areas of the image that contain text;
The Handwritten Text Transcription Tool is designed to transcribe handwritten text from a digital image;
The Content-Based Word Tracking Tool facilitates the exploration of a digitized manuscript by efficiently scanning a collection of manuscript images and pinpointing the precise locations where a specific word has been detected;
The Language Models and Dictionaries From TLG And Body of Texts application facilitates a spelling check on the transcribed text of a manuscript.

Second layer. The API Gateway facilitates the seamless integration of the software tools and packages, ensuring the smooth flow of information between them. Finally, the SQL Database serves as a repository for document collections, document files, transcriptions, and partitioning models.

Third layer. The user of the platform can do the following:

Create and manage a collection of manuscripts;
Use preprocessing tools for manuscript collection images;
Produce the layout analysis of the manuscript;
Transcribe the images of the manuscripts;
Edit the extracted result of the transcription and correct it efficiently;
Search for words in the images of the collection (keyword spotting);
Train document layout analysis and transcription models to manage new collections with special features;
Save the transcription of images in a collection in a manageable format for further use.

The API Gateway, being the main focus of the present manuscript, is presented in detail in the Section 2.2.

2.2. The API Gateway

The Gateway API serves as the conduit for communication between the different software packages within the µDoc.tS information system. These software packages encompass both the back end of the system and the user applications, whether they be web-based or desktop-based, that will be created using the µDoc.tS SDK. These user applications will form the front end of the overall µDoc.tS system.

To ensure secure communication, users’ applications can only interact with the back end if they possess the appropriate client credentials. The Gateway API is safeguarded by an authorization server known as the Identity Server. For a visual representation of the Gateway API’s architecture, please refer to Figure 2.

The Gateway API, Ocelot [26,27], is deployed using advanced technology. Ocelot, an open-source Gateway API, is developed in C and utilizes the .NET Core framework. With its capabilities, Ocelot enables seamless and efficient interconnection of services and micro-services within an information system. This technology is extensively employed in the implementation of service-oriented architecture systems and applications. The suitability of Ocelot for applications like µDoc.tS is outlined in [28], citing its advantageous features. Ocelot was also selected due to its seamless integration with the Identity Server [29], which serves as the Authentication/Authorization Server. This server ensures the security and proper utilization of both the Gateway API and back-end applications.

The Identity Server is responsible for authenticating applications by confirming their identity through presented credentials, such as the Client Id and Client Secret. Valid credentials result in the issuance of an access token with a lifespan of 50 min and a Refresh Token in Bearer JWT token format with a lifespan of 8 h. The Refresh Token is used to renew the access token when it expires, and a new license of the application must be issued using the Client Id and Client Secret after the Refresh Token expires. Authorization is also handled by the Identity Server, defining the permissions that each application has in the services of µDoc.tS. Each user has a certain number of permissions that define their permissions in the application, and the Identity Server issues authorization tokens containing the rights of applications to access the services of an information system.

Figure 3 depicts the process through which the Gateway API and Identity Server handle certification requests from client applications. The client application initiates a token request to the Token endpoint of the Gateway API, including the Client Ids and Client Secrets of each application. Subsequently, the Gateway API re-directs the user’s request to the Identity Server, as illustrated in Figure 3.

If the request is made with the correct credentials, the Identity Server responds to the application’s request in JSON format, containing the access token. Each token issued by the Identity Server is an encrypted JSON Web Token (JWT). Consequently, the resulting information system possesses an additional layer of security, as the internal components of the application and the tools it serves remain shielded from external attackers. When making data calls to the Gateway API, the user must include the JWT token in each request. Failure to do so results in the Gateway API withholding content and responding with an Unauthorized 401 error message. The aforementioned architecture facilitates the management and scalability of the µDoc.tS information system. Each segment of the back end operates autonomously and can be configured and operated independently from one another. This division of labor reduces the workload by enabling shared and separate execution of processes, thereby preventing the Gateway API from being overwhelmed by parallel requests. In situations where certain back-end tools are temporarily unavailable, such as during software upgrades, the remaining tools can continue to function normally. This level of resilience would not be achievable in a single API architecture. The certification architecture is further elaborated in Figure 4.

3. Methodology for Verification and Validation

This study used a sequential mixed analysis approach, combining quantitative data with qualitative findings to explore a phenomenon, validate theories, extrapolate results, and create new tools. Specific activities for each stage are outlined in Figure 5.

At first, a qualitative analysis was carried out to define user scenarios based on the system’s functionalities, considering both user requirements during the initial phase of development and system specifications in the later phase. Following this, the second phase concentrated on the quantitative evaluation of the Gateway API. Based on the main evaluation scenarios mentioned earlier, three categories of assessments were conducted, as detailed in Table 1.

The primary objective of the first and second tasks was to assess the Gateway API’s performance in accordance with its specifications. While these testing tasks encompass an extensive evaluation of the processing models and the Graphical User Interface, the current focus lied solely on the Gateway API itself and its ability to deliver uninterrupted services. Throughout these two tasks, all user scenarios developed in the previous phase were thoroughly tested. The successful completion of these two tasks initiates the third task, which involves comprehensive system tests for the service. In this task, the same user scenarios are tested, but this time with the involvement of a series of parallel end-users, potentially reaching up to 10,000 users. The primary objective of this test is to exert pressure on the Gateway API and determine the optimal number of end users that can be efficiently served.

The final stage involved interpreting the results to provide recommendations on the suitable utilization of each tool for specific use cases. It is important to note that the testing process was not a one-time occurrence at the end of the project but rather an iterative process carried out with each new release, either due to requirements refinement or performance optimization. The evaluation outcomes were incorporated into the initial phase to refine the design of user scenarios or the system itself, thus triggering a new iteration. The subsequent sections outline the primary steps of the methodology employed.

3.1. User Scenarios Definition

The evaluation method included both summative and formative assessments, utilizing a criteria-based evaluation approach that highlights system performance criteria, such as response time, availability, and reliability in different user and service request situations. The recording process was founded on four core principles.

Principle 1 involved recording usage scenarios based on the user’s experience with the information system and how the user interacts with it. These scenarios are known as user-oriented use cases.

Principle 2 involved recording usage scenarios that concern the functionalities of the system and mainly its response to various inputs and parameters. These scenarios are called system-oriented use cases.

Principle 3 involved recording usage scenarios that correspond to the smooth operation of the system and its response when all the parameters given as inputs to each application are correct and/or the user of the information system performs the appropriate movements to carry out a process that can be executed through the information system. These scenarios are called main flow use cases or sunny day use cases.

Principle 4 involved recording usage scenarios that describe the response and behavior of the system in case the user does not follow the appropriate practice of performing the procedures of the system, executes the procedures by making one or more errors, and/or enters incorrect parameters in the system to perform the desired processes. These scenarios are called alternative flow use cases or rainy day use cases.

The usage scenarios were recorded with specific objectives in mind, including accurately representing the system’s functionalities and involving users and applications in the process. They were structured in a simple, numbered manner for clarity and ease of understanding, with a focus on sequencing to depict the flow of actions. The scenarios were constantly updated to ensure relevance and accuracy, playing a crucial role in identifying roles and responsibilities within the system. The approach used for use case and control scenarios consolidation for effective verification of the entire set of the API Gateway’s functionalities is a modified approach of the one presented initially in [25], and consequently in [26], with its focus being on software development and verification.

In the given usage scenarios, the tools of the back end of the information system were referred to by their names instead of the official names assigned to them during the development process. The following names were used:

Binarization Tool;
Noise Reduction Tool;
Transcription Tool.

3.2. System Functionality

The evaluation consisted of (a) analyzing the interaction between the Gateway and the Database through the Graphical Interface Design; (b) examining the Installation process; and (c) evaluating the quality and significance of the data. The assessment did not cover user interface experience and specific functional tests, as the main goal was to guarantee data integrity throughout the data fusion pipeline when accessed by one or more users at the same time. Figure 6 summarizes the user scenarios that were tested in this assessment divided into collection, file, and model levels.

3.3. Development Package

The Gateway API operation was evaluated in this test based on its interface with the development packages. The evaluation procedure encompassed testing the system under normal and non-normal conditions to assess its performance and outputs. This allowed for testing the platform’s operation during regular operating conditions and in the event of a malfunction. When necessary, simulated signals were utilized as system inputs through a dedicated JSON document and manual data entry using the Postman tool. Figure 7 outlines the various user scenarios that were tested.

In the testing process, the efficiency and accuracy of the models themselves were not considered during this phase. This also extends to the computation of delays in the development package and its computational times. The main emphasis is on assessing the effectiveness of the Gateway API in carrying out its intended function while ensuring that it does not disrupt data fusion or introduce any uncertainties or errors to the system’s communication layer.

3.4. Service System

The use-case scenarios were utilized for the evaluation of the system, employing the k6 tool [30] and following standard international literature documented practices [31]. The developed methods encompass Smoke Testing, Load Testing, Stress Testing, and Soak Testing, as shown in Table 2.

The k6 is an open-source load-testing tool that makes performance testing easy and productive. It allows for checking the reliability and performance of systems, as well as the timely detection of regressions and performance issues. The k6 tool analysis includes the fields described in Table 3.

4. Results

4.1. System Functionality

The assessment included testing system functionalities across various end-user systems (such as web apps, desktops, etc.), different hardware for end users, and alternative hardware for the Gateway API. The evaluation process involved comparing the requests sent from the devices with the packages received by the server.

Three key parameters were considered: the accuracy of collected information, the precision of reported errors, and the service execution time under 300 ms. Each test iteration was conducted after implementing optimization suggestions for the service. The final results confirmed successful transmission of all requests to the intended destination without any loss or duplication, with accurate responses. Some minor deviations in service timing intervals were observed, likely attributed to noise in the link budget.

The evaluation of different hardware configurations for the server was based on user scenarios, which identified the most time-consuming set of services. Consequently, the minimum requirements for the Gateway API and Identity Server were determined to be a 1.6 GHz processor, 2 GB of RAM, and 5 GB of available hard disk space.

4.2. Development Package

The purpose of this test suite was two-fold: firstly, to verify that all tasks are carried out as per the user scenarios and, secondly, to assess whether the minimum hardware requirements established in prior tests remain applicable in this instance. Tests included 12 different end-users requesting slightly different tasks. A concise summary of the test results for each scenario based on these hardware specifications can be found in Table 4.

The service effectively responds to the requested services and establishes seamless communication between various entities. Moreover, the auto-connect provision functions flawlessly, behaving as anticipated in case of connection loss, malfunctioning errors, timeserver faults, and unavailability of the Gateway API server.

4.3. Service System

After confirming the service’s accurate operation and setting the minimum hardware requirements, the next step was to push the service to its limits by testing it with a significant number of simultaneous end users. This test did not involve incrementally increasing the number of end users. The primary goal was to determine the minimum number of parallel users that could be supported without compromising the quality of service based on the hardware specifications.

In order to achieve this objective, a total of fourteen tests were conducted to assess the endurance and reliability of the system over a 24 h timeframe. User connections were established concurrently at three different levels: 100, 1000, and 10,000, with an average of 3200 calls per user per minute. The results of these tests are presented below. Figure 8, Figure 9 and Figure 10 display screenshots illustrating the detailed test outcomes for 100, 1000, and 10,000 simultaneous users.

Table 5 provides a thorough overview of the results, summarizing their most important points. The primary parameters are listed in the initial column, while the average value for each parameter is displayed in the subsequent columns for varying numbers of end users. The average values were derived from a set of 12 iterations for each test conducted in every end-user scenario.

Based on the test results, the system is expected to handle between 100 to 1000 simultaneous users per day. Testing with 10,000 users was conducted to determine the worst-case scenario, resulting in system failure when 7000 users logged in, with 72% of requests failing due to a “Server Busy” message. It is important to note that the “Load Balancing” function was not activated during these tests, impacting the accuracy of the data collected.

These test outcomes allow for the identification of system limitations and the prediction of potential malfunctions in the future. The findings indicate that under average usage, the system requires between 302 ms to 3.7 s to process requests for 100–1000 simultaneous users daily, with a maximum of 6600 simultaneous users as a benchmark for system upgrade. Overall, the results are deemed satisfactory for the expected traffic and usage levels in the system.

5. Discussion

Throughout this project, numerous inquiries were raised. Some were related to the original research questions, while others steered the research in different directions. The outcome of this endeavor was a blend of explicit and tacit knowledge that provided valuable insights into implementation. In this section, a portion of this knowledge will be presented, with a primary focus on two key aspects. Firstly, the added value that this solution appears to offer in real-world use cases will be explored. It is important to note that the latter aspect is not substantiated by comparison with similar tools, mainly due to a lack of relevant information. However, it is based on the experience gained by the authors throughout the process. Secondly, the challenges encountered during the architecture testing phase will be discussed.

Added value.

The tool has demonstrated the ability to efficiently handle a large number of simultaneous end users with minimal computational requirements. With an ever-growing need for access to important historical documents that stems from the fact that as students and scholars gradually gain access to them and employ them in their work, even more, such end-users require access to them, this is one of the most crucial performance aspects of a developed system like the one presented in this study. Although this is a crucial aspect, it is not the sole focus of the proposed architecture. The following is a set of supplementary features that could significantly impact the implementation and, more importantly, the scalability of similar tools.

Scalability. The proposed design is centered around an intermediary entity that acts as a bridge between the end user and the applications. This entity manages both the functionality and the service quality for the end user, offering pertinent responses as required without compromising its performance. Simultaneously, it imparts a distinctive feature to the system by enabling easy scalability either horizontally or vertically to cater to a larger user base. Through the utilization of additional computational power or by distributing the workload among multiple entities, the design offers flexibility for small research teams or companies to implement and enhance the architecture based on usage. Consequently, the projected use (5-year scenario) does not necessarily have to align with the anticipated use during the initial phase, allowing financial planning to synchronize with expected revenue. Although the perceived scientific impact of this architectural enhancement may seem minimal, it is believed to significantly influence scientific advancements in this field by facilitating seamless collaboration and resource sharing among diverse research teams.

Extendibility. The architecture’s capability to easily connect various Graphical User Interfaces with different algorithmic models and databases, without requiring significant structural changes or extensive communication API refinements, closely resembles its previous characteristic. This feature enables tools created by different teams worldwide to seamlessly integrate with other databases and utilize models developed by various research teams with minimal effort. By serving as an integrator, the architecture facilitates the efficient exchange of data between different entities, thereby simplifying collaboration and bringing the globalization of heritage preservation one step closer to fruition.

Safety and security. With the growing number of documents and the corresponding increase in modeling procedures, ensuring the safety and security of these services becomes even more crucial. This is primarily because manuscript databases have become a prime target for a new breed of criminals, who not only aim for financial gain but also seek to manipulate historical records [20,21]. Addressing this new form of criminal activity requires service providers to implement different mitigation and action measures. In this regard, the proposed architecture offers a distinct advantage through its secure data transfer, which not only includes encryption but also incorporates an additional layer of safety through service management by a dedicated entity. The client application authentication and authorization via the Identity Server results in a more secure approach in comparison to the traditional end-user authentication/authorization offered by most informational systems, as access tokens are issued for the application, which needs to be authorized to access the back-end services and not each end user. In addition to that, the very nature of the Gateway API offers a layer of security by providing a non-transparent layer between the end-user application and the back-end hosting documents, databases, trained models, etc. Furthermore, the future integration of Gateway API technologies, such as blockchain, could provide an additional safeguard against various cyberattacks.

b.: Challenges.

A Gateway API can provide added value to any informational system, as outlined in the previous sub-section of the discussion. However, there are significant challenges associated with its use as part of an informational system architecture, the most notable of which are the following:

Scalability. Being the central point for handling, translating, and re-directing requests and responses between a system’s front end and back end, any scale-up process of a Gateway API can present a significant challenge, especially when multiple independent systems, document databases, data lakes, and end users are incorporated into the same system or significantly expanded. Therefore, any scale-up in a Gateway API needs to be performed with attention to critical parameters that may affect its compatibility with the different systems that are being integrated or expanded. Correct design, as well as configuration and management flexibility, are key points for avoiding scalability-related challenges when designing a Gateway API.

Performance. Performance issues in Gateway APIs are often associated with latency in handling multiple simultaneous requests, which is closely associated with challenges related to the system’s scalability. In case of a significant system scale-up, any Gateway API may experience latency in accomplishing requests and service discovery issues, such as keeping track of multiple back-end services and re-directing the requests to it effectively. In many cases, these can lead to the Gateway API becoming a bottleneck for the entire system. The Gateway API design and the network infrastructure that supports it play an important role in tackling performance-related challenges. Monitoring its performance and keeping logs can also contribute to performance issues detection and mitigation.

Security. Being the center of the information system that it is part of, any Gateway API can become a single point of failure for the entire system and a prime target for attacks. In case the Gateway API is attacked, the entire system that it is part of can collapse. Another security-related challenge is associated with the authentication and authorization of client applications and end users, especially in architectures in which the Identity Server is part of the Gateway API, like the one presented in this manuscript. In these cases, any scale-up of the system needs to be seen in association with the scale-up of the Identity Server. Most importantly, the Gateway API shall incorporate a robust authentication and authorization mechanism. The Gateway API design and its proper maintenance, including regular updates, are crucial in ensuring immunity against security-associated issues for the Gateway API.

6. Conclusions

To ensure the preservation of handwritten historical documents and the access of as many scholars and students to them, libraries have been digitizing them. However, the true value lies in transcribing these digitized images into text. Although tools have been developed to decipher handwritten texts, there are still challenges to overcome, such as scalability and meeting the needs of a growing user community. Under the μDoc.tS initiative, the information fusion architecture has been created to process and transcribe Greek handwritten texts from the Byzantine era. In order to address the research inquiry, it is important to determine what kind of information fusion framework could facilitate the provision of Handwritten Text Recognition (HTR) services with minimal computational burden as the number of end users and/or requests for services increases. Additionally, it is crucial to identify the optimal quantity of parallel end users that can be effectively accommodated by this framework. Thus, in the present work, a Gateway API is presented as a central part of a handwritten document transcription information system, offering client request handling, authentication, and authorization.

During testing, the architecture was subjected to 10,000 concurrent users. It was found that when 7000 users connected, there were issues with inadequacy, resulting in 72% of requests failing. As a benchmark for system enhancements, an upper limit of 6600 concurrent users was established. Considering the expected traffic and usage, this outcome can be deemed satisfactory, provided that the authors expect a substantial growth in the number of scholars and students who need access to important historical documents.

It is believed that the proposed architecture can serve as a significant step in the structural planning of relevant tools. This will enable them to progress to the next level, transitioning from an isolated world to a cohesive and fully collaborative research effort aimed at preserving our cultural heritage while effectively protecting the available collections and users from cyberattacks. Given Gateway API’s good performance results, the authors intend to expand the use of Gateway APIs in more platforms where effective and secure request handling is crucial and present more Gateway API-based system architectures in the future.

Author Contributions

Conceptualization, C.S.; Methodology, C.S. and F.G.; Software, F.G.; Validation, F.G.; Formal analysis, C.S., F.G. and K.A.; Writing—original draft, C.S.; Writing—review & editing, F.G. and K.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

This research was co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH-CREATE-INNOVATE (project code: T1EDK-01939).

Conflicts of Interest

The authors declare no conflict of interest.

Declaration of AI-Assisted Technologies in the Writing Process

During the preparation of this work, the authors used Ahrefs AI tool to edit the English language of the text. After using these tools, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

References

Toledo, J.I.; Carbonell, M.; Fornés, A.; Lladós, J. Information extraction from historical handwritten document images with a context-aware neural model. Pattern Recognit. 2019, 86, 27–36. [Google Scholar] [CrossRef]
Sánchez, J.A.; Romero, V.; Toselli, A.H.; Villegas, M.; Vidal, E. A set of benchmarks for handwritten text recognition on historical documents. Pattern Recognit. 2019, 94, 122–134. [Google Scholar] [CrossRef]
Tracy, S.V.; Papaodysseus, C. The study of hands on Greek inscriptions: The need for a digital approach. Am. J. Archaeol. 2009, 113, 99–102. [Google Scholar] [CrossRef]
Nagy, G. Homer multitext project. In OnlineHumanities Scholarship: The Shape of Things to Come, Proceedings of the Mellon Foundation Online Humanities Conference at the University of Virginia, 26–28 March 2010; McGann, J., Stauffer, A.M., Wheeles, D., Eds.; Rice University Press: Houston, TX, USA, 2010. [Google Scholar]
Firmani, D.; Merialdo, P.; Nieddu, E.; Scardapane, S. In codice ratio: OCR of handwritten Latin documents using deep convolutional networks. In Proceedings of the CEUR Workshop, Bozen-Bolzano, Italy, 21–23 September 2017; Volume 2034, pp. 9–16. [Google Scholar]
Kesidis, A.L.; Gatos, B. Providing Access to Old Greek Documents Using Keyword Spotting Techniques. In Visual Computing for Cultural Heritage; Springer: Cham, Switzerland, 2020; pp. 85–102. [Google Scholar]
Faigenbaum-Golovin, S.; Shaus, A.; Sober, B. Computational handwriting analysis of ancient Hebrew inscriptions—A survey. IEEE BITS Inf. Theory Mag. 2022, 2, 90–101. [Google Scholar] [CrossRef]
Nguyen, H.T.; Nguyen, C.T.; Nakagawa, M.; Kitamoto, A. Text Segmentation for Japanese Historical Documents Using Fully Convolutional Neural Network; SIG Technical Report; Information Processing Society of Japan: Tokyo, Japan, 2019; Volume 2019-CH-119, pp. 1–5. [Google Scholar]
Bangera, S.; Bhat, S. Digitization of Tulu Handwritten Scripts-A Literature Survey. J. Sci. Res. Technol. 2024, 2, 64–73. [Google Scholar]
Capurro, C.; Provatorova, V.; Kanoulas, E. Experimenting with Training a Neural Network in Transkribus to Recognise Text in a Multilingual and Multi-Authored Manuscript Collection. Heritage 2023, 6, 7482–7494. [Google Scholar] [CrossRef]
Nockels, J.; Gooding, P.; Ames, S.; Terras, M. Understanding the application of handwritten text recognition technology in heritage contexts: A systematic review of Transkribus in published research. Arch. Sci. 2022, 22, 367–392. [Google Scholar] [CrossRef] [PubMed]
Smith, N.; Blackwell, C. Analytical developments for the Homer Multitext: Palaeography, orthography, morphology, prosody, semantics. Int. J. Digit. Libr. 2023, 24, 179–184. [Google Scholar] [CrossRef]
Arabadjis, D.; Giannopoulos, F.; Panagopoulos, M.; Exarchos, M.; Blackwell, C.; Papaodysseus, C. A general methodology for identifying the writer of codices. Application to the celebrated “twins”. J. Cult. Herit. 2019, 39, 186–201. [Google Scholar] [CrossRef]
Fischer, A.; Bunke, H.; Naji, N.; Savoy, J.; Baechler, M.; Ingold, R. The HisDoc project. Automatic analysis, recognition, and retrieval of handwritten historical documents for digital libraries. In Internationality and Interdisciplinarity in Edition Philology; De Gruyter: Berlin, Germany; München, Germany; Boston, TX, USA, 2014. [Google Scholar]
Philips, J.; Tabrizi, N. Historical Document Processing: A Survey of Techniques, Tools, and Trends. In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020), Online, 2–4 November 2020; KDIR. pp. 341–349. [Google Scholar]
Eichenberger, N.; Garz, A.; Chen, K.; Wei, H.; Ingold, R.; Liwicki, M. DivaDesk: A holistic digital workspace for analyzing historical document images. Manuscr. Cult. 2014, 7, 69–82. [Google Scholar]
Vobl, T.; Gotscharek, A.; Reffle, U.; Ringlstetter, C.; Schulz, K.U. PoCoTo-an open source system for efficient interactive postcorrection of OCRed historical texts. In Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage, Madrid, Spain, 19–20 May 2014; pp. 57–61. [Google Scholar]
Lincoln, M. Ways of forgetting: The librarian, the historian, and the machine. In Always Already Computational: Library Collections as Data; Padilla, T., Allen, L., Frost, H., Potvin, S., Russey, R.E., Varner, S., Eds.; Institute of Memory and Library Services, National Forum Positional Statements: Washington, DC USA, 2017; pp. 20–30. Available online: https://collectionsasdata.github.io (accessed on 20 November 2020).
Teslya, N.; Mohammed, S. Deep learning for handwriting text recognition: Existing approaches and challenges. In Proceedings of the 2022 31st Conference of Open Innovations Association (FRUCT), Helsinki, Finland, 27–29 April 2022; pp. 339–346. [Google Scholar]
Museum World Hit by Cyberattack on Widely Used Software. Available online: https://www.nytimes.com/2024/01/03/arts/design/museum-cyberattack.html (accessed on 8 August 2024).
Museums on Alert Following British Library Cyber Attack. Available online: https://www.museumsassociation.org/museums-journal/news/2023/12/museums-on-alert-following-british-library-cyber-attack/# (accessed on 8 August 2024).
Sánchez-DelaCruz, E.; Loeza-Mejía, C.I. Importance and challenges of handwriting recognition with the implementation of machine learning techniques: A survey. Appl. Intell. 2024, 54, 6444–6465. [Google Scholar] [CrossRef]
Tsochatzidis, L.; Symeonidis, S.; Papazoglou, A.; Pratikakis, I. HTR for Greek Historical Handwritten Documents. J. Imaging 2021, 7, 260. [Google Scholar] [CrossRef] [PubMed]
Spandonidis, C.C.; Theodoropoulos, P.; Papadopoulos, P.; Demagkos, N.; Tzioridis, Z.; Giordamlis, C. Development of a novel Decision-Making tool for vessel efficiency optimization using IoT and DL. In Proceedings of the 2021 International Conference on Decision Aid Sciences and Application (DASA), Sakheer, Bahrain, 7–8 December 2021; pp. 479–483. [Google Scholar] [CrossRef]
Spandonidis, C.; Theodoropoulos, P.; Giannopoulos, F. A Combined Semi-Supervised Deep Learning Method for Oil Leak Detection in Pipelines Using IIoT at the Edge. Sensors 2022, 22, 4105. [Google Scholar] [CrossRef] [PubMed]
Loureiro, T. API Gateway with Ocelot. Available online: https://medium.com/@thiagoloureiro/api-gateway-with-ocelot-cadc3c7b60b6 (accessed on 8 August 2024).
De la Torre, C. Designing and Implementing API Gateways with Ocelot in NET Core Containers and Microservices Architectures. Available online: https://devblogs.microsoft.com/cesardelatorre/designing-and-implementing-api-gateways-with-ocelot-in-a-microservices-and-container-based-architecture (accessed on 8 August 2024).
Ocelot. Available online: https://ocelot.readthedocs.io/en/latest/introduction/bigpicture.html#big-picture (accessed on 8 August 2024).
Identity-Server4. Available online: https://identityserver4.readthedocs.io/en/latest/ (accessed on 8 August 2024).
k6. Available online: https://k6.io/ (accessed on 8 August 2024).
Cockburn, A.; Cockburn, L. Writing Effective Use Cases; Pearson Education: Chennai, India, 2008. [Google Scholar]

Figure 1. The μDoc.tS information system architecture.

Figure 2. The Ocelot architecture with Identity Server authorization.

Figure 3. The API Gateway general architecture.

Figure 4. Application certification request management process.

Figure 5. Task sequence and method allocation for each task.

Figure 6. System functionality user scenarios.

Figure 7. Development package user scenarios.

Figure 8. Script execution results with 100 concurrent users.

Figure 9. Script execution results with 1000 concurrent users.

Figure 10. Script execution results with 10,000 concurrent users.

Table 1. Tests categories.

Task	Test Category	Description
1	System Functionality	Assess the ability of the architectural design to support all system functionalities.
2	Development Package	Assess the ability of the architectural design to support the sequence of operations for the development package under both typical and unusual circumstances.
3	Service System	Assess the ability of the architecture to support both System Functionalities and development package concurrently for a significant amount of end users.

Table 2. k6 tool tests categories.

Test Category	Description
Smoke Testing	A regular load test configured for minimal load.
Load Testing	Primarily focused on assessing the current performance of the system in terms of concurrent users or requests per second.
Stress Testing	The simulation of extreme scenarios to evaluate the system’s availability and stability under heavy load
Soak Testing	Revealing performance and reliability issues that arise from prolonged system pressure

Table 3. k6 tool analysis includes the following fields.

Parameter	Description
data_received	The size of data returned by the server.
data_sent	The size of data sent to the server.
http_req_blocked	The time the server spent processing a request.
http_req_connecting	The time it took to establish a connection with the server.
http_req_duration	The time taken for the request.
http_req_failed	The percentage of failed requests and the total count of both failed and successful requests.
expected_response	The time it took to create the response in relation to the value of http_req_duration.
http_req_receiving	The time it took to receive the response from the server.
http_req_sending	The time it took to send the request to the server.
http_req_waiting	The time it took for the server to accept the request.
http_reqs	The number of requests made per second.
iteration_duration	The time taken for a group of tests.
iterations	The total number of tests.
vus	Virtual users created.
vus_max	Maximum number of virtual users.

Table 4. Development package service status for minimum server specifications.

Test Category	Normal Operation	Execution Error	Fail to Connect
Download bearer token	100%
Inability to receive bearer token		100%
Contact of Binarization Tool	100%	100%	100%
Communication of Noise Reduction Tool	100%	100%	100%
Communication of Transcription Tool	100%	100%	100%
Contact a word locator with a contextual query	100%	100%	100%

In this table, 100% indicates that the service performed as expected each time it was tested.

Table 5. Results of scenario execution with 100 concurrent users.

Parameter	100 Users	1000 Users	10,000 Users
data_received	22 mb	24 mb	15 mb
data_sent	2.9 mb	3.2 mb	5.8 mb
http_req_blocked	2.98 ms	29.69 ms	2.97 s
http_req_connecting	782.92 μs	29.05 ms	2.62 s
http_req_duration	302.61 ms	3.68 ms	6.24 s
http_req_failed	0.00%	0.00%	71.99%
http_req_receiving	296.79 μs	412.78 μs	59.4 ms
http_req_sending	9.89 μs	830.77 μs	242.91 ms
http_req_waiting	302.3 ms	3.68 s	5.94 s
http_reqs	3389	3745	6798
vus	100	1	0
vus_max	100	1000	10,000

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Spandonidis, C.; Giannopoulos, F.; Arvaniti, K. A Gateway API-Based Data Fusion Architecture for Automated User Interaction with Historical Handwritten Manuscripts. Heritage 2024, 7, 4631-4646. https://doi.org/10.3390/heritage7090218

AMA Style

Spandonidis C, Giannopoulos F, Arvaniti K. A Gateway API-Based Data Fusion Architecture for Automated User Interaction with Historical Handwritten Manuscripts. Heritage. 2024; 7(9):4631-4646. https://doi.org/10.3390/heritage7090218

Chicago/Turabian Style

Spandonidis, Christos, Fotis Giannopoulos, and Kyriakoula Arvaniti. 2024. "A Gateway API-Based Data Fusion Architecture for Automated User Interaction with Historical Handwritten Manuscripts" Heritage 7, no. 9: 4631-4646. https://doi.org/10.3390/heritage7090218

Article Menu

A Gateway API-Based Data Fusion Architecture for Automated User Interaction with Historical Handwritten Manuscripts

Abstract

1. Introduction

2. Architecture

2.1. The μDoc.tS Information System

2.2. The API Gateway

3. Methodology for Verification and Validation

3.1. User Scenarios Definition

3.2. System Functionality

3.3. Development Package

3.4. Service System

4. Results

4.1. System Functionality

4.2. Development Package

4.3. Service System

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Declaration of AI-Assisted Technologies in the Writing Process

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI