**5. Conclusions**

In this paper, we presented the first steps for designing a novel tool for the automatic extraction/sensing of knowledge from documents written in multiple non-Latin languages.

As shown in the preliminary tests, the performance of the currently developed techniques are encouraging. We will conclude the paper by briefly discussing the expected advantages of the final tool we are aiming to create, both from technical and non-technical perspectives.

From a technical point of view, many foreseen advantages are auspicable in the scientific domain:


Finally, from a broader standpoint, benefits from this research and the use of the tool are expected on several innovative fronts:


We are also aware of some limitations of our current research: in particular, achieving complete automation might be, in some ways, limited by the quality of the input images, which is not always satisfying; moreover, we are aware that text sensing on Arabic-script documents is an area where available research results and tools are not as developed as the ones for Latin scripts. This poses some obstacles but it is also a further motivation for what we are performing. Finally, machine-learning/intelligent features will only be possible with large amounts of data already loaded into the system, which will require a significant initial amount of manual work. In any case, we are confident that these limitations will be dealt with in our future research work.

In short, we are confident that this research will ultimately help in preserving and conserving culture, a crucial task, especially in the challenging scenario we consider.

**Author Contributions:** Conceptualization, R.M., F.R.; methodology, R.M., L.S., M.V.; software, L.S., M.V.; data curation and investigation, L.S., M.V., R.A.V.; writing—original draft preparation, all authors; writing—review and editing, R.M.; supervision, R.M.; project administration, S.D.N., F.R.; funding acquisition, S.B., S.D.N., R.M., F.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by: MIM.fscire startup; Protocol 2020–2024 between The Italian Ministry for University and Research (MUR) and Fondazione per le Scienze Religiose (FSCIRE); Programme agreemen<sup>t</sup> 2021–2025 between The italian Ministry for University and research (MUR) and Fondazione per le Scienze Religiose (FSCIRE).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** At present the dataset of the La Pira Library is not publicly available, all items were donated from different contributors to the La Pira Library in PDF format and are property of the Fondazione per le Scienze Religiose (FSCIRE) and "Giorgio La Pira" Library. It is worth to remind that the main goal of the present research is to enable the creation of a digital library that will eventually enable to make those big amounts of data available, thus providing access to a wide array of users, from students and researchers to the general public.

**Conflicts of Interest:** The authors declare no conflict of interest.
