LPG Semantic Ontologies: A Tool for Interoperable Schema Creation and Management

Bernasconi, Eleonora; Ceriani, Miguel; Ferilli, Stefano

doi:10.3390/info15090565

Open AccessArticle

LPG Semantic Ontologies: A Tool for Interoperable Schema Creation and Management

by

Eleonora Bernasconi

^1,*

,

Miguel Ceriani

²

and

Stefano Ferilli

^1,*

¹

Department of Computer Science, University of Bari, 70125 Bari, Italy

²

Institute of Cognitive Sciences and Technologies-National Research Council, 00196 Roma, Italy

^*

Authors to whom correspondence should be addressed.

Information 2024, 15(9), 565; https://doi.org/10.3390/info15090565

Submission received: 21 July 2024 / Revised: 28 August 2024 / Accepted: 9 September 2024 / Published: 13 September 2024

(This article belongs to the Special Issue Knowledge Graph Technology and its Applications II)

Download

Browse Figures

Versions Notes

Abstract

Ontologies are essential for the management and integration of heterogeneous datasets. This paper presents OntoBuilder, an advanced tool that leverages the structural capabilities of semantic labeled property graphs (SLPGs) in strict alignment with semantic web standards to create a sophisticated framework for data management. We detail OntoBuilder’s architecture, core functionalities, and application scenarios, demonstrating its proficiency and adaptability in addressing complex ontological challenges. Our empirical assessment highlights OntoBuilder’s strengths in enabling seamless visualization, automated ontology generation, and robust semantic integration, thereby significantly enhancing user workflows and data management capabilities. The performance of the linked data tools across multiple metrics further underscores the effectiveness of OntoBuilder.

Keywords:

ontologies; knowledge management; semantic web; large language models; Semantic Labeled Property Graphs

1. Introduction

Managing heterogeneous data across various domains presents substantial challenges in the rapidly evolving era of AI. The spectrum of data types, from text archives to multimedia content, requires advanced tools that ensure efficient storage, accessibility, analytical capabilities, and interoperability. Traditional data management systems often struggle to meet these multifaceted demands, highlighting the urgent need for innovative solutions capable of proficiently handling this complexity [1].

Ontologies are indispensable in many domains for enhancing data organization to improve searchability and semantic richness. By mapping networks of concepts and their interrelationships, ontologies can significantly improve data integration, retrieval, and analytical processes. For example, in digital libraries, an ontology can link a historical figure to relevant events, works, and locations, offering a multidimensional perspective conducive to educational and scholarly endeavors. Similarly, in the financial sector, ontologies can correlate financial instruments, market events, regulations, and economic indicators, thus providing a comprehensive structure that aids in risk assessment, regulatory compliance, and strategic investments. In healthcare, ontologies connect patient records, medical literature, treatment protocols, and clinical trials, facilitating more accurate diagnoses, personalized treatment plans, and advanced research capabilities [2]. These intricate semantic layers are crucial for uncovering significant connections and insights within vast datasets, thereby supporting rigorous academic research and enhancing user interactions with digital resources. The utility of ontologies extends beyond digital libraries, finance, and healthcare to various fields, thus emphasizing their critical role in fostering multidomain adaptability and interoperability [3].

Graphs inherently encapsulate data through nodes (entities) and edges (relationships), thus delineating the essential framework of multifaceted information sets. This conceptual model is particularly beneficial for graph-based techniques, such as semantic labeled property graphs (SLPGs), which excel in depicting the intricate interconnections inherent in diverse datasets [4]. Owing to their exceptional performance, SLPGs provide advanced query functionalities within extensive datasets, making them well suited for managing the dynamic and intricate nature of data across various domains, including cultural, academic, financial, and healthcare datasets [5,6].

This paper introduces OntoBuilder, an advanced tool that leverages the structural advantages of SLPGs in compliance with semantic web standards [7]. Its development was motivated by the lack of tools capable of constructing LPG-based semantic ontologies from scratch. It is designed to facilitate the direct derivation of SLPGs from data models, thus optimizing the ontology generation process. OntoBuilder addresses the urgent demand for more efficient methodologies for generating semantic structures, thereby enabling the seamless integration and interoperability of heterogeneous datasets. Prior research by our team corroborated the advantages of a hybrid paradigm [8] that synergizes semantic web technologies with graph-based data models, significantly enhancing data interoperability and querying functionalities. This amalgamation is pivotal for the development of adaptable and efficient data management systems capable of evolving to meet the dynamic requirements of various domains, such as digital libraries and humanities research. By combining the strengths of both semantic web and graph-based paradigms, OntoBuilder provides a comprehensive solution for the management and analysis of complex multifaceted datasets [9,10].

OntoBuilder offers a sophisticated and highly effective mechanism for constructing ontologies within the SLPG framework. By integrating the semantic richness of web standards within the flexible, high-performance framework of SLPGs, OntoBuilder facilitates the development of robust standard-compliant ontologies. Semantic web standards, including the Resource Description Framework (RDF), RDF Schema (RDFS), and Web Ontology Language (OWL), are employed to ensure seamless data interchange and interpretation across diverse systems and platforms, thereby enhancing data exchange and reuse across various domains [11,12]. OntoBuilder also provides a robust and interoperable framework that significantly enhances data management practices. By merging ontologies with semantic web standards and utilizing graph-based models such as SLPGs, OntoBuilder addresses critical data management challenges, such as scalability, data integration, and the execution of complex queries [13].

The remainder of this paper is structured as follows: In Section 2, we review the existing literature on the integration of semantic web technologies with graph-based data models, highlighting their applications across various domains. Section 3 introduces OntoBuilder, explaining its architecture, key features, and the methodologies employed for constructing and managing ontologies within the SLPG framework. In Section 4, we describe the operational flow of OntoBuilder, focusing on user interactions and system responses, and provide an illustrative example of its practical use. Section 6 presents a comprehensive evaluation of OntoBuilder’s performance, with an emphasis on its integration with various linked data tools, including user feedback and areas identified for improvement. Finally, Section 7 concludes the paper by summarizing the key contributions of OntoBuilder, discussing its impact on the field of semantic web technologies, and suggesting potential directions for future research and development.

2. Related Work

The integration of semantic web technologies with graph-based data models has garnered substantial attention in various domains due to their enhanced data handling and retrieval capabilities. This section reviews the pertinent literature, focusing on the synergistic use of RDF, ontologies, and graph databases to facilitate semantic enrichment and improved data interoperability.

2.1. Semantic Web Technologies

The semantic web and RDF are foundational technologies that underpin the vision of a web of data. Tim Berners-Lee et al. [7] introduced the concept of the semantic web to enhance the current web’s capability by enabling machines to understand and respond to complex human requests based on their semantic meaning. This vision is built upon standards and protocols that ensure data can be shared and reused across application, enterprise, and community boundaries.

RDF is a crucial component of the semantic web. It provides a standard method for modeling information in graph form, where data are structured as triples (subject-predicate-object). This structure allows for the representation of information about resources in a way that is both machine-readable and human-readable. RDF’s flexibility and extensibility have made it a preferred choice for representing metadata and for integrating data from diverse sources. Klyne and Carroll [11] provide an exhaustive explanation of RDF’s concepts and abstract syntax. They emphasize how RDF facilitates interoperability by providing a common framework for expressing information about resources. RDF is designed to be simple yet powerful, enabling the creation of rich data models that can be easily processed by computers.

The adoption of RDF in various domains highlights its utility. For instance, in the healthcare sector, RDF has been used to integrate patient data from different systems, providing a unified view that supports better clinical decision-making. In the cultural heritage domain, RDF helps in linking and enriching data from museums, libraries, and archives, thereby enhancing accessibility and discoverability. Furthermore, RDF’s role in enabling linked data cannot be overstated. Linked data refers to a set of best practices for publishing structured data on the web [14]. These practices involve using RDF to interlink data from different sources, making it possible to navigate between related data across different domains seamlessly. This capability is pivotal for applications requiring comprehensive data integration and retrieval.

Recent advancements have focused on optimizing RDF storage and querying mechanisms to handle large-scale datasets efficiently. Techniques such as RDF indexing, partitioning, and the use of specialized RDF databases (triple stores) have been developed to address performance challenges. These enhancements ensure that RDF remains a scalable solution for managing big data in a semantic web context [15].

2.2. Role of Ontologies

Ontologies further enhance this framework by introducing a structured vocabulary for a particular domain, enabling more precise data integration and retrieval. Ontologies define the types of entities within a domain and the relationships between them, which allows for more sophisticated querying and data analysis. The seminal work by Berners-Lee, Hendler, and Lassila [7] on the semantic web outlines the critical role of ontologies in creating a more intelligent and interconnected web.

The use of ontologies spans a variety of domains, reflecting their versatility and efficacy in organizing complex information structures.

2.2.1. Digital Libraries

In the realm of digital libraries, ontologies play a crucial role in the structuring and enriching of metadata, thus enhancing the searchability and accessibility of digital resources. Ontologies enable the creation of sophisticated search mechanisms that understand the relationships between different concepts, facilitating more precise and meaningful search results [16]. For example, ontologies can link authors to their works, categorize documents by subject matter, and connect historical events with relevant archival materials, significantly improving the user experience and supporting advanced scholarly research [17,18].

2.2.2. Healthcare

Ontologies have become indispensable in healthcare care for the management and integration of diverse biomedical data. They provide a structured framework for representing medical knowledge by linking patient records, clinical trials, treatment protocols, and medical literature [19]. This integration is essential for developing decision support systems, improving diagnostic accuracy, and personalizing treatment plans [20]. Ontologies such as SNOMED CT and the gene ontology (GO) have been pivotal in standardizing medical terminology, enabling interoperability between different health information systems, and facilitating biomedical research [21].

2.2.3. Finance

In the financial domain, ontologies help in structuring complex financial information, linking various financial instruments, market events, regulations, and economic indicators [22]. This structured representation supports risk assessment, regulatory compliance, and strategic investment decisions. Ontologies facilitate the integration of heterogeneous financial data sources, enabling sophisticated query capabilities and real-time data analysis [23]. They also play a critical role in the development of financial knowledge management systems, enhancing the ability to detect market trends and make informed decisions [24].

2.2.4. Other Domains

Ontologies also find applications in many other fields. In e-commerce, they help in product categorization, recommendation systems, and customer relationship management [25]. In environmental science, ontologies are used to integrate and analyze data related to climate change, biodiversity, and ecosystem management [26]. In education, they support the development of intelligent tutoring systems and personalized learning environments by structuring educational content and linking learning resources with educational standards [27,28].

2.3. Graph Databases and LPGs

Graph databases, such as those based on the Label Property Graph (LPG) model, have proven to be particularly effective in managing complex and interconnected data. Graph databases excel in scenarios requiring the modeling of relationships and have been widely adopted for applications requiring high-performance querying of large datasets. Robinson, Webber, and Eifrem [6] provide a comprehensive guide to graph databases, highlighting their advantages over traditional relational databases in the handling of connected data.

Several tools have been engineered to aid in the development and administration of LPG models, underscoring the increasing significance of graph databases in varied data-intensive arenas. Neo4j [29] stands out as a leading tool in this realm, offering a robust platform for the construction, querying, and management of LPG-centric graph databases. The extensive Neo4j ecosystem encompasses a potent query language, Cypher, meticulously designed for graph data operations. Another distinguished tool is ArangoDB [30], a multi-model database that accommodates LPGs alongside alternative data models, thus providing versatility in handling diverse data types within a unified system. OrientDB [31] similarly supports LPGs and integrates graphs with document models, thereby enabling the management of intricate data relationships with the benefits of both paradigms. Moreover, TinkerPop [32] offers a framework for engaging with LPGs across different graph databases, enhancing the deployment of LPGs in assorted environments through its Gremlin query language. Collectively, these tools have propelled the widespread adoption and efficiency of LPGs in the stewardship of complex, highly connected datasets across various industries.

2.4. Integration of Semantic Web Technologies and Graph Databases

Several studies have demonstrated the benefits of integrating semantic web technologies with graph databases. Angles and Gutierrez [5] explored the use of graph databases in conjunction with RDF to improve data querying and management efficiency. Their research showed significant performance improvements in data retrieval tasks, particularly in complex query scenarios.

Sporny et al. [13] further investigated the integration of semantic web standards with graph databases, focusing on enhancing data interoperability. Their work emphasized the importance of adhering to established standards to ensure seamless data exchange across different systems and platforms. The application of these principles has been shown to facilitate more effective data integration and retrieval, particularly in domains with heterogeneous data sources.

2.5. Hybrid Approaches Combining Semantic Web and Graph Databases

More recently, Ferilli et al. [8] explored the use of hybrid models that combine the strengths of semantic web technologies and graph databases. Their studies proved that such hybrid approaches can significantly enhance data interoperability and querying capabilities, offering a robust solution for managing and analyzing complex datasets.

The review of existing literature highlights the significant advances made in integrating semantic web technologies with graph-based data models. These innovations have paved the way for improved data interoperability, advanced querying capabilities, and the efficient management of complex datasets.

For instance, Nguyen et al. [33] proposed the Singleton Property Graph as a means to add a semantic web abstraction layer to graph databases, facilitating more advanced data interactions. Similarly, Angles et al. [34] investigated methods for mapping RDF databases to property graph databases, further enhancing the flexibility and utility of these hybrid systems. Hristovski et al. [35] also contributed to this field by exploring the implementation of semantic literature-based discovery using a graph database, demonstrating practical applications of these technologies in knowledge discovery.

However, despite these advancements, there remains a noticeable gap in practical tools that can operationalize these integrations, especially those that adhere to semantic standards while leveraging the flexibility of LPG-based ontologies.

This gap underscores the necessity for a tool that not only incorporates the theoretical benefits of these technologies but also provides a practical, user-friendly solution for building and managing semantic ontologies within an LPG framework. The next section introduces OntoBuilder, a novel tool designed to address these challenges by seamlessly integrating semantic web standards with LPG models, thus bridging the gap between theoretical frameworks and practical applications.

3. OntoBuilder

The innovative tool presented in this study, OntoBuilder, builds upon foundational works by integrating semantic web standards with LPG models to create a sophisticated framework for data management. OntoBuilder leverages RDF, RDFS, and OWL within an LPG framework, offering a robust solution for constructing and managing ontologies. This integration enhances data interoperability and retrieval capabilities across various domains.

The diverse applications of semantic ontologies and LPGs aim to harness semantic technologies to improve data access, interoperability, and management. The reviewed literature highlights substantial advances in integrating semantic web standards with graph-based data models, establishing a solid foundation for enhancing data interoperability, advancing semantic querying capabilities, and optimizing data management systems. Despite various studies elucidating the theoretical and practical benefits of these integrations, a distinct gap remains in the availability of tools that operationalize these concepts into user-friendly applications, particularly for the creation of LPG-based ontologies adhering to semantic standards.

OntoBuilder addresses this gap by providing a user-friendly solution specifically designed for constructing and managing ontologies within an LPG framework. It is uniquely tailored to meet the complex and specific demands of multiple domains, offering an innovative approach that facilitates the practical application of semantic and graph technologies while enhancing the overall utility and efficiency of domain systems. OntoBuilder represents a significant step forward, bridging the gap between theoretical benefits and practical implementations, enabling users to fully exploit the advantages of semantic technology in a graph-based context.

Key features of OntoBuilder include:

LPG schema model creation: OntoBuilder facilitates the creation of ontology schema models based on existing knowledge graph classes and properties. This allows users to utilize and adapt well-established data structures, improving efficiency and consistency in the construction of new ontologies.
Automated ontology construction: OntoBuilder automates the extraction of concepts and relationships from unstructured data, connecting them to the OntoBuilder schema model. This significantly simplifies the knowledge creation process, reducing manual workload and increasing accuracy.
Semantic integration: By integrating semantics, OntoBuilder ensures data interoperability. This enables smooth data exchange and high consistency across different systems and platforms, enhancing the quality and usability of information.
Support for ontology evolution: OntoBuilder supports the continuous evolution of ontologies, allowing for constant updates and refinements as new data become available. This ensures that ontologies remain current and relevant, adapting to changes in data and domain requirements.
Scalability: Designed to handle large datasets, OntoBuilder is scalable and adaptable to various fields, including digital libraries, healthcare, and finance. This scalability ensures that the tool can grow with the users’ needs, maintaining high performance even with expanding data volumes.
Enhanced querying capabilities: Utilizing LPG models, OntoBuilder provides advanced querying capabilities, enabling more sophisticated data retrieval and analysis. Users can perform complex queries and gain detailed insights, improving the quality of data-driven analyses and decision-making.

By integrating semantic web standards within a flexible and high-performance LPG framework, OntoBuilder offers a comprehensive solution for data management. It not only enhances data interoperability and retrieval capabilities but also empowers users across different domains to build precise and detailed ontological frameworks, fostering more effective knowledge management and facilitating advanced research and decision-making processes.

The system architecture of OntoBuilder (see Figure 1) is designed to enable the seamless amalgamation of cutting-edge technologies, thereby establishing a unified platform for semantic ontology management within the SLPG frameworks. This section expounds on the integral technical components comprising OntoBuilder’s architecture, elucidating their respective roles and contributions to the system’s overall efficacy and performance.

OntoBuilder integrates several key technologies: Streamlit for the user interface, SPARQLWrapper for querying RDF data, and Neo4j for managing graph databases. This combination provides a robust framework for the creation, management, and deployment of semantic ontologies, effectively addressing the complex requirements of multi-domain scenarios:

•: Streamlit. (https://streamlit.io, accessed on 16 July 2024) functions as the primary interface, facilitating user interactions with the system. It allows users to perform operations such as selecting classes from DBpedia (https://dbpedia.org, accessed on 16 July 2024), creating new classes within Neo4j, and dynamically managing ontology properties. This front-end component enhances system accessibility for users with varying levels of technical proficiency, providing a clear and intuitive interface that simplifies the execution of complex tasks.
•: SPARQLWrapper. (https://rdflib.github.io/sparqlwrapper, accessed on 16 July 2024) is employed for querying DBpedia, an organized version of Wikipedia that offers public access to its content in RDF format. This utility sends queries to DBpedia’s SPARQL endpoint to retrieve RDF class information. Users can utilize this data to define or augment their ontologies with real-world datasets, thereby increasing the semantic richness and applicability of the developed ontologies.
•: Neo4j. (https://neo4j.com/, accessed on 16 July 2024) is the core graph database used to store and manage the ontological structures defined by users. Neo4j supports the Label Property Graph (LPG) model, which is particularly well-suited for representing the intricate relationships and attributes inherent in each node (class or entity). Neo4j’s selection was driven by its robust graph database capabilities, which align perfectly with the demands of semantic data modeling and linked data production. Its schema-less nature allows for the dynamic representation of complex relationships inherent in ontologies, making it an ideal intermediary between ontology design and linked data generation. Moreover, Neo4j’s support for the Cypher query language and its ability to efficiently manage and traverse large-scale graphs significantly enhance the performance and scalability of the system. This makes Neo4j not only a storage solution but also a powerful engine for executing graph-specific queries, ensuring data integrity and responsiveness in ontology management.

The architecture of OntoBuilder ensures robust performance, scalability, and ease of use, making it an invaluable tool for a wide range of domain-specific applications.

4. Functional Flow

The activity diagram (see Figure 2) outlines the operational flow of the ontology creation tool, detailing user interactions and system responses. The following is a comprehensive description of the diagram, which elucidates the sequence of operations within the proposed system.

1.

Start application: The process begins when the user accesses the web application. This action activates the system and prepares it for subsequent interactions.

2.

Interface: Upon launch, the user is greeted by an interface developed using Streamlit. This interface serves as the main conduit for user interaction with the system.

3.

Select operation: The system offers multiple operations that the user can perform, each catering to different needs (see Figure 2). These operations include:

View DBpedia classes. This feature allows users to create ontological classes based on existing ones in DBpedia, a key semantic web resource that automatically extracts structured information. Users can search for a specific class within DBpedia, view, and analyze all associated properties of that class. Subsequently, they can select and assign a specific type to each class, which may be:
-
Attribute: An attribute is a characteristic or a piece of data that belongs to a specific entity or class. For instance, rdfs:comment can be associated with the Book class as an attribute providing comments or descriptions. Attributes are typically used to store additional information about an entity that does not require a relationship with another class.
-
Property: A property defines a relationship between two classes or entities. For example, dbp:firstPublicationDate can be designated as a property of the Book class, indicating the date of the first publication. Properties often define the connections between different entities within the ontology and are used to model relationships.
-
Class: For instance, dbo:Author can be treated as a class linked to Book, implying that each instance of Book can have a relationship with an instance of the Author class, thereby strengthening the semantic link between book and author.
Users can also add new properties that are not present in DBpedia but are relevant to the ontology they are developing. These new properties can be defined as attributes, properties, or classes and can be linked to a custom namespace, for example, https://www.semanticweb.org/neo4j/ (the semanticweb.org domain is used purely as an example of an IRI for the ontology, and does not represent an actual domain). Examples include: NMProperty#:ArchiveLocation to denote the physical location of a book within an organization that maintains a physical copy, or NMAttribute#physicalStatus to describe the condition of the book, or even NMClass#Proofreaders to refer to the individuals responsible for proofreading (see Figure 3).
The NM prefix is a custom namespace used within the ontology to uniquely identify properties, attributes, or classes that are specifically defined by the user. This prefix helps prevent naming conflicts and ensures that these custom-defined terms remain distinct from those imported from external sources like DBpedia, thus maintaining the integrity and clarity of the ontology.
Create new class. This option enables users to create new ontological classes directly within the graph, independent of the guidance from DBpedia’s information. This approach is beneficial when the desired classes are not pre-existing or when the user wishes to develop a completely original structure that diverges from DBpedia models.
Modify existing class. This option allows users to access and modify classes and properties already defined in their ontological system as needed. Modifications may include updating definitions, adding or removing relationships, or adjusting properties to better reflect an evolving context or emerging needs. This functionality is crucial for keeping the ontology updated and relevant in response to changes in the application domain.

4.

Select Export Format: After creating or modifying a class, users decide the format for exporting their graph (see Figure 2):

If the export option is selected, users choose between RDF format and Neo4j format, based on their specific needs for compatibility or integration.

5.

Export graph: Depending on the selected format, the graph is exported:

Export to RDF format: Suitable for systems requiring RDF data models, enhancing interoperability and data sharing.
Export to Neo4j format: Keeps the ontology in Neo4j’s native format, optimizing for graph-specific features.

This adopted mapping into the export functions ensures that the transition from LPG to RDF adheres to semantic web standards, enhancing the interoperability, expressiveness, and usability of the data across various platforms.

6.

Repeat or Stop: Users can choose to continue with additional operations or complete their session, allowing for dynamic interaction with the system (see Figure 2).

This diagram and its corresponding description illustrate the system’s flexibility and user-centric design, emphasizing the tool’s capability to manage complex data structures efficiently. Highlights how the system supports improved data interoperability and user engagement in the management of DLs. (See Figure 4).

4.1. Initialization of RDF Export

The RDF export function is a critical component of our system, enabling the conversion of data stored in a Neo4j database to RDF format. This section describes the detailed mapping process and motivations behind our choices, with the aim of improving interoperability, expressiveness, and usability across various semantic web platforms. The process begins with the initialization of the RDF graph, where namespaces for RDF, RDFS, OWL, and a custom namespaces (NS1) are set up.

@prefix ns1: <https://www.semanticweb.org/neo4j/>.
@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

This step is crucial, as it defines the semantic context and scope of the ontology elements being exported, ensuring correct categorization and linkage according to semantic web standards.

4.2. Node Processing in Neo4j

Each node retrieved from the Neo4j database is processed according to its type, and an appropriate semantic representation in RDF is assigned:

DBpedia and NS1 Classes are mapped to OWL:Class to denote that the node represents a class definition in the ontology.
DBpedia and NS1 Properties are designated as RDF:Property, indicating that the node serves as an ontological property.
DBpedia and NS1 Attributes are designated as RDF:Property, indicating that the node serves as an ontological property.
Neo4j Instances are classified as NS:ClassName, highlighting that the node is an instance of a specific class (defined by ClassName).
Other Nodes default to a custom resource, RDFS:Resource, allowing for flexibility in representing additional data types.

The nodes are further enriched with labels using RDFS.label and any existing additional semantic properties, which enhances the utility of the RDF graph for detailed queries and analysis.

4.3. Relationship Processing

The relationships between nodes are extracted and processed to determine their semantic implications.

INSTANCE_OF relationships are mapped to RDF.type, establishing essential class-instance relationships for RDF ontologies.
HAS_PROPERTY relationships are linked with RDFS.isDefinedBy, useful for describing properties associated with a class or instance.
Custom Relationships are handled via specific mappings that accommodate unique or complex interactions within the graph.

4.4. Serialization of RDF Graph

The complete RDF graph is serialized into Turtle format. This serialized graph can then be used for various applications, including data sharing, publication on SPARQL endpoints, or further integration into semantic web applications.

This approach ensures that the data transition from an LPG format to RDF is not only technically sound but also enhances the usability and compatibility of the data within the broader semantic web ecosystem. Key advantages include:

Interoperability: Facilitates data integration with other semantic web technologies.
Expressiveness: Allows for the expression of complex relationships and constraints.
Scalability: Supports expansions in data volume and complexity without losing the benefits of semantic structuring.
Advanced Querying and Inference: Enables sophisticated data analysis and reasoning capabilities.

This explanation of the RDF export functionality underscores its crucial role in mapping LPG to RDF and highlights the Neo4j endpoint’s compatibility with SPARQL standards.

5. OntoBuilder Demo

In this section, we provide a detailed description of the process of creating the SLPG ontology, focusing on both manual and automatic modes of operation. This section builds on what was described in Section 4, with a more in-depth look at the automation aspects, particularly how unstructured text is handled, and the data import process. Additionally, a demonstration of this process can be accessed at the following link: https://youtu.be/hIBa_6MDQ_Y, accessed on 9 September 2024.

5.1. Domain: Digital Library

Consider the creation of an ontology in the domain of a digital library. The main entities to be described include books, authors, topics covered in books, publishers, locations, and involved persons.

To begin, we use the guided creation section of DBpedia. As a reference class, we select the class Book. We will then add the properties available in DBpedia, such as:

comment: A brief description or annotation of the book.
author: The name of the person who wrote the book.
subject: The main topics or themes addressed in the book.
publisher: The company or organization responsible for publishing the book.
publicationDate: The date when the book was first published.
isbn: The International Standard Book Number, a unique identifier for books.

Additionally, we will add custom properties such as:

bookStatus: Indicates the current availability status of the book (e.g., available, checked out).
editingContributors: Names of the individuals involved in the editing process of the book.

Finally, we name the class Book.

Once the class is created with its properties (see Figure 5), we will link them to both the DBpedia resources and our internal digital library resources. The DBpedia properties maintain links to its URIs, while the custom properties will have specific URIs that describe the internal reference environment.

To illustrate the manual creation process, consider the book “The Name of the Rose” by Umberto Eco. Here is how the instance of the class Book would be created and populated.

Creation of the Class Book

Name: Book
DBpedia Properties:
-
comment: A description of the book.
-
author: Umberto Eco.
-
subject: Historical novel, Mystery.
-
publisher: Bompiani.
-
publicationDate: 1980.
-
isbn: 978-88-452-1523-5.
Custom Properties:
-
bookStatus: Available.
-
editingContributors: Maria Bonfantini.

Manual Insertion of Properties

comment: “The Name of the Rose” is a historical novel and mystery written by Umberto Eco, set in a Benedictine monastery in the 14th century.”
author: Umberto Eco.
subject: Historical novel, Mystery.
publisher: Bompiani.
publicationDate: 1980.
isbn: 978-88-452-1523-5.
bookStatus: Available.
editingContributors: Maria Bonfantini.

During the ontology population phase, class instances may be added through either manual or automated processes. In the manual insertion mode, as exemplified earlier, the user manually inputs the values for class properties. Conversely, in the automatic insertion mode, an advanced language model conducts a rigorous examination of the provided text to identify and categorize different classes and their respective properties. For instance, the model will systematically extract relevant information about all books and their associated attributes from the text.

Suppose that we have the following text:

“The Name of the Rose, written by Umberto Eco, is a historical novel published by Bompiani in 1980. The book deals with themes such as faith, truth, and heresy and is available in our library”.

In automatic insertion mode, the language model will parse each sentence to identify references to books and their properties. It will identify The Name of the Rose as a book, extract the author Umberto Eco, the publisher Bompiani, the publication date 1980, and other details, which will be used to automatically create class instances in the ontology.

The SKATEBOARD interface [36] elucidates the interconnectedness among individuals, their scholarly articles, and the works cited therein by integrating dynamic data from DBpedia to enhance the visualization of the ontology (see Figure 6). This exemplifies the real-time interoperability and semantic augmentation of LPG ontologies, with instantaneous updates manifesting within the interface. The SKATEBOARD interface serves as a visualization tool that can display the ontologies created by OntoBuilder, whether manually or automatically. However, it is important to note that the SKATEBOARD interface itself does not perform the extraction of classes and relationships from unstructured text; this is carried out by OntoBuilder’s automatic mode. SKATEBOARD is used to explore and interact with the resulting ontologies, as shown in Figure 6.

5.2. Domain: History of Computer Science

The second use case for OntoBuilder focuses on the historical development of computer science. This domain encompasses key milestones, influential figures, seminal publications, and pivotal technologies that have shaped the field over time.

To illustrate the creation and management of ontologies in this domain, we demonstrate how OntoBuilder can be used to import data, explore class structures, and extract knowledge from unstructured text.

Using OntoBuilder, we begin by importing data from a CSV file. These data include various classes such as: Computer, Invention, Inventor, Founder, Event, Software, Person, Place, and Company. Each class is associated with specific properties relevant to the historical context, such as birthDate for persons, inventionDate for inventions, and foundationYear for companies. Figure 7 shows how OntoBuilder uses data imported from a CSV file to create and explore classes and properties within the Neo4j environment. The figure highlights the automated aspect of data integration, where large datasets are imported and structured into the ontology without manual input. The process involves assigning unique IRIs to ensure each entity’s unambiguous identification, which is crucial for maintaining data integrity and interoperability within the semantic web.

Once the data are imported, OntoBuilder utilizes Neo4j to visualize and explore the class structures. The Neo4j Class Explorer allows users to navigate through the ontology, examining the relationships and properties of each class. Following the rules of the semantic web, each entity is assigned a unique IRI to ensure unambiguous identification and interoperability. Figure 8 illustrates the manual creation of an ontology in the digital library domain using the Neo4j platform. In this example, the class “Book” and its associated properties, such as “author” and “publicationDate”, are created manually. This figure demonstrates how OntoBuilder integrates with Neo4j to visualize and manage ontological data. It does not depict the automatic extraction process but instead focuses on how manually created data is structured and linked in Neo4j.

OntoBuilder’s capability to extract classes and properties from unstructured text is demonstrated by processing historical documents and publications. For instance, given a text about Chuck Peddle, OntoBuilder can identify relevant classes such as “Inventor” and Invention and extract properties like description, birthDate, and relatedInventions. Figure 9 exemplifies the automatic extraction of classes and relationships from unstructured text using OntoBuilder’s NLP capabilities. The text is parsed to identify entities like “Chuck Peddle” (an inventor) and properties such as “birthDate” and “relatedInventions”. The extracted information is then automatically added to the ontology in Neo4j. This figure emphasizes the automatic mode’s ability to transform raw text into structured, queryable knowledge, which can be further explored using tools like the SKATEBOARD interface.

The SKATEBOARD interface enhances the visualization of individual contributions and relationships within the field of computer science. By exploring entities such as “Chuck Peddle”, users can gain insight into their biographies, major works, and the impact they had on various aspects of computer science. Figure 10 shows the visualization of the graph of people extracted from the text and their relationships with the classes and properties within OntoBuilder.

Overall, this use case demonstrates how OntoBuilder can effectively manage and analyze historical data in the domain of computer science, providing valuable insights and facilitating advanced research.

6. Evaluation

The primary objective of this evaluation is to elucidate the integration efficacy of ontologies crafted using OntoBuilder within the semantic web framework, emphasizing their compatibility with tools designed for the visualization and administration of linked data [37]. The tools selected for this evaluative process encompass SKATEBOARD [36], RDF Explorer [38], LD-VOWL [39], and LOD-Live [40]. The assessment of OntoBuilder’s effectiveness and integration capability with these tools was conducted employing a series of structured questions and Likert scales with ratings ranging from 1 to 5. Furthermore, the evaluation methodology adopted to scrutinize the critical features of OntoBuilder is delineated in detail. The evaluation was performed with the participation of 10 users, all of whom are researchers in the field of computer science.

The participant group consisted of 7 male and 3 female researchers with the following levels of experience:

Junior Researchers (3 participants): Less than 2 years of experience with ontologies.
Senior Researchers (4 participants): Between 2 and 5 years of experience with ontologies.
Expert Researchers (3 participants): More than 5 years of experience with ontologies.

Participants were also categorized based on their familiarity with ontology tools:

Beginner (2 participants): Limited experience with ontology tools.
Intermediate (5 participants): Moderate experience with ontology tools.
Advanced (3 participants): Extensive experience with ontology tools.

Before the evaluation, participants were assigned specific tasks designed to reflect real-world scenarios of using OntoBuilder with the selected tools. These tasks included:

Task 1: Integrate an ontology created with OntoBuilder into SKATEBOARD and explore the dataset.
Task 2: Use RDF Explorer to query and visualize a complex ontology generated by OntoBuilder.
Task 3: Visualize an ontology using LD-VOWL, focusing on the clarity and effectiveness of the visual representation.
Task 4: Explore a large linked dataset using LOD-Live, starting from a single resource and expanding connections.

These tasks were designed to evaluate both the functional capabilities (e.g., integration and visualization) and non-functional aspects (e.g., ontology representation accuracy, overall satisfaction) of the tools in conjunction with OntoBuilder.

6.1. Evaluation Questions

For each tool, users were asked to rate their agreement with the following statements:

Integration compatibility: The tool integrates seamlessly with OntoBuilder, effectively supporting the ontologies it produces.
Ontology representation accuracy: The tool accurately represents the ontologies generated by OntoBuilder, reflecting their structure and semantics as intended.
Performance: The tool performs efficiently when handling ontologies and datasets created by OntoBuilder, even when they are large or complex.
Support for OntoBuilder specific features: The tool supports and enhances any unique features of ontologies produced by OntoBuilder (e.g., specific relationships, custom classes).
Overall effectiveness in task execution: The tool, in conjunction with OntoBuilder, effectively supports the tasks designed for this evaluation.

6.2. Tools Description and Evaluation

This section proposes a comprehensive examination of each tool, supplemented by user testimonials and performance assessments, thereby underscoring their efficacy in practical, real-world scenarios when integrated with OntoBuilder.

1.: SKATEBOARD

Description: SKATEBOARD is designed for clear and interactive visualization of linked data. It provides an intuitive interface that allows users to explore complex datasets easily. As a tool designed to facilitate knowledge exploration through the application of semantic technologies, SKATEBOARD addresses the growing demand for advanced solutions that streamline knowledge extraction, management, and visualization. In the current era characterized by abundant information, graph-based representations have emerged as a robust approach for uncovering intricate data relationships, complementing the capabilities offered by AI models. Acknowledging the transparency and user control challenges faced by AI-driven solutions, SKATEBOARD offers a comprehensive framework encompassing knowledge extraction, ontology development, management, and interactive exploration. By adhering to linked data principles and adopting graph-based exploration, SKATEBOARD provides users with a clear view of data relationships and dependencies. Furthermore, it integrates recommendation systems and reasoning capabilities to augment the knowledge discovery process, thus introducing a serendipity effect generated by the SKATEBOARD interface exploration.

Integration with OntoBuilder: The integration between OntoBuilder and SKATEBOARD is designed to be highly interactive. OntoBuilder supports real-time synchronization with SKATEBOARD, ensuring that any changes made in OntoBuilder are immediately reflected in SKATEBOARD. This integration is facilitated by direct API connections, which allow OntoBuilder to push updates instantly to SKATEBOARD. This tight integration not only simplifies the workflow but also provides a robust platform for ontology management, where users can visualize complex relationships and make real-time modifications to their ontologies.

User Feedback:

Integration compatibility 4.7. users found the integration process straightforward.
Ontology representation accuracy 4.8. The ease of visualizing complex relationships and the real-time modification capabilities of OntoBuilder were highly praised.
Performance with OntoBuilder ontologies 4.2. users noted occasional performance issues when handling very large datasets.
Support for OntoBuilder specific features 4.6. SKATEBOARD supported OntoBuilder-specific features effectively.
Overall Effectiveness in Task Execution 4.5. users expressed high satisfaction with the tool’s effectiveness in completing tasks.

Users rated SKATEBOARD 4.6 out of 5 on the Likert scale.

2.: RDF Explorer

Description: RDF Explorer allows users to visualize and analyze RDF data intuitively. It is widely used in academic settings for its robust capabilities in handling complex RDF datasets. RDF Explorer offers an interface similar to QueryVOWL, but where the user starts from the data rather than the model. The node-link paradigm for exploration is paired with a mechanism to build queries by replacing nodes with variables. The results of a query may be used to refine the query itself or further explore the dataset.

Integration with OntoBuilder: The integration with RDF Explorer involves OntoBuilder’s automated ontology construction, which simplifies the data import process. OntoBuilder exports ontologies in RDF format, ready for direct import into RDF Explorer. The integration ensures that complex ontologies are accurately represented and easily navigable within RDF Explorer, allowing users to take full advantage of RDF Explorer’s querying and visualization capabilities.

User Feedback:

Integration compatibility 3.8. The integration with RDF Explorer involves OntoBuilder’s automated ontology construction, which simplifies the data import process. RDF Explorer accurately reflects the structure and queries of OntoBuilder-created ontologies.
Ontology representation accuracy 4. The visualizations were found effective.
Performance with OntoBuilder ontologies 3.5. Users found the performance adequate but noted some challenges with large datasets.
Support for OntoBuilder specific features 3.7. Support for specific features was noted but could be enhanced.
Overall effectiveness in task execution 3.5. Users were generally satisfied, with room for improvement.

Users rated RDF Explorer 3.7 out of 5 on the Likert scale.

3.: LD-VOWL

Description: LD-VOWL provides user-friendly visual representations of ontologies using the Visual Notation for OWL Ontologies (VOWL) to present data in an accessible format. Such tools leverage SPARQL queries to process RDF triples and extract schema information, inferring the ontology schema from the linked data and presenting the most representative concepts. The extracted ontology schema is then progressively visualized as a graph, offering various interactive operations for exploration and analysis.

Integration with OntoBuilder: OntoBuilder’s integration with LD-VOWL leverages its enhanced querying capabilities to create comprehensive visual representations. The ontologies constructed in OntoBuilder are exported in RDF format, which LD-VOWL can interpret to generate visual representations. This integration is seamless, allowing users to visualize the ontology’s structure, including complex relationships, directly in LD-VOWL.

User Feedback:

Integration compatibility 4.3. Users found the integration relatively easy.
Ontology representation accuracy 4.5. The visual representations were highly praised.
Performance with OntoBuilder ontologies 4.1. Performance was generally good.
Support for OntoBuilder specific features 4.2. LD-VOWL effectively supported OntoBuilder’s unique features.
Overall effectiveness in task execution 4.3. Users were very satisfied with the tool’s effectiveness.

Users rated LD-VOWL 4.2 out of 5 on the Likert scale.

4.: LOD-Live

Description: LOD-Live is designed for dynamic exploration of linked datasets, supporting real-time data navigation and providing a flexible interface for interacting with large sets of linked data. Users can start their exploration with a single resource and then expand their search by exploring the properties of connected elements. This tool supports multiple endpoints, enabling retrieval of information from various sources based on the URI provided by the user. The visualization presents resources as circles, with the value of the property rdfs:label displayed within. Smaller concentric circles represent objects connected to the central resource, and clicking on these circles allows further expansion.

Integration with OntoBuilder: The scalability of OntoBuilder ensures smooth integration with LOD-Live, enabling the efficient handling of large datasets. The integration involves using standardized data formats and APIs that allow LOD-Live to pull data directly from OntoBuilder. This setup ensures that users can explore large linked datasets in real time without performance bottlenecks, offering a robust platform for dynamic data exploration.

User Feedback:

Integration compatibility 4.3. The integration process was found to be easy.
Ontology representation accuracy 4.3. The flexibility and real-time navigation features were highly effective.
Performance with OntoBuilder ontologies 4.0. Performance was generally good, though some users suggested enhanced data filtering.
Support for OntoBuilder specific features 4.1. The tool supported OntoBuilder specific features effectively.
Overall Satisfaction 4.1. Users were satisfied overall.

Users rated LOD-Live 4.2 out of 5 on the Likert scale. (See Figure 11).

6.3. Discussion

The integration of OntoBuilder with key tools for linked data, including SKATEBOARD, RDF Explorer, LD-VOWL, and LOD-Live, has demonstrated the system’s versatility and effectiveness in supporting the visualization and management of semantic data. The positive feedback from users highlights the strengths of each tool in terms of user interface and functionality, while the identified areas for improvement indicate opportunities for further optimization.

SKATEBOARD emerged as the top performer, particularly excelling in Integration compatibility, Ontology representation accuracy, and User interface. Its high scores in these categories (4.7, 4.8, and 4.9, respectively) prove robust integration capabilities and a user-friendly interface, making it highly suitable for visualizing and managing complex ontologies. The only minor drawback noted was occasional performance issues with very large datasets, though it still scored a commendable 4.2 in Performance. The overall satisfaction rating of 4.5 indicates that users were very pleased with SKATEBOARD’s functionality and ease of use.

RDF Explorer, while effective in visualization (4.0) and integration compatibility (3.8), faced challenges in performance (3.5), particularly with large datasets. Users appreciated its user-friendly interface (3.7), but the overall satisfaction (3.5) was the lowest among the evaluated tools. This suggests that while RDF Explorer is capable, it may not be the best choice for handling very large or complex ontologies.

LD-VOWL received high marks for Ontology representation accuracy (4.5) and Integration compatibility (4.3). Its performance (4.1) and user interface (4.1) were also well-regarded, leading to an overall satisfaction score of 4.3. This indicates that LD-VOWL is a reliable and effective tool for visualizing ontologies, benefiting from its integration with OntoBuilder’s querying capabilities.

LOD-Live was noted for its dynamic exploration capabilities, scoring well in Integration compatibility (4.3), Ontology representation accuracy (4.3), and User Interface (4.2). Its performance (4.0) was also satisfactory, though there were suggestions for enhanced data filtering. The overall satisfaction score of 4.1 reflects that users were generally pleased with its performance and usability, making it a strong choice for exploring large linked datasets.

In summary, SKATEBOARD stands out as the most robust and user-friendly tool among those evaluated, with LD-VOWL and LOD-Live also performing well across most metrics. RDF Explorer, while useful, may require improvements in handling larger datasets to enhance user satisfaction. These findings provide valuable insights for users looking to integrate OntoBuilder with visualization and management tools for linked data.

One of the most notable strengths observed was the seamless visualization of complex relationships facilitated by OntoBuilder. For instance, in the domain of digital libraries, OntoBuilder can manage and visualize intricate relationships between books, authors, publishers, and related historical events. An example of this is the relationship between a historical novel and its author, the various editions published by different publishers, and the critical reviews or scholarly articles associated with the book over time. OntoBuilder can effectively handle and display these multi-faceted connections, allowing users to explore how a single work is interconnected with various entities, such as the author’s other works, thematic connections to other literature, and even its influence on subsequent publications.

This capability was particularly appreciated in tools like SKATEBOARD and LD-VOWL, where users could intuitively navigate and modify these complex data visualizations in real-time. For example, a user might explore the network of relationships surrounding a specific book, examining how it is linked to different authors, editions, and critical reviews across various publications. This feature greatly enhanced the user experience, making it easier for users to understand and manipulate their data. The intuitive interfaces of these tools, combined with OntoBuilder’s robust functionality, provided a powerful combination for managing semantic data.

In addition to visualization capabilities, OntoBuilder’s automated ontology construction was another key feature that received positive feedback. Users of RDF Explorer and other tools highlighted the efficiency and accuracy with which OntoBuilder could extract concepts and relationships from unstructured data. This automation not only reduced manual workload but also increased the accuracy and consistency of the resulting ontologies. The semantic integration provided by OntoBuilder ensured high data interoperability, which was crucial for maintaining consistency between different systems and datasets.

However, the evaluation also identified several areas for improvement. Performance issues were noted with SKATEBOARD when handling very large datasets, suggesting a need for optimization to enhance scalability and smoothness of visualizations. Similarly, RDF Explorer users faced challenges with the initial setup, indicating that more comprehensive documentation and user guides could help ease the onboarding process. LD-VOWL users expressed the need for more customization options in visualization to accommodate specific user preferences and requirements. LOD-Live users suggested improved data filtering options to streamline the data exploration process and focus on relevant subsets of data more efficiently. This feedback underscores the importance of continuous improvement and optimization to address these challenges. By focusing on enhancing performance, documentation, customization, and data filtering capabilities, OntoBuilder can further solidify its position as a leading tool for ontology management and linked data integration.

7. Conclusions

Ontologies are essential for the management and integration of heterogeneous datasets. Motivated by the lack of tools capable of constructing LPG-based semantic ontologies from scratch, we developed OntoBuilder, an advanced tool that leverages the structural capabilities of SLPGs in strict alignment with semantic web standards to create a sophisticated framework for data management. OntoBuilder proved its versatility and power in integrating with various linked data tools, significantly enhancing the management of semantic data across diverse domains. This study underscores OntoBuilder’s ability to address critical challenges in data management through the utilization of SLPGs and adherence to semantic web standards. The positive reception of OntoBuilder’s key features, such as semantic integration, scalability, and enhanced querying capabilities, underscores its effectiveness in delivering a robust and user-friendly ontology management solution.

The evaluation highlighted OntoBuilder’s strengths in providing seamless visualization, automated ontology construction, and robust semantic integration, which collectively enhanced user workflows and data management capabilities. SKATEBOARD emerged as the top performer, excelling in Integration compatibility, Ontology representation accuracy, and User interface. Its high scores in these categories reflect its robust integration capabilities and user-friendly interface, making it highly suitable for visualizing and managing complex ontologies. However, occasional performance issues with very large datasets were noted, suggesting a need for optimization to enhance scalability. RDF Explorer, while effective in visualization and integration compatibility, faced challenges in performance, particularly with large datasets. This indicates that while RDF Explorer is capable, it may not be the best choice for handling very large or complex ontologies. LD-VOWL and LOD-Live also performed well across most metrics, with users particularly appreciating their dynamic exploration capabilities and effective visualizations.

Future development of OntoBuilder will address the identified areas for improvement, such as performance, documentation, customization, and data filtering capabilities, to ensure it remains at the forefront of semantic data management and linked data tool integration. We also aim at expanding the scope of our testing to include more diverse and complex scenarios. We are particularly interested in incorporating machine learning techniques to predict and optimize query handling and data integration. Future work will focus on refining the adaptability and scalability of our ontologies to meet evolving data needs and emerging web standards, ensuring that our technologies remain at the forefront of semantic web development. Finally, we are preparing a second evaluation to measure, using both quantitative and qualitative metrics, the quality of information automatically extracted by OntoBuilder on a large scale. This will provide deeper insights into the system’s performance and its capability to handle extensive datasets effectively.

Author Contributions

Conceptualization, E.B., M.C. and S.F.; methodology, E.B.; software, E.B.; validation, E.B.; formal analysis, E.B. and M.C.; investigation, E.B.; resources, E.B.; data curation, E.B.; writing—original draft preparation, E.B. and S.F.; writing—review and editing, E.B. and S.F.; visualization, E.B.; supervision, S.F.; project administration, S.F.; funding acquisition, S.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by projects CHANGES “Cultural Heritage Active Innovation for Sustainable Society” (PE00000020), Spoke 3 “Digital Libraries, Archives and Philology” and FAIR “Future AI Research” (PE00000013), spoke 6 “Symbiotic AI”, funded by the Italian Ministry of University and Research NRRP initiatives under the NextGenerationEU program.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Smith, B. Ontology and Information Systems; Stanford Encyclopedia of Philosophy. Available online: http://ontology.buffalo.edu/ontology_long.pdf (accessed on 29 November 2020).
Riaño, D.; Real, F.; López-Vallverdú, J.A.; Campana, F.; Ercolani, S.; Mecocci, P.; Annicchiarico, R.; Caltagirone, C. An ontology-based personalization of health-care knowledge to support clinical decisions for chronically ill patients. J. Biomed. Inform. 2012, 45, 429–446. [Google Scholar] [CrossRef] [PubMed]
Allemang, D.; Hendler, J. Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL; Morgan Kaufmann: Cambridge, MA, USA, 2011. [Google Scholar]
Purohit, S.; Van, N.; Chin, G. Semantic property graph for scalable knowledge graph analytics. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December 2021; pp. 2672–2677. [Google Scholar]
Angles, R.; Gutierrez, C. Survey of graph database models. ACM Comput. Surv. 2008, 40, 1–39. [Google Scholar] [CrossRef]
Robinson, I.; Webber, J.; Eifrem, E. Graph Databases: New Opportunities for Connected Data; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2013. [Google Scholar]
Berners-Lee, T.; Hendler, J.; Lassila, O. The Semantic Web. Sci. Am. 2001, 284, 34–43. [Google Scholar] [CrossRef]
Ferilli, S.; Basili, R.; Esposito, F. Hybrid approaches to semantic data management. J. Data Semant. 2023, 12, 123–145. [Google Scholar]
Di Pierro, D.; Ferilli, S.; Redavid, D. Lpg-based knowledge graphs: A survey, a proposal and current trends. Information 2023, 14, 154. [Google Scholar] [CrossRef]
Ferilli, S.; Bernasconi, E.; Di Pierro, D.; Redavid, D. A Graph DB-Based Solution for Semantic Technologies in the Future Internet. Future Internet 2023, 15, 345. [Google Scholar] [CrossRef]
Klyne, G.; Carroll, J.J. Resource Description Framework (RDF): Concepts and Abstract Syntax; W3C Recommendation. Available online: https://www.w3.org/TR/rdf11-concepts/ (accessed on 9 September 2024).
Patel-Schneider, P.F.; Horrocks, I. OWL 2 Web Ontology Language: Structural Specification and Functional-Style Syntax; W3C Recommendation. Available online: https://www.w3.org (accessed on 9 September 2024).
Sporny, M.; Kellogg, G.; Lanthaler, M. JSON-LD 1.0: A JSON-Based Serialization for Linked Data; W3C Recommendation. 2014. Available online: https://www.w3.org (accessed on 9 September 2024).
Bizer, C.; Heath, T.; Berners-Lee, T. Linked Data—The Story So Far. In Linking the World’s Information: Essays on Tim Berners-Lee’s Invention of the World Wide Web, 1st ed.; Association for Computing Machinery: New York, NY, USA, 2023; pp. 115–143. [Google Scholar]
Saleem, M. Storage, Indexing, Query Processing, and Benchmarking in Centralized and Distributed RDF Engines: A Survey. Preprints. 2023. Available online: https://www.authorea.com/doi/pdf/10.36227/techrxiv.12813698.v2 (accessed on 9 September 2024).
Bernasconi, E.; Ceriani, M.; Mecella, M.; Catarci, T. Design, realization, and user evaluation of the ARCA system for exploring a digital library. Int. J. Digit. Libr. 2023, 24, 1–22. [Google Scholar] [CrossRef]
Schwartz, C.; Smith, J. Enhancing Metadata for Digital Libraries: Ontology-Based Approaches. J. Digit. Inf. 2008, 9, 1–10. [Google Scholar]
Hunter, J. Combining RDF and XML Schemas to Enhance Interoperability Between Metadata Application Profiles. Int. J. Digit. Curation 2002, 7, 5–18. [Google Scholar]
Bandrowski, A.; Brinkman, R.; Brochhausen, M.; Brush, M.H.; Bug, B.; Chibucos, M.C.; Clancy, K.; Courtot, M.; Derom, D.; Dumontier, M.; et al. The ontology for biomedical investigations. PLoS ONE 2016, 11, e0154556. [Google Scholar] [CrossRef]
Bodenreider, O. The Unified Medical Language System (UMLS): Integrating Biomedical Terminology. Nucleic Acids Res. 2004, 32, 267–270. [Google Scholar] [CrossRef] [PubMed]
Ashburner, M.; Ball, C.A.; Blake, J.A. Gene Ontology: Tool for the Unification of Biology. Nat. Genet. 2000, 25, 25–29. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, C.; Ong, S.S. Building an ontology for financial investment. In Intelligent Data Engineering and Automated Learning, Proceedings of the IDEAL 2000. Data Mining, Financial Engineering, and Intelligent Agents. IDEAL 2000, Hong Kong, China, 13–15 December 2000; Springer: Berlin/Heidelberg, Germany, 2000; pp. 308–313. [Google Scholar]
Harrington, R.; Jones, E. Ontologies for Financial Data: Enhancing Decision Making and Compliance. Financ. Res. Lett. 2011, 8, 150–160. [Google Scholar]
Yu, L.; Johnson, M. The Role of Ontologies in Financial Data Integration and Analysis. J. Financ. Data Sci. 2018, 3, 77–91. [Google Scholar]
El Bouhissi, H.; Patel, A.; Debnath, N.C. Recommender System for E-Commerce: How Ontologies Support Recommendations. In Data Science with Semantic Technologies; CRC Press: Boca Raton, FL, USA, 2023; pp. 287–297. [Google Scholar]
Hepp, M. Ontology-Based Product Classification in E-Commerce. Electron. Commer. Res. Appl. 2008, 9, 12–25. [Google Scholar]
Madin, J.S.; Bowers, S.; Schildhauer, M.P. Advancing Ecological Research with Ontologies. Trends Ecol. Evol. 2008, 23, 159–168. [Google Scholar] [CrossRef] [PubMed]
Mizoguchi, R.; Ikeda, M.; Yano, Y. Ontological Engineering for Learning Technology. J. Educ. Technol. Soc. 2006, 9, 38–50. [Google Scholar]
Miller, J.J. Graph database applications and concepts with Neo4j. In Proceedings of the Southern Association for Information Systems Conference, Atlanta, GA, USA, 23–24 March 2013; Volume 2324, pp. 141–147. [Google Scholar]
Fernandes, D.; Bernardino, J. Graph Databases Comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, and OrientDB. Data 2018, 10, 0006910203730380. [Google Scholar]
Ritter, D.; Dell’Aquila, L.; Lomakin, A.; Tagliaferri, E. OrientDB: A NoSQL, Open Source MMDMS. In Proceedings of the BICOD, London, UK, 28 March 2022; pp. 10–19. [Google Scholar]
Apache Tinkerpop. 2021. Available online: https://tinkerpop.apache.org/ (accessed on 9 September 2024).
Nguyen, V.; Yip, H.Y.; Thakkar, H.; Li, Q.; Bolton, E.; Bodenreider, O. Singleton Property Graph: Adding A Semantic Web Abstraction Layer to Graph Databases. BlockSW/CKG@ ISWC. 2019, Volume 2599, pp. 1–13. Available online: https://ceur-ws.org/Vol-2599/CKG2019_paper_4.pdf (accessed on 20 July 2024).
Angles, R.; Thakkar, H.; Tomaszuk, D. Mapping RDF databases to property graph databases. IEEE Access 2020, 8, 86091–86110. [Google Scholar] [CrossRef]
Hristovski, D.; Kastrin, A.; Dinevski, D.; Rindflesch, T.C. Towards implementing semantic literature-based discovery with a graph database. In Proceedings of the DBKDA 2015, Rome, Italy, 24–29 May 2015; p. 190. [Google Scholar]
Bernasconi, E.; Di Pierro, D.; Redavid, D.; Ferilli, S. SKATEBOARD: Semantic Knowledge Advanced Tool for Extraction, Browsing, Organisation, Annotation, Retrieval, and Discovery. Appl. Sci. 2023, 13, 1782. [Google Scholar] [CrossRef]
Bernasconi, E.; Ceriani, M.; Di Pierro, D.; Ferilli, S.; Redavid, D. Linked Data Interfaces: A Survey. Information 2023, 14, 483. [Google Scholar] [CrossRef]
Vargas, H.; Aranda, C.B.; Hogan, A.; López, C. RDF Explorer: A Visual SPARQL Query Builder. In The Semantic Web—ISWC 2019, Proceedings of the 18th International Semantic Web Conference, Auckland, New Zealand, 26–30 October 2019; Proceedings, Part I; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2019; Volume 11778, pp. 647–663. [Google Scholar] [CrossRef]
Weise, M.; Lohmann, S.; Haag, F. Ld-vowl: Extracting and visualizing schema information for linked data. In Proceedings of the 2nd International Workshop on Visualization and Interaction for Ontologies and Linked Data, Kobe, Japan, 17 October 2016; pp. 120–127. [Google Scholar]
Camarda, D.; Mazzini, S.; Antonuccio, A. Lodlive, exploring the web of data. In Proceedings of the I-SEMANTICS 2012—8th International Conference on Semantic Systems, I-SEMANTICS’12, Graz, Austria, 5–7 September 2012; ACM: New York, NY, USA, 2012; pp. 197–200. [Google Scholar]

Figure 1. OntoBuilder architecture.

Figure 2. Activity diagram of ontology creation.

Figure 3. System interface to build SLPG ontology.

Figure 4. LPG to RDF mapping.

Figure 5. Neo4j explore graph (LPG).

Figure 6. SKATEBOARD Interface (SPARQL endpoint).

Figure 7. Class and property creation in OntoBuilder through CSV import model.

Figure 8. Neo4j Class explorer.

Figure 9. Extract classes and related properties from unstructured text.

Figure 10. Persons exploration in SKATEBOARD interface.

Figure 11. OntoBuilder performance metrics.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bernasconi, E.; Ceriani, M.; Ferilli, S. LPG Semantic Ontologies: A Tool for Interoperable Schema Creation and Management. Information 2024, 15, 565. https://doi.org/10.3390/info15090565

AMA Style

Bernasconi E, Ceriani M, Ferilli S. LPG Semantic Ontologies: A Tool for Interoperable Schema Creation and Management. Information. 2024; 15(9):565. https://doi.org/10.3390/info15090565

Chicago/Turabian Style

Bernasconi, Eleonora, Miguel Ceriani, and Stefano Ferilli. 2024. "LPG Semantic Ontologies: A Tool for Interoperable Schema Creation and Management" Information 15, no. 9: 565. https://doi.org/10.3390/info15090565

APA Style

Bernasconi, E., Ceriani, M., & Ferilli, S. (2024). LPG Semantic Ontologies: A Tool for Interoperable Schema Creation and Management. Information, 15(9), 565. https://doi.org/10.3390/info15090565

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LPG Semantic Ontologies: A Tool for Interoperable Schema Creation and Management

Abstract

1. Introduction

2. Related Work

2.1. Semantic Web Technologies

2.2. Role of Ontologies

2.2.1. Digital Libraries

2.2.2. Healthcare

2.2.3. Finance

2.2.4. Other Domains

2.3. Graph Databases and LPGs

2.4. Integration of Semantic Web Technologies and Graph Databases

2.5. Hybrid Approaches Combining Semantic Web and Graph Databases

3. OntoBuilder

4. Functional Flow

4.1. Initialization of RDF Export

4.2. Node Processing in Neo4j

4.3. Relationship Processing

4.4. Serialization of RDF Graph

5. OntoBuilder Demo

5.1. Domain: Digital Library

5.2. Domain: History of Computer Science

6. Evaluation

6.1. Evaluation Questions

6.2. Tools Description and Evaluation

6.3. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI