An Ontology-Based Approach for Knowledge Acquisition: An Example of Sustainable Supplier Selection Domain Corpus

Konys, Agnieszka

doi:10.3390/electronics11234012

Open AccessArticle

An Ontology-Based Approach for Knowledge Acquisition: An Example of Sustainable Supplier Selection Domain Corpus

by

Agnieszka Konys

Faculty of Computer Science and Information Technology, West-Pomeranian University of Technology in Szczecin, Żołnierska 49 Street, 71-210 Szczecin, Poland

Electronics 2022, 11(23), 4012; https://doi.org/10.3390/electronics11234012

Submission received: 20 October 2022 / Revised: 24 November 2022 / Accepted: 2 December 2022 / Published: 3 December 2022

(This article belongs to the Special Issue Knowledge Engineering and Data Mining)

Download

Browse Figures

Versions Notes

Abstract

:

Selecting the right supplier is a critical decision in sustainable supply chain management. Sustainable supplier selection plays an important role in achieving a balance between the three pillars of a sustainable supply chain: economic, environmental, and social. One of the most crucial aspects of running a business in this regard is sustainable supplier selection, and, to this end, an accurate and reliable approach is required. Therefore, the main contribution of this paper is to propose and implement an ontology-based approach for knowledge acquisition from the text for a sustainable supplier selection domain. This approach is dedicated to acquiring complex relationships from texts and coding these in the form of rules. The expected outcome is to enrich the existing domain ontology by these rules to obtain higher relational expressiveness, make reasoning, and produce new knowledge.

Keywords:

ontology; knowledge base; sustainable supplier selection; ontology population; information extraction; knowledge acquisition from text

1. Introduction

The concept of sustainable development is based on the intersection of three dimensions: economic, environmental, and social. Each of them deals with different aspects, but together they focus on promoting sustainable development. Globalization forces global manufacturers to attach much importance to partnerships between suppliers. In general, a supply chain is a concept that links upstream, midstream, and downstream. The manufacturers’ aim is to reduce costs in this process. Moreover, supply chain management (SCM) receives the applicable information from downstream to improve the quality of the goods provided upstream and downstream [1]. Growing customer, non-governmental organization (NGO), and law enforcement concerns about environmental, social, and corporate responsibility have drawn industry academics and practitioners to the concept of sustainable supply chain management [2].

The assessment of sustainable development is an issue of growing importance among scientists and decision-makers. Sustainability assessment offers a large number of opportunities to measure and evaluate the level of its accomplishment. The search for effective methods of assessing sustainable development and its monitoring of development is now becoming one of the key factors determining the development of a sustainable society. The problem of assessing sustainable development applies to almost all areas. The international environmental policy, government, and people have stimulated enterprises to strictly adopt sustainable concepts in the supply chain networking to obtain a reactive, regulatory, proactive strategic, and competitive merit and abrade the non-sustainable challenges and factors against the world’s environment [3]. Due to globalization, sustainable supply chains are becoming more and more important. Hence, it is worth paying attention to ensuring sustainable supplier selection in this process. Sustainable supplier selection is a combined multi-dimensional problem that includes considering both qualitative and quantitative factors. The sustainability paradigm has been considered a comprehensive term in supplier selection, which includes a vital presence of three aspects (economic, environmental, and social) [4].

Ensuring sustainable supply chain complexity is one of the most difficult problems in today’s global supply chains and is assumed as the key impediment to business performance. It has a significant influence on competitiveness, costs, customer satisfaction, product innovation, and market share. Therefore the decision-makers must know the criteria causing sustainable supply chain efficiency. Proper identification and prioritizing of sustainable supplier criteria are required for effective monitoring and controlling of supply chain management [5]. Moreover, the timeliness of these criteria is also of great importance. The selection of a sustainable supplier depends on many factors. Thus, the crucial question is to find a reasonable approach between comprehensiveness and a manageable multi-dimensional knowledge base as well as up-to-date information exchange.

This paper presents an ontology-based approach to knowledge acquisition from the text. This approach is dedicated to acquiring complex relationships from texts and coding these in the form of rules. The approach begins with elaborating data using VosViewer to plot knowledge domain maps. Next, existing domain knowledge is implemented as OWL ontology and applies NLP tools and text-matching techniques to deduce different atoms, such as classes, properties, and literals, to capture deductive knowledge in the form of new rules. The expected outcome is to enrich the existing domain ontology by these rules to obtain higher relational expressiveness, make reasoning and produce new facts.

Several research gaps are identified through an in-depth review of the literature. Firstly, lack of a comprehensive knowledge base about criteria, sets of criteria are found by various literature studies but cannot effectively estimate sustainable supplier selection criteria [1,6,7]. Secondly, in most cases, there is a subjective evaluation of the performance of sustainable supplier selection [3,8,9].

Moreover, there is a lack of a systematic framework to handle knowledge about sustainable supplier selection criteria [1,6,7,9]. There is also a lack of a complex approach to both selecting and filtering linguistic information about criteria determining sustainable supplier selection and its categorization in the form of a knowledge base [3,5,8,10,11].

These research gaps are transformed into the author’s contribution as follows:

Plotting knowledge domain maps;
Development of a framework for selecting sustainable supplier criteria;
Ontology design and implementation;
Semi-automated ontology population by knowledge extraction from various resources;
Rule-based reasoning.

Based on this, it is possible to define the following highlights:

Development of an ontology-based framework to deal with distributed knowledge representation;
Development of a domain ontology that stores various information about sustainable suppliers to support various aspects of knowledge management by combining dynamic data provided from external sources with predefined information gathered in the ontology;
Providing examples of using ontology in various scenarios in the domain of sustainable supplier selection;
Creating a knowledge base with rules and queries using JAPE and reasoners;
Demonstrating the effectiveness of rule-based reasoning to increase the ability of logical reasoning in the context of selecting sustainable supplier criteria.

The presented approach begins with creating domain knowledge represented as OWL ontology and applies NLP tools and text-matching techniques to deduce different atoms, such as classes, properties, and literals, to capture new knowledge. This research increases the body of knowledge on the ontology for the sustainable supplier domain by providing a systematic keywords map of the subject and grasping the main criteria in the research field. The results demonstrate that the proposed approach can (1) successfully handle the knowledge domain, (2) reduce the time for searching for relevant information, (3) improve the accuracy of search results that suit users’ specific needs, and (4) provide quick updates with new knowledge.

The remainder of this paper is organized as follows. Section 2 presents the related works, in particular, taking into account such topics as sustainable supplier selection, information extraction, NLP, and ontologies. In Section 3, Materials and Methods, a new ontology-based approach for extracting knowledge in the form of rules from texts is described in detail. Section 4 presents the working example of the elaborated approach. Section 5 provides the conclusions and directions for further research.

2. Background and Related Works

2.1. Sustainable Supplier Selection

The growing emphasis on supply chain management among manufacturing companies has made the suppliers’ role in the value-addition processes to become strategically significant [8]. The problem of assessing sustainable development applies to almost all areas. Supplier selection is a combined multi-dimensional problem that includes considering both qualitative and quantitative factors [9]. Due to globalization, sustainable supply chains are becoming more and more important. The fast globalization of doing business affects business competition, changing the model from “company versus company” to the model “supply chain versus supply chain” [11]. Therefore, choosing a good combination of suppliers to work with is critical to the success of conducting business [1]. Over the years, the importance of selecting suppliers has been appreciated and emphasized. Adding sustainability aspects to the supplier selection process highlights existing trends in environmental, economic, and social issues related to management and business processes. Moreover, the development of sustainable development allows the integration of environmental, economic, and social thinking with conventional supplier selection [12].

From a systematic point of view, the study of the problem of sustainable supplier selection can be divided into two parts, including criteria and methods [13]. The analysis of the literature provides a set of various methods exploiting different aspects and using single or mixed approaches, as well as examples of selection criteria [11,12,14]. Most of the studies on sustainable supplier selection use MCDM or fuzzy MCDM techniques with complex calculations [1]. A wide range of methods was applied to solve the problem of sustainable supplier selection. The literature reviews [12] point out that the main single and combined approaches used to solve this problem are mathematics methods and artificial intelligence approaches, especially including analytic hierarchy process [10,15], linear programming [10], multi-objective programming [16,17], goal programming [6], data envelopment analysis [13], heuristics [18], statistical [19], cluster analysis [7], multiple regression [20], discriminant analysis [21], neural networks [22], software agent [20], case-based reasoning [23], expert system [21], and fuzzy set theory [14] as well as combinations of selected pairs.

As it is a multi-dimensional concept, the selection of sustainable suppliers is not based on a single criterion but on a set of criteria, which are mostly focused on economic, social, and environmental issues. In general, most companies need to focus on their supply chains to enhance sustainability to meet customer demands and comply with environmental legislation. In order to achieve these goals, companies must focus on criteria that include carbon footprint and toxic emissions, energy use and efficiency, waste generation, and worker health and safety [24]. Therefore, to analyze interrelationships among sustainability criteria, it is necessary to identify the most important ones for a given decision problem and then evaluate suppliers according to these criteria. Since the knowledge about criteria is scattered, a set of hybrid information aggregation is required to provide practical evaluation and link this set of information to the proposed knowledge base. The literature analysis provides many multi-criteria methods to support a balanced selection of suppliers and multiple cuttings of criteria sets, often suited for a given area (e.g., food, industry, and others). There are many comparable approaches; Table 1 shows a small piece of them. However, little attention has been paid to building a complex solution that allows gathering the selection criteria for sustainable suppliers, and there is almost no systemic and structured knowledge-based approach that could be used to evaluate the sustainability of suppliers.

2.2. Information Extraction

The information extraction (IE) process is based on the automatic extraction of certain types of information from natural language text. IE is the process of extracting information from unstructured text sources to enable entities to be searched, classified, and stored in a knowledge base [34]. The general aim is to parse text in natural language and look for instances of a certain class of objects or events and the instances of relationships between them. Another definition describes information extraction as a form of natural language processing in which certain types of information must be recognized and extracted from a text. Extracting information uses various algorithms and methods for finding information [35]. IE deals with the collection of texts in order to transform them into information that can be easily understood and analyzed [36]. Semantically enhanced information extraction (also known as semantic annotation) links these units to their semantic descriptions and connections from the knowledge graph. Because is much information available on the Internet these days, and the amount of it is constantly growing, this results in information overload. However, the real problem is not the sheer amount of information but the inability to filter it properly [34,37]. IE helps in the automatic detection of new, previously unknown information by automatically extracting information from various unstructured resources [38]. Therefore, the key element is linking the extracted information together to formulate new facts or new knowledge. In other words, in IE, the goal is to discover previously unknown information. Figure 1 displays an illustrative example of how information extraction works in practice.

Natural Language Processing (NLP)

NLP aims to analyze, identify and solve problems related to the automatic generation and understanding of human language. NLP aims to perform, decode and understand unstructured information [39]. NLP allows for the following:

Sorting the data to remove the rubbish from the interesting parts;
Extracting the relevant pieces of information;
Linking the extracted information to other sources of information;
Aggregating the information according to potential new categories;
Querying the (aggregated) information;
Visualizing the results of the query.

It is composed of several tasks:

Text pre-processing—the text is prepared for processing using computational linguistics tools such as tokenization, sentence sharing, morphological analysis, etc.;
Finding and classifying concepts—the various types of concepts are detected and classified;
Connecting concepts—this task aims to identify the relationship between the extracted concepts;
Unify—this task presents the extracted data in a standard form;
Remove information noise—this task eliminates duplicate data;
Enriching the knowledge base—the extracted knowledge is processed in the knowledge base for further use.

Overall, the combination of NLP and information extraction extracts new knowledge from the raw data. Finally, unknown information is obtained by automatically extracting information from various unstructured resources.

2.3. Ontology and Ontology Population

2.3.1. Ontology

Recently, the terms ontology and Semantic Web are quite popular and top research areas in computer science. Ontology is a standard recommended by World Wide Web Consortium for representing knowledge in the Semantic Web, and it turns into a fundamental and critical component for developing applications in different real-world scenarios [40]. Ontologies have become an important tool in domain modeling over the years and have been used successfully in several fields. In the artificial intelligence field [41,42,43,44], ontologies can also be used to build knowledge databases that will be used in various systems, using the obtained information to perform different tasks [41]. As a result, they help in carrying out real-world representations, establishing axioms, and obtaining conclusions from them [41,45,46].

Ontologies are defined as a set of concepts and relations between them [47]. Concepts can be divided into classes, subclasses, attributes, relationships, and instances. From a technical point of view, ontologies are a formal source of domain-specific knowledge, which is proven to be efficient for search results diversification [48]. In fact, they allow you to express the semantics of a domain in a language that computers can understand, allowing automatic processing of the meaning of the information provided [49]. Ontologies provide a controlled vocabulary of concepts whose semantics are explicitly defined and machine understandable [47]. Ontologies also offer a common understanding of the topics of communication between systems and users and enable the processing of web-based knowledge as well as the sharing and reuse among applications [48]. The most popular definition of ontology was proposed by Gruber, who stated that ontology could be defined as an explicit, formal specification of a shared conceptualization [47]. It contains the following components called concepts, individuals, relations, and attributes. It can be formulated as follows:

O = {I; C; R; A}

(1)

where I is the set of individuals, C refers to the set of concepts, R represents the set of relations and the interactions between domain individuals as follows: R is ⊆ C1 × C2 ×. Cn and A is the set of axioms.

The concepts (classes) correspond to the relevant abstractions of a segment of reality (the domain of the problem). The relations (properties) link the individuals or concepts between them. The individual is defined as a resource that has been placed into the class, but individuals are not classes themselves. The axioms are statements that are asserted to be true in the domain being described [50].

The OWL 2 standard is currently used as a formal language for representing ontologies. The inference process takes place using various ontological reasoners. The main functions of reasoners are ontology consistency checking, class taxonomy building, and ontology querying. Ontology reasoning aims to ensure that the ontology is consistent with its logical semantics. The reasoning is also required to infer new knowledge from ontology. The reasoners enable validation of the ontology, whereas at the end is possible to obtain inferred knowledge against the user’s description logic (DL) queries.

2.3.2. Ontology Population

Ontology population is a process for inserting concept and relation instances into an existing ontology [51,52]. The ontology population process has several tasks: the extraction of relation instances and identification values from any information sources and assigning such values to instances. The next task involves extracting instances, or more precisely, identifying values from any information source and assigning them to an instance [51,52]. There are many approaches in the literature related to ontology learning and ontology population. Ontology learning has benefited from the adoption of established techniques such as machine learning, data mining, natural language processing, information retrieval, and knowledge representation [53]. Based on the classification proposed by Alexander Maedche and Steffen Staab [54], ontology learning approaches were distinguished, taking into account the type of input data used for learning. Thus, common classification contains ontology learning from text, dictionary, knowledge base, semi-structured schemata, and relational schemata [53]. Each of them requires multiple research efforts to achieve a common domain conceptualization [55,56].

An automated ontology population is intended to identify concept and relation instances by using a computational tool [52,55,57,58]. Ontology learning techniques apply more complex NLP techniques to the text. Rather than simply extracting terms, they analyze the grammatical structure of sentences to determine how the terms are used. Then they deduce possible IS-A relationships between terms, which will be used to build classification hierarchies.

3. Materials and Methods

This section describes a new ontology-based approach for extracting knowledge in the form of rules from texts. This approach is dedicated to acquiring complex relationships from texts and coding these in the form of rules. The proposed approach is based on different works in the areas of knowledge acquisition, rule-based reasoning, and ontology population. A semi-automated supervised solution has been proposed for extending ontology classes in terms of learning concept attributes, data types, and value ranges. This approach requires two inputs: existing knowledge and free texts. The existing knowledge is OWL ontology. Free texts represent the domain knowledge in unstructured natural language, in this case, English. The selected domain covers sustainable supplier selection criteria.

3.1. Data Preparation and Search Strategy

In this study, we used the following tools: (1) Scopus database for managing bibliographic references [59] and (2) VOSviewer for bibliographic analysis and developing a keywords map [60]. The search strategy encompasses using the Scopus database to retrieve documents related to sustainable supplier selection criteria. In order to support the document filtration process, a formal PRISMA approach [61] was used (Figure 2). However, not all steps from the PRISMA flow diagram were used because the main goal was to search for criteria, and filtering only on the abstract and keywords was insufficient. The list of papers contains 1652 elements. The year of publication of selected documents is between 2003 and 2021. The analysis started in June 2021; hence not all publications from 2021 are included. The analyzed set of papers excluded from the final set of documents the conference reviews, erratum, and review. The query was as follows: (TITLE-ABS-KEY (“sustainable supplier selection”). The extracted documents were exported to Excel spreadsheets as *.csv file. The results can be revised by the author’s name, affiliation, document type, source title, or subject area.

Then the set of papers was manually filtered. The process itself was highly time-consuming but allowed for the identification of an initial set of criteria and sub-criteria. It contained 8261 items. The data set prepared in this way was then subjected to further work using a dedicated tool for plotting knowledge domain maps.

3.2. Plotting Knowledge Domain Maps

The developed data set was used to prepare and plot knowledge domain maps. Previously collected data were processed using the VOS viewer software [62]. VOSviewer enables the user to generate networks from given bibliometric data. VOSViewer allows the user to group criteria and sub-criteria and display the results. The size of a given item displays the density of occurrence of a given criterion (Figure 3).

For the analysis, it was also necessary to clean up the data, so a VOSViewer thesaurus file was created to combine similar criteria names. Due to the relatively large number of criteria, it is not possible to present all changes in this study. Selected limitation rules are defined and shown for example:

Merging “Product Quality” and “Quality of Product”;
Merging “Deliver & Service” and “Delivery and Service”;
Merging “Technology Capabilities” and “Technology Capability”;
Merging “Inventory costs” and “Inventory cost”;
Merging “Service Quality” and “Quality of Service”;
Merging abbreviations “EMS” and “Environmental management System”;
Merging synonyms “Green packaging” and “Green packaging ability”.

The thesaurus file contains 68 extra items. Ultimately, the set included 8261 criteria as input from 1652 papers. The total number of main clusters is 126. Each cluster contains a set of sub-criteria. The keyword occurrence map was also created. The most common keywords are green, cost, and quality. Table 2 shows the 10 most popular keywords.

3.3. Ontology Representation

The conducted plotting knowledge domain maps provide the pre-elaborated set of criteria and sub-criteria ready to implement in an OWL ontology. The ontology contains all the identified elements, which are the backbone for taxonomy building/class hierarchy building. This process requires the knowledge engineer’s participation. Therefore, the input domain ontology was developed from scratch based on the data set provided. The Protégé OWL-API [63] was selected to work with ontology and to manipulate the different constituents of the ontology (classes, object properties, data type properties, and individuals). It aims to structure knowledge, organize it, and above all, reason about it. The main stages of the development process are inspired by the ontology methodology provided by Noy and McGuiness, as shown in Figure 4.

The first step aims to define the domain and scope of the ontology—in this case, the domain of sustainable suppliers was considered. Since no similar solutions have been found, the second step will be to create an ontology from scratch. Steps 3 through 7 relate directly to ontology construction. In the third step, it is necessary to indicate the most important terms in the ontology. These terms are then detailed. This is the basis for building the class hierarchy in step 4. The class hierarchy represents an “is-a” relation: class X is a subclass of Y if every instance of X is also an instance of Y. It is worth noticing that the whole set contains 126 main classes and 8261 sub-classes. Thus, there are 8378 classes in total. The final set of clusters is attached in supplementary materials (the set of criteria: Sustainable_Supplier_Criteria.xls). Table 3 displays a piece of a class hierarchy.

Therefore, in the 5th step, the constitution of the relations is needed. In Protégé, the slots are also named object properties. Object properties describe the relations between classes or individuals. Another group is datatype property, which aims to describe the relations between individuals and values. Table 4 shows selected object properties and datatype properties with assigned domains and ranges.

In the 6th step, the definitions of facets of the slots take place. The value types, cardinality, range of slots, and other features are determined. The 7th step aims to create instances of the classes in the hierarchy. Defining an individual instance of a class requires (1) selecting the class, (2) creating an individual instance of that class, and (3) filling the slot values [64].

The resulting knowledge base contains 8261 ontological entities such as classes, relations, datatype properties, and individuals. This ontology contains a considerable amount of information representing the sustainable supplier criteria. Moreover, this ontology can be fed with new data from external sources. The ontology is available at: https://webprotege.stanford.edu/#projects/d819c911-a0dc-4208-86a5-3be0df042caa/edit/Classes (accessed on 1 April 2022).

3.4. Ontology Population—Information Extraction and Discovering Specific Concepts from the Text and Semantic Annotation

Ontologies can provide an alternative to storing knowledge at the concept and instance levels. The process of ontology enrichment by adding the names of the concepts and their relationships and instances to populate the ontology is performed by domain experts. However, this process is time-consuming and requires relevant knowledge from domain experts as well as manual skills. Therefore, an ontological population is needed to obtain useful information from texts and includes enrichment with class and relationship instances using an existing ontology as input [52].

The elaborated approach aims to provide a knowledge extraction ontology-based system for texts that helps automatically acquire and formalize this knowledge, limiting the need for expert intervention as much as possible. The proposed approach is based on natural language processing (NLP) and information extraction (IE) techniques. In this work, information extraction techniques are applied as named-entity recognition and co-reference resolution. The process of discovering specific concepts from text requires using a dedicated tool. The approach was developed by using the GATE tool and a pipeline-shaped architecture, i.e., a process should finish for starting the next one. GATE is an architecture, framework, and development environment for language engineering (LE). GATE is a component-based model application that allows for easy coupling and decoupling of the processing resources. GATE includes a core library and a set of reusable LE modules. The framework implements the architecture and provides amenities for processing and visualizing sources, including representation, import, and export of data. The provided reusable modules can perform basic language processing tasks such as POS and semantic tagging [65]. The process is shown in Figure 5.

The input data are provided by the user in the form of unstructured text or web resources. Therefore, a corpus of documents is created. The corpus consists of a set of various documents related to the sustainable supplier. Apart from scientific papers, it is also possible to use as input various reports and statistics written by specialists. The usage of GATE software enables pipeline construction using various processing resources. Therefore, various steps take place, especially containing:

Document Reset to remove all previous annotations from the document;
Tokenizer to split the English text into tokens;
Gazetteer to find list items in the text and annotate them as “lookup”;
Sentence splitter to split the text into sentences;
POS Tagger to split the text into parts of speech;
Transducer NE to identify individuals—e.g., person, location;
OrthoMatcher to add reference identity relationships between previously annotated entities;
OntoRoot Gazetter to produce annotations over textual documents, where an ontology is given as input;
JAPE Transducer to use JAPE rules to transform annotations to property assertions [65].

This semantic annotation using the population of ontology and definition of classes would be impossible without ANNIE (A nearly new information extraction system: Tarnow, Poland). ANNIE is a component of GATE. It is a complete chain dedicated to information extraction. ANNIE is based on the Java Annotation Patterns Engine (JAPE) and includes various annotation modules that are useful for performing various extraction tasks. In selected cases, it is possible to use additional processing resources. Figure 6 shows a simplified procedure related to information extraction and feeding the ontology with new knowledge.

3.5. Rule-Based Reasoning

The GATE resource OntoRoot Gazetteer can create annotations over textual documents. It demands implementing an ontology as an input in combination with other generic GATE resources. Another processing resource, the JAPE transducer, applies JAPE rules to transform annotations into property assertions. It allows for defining the rules and recognizing regular expressions in annotations of documents. A single JAPE rule is composed of two parts: LHS and RHS. The LHS contains the patterns to match, whereas the RHS details the annotations to be created. JAPE rules combine to form a specific state. The rules are designed to tag classes, instances, and attribute values. The priority of rules is based on pattern length, rule status, and rule order. The phases combine to create grammar. JAPE rules are used to locate terms in the text that potentially relate to markers, and that will later be used to create new annotations using the JAPE formalism and to identify the body and the head of the produced rules.

Table 5 presents an implemented code of the sample JAPE rule titled “Quality1”. In this case, to match a string of text, the “Token” annotation and the “string” feature were used to match text with “Token” annotation quality. The formula combination used in this example is enclosed in parentheses, followed by a colon and label. The sign “->” separates the LHS and the RHS parts, and it begins the RHS part. RHS is responsible for the manipulation of the annotation pattern from LHS, and the label on the RHS must match a label on the LHS. When the LHS part is true, the RHS part should be run [65]. When a rule matches a text sequence, the entire sequence is assigned by the rule to the label. The transducer is informed that the temporary label (quality) will be renamed to “Quality” and the rule that achieves this is “Quality1”. Naming a rule is important for the debugging purpose, as when the rule fires, it will be part of the annotation properties that you can see in GATE GUI. In this example, a sample criterion will be annotated as {rule = Quality1}.

The set of syntactic rules was created manually. The categories of developed rules refer to a previously elaborated set of criteria implemented in the OWL ontology. Elaborated rules aim to extract attribute values from any corpus of documents and assign them to a given class. These rules have been implemented in the JAPE language. GATE offers OWLLim as an ontology editor that allows you to add results directly to the ontology. In addition, it is possible to save all extracted information in the XML file. Subsequently, an ontology can be automatically created with all information about classes, attributes, and instances. The XML file may also be used by the Protégé environment as an input file and may be processed and saved in OWL/XML format.

4. Case Study

4.1. Domain Knowledge Acquisition and Cluster Construction

Data were collected from the Scopus database [59]. This data pre-processing and selection process was described in Section 3.1. Manual screening of selected works allows for dividing the data into criteria and sub-criteria. This process enables the initial classification of criteria. The main set of criteria represents keywords specific to a given class. For example, if the criterion “Quality” is analyzed, then the sub-criteria containing this word in the description will belong to that class. Moreover, in many cases, the sub-criteria may belong to other classes l (e.g., the quality of delivery will belong to the quality and delivery classes).

Subsequently, a bibliometric analysis of selected articles takes place in order to obtain and condense a large amount of bibliographic information. The assumptions of this process are described in Section 3.2. The output is a plotted knowledge map containing the criteria of a sustainable supplier. Finally, this process allowed the grouping of a set of clusters with assigned criteria. The input file was modified on the base of a pre-elaborated set of criteria and sub-criteria. As the main purpose is to extract and classify criteria and sub-criteria, other information such as author, publication date, and the title is omitted. Moreover, the analysis of the keywords alone is insufficient, as it does not contain information about the criteria that are crucial for the construction of the knowledge map. Its further elaboration helps in taxonomy construction. Therefore, VOSviewer will be fed data about the items in the network and the links between the items. This process allows for building a map and obtaining a classification of clusters of related items. This map was computed and normalized using the association strength method as the analysis method. This method is used to normalize the strength of connections between items. The association strength method is used for normalizing the strength of the links between items.

Figure 7 depicts the items indicated by a label and, by default, also by a circle. The size of a label and its circle reflects its importance. Overall, the set of 126 various clusters was defined. The items grouped in the cluster represent the criteria that specify the sustainable supplier’s selection. Items containing sub-items are arranged in the same cluster and are related to the main criterion. The colors represent the groups of related items. The distance between items tells you how related the items are. The volume of the circle indicates the contribution of the item, while the size of a circle reflects the total number of co-occurrences of the item.

Figure 8 presents the density map, where each point in a map has a color (ranging from blue to green to yellow) that depends on the density of keywords at that point. The color of the point is closer to yellow when there are more items in the neighborhood of the point and the higher weight of these items.

In turn, the color of the point is closer to blue when we have a smaller number of items in the neighborhood of the point and the smaller weights of the neighboring items.

As a result, the taxonomic form elaborated on the base of the cluster construction can be implemented in the OWL language. The final set of criteria represents the identified items, and it covers 8261 elements.

4.2. Ontology Construction and Validation

The knowledge acquisition process is described in Section 3.3. The considered domain refers to sustainable supplier criteria. In conclusion, an in-depth analysis of selected articles and the use of bibliometric analysis supports the process of acquiring knowledge and plotting a map of the knowledge domain. This is followed by specification and conceptualization of knowledge, formalization, integration, and implementation in OWL language. Therefore, the knowledge derived from the unstructured data was performed in a structured form. The ontology construction process requires the specification of individuals (concepts), classes, and relations, as well as restrictions, rules, and axioms. The exemplary classes, object properties, and datatype properties were presented in Table 3 and Table 4. Figure 9 shows a small piece of a class hierarchy. Each class contains sub-classes. The exemplary class technology is shown in Figure 10 with assigned sub-classes. The ontology also provides information about suppliers’ profiles (Figure 11 and Figure 12).

The implementation uses Protégé-OWL API [63] to work with the OWL ontologies and DL query mechanism to manipulate the different constituents of the ontology. The formal description was performed using the description logic (DL) standard. The formal description of the developed knowledge representation using DL allows for machine processing, sharing, reusing, and, finally, populating new knowledge. The evaluation process of the elaborated ontology was performed using the competency questions and implemented using the description logic query mechanism. This process aims to check the coherence and correctness of the constructed ontology using reasoning mechanisms. For a consistent ontology, the output is a result set.

The first example shows how to ask about sustainable supplier criteria in terms of flexibility, quality, responsiveness, and delivery. A rule-based query is created to find results that meet a defined set of criteria. Query 1 is executed by the code, as shown in Table 6.

The second exemplary query aims to demonstrate how to find sustainable supplier criteria in the context of quality, reputation, and delivery. The sub-criteria were predefined, including quality of product, quality ISO 9000, delivery and service, delivery on time, and reputation of the supplier. The query was executed using a reasoner. The code is shown in Table 7.

These queries represent only the partial possibilities of using a knowledge base in extracting information. The examples are attached in supplementary materials (see: JAPE examples: JAPE examples.zip). Given the huge number of criteria included in the knowledge base, there are many possibilities to build different combinations of queries. As a result, the user will also be able to indicate the profile of the preferred supplier. It also allows the user to identify the source of the criteria. Combining the knowledge base with additional modules/knowledge bases containing information, for example, on indicators, gives a chance for a comprehensive source of knowledge in the field of sustainable supplies and suppliers.

4.3. Semantic Annotation and Ontology Population

The corpus for tests consists of a set of sustainable supplier reports, papers, and other data gathered from web resources. The use of ANNIE, together with selected processing resources (PR) dedicated to information extraction, enabled the performance of various extraction tasks. (mentioned in detail in Section 3.4). The implementation of these PR begins the process of performing the corpus of documents. The corpus of documents may contain various text documents such as scientific articles, report sheets, plain text, etc., and links to websites. Finally, a set of basic annotations has been provided. In order to extend the built-in set of annotations, the own annotations with specific constraints and rules have been created. The created annotations depend on what a user wants to search for and how to classify it. Figure 13 displays exemplary annotations that aim to find the criteria related to technology, transport, and strategic feature. The criteria found in the document body are highlighted (depending on the color assigned to them). It is also possible to add additional features.

The implementation of the presented approach using semantic annotation and ontology population requires the use of tools included in this environment and, thus, the installation of new plugins for working with ontologies. OWLIM Ontology plugin and GATE Ontology Editor were used to work with ontology (Figure 14). The ontology was created in the Protégé environment [63]; however, to work with GATE and enable semantic annotation and ontology population, available GATE plugins were used in this part of the experiments.

Within the ontology population, it is possible to create specific rules that are designed to find and classify selected concepts. Hence, the next step is to use JAPE Transducer. JAPE Transducer defines the rules and recognizes regular expressions in annotations of documents. Figure 15 displays the partial results of these phases. The working example of the rule named Quality1 demonstrates the applicability of JAPE rules. Many such rules were created to carry out the tests. Of course, the possibilities of creating rules are huge, and it is possible to expand the rules with additional elements. Figure 16 displays the partial results of applied rule Quality1. The execution of the JAPE rule for extracting attribute values for rule Quality1 is shown in Figure 17.

The presented approach offers a semi-automatic, supervised ontology population. By using semantic annotation, it is possible to annotate the relevant word, for example, “Quality of supply” as a criterion related to sustainable suppliers and link it to an ontology instance. As a consequence, new knowledge is added to the ontology. The application of the reasoning mechanism allows classifying the selected word as a criterion of quality. It can therefore be interpreted as follows from the ontology that “Quality of supply” is a criterion associated with a given supplier profile. For implemented ontology, the class feature can be used on the LHS of a JAPE rule. When matching the class value, the ontology is checked for subsumption. If any sub-class on the left side of “==” matches {Lookup.class == Quality}, it will match a lookup annotation with the class feature, whose value is either quality or any subclass of it (Figure 18).

Ontologies are useful for encoding the information found. Applying the created rules for a given corpus of documents makes it possible to extract knowledge using rules and assign this knowledge to classes and instances in the ontology (Figure 16 and Figure 19). The richer NE tagging and application of JAPE rules aim to disambiguate the instances. The modified ontology is then loaded using Protégé software [63]. In this way, the user has control over the development of the ontology and its population and the updating of data. In order to further develop the ontology, rules can be created automatically from a single pattern, with a rule per object property having to be populated.

4.4. Validation and Evaluation

In order to evaluate and validate the obtained ontology, the application of the reasoning mechanism takes place. Two reasoning mechanisms were applied: HermiT 1.4.3.456 and Pellet. Both of them did not detect the inconsistency of the loaded ontology (Figure 20).

Other ontology assessments and validations require the use of a master ontology. In this case, these measures cannot be used. For example, ontology can be evaluated using metric-based evaluation, including relationship richness, attribute richness, and class richness. However, to evaluate the quality using these metrics, a similar basic ontology is needed. Apart from that, it is possible to evaluate the ontology using dedicated measure balance distance metrics (BDM), but the reference ontology, test set, and training set are also necessary.

5. Conclusions

This paper proposed an ontology-based approach for knowledge acquisition from the text for the sustainable supplier selection domain. The presented solution showed the process of acquiring complex relationships from texts and encoding them in the form of rules. As a result, the enrichment of the existing domain ontology by adding new knowledge and reaching higher relational expression, reasoning, and producing new facts has been successfully implemented and achieved.

This process required the use of various techniques and tools, such as VosViewer for plotting knowledge domain maps, Protégé environment for implementing and managing the OWL ontology, GATE software with NLP tools and text matching techniques and plugins for deducing different atoms, and JAPE rules for capturing deductive knowledge in the form of new rules. The evaluation process was performed using the reasoning mechanisms HermiT 1.4.3.456 and Pellet.

The essential contribution of the work covers the following:

Developing an ontology-based framework to deal with distributed knowledge representation;

Developing a domain ontology that stores various information about sustainable suppliers, which supports various knowledge management aspects, associating dynamic data delivered from external sources with predefined information gathered in the ontology;

Constructing a knowledge base with rules and queries using JAPE;

Checking the consistency and testing the use of the ontology in different scenarios in the domain of sustainable supplier selection and applying rule-based reasoning.

The presented ontology provides independent knowledge about criteria for sustainable supplier selection, which is proved by a scientific literature analysis. The new knowledge can be incorporated into any database, knowledge base, or information system. This form of storing knowledge offers machine-readable access and semantic data handling. Additionally, the proposed approach made it possible to:

Increase the body of knowledge on the ontology for the sustainable supplier domain by providing a systematic keywords map of the subject and grasping the main criteria in the research field;

Handle knowledge domain;

Reduce time for searching for relevant information;

Improve the accuracy of search results that suit user’s specific needs;

Provide quick updates with new knowledge.

However, there are still some limitations that need to be addressed in future research. Further refinements to the presented approach include increasing the level of automation of phases that currently require manual work. In particular, a way to automate JAPE rule definitions and prepare patterns is currently under development. The use of the reasoning abilities provided by the ontology to generate new JAPE rules, starting with patterns of manually specified JAPE rules, is also a promising direction and an extension of this work.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/electronics11234012/s1; the set of criteria: Sustainable_Supplier_Criteria.xls; JAPE examples: JAPE examples.zip.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hoseini, S.A.; Fallahpour, A.; Wong, K.Y.; Mahdiyar, A.; Saberi, M.; Durdyev, S. Sustainable Supplier Selection in Construction Industry through Hybrid Fuzzy-Based Approaches. Sustainability 2021, 13, 1413. [Google Scholar] [CrossRef]
Govindan, K.; Khodaverdi, R.; Jafarian, A. A Fuzzy Multi Criteria Approach for Measuring Sustainability Performance of a Supplier Based on Triple Bottom Line Approach. J. Clean. Prod. 2013, 47, 345–354. [Google Scholar] [CrossRef]
Saeed, M.; Kersten, W. Drivers of Sustainable Supply Chain Management: Identification and Classification. Sustainability 2019, 11, 1137. [Google Scholar] [CrossRef] [Green Version]
Amindoust, A. A Resilient-Sustainable Based Supplier Selection Model Using a Hybrid Intelligent Method. Comput. Ind. Eng. 2018, 126, 122–135. [Google Scholar] [CrossRef]
Chand, P.; Thakkar, J.J.; Ghosh, K.K. Analysis of Supply Chain Complexity Drivers for Indian Mining Equipment Manufacturing Companies Combining SAP-LAP and AHP. Resour. Policy 2018, 59, 389–410. [Google Scholar] [CrossRef]
Singh, R.K.; Murty, H.R.; Gupta, S.K.; Dikshit, A.K. An Overview of Sustainability Assessment Methodologies. Ecol. Indic. 2012, 15, 281–299. [Google Scholar] [CrossRef]
Lee, J.; Jung, K.; Kim, B.H.; Peng, Y.; Cho, H. Semantic Web-Based Supplier Discovery System for Building a Long-Term Supply Chain. Int. J. Comput. Integr. Manuf. 2015, 28, 155–169. [Google Scholar] [CrossRef]
Kumar, C.V.S.; Routroy, S. Developing the Preferred Supplier Relationships—A Case Study. Int. J. Intell. Enterp. 2018, 5, 50. [Google Scholar] [CrossRef]
Ware, N.R.; Singh, S.P.; Banwet, D.K. Supplier Selection Problem: A State-of-the-Art Review. Manag. Sci. Lett. 2012, 2, 1465–1490. [Google Scholar] [CrossRef]
Shaw, K.; Shankar, R.; Yadav, S.S.; Thakur, L.S. Global Supplier Selection Considering Sustainability and Carbon Footprint Issue: AHP Multi-Objective Fuzzy Linear Programming Approach. Int. J. Oper. Res. 2013, 17, 215. [Google Scholar] [CrossRef]
Awasthi, A.; Govindan, K.; Gold, S. Multi-Tier Sustainable Global Supplier Selection Using a Fuzzy AHP-VIKOR Based Approach. Int. J. Prod. Econ. 2018, 195, 106–117. [Google Scholar] [CrossRef] [Green Version]
Konys, A. Methods Supporting Supplier Selection Processes–Knowledge-Based Approach. Procedia Comput. Sci. 2019, 159, 1629–1641. [Google Scholar] [CrossRef]
Ramanathan, R. Supplier Selection Problem: Integrating DEA with the Approaches of Total Cost of Ownership and AHP. Supply Chain Manag. Int. J. 2007, 12, 258–261. [Google Scholar] [CrossRef]
Arabsheybani, A.; Paydar, M.M.; Safaei, A.S. An Integrated Fuzzy MOORA Method and FMEA Technique for Sustainable Supplier Selection Considering Quantity Discounts and Supplier’s Risk. J. Clean. Prod. 2018, 190, 577–591. [Google Scholar] [CrossRef]
Kahraman, C.; Topcu, Y.I. Operations Research Applications in Health Care Management; Springer International Publishing: New York, NY, USA; Heidelberg, Germany; Dordrecht, The Netherlands; London, UK, 2018; ISBN 978-3-319-65455-3. [Google Scholar]
Weber, C.A.; Current, J.R.; Benton, W.C. Vendor Selection Criteria and Methods. Eur. J. Oper. Res. 1991, 50, 2–18. [Google Scholar] [CrossRef]
Büyüközkan, G.; Feyzioğlu, O.; Havle, C.A. Analysis of Success Factors in Aviation 4.0 Using Integrated Intuitionistic Fuzzy MCDM Methods. In Intelligent and Fuzzy Techniques in Big Data Analytics and Decision Making; Kahraman, C., Cebi, S., Cevik Onar, S., Oztaysi, B., Tolga, A.C., Sari, I.U., Eds.; Springer International Publishing: Cham, Switzerland, 2020; Volume 1029, pp. 598–606. ISBN 978-3-030-23755-4. [Google Scholar]
Burki, U.; Ersoy, P.; Dahlstrom, R. Achieving Triple Bottom Line Performance in Manufacturer-Customer Supply Chains: Evidence from an Emerging Economy. J. Clean. Prod. 2018, 197, 1307–1316. [Google Scholar] [CrossRef]
Sarkis, J.; Dhavale, D.G. Supplier Selection for Sustainable Operations: A Triple-Bottom-Line Approach Using a Bayesian Framework. Int. J. Prod. Econ. 2015, 166, 177–191. [Google Scholar] [CrossRef]
Yu, C.; Wong, T.N. An Agent-Based Negotiation Model for Supplier Selection of Multiple Products with Synergy Effect. Expert Syst. Appl. 2015, 42, 223–237. [Google Scholar] [CrossRef]
Kumar, A.; Jain, V.; Kumar, S. A Comprehensive Environment Friendly Approach for Supplier Selection. Omega 2014, 42, 109–123. [Google Scholar] [CrossRef]
Kuo, R.J.; Wang, Y.C.; Tien, F.C. Integration of Artificial Neural Network and MADA Methods for Green Supplier Selection. J. Clean. Prod. 2010, 18, 1161–1170. [Google Scholar] [CrossRef]
Zhao, H.; Guo, S. Selecting Green Supplier of Thermal Power Equipment by Using a Hybrid MCDM Method for Sustainability. Sustainability 2014, 6, 217–235. [Google Scholar] [CrossRef] [Green Version]
Wang, C.-N.; Nguyen, V.T.; Thai, H.T.N.; Tran, N.N.; Tran, T.L.A. Sustainable Supplier Selection Process in Edible Oil Production by a Hybrid Fuzzy Analytical Hierarchy Process and Green Data Envelopment Analysis for the SMEs Food Processing Industry. Mathematics 2018, 6, 302. [Google Scholar] [CrossRef] [Green Version]
Opher, T.; Friedler, E.; Shapira, A. Comparative Life Cycle Sustainability Assessment of Urban Water Reuse at Various Centralization Scales. Int. J. Life Cycle Assess. 2019, 24, 1319–1332. [Google Scholar] [CrossRef]
Shaaban, M.; Scheffran, J.; Böhner, J.; Elsobki, M.S. A Dynamic Sustainability Analysis of Energy Landscapes in Egypt: A Spatial Agent-Based Model Combined with Multi-Criteria Decision Analysis. J. Artif. Soc. Soc. Simul. 2019, 22. [Google Scholar] [CrossRef]
Roinioti, A.; Koroneos, C. Integrated Life Cycle Sustainability Assessment of the Greek Interconnected Electricity System. Sustain. Energy Technol. Assess. 2019, 32, 29–46. [Google Scholar] [CrossRef]
Bottero, M.; Oppio, A.; Bonardo, M.; Quaglia, G. Hybrid Evaluation Approaches for Urban Regeneration Processes of Landfills and Industrial Sites: The Case of the Kwun Tong Area in Hong Kong. Land Use Policy 2019, 82, 585–594. [Google Scholar] [CrossRef]
Talukder, B.; Hipel, K.W. The PROMETHEE Framework for Comparing the Sustainability of Agricultural Systems. Resources 2018, 7, 74. [Google Scholar] [CrossRef] [Green Version]
Melkonyan, A.; Gruchmann, T.; Lohmar, F.; Kamath, V.; Spinler, S. Sustainability Assessment of Last-Mile Logistics and Distribution Strategies: The Case of Local Food Networks. Int. J. Prod. Econ. 2020, 228, 107746. [Google Scholar] [CrossRef]
Cai, M.; Zhang, W.Y.; Zhang, K. ManuHub: A Semantic Web System for Ontology-Based Service Management in Distributed Manufacturing Environments. IEEE Trans. Syst. Man Cybern.-Part A Syst. Hum. 2011, 41, 574–582. [Google Scholar] [CrossRef]
Balasbaneh, A.T.; Yeoh, D.; Zainal Abidin, A.R. Life Cycle Sustainability Assessment of Window Renovations in Schools against Noise Pollution in Tropical Climates. J. Build. Eng. 2020, 32, 101784. [Google Scholar] [CrossRef]
Siksnelyte, I.; Zavadskas, E.K.; Bausys, R.; Streimikiene, D. Implementation of EU Energy Policy Priorities in the Baltic Sea Region Countries: Sustainability Assessment Based on Neutrosophic MULTIMOORA Method. Energy Policy 2019, 125, 90–102. [Google Scholar] [CrossRef] [Green Version]
Maedche, A. The Text-to-Onto Ontology Extraction and Maintenance System. In Proceedings of the ICDM-Workshop on Integrating Data Mining and Knowledge Management, San Jose, CA, USA, 29 November–2 December 2001. [Google Scholar]
Konys, A. Towards Knowledge Handling in Ontology-Based Information Extraction Systems. Procedia Comput. Sci. 2018, 126, 2208–2218. [Google Scholar] [CrossRef]
Jain, V.; Singh, M. Ontology Based Information Retrieval in Semantic Web: A Survey. Int. J. Inf. Technol. Comput. Sci. 2013, 5, 62–69. [Google Scholar] [CrossRef] [Green Version]
Konys, A. A Tool Supporting Mining Based Approach Selection to Automatic Ontology Construction. IADIS J. Comput. Sci. Inf. Syst. 2015, 3–10. [Google Scholar]
Boufrida, A.; Boufaida, Z. Rule Extraction from Scientific Texts: Evaluation in the Specialty of Gynecology. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 1150–1160. [Google Scholar] [CrossRef]
Zhang, Z.; Gentile, A.L.; Ciravegna, F. Recent Advances in Methods of Lexical Semantic Relatedness—A Survey. Nat. Lang. Eng. 2013, 19, 411–479. [Google Scholar] [CrossRef]
Zhang, F.; Cheng, J.; Ma, Z. A Survey on Fuzzy Ontologies for the Semantic Web. Knowl. Eng. Rev. 2016, 31, 278–321. [Google Scholar] [CrossRef]
Morente-Molinera, J.A.; Pérez, I.J.; Ureña, M.R.; Herrera-Viedma, E. Building and Managing Fuzzy Ontologies with Heterogeneous Linguistic Information. Knowl.-Based Syst. 2015, 88, 154–164. [Google Scholar] [CrossRef]
Díaz Rodríguez, N.; Cuéllar, M.P.; Lilius, J.; Delgado Calvo-Flores, M. A Fuzzy Ontology for Semantic Modelling and Recognition of Human Behaviour. Knowl.-Based Syst. 2014, 66, 46–60. [Google Scholar] [CrossRef]
Poslad, S.; Kesorn, K. A Multi-Modal Incompleteness Ontology Model (MMIO) to Enhance Information Fusion for Image Retrieval. Inf. Fusion 2014, 20, 225–241. [Google Scholar] [CrossRef]
Pérez, I.J.; Wikström, R.; Mezei, J.; Carlsson, C.; Herrera-Viedma, E. A New Consensus Model for Group Decision Making Using Fuzzy Ontology. Soft Comput. 2013, 17, 1617–1627. [Google Scholar] [CrossRef]
Fensel, D. Ontologies; Springer: Berlin/Heidelberg, Germany, 2001; pp. 11–18. [Google Scholar]
Little, E.G.; Rogova, G.L. Designing Ontologies for Higher Level Fusion. Inf. Fusion 2009, 10, 70–82. [Google Scholar] [CrossRef]
Gruber, T.R. A Translation Approach to Portable Ontology Specifications. Knowl. Acquis. 1993, 5, 199–220. [Google Scholar] [CrossRef]
Besbes, G.; Baazaoui-Zghal, H. Fuzzy Ontologies for Search Results Diversification: Application to Medical Data. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing, Pau, France, 9 April 2018; pp. 1968–1975. [Google Scholar]
Shafna, S.; Rajendran, V.V. Fuzzy Ontology Based Recommender System with Diversification Mechanism. In Proceedings of the 2017 International Conference on Intelligent Computing and Control (I2C2), IEEE, Coimbatore, India, 23–24 June 2017; pp. 1–6. [Google Scholar]
Motik, B.; Parsia, B.; Patel-Schneider, P.F. OWL 2 Web Ontology Language XML Serialization. World Wide Web Consort. 2009. Available online: https://www.w3.org/TR/2009/WD-owl2-xml-serialization-20090421/all.pdf (accessed on 1 April 2022).
Corcoglioniti, F.; Rospocher, M.; Aprosio, A.P. Frame-Based Ontology Population with PIKES. IEEE Trans. Knowl. Data Eng. 2016, 28, 3261–3275. [Google Scholar] [CrossRef]
Blandón Andrade, J.C.; Zapata Jaramillo, C.M. Gate-Based Rules for Extracting Attribute Values. Computación y Sistemas 2021, 25, 851–862. [Google Scholar] [CrossRef]
Corcho, O.; Fernández-López, M.; Gómez-Pérez, A.; López-Cima, A. Building Legal Ontologies with METHONTOLOGY and WebODE. In Law and the Semantic Web; Benjamins, V.R., Casanovas, P., Breuker, J., Gangemi, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3369, pp. 142–157. ISBN 978-3-540-25063-0. [Google Scholar]
Maedche, A.; Staab, S. Ontology Learning for the Semantic Web. IEEE Intell. Syst. 2001, 16, 72–79. [Google Scholar] [CrossRef] [Green Version]
Konys, A. Knowledge Repository of Ontology Learning Tools from Text. Procedia Comput. Sci. 2019, 159, 1614–1628. [Google Scholar] [CrossRef]
Konys, A. Knowledge Systematization for Ontology Learning Methods. Procedia Comput. Sci. 2018, 126, 2194–2207. [Google Scholar] [CrossRef]
Cimiano, P. Ontology Learning from Text. In Ontology Learning and Population from Text; Springer Science & Business Media: Berlin, Germany, 2006; pp. 19–34. ISBN 978-0-387-30632-2. [Google Scholar]
Ma, C.; Molnár, B. Use of Ontology Learning in Information System Integration: A Literature Survey. In Intelligent Information and Database Systems; Sitek, P., Pietranik, M., Krótkiewicz, M., Srinilta, C., Eds.; Springer: Singapore, 2020; Volume 1178, pp. 342–353. ISBN 9789811533792. [Google Scholar]
Available online: www.scopus.com (accessed on 1 April 2022).
van Eck, N.J.; Waltman, L. VOS: A New Method for Visualizing Similarities Between Objects. In Advances in Data Analysis; Decker, R., Lenz, H.-J., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 299–306. ISBN 978-3-540-70980-0. [Google Scholar]
Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G. The PRISMA Group Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement. PLoS Med. 2009, 6, e1000097. [Google Scholar] [CrossRef] [Green Version]
van Eck, N.J.; Waltman, L. Software Survey: VOSviewer, a Computer Program for Bibliometric Mapping. Scientometrics 2010, 84, 523–538. [Google Scholar] [CrossRef] [Green Version]
Musen, M.A. The Protégé Project: A Look Back and a Look Forward. AI Matters 2015, 1, 4–12. [Google Scholar] [CrossRef] [PubMed]
Noy, N.F.; McGuinness, D.L. Ontology Development 101: A Guide to Creating Your First Ontology. 2001. Available online: https://protege.stanford.edu/publications/ontology_development/ontology101.pdf (accessed on 1 April 2022).
Ganino, G.; Lembo, D.; Mecella, M.; Scafoglieri, F. Ontology Population for Open-Source Intelligence: A GATE-Based Solution: Ontology Population for OSInt: A GATE-Based Solution. Softw. Pract. Exp. 2018, 48, 2302–2330. [Google Scholar] [CrossRef]

Figure 1. An example of information extraction.

Figure 2. Modified PRISMA procedure [61].

Figure 3. Collected and elaborated data using the VOS viewer software [62].

Figure 4. Ontology construction steps.

Figure 5. Semi-automatic information extraction and knowledge base model constructions.

Figure 6. Information extraction procedure and feeding the ontology with new knowledge. Source: Personal elaboration on base of GATE documentation.

Figure 7. The network visualization of selected items. Source: Personal elaboration using VOSviewer software [62].

Figure 8. The density visualization of selected items. Source: Personal elaboration using VOSviewer software [62].

Figure 9. Selected criteria of the constructed ontology. Source: Personal elaboration using Protégé software [63].

Figure 10. Selected criterion technology with sub-criteria. Source: Personal elaboration using Protégé software [63].

Figure 11. An example of a sustainable supplier profile. Source: Personal elaboration using Protégé software [63].

Figure 12. Description of sustainable supplier profile. Source: Personal elaboration using Protégé software [63].

Figure 13. Displaying the exemplary annotations from the text (web resource). Source: Personal elaboration using GATE software [65].

Figure 14. Displaying the ontology using the OWLIM Ontology plugin and GATE Ontology Editor. Source: Personal elaboration using GATE software [65].

Figure 15. The exemplary JAPE rule “Quality1”. Source: Personal elaboration using GATE software [65].

Figure 16. Populated ontology after applying the created rules. Source: Personal elaboration using Protégé software [63].

Figure 17. The execution of the JAPE rule for extracting attribute values for rule Quality1. Source: Personal elaboration using GATE software [65].

Figure 18. The execution of the LHS JAPE rule for extracting attribute values for rule QualityLookup. Source: Personal elaboration using GATE software [65].

Figure 19. Graphical visualization of the part of populated ontology after applying the created rules. Source: Personal elaboration using Protégé software [63].

Figure 20. The log results after using HermiT and Pellet reasoners. Source: Personal elaboration using Protégé software [63].

Table 1. Examples of multi-criteria methods to support a selection of sustainable suppliers.

The Used MCDA Method	Domain	Source
AHP	urban water reuse, energy landscape	[25,26]
Multi-attribute Value Theory (MAVT)	electricity system, RES, urban regeneration	[27,28]
PROMETHEE	logistics and distribution, agriculture	[29,30]
TOPSIS	air pollution, transportation sector	[31,32]
MULTIMOORA	energy policy	[33]

Table 2. The top 10 keywords.

Keyword	Occurrences
Green	543
Cost	420
Quality	375
Service	287
Delivery	285
Time	214
Risk	200
Price	154
Technology	152
Waste	140

Table 3. Examples of classes.

Class	Description
Quality	A collection of criteria related to assessing quality (e.g., product quality, quality assurance, QMS)
Green	A collection of criteria related to assessing the level of green (e.g., green competencies, green logistics, green packaging)
Service	A collection of criteria related to assessing the level of offered service (e.g., flexibility of the supplier, payment flexibility)
Delivery	A collection of criteria related to assessing the level of delivery (e.g., delivery lead time, delivery safety, delivery flexibility)
Cost	A collection of criteria related to assessing the level of cost (e.g., cost control, delivery costs, freight costs)
Risks	A collection of criteria related to assessing the level of risks (e.g., economy risk, environmental risk)
Knowledge	A collection of criteria related to assessing the level of knowledge (e.g., sustainable knowledge sharing, IT knowledge)
Supplier’s profile	A collection of criteria related to assessing the level of supplier’s profile (e.g., supplier’s reputation, references)
Logistics	A collection of criteria related to assessing the level of logistics (e.g., reverse logistics, logistics for environment, green logistics)
Pollution	A collection of criteria related to assessing the level of pollution (e.g., energy consumption, pollution control, use of harmful materials)

Table 4. Examples of object properties and datatype properties.

Type	Property	Domain	Range
Object Property	hasCriterion	Criteria	Sus_Supplier
Object Property	isCriterionOf	Sus_Supplier	Criteria
Object Property	hasFeature	Criteria	Sus_Supplier
Object Property	isFeatureOf	Sus_Supplier	Criteria
Datatype Property	hasValue	Criteria	xsd:double
Datatype Property	hasRating	Sus_Supplier	xsd:int
Datatype Property	hasOpinion	Sus_Supplier	xsd:string
Datatype Property	hasLevel_of_Sustainability	Criteria	xsd:string

Table 5. An implemented code of the sample JAPE rule.

Rule: Quality1

Phase: Quality
Input: Token
Options: control = appelt

Rule: Quality1
Priority:100
(
{Token.string == “Quality”}
)
:quality
-->
:quality.Quality = {rule = “Quality1”}

Table 6. The working example of the 1st query.

Query 1:

Table 7. The working example of the 2nd query.

Query 2:

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Konys, A. An Ontology-Based Approach for Knowledge Acquisition: An Example of Sustainable Supplier Selection Domain Corpus. Electronics 2022, 11, 4012. https://doi.org/10.3390/electronics11234012

AMA Style

Konys A. An Ontology-Based Approach for Knowledge Acquisition: An Example of Sustainable Supplier Selection Domain Corpus. Electronics. 2022; 11(23):4012. https://doi.org/10.3390/electronics11234012

Chicago/Turabian Style

Konys, Agnieszka. 2022. "An Ontology-Based Approach for Knowledge Acquisition: An Example of Sustainable Supplier Selection Domain Corpus" Electronics 11, no. 23: 4012. https://doi.org/10.3390/electronics11234012

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Ontology-Based Approach for Knowledge Acquisition: An Example of Sustainable Supplier Selection Domain Corpus

Abstract

1. Introduction

2. Background and Related Works

2.1. Sustainable Supplier Selection

2.2. Information Extraction

Natural Language Processing (NLP)

2.3. Ontology and Ontology Population

2.3.1. Ontology

2.3.2. Ontology Population

3. Materials and Methods

3.1. Data Preparation and Search Strategy

3.2. Plotting Knowledge Domain Maps

3.3. Ontology Representation

3.4. Ontology Population—Information Extraction and Discovering Specific Concepts from the Text and Semantic Annotation

3.5. Rule-Based Reasoning

4. Case Study

4.1. Domain Knowledge Acquisition and Cluster Construction

4.2. Ontology Construction and Validation

4.3. Semantic Annotation and Ontology Population

4.4. Validation and Evaluation

5. Conclusions

Supplementary Materials

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI