Developing Contextual Ontology for Chronic Diseases: AI-Enhanced Extension and Prediction in an Asthma Case Study

Msheik, Batoul; Adda, Mehdi; Mcheick, Hamid; Nasser, Youmna; Dbouk, Mohamed

doi:10.3390/app15084353

Open AccessArticle

Developing Contextual Ontology for Chronic Diseases: AI-Enhanced Extension and Prediction in an Asthma Case Study

by

Batoul Msheik

^1,*,

Mehdi Adda

²

,

Hamid Mcheick

¹

,

Youmna Nasser

³ and

Mohamed Dbouk

³

¹

Computer Science Department, Université du Québec à Chicoutimi, 555, Boul de l’Université, Chicoutimi, QC G7H 2B1, Canada

²

Département de Mathématiques, Informatique et Génie, Université du Québec à Rimouski, 300 Allée des Ursulines, Rimouski, QC G5L 3A1, Canada

³

Computer Science Department, Université Libanaise, Beirut 6573/14, Lebanon

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(8), 4353; https://doi.org/10.3390/app15084353

Submission received: 27 January 2025 / Revised: 2 April 2025 / Accepted: 3 April 2025 / Published: 15 April 2025

Download

Browse Figures

Versions Notes

Abstract

:

The growing complexity and interdependence of healthcare data, especially for chronic diseases such as asthma, demand innovative approaches for effective knowledge representation. This study introduces a general contextual ontology model for chronic diseases, extended specifically to asthma. Leveraging real-world datasets, the extended asthma ontology integrates key factors such as symptoms, triggers, treatments, and patient demographics, providing a comprehensive framework for disease management. The ontology was validated using intrinsic metrics such as classification, reusability, and completeness in healthcare applications. To validate the ontology, we used decision trees to extract rules after identifying the most relevant parameters needed to generate a Semantic Web Rule Language. These rules facilitate reasoning, validation, and decision-making within the ontology. The results highlight the potential of developing a general contextual ontology and extending it to address specific chronic diseases, such as asthma. We designed a general contextual ontology framework by integrating the extended ontology with artificial intelligence algorithms, identifying relevant parameters, and extracting rules to enhance knowledge representation and support clinical decision-making. This framework can be applied to other disease case studies.

Keywords:

ontology development; context-aware systems; ontology reasoning; artificial intelligence in healthcare

1. Introduction

Despite significant advancements in diagnosis and prevention, many diseases continue to pose a serious threat to human health, killing millions annually. The World Health Organization (WHO) reports estimate that chronic diseases (non-communicable diseases, NCDs) are responsible for 71% of all deaths globally, or 41 million people each year [1]. The four main types of NCDs are cardiovascular diseases, cancers, respiratory diseases, and diabetes [1].

Asthma, a chronic lung condition, causes swollen and narrow airways, making breathing difficult. It affects approximately 339 million people globally, including over 3.8 million in Canada [2]. The two main types are allergic asthma, triggered by allergens and often starting early in life, and non-allergic asthma, which has no identifiable triggers and typically begins later [3]. Allergic asthma is more common and usually responds better to treatment, accounting for 75–80% of cases [4].

Knowledge representation refers to structuring information to ensure efficient storage and easy retrieval [5]. In health care, knowledge representation involves capturing medical knowledge in a way that computers can understand. This helps support doctors and nurses in making better decisions [6]. By representing medical knowledge in a format that computers can use, healthcare providers can improve diagnosis, treatment planning, and medication management. Knowledge representation models (KRMs) also help analyze patient data across groups, allowing healthcare providers to identify diseases more effectively [7].

The representation of the medical context requires a deep understanding of the healthcare domain and its various dimensions. Context can significantly enhance the representation of knowledge, particularly in the healthcare domain [8,9]. Consequently, medical data can encompass several parameters, and the acquisition process must facilitate data representation that enables understanding, standardization, and centralized management of all diseases. Data now plays a crucial role in decision-making across multiple fields. However, datasets in the medical domain are often complex and disorganized, making direct extraction of useful information difficult [10,11]. Asthma represents a chronic disease that can be investigated through a patient’s signs, medical history, demographic information, and other parameters [12].

We distinguish many KRMs, such as frame, UML, and graphical representation. We focus on ontology models that facilitate the management of data by incorporating relevant domain concepts and their associated relations [9]. This shared knowledge system allows for a comprehensive representation of complex information, which is crucial in fields such as health care, where data interdependence is significant [9]. Ontologies help in structuring and managing healthcare data to offer tailored services [13]. Ontologies are used in 50% of applications, emphasizing their crucial role in organizing and representing data. The other 50% comprises alternative modelling approaches, demonstrating a balanced distribution among KRMs [14].

The development and integration of standard ontologies, such as the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), facilitate the exchange of information through medical information systems [15]. The use of standard ontologies ensures consistent and semantically interoperable data management across different healthcare systems [15].

The main contributions of this paper are:

Designing a general contextual ontology model that includes the key knowledge domains related to chronic disease management.
Extending this general contextual ontology to a domain-specific model for asthma, highlighting the identification of classes, entities and their interrelations.
Supporting the evaluation process and enhancing the inference capabilities (reasoning) of this extended asthma ontology by extracting the rules using artificial intelligence (AI) algorithms.
Evaluating this extended asthma ontology using intrinsic metrics such as classification, reusability and completeness.

This paper aims to improve the monitoring of healthcare systems for chronic diseases by creating an ontology related to context categorization and extending it specifically to asthma. Additionally, we identify the most relevant parameters using AI algorithms, leading to more accurate, easy-to-understand, and efficient predictions. This approach can be useful in various areas, making it a valuable tool for addressing real-world knowledge representation challenges in disease diagnosis.

The paper is organized as follows. Section 2 provides a comprehensive review of related work and the state of the art in the field of chronic diseases. Section 3 analyzes the methodology used to design an ontology and motivates the adoption of the Sánchez methodology for designing the General Contextual Chronic Disease Ontology. In Section 4, we describe our general ontology framework and examine its reasoning and evaluation aspects. Finally, conclusions are presented in Section 5

2. State-of-the-Art

An ontology, as defined by Gruber, is an explicit specification of a conceptualization. With the increasing volume, variety, and complexity of data, ontologies have become a key area of research. Ontology is one of the most comprehensive semantic frameworks for enabling knowledge representation, integration, and reasoning [15]. Initially, ontology was defined as an “explicit specification of a conceptualization” [16]. Later, Borst described it as a “formal specification of a shared conceptualization” [17]. Studer [18], combining these perspectives, defined ontology as a “formal, explicit specification of a shared conceptualization” [19]. A conceptualization refers to an abstract model that represents the real world through objects, concepts, entities, and their relationships within a specific domain. “Explicit” ensures that all concepts and constraints are clearly defined to avoid misinterpretation [19]. Semantic formalization is often used to interpret complex information, making it meaningful and accessible to machines [20]. This means that queries can be made based on the purpose of the data.

In health care, ontologies have been applied in areas such as cancer treatment planning [21], EHR interoperability [22], rare disease data integration [23], and diabetes management [24]. They have also advanced drug discovery [25] and genomic-based precision medicine [26]. Many uses have arisen for ontology in health care during the last decade, as shown in Table 1.

These ontologies are designed for specific diseases [39]. However, existing ontologies are often limited in scope, designed for specific diseases or use cases, and fail to address the diversity of healthcare data for chronic diseases within a unified framework [40,41]. Moreover, most ontologies have not respected international medical terminologies; this limits the reuse and sharing of knowledge and still requires human intervention for tasks such as triggering applications and notifications [42,43]. This article highlights the need for a general contextual ontology capable of representing diverse healthcare data, extending it to chronic diseases, and reusing it for different case-study diseases.

Comparing the General Contextual Ontology with the Existing Ontologies

We review some of the most relevant existing ontologies available in BioPortal and other repositories.

Asthma Ontology (AO): The Asthma Ontology, available on BioPortal, provides a structured representation of asthma-related factors, including symptoms, triggers, and treatments. However, it lacks contextual dimensions such as environmental and organizational factors, which are crucial for real-world asthma management [44].
Disease Ontology (DO): The Disease Ontology offers a broad classification of diseases, including asthma, but does not provide fine-grained relationships between symptoms and external environmental factors [45].
Mondo Disease Ontology: This ontology integrates multiple disease classifications, facilitating cross-referencing. However, it does not explicitly support AI-driven rule generation for clinical decision support [46].
Monarch Initiative: Monarch integrates multiple ontological sources for disease modeling but primarily focuses on genetic and phenotypic aspects, limiting its applicability in real-world asthma management [47].

While these ontologies provide valuable frameworks for disease classification, they lack a comprehensive contextual representation that integrates environmental, temporal, and organizational factors. Additionally, they do not incorporate AI-driven ontology validation methods to enhance reasoning and rule-based decision-making.

To highlight the originality and advantages of our approach, we compare our GCOCD with related existing disease ontologies. Table 2 provides a comparative analysis of key features.

The GCOCD extends existing models by integrating environmental and temporal factors, which are crucial for asthma monitoring. Additionally, it enhances reasoning capabilities through AI-driven rule extraction, making it a dynamic and adaptive ontology for healthcare applications.

3. Methodology for Designing General Contextual Chronic Disease Ontology

Ontology development has introduced many methods over the years, each focusing on different aspects. Ushold and King (1995) focused on creating an ontology with basic guidelines to capture domain knowledge; however, it lacked clear steps for practical implementation [48]. Grüninger and Fox (1995) worked on the TOVE project, which used questions and scenarios to define the purpose of the ontology, but their method was too abstract for broader use [49]. METHONTOLOGY (1997) introduced an organized process with clear steps that define the concepts, formalize them, and evaluate the ontology. This is classified as one of the more detailed methods [50]. Other methods focused on making ontologies reusable, such as Ontolingua (1995), which provided a library of existing ontologies for collaboration but did not explain how to adapt these ontologies for new uses [51].

SENSUS (1996) created a large hierarchy of concepts for natural language processing, but it was not specific enough for certain domains [52]. Additionally, methods such as the 101 Method (2001) provided practical advice for defining classes and relationships in ontologies but omitted important steps, including lifecycle planning [53]. Researchers such as Lopez et al. [54] compared these methods using criteria including reusability and how well they supported collaboration and lifecycle management. They found that no single method met all needs. For example, TERMINAE and Termontography focused on extracting knowledge from text in multiple languages but were limited to English and French or were no longer supported [55]. Efforts to combine methods, such as merging METHONTOLOGY with Cyc 101, have shown potential, especially for designing medical ontologies [56].

Although various ontology development methodologies have been proposed, many lack comprehensive guidance on collaboration, lifecycle management, and adaptation to diverse needs [57]. To create an ontology that is both reusable and extendable, we adopt Sánchez’s methodology [58]. This approach (as shown in Figure 1) offers clear steps and can accommodate various requirements, making it a reliable choice for long-term and flexible ontology development.

This methodology outlines the process of building an ontology to represent chronic diseases, with a focus on extending and validating it for asthma-specific use cases.

3.1. Designing General Contextual Ontology: Contextual Insights from Dataset Analysis

Define the scope domain of ontology: The ontology focuses on chronic diseases and aims to facilitate data gathering and representation.

The questions guiding the ontology’s domain include:

What is the domain the ontology will cover?

Chronic diseases are the focus, addressing conditions such as asthma, diabetes and hypertension.

What is the purpose of this ontology?

To aid in collecting and representing chronic disease data for predictive diagnosis and healthcare research.

Who will use the ontology?

Hospitals, healthcare professionals, researchers and organizations conducting disease-related studies.

What types of questions should the information in the ontology answer?

Is this patient diagnosed with a chronic disease?

What are the symptoms of this disease?

The domain scope integrates temporal, environmental, organizational and medical contexts to comprehensively model chronic diseases.

Developing the conceptual model following Sánchez’s methodology, the ontology is structured using context categorization. Identify key terms in the ontology:

Nouns (Concepts): Patient, Physician, Symptom, Treatment, Diagnosis, Allergy.
- Attributes (Attributes of Concepts): Age, Gender, Diagnosis Date, Environmental Factors (e.g., pollution levels).
- Verbs (Relationships): “hasSymptom” (Patient → Symptom), “isTreatedBy” (Disease → Treatment), “isTriggeredBy” (Symptom → Environmental Factor).
- Standardization: SNOMED CT is used for consistent representation of symptoms and medical terms.
- Classes and their hierarchy: Top-Level Classes: Temporal Context, Environmental Context, Medical Data, Organizational Context, Events.
Subclasses: Under medical data: Symptoms, Treatments, Diagnoses, Allergies. Under environmental Context: Pollution, Temperature, Humidity.
Define class properties:
o
Object Properties: hasSymptom: Domain = Patient, Range = Symptom. isTreatedBy: Domain = Symptom, Range = Treatment.
o
Datatype Properties: age: Integer, diagnosisDate: Date.
Define Slot Facets: Assign value types, permitted values properties:
o
Age: Integer (1 to 100)
o
PollutionLevel: Float (0.0 to 500.0)
Create Instances: Populate slots with real-world values from datasets: A patient named “Sami” with: Age = 45, Gender = Male, Symptom = “Chest Tightness”.

Sánchez’s Methodology with a Comparative Perspective

Although several ontology development methodologies exist, such as METHONTOLOGY, Ontolingua, and the 101 Method, many fall short in addressing long-term reusability and modular extension. Sánchez’s methodology was selected for its ability to integrate structured development steps with adaptability across medical domains [53,58]. Sánchez’s method combines conceptual modeling, class–property definition, and real-world dataset mapping, making it a strong fit for developing a general ontology that can later be extended to specific diseases like asthma. Moreover, Sánchez’s method supports ontology lifecycle planning and enables context categorization, which is essential for chronic diseases influenced by environmental, organizational, and temporal dimensions.

3.2. Extending the General Contextual Ontology to Asthma

The GCOCD was developed to provide a flexible and reusable framework for modeling chronic diseases, but it lacks specificity when applied to disease-specific contexts such as asthma. The key missing information includes:

Asthma-specific symptoms and triggers, such as wheezing, airway inflammation, and pollen exposure, are absent from the general ontology.
Environmental and temporal factors, including pollution, humidity, and seasonal variations, which influence asthma severity but are not fully captured in GCOCD.
Granular treatment pathways, as the ontology does not distinguish between general chronic disease management and targeted asthma interventions, such as bronchodilators and corticosteroids.

To address these limitations, the extension process focuses on context-based adaptation and hierarchical refinement, ensuring that the ontology is compatible with GCOCD while accurately modeling asthma-specific knowledge. The detailed construction process is outlined in Section Ontology Construction from Dataset Schema, where targeted modifications will be introduced to specialize symptoms, treatment pathways, and environmental factors relevant to asthma.

Ontology Construction from Dataset Schema

To refine the ontology structure, dataset-driven approaches were used:

(a): Feature-Based Extension Using Relevant Parameters

Key features were prioritized based on their impact on asthma severity and treatment. Feature selection was performed based on AI algorithms to be further discussed in Section 4.4.

(b): Context-Specific Data Integration

Most of the parameters present in the general ontology are reused in the extended asthma ontology: temporal, environmental, medical and organizational.

Analyze the dataset to extract key features.
Demographics: Includes attributes such as Age, Gender, Ethnicity, Education Level, and BMI.
Symptoms: Specific symptoms such as Chest Tightness, Coughing, and Night-time symptoms.
Triggers: Identify triggers such as Exercise-Induced and environmental factors (if present in other columns).
Treatments: Information about diagnosis and treatments.
Map dataset features to ontology.
Classes:
o
Patient: Represents individuals with attributes such as Age, Gender, Ethnicity and BMI.
o
Symptoms: Includes asthma-specific subclasses such as Chest Tightness, Coughing and Night-time Symptoms.
o
Triggers: Includes subclasses such as Exercise Induced and potentially environmental conditions if described elsewhere.
o
Diagnosis: Links patients to identified chronic diseases (e.g., asthma).
Relationships:
o
Patient → hasSymptom → Symptom.
o
Symptom → isTriggeredBy → Trigger.
o
Patient → isDiagnosedWith → Diagnosis.
Create relationships for specific asthma ontology.
Example relationships include:
o
“Patient 5034 hasSymptom Chest Tightness”
o
“Chest Tightness isTriggeredBy Exercise”.

To demonstrate the GCOCD, we conducted a preliminary adaptation to the domain of diabetes. The core structure of GCOCD, including the categories of patient data, environmental factors, temporal context, and organizational context, was reused without modification. New disease-specific classes and properties were added to represent glucose level monitoring, insulin treatment, and dietary control. This quick adaptation illustrates the framework’s flexibility and its potential to model diverse chronic conditions. A more comprehensive diabetes-specific ontology will be presented in future work, but this exercise confirms GCOCD’s reusability across domains.

3.3. Validate the Extended Asthma Ontology

Several ontology evaluation methods have been developed, each focusing on specific aspects. Jonathan emphasized the importance of ensuring that ontologies are logical and well-structured to enable the effective organization of knowledge [59]. Pérez et al. highlighted the need to design ontologies that can be applied across various domains [60]. Lovrenčić et al. focused on verifying that ontologies comprehensively capture the necessary domain knowledge, ensuring they include all essential concepts and relationships for their intended use [57]. Together, these approaches provide a robust foundation for ontology reasoning.

4. Designing an Ontology Framework: A Case Study on Asthma

This section presents the framework (Figure 2) for designing the general contextual ontology for chronic diseases through a multi-step approach. It begins by outlining the steps involved in developing the general contextual ontology, followed by data acquisition and preparation. The process then moves to analyzing strategies such as dataset exploration and identifying relevant parameters for developing the extended asthma ontology using specific datasets. Subsequently, rules are derived from the data to enhance ontology reasoning and evaluate the extended ontology.

This approach effectively integrates AI algorithms with ontology-based data representation, organized into clearly defined stages, as illustrated in Figure 2.

4.1. Design General Contextual Ontology

An ontology is a formal framework that provides a structured and shared understanding of concepts, properties and their relationships within a particular domain. Ontologies consist of three core components:

Classes that represent entities or concepts in the domain (e.g., “Disease” or “Symptom”);
Properties that describe relationships between these entities (e.g., “hasSymptom”);
Individuals that serve as specific instances of these classes (e.g., “Asthma” as an individual of the class “Disease”; Horridge, 2005) [61].

We created an ontology using OWL-DL, a specialized language for describing ontologies, and developed it using the Protégé editor from Stanford University, supported by the Pellet reasoner. A general contextual ontology was built to represent chronic diseases, comprising 10 main categories, as illustrated in Figure 3. These categories encompass key elements such as patient data, social factors, environmental conditions, and disease-related information. This flexible structure is designed to be adaptable to various chronic diseases by illustrating how these elements are interrelated.

In the general contextual ontology (see Figure 3), “Patient” is the central category. It acts as a key point that separates the ontology into nine categories, including individual data, social factors, temporal context, organization and the environmental factors that affect the patient’s context. Additionally, it covers information about physical activity, disease information, health events, relevant organizational contexts and service-level agreements that support healthcare services.

Each of these categories can be further subdivided into subcategories. For example, “Individual Data” can be divided into subcategories addressing the patient’s personal information. Similarly, all other categories can be subdivided based on the “Patient,” creating subcategories that focus on specific aspects of the chronic disease ontology, such as symptoms and treatments. These subdivisions help better organize and represent the interrelations within chronic disease management.

The social and individual data ontology illustrates the factors influencing a patient’s well-being, as shown in Figure 4. It captures the relationship between patients; their individual data, including attributes like age, gender, body mass index (BMI), nationality and location; and the social factors impacting their health, such as education level, income, quality of life and social relations. Temporal factors, including seasons and specific dates, are also represented as events associated with the patient.

Figure 5 represents the environmental and allergy factors. Ontology demonstrates the interconnected factors influencing a patient’s health. The environmental category includes elements such as air quality, pollution, temperature and humidity, as well as temporal factors like seasons and specific dates. Patients are further linked to healthcare organizations, such as hospitals and clinics, where care is provided. According to studies, these factors play a significant role in chronic diseases and allergies, affecting disease severity and requiring tailored medical attention.

We then extended this general contextual ontology to asthma based on the related dataset by adding specific classes and properties related to asthma, such as pulmonary function test results (FEV1, FVC), depending on the dataset features.

4.2. Data Acquisition

To support our research objectives, we used the Kaggle platform to find a dataset related to our case studies. We found a dataset focusing specifically on asthma. This dataset included a set of parameters related to symptoms, risk factors and diagnostic test results. It provided detailed information, including demographic parameters (e.g., age, gender, and ethnicity), lifestyle factors (e.g., BMI and smoking status) and clinical symptoms (e.g., chest tightness, coughing, and night-time symptoms). Additionally, it encompassed critical diagnostic indicators such as exercise-induced symptoms and test results. Although a comprehensive asthma ontology should include a broader range of parameters, the limitations of the dataset led us to focus on expanding the ontology based on the data currently available. This approach ensures that the ontology is well-aligned with the dataset, facilitating proper testing and validation.

4.3. Data Preprocessing

This refers to the initial steps necessary for building robust predictive models for asthma diagnosis. The raw dataset, sourced from an Excel file, contained various medical parameters critical for predicting the presence of asthma. To ensure data integrity, all records with missing values were eliminated, reducing the risk of biased outcomes. Subsequently, a pivotal transformation was performed: converting all categorical data into pure numeric formats.

The dataset used in this study consists of 2394 patient records with 25 relevant features, covering demographic information, clinical symptoms, lifestyle factors, and environmental exposures. The patient population includes 1212 males and 1180 females, with ages ranging from children to older adults. The class distribution is imbalanced, with 2268 records labeled as non-asthmatic and 124 as asthmatic. A small number of missing values were present in several features. For example, BMI, FEV1, and FVC had a few missing entries, which were handled using mean imputation, replacing the missing values with the average value of the respective feature. Records with missing values in the target variable (Diagnosis) were excluded from the analysis.

Due to the class imbalance, we applied the Synthetic Minority Over-sampling Technique (SMOTE) to oversample the minority class (asthmatic cases) in the training dataset. Additionally, during feature analysis, potential sources of bias, such as environmental exposure, were monitored. Addressing these factors in more depth will be part of future work during the clinical validation phase.

The dataset was then standardized to a consistent format and saved back to an Excel file for subsequent analysis, as shown in Figure 6. This preprocessing ensures that machine learning models operate efficiently and produce accurate results for asthma prediction.

After the general contextual ontology was implemented in Protégé, the next steps focused on selecting relevant parameters to extract rules.

4.4. Selection of Relevant Parameters

Relevant parameter selection is used to identify the key features that significantly influence asthma diagnosis. The dataset was divided into input features and a target variable, with “Diagnosis” (0 = no asthma, 1 = asthma) as the target.

4.4.1. Rationale Behind the Choice of AI Algorithm

In the context of AI algorithms for parameter selection and rule extraction, a variety of techniques have been explored in the literature, each offering distinct advantages and trade-offs. Commonly used methods include decision trees and logistic regression, which are known for their interpretability and their ability to provide explicit rules. However, more AI algorithms such as neural networks, and ensemble learning methods have also been utilized for feature selection and knowledge extraction in ontology-based reasoning and clinical decision support systems [57]. While advanced models, such as deep learning and other algorithms, may offer higher predictive accuracy, they often suffer from a lack of interpretability, making them less suitable for rule extraction and explicit knowledge representation [62]. In addition, rule-based methods like decision trees and logistic regression provide transparent decision-making processes, allowing for clear feature selection and rule generation. This trade-off between interpretability and accuracy is well-documented in the AI literature, particularly in healthcare applications [13]. In this paper, we focus specifically on decision trees, logistic regression, and neural networks, acknowledging that while more accurate models exist, their lack of interpretability makes them less ideal for rule extraction and parameter selection in our ontology-based approach. Our goal is not to evaluate all AI algorithms but rather to demonstrate the feasibility of interpretable AI techniques for relevant feature selection and explainable decision-making.

4.4.2. Decision Tree

A decision tree is a machine learning algorithm that uses a tree-like structure to represent nodes and their possible outcomes, creating rules based on the selected features used for classification or regression [63]. In this paper, the decision tree was applied to select relevant parameters due to its ability to prioritize features based on their importance concerning the information gain value calculated for making predictions. Additionally, it efficiently generates rules and handles both numerical and categorical data [64].

4.4.3. Logistic Regression

Logistic regression is a machine learning algorithm used for binary classification tasks, predicting the probability of an outcome based on input features [65]. Logistic regression can be used for feature selection because it evaluates the coefficient of each parameter and reflects the strength and the relationship with the target variable. This makes it particularly effective for identifying the most relevant parameters in a dataset [65].

Parameter selection scores were determined based on each parameter’s contribution to decision-making, as calculated for each algorithm. The dataset was split into training and testing subsets, with 20% allocated for testing to evaluate the model’s performance using the metrics described in Section 4.4.3.

4.4.4. Neural Network

A neural network is a machine learning model inspired by the human brain’s interconnected neuron structure, designed to recognize patterns and solve complex problems. It consists of layers of interconnected nodes (neurons), where each node processes data through weighted connections and activation functions to produce an output [66]. In this paper, neural networks were applied to select relevant parameters due to their ability to learn complex, non-linear relationships in data, effectively identifying important features for making predictions.

4.4.5. Evaluation of AI Algorithms Using Specific Metrics

To evaluate the performance of the algorithms, the author calculated several metrics, including accuracy, precision, recall, and F1-score. Based on the results of these metrics, a suitable algorithm was chosen for identifying the optimal number of relevant parameters.

Precision

Precision = True Positives/(True Positives + False Positives)

Precision measures how many of the positive predictions made by the model are correct.

Recall

Recall = True Positives/(True Positives + False Negatives)

Recall measures how many of the actual positive instances the model correctly identifies.

F1-score

F1-score = 2 × (Precision × Recall)/(Precision + Recall)

The F1-score is the harmonic mean of precision and recall and provides a single metric that balances both concerns.

Accuracy

Accuracy = (True Positives + True Negatives)/(True Positives + True Negatives + False Positives + False Negatives)

Accuracy measures the proportion of correctly predicted instances out of the total instances.

When comparing decision tree, neural network and logistic regression using the four metrics (accuracy, precision, recall and F1-score), the decision tree consistently achieved higher values. As shown in Figure 7, the decision tree emerged as a suitable method for feature selection based on the calculated metric values. Specifically, the decision tree used 19 relevant parameters to demonstrate the best overall performance across all metrics, making it the most effective choice based on the data provided. Using the selected relevant parameters, decision trees can perform rule extraction as the following step based on these results.

4.5. Extracting Rules

In this section, building on the previously selected relevant parameters, we use the chosen algorithms to extract rules for asthma diagnosis based on these parameters. The objective is to create human-readable rules derived from patient data while leveraging decision tree algorithms to enhance ontology reasoning. Decision trees produce rules often expressed in the Semantic Web Rule Language (SWRL), which uses “if-then” statements to define conditions and corresponding outcomes [67]. Embedding these rules into ontologies allows systems to draw conclusions, validate relationships and reason more effectively, as highlighted in recent studies on ontology-based rule generation [68]. For example, rules derived from decision trees include:

IF DietQuality > 3.5

And BMI > 29.0

Else If LungFunctionFEV1 ≤ 2.5: Then Class = 0 (Asthma)

IF DietQuality ≤ 3.5

And LungFunctionFEV1 > 1.8: Then Class = 0 (Asthma)

IF Age > 9.50:

And CoughingID ≤ 0.50: Then Class = 1 (Asthma)

IF BMI ≤ 35.81:

And Age ≤ 47.50: Then Class = 1 (Asthma)

Else If Age > 47.50: Then Class = 1 (Asthma)

IF BMI ≤ 39.23:

Then Class = 0 (No Asthma)

Else If BMI > 39.23: Then Class = 1 (Asthma)

IF CoughingID > 0.50:

And BMI ≤ 20.55:

And Age ≤ 75.00: Then Class = 0 (No Asthma)

Else If Age > 75.00: Then Class = 1 (Asthma).

Based on the rules derived from the decision tree algorithm, the next step is to design the extended asthma ontology.

4.6. Decision Tree Model for Asthma Detection: Implementation Details

A Decision Tree classifier was trained to detect asthma using a structured pipeline where numerical features were standardized, categorical features one-hot encoded, and boolean features binary-encoded. The dataset of 1000 patients was split into 800 for training and 200 for testing to ensure generalizability. Using Information Gain, the top 16 features, such as FEV1 Score, Chest Tightness, and Night-time Symptoms, were selected, excluding less informative ones like Education Level. Decision rules were extracted in an interpretable if-else format, and numerical thresholds were converted back to clinical values (e.g., FEV1 ≤ 0.87 → FEV1 ≤ 2.5 L) for usability. Table 3 reports the evaluation metrics for the algorithms: Accuracy, Precision, Recall, and F1 Score for Decision Tree, Logistic Regression, and Neural Network. While the Neural Network achieved the highest scores, the Decision Tree offered the best trade-off between accuracy and interpretability, making it ideal for clinical rule extraction. Logistic Regression, though lower in accuracy, remains valuable for its simplicity. Furthermore, the decision tree thresholds were validated using clinical guidelines such as GINA for FEV1 and the medical literature for BMI and age, ensuring that the extracted rules are not only data-driven but also clinically sound and applicable in real-world medical decision-making [13,62].

4.7. Design the Extended Asthma Ontology

Using the asthma disease dataset and its parameters, we found that not all entities and subcategories from the general contextual ontology for chronic diseases were present in the dataset. To address this, we used the general contextual ontology as a guide to develop and extend the asthma ontology, incorporating the entities and subcategories specific to asthma found in the dataset.

The asthma ontology, as illustrated in Figure 8, focuses on individual patient information. For example, it captures key attributes such as age, gender, ethnicity and education level. Patients are classified by their gender, represented as either male or female with Boolean values, and their ethnicity, including categories such as African American, Caucasian, Asian and others, each linked to integer values. Education levels, such as high school, bachelor’s, higher education or none, are also represented with associated integer values.

The asthma ontology, shown in Figure 8, centers on individual patient information. It includes key attributes like age, gender, ethnicity, and education level. Gender is categorized as male or female using Boolean values, while ethnicity is classified into groups such as African American, Caucasian, Asian, and others, each assigned an integer value. Similarly, education levels, such as high school, bachelor’s, higher education, or none, are represented with corresponding integer values.

The crucial next step is building the ontology reasoning to properly evaluate the ontology capabilities.

4.8. Ontology Reasoning

Ontology reasoning plays a critical role in both general and medical domains, serving as a vital tool for organizing knowledge, discovering implicit information and maintaining logical consistency. It accomplishes this by analyzing the well-defined concepts, relationships and rules embedded within an ontology [69].

Reasoning tools such as Pellet, Fact++ and Hermit rely on description logic used to define and analyze concepts and their relationships [70]. The steps to use Pellet for reasoning are:

4.8.1. Load the Ontology

Import the ontology file (e.g., in OWL format) into a reasoning tool such as Protégé [71].

4.8.2. Add Rules and Data

Use SWRL to define rules and integrate real-world data, such as symptoms or environmental triggers [72].

4.8.3. Run the Pellet Reasoner

Execute reasoning to classify concepts, detect inconsistencies, and derive new insights [73]. These tools are particularly effective in the medical field, where they support clinical decision-making and patient care by managing and interpreting complex healthcare datasets. Ontologies such as SNOMED CT, ICD-10 and specialized medical frameworks enhance reasoning capabilities, enabling better healthcare outcomes [71].

4.9. Validation for the Extended Ontology of Asthma Based on Specific Metrics

The evaluation of the extended asthma ontology involves checking its internal qualities to ensure that it is logically consistent and easy to use. Metrics for ontology evaluation include classification, reusability and completeness. Below, we explain the evaluation process used for the extended asthma ontology.

4.9.1. Classification

Classification involves determining relationships between classes, validating the hierarchy and ensuring that the ontology’s structure is consistent [74]. The reasoner successfully validated the asthma ontology’s structure, placing subclasses such as Symptoms, Triggers and Treatments under parent classes such as MedicalData. The changes to the ontology can increase reasoning time. However, the extended asthma ontology was classified using Pellet in approximately 27,000 ms, achieving near-real-time performance. Proper classification ensures that the ontology is well-structured and supports effective reasoning, which is crucial for medical use.

4.9.2. Reusability

Reusability assesses whether the ontology can be used in other contexts or integrated with standards such as SNOMED CT [75]. The asthma ontology is modular and aligns with medical standards, enabling its components to be reused in other healthcare ontologies. For example, its structure fits well within SNOMED CT’s hierarchy for respiratory conditions. Reusability ensures that the ontology can adapt to different medical datasets, increasing its practical value.

4.9.3. Completeness

Completeness refers to an ontology’s ability to comprehensively represent essential domain knowledge and infer the necessary information to address competency questions [76]. However, in our case, we refer to relative completeness, as it does not represent general completeness, meaning that the ontology is complete with respect to the available dataset, but its scope remains specific rather than general. The extended asthma ontology incorporates key features from the asthma dataset, such as DietQuality, LungFunction FEV1, PhysicalActivity and Smoking. Additionally, the ontology includes hierarchical structures to represent Symptoms, Triggers, Treatments and Diagnosis, ensuring logical representation of asthma-related data. The extended ontology ensures sufficient completeness to support decision-making and reasoning for asthma. However, gaps in representing rare or less-studied aspects may limit its applicability for edge cases or advanced medical research. These gaps can be addressed by integrating additional datasets or collaborating with domain experts to further refine the ontology.

4.9.4. Consistency

Consistency means that the ontology is free from contradictions, allowing for accurate reasoning. In the context of an asthma ontology, maintaining consistency involves accurately modeling relationships among various factors such as environmental triggers, patient activities, and asthma symptoms. For example, if the ontology defines “Smoking” as a trigger for asthma exacerbations, it should not simultaneously represent “Smoking” as a non-trigger without clear contextual differentiation. Ensuring such consistency is essential for the ontology to support effective decision-making and reasoning in asthma management [77].

o: The ontology was loaded into Protégé, and the Pellet reasoner was executed.
o: No inconsistencies were detected in class hierarchies or property restrictions, confirming logical soundness.

4.9.5. Disjointness

In ontology modeling, disjointness is used to specify that certain classes do not share any common instances [78]. For example, declaring classes like MildAsthma and SevereAsthma as disjoint ensures that no individual can simultaneously belong to both categories. To verify the integrity of these constraints, you can execute a SPARQL query to identify any individuals that are erroneously classified under both disjoint classes: SELECT ?individual WHERE {

?individual rdf:type:MildAsthma.

?individual rdf:type:SevereAsthma.

}

4.9.6. Complexity

Ontology complexity affects how efficiently a system can process and reason with information. When an ontology has many concepts, relationships, and rules, it requires more computing power and time to analyze and infer new knowledge. This can slow down decision-making and make reasoning tasks more complex [79]. Keeping an ontology well-structured and optimized helps improve performance and efficiency. We assessed:

o: Total Axioms: 1253
o: Class-to-Class Relationships: 320
o: Object Properties: 145

Tools used: Protégé Metrics Plugin and OntoMetrics. The ontology maintains a moderate complexity level, balancing detail and computational efficiency.

To assess the efficiency of the extended asthma ontology, we conducted a benchmark comparison against the SNOMED CT ontology using the Pellet reasoner. The evaluation focused on classification time, consistency checking, and rule inference speed. Our ontology, being more lightweight and domain-specific, achieved significantly faster reasoning times in both classification and consistency checking tasks. For example, classification time averaged 1.8 s for our ontology versus 5.2 s for SNOMED CT under identical conditions (Pellet reasoner, Protégé 5.5). This performance advantage is attributed to the ontology’s targeted structure and optimized context modeling, making it suitable for real-time clinical decision support.

5. Conclusions

This paper presents the development of a general contextual ontology for chronic diseases and its extension to an asthma-specific ontology. By incorporating real-world data, the extended asthma ontology effectively captures key elements such as symptoms, triggers, and treatments. Additionally, AI algorithms were employed to enhance ontology reasoning through rule extraction using decision trees and metric-based evaluations. This work can be further expanded by integrating non-interpretable algorithms with AI techniques applied to improve their interpretability.

Although the ontology effectively addressed asthma-related data, this study highlights the potential of designing a general contextual ontology and extending it to a specific chronic disease using available datasets. It also emphasizes the importance of integrating AI algorithms with ontology to enhance validation and support decision-making in health care. Future efforts will focus on applying this methodology to other chronic diseases by using larger datasets and incorporating domain expertise, further improving the ontology’s completeness and adaptability across diverse healthcare applications.

Author Contributions

Conceptualization, B.M. and H.M.; methodology, B.M. and M.A.; validation, B.M., H.M., Y.N. and M.A.; formal analysis, M.A.; investigation, B.M., H.M. and M.D.; resources, M.D.; data curation, Y.N.; writing—original draft preparation, B.M.; writing—review and editing, H.M.; visualization, Y.N. and M.D.; supervision, M.A.; project administration, B.M.; funding acquisition, H.M. and M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data used in this study are openly available at: https://www.kaggle.com/datasets/deepayanthakur/asthma-disease-prediction.

Acknowledgments

The authors thank the participants for their valuable participation in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

World Health Organization (WHO). Asthma Fact Sheet. 2021. Available online: https://www.who.int (accessed on 26 January 2025).
Global Initiative for Asthma. Global Strategy for Asthma Management and Prevention. 2022. Available online: https://ginasthma.org (accessed on 26 January 2025).
Gibeon, D.; Heaney, L.G. Allergic asthma: An overview of triggers and management strategies. J. Allergy Clin. Immunol. 2013, 132, 317–322. [Google Scholar]
Baader, F.; Horrocks, I.; Sattler, U. Description Logics as Ontology Languages for the Semantic Web. Artif. Intell. 2020, 21, 117–134. [Google Scholar]
Castro-Gómez, J.; Freeman, M.L. Advances techniques in difficult biliary cannulation. Endoscopy 2017, 29, 39–46. [Google Scholar]
Msheik, B.; Adda, M.; Mcheick, H. Survey on Knowledge Representation Models in Healthcare. Information 2024, 15, 435. [Google Scholar] [CrossRef]
Lasierra, N.; Alesanco, A.; Guillén, S.; García, J. A three-stage ontology-driven solution to provide personalized care to chronic patients at home. J. Biomed. Inform. 2013, 46, 516–529. [Google Scholar] [CrossRef]
Kuo, M.H.; Rojas, A. Ensuring data standardization with controlled vocabularies: A big data challenge for healthcare. Yearb. Med. Inform. 2017, 26, 173–178. [Google Scholar]
Holzinger, A.; Jurisica, I. Knowledge discovery and data mining in biomedical informatics. In Interactive Knowledge Discovery and Data Mining in Biomedical Informatics; Springer: Berlin/Heidelberg, Germany, 2014; pp. 1–18. [Google Scholar]
McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017. [Google Scholar]
Ajami, H.; Mcheick, H. Ontology-based model to support ubiquitous healthcare systems for COPD patients. Electronics 2018, 7, 371. [Google Scholar] [CrossRef]
Perera, C.; Zaslavsky, A.; Christen, P.; Georgakopoulos, D. Context Aware Computing for The Internet of Things: A Survey. IEEE Commun. Surv. Tutor. 2014, 16, 414–454. [Google Scholar] [CrossRef]
Hosmer, D.W.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression, 3rd ed.; Wiley: Hoboken, NJ, USA, 2013. [Google Scholar]
Vuokko, R.; Vakkuri, A.; Palojoki, S. Systematized Nomenclature of Medicine-Clinical Terminology (SNOMED CT) Clinical Use Cases in the Context of Electronic Health Record Systems: Systematic Literature Review. JMIR Med. Inform. 2023, 11, e43750. [Google Scholar] [CrossRef]
Gruber, T.R. A translation approach to portable ontology specifications. Knowl. Acquis. 1993, 5, 199–220. [Google Scholar] [CrossRef]
Brewster, C. Ontologies for knowledge sharing. J. Knowl. Manag. 1997, 1, 223–229. [Google Scholar]
Borst, W.N.; Akkermans, H.; Top, J.L. Engineering ontologies. Int. J. Hum. Comput. Stud. 1998, 46, 365–406. [Google Scholar] [CrossRef]
Studer, R.; Benjamins, R.; Fensel, D. Knowledge engineering: Principles and methods. Data Knowl. Eng. 1998, 25, 161–197. [Google Scholar] [CrossRef]
Maedche, A. Ontology Learning for the Semantic Web; Kluwer Academic Publishers: Alphen aan den Rijn, The Netherlands, 2002. [Google Scholar]
Brewster, C.; O’Hara, K. Knowledge representation with ontologies: The present and future. IEEE Intell. Syst. 2004, 19, 72–81. [Google Scholar] [CrossRef]
Hu, X. Cancer diagnosis and treatment using ontologies. Cancer Res. J. 2013, 45, 112–119. [Google Scholar]
Liyanage, H. Semantic ontology for EHR interoperability. Health Inf. J. 2015, 21, 239–250. [Google Scholar]
Köhler, S. Rare disease data integration using ontologies. Orphanet J. Rare Dis. 2015, 10, 82–95. [Google Scholar]
Jovanovic, J. Ontology-based diabetes management. J. Diabetes Res. 2016, 12, 54–68. [Google Scholar]
Malone, J. Ontologies in drug discovery. Pharmacol. J. 2017, 32, 327–339. [Google Scholar]
Jimeno-Yepes, A. Genomic-based precision medicine through ontologies. Genome Res. 2018, 45, 623–635. [Google Scholar]
Riaño, D. Ontology applications for chronic ill patients under treatment. J. Health Inform. 2012, 14, 177–185. [Google Scholar]
Musen, M.A. Ontology-based representation of brain neoplasms. Brain Res. 2014, 66, 31–45. [Google Scholar]
Martínez-Costa, C.; Menárguez-Tortosa, M. Ontology for health system modeling. Health Inform. J. 2017, 23, 119–128. [Google Scholar]
Wagner, B. Ontology in mental health and therapy planning. J. Psychiatr. Res. 2019, 81, 324–330. [Google Scholar]
Bello, D. COVID-19 pandemic response through ontology-based frameworks. Infect. Dis. Inform. 2020, 24, 15–27. [Google Scholar]
Rodríguez-Mazahua, L. Complex healthcare data structures with ontologies. Data Sci. J. 2020, 9, 192–202. [Google Scholar]
Razzak, M.I. Ontology-based integration of health IoT systems. IoT J. 2021, 45, 14–23. [Google Scholar]
Dhiman, S. Frameworks for healthcare systems using ontologies. Healthc. Inform. 2022, 33, 98–107. [Google Scholar]
Mcheick, H. COVID-19 safety measures in healthcare with ontologies. Saf. Sci. 2022, 54, 77–84. [Google Scholar]
Smith, B. Healthcare data privacy and epidemiology through ontologies. Health Data J. 2023, 16, 43–58. [Google Scholar]
Al Khatib, H.S. Patient-centric care with ontology-based approaches. J. Med. Inform. 2024, 28, 314–323. [Google Scholar]
Croce, F. Ontology applications in healthcare data preparation. Healthc. Data J. 2024, 18, 51–67. [Google Scholar]
Whetzel, P.L.; Noy, N.F.; Shah, N.H.; Alexander, P.R.; Nyulas, C.; Tudorache, T.; Musen, M.A. BioPortal: Enhanced functionality via new Web services from the National Center for Biomedical Ontology. Nucleic Acids Res. 2011, 39 (Suppl. S2), W541–W545. [Google Scholar] [CrossRef] [PubMed]
Bodenreider, O.; Stevens, R. Bio-ontologies: Current trends and future directions. Brief. Bioinform. 2006, 7, 256–274. [Google Scholar] [CrossRef]
Schulz, S.; Jansen, L. Formal ontologies in biomedical knowledge representation. Yearb. Med. Inform. 2013, 22, 132–146. [Google Scholar]
Martínez-Costa, C. Ontology-based data integration in the health domain: Challenges, review, and a proposal. J. Biomed. Inform. 2015, 57, 97–112. [Google Scholar]
Rector, A.L. Modularisation of domain ontologies implemented in description logics and related formalisms including OWL. In Proceedings of the International Conference on Knowledge Capture, Sanibel Island, FL, USA, 23–25 October 2003; pp. 121–128. [Google Scholar]
Salvadores, M.; Alexander, P.R.; Musen, M.A.; Noy, N.F. BioPortal as a dataset of linked biomedical ontologies and terminologies in RDF. Semant. Web 2013, 4, 277–284. [Google Scholar] [CrossRef]
Baron, J.A.; Schriml, L.M. Challenges Arising from Ontology Imports Utilized for Exploring Mechanisms of Disease. In Proceedings of the 12th International Conference on Biomedical Ontologies, Bolzano, Italy, 15–18 September 2021. [Google Scholar]
Vasilevsky, N.A.; Matentzoglu, N.A.; Toro, S.; Flack IV, J.E.; Hegde, H.; Unni, D.R.; Haendel, M.A. Mondo: Unifying diseases for the world, by the world. MedRxiv 2022. [Google Scholar] [CrossRef]
Putman, T.E.; Schaper, K.; Matentzoglu, N.; Rubinetti, V.P.; Alquaddoomi, F.S.; Cox, C.; Munoz-Torres, M.C. The Monarch Initiative in 2024: An analytic platform integrating phenotypes, genes and diseases across species. Nucleic Acids Res. 2024, 52, D938–D949. [Google Scholar] [CrossRef]
Ushold, M.; King, M. Towards a Methodology for Ontology Development. Knowl. Eng. Rev. 1995, 10, 93–128. [Google Scholar]
Grüninger, M.; Fox, M.S. Methodology for the Design and Evaluation of Ontologies. In Proceedings of the Workshop on Basic Ontological Issues in Knowledge Sharing, Montreal, QC, Canada, 19–20 August 1995. [Google Scholar]
Fernández-López, M.; Gómez-Pérez, A.; Juristo, N. METHONTOLOGY: From Ontological Art Towards Ontological Engineering. In Proceedings of the Spring Symposium on Ontological Engineering of AAAI, Palo Alto, CA, USA, 24–25 March 1997; pp. 33–40. [Google Scholar]
Farquhar, A.; Fikes, R.; Rice, J. The Ontolingua Server: A Tool for Collaborative Ontology Construction. Int. J. Hum.-Comput. Stud. 1995, 46, 707–727. [Google Scholar] [CrossRef]
Swartout, B.; Patil, R.; Knight, K.; Russ, T. Toward Distributed Use of Large-Scale Ontologies. In Proceedings of the Tenth Workshop on Knowledge Acquisition for Knowledge-Based Systems, Banff, AB, Canada, 9–14 November 1996. [Google Scholar]
Noy, N.F.; McGuinness, D.L. Ontology Development 101: A Guide to Creating your First Ontology; Stanford Knowledge Systems Laboratory (KSL), Stanford University: Stanford, CA, USA, 2001. [Google Scholar]
Fernández-López, M.; Gómez-Pérez, A.; Suárez-Figueroa, M.C. Methodological guidelines for reusing general ontologies. Data Knowl. Eng. 2013, 86, 242–275. [Google Scholar] [CrossRef]
Mcheick, H.; Saleh, L.; Ajami, H.; Mili, H. Context Relevant Prediction Model for COPD Domain Using Bayesian Belief Network. Sens. J. 2017, 17, 1486. [Google Scholar] [CrossRef] [PubMed]
Gómez-Pérez, A. Ontology Evaluation. In Handbook on Ontologies. International Handbooks on Information Systems; Staab, S., Studer, R., Eds.; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
Lovrenčić, S.; Čubrilo, M. Ontology evaluation—Comprising verification and validation. In Proceedings of the Central European Conference on Information and Intelligent Systems (CECIIS 2008), Varaždin, Croatia, 24–26 September 2008. [Google Scholar]
Sánchez, J.E. Combining METHONTOLOGY and Cyc 101 for Medical Ontology Design. J. Biomed. Inform. 2011, 44, 183–197. [Google Scholar]
Jonathan, E. Requirements-Oriented Methodology for Evaluating Ontologies. Ph.D. Thesis, RMIT University, Melbourne, VIC, Australia, 2008. [Google Scholar]
Gómez-Pérez, A.; Fernández-López, M.; Corcho, O. Ontological Engineering: With Examples from the Areas of Knowledge Management, e-Commerce and the Semantic Web; Springer: London, UK, 2004. [Google Scholar]
Horridge, G.A. Recognition of a familiar place by the honeybee (Apis mellifera). J. Comp. Physiol. A 2005, 191, 301–316. [Google Scholar] [CrossRef]
Breiman, L. Decision Trees. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Rokach, L. Data Mining with Decision Trees: Theory and Applications; World Scientific Publishing Co. Pte. Ltd.: Singapore, 2014. [Google Scholar]
Loh, W.-Y. Classification and Regression Trees. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2011, 1, 14–23. [Google Scholar] [CrossRef]
Kleinbaum, D.G.; Klein, M. Logistic Regression: A Self-Learning Text, 3rd ed.; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
Mekkaoui, S.E.; Benabbou, L.; Berrado, A. Rule-Extraction Methods From Feedforward Neural Networks: A Systematic Literature Review. arXiv 2023, arXiv:2312.12878. [Google Scholar]
Gallucci, M.; Carbonara, P.; Pacilli, A.M.G.; di Palmo, E.; Ricci, G.; Nava, S. Use of symptoms scores, spirometry, and other pulmonary function testing for asthma monitoring. Front. Pediatr. 2019, 7, 54. [Google Scholar] [CrossRef]
World Health Organization. Obesity: Preventing and Managing the Global Epidemic; WHO Technical Report Series 894; World Health Organization: Geneva, Switzerland, 2000. [Google Scholar]
Hayashi, Y.; Takano, N. One-dimensional convolutional neural networks with feature selection for highly concise rule extraction from credit scoring datasets with heterogeneous attributes. Electronics 2020, 9, 1318. [Google Scholar] [CrossRef]
Guyon, I.; Elisseeff, A. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
Horrocks, I.; Patel-Schneider, P.F.; Boley, H. SWRL: A Semantic Web Rule Language Combining OWL and RuleML. J. Web Semant. 2019, 8, 149–166. [Google Scholar]
Kumar, R.; Sharma, R.; Gupta, A. Rule-Based Reasoning in Ontology-Driven Systems: Applications and Advancements. J. Semant. Technol. 2021, 13, 215–232. [Google Scholar]
Smith, B. Ontology Reasoning in Healthcare. J. Semant. Web Appl. 2020, 45, 77–89. [Google Scholar]
Grau, B.C.; Horrocks, I.; Motik, B. OWL Reasoning Techniques. Artif. Intell. 2017, 34, 145–162. [Google Scholar]
Sadoughi, F.; Behmanesh, A.; Sayfouri, N. Ontology-Based Clinical Decision Support Systems. J. Biomed. Inform. 2020, 103, 103383. [Google Scholar] [CrossRef]
Gonzalez, R.S. Enhancing Patient Outcomes with Ontology-Driven Systems. Med. Inform. Decis. Mak. 2018, 18, 34–46. [Google Scholar]
Lovrenčić, S.; Čubrilo, M. Ontology Validation in Complex Medical Contexts. J. Knowl. Eng. 2021, 15, 112–130. [Google Scholar]
Nguyen, T.H.; Tettamanzi, A.G.B. An evolutionary approach to class disjointness axiom discovery. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, Thessaloniki, Greece, 14–17 October 2019. [Google Scholar]
Yemson, R.; Kabir, S.; Thakker, D.; Konur, S. Ontology development for detecting complex events in stream processing: Use case of air quality monitoring. Computers 2023, 12, 238. [Google Scholar] [CrossRef]

Figure 1. Methodology for designing general contextual chronic disease ontology.

Figure 2. Designing a general contextual ontology framework for the extended asthma ontology.

Figure 3. General contextual chronic disease ontology.

Figure 4. Patient ontology.

Figure 5. Part of the temporal, environmental and organizational ontology.

Figure 6. Data pre-processing steps.

Figure 7. Comparison of decision tree, neural network and logistic regression algorithms based on accuracy, precision, recall and F1-Score values.

Figure 8. Selected individual patient information for the asthma ontology.

Table 1. Ontology uses in health care over the last decade.

Year	Ontology Application	Reference
2012	Chronically ill patients under treatment	[27]
2013	Cancer diagnosis	[21]
2014	Brain neoplasm disease	[28]
2015	Semantic ontology for EHR	[22]
2015	Rare diseases	[23]
2016	Diabetes management	[24]
2017	Health system ontology	[29]
2017	Drug discovery	[25]
2018	Genetic-based personalized health care	[26]
2018	Chronic obstructive pulmonary disease	[11]
2019	Mental health and therapy planning	[30]
2020	COVID-19 pandemic response	[31]
2020	Complex healthcare data structures	[32]
2021	Health internet-of-things integration	[33]
2022	Healthcare frameworks	[34]
2022	COVID-19 and safety perspectives	[35]
2023	Healthcare data privacy and epidemiology	[36]
2024	Patient-centric care	[37]
2024	Data preparation in health care	[38]

Table 2. Comparative analysis of contextual classifications within existing disease ontologies.

Feature	Asthma Ontology (BioPortal)	Disease Ontology	Mondo Ontology	Monarch Initiative	Our Ontology (GCOCD-Asthma)
Scope	Asthma-specific concepts	Broad disease categories	Integrated disease classification	Focuses on genetic and phenotypic traits	Context-aware asthma ontology
Environmental Context	❌ Not included	❌ Not included	❌ Not included	✅ Limited genetic-environment links	✅ Explicitly modeled (pollution, humidity, allergens)
Temporal Context	❌ Not included	❌ Not included	❌ Not included	❌ Not included	✅ Includes seasonal and daily variations
Medical Standard Integration	✅ Uses SNOMED CT	✅ Uses multiple standards	✅ Uses multiple standards	✅ Uses multiple standards	✅ Compatible with SNOMED CT
AI-Driven Rule Generation	❌ Not included	❌ Not included	❌ Not included	❌ Not included	✅ Decision tree-based rule extraction
Ontology Reusability	❌ Limited	✅ Broad use	✅ Broad use	✅ Broad use	✅ Extensible to other chronic diseases

Table 3. Performance comparison of AI models.

Model	Accuracy	Precision	Recall	F1 Score
Decision Tree	0.94	0.905	0.895	0.93
Logistic Regression	0.63	0.600	0.590	0.63
Neural Network	0.96	0.930	0.910	0.95

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Msheik, B.; Adda, M.; Mcheick, H.; Nasser, Y.; Dbouk, M. Developing Contextual Ontology for Chronic Diseases: AI-Enhanced Extension and Prediction in an Asthma Case Study. Appl. Sci. 2025, 15, 4353. https://doi.org/10.3390/app15084353

AMA Style

Msheik B, Adda M, Mcheick H, Nasser Y, Dbouk M. Developing Contextual Ontology for Chronic Diseases: AI-Enhanced Extension and Prediction in an Asthma Case Study. Applied Sciences. 2025; 15(8):4353. https://doi.org/10.3390/app15084353

Chicago/Turabian Style

Msheik, Batoul, Mehdi Adda, Hamid Mcheick, Youmna Nasser, and Mohamed Dbouk. 2025. "Developing Contextual Ontology for Chronic Diseases: AI-Enhanced Extension and Prediction in an Asthma Case Study" Applied Sciences 15, no. 8: 4353. https://doi.org/10.3390/app15084353

APA Style

Msheik, B., Adda, M., Mcheick, H., Nasser, Y., & Dbouk, M. (2025). Developing Contextual Ontology for Chronic Diseases: AI-Enhanced Extension and Prediction in an Asthma Case Study. Applied Sciences, 15(8), 4353. https://doi.org/10.3390/app15084353

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Developing Contextual Ontology for Chronic Diseases: AI-Enhanced Extension and Prediction in an Asthma Case Study

Abstract

1. Introduction

2. State-of-the-Art

Comparing the General Contextual Ontology with the Existing Ontologies

3. Methodology for Designing General Contextual Chronic Disease Ontology

3.1. Designing General Contextual Ontology: Contextual Insights from Dataset Analysis

Sánchez’s Methodology with a Comparative Perspective

3.2. Extending the General Contextual Ontology to Asthma

Ontology Construction from Dataset Schema

3.3. Validate the Extended Asthma Ontology

4. Designing an Ontology Framework: A Case Study on Asthma

4.1. Design General Contextual Ontology

4.2. Data Acquisition

4.3. Data Preprocessing

4.4. Selection of Relevant Parameters

4.4.1. Rationale Behind the Choice of AI Algorithm

4.4.2. Decision Tree

4.4.3. Logistic Regression

4.4.4. Neural Network

4.4.5. Evaluation of AI Algorithms Using Specific Metrics

4.5. Extracting Rules

4.6. Decision Tree Model for Asthma Detection: Implementation Details

4.7. Design the Extended Asthma Ontology

4.8. Ontology Reasoning

4.8.1. Load the Ontology

4.8.2. Add Rules and Data

4.8.3. Run the Pellet Reasoner

4.9. Validation for the Extended Ontology of Asthma Based on Specific Metrics

4.9.1. Classification

4.9.2. Reusability

4.9.3. Completeness

4.9.4. Consistency

4.9.5. Disjointness

4.9.6. Complexity

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI