3.1. A Conceptual Framework for Model-Driven Explainable Artificial Intelligence
In this section, we briefly explain the concept of model-driven XAI (or model-based XAI), somewhat in opposition to shallow XAI approaches such as LIME or SHAP. The concept of causality plays the principal role in MD-XAI, in addition to causal reasoning, causal models, causal graphs, etc. [
23]. The phenomenon of causality is a core mechanism to explain the behavior of systems. It can be used both forwards—by a kind of deductive reasoning or forward chaining—for predicting the results of certain input and backwards for a kind of abductive reasoning or backward chaining—for finding potential causes of some observed system outputs. Both forward reasoning and backward reasoning can be multistep procedures based on a graphical framework constituted by a multi-level graph. In [
37], we presented a probabilistic causal model for biomedical data. It consists of a DAG with clear presentation of the causal dependencies among variables. Here, we consider a similar framework but a strictly deterministic one.
Here, we put forward the idea of the “3C principle”. In order to build an explainable model, we propose that the following 3 concepts be used in common:
Causality—identification of causal dependencies among variables;
Components—identification of system components; and
Connections—identification of the connections of inputs, components, and outputs.
The abovementioned picture comes from the ideas of systems science and automatic control. A good example of such knowledge representation is the multiplier–adder circuit introduced in the Reiter’s famous paper on model-based diagnosis [
25]. Some other examples used in ML/AI are Bayesian Networks (BNs) [
23,
37].
In order to provide clear intuitions for further discussion, let us briefly refer to the example network discussed in
Section 2.1. It is the example of the famous multiplier–adder system presented in detail in the seminal paper on model-based diagnosis [
25]. In order to focus on explanatory reasoning, we can follow the mode of Ligęza [
30]. This example clearly explains the intuitions behind the following concepts:
System inputs (A, B, C, D, and E);
System outputs (F and G);
Meaningful intermediate variables (X, Y, and Z);
System components (three multipliers and two adders);
Causality—the input signals and the defined work of components that cause the actual values of the intermediate and output variables.
For MD-XAI, we need an extended input; apart from a data table, we also need to incorporate some knowledge, e.g., the components, connections, intermediate values, etc.
Definition 1. Let be a set of attributes denoting the input variables, and let D be the output or decision variable. A table (T) of data specifications for a machine learning problem takes the following form presented in Table 5: The basic task of ML is to induce a decision tool or procedure capable of predicting the value of the decision column on the basis of the data columns.
The proposed approach consists of the following steps:
Selecting existing values or injecting new meaningful intermediate values;
Determining the functions defining their values;
Incorporating composition boxes;
Defining connections among composition boxes;
Evaluating the parameters (such as simplicity, error rate, etc.);
Defining the exceptions uncovered by the model,
Final simplification or modification/tuning of the structure and parameters.
The basic concept to be introduced is that of Meaningful Intermediate Variable (MIV), which is defined as follows:
Definition 2. A meaningful intermediate variable (MIV) is a (new) variable satisfying the following conditions:
It has clear semantics; knowledge of its value is of some importance to the domain experts, and it can be understood by the data analyzers;
Its values are defined with a relatively simple function (for example, from a set of predefined, admissible base functions);
The input for the function is defined by the data table or other previously calculated MIVs.
A typical MIV defines a new, meaningful parameter that can be easily interpreted by the users and that moves the discussion to an upper, abstract level. The function behind an MIV can typically have 2–3 inputs. Therefore, the role of introducing MIVs is threefold, including the following:
Reduction of the problem dimension by combining several inputs (from the initial dataset) into a single, newly created meta-level parameter;
Catching the causal and functional dependencies existing in the dataset;
Moving the explanatory analysis to a higher, abstract level in order to simplify it.
Typical examples of MIVs include introducing/calculating the area of specific surface, volume, density, ratio of coexisting values, etc. In the introductory example, it was the
d variable, as introduced in the 4th column of
Table 3.
A perfect example from the medical domain is the
Body Mass Index (
), which is defined as follows:
where
W is the weight of a person (in kilograms) and
H is his height (in meters). The
parameter is a MIV, while
W and
H are input data values. An experiment concerning the explanation of BMI considering LIME and SHAP versus grammatical evolution was reported in [
38].
Note that the definition of MIVs constitutes an extra knowledge component and is not included in the typical dataset from ML repositories. Invention and injection of MIVs may correspond to a tradeoff between random selection from the base of available MIVsand expert advice based on strong domain knowledge.
Also note that MIVs can be organized into several levels, and the outputs of some of them may serve as inputs for others.
3.2. Explanation Examples: IRIS
In this section, we consider several white-box, model-driven XAI models for the widely explored IRIS dataset. The IRIS dataset consists of 150 instances, where each instance represents an iris flower sample. Each sample has four features/attributes, namely sepal length, sepal width, petal length, and petal width. Based on these features, each sample in the dataset is labeled with one of three classes, representing different species of iris flowers, namely Setosa, Versicolor, or Virginica.
The dataset is balanced, with 50 samples for each of the three species. Moreover, it demonstrates good separability among the different species of iris flowers. The simplicity and small size of the dataset make it a popular choice for algorithm testing in the field of ML.
Initially, we applied the GE approach to four features from the dataset (minimizing the number of misclassified labels, allowing 1000 iterations, and with default parameters), utilizing the following grammar:
expr = grule((expr) & (sub.expr),
(expr) | (sub.expr),
sub.expr),
sub.expr = grule(comparison(var, func.var)),
comparison = grule(‘>’, ‘<’, ‘==’, ‘>=’, ‘<=’),
func.var = grule(num, var, func(var)),
func = grule(mean, max, min, sd),
var = grule(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width),
num = grule(1, 1.5, 2, 2.5, 3, 4, 5)
Below, we present illustrative examples that demonstrate the separability of the dataset, including the resultant formulas and plots for the
Setosa (
Figure 6) and
Versicolor (
Figure 7) classes:
ifelse(Petal.Length <= Sepal.Width, “setosa”, “other”)
ifelse((Petal.Width <= 1.5) & (Petal.Length >= 2.5), “versicolor”, “other”)
While the classification accuracy for the Setosa species was perfect, the accuracy for the other classes suggests potential for improvement (e.g., 94.66% for Versicolor).
To estimate confidence intervals (CI) for the accuracy of rule-based classification models, which traditionally lack direct uncertainty measures due to their deterministic nature, we applied a bootstrap resampling technique. Utilizing the boot package in R, we performed 1000 resamplings of the IRIS dataset. Each sample was evaluated using a predefined model that classifies instances as a given species or other based on fixed thresholds of the input features. This process of recalculating the model accuracy for each bootstrap sample generated a distribution of accuracies. From this distribution, percentile-based confidence intervals were derived, quantifying the variability and reliability of the model performance across different samples. The estimated CI (with a 95% confidence level) for Versicolor is (0.9067, 0.9800).
To improve the classification, we introduced MIVs.
In order to maintain simplicity, several models are presented in order but according to the same, simple scheme.
Causal model—data flow;
Input, output, and MIVs;
Functional model;
Accuracy;
Simplicity.
The first model is very simple; only one MIV is introduced, and it is a value intended to be proportional to the petal area.
The inputs are petal.length and petal.width
We allowed for MIVs in the grammar definition. For each species, we enabled 1000 iterations of the evolutionary process. The complete functional model introduces a simple comparison of the MIV value defined by Equation (
1) with the threshold values presented below.
ifelse(PLmPW <= 2, “setosa”, “other”)
ifelse((PLmPW <= 8) & (PLmPW >= 2), “versicolor”, “other”)
ifelse(PLmPW >= 8, “virginica”, “other”)
For intuition, in human-understandable terms,
can be interpreted as the “rough area” of a petal. The rules can be put together to create the functional model shown in
Figure 8.
The results of the classification process performed with the provided functional model are depicted in
Figure 9.
The classification model, although not perfect, achieves a relatively high accuracy of approximately 95%. The model exhibits a high degree of simplicity, utilizing merely two input variables and a single MIV, with the number of outliers limited to only seven. The model is interpretable and easy to understand for humans. The 95% CI for the model accuracy is (0.9200, 0.9867).
Functional models can incorporate more MIVs.
Now, let us introduce another MIV that estimates sepal area. The inputs are sepal.length and sepal.width
MIVs in the model can be derived not only from input parameters but also based on other MIVs. An example of such an MIV is the petal–sepal area ratio based on MIVs (
1) and (
2).
Utilizing the GE approach, the following rules were generated based on the MIV value defined by Equation (
3):
ifelse(PdS <= 0.2, “setosa”, “other”)
ifelse((PdS <= 0.43) & (PdS >= 0.22), “versicolor”, “other”)
ifelse(PdS >= 0.43, “virginica”, “other”)
The rules come together to form the functional model shown in
Figure 10.
The outcomes of the classification executed using the given functional model are shown in
Figure 11.
The model shows slight improvement compared to the previous one. The number of outliers is reduced to six (accuracy improved to 96%). The 95% CI for the model accuracy is (0.9267, 0.9867). The model is more complex than its predecessor yet remains interpretable to humans.
Furthermore, in order to statistically compare the functional models shown in
Figure 8 (Model 1) and
Figure 10 (Model 2), we performed analysis of variance (ANOVA) on 1000 bootstrap samples for both models, specifically examining their respective accuracies. The complete results of ANOVA are presented in
Table 6. A low
p-value (<
) indicates that there is a statistically significant difference in accuracy between the two tested models. The significant F value (96.75) suggests a strong model effect on accuracy.
As ANOVA indicated significant differences between models, we performed Tukey’s Honestly Significant Difference (HSD) test. The adjusted p-value (effectively zero) indicates that the difference in mean accuracy between Model 1 and Model 2 is statistically significant. Model 2 has a higher mean accuracy than Model 1 by approximately 0.745%, with a 95% CI of (0.0060, 0.0089).