3.3.1. Model Selection

To predict each one of the three class attributes, a Multi-Layer Perceptron ANN was utilized, as provided by the open-source machine learning tool Weka (Frank et al. 2016). ANN is a classifier which consists of a large number of interconnected neurons. Each and every connection is assigned to different weights. These weights are adjusted according to back propagation, which is a training technique of the Multi-Layer Perceptron used to classify given predictive attributes and predict class attributes. At first, data are entered into the first layer and then distributed to the next layer, which is the hidden layer. There is no rule that defines whether the hidden layer should have one or more sub-layers. The most common method to configure this is by making trials (Méndez-Suárez et al. 2019). In the hidden layer, mathematical operations are performed which aim at obtaining the outputs.

An ANN model was selected for the experiment, since the data source is based on business and real time data, which have a complex relationship among their attributes. The ANNs are considered one of the most suitable classification models when the observed data are highly complicated, which is the case in this research effort.

#### 3.3.2. Experimental Parameters

The adopted ANN model is finely tuned with certain parameters for each class attribute application. More specifically, the input layer was composed by 15 nodes, such as the number of the predictive attributes. Three hidden layers were used: (1) the first is composed of 15 nodes; (2) the second is composed of 30 nodes; and (3) the third is composed of 15 nodes. The output layer for each class is composed of two nodes according to the number of binary values (i.e., {0,1}). All the adopted nodes in the ANN are sigmoid. The number of epochs used to train the ANN is set to 500. The learning rate for weight updates is set to 0.3. The momentum applied to the weight updates is set to 0.2. Regarding the model incorporated, the number of nodes by the three layers was converged into the resulted values by experimenting with a given interval of values provided by the Weka machine learning software. This was used to observe optimal results. Accordingly, the number of epochs was experimentally defined by the provided values of the incorporated Weka software. Subsequently, the learning rate and the momentum applied to the weight updates were experimentally provided by the adopted Weka machine learning software. This is because machine learning is an empirical science and it is initially not known which values should be used when training a neural network classification model. For this reason, several parameters are experimented with different values to converge to the model, which achieves optimal efficiency. The number of hidden layers and their nodes, as well as the rest of the model parameters, was observed experimentally as presented in Table 2.

**Table 2.** Experimental Neural Network parameters.


The graphical representation of the adopted ANN is presented in Figure 2.

**Figure 2.** Graphical representation of the adopted Artificial Neural Network.
