**5. Data Collection**

There were eight input parameters and one output variable in the data collected. Less than 200 square meters, 200 to 400 square meters, 400 to 600 square meters, and more than 600 square meters were the four categories for the floor space factor. The authors divided the factors of the number of floors into four categories: one or two floors, three to five stories, six to eight stories, and more than eight stories. The interior finishes variant is categorized into four groups: no interior finishes, basic, semi-finished interior finishes, and luxurious interior finishes. The choice is of the type of semi-finished interior finishes in the case of normal plaster for walls only and there are no paintworks, whereas for the type of basic interior finishes, it is in the case of the presence of paint works for the walls and ceramics for the floors. The type of luxurious interior finishes is chosen in the case of the presence of paint works for the walls and porcelain or marble works for the floors. The external finishing aspect was simply divided into two categories: basic and luxurious. The type of basic external finishing is chosen if the facades of the building have been painted only without any works of marble, Hashemite, or Pharaonic stone, while the external finishing is considered the luxurious type if the facades of the building have been done with any works of marble, Hashemite, or Pharaonic stone. There were two groups for the number of basements parameter: no basement and one basement. The overall project duration parameter was divided into four categories by the authors: less than six months, six months to a year, one year to two years, and more than two years. The risk management process application parameter was split into two categories: no risk management processes were performed on the project and risk management procedures were performed on the project. Electromechanical can be divided into two categories: basic and luxurious. The type of the electromechanical parameter is considered with the basic standards if the scope of work includes the main works of water, electricity, and sewage outside the apartment, but if it includes the internal works of the apartment, the type of the electromechanical parameter is considered a luxury type.

Based on the review of planned cost and actual cost data and using Equation (2), the authors assessed the overall percentage of risk for the completed projects. A project is excluded from the analysis if there is insufficient information about its planned cost or its actual cost. According to Table 5, the overall percentage of risk which is the major outcome variable was divided into three levels: low, medium, and high-risk scores.

$$\%OR = \frac{|AC - PC|}{PC} \times 100\tag{2}$$

"*%OR*" represents the overall percentage of risk, "*PC*" represents planned cost and "*AC*" represents an actual cost.

**Table 5.** The Classifications of outputs.


The authors examined 250 projects and discovered that some data for the eight input variables or the result variable were missing. As a result, only the full data of 149 actual residential projects were accessed. For example, out of the 149 projects analyzed, "case no. the 26" project consisted of a 12-story building with 500 square meters per level, exquisite interior and exterior finishes, and luxurious electromechanical work. This building has one basement and was built in 20 months using risk management procedures, with an overall risk of roughly 12%. Due to the enormous population, it was assumed that the population's size was unlimited, so the sample size could be calculated using Equation (3). Table 6 shows demographic information about the respondents, whereas Table 7 shows demographic information about the inputs gathered from 149 projects.

$$SS = \frac{Z^2 \times p \times (1 - p)}{\mathbb{C}^2} \tag{3}$$

where *SS* stands for sample size, *Z* stands for 1.96 with a 95% confidence level, *p* stands for the probability of selection, and *C* stands for the confidence interval. The sample size in this study was 149 projects, and the *p*-value was 0.5, hence the confidence interval was 0.08.

**Table 6.** The demographic data regarding the respondents.


**Table 7.** Demographic data of inputs.


Using Equation (4), the authors estimated Cronbach's Alpha for items. Cronbach's Alpha has a threshold of 0.7 [33]. Cronbach's alpha in this study was 0.757, which is higher than 0.7. It indicates that the scale is consistent and does not contradict itself, implying that it will produce the same findings when applied to the same sample again. Validity refers to how accurate a measurement is. The validity of this study was 0.87.

$$\approx = \frac{n}{n-1} \times \left(1 - \frac{\sum\_{1}^{n} V\_{i}}{V\_{t}}\right) \tag{4}$$

where "*n*" represents the number of items, *Vi* represents the variance of item *i*, and *Vt* represents the variance of the test score.

#### **6. Model Specification**

The model was simulated using artificial neural networks. Due to its ease of use, IBM SPSS software was chosen to construct the model. It has a simple user interface and can be quickly imported and exported from Excel. The model has eight input parameters and just one output. Floor space, number of floors, interior, and exterior finishes, number of basements, total project time, risk management process application, and electromechanical type are all inputs. The output, on the other hand, is the overall risk factor. To evaluate the model's performance, the acquired data were randomly divided into 5-fold cross-validation. The first fold has 29 cases, whereas the subsequent folds have 30. Four folds were utilized to train the network in each model, while the fifth fold was used to evaluate the model. One hidden layer or two hidden layers might be present in a model. As a result, there are two different sorts of hidden layer groups. The number of neurons in the model with one hidden layer can be three, four, or five. In models with two hidden layers, the number of neurons can be four in the first layer and three in the second, or five in the first layer and three in the second, or five in the first layer and four in the second. Thus, there are three different groups in terms of the number of neurons in each hidden layer. The hyperbolic tangent function or the sigmoid function was employed as an activation function for the hidden layers, and both were investigated. Equation (5) can be used to estimate the number of models that can be tested. Twelve Multilayer Perceptron models were identified and tested as a result. The examined models and their mean absolute errors (MAE) in each k-fold are shown in Table 8. Equation (6) can be used to calculate the mean absolute error [34].

$$N\_m = N\_l \times N\_a \times N\_\mathcal{g} \tag{5}$$

$$MAE = \frac{\left(\sum\_{i=1}^{N} (ER - RS)\right)}{N} \tag{6}$$

where "*Nm*" stands for the number of models, "*Nl*" for the number of hidden layers, "*Na*" for the number of hidden layer activation functions, and "*Ng*" for the number of neuron groups. "*ER*" stands for the model's estimated risk, "*RS*" for the risk score, and "*N*" for the number of case studies.

The mean absolute error of any model is equal to the mean error in its k-fold. Hence, the proposed model should have the minimum percentage of MAE. In this study, the MAE was equal to 11.7%, as shown in Table 8. The proposed model consists of two hidden layers: five neurons in the first hidden layer, and three neurons in the second hidden layer. The activation function of the hidden layer was the Hyperbolic Tangent function in the proposed model. Figure 1 illustrates the structure of the proposed model. The real and estimated overall risks are presented in Table 9.


**Table 8.** Mean absolute error of the models.

"H" stands for the Hidden Layers' Hyperbolic Tangent activation function and "S" stands for the Hidden Layers' Sigmoid activation function.

**Figure 1.** The Methodology of research.


**Table 9.** Classification of overall risk.
