1. Introduction
The number of road construction projects is increasing dramatically every year. Although project management is being more expertly implemented, there are still problems associated with cost overruns in projects [
1]. One of the factors that increases the capital output ratio for a country’s economy is cost overrun. Estimating the cost of projects has always been a crucial, demanding and sophisticated challenge [
2,
3]. Cost estimation is a process in which the total cost of a project is predicted based on the existing information [
4]. Generally, cost estimation is conducted in order to set the initial budget of a project, which will ideally produce symmetry between the initial estimation and the subsequent actual cost [
1]. Cost estimation presents some difficulties, such as the initial information required, the small number of databases available for road construction project costs, the low efficiency of existing cost estimation methods and the existence of uncertainties [
5].
Earned Value Management (EVM) is a tool to help with controlling the progress of a project. EVM is able to illustrate the current status of projects, as well as measuring current variances [
6]. To assess the progress of projects, EVM exploits three constraints: time, scope and cost. Moreover, EVM is able to predict the future parameters of projects, including the final cost, based on existing data [
7,
8,
9]. This comprehensive management approach has been widely used in numerous studies and in different fields [
10,
11,
12,
13,
14].
Artificial Neural Networks (ANNs) are an effective tool that imitates the human mind for application in various problems [
15]. The first application of ANNs in construction activities took place in the late 1980s [
16]. Adeli (2001) published the first scientific article regarding the use of ANNs in the construction industry [
17]. ANNs are widely used in various stages of a project, including design, construction, maintenance, renovation and destruction [
18]. Some examples of the use of ANNs are presented in the following.
Albino and Garavelli (1998) applied a neural network in order to rank subcontractors in construction firms [
19]. Leung et al. (2001) exploited ANNs to predict the hoisting times of tower cranes [
20]. Cheung et al. (2006) forecasted the performance of projects using neural networks [
21]. Vouk et al. (2011) analyzed the economy of wastewater systems using neural networks [
22]. Mucenski et al. (2013) estimated the recycling capacity of multistorey buildings using ANNs [
23]. Chaphalkar et al. (2015) used a multilayer perceptron neural network in order to forecast the outcome of construction dispute claims [
24]. Golanaraghi et al. (2019) predicted formwork labor productivity using an ANN [
25]. Tijanic et al. (2019) used an ANN in order to predict costs in road construction [
26]. Readers are referred to References [
27,
28,
29,
30,
31,
32,
33,
34,
35,
36] for further uses of ANNs for various applications in the construction industry, as well as in other fields of science.
Cost, time and quality are the three components of success in a construction project. In other words, a project in which construction is finished within the predicted cost, to the required quality and within the forecasted time can be called a successful project [
37]. The cost of construction projects usually deviates from the initial estimation due to a variety of factors [
38]. In other words, the costs in construction projects do not usually remain the same as they were predicted to be before the construction phase. Cost increases are normal, as can be seen in most projects [
39]. According to the available literature, not many projects are finished within the forecasted cost. A lot of construction projects face both delays and cost overruns [
40]. Flyvbejerg et al. illustrated that cost underestimation happens dramatically more frequently than cost overestimation [
41]. Iran is a developing country, and cost overruns are common in such countries. For instance, Heravi and Mohammadian (2019) investigated 72 construction projects in Iran based on both their documentation and their actual performance. They concluded that larger projects faced higher cost overruns and delays [
42]. Although EVM is able to illustrate the degree to which delays and cost shortages exist in a project on the basis of the project’s previous data, it cannot provide an accurate prediction of the future status of the project [
8,
9].
EVM results are obtained during and after the implementation phase. Thus, having the ability to predict the future situation of the project during the implementation phase could be very useful for project managers. The novelty of this study is in using an ANN, a tool that possesses the ability to learn from existing data in order to effectively predict the future status, in order to obtain more precise future predictions [
25]. In this way, hazardous situations are less likely to happen, as they will have been forecasted before their occurrence. There are few previous research studies that have attempted to address the deficiency of the earned value management system in accurately predicting a project’s future status. Moreover, as mentioned before, construction projects usually face time and cost overruns, making it a permanent issue for all project managers [
37]. For instance, Moura et al. conducted a research study and concluded that construction projects experienced cost overruns of 20.4% to 44.7% in comparison to the initial cost estimation [
43]. Thus, the significance of this study is in enabling project managers to use ANNs instead of the traditional EVM method in order to predict a project’s future status more accurately and to fill the mentioned gaps in the body of knowledge. In the current study, we chose to investigate road construction projects in Fars Province, Iran, as a case study. The findings of this study will help road construction industry members to predict cost indices more precisely in their projects.
2. Methodology
The methodology of the current study was determined according to the research aim. The main purpose of this research was to improve the prediction of the traditional EVM system in Fars road construction projects using an artificial neural network, as well as comparing it with a multiple regression model. The abovementioned main aim can be divided into three stages. Firstly, factors affecting the earned value of Fars road construction projects were determined using the existing literature. An artificial neural network was built in MATLAB, and the identified factors were introduced to the ANN model. In the next stage, the identified factors were prioritized in MATLAB using the ANN model. Finally, multiple regression was used as the analyzing tool, and the obtained results were compared with the ANN model. The abovementioned stages are summarized in
Figure 1.
2.1. Predicting Earned Value Using Artificial Neural Network
Intelligent dynamic systems, such as ANNs, have been under researchers’ focus recently [
44,
45,
46,
47,
48,
49,
50,
51]. ANNs are able to identify the relationship among data by analyzing them and to then exploit this relationship in further analyses [
52]. In fact, these computational intelligence-based systems attempt to model the neurosynaptic structure of the brain and are able to contribute to estimation, prediction and categorization problems effectively [
53]. Generally, ANNs consist of three layers, namely, the input, hidden and output layers. Each of the abovementioned layers possesses its own neurons. It is important to mention that the number of hidden layers may be more than one according to the problem. In the current study, a multilayer perceptron network was used.
2.1.1. Input Data
Variables affecting the status of the project must be identified in order to investigate its future status. In fact, these variables are the input data of the artificial neural network. In this study, 14 factors affecting a project’s success were identified by investigating the existing literature, including books, journal papers and documents from the Fars State Road Administration. Due to the high sensitivity of this paper’s topic, the authors were not able to reduce the abovementioned number of factors. Some of the variables possessed numerical values, such as inflation rate. The inflation rate was derived from the Central Bank of Iran. However, there were variables that were not numerical, such as the qualification of the project management team. The abovementioned data were then quantified by scoring the variables from 1 to 5, where 1 and 5 stand for the worst and best status of a variable, respectively. In order to make it clearer, the qualitive status of a variable and its corresponding quantitative value are illustrated in
Table 1. Ten questionnaires were filled out by experts for each project. Thus, 500 questionnaires were used for data gathering.
Using Microsoft Project files of the studied projects, the Cost Performance Index (
) of each project was extracted. Then, using Microsoft Excel, Mean Squared Error (
) was calculated. This error was used to compare the results of the ANN, multiple regression and the traditional EVM method. The BOX-COX method was used in order to normalize data using SPSS software. Then, the obtained data were exported to MATLAB software for further stages.
and
formulas are presented as follows [
1,
8,
54,
55]:
where
and
stand for the actual cost of the work performed and the budgeted cost of the work performed, respectively.
2.1.2. Architecture of the Network
In this stage, the network’s architecture must be determined. In order to do so, the number of input, hidden and output layers should be specified [
15]. In this study, an MLP (Multilayer perceptron) network is used in which the output of each layer is considered the input vector for the next layer. Each layer’s neurons have connections with the previous layer’s neurons. Each neuron’s duty is to calculate the net layer’s weight and pass data through a function called the transfer function. Sigmoid Tangent is regarded as one of the most useful functions in this case and has been widely used by experts [
56,
57,
58,
59,
60,
61]. Thus, the abovementioned function was used as the transfer function. The final network in this research constitutes a multilayer perceptron neural network with 14 input variables in an input layer, a hidden layer and an output layer. The schematic structure of the designed neural network is illustrated in
Figure 2.
2.2. Determination and Prioritization of Factors Using ANN
After training the network, output coefficients of introduced variables can be extracted from MATLAB software. As the artificial neural network considers all the introduced factors important, the prioritization of factors is conducted according to the coefficients.
2.3. Earned Value Prediction Using Multiple Regression Method
The correlation among dependent and independent variables can be determined using the multiple regression method [
62]. There are four methods to enter input data into the model. These methods are the entering method (direct method), backward method, forward method and step-wise method [
63]. In this study, the direct entering method was selected to be exploited. The linear relationship among the variables is illustrated below:
where
is the number of predictions,
is the value of the
th coefficient,
is the
th value of the
th prediction, and
is the error of the
th value. Furthermore, the matrix form of the model is presented as follows:
where
is the vector of regression coefficients,
is the matrix of fitting errors,
is the vector of the dependent variable, and
is the matrix of independent variables.
In order to determine and rank factors affecting the earned value of the studied projects, outputs of SPSS analyses were used. Variables with a significance of less than 0.05 were selected as effective factors. Furthermore, according to their significance value, variables were prioritized.
Finally, the ANN and the multiple regression model were compared according to the correlation coefficient and mean squared error of each model. The model possessing the higher correlation coefficient, as well as the lower MSE, was introduced as the preferable model [
64].
2.4. Data Collection
In order to collect data, information regarding 50 road construction projects in Fars Province was extracted from documents. Then, besides other literature sources, data were turned into matrices and analyzed. As all factors affecting the cost of the abovementioned projects had to be considered, 14 factors were finally selected.
4. Conclusions
Perceptron neural networks, especially multilayer perceptron networks, are considered to be some of the best neural networks. In this study, it was observed that these networks were able to perform a non-linear mapping with desirable accuracy by selecting a suitable number of layers and neurons. As these neural networks possess the two main features of experimental data-based learning and parallel generalization ability, they are highly suitable for sophisticated systems that are impossible or difficult to model. Artificial neural networks are more accurate in comparison to other methods due to their usage of proven mathematical formulas possessing the lowest possible errors. One of the aspects that limit the usage of artificial neural networks is the difficulty faced when training them. These networks produce better results when they receive a large group of data. However, adjusting the parameters of network training is a difficult task that requires experience and a lot of trial and error. Furthermore, convergence to an incorrect answer, keeping internal information instead of learning it, and requiring a lot of time for training are other difficulties associated with using artificial neural networks.
In this research, two different models, i.e., an artificial neural network model and a multiple regression model, were designed and analyzed in order to improve the traditional earned value management system. The latter model was used as a validation test for the ANN model. Road construction projects in Fars Province, Iran, between 2010 and 2020 were investigated as a case study. Fourteen factors affecting the earned value of these projects were identified. According to the ANN results, “Project plan”, “Payment status”, “Inflation rate”, “Fortuitous events” and “Qualification of project management team” with coefficients of 0.81, 0.65, −0.58, 0.42 and 0.4 were the top five influencing factors, respectively. On the other hand, according to the multiple regression model results, “Risk management”, “Plans”, “Project schedule”, “Relationship among project’s parties” and “Conflicts” with standardized coefficients of 0.333, 0.321, 0.311, 0.297 and 0.254, respectively, were the most important factors. A comparison of the two models illustrated that both models result in better results in comparison to the traditional EVM method. Moreover, the ANN model with an MSE of 0.00206 and an R value of 0.896 was selected as the best model.
The methods used in this study could also be used to tackle other problems in the construction industry. The results obtained in this study will help road construction industry members to predict the earned value of future projects more precisely. ANN models are highly recommended by the authors for use in other construction problems. Furthermore, it is suggested that prospective researchers focus on more complex construction projects in order to investigate the performance criteria more deeply [
65].