Advancement of Artificial Intelligence in Cost Estimation for Project Management Success: A Systematic Review of Machine Learning, Deep Learning, Regression, and Hybrid Models
Abstract
:1. Introduction
2. Literature Review
2.1. Historical Evolution of Cost Estimation Techniques
2.2. Techniques and Tools for Cost Estimation
2.3. AI-Enabled Decision Support Systems in Cost Prediction
2.4. AI Techniques Used in Cost Estimation
2.5. Enhancing Cost Estimation Processes with AI: Key Areas of Application
3. Methodology
3.1. Identification Phase
3.2. Screening Phase
3.3. Eligibility Assessment Phase
3.4. Data Extraction and Thematic Analysis
4. Findings
4.1. General Information of the Studies
4.2. Cost Estimation Trends
4.3. Relationship Between Study Count and Citation Impact in Cost Management
4.4. Using of AI Cost Estimation Models in Different Projects
4.5. Percentage of AI Models in Industries
4.6. Assessment of Confidence in the Findings
5. Discussion
5.1. Performance of the Models Industry-Wise
5.1.1. Manufacturing
5.1.2. Road Construction
5.1.3. Building Construction Projects
5.1.4. General Construction
5.1.5. Highway Construction
5.1.6. Canals
5.1.7. Field Construction
5.1.8. Mining
5.1.9. Healthcare
5.1.10. Software
5.1.11. Aviation
5.1.12. Bridge Construction
5.2. Performance of the Models in Different Industries
5.2.1. Deep Learning Models
5.2.2. Machine Learning Models
5.2.3. Regression Models
5.2.4. Hybrid Models
5.3. Limitations of This Study
5.4. Future Directions
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
ANN | Artificial neural network |
CBR | Case-based reasoning |
CNN | Convolutional neural network |
DNN | Depp neural network |
GBM | Gradient-Boosting Machine |
GPR | Gaussian Process Regression |
GBT | Gradient-Boosted Trees |
IBL | Instance-based learning |
KNN | K-Nearest Neighbors |
LGBoost | Light Gradient Boosting, |
NGBoost | Natural Gradient Boosting |
LTSM | Long Short-Term Memory |
LR | Linear regression |
PCA | Principal component analysis |
RF | Random Forest |
RME | Relative mean error |
RMSE | Root-mean-square error |
RNN | Recurrent neural network |
SGB | Stochastic Gradient Boosting |
SVM | Support vector machine |
XGBoost | Extreme Gradient Boosting |
References
- Sargentis, G.-F.; Defteraios, P.; Lagaros, N.D.; Mamassis, N. Values and Costs in History: A Case Study on Estimating the Cost of Hadrianic Aqueduct’s Construction. World 2022, 3, 260–286. [Google Scholar] [CrossRef]
- Walker, D.; Dart, C.J. Frontinus—A Project Manager from the Roman Empire Era. Proj. Manag. J. 2011, 42, 4–16. [Google Scholar] [CrossRef]
- Weave, P. The Origins of Modern Project Management. In Proceedings of the Fourth Annual PMI College of Scheduling Conference, Vancouver, BC, Canada, 15–18 April 2007; pp. 15–18. [Google Scholar]
- Seymour, T.; Hussein, S. The history of project management. Int. J. Manag. Inf. Syst. 2014, 18, 233–240. [Google Scholar] [CrossRef]
- Archer, S.; Lesczynski, M. Project Estimation—Go Parametric to Reduce ‘Hectic’. Available online: https://www.pmi.org/learning/library/project-estimation-reduce-hectic-deloitte-6061 (accessed on 28 January 2025).
- Hueber, C.; Horejsi, K.; Schledjewski, R. Review of cost estimation: Methods and models for aerospace composite manufacturing. Adv. Manuf. Polym. Compos. Sci. 2016, 2, 1–13. [Google Scholar] [CrossRef]
- Tayefeh Hashemi, S.; Ebadati, O.M.; Kaur, H. Cost estimation and prediction in construction projects: A systematic review on machine learning techniques. SN Appl. Sci. 2020, 2, 1703. [Google Scholar] [CrossRef]
- Matel, E.; Vahdatikhaki, F.; Hosseinyalamdary, S.; Evers, T.; Voordijk, H. An artificial neural network approach for cost estimation of engineering services. Int. J. Constr. Manag. 2022, 22, 1274–1287. [Google Scholar] [CrossRef]
- Hosny, S.; Elsaid, E.; Hosny, H. Prediction of construction material prices using ARIMA and multiple regression models. Asian J. Civ. Eng. 2023, 24, 1697–1710. [Google Scholar] [CrossRef]
- Ou, T.-Y.; Cheng, C.-Y.; Chen, P.-J.; Perng, C. Dynamic cost forecasting model based on extreme learning machine—A case study in steel plant. Comput. Ind. Eng. 2016, 101, 544–553. [Google Scholar] [CrossRef]
- Momade, M.H.; Durdyev, S.; Dixit, S.; Shahid, S.; Alkali, A.K. Modeling labor costs using artificial intelligence tools. Int. J. Build. Pathol. Adapt. 2024, 42, 1263–1281. [Google Scholar] [CrossRef]
- Chou, J.-S. Cost simulation in an item-based project involving construction engineering and management. Int. J. Proj. Manag. 2011, 29, 706–717. [Google Scholar] [CrossRef]
- Lowe, D.J.; Emsley, M.W.; Harding, A. Predicting Construction Cost Using Multiple Regression Techniques. J. Constr. Eng. Manag. 2006, 132, 750–758. [Google Scholar] [CrossRef]
- Green, C. Estimating as an Art—What It Takes to Make Good Art. Presented at the PMI® Global Congress 2006—EMEA, Madrid, Spain: Project Management Institute, October 2006. Available online: https://www.pmi.org/learning/library/project-estimating-accurate-labor-costs-8207 (accessed on 22 January 2025).
- Niederman, F. Project Openings management: For disruption from AI and advanced analytics. Inf. Technol. People 2021, 34, 1570–1599. [Google Scholar] [CrossRef]
- Shamim, M.I. Exploring the success factors of project management. Am. J. Econ. Bus. Manag. 2022, 5, 64–72. [Google Scholar]
- Sanchez, O.P.; Terlizzi, M.A.; De Moraes, H.R.D.O.C. Cost and time project management success factors for information systems development projects. Int. J. Proj. Manag. 2017, 35, 1608–1626. [Google Scholar] [CrossRef]
- Yun, Y.; Ma, D.; Yang, M. Human–computer interaction-based Decision Support System with Applications in Data Mining. Future Gener. Comput. Syst. 2021, 114, 285–289. [Google Scholar] [CrossRef]
- Tijanić, K.; Car-Pušić, D.; Šperac, M. Cost estimation in road construction using artificial neural network. Neural Comput. Appl. 2020, 32, 9343–9355. [Google Scholar] [CrossRef]
- Guo, K.; Yang, Z.; Yu, C.-H.; Buehler, M.J. Artificial intelligence and machine learning in design of mechanical materials. Mater. Horiz. 2021, 8, 1153–1172. [Google Scholar] [CrossRef]
- Bento, S.; Pereira, L.; Gonçalves, R.; Dias, Á.; Costa, R.L.D. Artificial intelligence in project management: Systematic literature review. Int. J. Technol. Intell. Plan. 2022, 13, 143–163. [Google Scholar] [CrossRef]
- Aziz, S.; Dowling, M. Machine Learning and AI for Risk Management. In Disrupting Finance; Lynn, T., Mooney, J.G., Rosati, P., Cummins, M., Eds.; Palgrave Studies in Digital Business & Enabling Technologies; Springer International Publishing: Cham, Switzerland, 2019; pp. 33–50. [Google Scholar] [CrossRef]
- Gupta, S.; Modgil, S.; Bhattacharyya, S.; Bose, I. Artificial intelligence for decision support systems in the field of operations research: Review and future scope of research. Ann. Oper. Res. 2022, 308, 215–274. [Google Scholar] [CrossRef]
- Tayyab, M.R.; Usman, M.; Ahmad, W. A Machine Learning Based Model for Software Cost Estimation. In Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016; Bi, Y., Kapoor, S., Bhatia, R., Eds.; Lecture Notes in Networks and Systems; Springer International Publishing: Cham, Switzerland, 2018; Volume 16, pp. 402–414. [Google Scholar] [CrossRef]
- Hammann, D. Big data and machine learning in cost estimation: An automotive case study. Int. J. Prod. Econ. 2024, 269, 109137. [Google Scholar] [CrossRef]
- Krichen, M. Convolutional Neural Networks: A Survey. Computers 2023, 12, 151. [Google Scholar] [CrossRef]
- Ning, F.; Shi, Y.; Cai, M.; Xu, W.; Zhang, X. Manufacturing cost estimation based on a deep-learning method. J. Manuf. Syst. 2020, 54, 186–195. [Google Scholar] [CrossRef]
- Fischer, T.; Krauss, C. Deep learning with long short-term memory networks for financial market predictions. Eur. J. Oper. Res. 2018, 270, 654–669. [Google Scholar] [CrossRef]
- Van Houdt, G.; Mosquera, C.; Nápoles, G. A review on the long short-term memory model. Artif. Intell. Rev. 2020, 53, 5929–5955. [Google Scholar] [CrossRef]
- Xue, X.; Jia, Y.; Tang, Y. Expressway Project Cost Estimation with a Convolutional Neural Network Model. IEEE Access 2020, 8, 217848–217866. [Google Scholar] [CrossRef]
- Upreti, K.; Singh, U.K.; Jain, R.; Kaur, K.; Sharma, A.K. Fuzzy Logic Based Support Vector Regression (SVR) Model for Software Cost Estimation Using Machine Learning. In ICT Systems and Sustainability; Tuba, M., Akashe, S., Joshi, A., Eds.; Lecture Notes in Networks and Systems; Springer Nature: Singapore, 2022; Volume 321, pp. 917–927. [Google Scholar] [CrossRef]
- Peško, I.; Mučenski, V.; Šešlija, M.; Radović, N.; Vujkov, A.; Bibić, D.; Krklješ, M. Estimation of Costs and Durations of Construction of Urban Roads Using ANN and SVM. Complexity 2017, 2017, 2450370. [Google Scholar] [CrossRef]
- Tajani, F.; Morano, P. A Systematic Analysis of Benefits and Costs of Projects for the Valorization of Cultural Heritage. In Cultural Territorial Systems; Rotondo, F., Selicato, F., Marin, V., Galdeano, J.L., Eds.; Springer Geography; Springer International Publishing: Cham, Switzerland, 2016; pp. 107–118. [Google Scholar] [CrossRef]
- Dong, J.; Chen, Y.; Guan, G. Cost Index Predictions for Construction Engineering Based on LSTM Neural Networks. Adv. Civ. Eng. 2020, 2020, 6518147. [Google Scholar] [CrossRef]
- Ning, F.; Shi, Y.; Cai, M.; Xu, W.; Zhang, X. Manufacturing cost estimation based on the machining process and deep-learning method. J. Manuf. Syst. 2020, 56, 11–22. [Google Scholar] [CrossRef]
- Chakraborty, D.; Elhegazy, H.; Elzarka, H.; Gutierrez, L. A novel construction cost prediction model using hybrid natural and light gradient boosting. Adv. Eng. Inform. 2020, 46, 101201. [Google Scholar] [CrossRef]
- Jaafari, A.; Pazhouhan, I.; Bettinger, P. Machine Learning Modeling of Forest Road Construction Costs. Forests 2021, 12, 1169. [Google Scholar] [CrossRef]
- Loyer, J.-L.; Henriques, E.; Fontul, M.; Wiseall, S. Comparison of Machine Learning methods applied to the estimation of manufacturing cost of jet engine components. Int. J. Prod. Econ. 2016, 178, 109–119. [Google Scholar] [CrossRef]
- Alshboul, O.; Shehadeh, A.; Almasabha, G.; Almuflih, A.S. Extreme Gradient Boosting-Based Machine Learning Approach for Green Building Cost Prediction. Sustainability 2022, 14, 6651. [Google Scholar] [CrossRef]
- Elmousalami, H.H. Comparison of Artificial Intelligence Techniques for Project Conceptual Cost Prediction: A Case Study and Comparative Analysis. IEEE Trans. Eng. Manag. 2020, 68, 183–196. [Google Scholar] [CrossRef]
- Leśniak, A.; Zima, K. Cost Calculation of Construction Projects Including Sustainability Factors Using the Case Based Reasoning (CBR) Method. Sustainability 2018, 10, 1608. [Google Scholar] [CrossRef]
- Juszczyk, M.; Leśniak, A.; Zima, K. ANN Based Approach for Estimation of Construction Costs of Sports Fields. Complexity 2018, 2018, 7952434. [Google Scholar] [CrossRef]
- Guo, H.; Nguyen, H.; Vu, D.-A.; Bui, X.-N. Forecasting mining capital cost for open-pit mining projects based on artificial neural network approach. Resour. Policy 2021, 74, 101474. [Google Scholar] [CrossRef]
- Ali, Z.H.; Burhan, A.M.; Kassim, M.; Al-Khafaji, Z. Developing an Integrative Data Intelligence Model for Construction Cost Estimation. Complexity 2022, 2022, 4285328. [Google Scholar] [CrossRef]
- Wang, R.; Asghari, V.; Cheung, C.M.; Hsu, S.-C.; Lee, C.-J. Assessing effects of economic factors on construction cost estimation using deep neural networks. Autom. Constr. 2022, 134, 104080. [Google Scholar] [CrossRef]
- Rafiei, M.H.; Adeli, H. Novel Machine-Learning Model for Estimating Construction Costs Considering Economic Variables and Indexes. J. Constr. Eng. Manag. 2018, 144, 04018106. [Google Scholar] [CrossRef]
- Relich, M.; Pawlewski, P. A case-based reasoning approach to cost estimation of new product development. Neurocomputing 2018, 272, 40–45. [Google Scholar] [CrossRef]
- Yaseen, Z.M.; Ali, Z.H.; Salih, S.Q.; Al-Ansari, N. Prediction of Risk Delay in Construction Projects Using a Hybrid Artificial Intelligence Model. Sustainability 2020, 12, 1514. [Google Scholar] [CrossRef]
- Vimont, A.; Leleu, H.; Durand-Zaleski, I. Machine learning versus regression modelling in predicting individual healthcare costs from a representative sample of the nationwide claims database in France. Eur. J. Health Econ. 2022, 23, 211–223. [Google Scholar] [CrossRef]
- Mazumdar, M.; Lin, J.-Y.J.; Zhang, W.; Li, L.; Liu, M.; Dharmarajan, K.; Sanderson, M.; Isola, L.; Hu, L. Comparison of statistical and machine learning models for healthcare cost data: A simulation study motivated by Oncology Care Model (OCM) data. BMC Health Serv. Res. 2020, 20, 350. [Google Scholar] [CrossRef] [PubMed]
- Ul Hassan, C.A.; Iqbal, J.; Hussain, S.; AlSalman, H.; Mosleh, M.A.A.; Sajid Ullah, S. A Computational Intelligence Approach for Predicting Medical Insurance Cost. Math. Probl. Eng. 2021, 2021, 1162553. [Google Scholar] [CrossRef]
- Morid, M.A.; Sheng, O.R.L.; Kawamoto, K.; Abdelrahman, S. Learning hidden patterns from patient multivariate time series data using convolutional neural networks: A case study of healthcare cost prediction. J. Biomed. Inform. 2020, 111, 103565. [Google Scholar] [CrossRef]
- Barros, L.B.; Marcy, M.; Carvalho, M.T.M. Construction Cost Estimation of Brazilian Highways Using Artificial Neural Networks. Int. J. Struct. Civ. Eng. Res. 2018, 7, 283–289. [Google Scholar] [CrossRef]
- Ho, W.K.O.; Tang, B.-S.; Wong, S.W. Predicting property prices with machine learning algorithms. J. Prop. Res. 2021, 38, 48–70. [Google Scholar] [CrossRef]
- Wang, H.; Cui, Z.; Chen, Y.; Avidan, M.; Abdallah, A.B.; Kronzer, A. Predicting Hospital Readmission via Cost-Sensitive Deep Learning. IEEE/ACM Trans. Comput. Biol. Bioinform. 2018, 15, 1968–1978. [Google Scholar] [CrossRef]
- Yun, S. Performance Analysis of Construction Cost Prediction Using Neural Network for Multioutput Regression. Appl. Sci. 2022, 12, 9592. [Google Scholar] [CrossRef]
- Tatiya, A.; Zhao, D.; Syal, M.; Berghorn, G.H.; LaMore, R. Cost prediction model for building deconstruction in urban areas. J. Clean. Prod. 2018, 195, 1572–1580. [Google Scholar] [CrossRef]
- Chandanshive, V.; Kambekar, A. Estimation of Building Construction Cost Using Artificial Neural Networks. J. Soft Comput. Civ. Eng. 2019, 3, 91–107. [Google Scholar] [CrossRef]
- Pham, T.Q.D.; Le-Hong, T.; Tran, X.V. Efficient estimation and optimization of building costs using machine learning. Int. J. Constr. Manag. 2023, 23, 909–921. [Google Scholar] [CrossRef]
- Elhegazy, H.; Chakraborty, D.; Elzarka, H.; Ebid, A.M.; Mahdi, I.M.; Aboul Haggag, S.Y.; Abdel Rashid, I. Artificial Intelligence for Developing Accurate Preliminary Cost Estimates for Composite Flooring Systems of Multi-Storey Buildings. J. Asian Archit. Build. Eng. 2022, 21, 120–132. [Google Scholar] [CrossRef]
- Abdulmajeed, A.A.; Al-Jawaherry, M.A.; Tawfeeq, T.M. Predict the required cost to develop Software Engineering projects by Using Machine Learning. J. Phys. Conf. Ser. 2021, 1897, 012029. [Google Scholar] [CrossRef]
- Alshamrani, O.S. Construction cost prediction model for conventional and sustainable college buildings in North America. J. Taibah Univ. Sci. 2017, 11, 315–323. [Google Scholar] [CrossRef]
- Chen, X.; Yi, M.; Huang, J. Application of a PCA-ANN Based Cost Prediction Model for General Aviation Aircraft. IEEE Access 2020, 8, 130124–130135. [Google Scholar] [CrossRef]
- Elmousalami, H.H.; Elyamany, A.H.; Ibrahim, A.H. Predicting Conceptual Cost for Field Canal Improvement Projects. J. Constr. Eng. Manag. 2018, 144, 04018102. [Google Scholar] [CrossRef]
- Kovačević, M.; Ivanišević, N.; Petronijević, P.; Despotović, V. Construction cost estimation of reinforced and prestressed concrete bridges using machine learning. J. Croat. Assoc. Civ. Eng. 2021, 73, 1–13. [Google Scholar] [CrossRef]
- Morid, M.A.; Sheng, O.R.L.; Kawamoto, K.; Ault, T.; Dorius, J.; Abdelrahman, S. Healthcare cost prediction: Leveraging fine-grain temporal patterns. J. Biomed. Inform. 2019, 91, 103113. [Google Scholar] [CrossRef]
- Park, B.R.; Choi, E.J.; Hong, J.; Lee, J.H.; Moon, J.W. Development of an energy cost prediction model for a VRF heating system. Appl. Therm. Eng. 2018, 140, 476–486. [Google Scholar] [CrossRef]
- Patra, G.K.; Kuraku, C.; Konkimalla, S.; Boddapati, V.N.; Sarisa, M.; Reddy, M.S. An Analysis and Prediction of Health Insurance Costs Using Machine Learning-Based Regressor Techniques. J. Data Anal. Inf. Process. 2024, 12, 581–596. [Google Scholar] [CrossRef]
- Taloba, A.I.; El-Aziz, R.M.A.; Alshanbari, H.M.; El-Bagoury, A.-A.H. Estimation and Prediction of Hospitalization and Medical Care Costs Using Regression in Machine Learning. J. Healthc. Eng. 2022, 2022, 7969220. [Google Scholar] [CrossRef] [PubMed]
- Ujong, J.A.; Mbadike, E.M.; Alaneme, G.U. Prediction of cost and duration of building construction using artificial neural network. Asian J. Civ. Eng. 2022, 23, 1117–1139. [Google Scholar] [CrossRef]
Study | First Author | Year of Publication | Citation | Study Design | Data Location Collection | Data Location Collection |
---|---|---|---|---|---|---|
[35] | Fangwei Ning | 2020 | 56 | Experimental study | China | 121,980 cases |
[36] | Debaditya Chakraborty | 2020 | 184 | Experimental study | Data from RSMeans Assemblies Books | Medium- and high-rise buildings consisting of 4477 data points |
[37] | Abolfazl Jaafari | 2021 | 27 | Experimental study | Iran | 4811 data samples collected from 300 road segments |
[8] | Erik Matel | 2022 | 133 | Experimental study | Data collected from engineering consultancy firms | 132 engineering projects |
[38] | Jean-Loup Loyer | 2016 | 161 | Case study approach | - | 254 jet engines’ data |
[39] | Odey Alshboul | 2022 | 95 | Experimental study | Various locations across North America | 283 building data |
[40] | Haytham H. Elmousalami | 2020 | 126 | Experimental study | Egypt | 144 FCIPs |
[41] | Agnieszka Leśniak | 2018 | 123 | Case study approach | Poland | 143 construction projects’ data |
[42] | Michał Juszczyk | 2018 | 83 | Experimental study | Poland | 115 construction projects’ data |
[43] | Hongquan Guo | 2021 | 87 | Experimental study | N/A | 74 open-pit mining projects |
[44] | Zainab Hasan Ali | 2022 | 24 | Experimental study | Iraq | 90 building projects’ data |
[45] | Ran Wang | 2022 | 52 | Experimental study | Hong Kong | 98 public school construction projects |
[46] | Mohammad Hossein Rafiei | 2018 | 212 | Experimental study | Iran | 372 low- and midrise building |
[47] | Marcin Relich | 2018 | 113 | Case study approach | Poland | 61 new product development projects |
[48] | Zaher Mundher Yaseen | 2020 | 159 | Experimental study | Iraq | 40 completed construction projects |
[49] | Alexandre Vimont | 2022 | 38 | Experimental study | France | 510,182 subjects |
[50] | Madhu Mazumdar | 2020 | 43 | Simulation study | United States | 4205 subjects |
[51] | Ch. Anwar ul Hassan | 2021 | 43 | Experimental study | N/A | 1338 subjects |
[52] | Mohammad Amin Morid | 2020 | 47 | Simulation study | United States | 91,000 individuals |
[53] | Laís B. Barros | 2018 | 24 | Experimental study | Brazil | 14 highway projects |
[54] | Winky K.O. Ho | 2021 | 254 | Experimental study | Hong Kong | 40,000 housing transactions |
[55] | Haishuai Wang | 2018 | 245 | Predictive modeling | United States | 41,503 patient visits |
[56] | Seokheon Yun | 2022 | 23 | Case study approach | South Korea | 908 construction cases |
[57] | Amol Tatiya | 2018 | 101 | Case study approach | United States | 530 deconstruction projects |
[34] | Jiacheng Dong | 2020 | 23 | Experimental study | China | 143 months of construction cost index data |
[58] | Viren Chandanshive | 2019 | 58 | Experimental study | India | 78 building projects |
[59] | T.Q.D. Pham | 2023 | 43 | Experimental study | N/A | 10,000 parametric building configurations |
[60] | Hosam Elhegazy | 2022 | 57 | Decision-making study | United States | More than 900 data points |
[61] | Ashraf Abdulmunim Abdulmajeed | 2021 | 18 | Case study approach | United States | 60 completed software projects |
[62] | Othman Subhi Alshamrani | 2017 | 68 | Experimental study | North America | 320 construction projects |
[63] | Xiaonan Chen | 2020 | 19 | Experimental study | N/A | 22 samples of general aviation aircraft |
[64] | Haytham H. ElMousalami | 2018 | 126 | Experimental study | Egypt | 144 field canal improvement projects |
[65] | Miljan Kovačević | 2021 | 42 | Experimental study | Serbia | 181 bridge construction projects |
[66] | Mohammad Amin Morid | 2019 | 40 | Experimental study | United States | 3.8 million medical claims and 780,000 pharmacy claims from 24,000 patients |
[67] | Bo Rang Park | 2018 | 41 | Experimental study | South Korea | 11 monitoring areas from a test building in winter |
[68] | Gagan Kumar Patra | 2024 | 37 | Experimental study | N/A | 1388 health insurance records |
[32] | Igor Peško | 2017 | 80 | Experimental study | N/A | 166 road construction projects |
[69] | Ahmed I. Taloba | 2022 | 47 | Experimental study | Japan | 24,353 patient records |
[70] | Jesam Ujong | 2022 | 59 | Experimental study | Nigeria | 78 responses from different construction professionals |
Model | Preferred Technique | Best-Suited Model | Real-Life Implications |
---|---|---|---|
Deep Learning Techniques | Convolutional neural network (CNN) [27,52,55] | CNNs learn complex relationships and features from large datasets, along with handling imbalanced class distribution while predicting costs | CNNs have automated price quotations and reduce dependency on expert-driven cost estimation |
Artificial neural network (ANN) [8,20,42,53,56,58,59,60,67,70] | ANNs were chosen for their ability to handle nonlinear relationships and build complex relationships between cost predictors and final costs, without requiring predefined equations | Enhances capital planning accuracy, reduces cost overruns, and improves investment decision-making by leveraging historical data for systematic cost estimation and reducing dependence on expert judgment | |
Deep neural network (DNN) [45] | DNNs are recommended for projects where economic factors and project characteristics interact in a complex manner | Used for predicting early-stage construction costs of public school projects, assisting stakeholders in financial decision-making | |
Long Short-Term Memory (LSTM) [34] | It effectively captures long-term dependencies in time-series data, overcoming issues | Improved accuracy in project cost estimation and planning | |
Machine Learning Technique | Support vector machine (SVM) [32,37,46] | SVM handles high-dimensional data and complex terrain-dependent costs between construction costs and economic variables | Enables early cost prediction for road construction projects in forestry; supports bid selection processes and alternative road alignment choices |
Random Forest (RF) [49,50] | RF showed superior performance by detecting nonlinearity and interactions without pre-specifying the model parameters | Enhances predictive accuracy in healthcare cost estimation, reducing bias in provider reimbursement under value-based care models | |
Gradient-Boosted Trees (GBT) [38] | GBT was chosen for its superior predictive accuracy and ability to handle complex data structures | Used for early-stage cost estimation for perfect manufacturing decisions | |
Extreme Gradient Boosting (XGBoost) [21,39,68] | XGBoost is best suited due to its scalability and ability to handle missing data, along with the best trade-off between accuracy and computational efficiency | XGBoost can predict conceptual costs at early project stages, aiding financial planning and decision-making | |
Hybrid LGBoost–NGBoost [36] | Hybrid LGBoost–NGBoost model has the ability to estimate uncertainty by automating cost forecasting at a large scale | Enables construction professionals to perform value engineering (VE) efficiently by evaluating different design alternatives based on accurate cost predictions | |
Stochastic Gradient Boosting (SGB) [51] | Enables construction professionals to perform value engineering efficiently by evaluating different design alternatives based on accurate cost predictions | Enhances pricing accuracy and improves cost management for selecting better plans | |
Categorical Boosting (CBR) [57] | CBR can learn from past cases, requiring less training, while offering better clarity and interpretability in cost estimation | Supports decision-makers in shifting from demolition to sustainable deconstruction by providing accurate cost estimates and potential material recovery values | |
K-Nearest Neighbors (KNN) [61] | KNN effectively learns cost patterns from historical project data, making it a reliable model for estimating costs without requiring complex training processes | Provides accurate cost estimation, reducing financial risks and optimizing project planning | |
Gradient-Boosting Machine (GBM) [54] | GBM can minimize prediction errors, making it the most effective model for property price estimation | Provides accurate property price predictions, aiding real estate investors and policymakers | |
Regression-Based Models | Regression model [62,64] | This model can handle large, nonlinear projects with complex dependencies | Automates and optimizes cost estimation processes; improves projects with early-stage financial planning |
Logistic regression (LR) [69] | Linear regression is best for its simplicity and high interpretability | Policymakers are benefitted by this model through cost forecasting | |
Generalized Boosting Regression (GBR) [66] | Gradient boosting can capture complex temporal dependencies in cost data | Managing resources efficiently and optimizing value-based payment models | |
Gaussian Process Regression (GPR) [65] | Handling the complex, nonlinear relationships in cost estimation | Helps allocate care management resources efficiently and optimize value-based payment models | |
Hybrid and Optimization-Based Models | XGBoost–Random Forest (RF) [44] | This model performs by balancing complex input dependencies while estimating project costs | Enhances cost estimation accuracy, reduces errors, and improves decision-making in early project planning |
CBR-ANN [47] | CBR-ANN effectively selects relevant past cases, improving cost prediction accuracy | Improves cost estimation accuracy, reduces design uncertainty, and aids decision-making in new product development projects | |
Random Forest–Genetic Algorithm (RF-GA) [48] | RF-GA optimizes Random Forest hyperparameters using Genetic Algorithms, enhancing learning efficiency with complex, nonlinear delay dependencies | Improves delay prediction accuracy, enables better risk management, and enhances project planning | |
Principal component analysis–ANN (PCA-ANN) [63] | PCA-ANN reduces dimensionality while retaining key information, improving ANNs’ performance in cost prediction | Provides more accurate and practical cost estimations, enabling better pricing and financial planning |
Study | Preferred Technique | Model | Evaluation Metrics | Reported Accuracy Performance | Proposed Future Directions/Recommendations |
---|---|---|---|---|---|
[35] | CNN | Deep learning techniques | the MAPE value was estimated at 10.02%; with 400,000 cases it reached 6.34% while predicting the cost | 89.98% | Optimize CNN architectures for better computational efficiency |
[8] | ANN | Engineering costs were accurately estimated through the ANN, with an MAPE of 14.5%. | 86.35% | Conduct real-world validation and pilot testing in active engineering consultancy projects | |
[42] | ANN | MAPE errors were expected to be smaller than 15% | 85% | Aim to expand dataset size, refine input variable selection, and explore deep learning approaches | |
[43] | ANN | The ANN model achieved RMSE = 138.103, R2 = 0.990, MAE = 114.589, and APE = 7.770% | 92.23%. | Integrate external economic factors, expand dataset size, and refine ANN architectures | |
[45] | DNN | DNN model achieved R2 = 0.95 and MAPE = 12.91% | 87.09% | Refine input variables, integrate real-time economic data, and test alternative deep learning models | |
[52] | CNN | The proposed CNN model achieved the best predictive performance, with an MAPE of 1.67% | 94.53% | Focus on optimizing CNNs for cost modeling and exploring hybrid models | |
[53] | ANN | Mean absolute percentage error (MAPE) of 5.84% | 99% | Expand datasets and refine ANN architectures for better generalization | |
[55] | CNN | The area under the curve (AUC) for the proposed model was 0.70 | 92%, | Improving interpretability of deep learning-based predictions and enhancing cost-sensitive learning for different clinical applications | |
[56] | ANN | R2 score for sum cost: 0.80; R2 score for total cost: 0.65 | 81% to 89% | Improving machine learning model stability and accuracy through better data preprocessing techniques | |
[58] | ANN | The best ANN model (11-3-1 architecture with Bayesian regularization) achieved R2 = 0.9922 and RMSE = 0.02469 | 99.22%. | Development of ANN-based cost estimation tools for real-world application | |
[59] | ANN | R2 is likely between 0.98 and 1.00, and MAE ≈ 0.1 to 1 (depending on the scale) | 99% | Enhancing the framework for broader applications | |
[60] | ANN | The mean squared error (MSE) for optimal ANN models ranged between 0.0026 and 0.09, confirming the model’s reliability | 99.9% | Using real project cost data instead of relying solely on RSMeans | |
[70] | ANN | This model showed mean absolute error (MAE): 0.2952 and root-mean-square error (RMSE): 0.5638 | 99.9995% | Further exploration of nonlinear and complex factors affecting cost estimation | |
[67] | ANN | ANN model structured with one hidden layer and 15 neurons achieved an R2 of 0.8417 | 84.17% | Implementing ANN models into real-world VRF heating control systems | |
[34] | LTSM | The LSTM model predicted engineering cost indices, achieving MAE: 0.96, MSE: 1.03, and MAPE: 0.71 | of 99.29%. | Further optimization of LSTM models for broader applications | |
[37] | SVM | Machine learning techniques | Support vector machine (SVM) cost prediction, with an accuracy of R = 0.993 and RMSE = 2.44%. | 97.56%. | Apply more machine learning methods and metaheuristic optimization to improve cost estimation accuracy |
[46] | SVM | Significant error reduction compared to BPNN and SVM | 88% to 93%. | Future research should explore larger datasets, integrate more economic indicators, and test alternative machine learning architectures | |
[32] | SVM | SVM achieved an MAPE of 7.06% for cost estimation | 92.94% | Further validation on diverse project types; integration of additional machine learning techniques | |
[49] | RF | The RF model achieved the best predictive performance, with an adjusted R2 of 47.5%, mean absolute error (MAE) of EUR 1338, and hit ratio (HiR) of 67% | 67% | Use real item data to increase the accuracy of the model | |
[50] | RF | In terms of MAPE, RF ranged from 12% to | 88.48% | Future research should explore improving risk adjustment models and assessing RF’s computational efficiency for real-world applications | |
[68] | XGBoost | XGBoost achieved R2 = 86.81 and RMSE = 4450.4 | 86.81% | Incorporating deep learning and metaheuristic approaches | |
[39] | XGBoost | XGBOOST achieved the highest accuracy, with R2 = 0.96 and MAPE = 19.9% | 80.1% | Future work should incorporate a broader dataset, more attributes, various building types, and an expanded lifecycle cost analysis | |
[40] | XGBoost | XGBOOST achieved the best accuracy: MAPE = 9.091%, | 90.909% | Recommendation to develop hybrid models incorporating fuzzy logic to handle uncertainties, improve rule generation for fuzzy systems, and explore deep learning for cost prediction | |
[41] | CBR | It generated a mean absolute estimate error of 14%. | 86% | Improve case similarity calculations and integrate sustainability metrics into other estimation models | |
[51] | SGB | The SGB model achieved the best performance, with an RMSE of 0.340 and cross-validation score = 0.858 | 86% | Use nature-inspired algorithms and deep learning models for cost prediction | |
[54] | GBM | This model had the best R2 score of 0.90365 for GBM | 90.37% | Incorporating additional property transaction data from a larger geographical region | |
[57] | CBR | Paired t-tests showed a high correlation (0.999, p = 0.03), confirming the model’s statistical significance at a 95% confidence level | 98.8% | Investigating the deconstruction supply chain and market development | |
[61] | KNN | The KNN model had the lowest MMRE (0.101), RMSE (0.547), and BRE (0.205), confirming its superior cost prediction accuracy | 90.24% | Expanding the dataset with real-world software project data | |
[62] | Regression Model | Regression models | MAPE was evaluated at 5.7% | 94.3% | Incorporating real-time cost data beyond RSMeans |
[64] | Regression Model | Quadratic regression model: R2 = 0.86, MAPE = 9.12% (training), and MAPE = 7.82% | 86% | Exploring hybrid ANN–regression models | |
[65] | GPR | Best model (GPR with ARD–exponential): R = 0.89 and MAPE = 11.60% | 88.4% | Exploring cost estimation as a classification problem instead of regression | |
[66] | GBR | Improves healthcare cost prediction accuracy, with Gradient Boosting achieving a MAPE of 2.04% | 97.96% | Evaluating deep learning models for cost prediction | |
[69] | LR | The MAPE value was 2.11%, and R2 =0.9789 | 97.89% | Integration of deep learning techniques for enhanced prediction | |
[63] | PCA-ANN | Hybrid and optimization-based models | During the test, the MAPE values were 0.009 (training) and 0.015 (testing) | 99% | Expanding the dataset for improved accuracy |
[47] | CBR-ANN | The CBR-ANN model (ANN-PA), with an MAPE value of 5.95%, achieved the best accuracy, demonstrating its effectiveness in improving cost estimation | 94.05% | Refine attribute selection, expand the case base, and test deep learning models | |
[48] | RF-GA | RF-GA model showed a classification error = 8.33% | 91.67%, | Refine delay factor classification, integrate real-time data, and enhance optimization techniques | |
[44] | XGBoost–Random Forest (RF) | The XGBoost-RF model achieved R2 = 0.87 and MAPE = 0.25 | 75% | Refine input selection, integrate deep learning, and explore economic impacts on cost estimation |
Model Domain | Study | Bias Level | Reason for Assessment |
---|---|---|---|
Deep Learning Techniques | [35] | Moderate | High computational requirements for 3D CNN training; lack of processing requirement data (e.g., surface roughness, machining precision) |
[8] | Moderate | The model has not been externally validated on new projects beyond the training dataset | |
[42] | Moderate | High variance in project specifications, and sensitivity to input variable selection | |
[43] | High | Variability in mining conditions, and lack of supply chain influence | |
[45] | Moderate | Difficulty in incorporating real-time economic trends, and project complexity | |
[52] | Low | CNNs require higher computational time; however, they improve cost prediction accuracy | |
[53] | Moderate | Potential overfitting with increasing neuron counts | |
[55] | Low | Imbalanced class distribution of readmission cases, need for higher sensitivity in predicting rare but costly misclassifications, and complexity of heterogeneous data integration | |
[56] | Low | High data deviation led to unstable learning performance in some cases | |
[34] | Moderate | Noise, anomalies, missing information, and data frequency differences | |
[58] | Moderate | Neural networks require high computational resources | |
[59] | High | Optimization constraints and data limitations | |
[60] | Low | Sensitivity to variations in cost estimation due to local unit cost differences | |
[67] | Moderate | Potential variability in real-world heating conditions; requires extensive data for model training | |
[70] | Moderate | Exclusion of bidding, tender negotiation, supply chain, and safety constraints | |
Machine Learning Models | [37] | Moderate | The model is limited to low-volume forest roads and may not be generalizable to highway or urban road construction |
[39] | Moderate | Lack of external support considerations; exclusion of social and environmental factors | |
[40] | Moderate | Handling uncertainty, high computational complexity, and model interpretability issues | |
[41] | High | Subjectivity in similarity assessment; lack of consideration for external economic factors | |
[46] | Moderate | Data availability constraints; economic fluctuations affecting construction costs | |
[49] | High | Potential difficulty in generalizing results to larger populations | |
[50] | Low | RF is computationally expensive; all models underpredicted high-cost cases, leading to bias against high-cost hospitals | |
[51] | High | Model training time, and the impact of missing data | |
[54] | Moderate | GBM and RF models showed high bias for extreme property values | |
[57] | High | Lack of a well-established supply chain for salvaged materials, and the need for greater adoption of deconstruction methods | |
[61] | High | KNN requires optimal K-value selection for optimal performance | |
[68] | Moderate | Potential difficulty in generalizing results to larger populations | |
[32] | Moderate | Normalization process required to ensure uniformity in data; limited dataset may impact model generalization | |
Regression Model | [38] | High | Data imbalance affected model performance across different component categories |
[62] | Moderate | RSMeans cost data may not fully reflect local variations | |
[64] | Moderate | Multi-collinearity in cost data affecting regression models | |
[65] | Moderate | Complexity of infrastructure cost estimation due to multiple influencing factors | |
[66] | Moderate | Medical predictors did not significantly improve accuracy; fine-grained features increase dimensionality | |
[69] | Moderate | Medical predictors had a minimal impact on forecast accuracy | |
Hybrid and Optimization-Based Models | [44] | Moderate | Data variability; political and economic instability affecting construction projects |
[47] | High | Limited case base for training, sensitivity to attribute weighting, and need for expert validation | |
[48] | Moderate | External factors affecting delays, and sensitivity to parameter selection | |
[63] | High | Multi-collinearity in cost data impacts traditional regression models |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shamim, M.M.I.; Hamid, A.B.b.A.; Nyamasvisva, T.E.; Rafi, N.S.B. Advancement of Artificial Intelligence in Cost Estimation for Project Management Success: A Systematic Review of Machine Learning, Deep Learning, Regression, and Hybrid Models. Modelling 2025, 6, 35. https://doi.org/10.3390/modelling6020035
Shamim MMI, Hamid ABbA, Nyamasvisva TE, Rafi NSB. Advancement of Artificial Intelligence in Cost Estimation for Project Management Success: A Systematic Review of Machine Learning, Deep Learning, Regression, and Hybrid Models. Modelling. 2025; 6(2):35. https://doi.org/10.3390/modelling6020035
Chicago/Turabian StyleShamim, Md. Mahfuzul Islam, Abu Bakar bin Abdul Hamid, Tadiwa Elisha Nyamasvisva, and Najmus Saqib Bin Rafi. 2025. "Advancement of Artificial Intelligence in Cost Estimation for Project Management Success: A Systematic Review of Machine Learning, Deep Learning, Regression, and Hybrid Models" Modelling 6, no. 2: 35. https://doi.org/10.3390/modelling6020035
APA StyleShamim, M. M. I., Hamid, A. B. b. A., Nyamasvisva, T. E., & Rafi, N. S. B. (2025). Advancement of Artificial Intelligence in Cost Estimation for Project Management Success: A Systematic Review of Machine Learning, Deep Learning, Regression, and Hybrid Models. Modelling, 6(2), 35. https://doi.org/10.3390/modelling6020035