Predictive Analysis for Optimizing Port Operations
Abstract
:1. Introduction
- Defining the problem for predictive analysis for port operations which is lacking in the current literature.
- Exhaustive comparison of methods with multiple metrics for prediction and classification of Total Time (Stay Time) and Delay Time at the port to obtain a holistic view.
- Discovery of key factors that affect port operations using feature importance.
- SHAP (Shapley Additive Explanations) analysis to understand the contribution of these key factors on the output.
- The combined analysis and inference is used to improve decision-making for effective port operations.
2. Literature Review
3. Case Study and Methodology
4. Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
API | Application Programming Interface |
BAP | Berth Allocation Problem |
PMS | Predictive Model Scheduling |
KPI | Key Performance Indicator |
SHAP | Shapley Additive Explanations |
TEU | Twenty-foot Equivalent Unit |
XGBoost | Extreme Gradient Boosting |
LightGBM | Light Gradient Boosting Machine |
SVM | Support Vector Machine |
RF | Random Forest |
KNN | k-Nearest Neighbor |
ResNet | Residual Network |
MAE | Mean Absolute Error |
MSE | Mean Squared Error |
RMSE | Root Mean Squared Error |
RMSLE | Root Mean Squared Logarithmic Error |
R-squared | |
MAPE | Mean Absolute Percentage Error |
AUC | Receiver Operating Characteristic Curve |
References
- Ogura, T.; Inoue, T.; Uchihira, N. Prediction of Arrival Time of Vessels Considering Future Weather Conditions. Appl. Sci. 2021, 11, 4410. [Google Scholar] [CrossRef]
- Kolley, L.; Rückert, N.; Kastner, M.; Jahn, C.; Fischer, K. Robust berth scheduling using machine learning for vessel arrival time prediction. Flex. Serv. Manuf. J. 2022, 35, 29–69. [Google Scholar] [CrossRef]
- Mekkaoui, S.E.; Benabbou, L.; Berrado, A. Machine Learning Models for Efficient Port Terminal Operations: Case of Vessels’ Arrival Times Prediction. IFAC-PapersOnLine 2022, 55, 3172–3177. [Google Scholar] [CrossRef]
- Nguyen, T.; Zhang, J.; Zhou, L.; He, Y. A data-driven optimization of large-scale dry port location using the hybrid approach of data mining and complex network theory. Transp. Res. Part E Logist. Transp. Rev. 2019, 134, 101816. [Google Scholar] [CrossRef]
- Abreu, L.; Maciel, I.; Alves, J.; Braga, L.; Pontes, H. A decision tree model for the prediction of the stay time of ships in Brazilian ports. Eng. Appl. Artif. Intell. 2022, 117, 105634. [Google Scholar] [CrossRef]
- Dimitrios, D.; Nikitakos, N.; Papachristos, D.; Dalaklis, A. Opportunities and Challenges in Relation to Big Data Analytics for the Shipping and Port Industries; Palgrave Macmillan: Cham, Switzerland, 2023; pp. 267–290. [Google Scholar] [CrossRef]
- Reggiannini, M.; Righi, M.; Tampucci, M.; Duca, A.L.; Bacciu, C.; Bedini, L.; D’Errico, A.; Paola, C.D.; Marchetti, A.; Martinelli, M.; et al. Remote Sensing for Maritime Prompt Monitoring. J. Mar. Sci. Eng. 2019, 7, 202. [Google Scholar] [CrossRef]
- Bautista-Sánchez, R.; Barbosa-Santillán, L.I.; Sánchez-Escobar, J.J. Statistical Approach in Data Filtering for Prediction Vessel Movements Through Time and Estimation Route Using Historical AIS Data. In Proceedings of the Mexican International Conference on Artificial Intelligence, Xalapa, Mexico, 27 October–2 November 2019. [Google Scholar]
- Eriksen, T.; Hoye, G.; Narheim, B.T.; Meland, B.J. Maritime traffic monitoring using a space-based AIS receiver. Acta Astronaut. 2006, 58, 537–549. [Google Scholar] [CrossRef]
- Iris, Ç.; Lam, J.S.L. A review of energy efficiency in ports: Operational strategies, technologies and energy management systems. Renew. Sustain. Energy Rev. 2019, 112, 170–182. [Google Scholar] [CrossRef]
- Izaguirre, C.; Losada, I.J.; Camus, P.; Vigh, J.L.; Stenek, V. Climate change risk to global port operations. Nat. Clim. Chang. 2020, 11, 14–20. [Google Scholar] [CrossRef]
- Dulebenets, M.A. A comprehensive multi-objective optimization model for the vessel scheduling problem in liner shipping. Int. J. Prod. Econ. 2018, 196, 293–318. [Google Scholar] [CrossRef]
- van Boetzelaer, F.B.; van den Boom, T.; Negenborn, R.R. Model predictive scheduling for container terminals. IFAC Proc. Vol. 2014, 47, 5091–5096. [Google Scholar] [CrossRef]
- Weerasinghe, B.A.; Perera, H.N.; Bai, X. Optimizing container terminal operations: A systematic review of operations research applications. Marit. Econ. Logist. 2023, 26, 307–341. [Google Scholar] [CrossRef]
- Bierwirth, C.; Meisel, F. A follow-up survey of berth allocation and quay crane scheduling problems in container terminals. Eur. J. Oper. Res. 2015, 244, 675–689. [Google Scholar] [CrossRef]
- Guo, L.; Zheng, J.F.; Liang, J.; Wang, S. Column generation for the multi-port berth allocation problem with port cooperation stability. Transp. Res. Part B Methodol. 2023, 171, 3–28. [Google Scholar] [CrossRef]
- Rodrigues, F.; Agra, A. Berth allocation and quay crane assignment/scheduling problem under uncertainty: A survey. Eur. J. Oper. Res. 2022, 303, 501–524. [Google Scholar] [CrossRef]
- Farag, Y.B.; Ölçer, A.I. The development of a ship performance model in varying operating conditions based on ANN and regression techniques. Ocean Eng. 2020, 198, 106972. [Google Scholar] [CrossRef]
- López-Bermúdez, B.; Freire-Seoane, M.J.; González-Laxe, F. Efficiency and productivity of container terminals in Brazilian ports (2008–2017). Util. Policy 2019, 56, 82–91. [Google Scholar] [CrossRef]
- Wang, H.; Liu, Z.; Wang, X.; Graham, T.L.; Wang, J. An analysis of factors affecting the severity of marine accidents. Reliab. Eng. Syst. Saf. 2021, 210, 107513. [Google Scholar] [CrossRef]
- Cao, Y.; Wang, X.; Yang, Z.; Wang, J.; Wang, H.; Liu, Z. Research in marine accidents: A bibliometric analysis, systematic review and future directions. Ocean Eng. 2023, 284, 115048. [Google Scholar] [CrossRef]
- Lim, S.; Pettit, S.; Abouarghoub, W.; Beresford, A. Port sustainability and performance: A systematic literature review. Transp. Res. Part D Transp. Environ. 2019, 72, 47–64. [Google Scholar] [CrossRef]
- Chu, Z.; Yan, R.; Wang, S. Vessel turnaround time prediction: A machine learning approach. Ocean. Coast. Manag. 2024, 249, 107021. [Google Scholar] [CrossRef]
- Štepec, D.; Martinčič, T.; Klein, F.; Vladušič, D.; Costa, J.P. Machine Learning based System for Vessel Turnaround Time Prediction. In Proceedings of the 2020 21st IEEE International Conference on Mobile Data Management (MDM), Versailles, France, 30 June–3 July 2020; pp. 258–263. [Google Scholar] [CrossRef]
- Abreu, L.; Maciel, I.; Alves, J.; Braga, L.; Pontes, H. Dataset—Stay Time of Ships in Brazilian Ports in 2018. 2022. Available online: https://www.researchgate.net/publication/365589683_Dataset_-_Stay_Time_of_Ships_in_Brazilian_Ports_in_2018?channel=doi&linkId=6379043e54eb5f547ce6ee87&showFulltext=true (accessed on 4 March 2025). [CrossRef]
- Cabral, A.M.R.; Ramos, F.S. Efficiency Container Ports In Brazil: A DEA And FDH Approach. Cent. Eur. Rev. Econ. Manag. (CEREM) 2018, 2, 43–64. [Google Scholar] [CrossRef]
- da Veiga Lima, F.; de Souza, D. Climate change, seaports, and coastal management in Brazil: An overview of the policy framework. Reg. Stud. Mar. Sci. 2022, 52, 102365. [Google Scholar] [CrossRef]
- Galvao, C.B.; Robles, L.T.; Guerise, L.C. 20 years of port reform in Brazil: Insights into the reform process. Res. Transp. Bus. Manag. 2017, 22, 153–160. [Google Scholar] [CrossRef]
- Costa, W.; Soares-Filho, B.; Nobrega, R. Can the Brazilian National Logistics Plan Induce Port Competitiveness by Reshaping the Port Service Areas? Sustainability 2022, 14, 14567. [Google Scholar] [CrossRef]
- Cacho, J.; Tokarski, A.; Thomas, E.; Chkoniya, V. Port Dada Integration: Opportunities for Optimization and Value Creation; IGI Global: Hershey, PA, USA, 2021; pp. 1–22. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
- Freund, Y.; Schapire, R. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30. [Google Scholar]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
- Louppe, G.; Wehenkel, L.; Sutera, A.; Geurts, P. Understanding variable importances in forests of randomized trees. In Proceedings of the Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–8 December 2013. [Google Scholar]
- James, G.M.; Witten, D.M.; Hastie, T.J.; Tibshirani, R. An Introduction to Statistical Learning. In Springer Texts in Statistics; Springer: New York, NY, USA, 2013. [Google Scholar]
- Cortes, C.; Vapnik, V.N. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Hand, D.J.; Yu, K. Idiot’s Bayes—Not So Stupid After All? Int. Stat. Rev. 2001, 69, 385–398. [Google Scholar]
- Altman, N.S. An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. Am. Stat. 1992, 46, 175–185. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T.J. Addendum: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2005, 67, 768. [Google Scholar] [CrossRef]
- Abiodun, O.I.; Jantan, A.B.; Omolara, A.E.; Dada, K.V.; Mohamed, N.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018, 4, e00938. [Google Scholar] [CrossRef] [PubMed]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. Off. J. Int. Neural Netw. Soc. 2014, 61, 85–117. [Google Scholar] [CrossRef] [PubMed]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Kohavi, R. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 20–25 August 1995. [Google Scholar]
- Menze, B.H.; Kelm, B.M.; Masuch, R.; Himmelreich, U.; Bachert, P.; Petrich, W.; Hamprecht, F.A. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinform. 2009, 10, 213. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Dahouda, M.K.; Joe, I. A Deep-Learned Embedding Technique for Categorical Features Encoding. IEEE Access 2021, 9, 114381–114391. [Google Scholar] [CrossRef]
- Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? In Proceedings of the Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022.
Model | MAE | MSE | RMSE | RMSLE | MAPE | |
---|---|---|---|---|---|---|
Random Forest | 5.3336 | 108.9832 | 10.4377 | 0.8241 | 0.2992 | 0.1358 |
Extra Trees | 5.0287 | 121.0471 | 10.9998 | 0.8046 | 0.3112 | 0.1282 |
Extreme Gradient Boosting | 7.4763 | 125.7988 | 11.2152 | 0.7969 | 0.3343 | 0.1974 |
ResNet | 7.4643 | 126.6790 | 11.2551 | 0.7955 | 0.3418 | 0.2033 |
Neural Network | 7.551 | 127.9665 | 11.3122 | 0.7934 | 0.3547 | 0.2192 |
Light Gradient Boosting Machine | 8.3872 | 147.0024 | 12.1234 | 0.7627 | 0.3648 | 0.2256 |
Decision Tree | 5.4081 | 165.8789 | 12.8762 | 0.7322 | 0.3469 | 0.1378 |
K Nearest Neighbors | 8.1328 | 184.7665 | 13.5920 | 0.7017 | 0.3968 | 0.2074 |
Gradient Boosting | 10.4030 | 211.0671 | 14.5265 | 0.6593 | 0.4238 | 0.2728 |
Bayesian Ridge | 11.8099 | 273.8831 | 16.5479 | 0.5579 | 0.4772 | 0.3203 |
Ridge Regression | 11.8092 | 273.8808 | 16.5479 | 0.5579 | 0.4773 | 0.3202 |
Linear Regression | 11.8089 | 273.8825 | 16.5479 | 0.5579 | 0.4774 | 0.3201 |
Elastic Net | 12.8900 | 314.4793 | 17.7318 | 0.4924 | 0.5022 | 0.3847 |
Lasso Regression | 12.9111 | 315.2336 | 17.7531 | 0.4911 | 0.5023 | 0.3845 |
Huber Regressor | 12.5049 | 315.4952 | 17.7600 | 0.4907 | 0.5316 | 0.3412 |
AdaBoost | 18.0348 | 502.1311 | 22.3968 | 0.1889 | 0.6748 | 0.7616 |
Model | MAE | MSE | RMSE | RMSLE | MAPE | |
---|---|---|---|---|---|---|
Random Forest | 0.8102 | 17.8800 | 4.2179 | 0.6160 | 0.4997 | 0.5579 |
ResNet | 1.2444 | 17.8963 | 4.2304 | 0.6165 | 0.5610 | 0.6188 |
Extreme Gradient Boosting | 1.2285 | 18.1063 | 4.2470 | 0.6112 | 0.6103 | 0.7758 |
Extra Trees | 0.7304 | 19.8816 | 4.4488 | 0.5729 | 0.4981 | 0.5186 |
Light Gradient Boosting Machine | 1.3226 | 20.9450 | 4.5700 | 0.5502 | 0.6241 | 0.8042 |
Neural Network | 1.4564 | 24.6622 | 4.9661 | 0.4715 | 0.7127 | 0.8734 |
Decision Tree | 0.7571 | 26.6984 | 5.1538 | 0.4255 | 0.5464 | 0.5569 |
Gradient Boosting | 1.7824 | 29.9394 | 5.4656 | 0.3574 | 0.7811 | 0.9428 |
K Nearest Neighbors | 1.3783 | 35.7453 | 5.9735 | 0.2310 | 0.6818 | 0.7387 |
Bayesian Ridge | 2.3272 | 42.8552 | 6.5377 | 0.0815 | 0.9559 | 1.1375 |
Ridge Regression | 2.3273 | 42.8554 | 6.5378 | 0.0815 | 0.9564 | 1.1423 |
Linear Regression | 2.3271 | 42.8556 | 6.5378 | 0.0815 | 0.9564 | 1.1425 |
Elastic Net | 2.0899 | 44.3601 | 6.6513 | 0.0494 | 0.8636 | 0.9716 |
Lasso Regression | 2.1192 | 44.7360 | 6.6794 | 0.0414 | 0.8685 | 0.9946 |
Huber Regression | 1.2250 | 48.0613 | 6.9232 | −0.0299 | 0.7032 | 0.9955 |
AdaBoost | 3.2974 | 54.0987 | 7.3534 | −0.1708 | 1.1535 | 1.6455 |
Model | Accuracy | AUC | Recall | Precision | F1 | Kappa |
---|---|---|---|---|---|---|
Extreme Gradient Boosting | 0.8424 | 0.9716 | 0.8424 | 0.8426 | 0.8424 | 0.7797 |
Extra Trees | 0.8386 | 0.9453 | 0.8386 | 0.8386 | 0.8386 | 0.7743 |
Random Forest | 0.8360 | 0.9634 | 0.8360 | 0.8361 | 0.8360 | 0.7707 |
Decision Tree | 0.8220 | 0.8775 | 0.8220 | 0.8219 | 0.8219 | 0.7511 |
ResNet | 0.8165 | 0.8503 | 0.8165 | 0.8168 | 0.8166 | 0.7785 |
Neural Network | 0.8108 | 0.8369 | 0.8108 | 0.8111 | 0.8109 | 0.7628 |
Light Gradient Boosting Machine | 0.7978 | 0.9518 | 0.7978 | 0.7981 | 0.7977 | 0.7165 |
Gradient Boosting | 0.7210 | 0.9051 | 0.7210 | 0.7235 | 0.7190 | 0.6046 |
K Nearest Neighbors | 0.7160 | 0.8962 | 0.7160 | 0.7171 | 0.7156 | 0.6009 |
Linear Discriminant Analysis | 0.6344 | 0.8353 | 0.6344 | 0.6355 | 0.6281 | 0.4788 |
Ridge Classifier | 0.6296 | 0.8194 | 0.6296 | 0.6355 | 0.6117 | 0.4661 |
Logistic Regression | 0.5997 | 0.8135 | 0.5997 | 0.6183 | 0.5626 | 0.4212 |
Ada Boost | 0.5511 | 0.7245 | 0.5511 | 0.5447 | 0.5457 | 0.3687 |
SVM | 0.5412 | 0.7749 | 0.5412 | 0.5808 | 0.5159 | 0.3382 |
Naive Bayes | 0.5342 | 0.7675 | 0.5342 | 0.5769 | 0.5136 | 0.3538 |
Quadratic Discriminant Analysis | 0.3429 | 0.7174 | 0.3429 | 0.4488 | 0.2962 | 0.1611 |
Model | Accuracy | AUC | Recall | Precision | F1 | Kappa |
---|---|---|---|---|---|---|
Extreme Gradient Boosting | 0.9511 | 0.9798 | 0.6923 | 0.7507 | 0.7202 | 0.6935 |
Extra Trees | 0.9498 | 0.9365 | 0.7014 | 0.7348 | 0.7176 | 0.6901 |
Random Forest | 0.9491 | 0.9701 | 0.6853 | 0.7364 | 0.7098 | 0.6820 |
ResNet | 0.9464 | 0.7975 | 0.6157 | 0.7484 | 0.6755 | 0.6489 |
Decision Tree | 0.9459 | 0.8394 | 0.6907 | 0.7079 | 0.6991 | 0.6694 |
Neural Network | 0.9460 | 0.7827 | 0.5832 | 0.7674 | 0.6627 | 0.6208 |
Light Gradient Boosting Machine | 0.9389 | 0.9688 | 0.4375 | 0.7996 | 0.5652 | 0.5354 |
K Nearest Neighbors | 0.9228 | 0.8741 | 0.4426 | 0.6027 | 0.5104 | 0.4695 |
Gradient Boosting | 0.9166 | 0.8990 | 0.1295 | 0.7371 | 0.2196 | 0.1979 |
Ridge Classifier | 0.9097 | 0.8418 | 0.0360 | 0.5570 | 0.0676 | 0.0573 |
Logistic Regression | 0.9079 | 0.8080 | 0.0436 | 0.4363 | 0.0792 | 0.0638 |
Linear Discriminant Analysis | 0.9059 | 0.8159 | 0.1250 | 0.4385 | 0.1945 | 0.1607 |
Ada Boost | 0.9056 | 0.8277 | 0.0684 | 0.3893 | 0.1162 | 0.0917 |
SVM | 0.8135 | 0.8375 | 0.2273 | 0.3695 | 0.1117 | 0.0688 |
Naive Bayes | 0.5680 | 0.7600 | 0.8872 | 0.1606 | 0.2720 | 0.1394 |
Quadratic Discriminant Analysis | 0.1167 | 0.5137 | 0.9978 | 0.0932 | 0.1705 | 0.0049 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rao, A.R.; Wang, H.; Gupta, C. Predictive Analysis for Optimizing Port Operations. Appl. Sci. 2025, 15, 2877. https://doi.org/10.3390/app15062877
Rao AR, Wang H, Gupta C. Predictive Analysis for Optimizing Port Operations. Applied Sciences. 2025; 15(6):2877. https://doi.org/10.3390/app15062877
Chicago/Turabian StyleRao, Aniruddha Rajendra, Haiyan Wang, and Chetan Gupta. 2025. "Predictive Analysis for Optimizing Port Operations" Applied Sciences 15, no. 6: 2877. https://doi.org/10.3390/app15062877
APA StyleRao, A. R., Wang, H., & Gupta, C. (2025). Predictive Analysis for Optimizing Port Operations. Applied Sciences, 15(6), 2877. https://doi.org/10.3390/app15062877