On the K-Means Clustering Model for Performance Enhancement of Port State Control
Abstract
:1. Introduction
2. Literature Review
3. Methodology
Model Evaluation Method
- Cluster quality
- Comparison with training results with labeled data
4. Prediction Model
4.1. Remove Useless Features and Extract New Features
4.2. Model Evaluation and Results
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Yan, R.; Wang, S.; Peng, C. An Artificial Intelligence Model Considering Data Imbalance for Ship Selection in Port State Control Based on Detention Probabilities. J. Comput. Sci. 2021, 48, 201257. [Google Scholar] [CrossRef]
- Yan, R. Data Analytics for Improving Shipping Efficiency: Models, Methods, and Applications. Ph.D. Thesis, PolyUThe Hong Kong Polytechnic University, Hong Kong, China, 2022. [Google Scholar]
- Balamurugan, K.S.; Chakrabarti, P.; Chakrabarti, T.; Gupta, A.; Elngar, A.A.; Nami, M.; Akbar, M.A. Improving the Performance of Diagnosing Chronic Obstructive Lung Disease Using Outlier Detection with Decision Tree Algorithm. 2022. Available online: https://assets.researchsquare.com/files/rs-2072803/v1/b9e70da5-9278-4bad-b918-e32dfdc1e8ce.pdf?c=1666880078 (accessed on 28 October 2022).
- Sriraman, R.; Younis, J.A.; Lim, C.P.; Hammachukiattikul, P.; Rajchakit, G.; Boonsatit, N. A Sampling Load Frequency Control Scheme for Power Systems with Time Delays. Complexity 2022, 2022, 3878321. [Google Scholar] [CrossRef]
- Visakamoorthi, B.; Muthukumar, P.; Rajchakit, G.; Boonsatit, N.; Hammachukiattikul, P. Stabilization of Fuzzy Hydraulic Turbine Governing System With Parametric Uncertainty and Membership Function Dependent H∞ Performance. IEEE Access 2022, 10, 23063–23073. [Google Scholar] [CrossRef]
- Rajchakit, G.; Sriraman, R.; Boonsatit, N.; Hammachukiattikul, P.; Lim, C.P.; Agarwal, P. Exponential stability in the Lagrange sense for Clifford-valued recurrent neural networks with time delays. Adv. Differ. Equ. 2021, 2021, 256. [Google Scholar] [CrossRef]
- Yan, R.; Wang, S.; Peng, C. Ship selection in port state control: Status and perspectives. Marit. Policy Manag. 2022, 49, 600–615. [Google Scholar] [CrossRef]
- Yan, R.; Wang, S. Ship Inspection by Port State Control—Review of Current Research; Springer: Singapore, 2019. [Google Scholar]
- Yan, R.; Wang, S.; Cao, J.; Sun, D. Shipping Domain Knowledge Informed Prediction and Optimization in Port State Control. Transp. Res. Part B 2021, 149, 52–78. [Google Scholar] [CrossRef]
- Wang, S.; Yan, R.; Qu, X. Development of a non-parametric classifier: Effective identification, algorithm, and applications in port state control for maritime transportation. Transp. Res. Part B 2019, 128, 129–157. [Google Scholar] [CrossRef]
- Yan, R.; Zhuge, D.; Wang, S. Development of Two Highly-Efficient and Innovative Inspection Schemes for PSC Inspection. Asia-Pac. J. Oper. Res. 2021, 38, 2040013. [Google Scholar] [CrossRef]
- Chi, Z.; Jun, S. Automatically optimized and self-evolutional Ship Targeting system for Port State Control. In Proceedings of the 2010 IEEE International Conference on Systems, Man and Cybernetics, Istanbul, Turkey, 10–13 October 2010; pp. 791–795. [Google Scholar] [CrossRef]
- Yang, Z.; Yang, Z.; Yin, J. Realising Advanced Risk-based Port State Control Inspection Using Data-driven Bayesian Networks. Transp. Res. Part A Policy Pract. 2018, 110, 38–56. [Google Scholar] [CrossRef]
- Tsou, M. Big Data Analysis of Port State Control Ship Detention Database. J. Mar. Eng. Technol. 2019, 18, 113–121. [Google Scholar] [CrossRef]
- Chung, W.; Kao, S.; Chang, C.; Yuan, C. Association Rule Learning to Improve Deficiency Inspection in Port State Control. Marit. Policy Manag. 2020, 47, 332–351. [Google Scholar] [CrossRef]
- Osman, M.T.; Yuli, C.; Li, T.; Senin, S.F. Association Rule Mining for Identification of Port State Control Patterns in Malaysian Ports. Marit. Policy Manag. 2020, 48, 1082–1095. [Google Scholar] [CrossRef]
- Zhang, L.F.; Gang, L.H.; Liu, Z.J. Analyzing Inspection Results of Port State Control by using PCA. Appl. Mech. Mater. 2014, 686, 730–735. [Google Scholar] [CrossRef]
Feature Name | Feature Meaning | Missing Value | Processing Method | Encoding Method |
---|---|---|---|---|
‘Dead Weight’ | Deadweight tonnage is a measure of how much weight a boat can carry | Yes | Mean fill | No encoding. |
‘flag performance’ | White, grey, black, not listed | No | “not listed” is filled with the mode of the feature in the training set | Label encoding: w white→1; grey→2; black→3. |
‘RO performance’ | High, medium, low, very low, not listed | Yes | “not listed” is processed with mode | Label encoding: high→1; medium→2; low→3. very low→4. |
‘Company performance’ | The performance of shipping businesses is determined using the company performance matrix from the Tokyo MoU | No | “not listed” is filled with mode | Label encoding: high→1; medium→2; low→3; very low→4. |
‘deficiency no’ | The number of defects in this inspection | No | No encoding. | |
‘detention’ | Whether this inspection is detained | No | Label encoding. | |
‘last deficiency no’ | Here is the number of defects from the last initial inspection | Yes | filled with the mode of the training set | No encoding. |
‘total detentions’ | total number of detentions | No | No encoding. | |
‘casualty in 5 years’ | A binary variable indicating whether a ship has had a casualty accident in the past five years. | No | Casualty-in-5-years: one-hot encoding: 1 for each casualty that has occurred in the previous 5 years, 0 otherwise. | |
‘flag changing times’ | Number of ship flag state changes | No | No encoding. | |
‘Length’(meter) | The ship’s overall maximum length | Yes | filled with the mean | No encoding. |
‘Beam’(meter) | Hull width | Yes | filled with the mean | No encoding. |
‘Depth’(meter) | vertical distance between the side upper deck and the underside of the keel | Yes | filled with the mean | No encoding. |
‘Speed’ | the speed of the boat | Yes | filled with the mean | No encoding. |
‘last_36_months_avg_def_no’ | Average number of defects in initial inspection in the past 36 months | No | No encoding. | |
‘last_36_months_all_det_no’ | Total number of detentions at initial inspection in the past 36 months | No | No encoding. | |
‘last_inspection_state’ | Whether the last initial inspection was held or not, its encoding method is a binary variable. The encoding method 1 indicates that it is held, and 0 means that it is not held. | No | No encoding. | |
‘Classification Society’ | NGO that creates and upholds technical guidelines for the design, manufacture, and use of ships and offshore structures. | No | One-hot encoding | |
‘Ship Type_PSC’ | Bulk carriers, container ships, general/multipurpose ships, passenger ships, oil tankers, and other ship categories are included in the collection. | No | One-hot encoding: is bulk carrier: 1 for bulk carriers, 0 otherwise; is container ship: 1 for container ships, 0 otherwise; is general cargo/multipurpose: 1 for such ships; is passenger ship: 1 for such ships; is tanker: 1 for such vessels; is other: 1 for other ship categories, 0 otherwise. |
Cluster/Features | 1 | 2 | 3 |
---|---|---|---|
No, of ships c | 252 | 1680 | 1005 |
No. of ships in HRS (rate) | HRS:225 (0.89) | HRS:547 (0.33) | HRS:494 (0.12) |
No. of ships in SRS (rate) | SRS:26 (0.10) | SRS:809 (0.48) | SRS:389 (0.49) |
No. of ships in LRS (rate) | LRS:1 (0.0039) | LRS:324 (0.19) | LRS:122 (0.39) |
Clustering performance on training set (accuracy) | 0.89 | 0.33 | 0.39 |
Clustering performance on test set (accuracy) | 0.86 | 0.35 | 0.40 |
Average no. of deficiencies | 11.83 | 3.96 | 2.30 |
Prediction performance on training set (MSE) | 79.25 | 15.86 | 8.50 |
Prediction performance on test set (MSE) | 75.16 | 16.00 | 8.16 |
Total number of detentions | 71 | 42 | 3 |
Average detention rate | 0.28 | 0.03 | 0.0030 |
Prediction performance on training set (Brier score) | 0.20 | 0.024 | 0.0030 |
Prediction performance on test set (Brier score) | 0.23 | 0.022 | 0.0043 |
Distribution of deficiency code | {‘01’: 234, ‘02’: 54, ‘03’: 254, ‘04’: 184, ‘05’: 176, ‘06’: 12, ‘07’: 631, ‘08’: 20, ‘09’: 228, ‘10’: 513, ‘11’: 339, ‘12’: 3, ‘13’: 69, ‘14’: 144, ‘15’: 83, ‘18’: 24, ‘99’: 14} | {‘01’: 395, ‘02’: 62, ‘03’: 515, ‘04’: 452, ‘05’: 286, ‘06’: 23, ‘07’: 1450, ‘08’: 97, ‘09’: 662, ‘10’: 1026, ‘11’: 866, ‘12’: 1, ‘13’: 199, ‘14’: 356, ‘15’: 88, ‘18’: 96, ‘99’: 76} | {‘01’: 119, ‘02’: 18, ‘03’: 167, ‘04’: 163, ‘05’: 125, ‘06’: 5, ‘07’: 481, ‘08’: 37, ‘09’: 272, ‘10’: 330, ‘11’: 235, ‘12’: 4, ‘13’: 94, ‘14’: 125, ‘15’: 28, ‘18’: 59, ‘99’: 45} |
Prediction performance on training set (MSE) code_label | 1.25 | 0.34 | 0.19 |
Prediction performance on test set (MSE) code_label | 1.41 | 0.34 | 0.22 |
Distribution of detainable code | {‘01’: 36, ‘02’: 24, ‘03’: 41, ‘04’: 28, ‘05’: 23, ‘06’: 2, ‘07’: 72, ‘08’: 2, ‘09’: 2, ‘10’: 51, ‘11’: 33, ‘13’: 2, ‘14’: 24, ‘15’: 54, ‘18’: 1} | {‘01’: 8, ‘02’: 3, ‘03’: 8, ‘04’: 12, ‘05’: 9, ‘06’: 2, ‘07’: 31, ‘09’: 1, ‘10’: 13, ‘11’: 19, ‘13’: 1, ‘14’: 11, ‘15’: 22} | {‘07’: 2, ‘10’: 2, ‘14’: 2, ‘15’: 2} |
Prediction performance on training set (MSE) detainable_code_label | 0.19 | 0.0098 | 0.0020 |
Prediction performance on test set (MSE) detainable_code_label | 0.27 | 0.0091 | 0.0043 |
Number of Clusters | Silhoutte_score |
---|---|
2 | −0.0916 |
3 | −0.0883 |
4 | −0.0883 |
5 | −0.0883 |
6 | −0.0837 |
7 | −0.0837 |
8 | −0.0837 |
9 | −0.0836 |
10 | −0.0842 |
11 | −0.0842 |
12 | −0.0840 |
13 | −0.0842 |
14 | −0.0856 |
K = 3 | K = 9 | |
---|---|---|
accuracy rate | 0.41 | 0.47 |
deficiency_no_mse | 18.73 | 19.86 |
detention_mse | 0.54 | 0.61 |
detainable_code_label0_MSE | 0.27 | 0.06 |
detainable_code_label1_MSE | 0.01 | 0.10 |
detainable_code_label2_MSE | 0.0043 | 0.26 |
detainable_code_label3_MSE | \ | 0.00 |
detainable_code_label4_MSE | \ | 0.0053 |
detainable_code_label5_MSE | \ | 0.0073 |
detainable_code_label6_MSE | \ | 0.54 |
detainable_code_label7_MSE | \ | 0.051 |
detainable_code_label8_MSE | \ | 0.032 |
code_label0_MSE | 1.41 | 0.62 |
code_label1_MSE | 0.34 | 0.91 |
code_label2_MSE | 0.22 | 1.02 |
code_label3_MSE | \ | 0.18 |
code_label4_MSE | \ | 0.22 |
code_label5_MSE | \ | 0.31 |
code_label6_MSE | \ | 2.22 |
code_label7_MSE | \ | 0.43 |
code_label8_MSE | \ | 0.45 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hou, Z.; Yan, R.; Wang, S. On the K-Means Clustering Model for Performance Enhancement of Port State Control. J. Mar. Sci. Eng. 2022, 10, 1608. https://doi.org/10.3390/jmse10111608
Hou Z, Yan R, Wang S. On the K-Means Clustering Model for Performance Enhancement of Port State Control. Journal of Marine Science and Engineering. 2022; 10(11):1608. https://doi.org/10.3390/jmse10111608
Chicago/Turabian StyleHou, Zeyu, Ran Yan, and Shuaian Wang. 2022. "On the K-Means Clustering Model for Performance Enhancement of Port State Control" Journal of Marine Science and Engineering 10, no. 11: 1608. https://doi.org/10.3390/jmse10111608
APA StyleHou, Z., Yan, R., & Wang, S. (2022). On the K-Means Clustering Model for Performance Enhancement of Port State Control. Journal of Marine Science and Engineering, 10(11), 1608. https://doi.org/10.3390/jmse10111608