Intelligent Method for Classifying the Level of Anthropogenic Disasters
Abstract
:1. Introduction
2. Related Work
- Development of the intelligent classification method for assessing the level of anthropogenic disasters based on the boosting ensemble methods of machine learning.
- Development of the mobile application architecture for classifying the level of anthropogenic disasters in the region.
3. Materials and Methods
3.1. Boosting Methods
3.2. Essence and Features of the Intelligent Method
- Potential Accident Levels I: Minimal danger level. Accidents of this level usually do not cause significant damage and can be easily resolved without special efforts.
- Potential Accident Levels II: Medium danger level. Accidents of this level can cause moderate damage and require more effort to resolve.
- Potential Accident Levels III: High danger level. Accidents of this level can cause the serious damage and require the significant efforts to resolve.
- Potential Accident Levels IV: Extreme danger level. Accidents of this level can cause the catastrophic damage and require extraordinary efforts to resolve.
4. Case Study
4.1. Data Description and Research Methodology
- Reading the data from the file and converting it to a format suitable for further processing.
- Cleaning and normalizing the data, including removing unnecessary information and standardizing the format of records.
- Creating a database to store the processed data and building an appropriate interface for accessing this data.
4.2. Exploratory Data Analysis
4.3. Classification Based on Textual Data
“When performing cleaning with LHD in block F 9970 at level 420, the operator was surprised by a rock block displacement of the side of the gallery, reaching his right leg causing him superficial injury”.
4.4. Classification Based on Quantitative Indicators
5. Mobile Application Architecture
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sirola, M.; Hulsund, J.E. Machine-Learning Methods in Prognosis of Ageing Phenomena in Nuclear Power Plant Components. Int. J. Comput. 2021, 20, 11–21. [Google Scholar] [CrossRef]
- Luna, S.; Pennock, M.J. Social media applications and emergency management: A literature review and research agenda. Int. J. Disaster Risk Reduct. 2018, 28, 565–577. [Google Scholar] [CrossRef]
- Sun, W.; Bocchini, P.; Davison, B.D. Applications of artificial intelligence for disaster management. Nat. Hazards 2020, 103, 2631–2689. [Google Scholar] [CrossRef]
- Costa, D.G.; Vasques, F.; Portugal, P.; Aguiar, A. A Distributed Multi-Tier Emergency Alerting System Exploiting Sensors-Based Event Detection to Support Smart City Applications. Sensors 2019, 20, 170. [Google Scholar] [CrossRef]
- Bhoi, A.; Pujari, S.P.; Balabantaray, R.C. A deep learning-based social media text analysis framework for disaster resource management. Soc. Netw. Anal. Min. 2020, 10, 78. [Google Scholar] [CrossRef]
- Cao, L. AI and data science for smart emergency, crisis and disaster resilience. Int. J. Data Sci. Anal. 2023, 15, 231–246. [Google Scholar] [CrossRef]
- Gopnarayan, A.; Deshpande, S. Tweets Analysis for Disaster Management: Preparedness, Emergency Response, Impact, and Recovery. In Innovative Data Communication Technologies and Application. ICIDCA 2019; Raj, J., Bashar, A., Ramson, S., Eds.; Lecture Notes on Data Engineering and Communications Technologies; Springer: Cham, Switzerland, 2020; Volume 46, pp. 760–764. [Google Scholar] [CrossRef]
- Munawar, H.S.; Qayyum, S.; Ullah, F.; Sepasgozar, S. Big Data and Its Applications in Smart Real Estate and the Disaster Management Life Cycle: A Systematic Analysis. Big Data Cogn. Comput. 2020, 4, 4. [Google Scholar] [CrossRef]
- Madichetty, S.; Sridevi, M. A Neural-Based Approach for Detecting the Situational Information From Twitter During Disaster. IEEE Trans. Comput. Soc. Syst. 2021, 8, 870–880. [Google Scholar] [CrossRef]
- Francis, N.; Suhaimi, H.; Abas, E. Classification of Sprain and Non-sprain Motion using Deep Learning Neural Networks for Ankle Sprain Prevention. Int. J. Comput. 2023, 22, 159–169. [Google Scholar] [CrossRef]
- Linardos, V.; Drakaki, M.; Tzionas, P.; Karnavas, Y.L. Machine Learning in Disaster Management: Recent Developments in Methods and Applications. Mach. Learn. Knowl. Extr. 2022, 4, 446–473. [Google Scholar] [CrossRef]
- Kanojia, D.; Kumar, V.; Ramamritham, K. Civique: Using Social Media to Detect Urban Emergencies. arXiv 2016, arXiv:1610.04377. [Google Scholar] [CrossRef]
- Zheng, Z.; Zhong, Y.; Wang, J.; Ma, A.; Zhang, L. Building damage assessment for rapid disaster response with a deep object-based semantic change detection framework: From natural disasters to anthropogenic disasters. Remote Sens. Environ. 2021, 265, 112636. [Google Scholar] [CrossRef]
- Bandyopadhyay, M.; Singh, V. Development of agent based model for predicting emergency response time. Perspect. Sci. 2016, 8, 138–141. [Google Scholar] [CrossRef]
- Avvenuti, M.; Cimino, M.G.C.A.; Cresci, S.; Marchetti, A.; Tesconi, M. A framework for detecting unfolding emergencies using humans as sensors. SpringerPlus 2016, 5, 43. [Google Scholar] [CrossRef]
- Zhang, X.; Ma, Y. An ALBERT-based TextCNN-Hatt hybrid model enhanced with topic knowledge for sentiment analysis of sudden-onset disasters. Eng. Appl. Artif. Intell. 2023, 123, 106136. [Google Scholar] [CrossRef]
- Adel, H.; Dahou, A.; Mabrouk, A.; Elaziz, M.A.; Kayed, M.; El-Henawy, I.M.; Alshathri, S.; Ali, A.A. Improving Crisis Events Detection Using DistilBERT with Hunger Games Search Algorithm. Mathematics 2022, 10, 447. [Google Scholar] [CrossRef]
- Ahmed, A.S.; Basheer, O.N.; Salah, H.A. Breast Tumors Diagnosis Using Fuzzy Inference System and Fuzzy C-Means Clustering. Int. J. Comput. 2021, 20, 551–559. [Google Scholar] [CrossRef]
- Ferreira, A.J.; Figueiredo, M.A.T. Boosting Algorithms: A Review of Methods, Theory, and Applications. In Ensemble Machine Learning; Zhang, C., Ma, Y., Eds.; Springer: New York, NY, USA, 2012. [Google Scholar] [CrossRef]
- Velthoen, J.; Dombry, C.; Cai, J.-J.; Engelke, S. Gradient boosting for extreme quantile regression. Extremes 2023, 26, 1–29. [Google Scholar] [CrossRef]
- Abdullahi, A.; Raheem, L.; Muhammed, M.; Rabiat, O.; Ganiyu, A. Comparison of the CatBoost Classifier with other Machine Learning Methods. Int. J. Adv. Comput. Sci. Appl. 2020, 11. [Google Scholar] [CrossRef]
- Chen, C.; Zhang, Q.; Ma, Q.; Yu, B. LightGBM-PPI: Predicting protein-protein interactions through LightGBM with multi-information fusion. Chemom. Intell. Lab. Syst. 2019, 191, 54–64. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 3149–3157. [Google Scholar]
- Lettieri, E.; Masella, C.; Radaelli, G. Disaster management: Findings from a systematic review. Disaster Prev. Manag. Int. J. 2009, 18, 117–136. [Google Scholar] [CrossRef]
- Tukey, J.W. Exploratory Data Analysis; Addison-Wesley: Reading, MA, USA, 1977; Volume 2, pp. 131–160. [Google Scholar]
- Majumder, M.G.; Gupta, S.D.; Paul, J. Perceived usefulness of online customer reviews: A review mining approach using machine learning & exploratory data analysis. J. Bus. Res. 2022, 150, 147–164. [Google Scholar] [CrossRef]
- Roman, G.; Lipyanina-Goncharenko, H.; Sachenko, A.; Lendyuk, T.; Zahorodnia, D. Intelligent Method of a Competitive Product Choosing based on the Emotional Feedbacks Coloring. In IntelITSIS; CEUR-WS: Khmelnytskyi, Ukraine, 2021; pp. 246–257. [Google Scholar]
- Wang, C.; Shakhovska, N.; Sachenko, A.; Komar, M. A New Approach for Missing Data Imputation in Big Data Interface. Inf. Technol. Control. 2020, 49, 541–555. [Google Scholar] [CrossRef]
- Jin, S.; Chen, S.; Xie, X. Property-based Test for Part-of-Speech Tagging Tool. In Proceedings of the 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, 15–19 November 2021; pp. 1306–1311. [Google Scholar]
- Guo, S.; Liu, Y.; Chen, R.; Sun, X.; Wang, X. Improved SMOTE Algorithm to Deal with Imbalanced Activity Classes in Smart Homes. Neural Process. Lett. 2019, 50, 1503–1526. [Google Scholar] [CrossRef]
- Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef]
- Industrial Safety and Health Analytics Database. Kaggle: Your Machine Learning and Data Science Community. Available online: https://www.kaggle.com/datasets/ihmstefanini/industrial-safety-and-health-analytics-database (accessed on 3 May 2023).
- Paffenroth, R.; Kong, X. Python in Data Science Research and Education. У Python in Science Conference. In Proceedings of the SciPy 2015, Austin, TX, USA, 6–12 July 2015. [Google Scholar] [CrossRef]
- Lipianina-Honcharenko, K.; Lukasevych-Krutnyk, I.; Butryn-Boka, N.; Sachenko, A.; Grodskyi, S. Intelligent Method for Identifying the Fraudulent Online Stores. In Proceedings of the 2021 IEEE 8th International Conference on Problems of Infocommunications, Science and Technology (PIC S&T), Kharkiv, Ukraine, 5–7 October 2021; pp. 218–222. [Google Scholar] [CrossRef]
- Krysovatyy, A.; Lipianina-Honcharenko, H.; Sachenko, S.; Desyatnyuk, O.; Banasik, A.; Lukasevych-Krutnyk, I. Recognizing the fictitious business entity on logistic regression base. CEUR Workshop Proc. 2022, 3156, 218–227. [Google Scholar]
- Classification Report—Yellowbrick v1.5 Documentation. Yellowbrick: Machine Learning Visualization—Yellowbrick v1.5 Documentation. Available online: https://www.scikit-yb.org/en/latest/api/classifier/classification_report.html (accessed on 10 May 2023).
- Sachenko, A.; Kochan, V.; Kochan, R.; Turchenko, V.; Tsahouridis, K.; Laopoulos, T. Error compensation in an intelligent sensing instrumentation system. In Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics (IMTC 2001), Budapest, Hungary, 21–23 May 2001; Volume 2, pp. 869–874. [Google Scholar] [CrossRef]
Source | Method | Accuracy |
---|---|---|
Bhoi et al. [5] | Hybrid model including LSTM and CNN. | F1-scores on both datasets are 84% and 84% |
Gopnarayan A., Deshpande, S. [7] | SVM, KNN, and logistic regression. | Not specified |
S. Madichetty and S. M. [9] | A combination of the RoBERTa model and a feature-based method. | The accuracy is 90% |
D. Kanojia, V. Kumar, K. [12] | Real-time message classification. | F-measure exceeds 70% and 90% respectively. |
Avvenuti, M. et al. [15] | Not specified. | Precision = 75%, Recall = 100%, and F-Measure = 86% |
Zhang & Ma [16] | ALBERT-based TextCNN-Hatt hybrid model enhanced with topic knowledge for sentiment analysis of sudden-onset disasters | 89% |
Adel, H. et al. [17] | DistilBERT model with Hunger Games search algorithm. | 98% (C6 data set), 97% (C36 data set) |
No. | Parameter | Description | Non-Null, Count, Dtype |
---|---|---|---|
1 | Data | timestamp or time/date information | 411, non-null, object |
2 | Countries | which country the accident occurred (anonymized) | 411, non-null, object |
3 | Local | the city where the manufacturing plant is located (anonymized) | 411, non-null, object |
4 | Industry sector | which sector the plant belongs to | 411, non-null, object |
5 | Accident level | from I to VI, it registers how severe was the accident (I means not severe but VI means very severe) | 411, non-null, object |
6 | Potential Accident Level | Depending on the Accident Level, the database also registers how severe the accident could have been (due to other factors involved in the accident) | 411, non-null, object |
7 | Genre | if the person is male of female | 411, non-null, object |
8 | Employee or Third Party | if the injured person is an employee or a third party | 411, non-null, object |
9 | Critical Risk | some description of the risk involved in the accident | 411, non-null, object |
10 | Description | Detailed description of how the accident happened | 411, non-null, object |
Index | Words_Metals | Count_Metals | Words_Mining | Count_Mining | Index | Words_Metals | Count_Metals | Words_Mining | Count_Mining |
---|---|---|---|---|---|---|---|---|---|
0 | employee | 50 | causing | 103 | 15 | worker | 14 | accident | 47 |
1 | left | 46 | right | 100 | 16 | performed | 13 | collaborator | 44 |
2 | causing | 43 | operator | 96 | 17 | mr. | 13 | safety | 44 |
3 | right | 37 | time | 96 | 18 | center | 13 | mesh | 44 |
4 | hit | 27 | left | 92 | 19 | hose | 12 | work | 43 |
5 | hand | 25 | hand | 88 | 20 | area | 12 | hit | 42 |
6 | operator | 25 | moment | 62 | 21 | face | 12 | employee | 38 |
7 | activity | 25 | level | 60 | 22 | remove | 12 | one | 37 |
8 | medical | 24 | assistant | 59 | 23 | sheet | 12 | fall | 36 |
9 | report | 23 | worker | 53 | 24 | cut | 12 | circumstance | 35 |
10 | finger | 19 | support | 51 | 25 | reaching | 12 | height | 35 |
11 | moment | 18 | rock | 49 | 26 | pipe | 11 | floor | 34 |
12 | one | 16 | pipe | 48 | 27 | fall | 11 | injured | 32 |
13 | collaborator | 15 | equipment | 47 | 28 | acid | 11 | used | 32 |
14 | cleaning | 15 | finger | 47 | 29 | contact | 11 | mr. | 31 |
Model | Accuracy | |
---|---|---|
0 | AdaBoost | 0.595376 |
1 | Gradient Boost | 0.924855 |
2 | XGBoost | 0.901734 |
3 | CatBoost | 0.791908 |
4 | LGBoost | 0.988439 |
Source | Method | Accuracy (Text Data), % | Accuracy (Quantitative Data), % | Real-Time Operation |
---|---|---|---|---|
Zhang & Ma [16] | ALBERT-based TextCNN-Hatt hybrid model enhanced with topic knowledge for sentiment analysis of sudden-onset disasters | 89% | N/A | N/A |
Adel et al., (2022) [17] | DistilBERT model with Hunger Games search algorithm | 98 (C6 data set); 97 (C36 data set) | N/A | N/A |
Proposed approach | Intelligent method of classifying the level of anthropogenic disasters based on textual and quantitative data | 76 | 81 | Є |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lipianina-Honcharenko, K.; Wolff, C.; Sachenko, A.; Kit, I.; Zahorodnia, D. Intelligent Method for Classifying the Level of Anthropogenic Disasters. Big Data Cogn. Comput. 2023, 7, 157. https://doi.org/10.3390/bdcc7030157
Lipianina-Honcharenko K, Wolff C, Sachenko A, Kit I, Zahorodnia D. Intelligent Method for Classifying the Level of Anthropogenic Disasters. Big Data and Cognitive Computing. 2023; 7(3):157. https://doi.org/10.3390/bdcc7030157
Chicago/Turabian StyleLipianina-Honcharenko, Khrystyna, Carsten Wolff, Anatoliy Sachenko, Ivan Kit, and Diana Zahorodnia. 2023. "Intelligent Method for Classifying the Level of Anthropogenic Disasters" Big Data and Cognitive Computing 7, no. 3: 157. https://doi.org/10.3390/bdcc7030157
APA StyleLipianina-Honcharenko, K., Wolff, C., Sachenko, A., Kit, I., & Zahorodnia, D. (2023). Intelligent Method for Classifying the Level of Anthropogenic Disasters. Big Data and Cognitive Computing, 7(3), 157. https://doi.org/10.3390/bdcc7030157