1. Introduction
A crucial role is played by industrial risk assessments in ensuring the safety of workers, the public, and the environment, as well as in complying with regulatory requirements. The number of risk assessment methods has grown, and the areas covered have expanded since the 1960s when U.S. chemical companies began to conduct more systematic safety risk assessments. Identifying, analyzing, and evaluating potential hazards and risks associated with industrial processes or activities are involved in these assessments. Hazard and Operability (HAZOP), Fault Tree Analysis (FTA), Safety Integrity Level (SIL), Event Tree Analysis (ETA), and Quantitative Risk Analysis (QRA) are traditional methods of risk assessment typically used. HAZOP is a structured inspection of a process or system that systematically identifies potential hazards and deviations from normal operating conditions. It is commonly used in industries such as chemistry, oil and gas, and nuclear [
1]. FTA is a systematic approach used to identify and analyze the causes and consequences of system failures, frequently employed in industries such as aerospace, national defense, and nuclear [
2]. Various other risk assessment methods are also widely utilized in industries such as aviation, transportation, and energy [
3]. However, with the popularization and use of sensors, computers, and the Internet of Things in the industrial field, the amount of data has increased dramatically. Traditional methods are not effective in predicting future events and processing large amounts of data. At the same time, the rapid development of technology makes industrial processes more complex, interrelated, and precise, which leads to new risks that are difficult to capture through traditional methods. In addition, the industrial environment is dynamic and constantly changing, which makes it difficult for static risk assessment to keep up with the pace of change. In view of these challenges, industrial organizations must adopt new risk assessment methods that are more suitable for the needs of modern industrial production, such as advanced machine learning technology.
Since 1991, when machine learning was first proposed as a tool for risk analysis and quality assessment [
4], with new developments in machine learning research driven by advances in computer technology, artificial intelligence, and big data. This technology offers numerous advantages, including the ability to quickly and efficiently process vast amounts of data and identify potential risks before they occur, preventing accidents and reducing downtime. Additionally, machine learning can be customized to meet the specific needs of an organization, identifying unique risks to a particular industry or facility. By utilizing probability theory, statistics, and computational complexity theory, machine learning can achieve real-time and accurate risk assessment, handling all possible variables and predicting more potential risk factors than human evaluators. As an emerging computer technology, machine learning is becoming an increasingly important part of risk assessment. Examples of the use of machine learning for industrial risk assessment have increased in recent years. In the construction industry, scholars have compared the prediction effects of support vector machines, artificial neural networks, and kernel logistic regression on shallow landslides [
5]. Some scholars use computational models to simulate the behavior of soil and fluid flow and evaluate the risk of failure or damage in structures built on or near soil [
6,
7]. In the biomedical industry, scholars have used machine learning to study the spread, diagnosis, and prevention of COVID-19 and fight the pandemic [
8]. At the same time, machine learning has also been used to predict drug toxicity [
9,
10]. In the mechanical manufacturing industry, machine learning is used to diagnose rotating machinery faults [
11], reduce the bullwhip effect in supply chains [
12], implement predictive maintenance for wind turbines [
13], and monitor network intrusions in transportation vehicles [
14]. In the energy and chemical industries, machine learning has been used for the risk assessment of oil pipelines and the investigation of self-ignition risks during coal transportation [
15,
16,
17]. In the fields of new energy vehicles and autonomous driving, some scholars propose a fault diagnosis system for new energy vehicles’ electric drive systems based on improved machine learning and several typical fault detection and diagnosis methods [
18].
The field of industrial risk assessment has seen significant advancements in recent years, particularly with the increasing use of machine learning techniques. However, this has also made it difficult to keep up with the evolving landscape and to identify the current status and future directions of research. To address this challenge, it is necessary to explore subjects such as (1) the integration of machine learning in industrial risk assessment, (2) the historical development of the field, (3) changes in the metaknowledge and research areas, and (4) the hot spots and trends in the field. One effective approach to gain insights into these topics is through bibliometric mapping analysis. This approach allows scholars to visualize the knowledge base, research hotspots, and development trends of a specific field, which can help researchers quickly understand the research overview of a specific area. Bibliometric mapping analysis has become increasingly popular in various research fields. For example, in the occupational accident analysis field, bibliometric methods have been applied to study the application of machine learning techniques [
19], while in engineering risk assessment, they have been used to investigate the adoption of machine learning algorithms [
20]. Bibliometric mapping analysis has also been utilized to examine the knowledge base and research hotspots related to the emergency management of sudden public health events [
21] and determine the current research status and trends of emergency evacuation studies [
22]. In this paper, we use bibliometric mapping analysis to investigate the relevant literature on the application of machine learning in industrial risk assessment, employing software such as VOSviewer, Bibliometrix R, and CiteSpace. This paper discusses the following topics: (1) a comprehensive understanding of the historical development based on an analysis of the spatiotemporal distribution of related paper outputs; (2) research hotspots and research frontiers of machine learning in industrial risk assessment; (3) identifying potential future research directions in this field.
4. Discussion
In this study, 3116 relevant publications were extracted from the Web of Science core database using VOSviewer, Bibliometrix R, and CiteSpace for the application of machine learning in industrial risk assessment, and the keywords were analyzed via co-occurrence analysis, cluster analysis, and dual-map overlays. An overview of the development history, topic evolution, relevant knowledge base, current research hotspots, and future research trends in the application area of machine learning in industrial risk assessment is presented. The main content is presented below:
The number of articles on machine learning in industrial risk assessment has increased year by year, indicating a growing interest in the field. The overall development process of machine learning in industrial risk assessment can be divided into three stages: the initial exploration stage (1991–2006), the stable development stage (2006–2017), and the high-speed development stage (2017–present), as shown in
Figure 2 in
Section 3.1. During the initial exploration stage (1991–2006), fewer than 10 relevant publications per year were produced. As machine learning was a relatively cutting-edge technology at the time and was limited by computer hardware, it did not receive widespread attention. However, an increase in the number of publications per year was observed during the stable development stage (2006–2017). With the improvement of computer processing capabilities and the arrival of the information age, machine learning was required to handle more complex learning tasks and was successfully applied to risk assessment in various industrial fields. In the high-speed development stage (2017–present), a rapid increase in the number of publications was indicated by the annual publication volume, which remained above 150 papers per year. During this period, due to the rise and gradual maturity of information and communication technologies, there is a need to establish big data prediction models, the Internet of Things, and autonomous driving and promote the Industry 4.0 model transformation toward intelligent monitoring and data empowerment, which are more complex tasks. As a result, widespread attention and high regard have once again been received by machine learning and artificial intelligence.
By selecting the top 20 most cited relevant articles, we have gained a deeper understanding of the research essence of using machine learning technology for industrial risk assessment. It was shown by the analysis results that research articles are more representative and informative in this field, with 15 of the top 20 most cited articles being research articles. The focus of the most cited articles is on model-based quantitative methods in artificial intelligence for process fault detection and diagnosis. The security and privacy issues of the Internet of Things from a healthcare perspective are discussed in the second most cited article. The potential of machine learning to solve medical problems using large media datasets and various learning algorithms, thereby transforming healthcare, is explored in the third most cited article. Various aspects within the industry are shown to be covered by research on machine learning in the world through the analysis of highly cited literature. Of the 20 articles, 9 were produced through international collaborations, and 17 were the result of multi-institutional collaborations, demonstrating the close collaborations between scholars in the field across institutions and regions, as well as the closely interdisciplinary nature of the research content.
By using CiteSpace to analyze the keywords in a dual-map overlay, we found that mathematics and computer science are the most important fields for research into how machine learning is used in industrial risk assessment. These two disciplines provide the basis for the development and optimization of a wide range of algorithms, from theory to application. In addition, this research incorporates knowledge from a number of disciplines, including chemistry, physics, materials science, nutrition, environmental science, toxicology, finance, and sociology, for its application in a variety of fields, including construction, energy, chemical engineering, and biomedicine. Based on the analysis in
Section 3.4.2, we can derive four knowledge bases for machine learning in industrial risk assessment: machine learning algorithm design, applications in biomedicine, risk monitoring in construction and machinery, and environmental protection.
Using CiteSpace for keyword timeline analysis, we have obtained six timelines with two main directions: algorithms and applications for risk assessment. Three clusters were included in the algorithms for risk assessment: “support vector machines”, “Random Forests”, and “deep learning”, while three clusters were included in the applications for risk assessment: “Industry 4.0”, “supply chain risk assessment”, and “Internet of Things”, as shown in
Figure 14. High keyword frequencies were observed in the period of 2017–2022 for four clusters, namely, “Random Forests”, “Industry 4.0”, “supply chain risk assessment”, and “Internet of Things”, indicating that they are currently at the forefront of research.
By analyzing the three stages of the theme using Bibliometrix R, the development of machine learning in the industrial field can be summarized as follows: from the initial use as a production assistant tool in a single field of industry to achieving real-time risk monitoring and assessment in multiple fields and then to realizing the informatization and digitization of Industry 4.0. Of these, the concept of Industry 4.0 is becoming increasingly important, emphasizing the use of advanced technologies to improve industrial processes, which is a prominent theme in this field. The necessity of continuing research on the application of machine learning in industrial environments is emphasized by these developments, as well as its potential to support broader industrial transformation.
Based on the evolution of time and topics, we have identified three current hotspots of machine learning in industrial risk assessment research: Firstly, the research on machine learning and deep learning algorithms themselves. Secondly, machine learning risk management in the Industry 4.0 system. Thirdly, the application of machine learning in the field of autonomous driving technology. In the era of the Internet of Things and big data, it is necessary to continuously explore machine learning algorithms, integrating various sensor information at the microlevel and integrating information from various fields at the macrolevel, achieving the informatization and digitization of the Industry 4.0 model. Additionally, researchers also need to use machine learning and deep learning to address more complex security issues, of which autonomous driving technology is one application.
The future research direction for machine learning applied to industrial risk assessment is expected to focus on several key areas. Firstly, further research on machine learning and deep learning algorithms themselves is anticipated, with the aim of developing more advanced and efficient algorithms for risk assessment in industrial settings. This could involve exploring new machine learning techniques, optimizing existing algorithms, and integrating different approaches to improve the accuracy and reliability of risk assessment models. Secondly, machine learning risk management in the context of Industry 4.0 is likely to be a significant research area given the increasing emphasis on the use of advanced technologies to improve industrial processes. This could involve developing machine learning models that can effectively manage risks in complex and dynamic industrial environments, where data from various sources and sensors are integrated to enable real-time risk monitoring and assessment. Additionally, the application of machine learning in the field of autonomous driving technology is expected to be another important research direction. As autonomous vehicles become more prevalent in industrial settings, machine learning algorithms can play a crucial role in enabling these vehicles to assess and manage risks in real-time, ensuring safe and efficient operations. Overall, the future research direction for machine learning in industrial risk assessment is expected to focus on advancing algorithms, integrating technologies in the context of Industry 4.0, and addressing complex security issues in emerging applications such as autonomous driving technology.
However, there are several challenges faced in using machine learning in industrial risk assessment, although it has the potential to improve accuracy and efficiency. The reliability of machine learning models can be affected by limited data availability and quality, algorithmic and data bias, and a lack of interpretability. To overcome these challenges, issues related to data quality and bias need to be addressed and the interpretability of machine learning models needs to be ensured while human expertise is incorporated into the risk assessment process. More robust and transparent machine learning models that can be validated and interpreted by human experts should be developed in future research [
87]. The interpretability challenge may be addressed by recent advancements in explainable AI techniques. The limitations of data availability and quality can be overcome by efforts to improve data collection and sharing [
88]. The future of machine learning in industrial risk assessment looks promising, but overcoming the challenges and limitations associated with its use requires continued research and development.
5. Conclusions
Using bibliometric mapping analysis, the research on the application of machine learning in industrial risk assessment was reviewed in this paper, with a focus on time distribution, highly cited literature, the research knowledge base, the evolutionary path, research hotspots, and frontier areas. Based on this analysis, three main conclusions are drawn by the paper.
The research history of machine learning applied to industrial risk assessment is broadly divided into three phases: the initial exploration phase (1991–2006), the stable development phase (2006–2017), and the high development phase (2017–present). The application of machine learning in industrial risk assessment research is increasing year by year, and the number of publications is rising. The years of publication in European and North American countries are significantly earlier than those in Asian and African countries. The highest number of publications are in China, the US, and the UK, the three countries with the highest intensity of collaboration. The highest number of publications and author collaborations are from Tsinghua University and Li, Heng, respectively, and IEEE Access, the journal most cited and published within, is the primary carrier of the literature in this research area.
Based on the citation relationships in the literature, the application of machine learning to industrial risk assessment is a multidisciplinary research field that requires a foundation in mathematics and computer science. It also necessitates the integration of knowledge from various disciplines, such as chemistry, materials science, physics, environmental science, nutrition, and toxicology, depending on the application area. The key technology in this field is the monitoring and diagnosis of process failures [
26]. The knowledge base in the field of applying machine learning to industrial risk assessment is machine learning algorithm design, applications in biomedicine, risk monitoring in construction and machinery, and environmental protection.
Currently, three hotspots have been formed in the industrial field by machine learning research: the study of machine learning and deep learning algorithms themselves, the risk management of machine learning in the Industry 4.0 system, and the use of machine learning in the direction of autonomous driving technology. The four research frontiers are “Random Forests”, “Industry 4.0”, “supply chain risk assessment”, and the “Internet of Things”. The trend in research content is for the application of machine learning in industry to range from a single production aid to risk assessment in several areas to the informatization and digitization of Industry 4.0 systems.