Intelligent Identification of Hidden Dangers in Hydrogen Pipeline Transmission Station Using GWO-Optimized Apriori Algorithm

Wang, Chaoming; Fu, Anqing; Li, Weidong; Li, Mingxing; Chen, Tingshu

doi:10.3390/en17184539

Open AccessArticle

Intelligent Identification of Hidden Dangers in Hydrogen Pipeline Transmission Station Using GWO-Optimized Apriori Algorithm

by

Chaoming Wang

¹,

Anqing Fu

^1,*,

Weidong Li

²,

Mingxing Li

^3,4 and

Tingshu Chen

¹

State Key Laboratory of Oil and Gas Equipment, CNPC Tubular Goods Research Institute, Xi’an 710077, China

²

College of Chemical Engineering, Fuzhou University, Fuzhou 350108, China

³

National Engineering Laboratory of Low Permeability Oil-Gas Field Exploration and Development, Xi’an 710018, China

⁴

Oil & Gas Technology Research Institute of ChangQing Oilfield Company, Xi’an 710018, China

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(18), 4539; https://doi.org/10.3390/en17184539

Submission received: 12 August 2024 / Revised: 8 September 2024 / Accepted: 9 September 2024 / Published: 10 September 2024

(This article belongs to the Section A5: Hydrogen Energy)

Download

Browse Figures

Versions Notes

Abstract

:

This work proposes an intelligent grey-wolf-optimizer-improved Apriori algorithm (GWO-Apriori) to mine the association rules of hidden dangers in hydrogen pipeline transmission stations. The optimal minimum support and minimum confidence are determined by GWO instead of the time-consuming trial approach. Experiments show that the average support and average confidence of association rules using GWO-Apriori increase by 29.8% and 21.3%, respectively, when compared with traditional Apriori. Overall, 59 ineffective association rules out of the total 105 rules are filtered by GWO, which dramatically improves data mining effectiveness. Moreover, 23 illogical association rules are excluded, and 12 new strong association rules ignored by the traditional Apriori are successfully mined. Compared with the inefficient and labor-intensive manual investigation, the intelligent GWO-Apriori algorithm dramatically improves pertinency and efficiency of hidden danger identification in hydrogen pipeline transmission stations.

Keywords:

hydrogen pipeline transmission station; hidden danger; intelligent identification; association rule mining; Apriori; grey wolf optimizer (GWO)

1. Introduction

Development of green energy is the key solution to solve the growing environmental, climate and energy concerns faced by the entire world. As a renewable green energy carrier, hydrogen is widely recognized as a critical path to reach the zero-carbon emission target [1,2]. It is predicted that hydrogen will provide up to 24% of global energy consumption by 2050 [3]. The most environmentally friendly method for hydrogen production is electrolysis of water [4,5]. For hydrogen transportation, pipelines are the most efficient and cost-effective means to deliver large scale gaseous hydrogen across long distance [6,7]. Approximately 4500 km of hydrogen pipelines are currently in operation worldwide [1]. Pipeline transportation is becoming a more and more important contributor to the development of hydrogen energy and its industrial chain. Similarly to natural gas pipelines, transmission stations are the heart of hydrogen pipeline systems. Due to the violent diffusivity and inflammability of hydrogen, pipeline transmission stations face some special safety problems, such as hydrogen brittleness, self-ignition and strong probability of fire hazards, which requires different and tougher safety management measures.

Transmission stations of hydrogen pipelines involve enormous equipment and operations, which bring about high safety risks that threaten the life and property security of personnel and pipeline proprietors. Thus, screening and finding hidden dangers prior to the occurrence of accidents is crucial for safety management of hydrogen pipeline transmission stations. In contrast to the inherent and inextirpable risks, hidden dangers could be eliminated by taking appropriate control measures [8]. Safety check lists with regular manual safety patrols are widely used to investigate hidden dangers, but they depend heavily on staff experiences with rather low labor efficiency. In recent years, intelligent identification and management of hidden dangers has become a new developing trend [9,10]. By analyzing the hidden danger data with intelligent algorithms, subjective and objective causes could be excavated and identified in a more accurate and efficient manner. The Apriori algorithm is one of the most powerful machine learning algorithms for association rule mining and it has been applied into a variety of fields. Dehghani et al. [11] employed the Apriori algorithm to discover symptom patterns from COVID-19 patients and proposed some valuable insights for clinicians to manage and treat the disease. Cleland et al. [12] investigated the spatial characteristics of human-caused wildfires in Colorado with Apriori-based itemset mining and the mined results were effectively evaluated. Papi et al. [13] analyzed suicidal behaviors with the Apriori algorithm based on a dataset of 1250 instances and 27 relevant features. Some key rules to predict suicidal behavior were found. Li et al. [14] optimized the efficiency and effectiveness of a traditional Apriori algorithm to analyze the landslide deformation response. Performance of the optimized Apriori algorithm was verified with experiments.

The Apriori algorithm is also promising in investigation and control of hidden dangers for safe production. Rafindadi et al. [15] used this technology to identify hidden dangers of fatal construction accidents. In total, 100 association rules were extracted from 253 generated rules to design effective inspection procedures and occupational safety initiatives. Qiu et al. [16] combined the Apriori algorithm and a complex network to explore the coal mine accident-causing mechanism. A hidden danger network for coal mine accidents was constructed based on the strong association rules. It was found that regulatory authority is the most influential accident causation. Shi et al. [17] proposed a H-Apriori association rule algorithm based on hazard degree weights to investigate the key risk chain for urban rail transit operations and came up with some useful hidden danger control strategies. Yi et al. [18] focused on hidden dangers of offshore drilling platforms. They developed a multientity HTApriori algorithm using hash technology and inverted itemset to understand the risky causes between entities. Reliability and effectiveness of the proposed algorithm were verified by experiments. These works prove the strong potential of Apriori algorithm in hidden dangers identification in different fields.

As widely used as it is, application of the Apriori algorithm in the hydrogen industry is still in its infancy. With the development of long-distance hydrogen pipelines, safe production of transmission stations becomes more and more important and challenging, and applicability of this technology in hidden danger identification needs further demonstration. Moreover, most works optimizing the Apriori algorithm are focused on improving computing efficiency [19,20], while logical rationality of the mined association rules is paid little attention. In response, the primary focus of this work is to develop an Apriori-based intelligent algorithm to identify hidden dangers in hydrogen transmission stations. Initially, the traditional Apriori algorithm is optimized by the grey wolf optimizer (GWO) to pinpoint the optimal minimum support and minimum confidence. Then, hidden danger data are collected and preprocessed, and average support and average confidence are taken as evaluation indexes. Next, performances of traditional and GWO-optimized Apriori algorithm are compared and discussed. In short, this work features the application of an Apriori algorithm in the hydrogen industry and the improvement of hidden danger identification performance by using GWO. From a practical point of view, it helps the operators to identify hidden dangers in a more effective and efficient manner, and safe production of hydrogen pipeline transportation could enjoy some benefits from this work.

2. Basic Thinking for Mining Association Rules of Hidden Dangers

Similarly to natural gas pipelines, transmission stations provide the driving force for flowing hydrogen. The key function of a typical transmission station of a hydrogen pipeline includes purification, flow measurement and compression. In addition, truncation, emergency blow-down, pig receiving and launching are also required. For gas purification, the cyclone separator and filter separator form a two-stage purification system to achieve complete separation of impurities.

In daily operation of a hydrogen pipeline transmission station, failure in finding and eliminating hidden dangers in time may cause severe accidents. The purpose of mining association rules is to find the frequent itemsets, from which the association rules of hidden dangers are extracted to support safety decision-making. Key processes of association rule mining for hidden danger identification are showcased in Figure 1.

The basic thinking for mining association rules of hidden dangers in hydrogen pipeline transmission stations proceeds as follows. Initially, the transaction dataset, which consists of massive and complex historical data of hidden dangers, requires preprocessing before association rule mining. Then, a specific data mining algorithm is employed to mine the strong association rules within the transaction dataset. Next, the strong association rules are adopted to predict the occurrence of consequent based on antecedent, thereby useful information can be mined to identify hidden dangers.

3. Theoretical Basis

3.1. The Apriori Algorithm

In Apriori, the association rule for two sets of itemsets is given by

X \Rightarrow Y

, where X is antecedent and Y is consequent. The concepts of support and confidence are essential to extract association rules of hidden dangers. Support P(X∪Y) is the occurrence probability of incidents X and Y together in the dataset. It indicates the importance of a certain association rule. Confidence P(X|Y) is the occurrence probability of item Y if item X occurs. It reflects accuracy and credibility of a rule. If a rule satisfies the specified minimum support and minimum confidence thresholds, it is taken as a strong association rule and the corresponding itemset is frequent. A two-stage iterative layer searching strategy of joining and pruning is adopted in Apriori. Producing frequent itemsets through continuous iteration and generating strong association rules from frequent itemsets that meet the minimum support and minimum confidence are the two major steps. To be specific, the Apriori algorithm proceeds as follows:

(1): Scan the initial dataset and calculate the support of each item. Frequent 1-itemset L₁ is generated from the items that meet the minimum support.
(2): Join frequent (k − 1)-itemset L_k−₁ (k ≥ 2) to itself to generate a candidate k-itemset C_k.
(3): Take C_k−₁ as (k − 1) order subset of C_k. If $C_{k - 1} \notin L_{k - 1}$ , then $C_{k} \notin L_{k}$ ; thus, the candidate k-itemset is not frequent and it is deleted from C_k.
(4): Iterate step 2 and step 3 until a higher order frequent itemset cannot be obtained. The strong association rules are next extracted from the frequent itemsets that meet the minimum support and minimum confidence.

The Apriori algorithm is a classic association rule mining approach to acquire frequent itemsets and association rules in datasets. However, the wide range of support and confidence values easily enlarges the searching space, which results in time-consuming implementation and uncertainty of prediction performance of this algorithm [21]. For example, a large support threshold reduces frequent itemsets and potentially useful association rules might be eliminated. On the contrary, too low of the minimum support leads to generation of excessive frequent itemsets and association rules, which may contain a significant proportion of meaningless and unreliable rules. Therefore, when selecting the minimum support, it is necessary to weigh the quantity and quality of the rules, as well as generalization ability of the model to obtain meaningful association rules. However, determination of support and confidence thresholds is still a rule of thumb in traditional Apriori. Therefore, this work employs a swarm intelligence optimization algorithm to obtain the best minimum support and minimum confidence.

3.2. Grey Wolf Optimizer (GWO)

A swarm intelligence optimization algorithm is inspired by the interaction behavior of social organisms in nature [22]. It mimics their social structures and survival behaviors in an abstract mathematical representation. Animal individuals in a population are simulated by setting massless particles within the searching space. Speed and position are the only two properties of individuals in the population. The positions of individuals are constantly updated to find the optimal solution to the problem.

Grey wolf optimizer (GWO) is a new bionic meta-heuristic algorithm that simulates the hunting mechanism and leadership hierarchy of a group of grey wolves. It has been successfully applied in a variety of fields, such as software defect prediction [23], path planning in laser machining [24], prestress design of structures [25] and energy system optimization [26]. Grey wolves are apex predators and live in a pack with a strict social dominance hierarchy. According to the roles of wolves in hunting, the grey wolf population is divided into four types in a pyramid principle from the top to the bottom, namely alpha, beta, delta and omega wolves. Alpha is the dominant wolf with the highest authority to manage the pack, but it is not necessarily the strongest one. The betas are subordinate in the hierarchy. They are responsible for assisting the alpha wolf in decision-making and commanding other lower-level wolves. They are advisors to the alpha and discipliners for the pack. Delta wolves obey the commands from alpha and beta wolves for scouting and hunting. Omega wolves rank the lowest under the leadership of superior wolves to besiege the prey, and they are the last wolves that are allowed to eat after hunting.

In GWO, the alpha is considered as the best solution. Correspondingly, the betas and deltas are the second and third best solutions, respectively, while the omegas are taken as the rest of the candidate solutions. The hunting (optimization) process is actually guided by the alpha, beta and delta wolves. GWO features its simple structure, fast iteration speed and good global searching ability, and it is suitable for Apriori optimization. As shown in Figure 2, the main phases of grey wolf hunting include tracking, chasing and approaching the prey; pursuing, encircling and harassing the prey until it stops moving; and attack towards the prey [27].

Finding the final location of the prey is a prerequisite for obtaining the global optimal solution. With a clear division of labor, the omegas hunt with the alpha, betas and deltas. The location of grey wolves at different levels and the distances between them can be calculated. Iterations of wolf pack positions can be calculated using the following formulas

\begin{array}{l} D = | C \cdot X_{p} (t) - X (t) | \\ X (t + 1) = X_{p} (t) + A \cdot D \end{array}

(1)

\{\begin{cases} D_{α} = | C_{1} X_{α} (t) - X (t) | \\ D_{β} = | C_{2} X_{β} (t) - X (t) | \\ D_{δ} = | C_{3} X_{δ} (t) - X (t) | \end{cases}

(2)

\{\begin{cases} X_{1} = X_{α} (t) - A_{1} D_{α} \\ X_{2} = X_{β} (t) - A_{2} D_{β} \\ X_{3} = X_{δ} (t) - A_{3} D_{δ} \end{cases}

(3)

X (t + 1) = (X_{1} + X_{2} + X_{3}) / 3

(4)

where

X_{α} (t)

,

X_{β} (t)

,

X_{δ} (t)

and

X_{p} (t)

are position vectors of alpha, beta, and delta wolves and the prey;

D_{α}

,

D_{β}

and

D_{δ}

are the distance between candidate and optimal wolves after t iterations; X(t + 1) is the position of a certain wolf after (t + 1) iterations; and A and C are coefficient vectors. X is the position vector of a grey wolf.

The wolves gradually approach the prey by constantly adjusting their positions and finally capture the prey successfully until the ending criterion is satisfied. More detailed information about the GWO can be found elsewhere [27].

3.3. The GWO-Apriori Algorithm

In implementation of the GWO-Apriori algorithm, the minimum support and minimum confidence are converted into location parameters of the alpha wolf. Alpha, beta and delta wolves are generated and used to update the positions of omega wolves. Iterative optimization is continuously conducted until the convergence condition is met. In this work, the convergence condition is that the sum of average support and average confidence reaches the minimum. Finally, location of the alpha wolf is outputted as the optimal solution of minimum support and minimum confidence. As showcased in Figure 3, the specific steps of GWO-Apriori proceed as follows:

(1): Initialize the number and locations of wolves and the maximum number of iterations. Then, set the optimization scopes of minimum support and minimum confidence. The position of an individual grey wolf corresponds to a feasible set of parameter combinations.
(2): Location of the prey is estimated by alpha, beta and delta wolves, whose position vectors are calculated. Based on the fitness parameter, three optimal individuals and their locations are determined and position of the prey is updated.
(3): Repeat the above step to update the positions of other omega wolves. Position vectors of alpha, beta and delta wolves are also updated. Then, conduct the next iterations until meeting the criteria for termination, or the fitness threshold is reached. The position of alpha wolf is taken as minimum support and minimum confidence.
(4): Conduct traditional Apriori with the GWO-determined minimum support and minimum confidence.

Figure 3. Flowchart of the GWO-Apriori algorithm.

4. Application of GWO-Apriori

4.1. Data Acquisition

Varieties of hidden dangers in hydrogen pipeline transmission stations were collected for data analysis. These hidden dangers may cause accidents if not appropriately addressed. Based on property attributes, these data were classified into four types, i.e., mismanagement, operation violation, equipment defect and environmental issue. All the hidden dangers were numbered for convenient analysis. The detailed information is given in Table 1.

4.2. Data Preprocessing

As the original text information could not be used as direct input data for the association rule mining algorithm, it was necessary to convert the original literal data into the Boolean type. In each hidden danger screening program, if an item arose, it was recorded as 1, and was otherwise denoted as 0. After deleting the invalid and missing data, the remaining effective data formed a 0–1 matrix. Table 2 showcases part of the preprocessed hidden danger data of a hydrogen pipeline transmission station.

4.3. Evaluation Metrics

As aforementioned, support and confidence, respectively, reflect the importance and credibility of an association rule. In Apriori, support and confidence of the strong association rules are larger than their corresponding minimum values. Rules with large support values indicate frequent associations between different factors. The interaction between these factors affects the probability and severity of hidden dangers, which eventually leads to occurrence of accidents. For large confidence association rules, the antecedent has a high probability of being correlated to the corresponding consequent. The strong relationship increases the risk of accidents in the hydrogen pipeline transmission stations. In-depth analysis of the relationship between antecedent and consequent helps us to obtain a better understanding of the characteristics and laws of accident hazards, which further benefits the operators to formulate effective prevention measures to improve safety management. Therefore, in this work, we adopted average support and average confidence to evaluate the mining performance of traditional and GWO-improved Apriori algorithms.

\sup^{¯} = \frac{\sum_{i \in n} \sup (I_{i})}{N}

(5)

\bar{confd} = \frac{\sum_{i \in n} confd (I_{i})}{N}

(6)

where

\sup^{¯}

and

\bar{confd}

are average support and average confidence, respectively;

\sum_{i \in n} \sup (I_{i})

and

\sum_{i \in n} confd (I_{i})

are the sum of support and confidence of selected rules; and N is the number of strong association rules.

5. Results and Discussion

5.1. Performance Analysis of GWO-Apriori

Data of hidden danger in Table 2 were stochastically selected to form five datasets with different sample capacities to check the mining performance of traditional and GWO-improved Apriori algorithms. The minimum support and minimum confidence in execution of traditional Apriori were empirically chosen to be 0.25 and 0.4, respectively, whereas for GWO-Apriori, they were determined by the GWO algorithm. The results are given in Figure 4. Significant increments of average support and average confidence are observed for GWO-Apriori when compared with the traditional approach. In some cases, the evaluation indexes increase nearly 40%. For example, the average support for the 60-sets-of-data case in Figure 4A is 0.53 when using traditional Apriori, whereas the GWO-Apriori algorithm presents a sharp increase to 0.84. Similar improvement is also observed in Figure 4B. The average increments for support and confidence are 29.8% and 21.3%, respectively, which proves that by optimizing the minimum support and minimum confidence, antecedents and consequents screened out by GWO-Apriori have stronger a correlation. This helps the operators to achieve better pertinency and efficiency in hidden danger identification.

5.2. Association Rule Mining Based on Traditional Apriori Algorithm

Data in Table 2 were employed to execute the Apriori algorithm to obtain the association rules of hidden dangers in hydrogen pipeline transmission stations. Minimum support of 0.25 and minimum confidence of 0.4 were also used to carry out the Apriori algorithm. In total, 105 association rules were mined. Table 3 gives the top ten high-confidence association rules of hidden dangers.

Association rules 1, 4, 5, 9 and 10 mainly focus on issues regarding safety management. Once antecedents such as handing over goods over rotating equipment (O3), slippery and wet ground (S10), failure to wear protective equipment (O14), illegal fire operation (O12) and failure to use and maintain fire-fighting equipment in accordance with regulations (O6) occur, we usually see management defects, including failure to provide safety education and training to employees (M9), incomplete elimination of discovered hidden dangers (M12), unqualified staff without qualification certificates (M7), as well as failure to check and maintain equipment as required (M2). Confidence values of these association rules range between 0.7 and 0.8, with the average confidence being 0.7574. Therefore, to control these highly risky hidden dangers, reinforcement of staff training, safety inspections and equipment maintenance becomes a necessity.

Apart from safety management problems, environment and equipment defects also deserve attention greatly. For example, inadequate hydrogen safety training (M1) brings about a 73.83% occurrence probability of exit passageway blockage (S2), while failure of a hydrogen detector (E1) is accompanied by a 79.87% probability of unacceptable equipment installation, use, testing and upgrading (E16). These association rules are logically easy to understand. However, some rules mined by traditional Apriori algorithm appear to be illogical. For instance, association rule 4 shows that aerial work without safety ropes (O11) is highly correlated to ineffective communication in a confined space (S4), while association rule 7 indicates that damage to the fence around the station (S1) easily leads to filter separator clogging (E4). Support and confidence of these rules reach up to 0.4594 and 0.7824, respectively. This is somewhat strange as the antecedents and consequents are totally different without any logical intersection. The reason why these far-fetched rules appear is because association analyses are based on mathematical statistics, and sometimes accidental concurrence of irrelevant items gives rise to unreasonable association rules of this kind, which are no benefit in identifying and eliminating the hidden dangers. Moreover, this is a great waste of computing resources. Therefore, it is necessary to optimize the Apriori algorithm to achieve more effective and efficient association rules.

5.3. Association Rule Mining Based on GWO-Apriori Algorithm

The GWO-Apriori algorithm was used to mine association rules of hidden dangers from the same dataset in Table 2. The GWO is capable of directly determining the optimal minimum support and minimum confidence. The best values of these two parameters were calculated to be 0.46 and 0.48, respectively. In total, 46 strong association rules were mined. Table 4 gives the top ten with high confidence.

We see some differences between Table 3 and Table 4. Initially, general improvement of average support and average confidence of the top ten association rules using GWO-Apriori is observed when compared with those of the traditional Apriori algorithm, which corresponds well with Figure 4. It implies closer links between antecedents and consequents. Moreover, most reasonable association rules in Table 3 are retained in Table 4, while the aforementioned illogical associate rules, including rules 4 and 7 in Table 3 are excluded by the GWO algorithm. Instead, new association rules arise. For rule 3, a support value of 0.4680 and confidence value of 0.9308 indicate that a surge of the hydrogen compressor (E2) is highly correlated to nonexecution of pre-start-up check (O4). For rule 5, a coarse inner wall of the blow-down pipe (E8) indicates a large probability of self-ignition of hydrogen in an emergency blow-down (M13). This is ascribed to the low ignition energy of hydrogen (0.018 MJ). Under these circumstances, the blow-down pipes should be replaced with smoother ones. Actually, on any occasion where high-pressure hydrogen flows with large velocity, a glazed surface is required to avoid self-ignition of hydrogen. Association rule

S 11 \Rightarrow E 12

is also new in Table 4 when compared with Table 3. On the whole, the GWO-Apriori algorithm excluded 23 illogical associate rules and mined another 12 effective ones.

As aforementioned, when minimum support of 0.25 and minimum confidence of 0.4 were taken to implement the traditional Apriori algorithm, 105 association rules were mined in total. As for the GWO-Apriori algorithm, under the same parameter and dataset conditions, 46 high-confidence association rules were found, which means that 59 ineffective association rules were filtered by GWO, resulting in a significant improvement in data mining effectiveness by 56.2%. Moreover, computational efficiency also improved with GWO, beyond the logical accuracy of the mined association rules. For the traditional Apriori algorithm, a trial method with different minimum support and minimum confidence combinations is required to obtain the best minimum support and minimum confidence values, which implies unacceptable computing time, especially when the dataset is huge. Actually, in field practice, the dataset easily runs to tens of thousands of hidden dangers. Whereas in execution of GWO-Apriori, the GWO directly obtains the best minimum support and minimum confidence, which avoids the time-consuming trial-and-error process of the traditional Apriori algorithm.

6. Conclusions

Pipeline transportation is a crucial piece of the industrial chain of hydrogen energy. Identification of hidden dangers is of great significance to guarantee safe production of hydrogen transmission stations. The traditional Apriori algorithm for data mining encounters some critical problems, such as poor applicability and low computational efficiency. In response, this work proposes an intelligent grey-wolf-optimizer-improved Apriori algorithm (GWO-Apriori) to mine the association rules of hidden dangers in hydrogen pipeline transmission stations. Compared with traditional Apriori, the GWO-Apriori algorithm acquires the optimal minimum support and minimum confidence by optimizing the locations of alpha wolf, which excludes the time-consuming trial-and-error process. Data mining experiments with four different types of hidden dangers in a hydrogen pipeline transmission station were conducted to verify the performances of the optimized and conventional Apriori algorithms. Under the same parameter and dataset conditions, the average support and average confidence of the GWO-Apriori algorithm increased by 29.8% and 21.3%, respectively. In total, 59 ineffective association rules out of the total 105 rules were filtered by GWO, which indicates 56.2% improvement of data mining effectiveness. Moreover, 23 illogical association rules were excluded by GWO, and 12 new strong association rules ignored by the traditional Apriori were successfully mined. It is expected that the proposed intelligent GWO-Apriori algorithm has strong application potential in hidden dangers identification of hydrogen pipeline transmission stations to prevent accidents and guarantee production safety.

Author Contributions

Conceptualization, C.W. and A.F.; methodology, C.W. and W.L.; validation, M.L. and T.C.; formal analysis, W.L.; investigation, W.L. and T.C.; writing—original draft, A.F. and W.L.; writing—review and editing, C.W. and M.L.; supervision, C.W.; funding acquisition, C.W and A.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 52071338; the Provincial Science Foundation for Distinguished Young Scholars of Shaanxi, grant number 2022JC-34; the Science and Technology Development Project of CNPC, grant numbers 2022DQ0527 and 2023ZZ11-02; the Basic Research and Strategic Reserve Technology Research Fund Project of CNPC, grant number 2023DQ03-04.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Mingxing Li was employed by the Oil & Gas Technology Research Institute of ChangQing Oilfield Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Olabi, A.G.; Abdelkareem, M.A.; Mahmoud, M.S.; Elsaid, K.; Obaideen, K.; Rezk, H.; Wilberforce, T.; Eisa, T.; Chae, K.J.; Sayed, E.T. Green Hydrogen: Pathways, Roadmap, and Role in Achieving Sustainable Development Goals. Process Saf. Environ. Prot. 2023, 177, 664–687. [Google Scholar] [CrossRef]
Sadeq, A.M.; Homod, R.Z.; Hussein, A.K.; Togun, H.; Mahmoodi, A.; Isleem, H.; Patil, A.R.; Moghaddam, A.H. Hydrogen Energy Systems: Technologies, Trends, and Future Prospects. Sci. Total Environ. 2024, 939, 173622. [Google Scholar] [CrossRef] [PubMed]
Cheng, W.; Cheng, Y.F. A Techno-economic Study of the Strategy for Hydrogen Transport by Pipelines in Canada. J. Pipeline Sci. Eng. 2023, 3, 100112. [Google Scholar] [CrossRef]
Achitaev, A.; Suvorov, A.; Ilyushin, P.; Volkova, I.; Kan, K.; Suslov, K. Life Extension of AC-DC Converters for Hydrogen Electrolysers Operating as Part of Offshore Wind Turbines. Int. J. Hydrogen Energy 2024, 51, 137–159. [Google Scholar] [CrossRef]
Bulychev, N.A.; Kazaryan, M.A.; Averyushkin, A.S.; Chernov, A.A.; Gusev, A.L. Hydrogen Production by Low-Temperature Plasma Decomposition of Liquids. Int. J. Hydrogen Energy 2017, 4, 20934–20938. [Google Scholar] [CrossRef]
Tsiklios, C.; Hermesmann, M.; Müller, T.E. Hydrogen Transport in Large-Scale Transmission Pipeline Networks: Thermodynamic and Environmental Assessment of Repurposed and New Pipeline Configurations. Appl. Energy 2022, 327, 120097. [Google Scholar] [CrossRef]
Cui, J.; Kong, Y.; Liu, C.; Cai, B.; Khan, F.; Li, Y. Failure Probability Analysis of Hydrogen Doped Pipelines Based on the Bayesian Network. Eng. Fail. Anal. 2024, 156, 107806. [Google Scholar] [CrossRef]
Miao, D.; Lv, Y.; Yu, K.; Liu, L.; Jiang, J. Research on Coal Mine Hidden Danger Analysis and Risk Early Warning Technology Based on Data Mining in China. Process Saf. Environ. Prot. 2023, 171, 1–17. [Google Scholar] [CrossRef]
Xu, F.; Chen, Q.; Liu, Q.; Li, N. Intelligent Analysis Algorithm for Hidden Danger Identification of Intelligent Network Monitoring System from the Perspective of Big Data. Procedia Comput. Sci. 2023, 228, 57–63. [Google Scholar] [CrossRef]
Liu, W.; Luo, R.; Xiao, M.; Chen, Y. Intelligent detection of hidden distresses in asphalt pavement based on GPR and deep learning algorithm. Constr. Build. Mater. 2024, 416, 135089. [Google Scholar] [CrossRef]
Dehghani, M.; Yazdanparast, Z. Discovering the Symptom Patterns of COVID-19 From Recovered and Deceased Patients Using Apriori Association Rule Mining. Inform. Med. Unlocked 2023, 42, 101351. [Google Scholar] [CrossRef]
Cleland, Z.W.; Dao, K.A.; Dao, T.H.D. Detecting Changes in Spatial Characteristics of Colorado Human-Caused Wildfires Using Apriori-Based Frequent Itemset Mining. Comput. Environ. Urban Syst. 2023, 101, 101941. [Google Scholar] [CrossRef]
Papi, R.; Attarchi, S.; Boloorani, A.D.; Samany, N.N. Knowledge Discovery of Middle East Dust Sources Using Apriori Spatial Data Mining Algorithm. Ecol. Inform. 2022, 72, 101867. [Google Scholar] [CrossRef]
Li, L.; Wu, Y.; Huang, Y.; Li, B.; Miao, F.; Deng, Z. Optimized Apriori Algorithm for Deformation Response Analysis of Landslide Hazards. Comput. Geosci. 2023, 170, 105261. [Google Scholar] [CrossRef]
Rafindadi, A.D.; Shafiq, N.; Othman, I.; Ibrahim, A.; Aliyu, M.M.; Mikić, M.; Alarifi, H. Data Mining of The Essential Causes of Different Types of Fatal Construction Accidents. Heliyon 2023, 9, e13389. [Google Scholar] [CrossRef]
Qiu, Z.; Liu, Q.; Li, X.; Zhang, X.; Zhang, J. Construction and Analysis of a Coal Mine Accident Causation Network Based on Text Mining. Process Saf. Environ. Prot. 2021, 153, 320–328. [Google Scholar] [CrossRef]
Shi, G.; Ding, X.; Hong, C.; Liu, Z.; Zhao, L. Research on Key Risk Chain Mining Method for Urban Rail Transit Operations: A New Approach to Risk Management. Int. J. Transp. Sci. Technol. 2024, 13, 29–43. [Google Scholar] [CrossRef]
Yi, J.; Chen, K.; Liu, H.; Liang, K.; Mi, H.; Zhou, W. A Hybrid Association Analysis Framework of Accident Reports for Offshore Drilling Platforms. J. Loss Prev. Process Ind. 2023, 85, 105161. [Google Scholar] [CrossRef]
Djenouri, Y.; Comuzzi, M. Combining Apriori Heuristic and Bio-Inspired Algorithms for Solving the Frequent Itemsets Mining Problem. Inf. Sci. 2017, 420, 1–15. [Google Scholar] [CrossRef]
Yürüşen, N.Y.; Uzunoğlu, B.; Talayero, A.P.; Estopiñán, A.L. Apriori and K-Means Algorithms of Machine Learning for Spatio-Temporal Solar Generation Balancing. Renew. Energy 2021, 175, 702–717. [Google Scholar] [CrossRef]
Bhandari, A.; Gupta, A.; Das, D. Improvised Apriori Algorithm Using Frequent Pattern Tree for Real Time Applications in Data Mining. Procedia Comput. Sci. 2015, 46, 644–651. [Google Scholar] [CrossRef]
Lien, L.; Cheng, M. A Hybrid Swarm Intelligence Based Particle-Bee Algorithm for Construction Site Layout Optimization. Expert Syst. Appl. 2012, 39, 9642–9650. [Google Scholar] [CrossRef]
Wang, H.; Arasteh, B.; Arasteh, K.; Gharehchopogh, F.S.; Rouhi, A. A Software Defect Prediction Method Using Binary Gray Wolf Optimizer and Machine Learning Algorithms. Comput. Electr. Eng. 2024, 118, 209336. [Google Scholar] [CrossRef]
Zhang, T.; Hu, H.; Liang, Y.; Liu, X.; Rong, Y.; Wu, C.; Zhang, G.; Huang, Y. A Novel Path Planning Approach to Minimize Machining Time in Laser Machining of Irregular Micro-Holes Using Adaptive Discrete Grey Wolf Optimizer. Comput. Ind. Eng. 2024, 193, 110320. [Google Scholar] [CrossRef]
Zhu, M.; Xu, W.; Ma, W. A Novel Prestress Design Method for Cable-Strut Structures with Grey Wolf-Fruit Fly Hybrid Optimization Algorithm. Structures 2024, 67, 106932. [Google Scholar] [CrossRef]
Hu, J.; Song, Z.; Tan, Y.; Tan, M. Optimizing Integrated Energy Systems Using a Hybrid Approach Blending Grey Wolf Optimization with Local Search Heuristics. J. Energy Storage 2024, 87, 111384. [Google Scholar] [CrossRef]
Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey Wolf Optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef]

Figure 1. Processes for mining association rules of hidden dangers.

Figure 2. Hunting behaviors of grey wolves: (A) chasing, approaching and tracking prey; (B–D) pursuing, harassing and encircling; (E) stationary situation and attack [27].

Figure 4. Improvement of (A) average support and (B) average confidence of GWO-Apriori.

Table 1. Classification of hidden dangers in hydrogen pipeline transmission stations.

Types of Hidden Dangers	Detailed Items
Mismanagement	Inadequate hydrogen safety training (M1); failure to check and maintain equipment as required (M2); production responsibility system unestablished (M3); lack of evaluation of employee competence (M4); risk mitigation measures untaken prior to operation (M5); the staff are unfamiliar with risk mitigation measures (M6); unqualified staff without qualification certificates (M7); unauthorized activities (M8); failure to provide safety education and training to employees as required (M9); forcible execution of illegal operations against job regulations (M10); approve illegal affairs (M11); incomplete elimination of discovered hidden dangers (M12); self-ignition of hydrogen in emergency blow-down (M13); lack of job inspection (M14).
Operation violation	Overheight of stacked up materials (O1); explosion-proof tools are not used in flammable and explosive areas (O2); handing over goods over rotating equipment (O3); nonexecution of pre-start-up check (O4); operations with body instead of tools (O5); failure to use and maintain fire-fighting equipment in accordance with regulations (O6); overload equipment (O7); touch the switches with hands wet (O8); explosive operation without wearing antistatic clothing; (O9); carry kindling into operation areas (O10); aerial work without safety rope (O11); illegal fire operation (O12); unauthorized entry into restricted areas (O13); failure to wear protective equipment (O14).
Equipment defect	Failure of hydrogen detector (E1); surge of hydrogen compressor (E2); deterioration of hydrogen compressor sealing performance (E3); clogging of filter separator (E4); aging of fiber optic splice closure (E5); equipment fails to meet the explosion-proof standard (E6); weld defect of pipelines (E7); coarse inner wall of the blow-down pipe (E8); poor lighting (E9); damaged protection layer of electrical cables (E10); internal and external leakage of valves (E11); dampened electrical equipment (E12); malfunction of automatic interlock protection system (E13); water shortage of fire pool (E14); explosion-proof grade of electrical equipment inconsistent with stipulation (E15); unacceptable equipment installation, use, testing and upgrading (E16).
Environmental issue	Damage of fence around the station (S1); blockage of exit passageway (S2); disarray of materials (S3); ineffective communication in confined space (S4); lack of safety warning signs (S5); improper setting of explosion-protection facilities (S6); insufficient safety distance (S7); nearby landslide (S8); unqualified fire rating of building materials (S9); slippery and wet ground (S10); extreme temperature and humidity (S11); excessive noise (S12); mixed setting of administrative and storage areas (S13); foundation settlement (S14).

Table 2. Data segment of hidden dangers in a hydrogen pipeline transmission station.

Investigation No.	M1	M2	M3	M4	…	S1	S2	S3	S4	…
1	1	0	1	0	…	0	1	0	0	…
2	1	1	0	0	…	1	0	1	0	…
3	0	0	0	1	…	1	0	0	1	…
4	1	0	1	0	…	0	1	0	0	…
5	0	1	0	1	…	0	0	0	0	…
6	1	0	0	1	…	0	1	0	0	…
…	…	…	…	…	…	…	…	…	…	…

Table 3. High-confidence association rules of hidden dangers using traditional Apriori algorithm.

Rule No.	Antecedent	Consequent	Support	Confidence
1	O3	M9	0.6328	0.8023
2	E1	E16	0.4547	0.7987
3	S10	M12	0.5238	0.7940
4	S4	O11	0.4594	0.7824
5	O14	M9	0.4890	0.7612
6	O13	S5	0.6305	0.7487
7	S1	E4	0.4354	0.7388
8	M1	S2	0.6840	0.7383
9	O12	M7	0.5671	0.7257
10	O6	M2	0.4515	0.7040

Table 4. High-confidence association rules of hidden dangers using the GWO-Apriori algorithm.

Rule No.	Antecedent	Consequent	Support	Confidence
1	O3	M9	0.5512	0.9551
2	E1	E16	0.6504	0.9483
3	O4	E2	0.4680	0.9308
4	O14	M9	0.6271	0.9247
5	E8	M13	0.5113	0.9162
6	S10	M12	0.5110	0.9031
7	O13	S5	0.4951	0.9002
8	O12	M7	0.5723	0.8836
9	S11	E12	0.6068	0.8760
10	O6	M2	0.7160	0.8634

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.; Fu, A.; Li, W.; Li, M.; Chen, T. Intelligent Identification of Hidden Dangers in Hydrogen Pipeline Transmission Station Using GWO-Optimized Apriori Algorithm. Energies 2024, 17, 4539. https://doi.org/10.3390/en17184539

AMA Style

Wang C, Fu A, Li W, Li M, Chen T. Intelligent Identification of Hidden Dangers in Hydrogen Pipeline Transmission Station Using GWO-Optimized Apriori Algorithm. Energies. 2024; 17(18):4539. https://doi.org/10.3390/en17184539

Chicago/Turabian Style

Wang, Chaoming, Anqing Fu, Weidong Li, Mingxing Li, and Tingshu Chen. 2024. "Intelligent Identification of Hidden Dangers in Hydrogen Pipeline Transmission Station Using GWO-Optimized Apriori Algorithm" Energies 17, no. 18: 4539. https://doi.org/10.3390/en17184539

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Intelligent Identification of Hidden Dangers in Hydrogen Pipeline Transmission Station Using GWO-Optimized Apriori Algorithm

Abstract

1. Introduction

2. Basic Thinking for Mining Association Rules of Hidden Dangers

3. Theoretical Basis

3.1. The Apriori Algorithm

3.2. Grey Wolf Optimizer (GWO)

3.3. The GWO-Apriori Algorithm

4. Application of GWO-Apriori

4.1. Data Acquisition

4.2. Data Preprocessing

4.3. Evaluation Metrics

5. Results and Discussion

5.1. Performance Analysis of GWO-Apriori

5.2. Association Rule Mining Based on Traditional Apriori Algorithm

5.3. Association Rule Mining Based on GWO-Apriori Algorithm

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Investigation No.	M1	M2	M3	M4	…	S1	S2	S3	S4	…
1	1	0	1	0	…	0	1	0	0	…
2	1	1	0	0	…	1	0	1	0	…
3	0	0	0	1	…	1	0	0	1	…
4	1	0	1	0	…	0	1	0	0	…
5	0	1	0	1	…	0	0	0	0	…
6	1	0	0	1	…	0	1	0	0	…
…	…	…	…	…	…	…	…	…	…	…

Investigation No.	M1	M2	M3	M4	…	S1	S2	S3	S4	…
1	1	0	1	0	…	0	1	0	0	…
2	1	1	0	0	…	1	0	1	0	…
3	0	0	0	1	…	1	0	0	1	…
4	1	0	1	0	…	0	1	0	0	…
5	0	1	0	1	…	0	0	0	0	…
6	1	0	0	1	…	0	1	0	0	…
…	…	…	…	…	…	…	…	…	…	…

Investigation No.	M1	M2	M3	M4	…	S1	S2	S3	S4	…
1	1	0	1	0	…	0	1	0	0	…
2	1	1	0	0	…	1	0	1	0	…
3	0	0	0	1	…	1	0	0	1	…
4	1	0	1	0	…	0	1	0	0	…
5	0	1	0	1	…	0	0	0	0	…
6	1	0	0	1	…	0	1	0	0	…
…	…	…	…	…	…	…	…	…	…	…