**A Framework for Big Data Analytical Process and Mapping—BAProM: Description of an Application in an Industrial Environment**

#### **Giovanni Gravito de Carvalho Chrysostomo 1, Marco Vinicius Bhering de Aguiar Vallim 1, Leilton Santos da Silva 2, Leandro A. Silva 1,\* and Arnaldo Rabello de Aguiar Vallim Filho 3,\***


Received: 30 July 2020; Accepted: 9 November 2020; Published: 18 November 2020

**Abstract:** This paper presents an application of a framework for Big Data Analytical Process and Mapping—BAProM—consisting of four modules: Process Mapping, Data Management, Data Analysis, and Predictive Modeling. The framework was conceived as a decision support tool for industrial business, encompassing the whole big data analytical process. The first module incorporates in big data analytical a mapping of processes and variables, which is not common in such processes. This is a proposal that proved to be adequate in the practical application that was developed. Next, an analytical "workbench" was implemented for data management and exploratory analysis (Modules 2 and 3) and, finally, in Module 4, the implementation of artificial intelligence algorithm support predictive processes. The modules are adaptable to different types of industry and problems and can be applied independently. The paper presents a real-world application seeking as final objective the implementation of a predictive maintenance decision support tool in a hydroelectric power plant. The process mapping in the plant identified four subsystems and 100 variables. With the support of the analytical workbench, all variables have been properly analyzed. All underwent a cleaning process and many had to be transformed, before being subjected to exploratory analysis. A predictive model, based on a decision tree (DT), was implemented for predictive maintenance of equipment, identifying critical variables that define the imminence of an equipment failure. This DT model was combined with a time series forecasting model, based on artificial neural networks, to project those critical variables for a future time. The real-world application showed the practical feasibility of the framework, particularly the effectiveness of the analytical workbench, for pre-processing and exploratory analysis, as well as the combined predictive model, proving effectiveness by providing information on future events leading to equipment failures.

**Keywords:** big data process; predictive maintenance; machine learning

#### **1. Introduction**

Interest in data-based knowledge applied to decision-making processes has been growing in different industrial segments [1]. The importance of this movement of data-driven decisions is understood, since organizations with better performance have used data analysis five times more than those with low performance [2].

This movement of implementing a so-called KDD—knowledge discovery in databases—environment is relatively new in industrial business, and it is due, on the one hand, to the huge volume of data generated (big data), which is largely the result of the Internet of Things (IoT), where sensors connected to a variety of objects, spread across the planet, have accelerated the big data phenomenon. On the other hand, data availability has sparked interest in using these historical data to support decisions, based on mathematical models and algorithms, mainly those of artificial intelligence (AI), which allow predictions of different types of events, such as the imminence of equipment failure, triggering a preventive maintenance schedule [3].

The combination of concepts, such as big data, IoT and AI, has had a considerable impact on industrial business, defining the main dimension of Industry 4.0, which can be defined as a concept that encompasses automation and information technology, transforming raw materials into value-added products from data-driven sources [3,4].

One of the main areas is AI-based predictive maintenance. In this type of maintenance rather than scheduling operation suspension for maintenance, based on fixed time intervals, the best stopping moment is defined based on AI inference, as a result of an analytical model, calibrated (trained) on the basis of historical data [4–6].

A continuous monitoring of equipment, by AI algorithms, can have an important impact by allowing the reduction of corrective maintenance, which occurs unexpectedly and is strongly undesirable, compromising budgets and industrial production. Advance information that an equipment failure is close allows for proactive and planned actions to mitigate these financial impacts. This is clearly a trade-off between investment in research and development and equipment productivity [7,8].

New companies are already starting operations considering the modern concepts of Industry 4.0, but traditional industries are also entering this new world, seeking to improve their processes by including Industry 4.0 elements.

#### *1.1. Motivation*

In this paper, we will deal with one of these cases. It is a real-world case of a hydroelectric power plant, operating since 1926, which despite being an operation within traditional standards, has, over time, been updated to receive monitoring systems based on data collection sensors. The objective now was to go one step further, developing an analytical "workbench" for data exploration and, furthermore, implementing applications of AI algorithms to support a predictive process.

So far, the plant updating process has been developed incrementally, but with little documentation. The mapping of processes and sensors, for example, were not fully updated. Therefore, if new improvements were desired, these mappings should be a must before any new action. Such mappings could provide a clear understanding of the power plant system and subsystems, as well as the types of sensors installed and variable observations collected. With an understanding of this entire universe, the road was open for new developments. As a result of these process mappings, as well as an exploratory data analysis, a favorable environment would be created for the application of AI algorithms to support the implementation of predictive models, and thus achieving a consistent KDD environment.

Therefore, the main motivation that led to the development of this paper was to report in the literature the experience obtained in this research project in which all phases of a big data process were covered and which led to the construction of a framework (BAProM) that can be used in industrial systems of different types.

The description of this framework, accompanied by an implementation in a real-world case, may lead other researchers to develop similar works, and professionals in the field to make better-informed decisions, and therefore, become more secure.

#### *1.2. Research Question*

This subsection presents the research question (RQ) that drove all the development of the study described in this paper.

#### **RQ:**

#### **What are the phases and their respective internal structures to constitute a consistent framework focused on the big data process, which could be applied in real-world cases of predictive maintenance?**

As the question states, its purpose is to define the phases, tasks and techniques that must be employed in each step of a big data process, considering from the identification of relevant processes and variables to be studied to the implementation of prediction models. Such a framework should be suitable for application in predictive maintenance use cases.

#### *1.3. Objectives*

Based on the RQ, the objective of this study, therefore, is to address these issues, and it must do so through a framework proposal that has been called BAProM—Big Data Analytical Process and Mapping.

As specific objectives of the study, we must:


#### *1.4. Implications and Contributions*

The importance of studying the big data process is the relevance that the subject has acquired in Industry 4.0, since more and more stakeholders are adopting data-driven decision-making practices.

The implications of data analysis and prediction models, expected products of a big data process, are far beyond Industry 4.0. In fact, its benefits spread across all areas of activity.

In Industry 4.0, in particular, the implications of a framework that could be implemented as systematic procedures in the operation can be huge. Such models would lead to a minimization of corrective maintenance occurrences, in addition to optimizing the periodic maintenance schedule. Productivity can increase, as can profit. As the amounts involved in industry can be significantly high, so would be the benefits of costs savings.

This paper, therefore, can bring an important practical contribution to an important economic sector.

On the other hand, the conceptual and technical implications of the paper can also be significant, since novelties are proposed and validated by a complete implementation in a real-world case.

The mapping of processes and variables is often not present in the big data processes described in the literature, and this paper seeks to draw attention to this fact and show its relevance in the direction the project took.

The development of an analysis and data exploration tool, with the demonstration in the article of its use in different stages of the process, is another contribution of the study that should have implications in the way the projects are developed.

In addition, a combined prediction model, employing a decision tree complemented by an artificial neural network to forecast critical variables for a future period, as will be presented in this paper, is not often seen in the literature.

The article thus acquires some relevance with these contributions and may have positive implications both from a conceptual and practical point of view.

The description of the BAProM framework, as well as the real-world application case, is presented in the paper over five more sections. In Section 2, we give a literature overview of the works related to this research. Section 3 presents the methodology employed in the conception of the framework and shows how it could be implemented. In Section 4 we describe the Case Study developed in the hydroelectric power plant, and Section 5 shows and discusses the results of these practical applications. Finally, Section 6 presents the conclusions and gives directions for future works.

#### **2. Related Works**

Every industry, including power generation, wants its equipment to be as efficient as possible, which means operating at full load (or close to it), producing as much as possible and having the equipment for the maximum available time [9]. Therefore, maintenance aims to inspect any equipment to ensure its effectiveness, avoiding unexpected failures [10].

The most common type of maintenance is a periodic one, called preventive maintenance, which consists of stopping the equipment according to a predefined schedule, and performing scheduled services and inspections to check for additional repair needs. Most preventive maintenance stops can prove to be unnecessary, resulting in maintenance expenses and loss due to production stoppage. However, even so, this type of maintenance is sustained by the industry, as it is still the best resource to avoid corrective maintenance [11].

Corrective maintenance comes from a failure in an equipment throughout the industrial process, generating a high financial impact on budgets due, above all, to the immediate need for repair and spare parts, in addition to interrupting the production chain in an unplanned way [12,13].

The best scenario would be one in which the ideal time for maintenance is known in advance. But, this type of discovery is not trivial, as it involves a complex system of variables related to operation, maintenance, production and even the human order of those who are handling the equipment [14].

These questions increase the interest in installing sensors in a variety of equipment, collecting data almost in real time (in the order of seconds) about their mechanical, electrical or operational conditions. Having the data and developing analyses makes it possible to get to models supporting decisions regarding when maintenance should occur and what procedures should be adopted for eventual failures. Decisions, in this case, are supported and based on information extracted from data (Data-Driven approach) [7,8,15].

When a maintenance decision is based on information extracted from data collection, it generates a proactive action. In addition, this paradigm shift between reactive (corrective) to proactive maintenance actions is also seen in the literature by transforming time-based maintenance (TBM) into condition-based maintenance (CBM) [7,8].

Proactive maintenance uses concepts of Internet of Things (IoT), big data (BD) and artificial intelligence (AI). Simply put, for conceptualization purposes, the sensor used in monitoring is associated with the IoT component, the process of collecting and exploratory processing of data to the database is associated with BD, and the training of algorithms for the generation of models to be used for decision-making is addressed to AI.

Literature points towards a new industry revolution. After the mechanical, electrical and automation revolutions that brought mass production, assembly lines and information technology, raising workers' income and making technological competition the core of economic development, the fourth industrial revolution is characterized by a set of technologies, where the operation is modernized with sensors for monitoring, collecting, and storing data and using data-mining techniques, with intelligent algorithms to support decision-making [3,16,17].

The approaches of Industry 4.0 used together are optimistic because they can monitor, diagnose and predict possible failures in addition to indicating the best time for maintenance to occur. The papers focusing on anticipating the best time for maintenance define this approach as predictive maintenance [18–20].

Related work emphasizes the choice of specific algorithms or composite algorithms, in order to seek the best performance in predicting the best time for a maintenance service. Composite algorithms imply, on the one hand, the use of techniques for dimensionality reduction which may occur due to the high number of sensors. These are techniques such as the Principal Components Analysis (PCA) [15–17] or data clustering algorithms, as K- Means [21] or yet, probabilistic models such as the Bayesian Belief Network (BBN) [3]. On the other hand, there is the use of AI algorithms, where the most used in predictive maintenance are Support Vector Machines (SVM) [16,17,22], Artificial Neural Networks (ANN) [18,22], Bayesian Belief Network [3], Random Forest (RF) [22], Partial least squares (PLS) [15], Markov Chain and deterministic methods [23,24]. These mentioned works are discussed in more detail below.

Yin et al. present a survey of studies employing statistical methods for monitoring and detecting failures in large-scale industrial systems. As their main results, database problems stand out, and among them can be highlighted the high number of variables, wrong measurements and missing values. For variable treatment, especially dimensionality reduction, and monitoring to detect flaws, the authors conclude that the best approaches are PCA and regression by PLS. The combination also allows identification of the most significant variables in an equipment failure [15].

Another paper, developed by Jing and Hou used the Tennessee-Eastman Process (TEP) to simulate an industrial chemical environment in order to assess process control, process monitoring and diagnostic methods. As far as diagnosis is concerned, the authors used PCA to reduce the dimensionality of the data and SVM for the diagnostic classification [16].

A survey of articles from 2007 to 2015 using SVM to detect failures in industrial environments is presented in a paper of Yin and Hou. The main conclusion of this research was that the best results were obtained when the SVM was combined with some other dimensionality reduction technique [17].

Lee et al. proposed an analytical framework with Prediction-Health Management (PHM) algorithms aiming to learn how to operate normal equipment and to predict its lifespan. Self-analysis of the equipment is performed using unsupervised algorithms such as the ANN Self-Organizing Maps (SOM), defining normal operating standards. Therefore, when the operation comes to the point of having a certain level of dispersion in relation to its standard behavior, learned by SOM, the algorithm infers that it just started a degradation process [3].

The development of an ANN based on operation data of machining equipment is the content of a paper of Yan et al. The objective of the research was to estimate the remaining life of the most relevant component of that equipment. The work also proposes the need for a standardization of semi-structured and unstructured data from industrial processes, to improve the accuracy of the prediction algorithms. An improvement occurs because heterogeneous data, such as vibration signals from the machine and images of the machine's working environment, can provide important information for the prediction model after being structured and standardized [18].

Gatica et al. propose two approaches to predictive maintenance, named online and offline. The approaches have top-down and bottom-up strategies. In the "top-down" approach, the process begins with understanding the use case, as well as the machines employed. Following from this, a mental model of the process is made, where a hypothesis of how the process impacts data collection, is elaborated. Finally, the hypothesis is tested by analyzing the sensor data. In the 'bottom-up' strategy, the process has the following flow: data collection, exploratory analysis, selection of variables, predictive modeling and results validation based on the experience of the industrial process team [20].

A model to evaluate equipment failure time by collecting data with a vibration sensor was proposed by Sampaio et al. Their objective was to develop a relationship between the vibration levels and the equipment failure time, thus raising a characteristic curve that was learned by three AI algorithms: ANN, RF and DT. The lowest RMSE (Root Mean Square Error) was achieved by ANN [22];

Wang et al. presented a framework named Policy Semi-Markov Decision Process (PSMDP) to find the best time for predictive maintenance, based on the system deteriorating state. The proposal aimed to understand the equipment operating status, so that maintenance would be planned

considering the aspects of production efficiency and maintenance expenses. The work aims to discover when equipment is about to present a failure and consequently establish an action plan for the best maintenance moment [23].

A paper developed by Gao et al. presented a bibliographic review of works dealing with approaches involving fault detection based on signals and methods of deterministic models. The result is a taxonomy of fault diagnosis approaches for deterministic systems, stochastic fault diagnosis methods, discrete and hybrid event diagnostic approaches, and diagnostic techniques for networked and distributed systems [24].

The works presented in this section focus on different aspects of predictive maintenance. Among all the works mentioned here, only Gatica et al., as explained above, thought of the problem in the form of a process [20]. The others focused on the techniques involved and among these, the problem of the data set is noted. The data collected from sensors has problems of outlier, missing values, standardization and dimensionality that were pointed out in full by only [18]. Others were concerned only with dimensionality reduction, which was resolved with the use of PCA. Regarding prediction processes, the SVM algorithm is widely used, but without further discussion of parameterization and the kernel used. In part, the strong use of this algorithm is due to its performance in comparison with other methods. However, most of the applications are in contexts that are not necessarily industrial environments.

A systematic review of Machine-Learning methods applied to Predictive Maintenance on two scientific databases: IEEE Xplore and Science Direct [25], gave an overview of the maintenance types—corrective, preventive and predictive—and tried to show the machine-learning methods being explored and the performance of the techniques. An analysis of the papers between 2009 and 2018 showed that techniques of the most diverse types have been widely used, such as: Decision Tree, RF—Random Forest, k-NN—k-Nearest Neighbors, , SVM—Support Vector Machine, Hierarchical clustering, k-means, Fuzzy C-means, ANN—Artificial Neural Network, LSTM- Long Short-Term Memory Network, ARIMA—Autoregressive Integrated Moving Average, ANOVA—Analysis of Variance, Linear Regression, GLM—Generalized Linear Model, and others.

In another paper, the authors presented a machine-learning approach for detecting drifting behavior—so-called concept drifts—in continuous data streams, as potential indication for defective system behavior and depict initial tests on synthetic data sets. The machine-learning techniques used in the study were LR, RF and Symbolic Regression (SR). They also presented a real-world case study with industrial radial fans and discuss promising results from applying their approach [26].

The literature also presents a predictive maintenance framework based on sensor measurements [27] and a prognostic is developed, oriented towards the requirements of operation planners, which is based on a Long Short-Term Memory network. Its performance is compared with two benchmark maintenance policies: a classical periodic and an ideal case (perfect prognostics information) called the ideal predictive maintenance (IPM). The mean cost rate of the proposed framework was lower than the periodic maintenance policy and close to the ideal case IPM. It is possible to find works yet, with confirmations that big data and IoT play a fundamental role in data-driven applications for Industry 4.0, as is the case of predictive maintenance [28]. The authors in this paper reviewed the strengths and weaknesses of open-source technologies for big data and stream processing and tried to establish their usage in some cases. As a result, they proposed some combinations of such technologies for predictive maintenance in two cases: one in the transportation industry, a railway maintenance, and another in the energy industry, a wind turbine maintenance.

Another study proposed a Weibull proportional hazards model to jointly represent degradation and failure time data. The authors explained that degradation refers to the cumulative change of the performance characteristic of an object over time, as the capacity of batteries of hybrid-electric vehicles, the leveling defects of railway tracks and so on. The proposed strategy was applied to the predictive maintenance of lead-acid batteries and proved to be adequate [29].

This review sought to provide an overview of the main aspects associated with the theme of this research. Thus, works were presented showing the context of the Industry 4.0 environment, involving the maintenance of equipment, the acquisition of data for monitoring, based on sensors, the use of AI algorithms based on ANN for failure prediction, the use of statistical methods for monitoring and fault detection, and other proposed analytical structures. A rich field of opportunities has been presented.

From this picture of opportunities, verified in the literature review, emerged one of those opportunities with the proposal of the big data Analytics Process Mapping framework, BAProM, which is the development of an analytical framework covering the entire big data process and also including a first phase of a detailed mapping of processes and variables, which is not frequently seen in the literature. As stated before, synthetically, the framework consists of four modules: Process Mapping, Data Management, Data Analysis and Predictive Modeling.

Such a framework, including the mapping of processes and variables to a predictive analysis and showing results of an implementation in a real-world case, is a novelty in the literature.

The details of each of these modules are presented in the next section, as well as the reasons for each technique selected to became part of this first version of the framework, which was validated in a real-world case of the Henry Borden hydroelectric plant Section 4.

In addition to its conceptual relevance, the research gains practical importance by being applied to a relevant industrial system in the real world creating mechanisms for monitoring the operation and predicting equipment failures, which could be avoided once they were known in advance.

#### **3. Framework**

A classical development of a big data project starts by data collection regarding the important variables of the system under study [1]. However, in some cases, it is not so clear what these variables are, since a comprehensive documentation may not be available. In these cases, an earlier phase of mapping processes and relevant variables to characterize the state of the system is necessary.

The framework proposed in this paper introduces in the big data process phase of mapping processes and variables as an initial fundamental part of the process, which is followed by data management, in which part lies the collection of primary data. After that, there would be a phase of data analysis, with more exploratory characteristics, and in the end there is a predictive modeling.

The entire process was consolidated into four modules, whose details are shown as follows:

#### **Module 1: Mapping Process**


#### **Module 2: Data Management**


#### **Module 3: Data Analysis**


#### **Module 4: Predictive Modeling**


Figure 1 illustrates the complete framework, including the techniques and the computational tools applied in each step. Please note that the framework proposed is an extension of the big data process proposed by [1]. Here, the flow of activities incorporates mapping, which therefore becomes an integral part of a big data process. This, in a way, is recommended in the CRISP-DM (Cross Reference Industrial Standard Process for Data Mining) model, which suggests as initial phases the understanding of processes and data [30]. However, this understanding phase is not directly related to a mapping of processes or variables, in CRISP-DM, as it is here in this proposal.

**Figure 1.** BAProM—Modules and techniques applied by module.

#### *3.1. Process Mapping*

The mapping of processes is a fundamental step, since it unveils the set of variables, which are those "keeping the knowledge" of the system under study, often obscured under a surface of a mass of data. This mapping of variables, which follows the mapping of process, opens the access doors to this knowledge. A process mapping can be defined as a modeling technique used to understand in a clear and simple way how a business unit is operating, representing each step of its operation in terms of inputs, outputs, and actions. As a result, a model of the system operation is built, with all its flows, relations, variables and complexities [31,32]. This is a fundamental step in research and development studies, as it provides a broader view of the object of study and makes it possible to improve the basis for decision-making, since at the modeling stage all processes are identified, mapped, understood and validated, which may lead to a process redesign. The characteristics of the processes (flows and/or activities) may be redesigned, aiming optimization and/or adaptation to recurrent needs.

All these concepts were initially applied to business processes, to improve and to automate a process. In fact, process automation by the means of applications is one of the major uses of process modeling [32]. However, by the characterization and validation of a process, it is possible to identify critical points in the system and, therefore, to identify and/or define critical variables, which form the basis for the data collection phase of a big data process. The start point for a consistent data collection is a set of representative variables of the system under study. Therefore, even though the aim here is the study of a big data process, the modeling process to identify this set of representative variables is similar to classical business process modeling. Thus, this paper tries to demonstrate how a tool originally designed for modeling business processes, the well-known software engineering tool BPMN—Business Process Model and Notation—can also be applied to a big data process.

BPMN is the notation of the methodology of business process management, widely used in software engineering for process modeling and validation of the process from the prototype generation of an application. The BPMN was developed by the Business Process Management Initiative (BPMI) and is currently maintained by the Object Management Group, maintaining the current version of BPMN in 2.0 [31,32].

A proposal [33] to use this process to align the business process with that of the analysis, corroborates the benefits pointed out in other articles [34]. The relevance of this type of application can also be demonstrated in a work which proposes, in an embryonic way, the improvement of a BPMN for better use in an analytical context [35].

The BPMN provides a standard notation, easily understood by all members of the business. Stakeholders include business analysts who create and refine the processes, the technical developers responsible for implementing the processes, and the business managers who monitor and manage the processes. Consequently, BPMN intends to serve as a common language to bridge the communication that often occurs between designing business processes and implementing a process automation. It is a process-modeling notation comprehensible to the process owner (definition); to the participant in the process (use); to system developers (automation); to the business manager (monitoring) and; to decrease the distance between definition and implementation of the defined solution [31,32]. Based on these characteristics, this proposed methodology considers the use of BPMN as an adequate tool for the development of the mapping of processes and variables, the initial stage of the big data process conceived here.

#### *3.2. Data Management*

When we talk about data, we are in fact referring to observations of a set of variables which is the fundamental pillar for an analytical description of a system. It represents a synthetic framework of the system knowledge map, and through the variables observations it is possible to penetrate often complex paths existing in the masses of data, obscured by a variety of noises, as random observations, missing value outliers and so on. Data management means collecting and dealing with these observed values of the variables, and assures quality to the data, since it is the base of the entire analytical process of the system. Data quality is essential for a descriptive analysis of the system and an understanding of its behavior, as well for predictive models.

As described at the beginning of this section, data management begins by the acquisition and recording steps, which are strongly dependent on the application domain. This collection step, based on the set of critical variables, is the basis for the next analytical phases. In the case of Industry 4.0, the theme of this work, sensors usually make the acquisition. However, it may also be done by data sources other than sensors, such as photos and/or sounds collected in the operating environment or even by very simple processes such as notes registering operating situations of equipment.

The second step of data management, referred to as extraction, cleaning and annotation, also known as pre-processing, is dedicated to improving data quality. The pre-processing has two fundamental segments: data preparation and dimensionality reduction.

Data preparation means, basically, cleaning, integration and representation or transformation of the data, preparing the data for the analytical phases.

The cleaning involves treatment of data noise, characterized mainly by outliers (points with behavior quite different from the others) and missing values (lack of observations). Due to the diversity of data sources from different databases, noise, inconsistencies or missing values are very common. Even data from a single database is not exempt from such problems, and neither is data collected automatically by sensors, as these are liable to fail [36–38]. The cleaning consists of eliminating noise, correcting inconsistencies and handling missing values. The treatment of noisy data consists of identifying attribute values outside an expected standard (outlier) or other unexpected behaviors. The causes are diverse, such as measurement variations of equipment, human interference or extraordinary events, among others. The solution can be by simply removing the value, if the observation is identified as an anomaly, or by treatment using binning, clustering or other procedures. The elimination of an outlier is the simplest solution, but, before eliminating such a value, it must be considered that an occurrence with a value other than the usual may be the result of a measurement never seen before and

therefore it should be carefully studied rather than being eliminated. An outlier, in fact, may represent an opportunity of a discovery, which might conduct a research to new paths not considered before. Correcting data inconsistency is also a part of cleaning. Inconsistency is the presence of conflicting values in the same attribute, which in many cases may be caused by the integration of different databases. An example would be if each database uses a different scale to measure power. One could use kilowatt and the other megawatt. In integration, the values would be inconsistent. The correction can be done manually, automatically, in some cases, or even considering some kind of normalization (see Data Transformation, ahead in this section). Another cleaning task is to deal with the absence of data, which occurs when one or more attribute values do not exist. There can be several causes, such as failure to fill manually, no knowledge of the attribute, or low importance of the attribute, among others. The problem can be solved simply by removing the attribute or removing the entire sample, if this may cause a problem to other attributes of the same sample. There are other types of solutions with more elaborate techniques, such as to assign the mean, a moving average, or even the minimum or maximum values to those missing values [37]. Data cleaning is an essential step for the analytic stage. After cleaning there is the integration and representation or transformation, as a final data preparation for the analytic stage. These are pre-processing procedures applied to the data to gain efficiency and effectiveness. The integration activity occurs when one has many data sources, and seeks the construction of a single database. Otherwise, it loses importance. The representation in many cases means a data transformation, converting types and/or values of attributes. In some cases it is necessary, for example, transforming a continuous numerical value into a discrete value, or a discrete value for categorical ordinal, categorical nominal for discrete binary and categorical ordinal for discrete. It may be necessary, however, to normalize attributes that present values in broad ranges, in order to make them have the same level of importance in an analytical process. For normalization, the literature presents different methods, such as the z-score that transforms attribute values so that they remain with zero mean and standard deviation equal to one. Another method, considered as standard by many authors, is the min-max method [37]. The pre-processing so far included its first segment, the data preparation, involving cleaning, integration and representation or transformation.

Dimensionality reduction is a second segment of pre-processing. It is associated with data redundancy, which is another problem that must be treated. It occurs when two attributes have a dependency on each other. In this case, they may have the same values, or they may be very similar. It may happen for different reasons, such as lack of information of a database (metadata) that an attribute generates another one, or it may also exist between copies of a database. Typically, redundancy can be identified using correlation analysis, where the Pearson Correlation Coefficient is one of the most frequently used [37]. However, it may also be identified by using techniques such as factor analysis or Principal Components Analysis (PCA). The result of applying these techniques is a selection of records in the database and/or attributes, which will form the final database for the analysis phase. This selected data is a reduced database, without redundancy, but with equivalent analytical capacity.

It should be noted that each project has different needs and it is not always necessary to develop all pre-processing steps described here. Anyway, if all the steps are necessary, the natural sequence would be preparation, involving cleaning, integration and representation or transformation, and dimensionality reduction, in an iterative way and with interactions between the steps, in a feedback process, until the final data quality is effectively guaranteed [38]. A final step may still be necessary at this phase, which would be consolidating the data into a single database.

#### *3.3. Data Analysis*

This analytical module, as shown in Figure 1, is basically defined by exploratory analyses of the critical variables, which is essential for a descriptive analysis of the system, building a clear understanding of its behavior. Similar to the variables stored regarding the information about a

system, once this data is properly explored and interpreted, the information obtained will represent an accumulated knowledge regarding the system.

The exploratory analysis is based on the observed values of the variables, and usually, it works with tools as the Structured Query Language (SQL) to create consolidated databases from multiple queries on different data sources. SQL allows data modeling, relating tables created by extracting, transforming and loading data, the so-called ETL process (extract, transform and load), and constructing analytical repositories appropriate for discoveries [39].

Typical examples of this analytical approach are multidimensional data models, supported by data warehouses (DW). A DW is a repository constructed with data extracted from transaction systems, the so-called OLTP (Online Transaction Processing) data, and is exclusively for analysis, and so, it is not constantly updated (non-volatile) [40].

Online analytical processing tools (OLAP) are useful instruments to explore a DW. This kind of tool provides the exploration of different perspectives of a database. Moreover, SQL and other statistical tools may provide aggregate functions to summarize data, generating descriptive statistics measures such as sum, mean, median, standard deviation, minimum and maximum values, counts, etc. As a result, the descriptive statistic provides a clear view of the behavior of the variables under study and furnishes indicators for consolidated reports.

The Interpretation of these indicators must be strongly supported by computational tools integrating statistical analysis with visualization resources, as different types of graphics, dashboards and other instruments. The implementation of such analytical tools is part of this proposal. With an analytical computational tool, the development of analyses and new insights about interrelations, correlations, and/or operational trends of the variables becomes a reality.

#### *3.4. Predictive Modeling*

Differently from the approach described in the previous section, the predictive module is based on Data-Mining (DM) techniques. DM is a process of analytically exploring databases for the purpose of making findings that are not obvious, whose outcomes are effective in decision-making processes. DM is a core component of a KDD process [38], and usually involves prediction, clustering, or data association techniques.

The prediction process can be developed based on AI algorithms, which are strongly based on the data, including an auto-adjustment of its internal free parameters, calibrating the model (the so-called "training" of the model) which is performed from data history. This parameter adjustment (the model training) makes the algorithm able to be applied in other datasets, distinct from the one where the training process took place. A training model can, for example, estimate future values of variables, as the probability of an equipment failure. In such an example, the prediction could support the estimation of an optimal period for equipment maintenance [1].

There are different types of algorithms for prediction. One of the classical ones is the Decision Tree (DT), which is a type of AI algorithm whose model, generated after the training process, can be interpreted by humans and machines [41].

In a DT algorithm, the training process is simple and intuitive. In DT, each variable (attributes) is analyzed in its capacity of estimating a class of an object in a dataset. The DT defines a sequence (in a hierarchical tree structure) of attributes to be used to estimate the category (the class) of the object under analysis and depending on this sequence, different results may be generated. Therefore, a metric must be employed to stablish the "best" attributes sequence. One of the most used indicators is the entropy, which is a measure of the uncertainty associated with an attribute. The entropy is computed in terms of separation between classes. The variables are combined, and a measure of the entropy is performed [37]. The final model is a hierarchical structure by variable importance leading to a process of classification of the objects.

In an industrial context, for example, a class may be an equipment failure or not. Based on the values of the attributes of an equipment, the DT algorithm decides if the values of its attributes in a certain point in time means an imminence of equipment failure or not. An important characteristic of a DT algorithm is its ability to allow interpretation by humans and not only by machines. It provides a reasonable understanding to experts of how the model is making its decisions, what leads stakeholders to trust the model. In this BAProM framework a DT algorithm has been developed to be used in the optimization of the maintenance programming of equipment.

Another important AI algorithm type is the ANN. ANN mimics or simulates the behavior of the human brain. In fact, it is a computational algorithm that implements a mathematical model inspired by the brain structure of intelligent organisms. As a result, it is possible to implement a simplified functioning of the human brain in computers [42]. The human brain receives information, processes and returns a response, and does so through neurons, connected in an immense network, and which communicate with each other, by electrical signals (synapses). An ANN seeks to imitate, in a simplified way, this process, to solve a problem. ANN, therefore, is an artificial network of neurons (nodes) connected. These artificial neurons are connected in layers: an input data layer, intermediate layers (varying from 0 to "n") and an output layer.

ANN is a powerful tool for solving complex problems, and can be used, for example, in classification, clustering, associations and in time series forecasts. In this BAProM framework an ANN was employed to forecast time series of critical variables defined in the DT model. The two models, therefore, worked together to forecast an equipment failure, allowing the operation team to act before the fail takes place.

#### **4. Case Study**

This section presents a more detailed description of the case study and the experimental methodology applied in real-world operation. The computational tools used in the experiment are also discussed.

Therefore, the core purpose that has been implemented in this study was mapping operation, facilities, and processes to identify variables that would be relevant for decision-making of maintenance. Then, with those variables identified, it would be possible to start an analytical repository, and, further, training machine-learning algorithms to the prediction.

#### *4.1. Case Description*

The case studied in this article is the Henry Borden Power Plant (UHB), located in Cubatão, about 60 km from São Paulo, capital of the state of São Paulo, in Brazil.

Its power generation complex is composed of two high drop-off power plants (720 m) called External and Underground, with 14 groups of generators powered by Pelton turbines, with an installed capacity of 889 MW. Pelton turbines are characterized by blade-shaped fins that are the main cause of maintenance [43].

The External Power Plant is the oldest. It has eight external forced ducts and a conventional powerhouse. The first unit started operations in 1926, the others were installed up to 1950, in a total of eight generator sets, with an installed capacity of 469 MW.

Each generator is powered by two Pelton turbines, which receive water flows from the Rio das Pedras reservoir. These flows arrive at the so-called "Valve House", where they pass through two butterfly valves in penstocks. Then, they descend a slope, reaching their respective turbines, covering a distance of approximately 1500 m.

The Underground Power Plant is composed of six generator sets, installed inside the rocky massif of Serra do Mar, in a 120 m long, 21 m wide and 39 m high cave with an installed capacity of 420 MW.

The first generator set went into operation in 1956. Each generator is triggered by a Pelton turbine driven by four jets of water. The operation of the UHB was developed according to an Integrated System of Generation of Electrical Energy composed of four large interdependent subsystems, interrelated in a continuous way, in the sense of generating electric energy delivered to the Brazilian Interconnected System, distributed cross country.

The general framework in UHB today is management practices combining modern instruments, such as computerized monitoring systems and dashboards for visualizations of different types of indicators with empirical practices based on its team experience. The system in the operation center runs uninterruptedly, allowing information from the entire system constantly. Therefore, appropriate decisions can be made at every moment. However, many of these operating parameters and metrics are established based on empirical practices. A typical example is the timeframe between inspections and preventive maintenance of turbines of the generating system. These parameters, which are determinant for the quantification of operational costs and level of the service of the electric system, should be periodically re-evaluated and, if possible, optimized, in order to find the optimal point of the tradeoff between costs and service levels.

Hydropower plants and generation facilities represent a high level of investments requiring management based on robust processes and standards, to guarantee the adequate return of the investments. Its operation and maintenance must be developed to guarantee the preservation and maximization of the use of this patrimony, within the operational conditions in which it operates.

The primary objective of such facilities is to maximize the availability of the energy generation and use of equipment. This is only achieved with high-level operating standards and procedures to guarantee the facilities productivity and the quality of the services offered.

The operational and maintenance standards of a power plant have their own costs, which may be considerable, given the complexity and size of the operations, inducing managers to search for practices leading to costs minimization.

Therefore, the use of BAProM framework (Section 3), seeking to establish optimized parameters to minimize operational costs and maintenance associated with equipment shutdowns could represent a relevant contribution to the operation of a hydroelectric plant. The BAProM framework was applied to this case, to develop a predictive model to establish the probable occurrence of an incident in Generating Units (GU) that could cause an interruption in the operation and, consequently, the need for corrective maintenance. Based on these predictions, it would be possible to establish optimal periods between maintenance.

In the following section, we present the approach and results involved in the predictive modeling.

#### *4.2. Experimental Methodology*

The methodology applied to the case study strictly followed the four modules of the proposed BAProM framework, but throughout the project, the technical team had to face some concrete questions, which only when an experimental methodology is effectively put into practice is it possible to have the real dimension of certain issues. Given the impact of such practical issues on the time dedicated by the team to resolve them, it is understood that they deserve a record and a discussion, as they can occur in many real projects. On the other hand, some steps that might have seemed difficult, in practice, demanded much less time and dedication from the team than one could imagine previously.

Thus, this section presents the four modules of the experimental methodology highlighting and discussing the main practical aspects related to the experience developed during the project. This type of record can be a useful contribution to the definition of the steps of a methodology, and also, to emphasize the attention that a team must dedicate to each step of the application of a methodology. In addition, it can also be an important contribution to scale, in a schedule, the time of each phase in a practical project.

#### **Module 1: Mapping Process**

The mapping of UHB operational processes involved the identification and characterization of the operation systems of the power plant, and the main physical variables (electrical, mechanical and electromagnetic) associated with the processes. The main procedure for gathering information to build the mapping focused on a search directly with the power plant staff, since the individuals working on the plant showed to have a consistent knowledge and deep domain of the business to be modeled. These professionals with relevant experience in management and operations showed to have knowledge not only of the general process, but also of the some important details, allowing a consistent and reliable description of the processes and their associated variables, identifying accurately some points to be mapped and highlighted. This module was one of the major challenges of the project, involving extensive discussion with the power plant operation team, to learn the operational process and the relevant variables and this is a lesson to be learned. Sometimes, it is not a trivial task for the technical data science team to learn the technicality of the business being modeled. In addition to the information extracted from the meetings with the operation personnel, other relevant information on equipment was obtained from documents provided by the company.

#### **Module 2: Data Management**

If the previous module represented one of the major challenges of the project, this data management module was the biggest one. First, it was decided that the data collection would be focused on two UHB Generating Units (UG), known as UG4 and UG6, both with the same mechanical, electrical and technical characteristics, and the data would be obtained from a supervisory system database fed by sensors coupled to the plant equipment, which were connected to this database. The problems started to appear when analyzing the collected data. The data has not been properly stored over the years. Most of the data collected was used just as input to dashboards and discarded after use. The fact is that although the hydroelectric plant had a good data collection infrastructure, the team did not have a culture of data analysis, but only used the information for an instant monitoring of the operation. In the team there was no qualification for data analysis, so what happened was that the data was used for monitoring and most of them was then discarded. In fact, there was little historical data to analyze. Therefore, what happened is that the effective data collection had to be started from the beginning of the study. This led to a considerable delay in the project's planned schedule. In addition, there was a great deal of heterogeneity among the collection time periods of the diverse variables. The intervals between two collections could be quite different from one variable to another. There was no standardization of these periods. Moreover, we found many variables without data (missing data) and many problems of noise, including inconsistency and outliers.

Beside these problems, during the project one of the Generating Units, UG4, had a technical problem and had to be deactivated for a long period. Therefore, the data collection had to be focused only on one of the Generating Units of the UHB, UG6, which became, therefore, the object of this study.

Anyway, all problems had to be solved, especially the interval between two consecutive collections, which was adjusted so that all variables were always collected at the same time stamp. New time series of observations of the variables started to be generated. Once these difficulties were overcome, the data was successfully collected and a succession of analyses could be conducted. The final collection period was from May 2017 to January 2018 and they kept being updated continuously.

#### **Module 3: Data Analysis**

This module did not present any significant practical problems. The challenge here was technical, related to the development of a computational tool to support the analytical phase, which should be effective, but also be friendly, so that it could also be used by the operation team, non-technical users. Therefore, as soon as the mapping phase ended, an analysis and visualization tool, an analytical workbench, was developed to be used not only in this phase 3, but also, in the previous one, data management, to assist in the characterization of the variables and in the identification of data quality problems. Each critical variable has been filtered by the dashboard tool, providing visualizations of their domains, through statistical diagrams and summaries, presenting time series graphs, boxplots and histograms to identify trends, seasonality, statistical distributions, presence of outliers and missing values, as well, metrics such as mean, median, standard deviation and quantiles. Once the analyses were completed, knowledge about the system increased significantly and the entire database was ready to be subjected to predictive modeling.

#### **Module 4: Predictive Modeling**

As the previous one, this fourth module did not present significant practical problems. Once more, the challenge here was technical, since the predictive modeling involves AI algorithms and data modeling, since the data should be properly prepared to input the models.

The modeling was subdivided in two predictive models: the first one was a Decision Tree, where relevant variables associated with equipment failure were identified, as well as, thresholds, indicating the imminence of a failure when a variable reaches that value; the second model was an ANN dedicated to forecast time series of the significant variables, and so, could be possible to foresee in a future period when one of these variables would reach a threshold. Therefore, this was a module in which the tasks went without unexpected occurrences, including the techniques and tools employed in the practical application, which proved to be well adapted to the tasks.

#### *4.3. Computational Tools*

This section discusses the computational instruments used in the case study, for the four modules of the proposed framework.

The mapping of operational processes and variables was the fundamental starting point of the methodology, first module of the BAProM framework. In this case, a tool originally designed for modeling business process, the software engineering tool, BPMN, could be applied even though this was a big data process. A graphical tool based on the last existing version of BPMN (v. 2.0) showed to be well suited for this development. The software provided appropriate resources for modeling processes allowing validation of operation rules, flows definition and identification of critical variables. Moreover, these features were essential for validation the mapping with the power plant team.

The next step, module 2 in the proposed methodology, was Data Management, where data collection played a relevant role. Data was provided from different sources, such as the supervisory system database, fed by a set of sensors, an application named Impediment Registry, a by-product of these project, and which records equipment occurrences, as well as some external data. The SQL Server Database Management System was the basis for data storage, in a unified repository.

Another step of this module 2, the pre-processing for data validation, as the data cleaning, had the support of the analysis and visualization tool mentioned earlier, an analytical workbench, developed specifically for this framework, in the R-Shiny, a language library and statistical environment in R. This computational tool was a fundamental support for the characterization of the variable's standards and identification of data quality problems, providing for example, the treatment of missing values and outliers.

In module 3, the complete exploratory analysis, was totally based on this analytical workbench. The Shiny package from R language made it possible to build interfaces for the application with a high level of usability, as well as the processing of basically, all kinds of statistical operations.

For analysis and visualization purposes the tool provided the flexibility of filters, allowing the selection of a generating unit, a specific subsystem and a variable, as for example, selects "UG6", subsystem "Generator" and variable "Stator Armature". The tool provided yet, diverse types of dashboard outputs, graphical and metrics, which represented the core of the data analysis in module 3. Its features included statistical summaries, graphical visualization of time series, boxplot and histograms.

Module 4 was developed basically, through algorithms coded in R. This language provides a variety of functionalities, as statistical and machine-learning functions, allowing the development of most of the data science algorithms, from the simplest ones to the most sophisticated, such as those of artificial intelligence algorithms. Both algorithms employed in module 4, DT and ANN, were developed in R, which proved to be well suited for the job.

#### **5. Practical Application of BAProM: Results and Discussion**

The practical application strictly followed the modules of the BAProM framework, which are presented in the following subsections.

#### *5.1. Process Mapping Results and Discussion*

The mapping of Processes and Variables, developed on the first module, was fundamental to identify and formalize all the relevant flows and processes of the UHB Integrated Energy Generation System, as well as all the relevant variables. An integrated macro model for the entire system was developed with the purpose of showing a more comprehensive view of all subsystems identified in the process. Therefore, it was possible to verify and analyze the major components of the four stages in their sequential order.

The modeling was developed, in most part through information gathering with the power plant team. Four subsystems were identified, making up the entire UHB energy generation process. These subsystems, which are: Adduction, Turbine, Generator and Transmission, are illustrated in Figure 2.

**Figure 2.** Macro view of the system with 4 subsystems.

These four subsystems make up, at UHB, an Integrated Electricity Generation System, which delivers energy to the Brazilian Interconnected Central System, composed of several power plants spread over the country, which in turn, distributes the energy throughout the country.

A synthetic description of the subsystems interaction, could be, as follows: the Adduction System carries a water flow from a reservoir descending a slope, to reach the turbines, covering a distance of approximately 1500 m (almost 1 mile). The pressure is enough to promote a high-speed rotation of the Turbine (Turbine System).

Each turbine, in turn, generates a rotation of the bearing axis on which it is supported, transmitting energy to the Generator to which it is connected.

The Generator by means of this kinetic energy, creates a magnetic field generating electrical current for the Transmission System. This system increases the voltage and prepares the energy ("packs") leading to transmission lines, integrating the Brazilian Interconnected System, for later distribution.

In this module, which represents the process mapping task of the BAProM approach, process modeling was applied to the entire system, which was a very extensive work, composed of vast documentation. Each of the four subsystems had its own modeled process, as well as the identification of its relevant variables.

For the purpose of illustrating this process, the mapping of one of the subsystems, the Generator, is presented here, with a special emphasis on one of its components, the "Stator Armature". The mapping of the other subsystems and their components was very similar to what is presented here.

The mapping of the Generator subsystem is shown in Figure 3, where the "Stator Armature" appears as its third component.

**Figure 3.** Generator Subsystem.

An integrated model for Stator Armature was developed showing a detailed view of this subsystem component (see Figure 4), and already including some aspects of data management, as well. This mapping provides a solid basis for applying the other framework modules, in their sequential order.

The mapping provides a broad overview of all critical variables. For the specific case of the Stator Armature, five critical variables were identified: Active Power, Active Energy, Armature Tension, Armature Current and Rotor Groove Temperature. These five variables, now, should be continually monitored by sensors, and their observed values subjected to an ETL process, for future analyses.

**Figure 4.** Mapping of the Component "Stator Armature".

The mapping of the complete operation of UHB was an effective practical contribution for the company, since it did not have this type of documentation, comprehensive and detailed, involving its entire operation.

Having completed the mapping, the next step was data management, which is the subject of the next section.

#### *5.2. Data Management Results*

The data management began with the data acquisition and recording, and it was favored by the previous phase, which provided an effective road map to the ETL procedure, by just following the flow throughout the mapping. In fact, as can be seen in Figure 4, whenever data is collected, a check is performed to verify that its value is within a specified range; if so, the values are extracted from the source, transformed into a compatible format and then loaded into the database. As the ETL procedure extracted and transformed the data for storage, simultaneously, was carried out an analysis of data quality, on all kinds of anomalies.

The data quality is essentially the pre-processing step, which was developed in Module 2. In this step, the data was submitted to a rigorous quality analysis process, based on data preparation, which involved cleaning, integration and transformation of the data into a final format for analysis.

The cleaning involved treatment of data noise, characterized mainly by outliers and missing values. This phase had the support of the analytical workbench, developed specifically for this framework, which provided statistical analysis and visualizations, having been a fundamental support for this cleaning task.

The number of variables resulting from the data quality checking for each subsystem (Figure 2) are summarized in Table 1. Please note that in percentage terms, the proportion of variables with outliers represented 57% of the total of variables, while variables with missing values were 43%.


**Table 1.** Distribution of variables among the operation subsystems and number of anomalies.

These are relatively high numbers and, therefore, were brought up for discussion with the UHB technical team, to understand the reasons for such values. Regarding outliers, while in some cases there was just a possibility of a value outside the expected standard, in other situations the observed values in fact corresponded to problems to be treated. As an example, some temperature sensors measured negative values. Since this scenario was impossible in the region of the power plant and the equipment should accompany the operating environment, it was clear that the observed negative temperatures were errors in the data collection, and consequently, those values were discarded. It was found that the errors were due to sensor failures. Another reason for outliers was data collections performed at system startup times. In these cases, peaks occur in certain variables, but they soon stabilize, entering in an equilibrium state. The missing values were also related to sensor problems. In this case, for a period, some sensors were disconnected from the system due to technical causes. The missing values were then treated. In most cases, they were filled with averages for near periods.

Another aspect identified during this phase was that some relevant data, collected in other systems operating at UHB, were not being integrated into the database of the supervisory system. As a combination of different sources can be useful to develop exploratory analyses, as well as robust predictive models, this integration has been implemented. One of the important data sets incorporated was a maintenance database, since the predictive models of this study are focused on maintenance. Thus, for exploratory and predictive analysis, the built repository integrated data from controls and records of equipment maintenance to the data of the critical variables of the system. Therefore, a single database started to store all the relevant variables for the analyses developed in Modules 3 and 4.

Once these problems had been solved, a last question arose, concerned with the time interval between successive data collections. This problem could be better perceived when analyzing the groups to which the variables belonged. The collected variables belong to four different types: electrical, pressure, temperature and speed regulation. This pattern was already part of a tacit knowledge of the UHB's operating team, which was formally defined in module 1 of process mapping.

Regarding these four groups, the periods between successive collections were too long, and there was still, a considerable heterogeneity among periods of collection of the different types of variables. There was no standardization of these periods.

The different collection periods can be seen in Table 2. These turned out to be a serious problem, since the analysis of variables for different timestamps creates basic problems under two aspects: analytical and systemically. Moreover, the value itself of each collection period, was a problem, since it varied from 5 min to 15 min. These were long periods for this type of data collection.


**Table 2.** Distribution of Variables in Categories and Collection Time Interval per Category.

The question was analyzed with the UHB technical team and from these discussions came out a resulting standard period for all variables, defined in a fixed time interval of 30 s.

Once all quality problems were resolved, the data started to be regularly collected. By the end of this module, the result was a consistent dataset, without outliers and without missing values and with all variables on the same time scale (timestamp).

As a final comment, it could be highlighted that previously to this study, most of the data collected in UHB was just used as an input for computation and presentation of operation indicators on dashboards in the plant's supervisory system. The data, after use, was then discarded. There were no historical data and, consequently, no analytical treatment of the data. This was changed with this project.

Today, all variables have their historical data kept in the database of the supervisory system, for a period of 6 months. At the same time, there is now a data warehouse, built on a separate data server, where new data is continuously incorporated into the historical series of the variables, which can now be increased for an almost indefinite period. It is a very different scenario. In fact, it would be reasonable to consider that the results of this module 2, mainly the ETL procedure, the exclusive data server and the data warehouse were successful, being, thus, relevant contributions of this work and that could be followed in other applications.

#### *5.3. Data Analysis Results*

The data analysis was based on the analytical workbench, illustrated in Figure 5, developed specifically for this framework. This application was fundamental for this analytical module. Before describing the workbench, it must be remembered that it was designed for a Brazilian company and, therefore, for Brazilian users. Thus, all labels and titles in the application were defined in the Portuguese language. The figures presented here in this paper are maintaining the original screens of the tool; however, this aspect should not affect the understanding of the tool, in relation to its functionalities, since a detailed description of each one will be provided.

#### *Energies* **2020**, *13*, 6014

**Figure 5.** Variable "Armor Current"—Exploratory Analysis in the Analytical Workbench.

The analysis could be developed from many angles. The tool provided the filtering of a generating unit, a specific subsystem and a variable of that subsystem. To select the variables the tool follows the hierarchy, starting at the system, going through its subsystems, then, its components and finally, the variables. At any hierarchical level, an analysis can be defined.

Once the analysis parameters are defined, four types of outputs can be viewed: a statistical summary, a boxplot diagram, a time series plot and a histogram. The user can select all these features to analyze a single variable or one of them for a comparative analysis among variables.

The analysis results are presented by subdividing the screen into quadrants, and in each of the quadrants one of these four types of outputs is presented. Therefore, the output interface is, in fact, a dashboard, combining graphical visualizations and with statistical measures.

As in Section 5.1, component "Stator Armature" will be used here, once more, to demonstrate the tool. An analysis of one of its critical variables, the "Armor Current", will be shown. An exploratory analysis of this variable is represented in the four blocks of Figure 5, in which all possibilities of statistical metrics and graphic analyses can be visualized.

From that Figure 5 one analysis that can be done for the "Armor Current" variable (in Portuguese *VarCorrente*\_*Armadura*), is based on the boxplot (see Plot 2, graphic at superior right in Figure 5), where it can be seen a group of values between 3000 and 4000, considering the scale of the vertical axes, distorting the visualization of a potential outlier. However, from the statistical summary (see Plot 1, graphic at superior left in Figure 5, the maximum expected value (in Portuguese: Máximo Normal) is 4000, indicating that there were no outliers in these data. A confirmation can be obtained by the maximum observed value of the variable which happens in the situation of high energy generation. The time series diagram and the histogram plot (Plot 3 and Plot 4, respectively from left to right in the bottom of Figure 5) also provide relevant information for analysis. In the case of plot 3, it is possible to identify the period in which the maximum value was reached and the histogram (plot 4) shows the distribution of observed values. In this case, value zero has the largest frequency, which stands the period when the GU was off for maintenance. Another type of analysis can be seen in the Figure 6, which shows the "Rotor Groove Temperature", another variable of the Stator Armature component. A comparison of the behavior of the variable in GU4 and GU6 is presented through the Boxplot and Time Series diagrams. The graphs show very similar behavior, as expected, even though the boxplot shows less dispersion for GU4, the green one. A complement to this comparison was developed based on the frequency histograms (Figure 7) and the same behavior was detected. With histograms, it can be seen more clearly the dispersion of the data.

This type of comparative analysis between the two generating units was developed for all variables, whenever data from both generating units were available.

**Figure 6.** Rotor Groove—Comparison GU4 vs. GU6—Boxplot and Time Series.

**Figure 7.** Rotor Groove—Comparison GU4 vs. GU6—Boxplot and Histogram.

A final analysis developed for all variables is illustrated in Figure 8, a time series decomposition, in this case for the Armature current. This is an important analysis, since when we look to the original data, many times we do not see certain behaviors as they are obscured by random effects. The time series decomposition shows three components of the series: the tendency, seasonality, and random effect (remainder). Through these decomposition tendencies, seasonality effects became much clearer, giving the stakeholders important information for the decision. These effects are clear in Figure 8. The figure shows in its upper part the original data. Then it presents the trend and seasonality curves in sequence. In addition, in its lower part it presents the random component (remainder). It is perfectly possible to see in the trend curve that in two moments in time the variable showed a growth trend, which was later reversed. Regarding seasonality, it can be seen that there are reasonably well-defined cycles, in which peaks occur. In addition, these peaks are reflected in the original data curve, as can be seen.

**Figure 8.** Armature Current—Time Series Decomposition.

Some important results were obtained in this module, and for this, the role of the analytical workbench in the developed analyses must be highlighted, not only in this Module 3 but also in Module 2, as already reported, having contributed significantly to the identification of anomalies associated with critical variables.

As stated earlier, the section showed some examples with focus on the Stator Armature component, but the figures and discussions presented in this section are just an illustration of the analysis developed. In fact, the entire set of variables was submitted to an exploratory analysis, which was a comprehensive and extensive work. Indeed, more than 750 diagrams have been generated, including dashboards of the types presented above, graphs of time series decomposition, and other types of diagrams involving comparisons and correlations among variables.

This extensive analysis provided a reasonably deep knowledge about the behaviors of the variables, individually and as a system. At the end of this module, the technical team was confident that the accumulated knowledge about the system at UHB was robust and that they could move on to the next module to develop predictive modeling, discussed in the next section.

#### *5.4. Predictive Modeling Results*

Before describing this last module of the framework is important to point out once more, the general objective of this application, which is to support the predictive maintenance decisions, by a minimization of corrective maintenance occurrence, and so, the objective of this application is to support predictive maintenance decisions, identifying when an equipment is in the imminence of a failure.

The predictive modeling was subdivided in two modules:

A first part of this application is a predictive model to identify the variables significantly associated with equipment failures, and, by the values of these significant variables, whether the equipment is close to a failure point or not.

The second part of the application is a time series forecasting model, developed to predict the values of the significant variables, and, thus, to predict the probability of an equipment failure in a future period.

The literature review presented in Section 2 shows an intense use of algorithms such as Artificial Neural Network, Support Vector Machine and others considered to be black box [41]. This term, black box, is used because the algorithm's decision criterion is not visible to humans. In this research, it was preferred for its first model, to use an AI algorithm more similar to a white box [41], where decision-making based on the algorithm, can be more easily interpreted by experts, and even by laypeople. In this sense, one of the algorithms employed in the predictive modeling, part 1, of BAProM was the Decision Tree (DT) and for the time series forecasting model was employed an ANN model.

The choice in both cases was technical, but in the first model it was also defined to better take advantage of the experience of the UHB team, who could better visualize the results and who, with their experience, with solid knowledge of the causes of maintenance, could better assess the outputs of the model. However, this model, as well as the whole framework, can be used for any industrial system.

Once again, the "Stator Armature", one of the components of the Generator subsystem (Figure 4), was used to illustrate the experiments carried out with DT, which are presented in this subsection. As mentioned before, the Stator Armature has five critical variables: Active Energy, Armature Tension, Armature Current, Active Power and Rotor Groove temperature. More than one DT were built to contemplate different possibilities of variables combinations. The DT presented here shows one of these DT, which included the three first variables:Active Energy, Armature Tension, Armature Current.

For this experiment, data from 2017 to 2018 were used, equivalent to 5000 samples. The data for training the model were classified with two types of labels: OP (operating normal) and CM (corrective maintenance), defined for data collected 15 days before and after the maintenance. The modeling of the DT algorithm follows the well-known cross-validation process, with 5-fold [37]. The resulting DT Model for this experiment is illustrated in Figure 9.

**Figure 9.** Predictive Decision Tree Model.

Once the decision tree has been generated, from the application of the technique to the data, it is possible to analyze the model output and identify the leaves whose classifications presented the best results in terms of accuracy and quantity, and retrace the path from these leaves to the root node to identify the rules that generated these leaves. With this knowledge of the rules, it will be possible to establish the degree of importance of each variable, defining those that deserve closer observation. Please note that the model provides the ranges of values and probabilities for each variable, which lead to the conditions of OP or CM. These allows the monitoring of these variables so that when reaching these thresholds, an alarm may be triggered to evaluate the possibility of having a maintenance stop before a failure occurs, generating corrective maintenance. The model furnishes, yet, for each node the percentage of observations of the dataset. In terms of performance, the accuracy of the different DT models developed, ranged from 70% to 96%. In addition, in this specific case, a qualitative analysis of the variables at the different levels of the decision tree was developed by UHB specialists, who agreed with the results, which showed the degrees of importance of the variables in the maintenance decision. The DT model, therefore, showed to be a consistent predictive maintenance tool, supporting the decision-making of scheduling equipment stops.

Complementing the DT model, a second type of predictive model based on ANN was developed to forecast the critical variables that would need to be monitored. Thus, in addition to monitoring the actual value of a variable, one can also identify in a future period, when one of these variables would reach a threshold that could lead to an equipment failure. An MLP—Multilayer Perceptron neural network was employed in this model and the forecasting results for the variable "Active Energy" are presented in Figure 10. In that figure, time is expressed in 30-s intervals, which were the time intervals used in data collection and there is a trend curve projected for the future, also showing the curves of the lower and upper limits, of confidence intervals for the forecasts. The intervals are presented for two confidence levels: 80% level, with a narrower range, and a 95% confidence level.

**Figure 10.** Active Energy Forecasting.

It should be noted that with the two predictive models working together, it can be said that there is a predictive modeling with reasonable robustness, once it can have reference parameters for monitoring the variables in real time, triggering preventive actions every time that a critical variable enters a level of equipment failure; and at the same time, there is an implementation of an effective instrument to project this type of situation for some time in the future, providing even more time, so that the operation teams can prepare and/or prevent such occurrences.

As said before, the research was conducted at UHB in a real-world environment. Therefore, the predictive model described here was tested with real data of UHB's operation. As previously stated, the data used in this study varied from May 2017 to January 2018. Therefore, to validate the model, what was done was to use data from the first months of this period to predict occurrences of failure for the final months of this period. In addition, since the actual data from these forecast months were known, it was possible to compare the predictions made by the model with the actual occurrences. The model was able to identify most of the failures that could have been avoided and to identify

maintenance that could have been reprogrammed. These predictions would result in cost savings and productivity increasing.

#### **6. Conclusions and Future Work**

This paper presented an application of a methodological proposal, expressed by the framework BAProM—Big Data Analytical Process and Mapping—which sought to contemplate all the phases of a KDD process, from the mapping of processes and critical variables, going through data management, exploratory analysis and even implementing predictive models. The complete framework has been tested in a real-world application in an industrial environment, making it possible to validate and demonstrate its practical feasibility. This real-world application started with the mapping of the entire operational process of the plant and an ETL procedure. Next, a data analysis tool, an "analytical workbench", was developed and implemented. This workbench has been shown to be suitable for different types of analysis, such as pre-processing or exploratory analysis. The tool has multiple possibilities for graphical analysis and statistical metrics computation, in addition to allowing monitoring of system variables, indicating anomalous behavior. It was used in the pre-processing phase and in exploratory analyses with satisfactory results.

A predictive model was developed, based on decision trees, which allowed the identification of more relevant variable thresholds, indicating the imminence of an equipment failure which consequently allows the programming of a predictive maintenance, avoiding unplanned stops for corrective maintenance. The predictive model made it possible to implement a management process for critical variables. Operators can act before an interruption event occurs. The whole process proved to be effective and efficient, given the feasibility of its implementation in a real-world operation.

In addition, a time series forecasting model for these critical variables, based on ANN, was also designed and implemented, which made the process even more effective, since managers can have information on future times when these variables should reach their thresholds, leading to the need for corrective maintenance. The forecasts provide additional time for teams to act, avoiding unexpected equipment stops.

The main conclusions of the research can be expressed as follows:


Despite the positive points of this framework, it must be considered that there are some limitations that should be considered in future studies and projects. One of these improvements concerns ETL, which relies on operational personnel to transfer production data to a repository dedicated to analytic. This process could be automated. Another limitation refers to data pre-processing in which part of the work is done by inspecting the variables with the support of the dashboard. Some of these tasks could also be automated. Furthermore, the dashboard could be improved by automatically generating some standard graphics and metrics to all or to a group of variables.

Moreover, regarding future works, it would be important to implement on the dashboard the critical values identified in the predictive decision tree model, so that alarms would be automatically triggered without the need for human monitoring when one of those variables is close to those values.

Another opportunity for future work is the application of this methodology in other industrial systems, including other subsystems of the case studied. Finally, one can also develop a validation of the results obtained through decision trees with other types of predictive models, such as artificial neural networks and support vector machines.

**Author Contributions:** Conceptualization, L.A.S., A.R.d.A.V.F.; methodology, L.A.S., A.R.d.A.V.F. and G.G.d.C.C.; software, G.G.d.C.C. and M.V.B.d.A.V.; validation, L.A.S., A.R.d.A.V.F. and L.S.d.S.; formal analysis, M.V.B.d.A.V. and G.G.d.C.C.; investigation, M.V.B.d.A.V. and G.G.d.C.C.; resources, L.A.S. and L.S.d.S.; data curation, M.V.B.d.A.V. and G.G.d.C.C.; writing–original draft preparation, G.G.d.C.C., L.A.S. and A.R.d.A.V.F.; writing–review and editing, A.R.d.A.V.F., L.A.S., G.G.d.C.C. and M.V.B.d.A.V.; visualization, G.G.d.C.C. and M.V.B.d.A.V.; supervision, L.A.S.; project administration, A.R.d.A.V.F.; funding acquisition, L.S.d.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research is a part of the R&D project "EMAE-ANEEL-P&D 00393-0008/2017", funded by EMAE—-Metropolitan Company of Water & Energy, of the state of São Paulo, Brazil.

**Acknowledgments:** We thank all the EMAE staff who participated in the R&D project "EMAE—ANEEL-P&D 00393-0008/2017", and all the faculty and student members of the BigMAAp research lab at Mackenzie Presbyterian University.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **An Innovative Technology for Monitoring the Distribution of Abutment Stress in Longwall Mining**

**Zhibiao Guo 1,2, Weitao Li 1,2,\*, Songyang Yin 1,2, Dongshan Yang 1,2 and Zhibo Ma 1,2**


**Abstract:** Fracturing roofs to maintain entry (FRME) is a novel longwall mining method, which has been widely used in China, leading a new mining revolution. In order to research the change law of side abutment pressure and movement law of overlying strata when using the FRME, a new abutment pressure monitoring device, namely, the flexible detection unit (FDU), is developed and is applied in the field. The monitoring results show that compared with the head entry (also called the non-splitting entry), the peak value of the lateral abutment pressure in the tail entry (also termed the splitting entry) is reduced by 17.2% on average, and the fluctuation degree becomes smaller. Then, finite difference software FLAC3D is used to simulate the stress change of the solid coal on both sides of the panel. The simulation results show that the side abutment pressure of the tail entry decreases obviously, which is consistent with the measured results. Comprehensive analysis points out that after splitting and cutting the roof, the fissures can change the motion state of the overlying strata, causing the weight of the overburden borne by the solid coal to reduce; therefore, the side abutment pressure is mitigated.

**Keywords:** fracturing roofs to maintain entry (FRME); field measurement; numerical simulation; side abutment pressure; strata movement

#### **1. Introduction**

The technology of gob-side entry retaining (GER) has been widely utilized worldwide since it was put forward in 1950s [1,2]. Compared with traditional mining methods, this technology has many merits, such as reducing the amount of roadway drivage, saving coal resources, alleviating dynamic mining disturbances, and so on [3]. At present, the common GER is to use high water materials, concrete blocks, gangue walls, and other filling bodies to support the roadway roof to isolate the connection between the roadway and the goaf, so that the roadway can be reused [4,5]. However, when the thick coal seam mining or rapid mining is carried out, the demand for filling materials will increase, and the formation speed of the filling body cannot match the speed of mining, so the roof cannot be supported in time. If the early deformation of the gateroad is serious, it will badly affect the use of the gateroad [6]. Based on this, an innovative GER approach is proposed, which can make fully use of the load-bearing capacity of gangue body to reduce periodic weighting load and improve the surrounding rock stress environment [7,8].

Many experts and scholars have conducted research on FRME, obtaining a series of rich results. He et al. [9] invented an energy-absorbing bolt with large elongation and high constant resistance. They introduced its structure and action mechanism in detail, and constructed a constitutive equation to derive the frictional resistance of the bolt during operation. Tao et al. [10] carried out a series of static tension tests on the constant resistance large deformation (CRLD) bolts. The results further indicated that the CRLD bolts had

**Citation:** Guo, Z.; Li, W.; Yin, S.; Yang, D.; Ma, Z. An Innovative Technology for Monitoring the Distribution of Abutment Stress in Longwall Mining. *Energies* **2021**, *14*, 475. https://doi.org/10.3390/ en14020475

Received: 21 November 2020 Accepted: 12 January 2021 Published: 18 January 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the characteristics of high support resistance, large elongation, absorbing energy, and negative Poisson's ratio effect. Gao et al. [11] used COMSOL software to simulate and investigate bilateral cumulative tensile explosion technology (BCTET); they found that the cracks, created by explosions, could be propagated toward the set direction. However, there were no cracks in the non-set direction. Finally, a complete cutting line was formed between the blasting holes. Hu et al. [12] established a mechanical model for the unilateral crack propagation of the BCTET, and deduced the yield condition of crack formation. Guo et al. [13] implemented a great many axial compression tests on the novel gangue prevention structure in the laboratory, and the results suggested that the torque value of block cable was closely related to the axial force of the gangue prevention structure. Wang et al. [14] built a mechanical model of the retained entry roof according to the energy variational theory, and explored the factors that made the retained entry roof deform. He believed that the rotation of the main roof and the width of the entry had the most obvious influence on the roof deformation, and raised a method to control the roof deformation by designing a reasonable roof splitting height and roadway width. Fan et al. [15] set up a mechanical model of the FRME and studied the vertical stress and displacement of the coal wall under different heights and angles of the roof cutting through UDEC software. Guo et al. [16] simulated the dynamic response of the roof with different roof fracturing angles by FLAC3D for the first time. They considered that the dynamic response of the roof was moderate and the gateway remained stable when the roof splitting angle was 10–20◦, and when the angle was 20–30◦, the dynamic response increased obviously and the gateway became unstable. Sun and others [17] expounded the principle of the FRME to control rockburst. He believed that the FRME could reduce the vertical stress and stress fluctuation of gateway roof. He et al. [6] investigated the loads monitoring of hydraulic support at different positions on the panel of thick coal seams. The results testified that the loads of hydraulic support near the splitting line could be reduced by approximately 60% compared with that far away from the splitting line, at the same time, periodic weighting intervals of roof increased near the splitting line. The theses [18–20] demonstrated that the FRME could be applied under complex geological conditions.

The above papers discussed the surrounding rock deformation characteristics, engineering technical parameters, key technologies, roof control, and so on through the methods of numerical simulation, theoretical analysis, and field measurement. However, there are no analyses of the overlying strata movement after roof cutting and the changes of the abutment pressure caused by the strata movement. The changes of the abutment stress are the most crucial reason for the large deformation and instability of the entry. It is of great significance to research the changes of abutment stress for further understanding the deformation mechanism of surrounding rock of the FRME.

The purpose of this paper is to compare the difference of lateral abutment pressure between the tail entry and head entry by monitoring the abutment pressure of solid coal on both sides of the working face with self-developed and more reliable FDU, and then the influence of cutting seam on the lateral abutment pressure of solid coal is explored. On the basis of fully considering the reasons for the change of abutment pressure after cutting the roof, the change of overburden movement caused by the slit is analyzed.

At present, there is still no study on the abutment stress of the FRME. Taking the geological conditions of Lvtang Mine as the engineering background, this paper analyzes the side abutment stress of coal mass in the tail roadway and head roadway by self-developed FDU combined with numerical simulation, and explores the changes of abutment stress, so as to reveal the movement laws of overhanging rock.

#### **2. Introduction of the FDU**

#### *2.1. Structure and Parameter of the FDU*

The FDU consists of flexible shape (thin steel wire and polymer material composition), injection interface, plug, and iron sheet, as shown in Figure 1a. Its main technical specifications are as follows: the measurement range is up to 60 MPa, the length 500 mm, the

diameter 45 mm, the accuracy from 0.5 to 1.0%FS, the repeatability from 0.2 to 0.4%FS, and the resolution 0.01%FS. The entity diagram of FDU is shown in Figure 1b.

**Figure 1.** The picture of the flexible detection unit (FDU): (**a**) Interior structural diagram of the FDU. (**b**) Entity graph of the FDU.

#### *2.2. Application Method of the FDU*

The installation steps for the FDU are as follows: (1) Use a twist drill rod with a diameter slightly larger than that of the FDU to drill holes of different depth into the solid coal. (2) Connect the hand pump, pressure gauge and metal pipe together through the tee. (3) Fill the manual pump with emulsion and press it until one end of the metal tube flows out of the emulsion to discharge the air inside the metal tube. (4) Fill the emulsion from the injection interface of the FDU to drain the internal air. (5) Connect the metal tubes and FDU. (6) Slowly advance the unit to the bottom of the borehole, as illustrated in Figure 2a. (7) The manual pump continues to press, and this process should ensure that the pressure increases steadily until the pressure gauges read 5.25 MPa and maintain for 30 min. At this time, the units expand to fit the hole wall, as shown in Figure 2b. (8) Relieve the pump pressure and observe the changes of the pressure gauge reading. If the reading cannot be stabilized at the preset initial pressure value, then step 7 should be repeated until the initial pressure reaches a stable level. (9) Debug the transmission substation and the master station to ensure that the pressure gauge data can be transmitted to the ground computer in time and accurately.

**Figure 2.** Comparison picture of the FDU before and after applying pressure: (**a**) Before the expansion. (**b**) After the expansion.

#### *2.3. Working Principle of the FDU*

At present, it is common that borehole stress meters are made of rigid materials and can only monitor the abutment pressure at a certain point or a certain face. There are no measurement data in the elastic deformation phase of coal caused by the installation clearance of rigid materials, as illustrated in Figure 3a. The objective of the FDU is to monitor the whole changes of abutment pressure in coal. Its working principle is shown in Figure 3b. After the liquid at a certain pressure injected into the FDU, it will expand, which can generate a pre-tightening force to the borehole surroundings. As the stoping face advances, the abutment pressure will move forward, causing the borehole surroundings to deform and break under the influence of dynamic pressure. Therefore, when the FDU is squeezed into different degrees, the internal liquid pressure will change clearly. Those pressure changes will be transmitted to the wireless pressure sensor through metal pipe, and the strain gauge on the elastomer of the sensor will change its value after being pressed, causing the change amount of the electrical signal to amplify. Then, the amplifying signal will be converted into a voltage signal after driven by the sensor. Finally, the CPU will convert the voltage signal into the pressure value and display it on the pressure gauge. Meanwhile, the pressure value will real-timely be transfer to the transmission substation through wireless communication, which will upload to the ground computer via optical cable.

**Figure 3.** Installation diagram of borehole stress meters with different materials: (**a**) Installation diagram of borehole stress meters with rigid materials. (**b**) Installation diagram of borehole stress meters with flexible materials.

The main structures of the abutment pressure monitoring system are mainly made up of five parts: FDU, pressure gauges, transmission substations, transmission main station, and computer. The working principle of the abutment pressure monitoring system is shown in Figure 4. The FDU, in real-time, can monitor the pressure changes of the surrounding rock in the coal body. The pressure gauge can transmit the pressure value to the transmission substation by wireless signal every 5 min. Each substation can receive the signals of multiple pressure gauges at the same time, and it can transfer the received singles to the main transmission station through the line. The main station can simultaneously process the singles of several substations, and then transmit all the information to the ground transmission interface with the optical fiber. Finally, through the independently developed software system in the computer, the observed data can be displayed in forms of curve and chart. Then, the observed data can be showed to users graphically through the Internet, which is convenient for users to analyze and view them in real-time and remotely.

**Figure 4.** Working principle of the abutment pressure monitoring system.

#### *2.4. Principle of the FRME*

The implementation procedure of the FRME can be divided into four parts, which are illustrated in Figure 5. Before the mining of the working face, CRLD anchor cables are constructed in the roadway to strengthen the roof, so as to prevent the disturbance of the roadway roof caused by the subsequent roof pre-split blasting and the roof collapse in the goaf, as shown in Figure 5a. Then, the structure of the blasting holes is implemented on the side of the working face in the tail entry according to the designed height, angle, and distance between the blasting holes. Sequentially putting the cumulative energy explosion tubes [11,19], the emulsified blasting powders, detonators, and the stemming into the blasting holes to form a continuous fracturing line along the axis of the roadway, as shown in Figure 5b. After the mining of the panel, the goaf side of the roadway roof will cave automatically along the splitting face under the action of the mine pressure. Then, the gangue of different size and shape will be formed in the process of roof collapse. Meanwhile, in order to prevent the gangue from rushing into the entry in the process of fall and compaction, U-shaped steels, metal meshes, and single hydraulic pillars beside the entry can be used to block the gangue, as shown in Figure 5c. In this way, the compacted gangue can be used as a side of the entry to continue to serve for the mining of the next

panel, as shown in Figure 5d. The three-dimensional schematic diagram of the FRME is shown in Figure 6.

**Figure 5.** The implementation procedure of the fracturing roofs to maintain entry (FRME): (**a**) The procedure of strengthening the entry roof by constructing the constant resistance large deformation (CRLD) cables. (**b**) The procedure of implementing blasting holes and fracturing the retained entry roof. (**c**) The procedure of installing U-shaped steels and entry-in supports when the coal is mined. (**d**) The procedure of withdrawing the entry-in supports when the retained entry keeps stable.

**Figure 6.** Three-dimensional schematic diagram of the FRME.

#### **3. Methods**

*3.1. Method of Numerical Analysis* 3.1.1. Study Site

Lvtang Coal Mine is located in Bijie area, Guizhou Province, China. This article analyzes the actual engineering geological conditions according to S204 working face of Lvtang Coal Mine. This panel is the first panel in this mining area, and the adjacent working faces are represented by the S205 and S203 panels. The average strike length of S204 panel is 310 m and the dip length is 115 m. The dip angle of coal seam is 3–9◦, with an average of 6◦. The thickness of coal seam is 0.9–7.55 m, with an average thickness of 3.4 m. The layout of panels is shown in Figure 7. The mine belongs to coal and gas outburst mine. In order to reduce the gas concentration to achieve safe mining, it is necessary to excavate a gas drainage roadway above the working face to extract gas from the coal before mining. It is planned to use FRME in tail entry, and the head entry will naturally fall.

**Figure 7.** The layout of panels.

The average buried depth of S204 mining panel is 210 m. The geological drilling picture of the S204 panel is shown in Figure 8. The roof strata above the S204 face are mainly composed of silty mudstone and muddy siltstone, and the floor stratum are made up of silty mudstone, coal, and muddy siltstone. It can be seen from the stratigraphic column that the lithology of the roof and the coal seam changes greatly.


**Figure 8.** The geological drilling picture of the S204 panel.

#### 3.1.2. Numerical Modal

According to the symmetrical characteristic of the model, in order to shorten the calculation time, the strike length of the model is shortened to become half of the actual engineering geological conditions. The work in [21] also employs the symmetry principle for numerical simulation. At the same time, in order to eliminate the influence of boundary conditions, 60 m boundary coal pillars are added on the left and right sides of the model. This model is divided into 8 layers, and the size (length × width × height) is 244 m × 160 m × 50 m. The Mohr–Coulomb criterion is used in the model. Constraints are imposed on the surrounding and bottom surfaces to restrict its movement. The stress of 5.25 MPa is applied on the upper surface to simulate the weight of the overlying rock. According to the geological situation of S204 coalface in Lvtang Coal Mine, a threedimensional numerical model is established as shown in Figure 9.

**Figure 9.** Three-dimensional numerical model of S204 coalface.

The splitting surface is an abstract simulation of the blasting effect. Because the distance between these blast holes is extremely close and they can penetrate each other after blasting, the blast holes and slit between them can be simplified to dispose, which can be regarded as a fissure surface composed of uniform blocks in numerical model. The length of the on-site blast holes is 8 m and the angle is 15◦; therefore, the length of the fissure surface is also 8 m, the angle is 15◦, and both the width and the diameter of the blast holes are 48 mm. The blasting is simulated by excavating the fissure face.

The mechanical parameters of each rock mass are shown in Table 1. Among them, the parameters of coal body and roof rock are obtained by GSI rock mass classification method on the basis of rock mechanics parameters obtained by laboratory uniaxial compression test and field borehole peep to estimation GSI value. The parameters of floor sandy mudstone are obtained by GSI rock mass classification method based on the measured mechanical parameters of coal mine exploration.


**Table 1.** Mechanical parameters of rock mass.

#### *3.2. Method of Field Measurement*

In order to discuss and investigate the evolution laws of the side abutment stress in the tail entry and head entry and the influence of the fracture and movement of overlying strata on gateway stability after using the FRME, 12 sets of the FDU were installed in the solid coal of the head entry and tail entry of 100 m in front of the coalface to monitor the changes of coal mass stress, independently. The installation depth of the FDU is different, because the authors of [21–23] found that the load-bearing capacity of coal is related to the distance from the coal wall. The distances between the installation position of the FDU and the gateway surface are 3 m, 5 m, 7 m, 9 m, 11 m, and 13 m, as shown in Figure 10a. According to the actual situation of the mining face, the twist drill pipes with a diameter of 48 mm are selected to drill the holes, which are perpendicular to the coal wall and 1 m away from the gateway floor, as shown in Figure 10b.

**Figure 10.** The test site and position of the FDU: (**a**) Longwall panel layout and the test site of the FDU. (**b**) A-A cross section in Figure 10a.

#### **4. Data Analysis of the Abutment Stress**

*4.1. Data Analysis of the Field Measurement*

4.1.1. Monitoring Result of the Strike Abutment Stress

Stations 1# and 2# started to be installed at 200 m in front of the working face from 1 August 2019. Due to the need of coal mining and the restriction of construction, the installation was officially completed and connected network single at about 100 m in front of the working face on 16 August. Therefore, the pressure changes of the previous FDU were not recorded in time. The pressure variations were only recorded when the signal was switched on. The unit's pressure had been recorded until 15 September at 100 m behind the working face. The strike abutment pressure monitoring curve of the stations 1# is shown in Figure 11.

**Figure 11.** Strike abutment pressure curve of the roof-noncutting entry (head entry) of station 1#.

The abutment pressure at the 3 m position of the station 1# rises the slowest, and the increasing trend is not obvious, which is likely that the coal mass in this range was yielded or even destroyed by the strong abutment pressure, so its bearing capacity is smaller. The variation tendencies of the strike abutment pressure at 5–13 m are the same, showing a state of increasing first, then decreasing and then increasing, and finally stabilizing. However, the speed and amplitude of the strike abutment pressure growth within the range of 100 to 20 m in front of the working face increases in turn with the increase of the coal depth.

About 20 m in front of the panel, the strike abutment stress at 5–13 m decreased rapidly, and the degree of stress fluctuation is large. This is because that as the longwall face approaces the FDU, a concentration area of abutment pressure will be formed ahead of the face, where the abutment pressure will be transmitted to the coal rib through the roof, where the stress concentration zone will be formed, causing the plastic failure of the coal mass. With the advancement of the longwall face, the failure will continue to develop deeper. Finally, a plastic zone with a length of about 20 m is formed on the coal rib, as a result, the carrying capacity of the solid coal is weakened [24]. After the longwall face pushes through the units, the strike abutment stress at 3–13 m grows slowly with the increase of the longwall face distance; in the meantime, the rising trend is almost the same as the amplitude. This is because the weight of the overburden on the longwall face is shifted after the coal seam is mined to the solid coal on both sides of the face and the gangue falling in the goaf, so the solid coal begins to slowly increase in stress. It should be mentioned that the irregular fluctuation of the abutment pressure behind the longwall face belongs to the stress fluctuation that is caused by the periodic collapse of the goaf roof.

The monitored distance of the abutment pressure at 2# station in the tail entry is the same as that of 1# station in the head entry. However, the abutment pressure has changed significantly. Strike abutment pressure curve of station 2# in the tail entry is illustrated in Figure 12.

**Figure 12.** Strike abutment pressure curve of the roof-cutting entry (tail entry) of station 2#.

The change trend of the strike abutment pressure at the 5–13 m measurement points of solid coal in the roof-cutting roadway (tail roadway) is also the same as the roof-noncutting roadway (head roadway), showing a state of increasing first, then decreasing and then increasing, and finally stabilizing. When the FDU is located more than 60 m in front of the working face, the abutment pressure at different measuring points of the 2# station increases slowly, and the curve is relatively stable. This indicates that when the distance of coal body is over 60 m in front of the working face, it is affected slightly by mining pressure. When the FDU is located in the range of 60 to 10 m in front of the working face, the abutment pressure at 5–13 m of station 2# begins to increase rapidly, reaching a maximum value about 10 m in front of the working face. The growth rate of the abutment pressure at 9–11 m is obviously greater than that of other depths, indicating that the coal body in this range is very susceptible to the impact of mining pressure. Moreover, the abutment pressure at 9 m is always greater than that of other depths.

When the FDU is from 10 m in front of the working face to near the working face, the abutment pressure of each measuring point drops rapidly, which is related to the mining of the working face. When the FDU is from near the working face to 60 m behind the working face, the abutment pressure of each measuring point increases slowly. When the lagging face of the FDU exceeds 60 m, the abutment pressure remains stable.

However, we discover that there are several differences in the strike abutment pressure curve of each measuring point between the two stations:


(4) The abutment pressure at the 9 m measuring point of station 2# is the highest, at the same time, the abutment pressure of each measuring point generally enlarges at first and then decreases as the depth from the coal wall increases.

The variation trend of the strike abutment pressure at each measuring point of station 2# is almost the same as that of station 1#. Therefore, the strike abutment pressure of the solid coal can be divided into 5 regions: slow increasing zone I, sharp increasing zone II, rapid reducing zone III, fluctuation enlarging zone (enlarging zone) IV, and stable zone V.

Slow increasing zone I: in this area, from the beginning position of the abutment pressure rise to 55 m in front of the working face, the stress increases slowly, and the abutment pressure curve is relatively smooth. At this time, the deformations of the entry surroundings are also relatively small, and the influence of the advanced abutment pressure on solid coal is not obvious.

Sharp increasing zone II: abutment pressure from about 55 m to 15 m in front of the face goes up rapidly, and the alteration of the abutment stress curve becomes steep. At this time, the deformations of the roadway surroundings get extremely remarkable, which is obviously affected by the advanced abutment pressure.

Rapid reducing zone III: the stress decreases sharply from about 15 m in front of the panel to near the longwall face. At this time, the deformation and its rate of the roadway surrounding rock is relatively great. The coal body on the side of the solid coal in the entry is plastically damaged, and cracks and holes appear in the internal coal mass. The coal releases stress and the abutment stress start to reduce.

Fluctuation enlarging zone (enlarging zone) IV: this area extends from the vicinity of the longwall face to 60 m behind the working face. The abutment stress shows a state of fluctuation increase. This is due to the roof behind the working face periodically breaks under the action of periodic pressure.

Stability zone V: the stress fluctuation is not obvious in 60 m behind the working face in this area and the curve becomes a stable straight line. It suggests that when the lagging distance of coal face is greater than 60 m, the overlying rock has been stabilized.

#### 4.1.2. Monitoring Result of the Side Abutment Stress

In order to analyze the abutment pressure distribution of the coal mass with different depths at the side of solid coal in the tail entry and head entry and the relationship between the side abutment pressure and the working face distance, the average value of the pressure gauge reading of each unit at 60 m, 40 m, and 20 m in front of and behind the mining panel is taken, respectively, to draw the pressure histograms. Figure 13 shows that the abutment pressure distribution of coal mass with different depths at the side of solid coal in the tail entry and head entry and the relationship between the side abutment pressure and the working face distance.

From Figure 13a,b, we can see that when the stations are in front of the working face, the lateral abutment pressure on both sides of panel increases first then declines as the depth changes. The peak point of the lateral abutment pressure in the tail roadway and head roadway is 9 m and 11 m away from the coal wall, respectively. Therefore, the rising range of lateral abutment pressure in the head entry is larger than that of the tail entry. At the same time, compared with the head entry, the peak value of the lateral abutment pressure is smaller in the tail entry, indicating that roof cutting can play a good effect in stress relieving. The closer the FDU is to the working face, the greater the lateral abutment pressure will become, which suggests that the lateral abutment pressure is in the process of dynamic change and has obvious space-time effect [23].

Figure 13c,d shows that when the measuring stations are behind the coalface, the farther the distance from the same measuring point to the coalface is, the higher the recovery degree of the abutment pressure will become, which is related to the transfer of overburden weight to solid coal on both sides of face after coal seam mining [25]. The lateral abutment pressure of the tail roadway climbs up first and then declines, while the lateral abutment pressure of the head roadway shows an increasing trend which has no

obvious regularity. Compared with the head roadway, the lateral abutment pressure of the tail roadway in the back of coalface under the same distance is generally smaller.

**Figure 13.** Abutment pressure distribution of coal mass with different depths at the side of solid coal in the tail entry and head entry and the relationship between the side abutment pressure and the working face distance: (**a**) Advanced side abutment pressure in the non-splitting entry (head entry). (**b**) Advanced side abutment pressure in the roof-splitting entry (tail entry). (**c**) Lagged side abutment pressure in the roof-splitting entry (tail entry). (**d**) Lagged side abutment pressure in the non-splitting entry (head entry).

The lateral abutment stress of the tail roadway reaches the maximum at 9 m from the coal wall, and its rising range of 7–9 m from the coal wall is obviously larger than that of 3–7 m in the process of lateral abutment pressure rising. In contrast, in the head roadway, the lateral abutment pressure at a distance of 3 m to 5 m from the coal wall first decreases slightly, then it at a distance of 5 m to 13 m from the coal wall has been increasing, whereas, the increasing rate in this range is not in agreement, which slows down when it is away from the coal wall (5–9 m) and begins to accelerate when it is away from the coal wall (9–13 m).

Variation curve of lateral abutment peak position is shown in Figure 14. We can find that the lateral abutment pressure of the two roadways constantly alters, and the peak

position continuously deepens from coal mass. In the end, there will be no change at a certain depth.

Nonetheless, we can also see that there is a significant difference of the lateral abutment pressure in two roadways. The lateral abutment pressure of the head entry has experienced three-step fluctuations to reach stability, while the lateral abutment pressure of the tail roadway goes through two steps to reach a stable state, and a small fluctuation degree is more conducive to maintaining the entry stability, which proves that the slit can change the structure of the roadway roof. As a result, when the technology of the FRME is employed, the solid coal is less affected by the breaking and collapsing gob roof.

#### *4.2. Data Analysis of the Numerical Modal*

During the simulated excavation, in order to best meet the reality, we first simulated the excavation of the two entries, then supporting measures of the entries were installed until the calculation reaches the balance, after that the splitting face was mined to simulate blasting. In the end, the simulated exploiting of the mining panel was carried on. The excavation length of the mining panel was 5 m each time, and it was mined a total of 32 times. The cumulative lengths of 40 m and 80 m were selected to research the change laws of side abutment pressure of the FRME.

The three-dimensional cloud maps of the abutment stress distribution in the surrounding rock at the coal face are shown in Figure 15.

From the three-dimensional cloud maps, we can deem that the overall stress will change after the panel is mined. When the roof of mined out area has not completely caved in, the pressure relief area (blue area) is formed. Only if the gob roof collapses fully and caved-rock is compacted by the overlying strata, the stress in the goaf starts to increase slowly. After the mining of the face, the weight of the overhanging rock is supported by the gangue in the goaf, the coal body in front of the panel, and the solid coal on both sides of the roadway, where the stress augments distinctly. When the FRME technology is not used in the tail entry, after 40 m or 80 m excavation of the working face, symmetrical stress concentration areas appear in the solid coal on both sides of the working face (see Figure 15a,b). While, when the tail entry adopts the FRME technology, after the working face is excavated to 40 m or 80 m, due to the blocking effect of the slit, asymmetrical stress concentration areas appear in the solid coal on both sides of the working face (see Figure 15c,d).

**Figure 15.** The three-dimensional cloud maps of the abutment stress distribution: (**a**) The stress distribution of 40 m exploitation without roof cutting. (**b**) The stress distribution of 80 m exploitation without roof cutting. (**c**) The stress distribution of 40 m exploitation with roof cutting. (**d**) The stress distribution of 80 m exploitation with roof cutting.

The side abutment pressure of the tail roadway with the roof splitting is significantly lower than that of the head roadway without the roof splitting. The simulation result displays that the peak value of lateral abutment pressure in tail roadway without the roof cutting is around 22.4 MPa. After using the technology of the FRME, the peak value of lateral abutment pressure in tail roadway with the roof cutting is reduced to 18.3 MPa, which is reduced by 18.3%.

Moreover, compared with the average lateral abutment stress in head roadway without roof cutting, the average lateral abutment stress of the tail roadway with roof cutting is also reduced by 18.3%. The technology of the FRME can change the stress state in the surrounding rock and have an excellent pressure relieving effect, which has positive significance to maintain the stability of tail roadway.

After increasing the length of the boundary coal pillar, we can more intuitively observe the influence range of lateral abutment pressure. When the roof is not cut, the lateral abutment pressure in the solid coal of the tail entry returns to the original rock stress at 47 m (see Figure 15a,b). After cutting the roof, the lateral abutment pressure in the solid coal of the tail entry is restored to the original rock stress at 35 m, and the influence range of the lateral abutment pressure is reduced by 25.5% (see Figure 15c,d).

After pre-cracking the roof, the reduction of the lateral abutment pressure of the tail entry is closely related to the slit. The cutting fissure can block the connection between the goaf roof and the entry roof; therefore, the stress of the goaf roof cannot be transmitted to the solid coal of the roadway. In the meantime, the cutting seam can shorten the length of the side roof of the gob-sides in the entry. When the main roof rotates and sinks after the mining of the working face, the deformation degree of extrusion to the roadway roof will be significantly reduced. These are the two main reasons for the decrease of lateral abutment pressure in the tail entry after pre-splitting the roof.

In order to further explain the technical advantages of the FRME, the models with the working face mining to 40 m and 80 m were selected for analysis and sliced along the bottom of the entry to explore the changes in lateral abutment pressure. We can clearly see that when the FRME technology is not used in the tail entry, the lateral abutment pressure presents symmetrical distribution all the time with the advancement of the working face, as shown in Figure 16a,b. When the FRME technology is used in the tail entry, the lateral abutment pressure on both sides of the working face has changed significantly. When the working face advances to 40 m, the stress concentration area on the solid coal of the head entry is closer to the head entry, while the stress concentration on the tail entry side is not obvious, and the abutment pressure is 18.4 MPa, as shown in Figure 16c. When the working face advances to 80 m, the stress concentration area on the solid coal of the head entry tends to be far away from the head entry, and the abutment pressure of the tail entry is about 20 MPa, which is bigger than the mining distance of 40 m, as shown in Figure 16d.

**Figure 16.** The plane development of the abutment stress distribution: (**a**) The stress distribution of 40 m exploitation without roof cutting. (**b**) The stress distribution of 80 m exploitation without roof cutting. (**c**) The stress distribution of 40 m exploitation with roof cutting. (**d**) The stress distribution of 80 m exploitation with roof cutting.

This phenomenon shows that the lateral abutment pressure increases with the increase in the distance of the lagging working face, which is closely related to the movement of the overburden. The farther the distance from the lagging working face is, the more sufficient the movement of the overburden is and the greater the weight of the overburden carried by the solid coal is, the larger the lateral abutment pressure is.

Figure 17 shows that with the face advancing to 80 m, a plastic failure zone with the width of 11 m and advanced working face of 17 m is formed in the solid coal of the head roadway (see Figure 17a), while a plastic failure zone of 9 m in width and 9.5 m in front of working face is formed in the solid coal of the tail roadway (see Figure 17b). The length and width of the plastic failure zone in the slotted entry is smaller than that in the unslotted entry.

**Figure 17.** Schematic diagram of plastic zone distribution in roadway: (**a**) Plastic zone of solid coal in the head entry. (**b**) Plastic zone of solid coal in the tail entry.

After pre-splitting and cutting the roof, the influence of coal mining on the solid coal in the tail entry is lessened; that is, the fissure changes the movement state of the overlying rock, so that the weight of the overburden rock supported by the solid coal is reduced. Therefore, the concentration degree of stress is lower, and the impact by upper strata on the coal body becomes smaller. On the other hand, the movement state of the overlying rock on the side of the unslotted head entry has not changed. The weight of the overburden rock born in the solid coal of the head entry is relatively larger, and the concentration degree of stress is higher, which has a greater impact on the coal body.

#### **5. Discussion of Overburden Movement**

#### *5.1. Overburden Movement Status in Traditional Coal Pillar Mining*

In view of the difference of the abutment pressure at each measuring point between the two stations, we believe that the cutting seam is the main reason for this phenomenon, because the formation of the splitting face can make the roof in the mined-out area collapse more fully, which affects the motion state of the entire coal face roof [26].

As shown in Figure 18, in traditional mining, after the coal seam is mined, the gob roof is difficult to collapse insufficiently. Therefore, the goaf is incompletely filled by the falling gangue. There is large space above the goaf, where the main roof can continue to go down, and when the tensile strength limit is reached, the breakage of the main roof occurs, which generates dynamic pressure impact on the roadway and in turn affects side abutment pressure.

**Figure 18.** Overburden movement status in traditional coal pillar mining.

In traditional mining, after the coal seam is mined, the gob roof is difficult to collapse insufficiently. Therefore, the goaf is incompletely filled by the falling gangue. There is large space above the goaf where the main roof can continue to go down, and when the tensile strength limit is reached, the breakage of the main roof occurs, which generates dynamic pressure impact on the roadway and in turn affects side abutment pressure. In the meantime, the rotation and subsidence of main roof will squeeze the side roof of the gob-sides, creating a stress concentration zone in the solid coal, and the abutment pressure is further increased. The movement of the rock strata will continue to develop upward. If the upper stratum also fractures, it will once again impact the solid coal of the roadway, causing the weight of the overlying strata borne by the solid coal on both entries will be further rising, while the goaf gangue only bears a small part of the weight of the overlying rock. Therefore, the abutment pressure on the solid coal will increase significantly, and this process continues until the overburden layer only bends and sinks without breaking. The abutment pressure of the solid coal increases, then the destruction degree of the coal body enlarges, creating a large distance between the plastic zone and mining panel. Therefore, the peak point of the abutment pressure is far away from the mining panel. The literature [20,27] believe that the abutment pressure reaches the maximum at the end of plastic failure (the elastic-plastic junction).

#### *5.2. Overburden Movement Status in the FRME*

Figure 19 shows that after splitting and cutting the roof, the immediate roof of the goaf in the range of roof-cutting height collapses smoothly under the action of mine pressure and the broken cantilever beam will further fill the gob, which makes the expansion coefficient of gangue augment. The gangue can rapidly connect with the main roof, and it will have a certain bearing capacity to support the main roof after being squeezed by it. Therefore, the space for the main roof to continue sinking is very small, which can prevent the overlying strata from further breaking. Simultaneously, the rotation and subsidence degree of the main roof will be reduced or even disappeared. On the one hand, the weight of the overburden borne by the solid coal is reduced. On the other hand, the extruded gangue has an inclined force on the roof of the retained roadway. As a result, the side abutment pressure of the solid coal must markedly decrease. In addition, the slit can cut off the stress transmission between the entry roof and the goaf roof, and this will avoid the transmission of dynamic loads in the goaf to the solid coal, which is also one of the reasons for the reduction of the side abutment stress. The reduction of abutment pressure will result in the decrease of coal damage degree, and the range of the plastic zone will decrease. Therefore, the peak point of the abutment pressure will be closer to the coal face.

**Figure 19.** Overburden movement status in the FRME.

The differences in the research on abutment pressure between this paper and the previous papers are shown as follows.

Yan et al. [25] discussed the influence range of lateral abutment pressure in coal pillar under the traditional mining mode through numerical simulation, but did not study the change of abutment pressure when using the FRME. The FRME is an innovative technology for the GER, the study of its lateral abutment pressure can not only fully understand this technology, but also better explain the advantages of the FRME over the traditional mining technology. Zhen et al. [28] compared and analyzed the difference of abutment pressure between traditional mining and the FRME through numerical simulation. However, they did not explore the reasons for this difference in depth. Based on field measurement, this paper considers that the overburden movement is the root cause of the difference in abutment pressure under different mining conditions. The state of overburden movement when using the FRME is described in detail, and the factors of abutment pressure reduction are analyzed from many aspects. Yao et al. [29] discussed the lateral abutment pressure by means of numerical simulation and borehole stress measurement, but did not analyze the lateral abutment pressure in the direction of the coalface strike. On the basis of the FDU field measurement, the abutment pressure in the strike and dip direction of solid coal on the entry under the mining mode of the FRME is researched in detail in this paper, and the strike abutment pressure of the solid coal can be divided into five regions: slow increasing zone, sharp increasing zone, rapid reducing zone, fluctuation enlarging zone (enlarging zone), and stable zone. The differences in abutment pressure on both sides of the working face are compared in detail. Zhang et al. [30] used the vibrating wire stress meters to monitor the advanced abutment pressure of the working face. Because the vibrating wire stress meters are made of rigid materials and cannot be deformed with the change of borehole surrounding rock, they will result in the loss of a large amount of data, which cannot really explain the change of abutment pressure with the advance of the working face. Shen et al. [31] used vibrating wire stress meters to monitor the stress changes of the roadway, so that the stress meters were able to monitor the stress changes parallel to the roadway axis, perpendicular to the roadway axis, and 45◦, respectively. Because the direction of the stress meters is difficult to control when they are installed in the borehole and traditional stress meters can only monitor the change of the abutment pressure in a certain direction, they cannot realize the omni-directional abutment pressure monitoring in the borehole, resulting in a large error in the measurement data. In order to overcome the above disadvantages of the traditional borehole stress meters, the FDU has been independently developed to monitor the changes of abutment pressure in coal. When

the FDU is used, it can actively apply prestress to the surrounding rock of the borehole after expansion, so that it can fully contact with the surrounding rock and move cooperatively with the surrounding rock, which can monitor the change of abutment pressure in the whole process, and the reliability of measurement data is high.

#### **6. Conclusions**

The FRME has broad application prospects. It is necessary to deeply research the variations of the abutment pressure and to reveal the movement laws of the overlying strata when this technology is applied. Because the movement of overburden rock is the fundamental reason for the deformations of the roadway surroundings, we should not focus on the surface problems such as the moving amount of roof and floor, the stress monitoring of anchor cable, hydraulic support pressure, and so on. A clear understanding of the movement laws of overlying rock is of great significance for predicting roadway deformations, coalface pressure, coal mine disasters, and so on. The main conclusions of this paper are as follows.

Through the field applications of the self-developed abutment pressure monitoring equipment, the measured data completely meet the needs for analyzing problems, which explains the reliability and accuracy of the new equipment. In the future, more site monitoring tests and in-depth research about the change laws of abutment pressure are needed.

On-site monitoring data indicate that the side abutment pressure in the entry of roof cutting is different from the entry without roof cutting. In the direction of the coalface strike, the peak value of abutment stress at each measuring point of the entry with roof cutting has been reduced by an average of 17.2%, and the peak point is also closer to the coalface. However, the tendency for variation in the strike abutment pressure of the solid coal in two roadways is almost the same. Therefore, it can be divided into five zones: slow increasing zone, sharp increasing zone, rapid decreasing zone, fluctuation rising zone (rising zone), and stability zone. In the dip direction of the coalface, the peak point position of the side abutment pressure in two gateways is different. The peak point of the side abutment pressure of the roadway with cutting roof is 9 m away from the coal wall, and the peak point of the side abutment pressure of the roadway without cutting roof is 11 m away from the coal wall.

The numerical simulation results show that the lateral abutment pressure of the roadway with cutting roof is significantly reduced. Meanwhile, with the stoping face advancing, the width and length of the plastic zone formed in the solid coal of the roadway with roof cutting are obviously smaller than those of the roadway without roof cutting.

Through the careful analysis of the abutment pressure data of both roadways, combined with the field practice, we believe that the cutting fissure has an obvious effect on the movement of overlying rock. After splitting and cutting the roof, the immediate roof of the goaf in the range of roof-cutting height collapses smoothly under the action of mine pressure and the broken cantilever beam will further fill the gob, which makes the expansion coefficient of gangue augment. The gangue can rapidly connect with the main roof, and it will have a certain bearing capacity to support the main roof after being squeezed by it. Therefore, the space for the main roof to continue sinking is very small, which can prevent the overlying strata from further breakage.

**Author Contributions:** All the authors contributed to this paper. Z.G. and W.L. discussed and conceived the research. W.L. and D.Y. performed the numerical simulation. W.L. conducted the field test. S.Y. and Z.M. revised the article. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Energies* Editorial Office E-mail: energies@mdpi.com www.mdpi.com/journal/energies

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18