1. Introduction
Causality analysis requires a specific approach and designated methods that allow one to determine the internal relations between different variables [
1]. One may find various approaches, which have been developed in different domains. Actually. there are four leading approaches to detecting causality from historical process data: Granger Causality, Transfer Entropy, Cross-correlation, and Partial Directed Coherence [
2]. Cross-correlation is simple, but it assumes linearity [
3]. Some of them are based on the process model, like the coherence method [
4], the Granger causality [
5] and its extension Partial Directed Coherence. Spectral Granger causality uses Fourier Transform, which limits the method to the linear and stationary cases. Others belong to the group of data-driven and model-free methods, like the Transfer Entropy approach [
6]. The analysis of complex and varying control systems, for which the exact model is unknown or its estimation is highly uncertain, encourages the use of the data-driven method. It is a great simplification in the field of causality determination as well as fault propagation research. Faults can be observed as system oscillations, among others.
The industrial demands for the causality analysis have been defined quite early [
7], but the he research on the causality root-cause analysis in the area of process control is relatively new and only few works are published. The introductory paper [
2] has been followed by some examples of the Gragner causality and Transfer Entropy comparison in [
8,
9]. Transfer Entropy research has been followed in [
10] with the hybrid combination of TE and the modified conditional mutual information (CMI) approach, which uses not data time series but multi-valued alarm series. Authors of [
11] used an opposite approach. They are assuming that periodical elements are not bringing any value so they use denoising and periodicity-removing time-delayed convergent cross mapping (TD-CCM), which is an extension of the original algorithm [
12] and its time-delayed version [
13].
The Transfer Entropy approach has been addressed by a few works. The Transfer Entropy systematic workflow for oscillation diagnosis is proposed in [
14], while the outliers impact is addressed in [
15]. Finally, the TE approach is improved by information granulation in [
16].
AI-driven methods, like neural or Bayesian networks, are uncommon in the process industry applications. They are many times harder to train because of the increasing size and complexity of the used data sets [
11]. Moreover, we would need to know the causality examples to train it. The authors decided not to use AI and develop the TE approach.
Our approach assumes that periodicity contains important information and propagates through the system. Therefore, the Transfer Entropy approach uses decomposed oscillation elements.
An example of the oscillatory behavior propagation through the control system is discussed in [
17]. Oscillation diagnosis may allow one to trace the root cause of these faults. It allows answering the question of which control loop of a given system introduces disturbances into the system causing its improper performance. As the result, too aggressive or too sluggish operation [
18] is observed. Thus, assuming that the oscillations in a multi-loop control system carry out valuable information, it is imperative to conduct the research in such a direction. In the considered analysis, the innovative application of the Transfer Entropy approach is combined with the oscillation decomposition leading toward time-frequency causality analysis.
The algorithm should work not only in the simulation or laboratory environment but also in industrial reality, which introduces various disturbances and limitations. Non-stationarity and non-Gaussian properties are common. They often turn out in the form of the variables distribution tails and outliers. One has to be aware that outlying observations are very frequent in industrial practice [
19]. They may deteriorate statistical estimation and regression analysis [
20] and also oscillation decomposition [
21]. This research takes this aspect into account as well trying to omit the problem through the subsets decomposition and median filtering. Our contribution consists of two parts. First, we show how the performance of decomposition methods can be extended for very long time series. Second, real-world examples have shown the inadequacy of existing oscillation decomposition methods for industrial data. This observation opens a new direction for further research.
Thus, assuming that the oscillations in a multi-loop control system carry out valuable information, it is imperative to conduct the research in such a direction. In the considered analysis, the innovative application of the Transfer Entropy approach is combined with the oscillation decomposition leading toward time-frequency causality analysis. The combination of the Transfer Entropy method with frequency analysis has not been used either with simulation or real data, however, it may be a great simplification in the field of diagnosis of fault propagation methods based on a data-driven model-free approach.
Once the causality graph is identified the tracing of the misbehavior is allowed in the multi-loop configurations. Frequently the poor tuning of a single loop causes low performance for the rest of the process. Thus, the identification of such loops may help control engineers in the process of multi-loop system tuning and fault diagnosis. Moreover, the causality graphs may help to properly organize the intelligent alarming system protecting from multiple alarming making the installation operator’s job easier.
The paper is organized as follows:
Section 2 contains an overview of the applied methods and algorithms, while
Section 3 describes the used simulation and industrial case study.
Section 4 presents obtained results including causality diagrams reflecting relationships between variables based on oscillation signals.
Section 5 concludes the paper and identifies open issues for further research.
2. Proposed Methodology
It is shown that the diagnosis of large-scale industrial control systems, especially the creation of its good representation by a mathematical model, is a tedious process and its accuracy depends on many factors [
22]. Depending on the adopted methodology, analysis can be based on various types of data, ranging from data measured directly from the given control loops (raw datasets), through the noise, to oscillations occurring in almost every system [
15]. The last-mentioned type of data is strongly related to Time-Frequency (T-F) analysis, which is a powerful tool and can be successfully used to determine interconnections between variables and units (causality). These relations represent the information flow (caused by control loops) in the process and can be “measured” thanks to the Transfer Entropy (TE) approach. The combination of T-F analysis with TE has not been previously observed in the literature, in particular in industry-related applications.
Proposed a median ensembled version of EEMD (MEEMD) that belongs to a class of noise-assisted EMD methods, helps to decompose signals into oscillations required for further analysis. This approach is used together with specific statistical data handling. The analysis is applied to two case studies: the known and simulated environment and the unknown industrial case.
2.1. Transfer Entropy Approach
Transfer Entropy (TE) is an information-theoretic interpretation of Wiener’s causality definition. In practice, this is a measure of information transfer from one variable to another by measuring the reduction of uncertainty while assuming predictability. TE is given by the Equation (
1).
where
p means the complete or conditional Probability Density Function (PDF), variables
and
respectively as
,
,
is a sampling,
h is a prediction horizon. Concluding, it is the difference between an information about a future observation of
x obtained from the simultaneous observation of past values of both
x and
y, and the information about the future of
x obtained from the past values of
x alone. Phenomenon of entropy in both directions is highly probable that is why a measure described as
is decisive due to quantity and direction, what is causality.
The practical implementation of the Transfer Entropy approach between pair of variables according to Equation (
1) requires its simplification to the form presented in Equation (
2):
where
p means the conditional (PDF),
and
t are the time lags in
x and
y, respectively. If the time-series length is short, the
t is set to 1 under the assumption that the maximum auto-transfer of information occurs from the data point immediately before the target value in
y.
2.2. Median Ensemble Empirical Mode Decomposition
Median Ensemble Empirical Mode Decomposition (MEEMD) is a variation of the EEMD algorithm that uses the median operator instead of the mean operator to ensemble noisy Intrinsic Mode Function (IMF) trials [
23]. The use of this algorithm is a practical extension of the classic EMD and justified choice with real-world applications. The EMD method was developed so that data can be examined in an adaptive time–frequency–amplitude space for nonlinear and non-stationary signals [
24]. It decomposes the input signal into a few Intrinsic Mode Functions and a residue. The given equation will be as follows:
where
is is the multi-component signal.
is the
mth intrinsic mode function, and
represents the residue corresponding to
M intrinsic modes. The proposed median EEMD (MEEMD) algorithm, defines median operator as:
where
denotes the ordered
list at time instant
t, which is obtained from
N independent noise realizations. Consider a real-valued signal
and a predefined noise amplitude
, the used MEEMD is outlined in Algorithm 1.
Algorithm 1 Algorithm of MEEMD |
1. Generate the ensemble for , where ; 2. Decompose every member of into IMFs using the standard EMD, to yield the set ; 3. Assemble same-index IMFs across the ensemble using the median operator to obtain the final IMFs within MEEMD; for instance, the mth IMF is computed as . |
Nevertheless, EMD, and thus its derivative MEEMD has limitations. In particular the occurrence of dependencies between the length of analyzed data and the number of received IMFs resulting from such a decomposition process.
2.3. Statistical Analysis
Actually one should remember that industrial datasets can include many observations and therefore the constraint of the oscillation detection algorithm (datasets cannot be too long) might limit the analysis. Splitting of the long datasets into smaller subsets is only one of the solutions to overcome a limitation of the MEEMD method described in
Section 2.2. Statistical analysis suggests that it is natural to use an arithmetic mean. Analyzing datasets/signal values based on the arithmetic mean is not always producing accurate results. It is known that using the arithmetic mean, adequate (true for a given sample) information is obtained only when the sample comes from a population with a normal distribution or close to it. On the other hand, if the population distribution is not close to a normal distribution, the median is used to infer the sample mean value. If industrial data is considered as a time series, the question arises of whether the generated noise can be smoothed using multi-period smoothing (averaging filters). However, with their construction, some information is lost, as a result of which the extremes of the smoothed signal may appear in inappropriate/illogical places, contrary to real conditions. As a result, the information obtained would be highly incorrect or may even be distorted. Previous papers show that raw datasets collected from real-scale industrial processes do not have normal distributions. Considering the above, the use of a median filter is justified.
Several consecutive samples are taken and their median is determined in the classical median filter. A modified sample is obtained as a result of this operation. Working with datasets of a given amount is still an insufficient solution for further analysis and inference. Therefore, such an
n-element dataset is divided into
i-th windows with a length corresponding to natural divisors of number
n (in this case 10, 25, 50, and 100, respectively). In each window, we have
k samples, which are arranged in a non-decreasing manner, and for each already ordered window we determine the median as follows [
25]:
where
is the value of the
i-th dataset. The position of the median (i.e., the middle unit) in each of the windows is set at half the size of the window under consideration:
In this way, based on the medians from each of the windows, a shorter, filtered sample is obtained.
2.4. The Method
The methodology of the analysis depends on the type of data, especially on the length of the time series. Simulation data presented in
Figure 1a may require a simpler approach as no additional data preprocessing is required. In our case, they are disturbed by randomly generated Gaussian measurement noise and optionally by the Cauchy disturbance. Oscillations are introduced using added sinusoidal signal of frequency equal to 60 Hz, not exceeding 10% of the mean value of the data. Real process data consists of longer time series. Therefore, they require preprocessing that is proposed to be done in two separate variants, as in
Figure 1b.
Both approaches can be summarized in a single algorithm. The methodology is described by the algorithm Algorithm 2. It uses acquired process data representing control errors for
M loops of length equal to
N observations
. In the proposed analysis the suggested value for the maximum time-series length to work without decomposition is set to
.
Algorithm 2 TE algorithm with oscillation decomposition |
Acquire data time series ifthen Decompose M loops with MEEMD to IMF functions ▹ Use Algorithm 1 Calculate TE coefficients for each pair of IMFs ▹ Use Equation ( 2) else Detrend data using polynomial/spline interpolation Select splitting variant Split detrended data into n datasets of length Decompose each subset data with MEEMD to ▹ Use Algorithm 1 Calculate TE coefficients for corresponding pairs of IMFs ▹ Use Equation ( 2) end if Draw causality diagram using coefficients
|
5. Conclusions and Further Research
The presented approach extends the classical TE model-free approach, changing the domain of causality detection. Original approaches take the process data as it is. We try to investigate the hypothesis that the identification can be done by taking into account the only oscillatory element of the data. This modification can be simply done with the TE method using common oscillation detection methods, for instance, the MEEMD algorithm. It is difficult to compare the TE method with other approaches as this method is model-free, while the others assume some process models. One has to be aware that such models are rarely known in the process industry and any misfit in modeling may arise the dilemma of what is wrong: the model or the causality detection. Therefore the the methods requiring model assumptions are not considered. In the case of the industrial data used in the analysis, the modeling effort would unnecessarily extend the complexity without any positive add-ons.
The paper focuses on simulated and industrial data processing used to discover the oscillations existing in control-loop variables. Time series decomposition is performed using the MEEMD algorithm. The disadvantages and limitations of this approach force the authors to prepare specific industrial data for analysis. Two variants are checked. The key issue is the use of the Transfer Entropy algorithm on the obtained oscillations from the simulation and industrial data and then the generation of causality graphs.
It is shown that causality analysis is a complex task. There is no one single solution that works. The paper discovers two realities. Simulation example, despite the fact that is is non-Gaussian works perfectly. There is no doubt that the proposed methodology works in that case. Application of the successfully tested method to the normal, not specifically treated industrial data opens the Pandora box. The results become counterintuitive, variable, unrepeatable, and difficult to interpret. It appears that long data files pose a challenge to the methods used. The question of why is hard to answer. There might be several reasons: known and unknown. It might be non-stationarity, non-linearity, outliers, data granulation, fractality, human interventions, uncoupled disturbances (weather?), and many others.
In the case of the simulated data, the data are coherent with the applied methods and algorithms limitations are not violated. However, in the case of industrial data, we are never sure about their actual properties. We do not know if they are stationary. We do not know what type of non-stationarity they exhibit. We should investigate the outliers. We should identify the oscillations. We also need to analyze the residuum and its statistical properties. There are various issues that should be taken into consideration. Once we properly identify the process data, we may select or adopt the proper methodology. However, the research on industrial data internal properties and their compliance with the limitations of the methods used is infrequent.
However, the above observation should not discourage anyone. It represents a research challenge. However, this challenge requires a different approach. Simulations are not sufficient and in-depth, step-by-step comparative analysis must be carried out on several industrial sites; real ones, not laboratory ones. Of course, this is difficult and demanding. But only then it is possible to delve deeper into the issue and perhaps, but not certainly, find a solution.