Modified Deng’s Grey Relational Analysis Model for Panel Data and Its Applications in Assessing the Water Environment of Poyang Lake

Jian, Fanghong; Li, Jiangfeng; Liu, Xiaomei; Wu, Qiong; Zhong, Dan

doi:10.3390/pr12091935

Open AccessArticle

Modified Deng’s Grey Relational Analysis Model for Panel Data and Its Applications in Assessing the Water Environment of Poyang Lake

by

Fanghong Jian

¹,

Jiangfeng Li

^1,2,

Xiaomei Liu

^1,*,

Qiong Wu

¹ and

Dan Zhong

³

¹

College of Science, Jiujiang University, 551, Qianjin St., Lianxi District, Jiujiang 332005, China

²

Jiangxi Key Laboratory of Industrial Ecological Simulation and Environmental Health in Yangtze River Basin, Jiujiang University, 551, Qianjin St., Lianxi District, Jiujiang 332005, China

³

Jiujiang Ecological Environment Monitoring Center of Jiangxi Province, Jiujiang 332005, China

^*

Author to whom correspondence should be addressed.

Processes 2024, 12(9), 1935; https://doi.org/10.3390/pr12091935

Submission received: 19 July 2024 / Revised: 27 August 2024 / Accepted: 5 September 2024 / Published: 9 September 2024

(This article belongs to the Special Issue Industrial IoT-Enabled Modeling and Optimization for the Process Industry)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Deng’s grey relational analysis (GRA) model is widely used in clustering because of its simple mathematical mechanisms. For sample data of different dimensions, people have put forward different Deng’s GRA models, including time series data, panel data, and panel time series data. The purpose of this paper is to improve the clustering accuracy of the existing Deng’s GRA model for panel data in order to overcome some of its shortcomings. Firstly, the existing Deng’s GRA model for panel data was tested based on the dataset LP1 of Robot Execution Failures. Then, according to the test results, the existing Deng’s GRA model for panel data is modified by means of Taylor’s formula, and the modified model is successfully validated by the dataset LP1 of Robot Execution Failures. Finally, as a practical application, the modified Deng’s GRA model for panel data is applied to assess the water environment of Poyang Lake over the past five years. Compared with other cluster methods, the results of the case study show that the modified Deng’s GRA model for panel data is applicable and also confirm the remarkable effectiveness of the Chinese government’s water quality regulation in Poyang Lake. Therefore, the modified Deng’s GRA model presented in this paper improves the clustering accuracy compared to the original model and can be applied well to the classification of data with a large dimension.

Keywords:

Deng’s grey relational analysis model; panel data; Robot Execution Failures Dataset; Poyang Lake; water environment

1. Introduction

Grey relational theory, first proposed by Professor Deng in 1984 [1], is a mathematical method to assess the similarity or closeness between samples by calculating the relational degree among them. Unlike the factor analysis method of statistics inference, it does not need a large number of samples, and so it provides an effective tool for analyzing the interactions between various factors within a system. The grey relational analysis (GRA) model, as an important concrete manifestation of grey relational theory, has been widely applied in various fields. For example, Zhang [2] applied GRA models to rank the importance of 22 factors in the process industry, pinpointing security inspection, risk identification, and security awareness as the most critical, so as to develop an intelligent monitoring system for key factors across subsystems, utilizing video surveillance and sensors for real-time safety management and accident prevention. Xu [3] utilized an enhanced GRA model to assess the influence of innovation strategies on marine industry clustering and transformation, revealing stronger ties between innovation investment and industry aggregation and contributing to marine economic sustainability. Chen [4] used an improved GRA model to develop a subcontractor selection model that integrates quality function deployment (QFD) and analytic hierarchy process (AHP) to enhance the objectivity and rationality of the selection process.

Since the GRA model was proposed 40 years ago, new patterns have been continuously proposed to adapt to wider applications. Overall, they can be divided into two types: One is univariate GRA models, which focus on the relationship between a single factor and the system’s behavior, such as Deng’s GRA model [5], the area GRA model [6], and the slope GRA model [7]. The other is multivariate GRA models, which consider the interactions between multiple factors within the system, such as the three-dimensional Deng’s GRA model [5], the norm GRA model [8], the convex GRA model [9], the grid GRA model [10], the matrix GRA model [11], and the curvature GRA model [12]. But in these GRA models, Deng’s GRA model is most widely applied due to its simple mathematical expression and easy implementation.

Deng’s GRA model is the most basic model of grey relational theory, it constructs a mathematical model based on the difference sequence between the reference sequence and comparison sequences and calculates the relational coefficient to measure the relational degree between them. In practical applications, Deng’s GRA model can effectively identify the interactions between various factors in the system, help researchers identify key factors, and optimize the system structure. For example, Zhou et al. [13] developed a backpropagation neural network model based on statistical analysis and Deng’s GRA theory to predict the sulfur content in the COREX process. Zhang et al. [14] introduced a novel multivariate grey relational model based on spatial pyramid pooling for the analysis of time series data on different scales. Javed et al. [15] provided a new perspective for supplier evaluation and classification in multi-sourcing through the dynamic grey relational method. Xu et al. [16] utilized Deng’s GRA model and PLS-SEM to analyze the integration level of China’s digital economy and real economy. Li et al. [17] proposed a grey-adversary perceptual network to improve the performance of anomaly detection in surveillance videos. Overall, Deng’s GRA model is an effective grey relational analysis tool that reveals the inherent connections between data sequences by quantifying their differences, providing a new perspective and method for solving practical problems.

However, as the sample data become relatively more complex, Deng’s GRA model sometimes deviates from the actual facts. The literature [18] has pointed out that Deng’s GRA model cannot distinguish between the normal state and the failure state in the Robot Execution Failures Dataset. The reason for this may be that Deng’s GRA model is only constructed based on the raw difference sequence between the reference sequence and the comparison sequence. From the perspective of approximation theory, Deng’s GRA model may not have sufficient depth in data mining. In fact, researchers have always improved Deng’s GRA model to adapt to a wider range of applications. Liu et al. [19] proposed a new GRA model for measuring the relationships between inverse sequences. Huang et al. [20] proposed a new GRA model based on information differences for the performance evaluation of sanatoriums. However, the follow-up research on Deng’s GRA model has changed towards other theoretical frameworks. That is to say, the mechanisms of the subsequent models proposed have significantly diverged from the foundational principles of Deng’s original GRA model. In this way, the models become more complex, which can have a certain impact on the practical applications.

In view of the above facts, this paper will use Taylor’s approximation theory to improve Deng’s GRA model based on the original theoretical framework. Firstly, we use small-scale multivariate sample data to test the reliability of Deng’s original GRA model for panel data, and then a possible method to modify Deng’s original GRA model for panel data is provided according to the test results. The main work of this paper lies in the following aspects:

(1): A series of testing experiments were conducted on the performances of Deng’s GRA model for panel data based on the dataset LP1 of Robot Execution Failures.
(2): A modified Deng’s GRA model for panel data is presented.
(3): The water environment of Poyang Lake is assessed over past years using a modified Deng’s GRA model for panel data.

The outline of this paper is organized as follows. Section 2 reviews Deng’s original GRA model for panel data and tests the performance of the model based on the dataset LP1 of Robot Execution Failures. Section 3 presents a modification Deng’s GRA model for panel data. Section 4 validates the modified Deng’s GRA model for panel data by three numerical experiments. Section 5 applies the modified Deng’s GRA model for panel data to assess the water environment of Poyang Lake. Section 6 makes conclusions.

2. Deng’s GRA Model for Panel Data

2.1. Review of Deng’s GRA Model for Panel Data

Deng’s GRA model is constructed based on the difference sequence between the reference sequence and the comparison sequence and measures the degree of relation between sequences by calculating the relational coefficient. For simplicity in description, we denote

{(a_{i j}^{k})}_{m \times n} = [\begin{matrix} a_{11}^{k} & \dots & a_{1 n}^{k} \\ \dots & \dots & \dots \\ a_{m 1}^{k} & \dots & a_{m n}^{k} \end{matrix}], k = 0, 1, 2, 3, \dots L .

Assume

X_{0} = {(a_{i j}^{0})}_{m \times n}

is the behavior matrix of system characteristics,

X_{k} = {(a_{i j}^{k})}_{m \times n} (k \neq 0)

is the behavior matrix of system factors,

1 ⩽ i ⩽ m

represents the time dimension, and

1 ⩽ j ⩽ n

represents the indicator dimension, L is the sample size.

Definition 1

([5]). Assume

X_{0} = {(a_{i j}^{0})}_{m \times n}

is the behavior matrix of system characteristics,

X_{k} = {(a_{i j}^{k})}_{m \times n}

is the behavior matrix of system factors, and

k = 1, 2, \dots, L

,

ξ \in [0, 1]

. Let

ε_{i j}^{0 k} = \frac{min_{1 ⩽ i ⩽ m} min_{1 ⩽ j ⩽ n} min_{1 ⩽ k ⩽ L} |a_{i j}^{0} - a_{i j}^{k}| + ξ \cdot max_{1 ⩽ i ⩽ m} max_{1 ⩽ j ⩽ n} max_{1 ⩽ k ⩽ L} |a_{i j}^{0} - a_{i j}^{k}|}{|a_{i j}^{0} - a_{i j}^{k}| + ξ \cdot max_{1 ⩽ i ⩽ m} max_{1 ⩽ j ⩽ n} max_{1 ⩽ k ⩽ L} |a_{i j}^{0} - a_{i j}^{k}|}

. Then,

ρ_{0 k} = \frac{1}{m n} \sum_{i = 1}^{m} \sum_{j = 1}^{n} ε_{i j}^{0 k}

is defined as Deng’s grey relational degree of

X_{k}

and

X_{0}

, and is also called Deng’s original GRA model for panel data.

Note that there is a distinguished coefficient

ξ

, the main function of which is to enhance the resolution, but in practical applications,

ξ

is commonly set to 0.5.

Obviously, Deng’s GRA model for panel data offers a global perspective to analyze the incidence relationships for panel data. Meanwhile, it has several primary properties, including normality, closeness, symmetry, translation invariance, and indicator permutation. Particularly, the indicator permutation is very important, which implies that there is no need to pay attention to the arrangement among the indicators of panel data.

2.2. Performance Testing of Deng’s GRA Model for Panel Data

As mentioned in the Introduction, the literature [18] points out that Deng’s GRA model cannot distinguish between the normal state and the failure state in the Robot Execution Failures Dataset. In this section, based on the research presented in literature [18], we further discuss specific details of performance testing on the Robot Execution Failures Dataset, aiming to provide a comprehensive assessment for Deng’s GRA model.

2.2.1. Robot Execution Failures Dataset

The Robot Execution Failures Dataset is a set of multivariate sample data from the UCI machine learning database. Detailed raw data can be obtained from the website (http://kdd.ics.uci.edu/databases/robotfailure/robotfailure.html), accessed on 23 April 1999. The Robot Execution Failures Dataset is generated by the fault monitoring of robots with six sensors, and there are a total of five sub-datasets. In this paper, the first sub dataset LP1 is adopted to perform experiments.

In dataset LP1, there are two states and a total of 88 samples, including 21 normal states and 67 failure states. Each sample is represented as a matrix of 15 by 6, including 6 indicators and 15 times of continuous acquisition. Specifically, columns 1–3 of the matrix are the evolution of force, and columns 4–6 are the evolution of torque.

According to the features of dataset LP1, we suppose that

X_{0} = {(a_{i j}^{0})}_{15 \times 6}

is the behavior matrix of system characteristics, where

1 ⩽ i ⩽ 15

represents the time dimension,

1 ⩽ j ⩽ 6

represents the indicator dimension, as shown in Table 1. At the same time, all of the 88 samples in dataset LP1 are considered as the behavior matrix of system factors, that is,

X_{k} = {(a_{i j}^{k})}_{15 \times 6}

,

k = 1, 2, \dots, 88

.

2.2.2. Data Pre-Processing

Because Columns 1–3 and Columns 4–6 on the behavior matrix of the system are the evolution of force and torque respectively, the testing data would be pre-processed for dimensionless form before modeling. There are many methods for dimensionless data processing, but in grey relational theory, only the interval operator, initial value operator, and mean operator are often used. Furthermore, all features of the LP1 dataset are numeric and often continuous, and each sample was collected at regular time intervals within 315 ms. Therefore, based on the characteristics of the LP1 dataset, there is a subtle connection between these 88 samples in the time dimension. To overcome this problem, the interval operator is selected to pre-process the testing data. The interval operator is defined as follows.

Definition 2

([5]). Assume the behavior matrix of the system is represented as

X_{k} = {(x_{i j})}_{m \times n}

,

1 ⩽ i ⩽ m

,

1 ⩽ j ⩽ n, k = 1, 2, \dots, L

, here L is the sample size. If D satisfies

D X_{k} = Y_{k} = {(y_{i j})}_{m \times n}

and

y_{i j} = \frac{x_{i j} - min_{1 ⩽ k ⩽ L} min_{1 ⩽ i ⩽ m} x_{i j}}{max_{1 ⩽ k ⩽ L} max_{1 ⩽ i ⩽ m} x_{i j} - min_{1 ⩽ k ⩽ L} min_{1 ⩽ i ⩽ m} x_{i j}}

, then D is called the interval operator.

2.2.3. Testing Results

As mentioned above, the relational degree order of Deng’s GRA model for panel data does not change as the change of indicator arrangements. So there will be only 88 values of relational degree in the total 88 samples of dataset LP1. By programming in Python, the 88 values were obtained. For illustration, the maximum and the minimum of the relational degree of normal and failure states are listed in Table 2.

Table 2 shows that Deng’s GRA model for panel data cannot fully distinguish normal and failure states. Because the maximum relational degree of failure states is greater than the minimum relational degree of normal states, there is no threshold between the relational degree of normal and failure states. In order to investigate the detailed reasons, the 88 relational degree values of Deng’s GRA model for panel data are presented through a scatter plot, where green and red dots display normal and failure states, respectively, as shown in Figure 1.

Figure 1 shows that the majority of green dots are located at the top, and very few green dots are intersected with red dots. This indicates that very few normal states have not been distinguished. In other words, Deng’s GRA model for panel data has demonstrated certain advantages in its ability to distinguish, but it still needs to be improved.

3. Modified Deng’s GRA Model for Panel Data

From a mathematical perspective, Deng’s original GRA model is constructed based on the raw difference sequence between the reference sequence and the comparison sequence. According to Taylor’s formula, the approximation will inevitably result in significant errors. For example, let the reference sequence

X_{0}

be a constant matrix with all elements equal to 0.5, the comparison sequence

X_{k}

is a binary matrix such that all elements can only be 0 or 1. That is,

X_{0} = {[\begin{matrix} 0.5 & 0.5 & \dots & 0.5 \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 0.5 & 0.5 & \dots & 0.5 \end{matrix}]}_{m \times n}, X_{k} = {[\begin{matrix} 0 or 1 & 0 or 1 & \dots & 0 or 1 \\ ⋮ & ⋮ & ⋮ & ⋮ \\ 0 or 1 & 0 or 1 & \dots & 0 or 1 \end{matrix}]}_{m \times n}, \forall k \in N^{+} .

According to Deng’s original GRA model, for

\forall k \in N^{+}

, Deng’s grey relational degree

ρ_{0 k}

of

X_{k}

and

X_{0}

will be 1. Obviously, this deviates from the facts and it is difficult to distinguish the samples.

Taylor approximation theory provides an approximation of a function as a polynomial sum derived from its derivatives at a single point. It is widely used in various fields of mathematics, physics, engineering, and computer science for its ability to solve problems that are otherwise difficult to handle analytically. In this paper, we adopted the concept of Taylor’s formula and converted the derivative into the differential form to optimize Deng’s GRA model. Therefore, Deng’s GRA model is expanded to the first-order and second-order differences.

Consequently, Deng’s GRA model for panel data is modified as follows.

Definition 3.

Assume

X_{0} = {(a_{i j}^{0})}_{m \times n}

is the behavior matrix of system characteristics, and

X_{k} = {(a_{i j}^{k})}_{m \times n}

is a behavior matrix of system factors, where

1 ⩽ i ⩽ m

represents the time dimension,

1 ⩽ j ⩽ n

represents the indicator dimension,

1 ⩽ k ⩽ L

is the sample size, and

ξ \in [0, 1]

. Let

r_{0 k}^{'} = \frac{min_{1 ⩽ i ⩽ m} min_{1 ⩽ j ⩽ n} min_{1 ⩽ k ⩽ L} |a_{i j}^{0} - a_{i j}^{k}| + ξ max_{1 ⩽ i ⩽ m} max_{1 ⩽ j ⩽ n} max_{1 ⩽ k ⩽ L} |a_{i j}^{0} - a_{i j}^{k}|}{|a_{i j}^{0} - a_{i j}^{k}| + ξ max_{1 ⩽ i ⩽ m} max_{1 ⩽ j ⩽ n} max_{1 ⩽ k ⩽ L} |a_{i j}^{0} - a_{i j}^{k}|}

,

r_{0 k}^{″} = \frac{min_{1 ⩽ i ⩽ m} min_{1 ⩽ j ⩽ n - 1} min_{1 ⩽ k ⩽ L} |Δ| + ξ max_{1 ⩽ i ⩽ m} max_{1 ⩽ j ⩽ n - 1} max_{1 ⩽ k ⩽ L} |Δ|}{|Δ| + ξ max_{1 ⩽ i ⩽ m} max_{1 ⩽ j ⩽ n - 1} max_{1 ⩽ k ⩽ L} |Δ|}

, here

Δ = (a_{i, j + 1}^{0} - a_{i j}^{0}) - (a_{i, j + 1}^{k} - a_{i j}^{k})

.

r_{0 k}^{‴} = \frac{min_{1 ⩽ i ⩽ m} min_{1 ⩽ j ⩽ n - 1} min_{1 ⩽ k ⩽ L} |Δ^{2}| + ξ max_{1 ⩽ i ⩽ m} max_{1 ⩽ j ⩽ n - 1} max_{1 ⩽ k ⩽ L} |Δ^{2}|}{|Δ^{2}| + ξ max_{1 ⩽ i ⩽ m} max_{1 ⩽ j ⩽ n - 1} max_{1 ⩽ k ⩽ L} |Δ^{2}|}

,

here

Δ^{2} = (a_{i, j + 2}^{0} - 2 a_{i, j + 1}^{0} + a_{i j}^{0}) - (a_{i, j + 2}^{k} - 2 a_{i, j + 1}^{k} + a_{i j}^{k})

. Then,

r_{0 k} = \frac{1}{2} \cdot \frac{1}{m n} \sum_{i = 1}^{m} \sum_{j = 1}^{n} {r^{'}}_{0 k} + \frac{1}{4} \cdot \frac{1}{m (n - 1)} \sum_{i = 1}^{m} \sum_{j = 1}^{n - 1} {r^{″}}_{0 k} + \frac{1}{4} \cdot \frac{1}{m (n - 2)} \sum_{i = 1}^{m} \sum_{j = 1}^{n - 2} {r^{‴}}_{0 k}

is called the modified Deng’s grey relational degree of

X_{k}

and

X_{0}

and is also called the modified Deng’s GRA model for panel data.

Similarly,

ξ

is a distinguished coefficient, the function of which is to enhance resolution, and, as usual,

ξ = 0.5

. Furthermore, the grey relational degree defined by the modified Deng’s GRA model for panel data still satisfies those primary properties, including normality, closeness, symmetry, translation invariance, and indicator permutation.

Next, we will continue to use the Robot Execution Failures Dataset adopted in Section 2 to test the performance of the modified Deng’s GRA model for panel data.

4. Performance Testing of the Modified Deng’s GRA Model for Panel Data

In this section, we will conduct a comprehensive assessment for the modified Deng’s GRA model from three aspects of the Robot Execution Failures Dataset, including the total samples, the force samples, and the torque samples. As in Section 2, all the testing data are pre-processed in a dimensionless form before modeling using an interval operator.

4.1. Experiments Based on the Total Samples of the Robot Execution Failures Dataset

Based on the total samples of the Robot Execution Failures Dataset, by programming in Python, the 88 relational degree values of the modified Deng’s GRA model for panel data were obtained. Table 3 lists the maximum and the minimum relational degree of normal and failure states. Similarly, the 88 values are presented through a scatter plot, where normal and failure states are still displayed by green and red dots, respectively, as shown in Figure 2.

Table 3 shows that the maximum relational degree of failure states is smaller than the minimum relational degree of normal states, and there is a significant gap in the relational degree interval of normal and failure states. Similarly, Figure 2 shows that the green and red dots are completely separated. Therefore, based on the total samples of the Robot Execution Failures Dataset, the modified Deng’s GRA model for panel data can fully distinguish between normal and failure states.

4.2. Experiments Based on the Force Samples of Robot Execution Failures Dataset

In the same way, based on the force samples of the Robot Execution Failures Dataset, the 88 relational degree values of the modified Deng’s GRA model for panel data were obtained by programming in Python. Table 4 lists the maximum and the minimum of the relational degree of normal and failure states. Meanwhile, Figure 3 displays the 88 relational degree values of normal and failure states by green and red dots, respectively.

In Table 4, similar to Table 3, the maximum relational degree of failure states is smaller than the minimum relational degree of normal states, and there is also a significant gap in the relational degree interval of normal and failure states. Figure 3 also shows that the green and red dots are completely separated. Therefore, based only on the force samples of the Robot Execution Failures Dataset, the modified Deng’s GRA model for panel data still can fully distinguish between normal states and failure states.

4.3. Experiments Based on the Torque Samples of the Robot Execution Failures Dataset

Similarly, based on the torque samples of the Robot Execution Failures Dataset, the 88 relational degree values of the modified Deng’s GRA model for panel data were obtained by programming in Python. The maximum and the minimum relational degrees of normal and failure states are listed in Table 5. Meanwhile, the 88 values are presented in Figure 4, and the green and red dots represent the normal and failure states, respectively.

Table 5 shows that the maximum relational degree of failure states is still smaller than the minimum relational degree of normal states, and there is a slight gap in the relational degree interval of normal and failure states. Nevertheless, Table 5 and Figure 4 still demonstrate that the modified Deng’s GRA model for panel data can completely distinguish between normal and failure states based solely on the torque samples.

Therefore, all three test results indicate that the modified Deng’s GRA model for panel data significantly improves the distinguishing ability of the original model and can be effectively performed by utilizing only a subset of the sample data. In summary, the performance of the modified Deng’s GRA model for panel data is validated through three numerical experiments based on the Robot Execution Failures Dataset.

4.4. Comparison and Analysis

In this section, to reveal the effectiveness of the modified Deng’s GRA model for panel data, the classification results are compared with five other methods, including Deng’s original GRA model, the Euclidean Norm GRA model [8], the matrix GRA model [21], the C-type GRA model [22], and the k-nearest neighbor (KNN) algorithm [23]. Similarly, the testing data are pre-processed in dimensionless form using an interval operator. By programming in Python, the metrics of overall performance for all models are listed in Table 6, where there are four classification evaluation metrics: accuracy, precision, recall (sensitivity), and F1 (score). These are quantitative measures used to assess the performance of a classification model and provide insights into the accuracy and reliability of the model’s predictions.

Table 6 shows that Deng’s original GRA model performs the worst, followed by the Euclidean norm GRA model and the KNN algorithm. By comparison, the modified Deng’s GRA model, the matrix GRA model, and the C-type GRA model all perform well. This also demonstrates the effectiveness of the modified Deng’s GRA model. In addition, the mathematical structure of the modified Deng’s GRA model is simpler and easier to implement than that of the matrix GRA model and the C-type GRA model. Therefore, it can be deemed that the modified Deng’s GRA model has more application prospects.

5. Case Study

In this section, as a practical application, the modified Deng’s GRA model for panel data is applied to assess the water quality of Poyang Lake over the past five years. Poyang Lake, the largest freshwater lake in China, has abundant ecological resources and plays a significant role in economic and social spheres. Therefore, the Chinese government has always placed a high priority on the protection of water quality in Poyang Lake.

Among the various indicators of water quality, total phosphorus (TP) is the most crucial indicator and has a decisive influence on the health assessment of aquatic ecosystems. So TP is acknowledged as one of the pollutants that require stringent monitoring and control within water quality assessment and management practices. This case study will focus on the monthly total phosphorus data measured from eight monitoring sites in Poyang Lake District from 2019 to 2023. For confidentiality requirements, the eight monitoring sites are represented by symbols Site 1–8, respectively, and the detailed data can be found in Table A1.

According to the previous discussion in this paper, we suppose that

X_{0} = {(0)}_{12 \times 8}

(which is zero matrix) is the behavior matrix of system characteristics, where

i (1 ⩽ i ⩽ 12)

represents the time dimension and

j (1 ⩽ j ⩽ 8)

represents the indicator dimension. The years are supposed as the samples for the behavior matrix of system factors, that is,

X_{k} = {(a_{i j}^{k})}_{12 \times 8}

,

k = 1, 2, \dots, 5

.

Meanwhile, to verify the effectiveness of the modified Deng’s GRA model for panel data, the results are compared with Deng’s original GRA model and other GRA models for panel data, including the Euclidean Norm GRA model [8], the matrix GRA model [21], and the C-type GRA model [22]. As in Section 4, the testing data are pre-processed in dimensionless form using interval operator. By programming in Python, the relational degree values are all obtained, as depicted in Table 7.

Table 7 shows that the relational degree values differ significantly from each other, which is due to the fact that different GRA models for panel data have different functional expressions, and the range of relational degree is also different. Because the main purpose of GRA models is for cluster analysis based on the relational degree order, the numerical values of relational degree between various GRA models are meaningless. In other words, we merely need to compare the relational degree order of the five GRA models. Therefore, the relational degree order yielded by the five GRA models are displayed, as shown in Table 8.

Table 8 clearly shows that, among the order provided by the five GRA models, only Deng’s original GRA model presents a different order from the others. This indicates that the modified Deng’s GRA model for panel data is effective and significantly improves the clustering accuracy of the original model. Furthermore, the earlier the year appearing in the order, the lower the TP of Poyang Lake that year. Therefore, the data in Table 8 further confirm that the Chinese government has achieved significant results in reducing TP emissions from Poyang Lake and effectively protecting the water environment of Poyang Lake.

6. Conclusions

In this paper, a new method is provided to improve the clustering accuracy of the existing Deng’s GRA model for panel data. Based on the numerical test results for Deng’s original GRA model, it is found that the mechanism of Deng’s original GRA model is reasonable, but it lacks in-depth modeling. To solve this problem, a modified Deng’s GRA model is presented that can be applied to the cluster analysis of data with any number of dimensions. Three numerical experiments on the Robot Execution Failures Dataset have verified the significant improvement in clustering accuracy compared to Deng’s original GRA model for panel data. Furthermore, a case study on the monthly TP data of Poyang Lake over the past five years is given to demonstrate the superiority of the modified model. By comparing it with other cluster methods, the results of the case study show that the modified Deng’s GRA model for panel data is applicable and also confirm the remarkable effectiveness of the Chinese government’s water quality regulation in Poyang Lake for many years. Due to the simplicity of the model, the modified Deng’s GRA model for panel data presented in this paper can be more widely applied in the future research of grey relational theory and is expected to more effectively reveal the interactions between various factors in the system, helping researchers to identify key factors and optimize the system structure.

However, comparative analysis results show that the modified Deng’s GRA model, the matrix GRA model, and the C-type GRA model all perform well. Based on the simple mathematical structure and easy implementation of the model, the modified Deng’s GRA model is deemed effective and has more application prospects. There may be some potential shortcomings. Therefore, a more extensive simulation study with other clustering methods will be a good idea for our future work. On the other hand, the selection of data pre-processing methods is crucial for ensuring the accuracy of the model. Therefore, discussing the specific impact of different data pre-processing methods on the performance of various GRA models will also be meaningful research work for the future.

Author Contributions

Conceptualization, F.J. and X.L.; methodology, X.L.; software, F.J.; validation, F.J., J.L. and X.L.; formal analysis, F.J. and X.L.; investigation, F.J. and X.L.; resources, J.L., Q.W. and D.Z.; data curation, J.L., Q.W. and D.Z.; writing—original draft preparation, X.L.; writing—review and editing, X.L.; visualization, F.J.; supervision, J.L., Q.W. and D.Z.; project administration, F.J., J.L., X.L., Q.W. and D.Z.; funding acquisition, F.J., J.L., X.L., Q.W. and D.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (62341308), Natural Science Foundation of Jiangxi Province (20232BAB201020, 20224BAB201010), and Open Fund of Key Laboratory of Industrial Ecological Simulation and Environmental Health in Yangtze River Basin of Jiangxi Province (jj20212021).

Data Availability Statement

Details of dataset LP1 can be found from Robot Execution Failures Dataset linking to the website (http://kdd.ics.uci.edu/databases/robotfailure/robotfailure.html), and the data of the case study can be found in Appendix A.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Total phosphorus data in Poyang Lake District (Unit: mg/L).

Sample Time	Site 1	Site 2	Site 3	Site 4	Site 5	Site 6	Site 7	Site 8
2019/1/1	0.52	0.12	0.08	0.13	0.11	0.02	0.06	0.1
2019/2/1	0.08	0.03	0.04	0.13	0.09	0.01	0.04	0.05
2019/3/5	0.08	0.07	0.07	0.07	0.07	0.08	0.1	0.06
2019/4/1	0.04	0.03	0.04	0.04	0.06	0.04	0.04	0.04
2019/5/6	0.08	0.04	0.04	0.06	0.05	0.05	0.04	0.07
2019/6/4	0.04	0.02	0.03	0.03	0.06	0.03	0.06	0.03
2019/7/2	0.02	0.01	0.03	0.03	0.06	0.03	0.04	0.03
2019/8/1	0.03	0.02	0.02	0.02	0.01	0.02	0.03	0.03
2019/9/3	0.11	0.05	0.06	0.06	0.05	0.08	0.05	0.08
2019/10/8	0.04	0.08	0.03	0.03	0.03	0.02	0.04	0.02
2019/11/1	0.34	0.11	0.19	0.08	0.11	0.02	0.07	0.08
2019/12/4	0.07	0.07	0.06	0.06	0.06	0.02	0.09	0.07
2020/1/6	0.07	0.05	0.12	0.047	0.055	0.08	0.05	0.05
2020/2/12	0.02	0.1	0.02	0.068	0.03	0.08	0.08	0.08
2020/3/5	0.04	0.06	0.03	0.04	0.027	0.08	0.05	0.05
2020/4/1	0.04	0.04	0.05	0.05	0.05	0.04	0.03	0.06
2020/5/1	0.05	0.04	0.06	0.043	0.06	0.04	0.07	0.06
2020/6/1	0.05	0.072	0.05	0.05	0.047	0.06	0.07	0.04
2020/7/1	0.02	0.033	0.03	0.03	0.05	0.03	0.05	0.03
2020/8/1	0.03	0.025	0.017	0.017	0.027	0.03	0.027	0.03
2020/9/1	0.04	0.04	0.026	0.053	0.04	0.03	0.038	0.05
2020/10/1	0.04	0.037	0.033	0.05	0.042	0.04	0.05	0.097
2020/11/1	0.075	0.07	0.045	0.093	0.08	0.03	0.05	0.06
2020/12/1	0.09	0.048	0.052	0.06	0.07	0.03	0.12	0.08
2021/1/1	0.08	0.045	0.045	0.05	0.05	0.03	0.06	0.07
2021/2/1	0.12	0.038	0.04	0.05	0.06	0.03	0.075	0.07
2021/3/1	0.09	0.047	0.055	0.077	0.05	0.06	0.11	0.105
2021/4/1	0.04	0.041	0.051	0.058	0.06	0.05	0.05	0.055
2021/5/1	0.03	0.029	0.039	0.04	0.04	0.04	0.05	0.04
2021/6/1	0.03	0.022	0.037	0.03	0.04	0.03	0.062	0.04
2021/7/1	0.04	0.037	0.039	0.12	0.053	0.05	0.06	0.06
2021/8/1	0.108	0.032	0.041	0.05	0.125	0.03	0.07	0.04
2021/9/1	0.065	0.035	0.037	0.052	0.048	0.05	0.055	0.052
2021/10/1	0.08	0.048	0.065	0.077	0.06	0.05	0.065	0.077
2021/11/1	0.05	0.063	0.066	0.047	0.1	0.07	0.06	0.05
2021/12/1	0.07	0.05	0.048	0.065	0.065	0.07	0.08	0.07
2022/1/1	0.05	0.055	0.04	0.028	0.065	0.07	0.05	0.05
2022/2/1	0.05	0.048	0.013	0.04	0.062	0.05	0.055	0.05
2022/3/1	0.07	0.048	0.031	0.048	0.04	0.05	0.05	0.05
2022/4/1	0.04	0.042	0.045	0.048	0.062	0.04	0.04	0.05
2022/5/1	0.05	0.033	0.046	0.06	0.043	0.08	0.072	0.07
2022/6/1	0.05	0.023	0.039	0.04	0.047	0.03	0.065	0.05
2022/7/4	0.03	0.033	0.035	0.05	0.048	0.04	0.055	0.06
2022/8/1	0.07	0.076	0.046	0.07	0.07	0.04	0.062	0.07
2022/9/1	0.07	0.063	0.051	0.08	0.1	0.04	0.075	0.075
2022/10/1	0.085	0.04	0.044	0.05	0.08	0.04	0.06	0.005
2022/11/1	0.04	0.044	0.036	0.045	0.05	0.04	0.04	0.05
2022/12/1	0.1	0.052	0.05	0.03	0.1	0.041	0.04	0.04
2023/1/1	0.05	0.039	0.042	0.02	0.075	0.04	0.04	0.035
2023/2/1	0.04	0.065	0.062	0.08	0.08	0.04	0.04	0.02
2023/3/1	0.04	0.054	0.046	0.05	0.052	0.04	0.04	0.06
2023/4/1	0.14	0.045	0.053	0.05	0.08	0.04	0.08	0.09
2023/5/1	0.05	0.068	0.053	0.085	0.09	0.06	0.05	0.055
2023/6/1	0.05	0.073	0.044	0.06	0.045	0.05	0.065	0.045
2023/7/1	0.04	0.062	0.045	0.04	0.042	0.05	0.065	0.04
2023/8/1	0.05	0.074	0.037	0.045	0.048	0.16	0.05	0.042
2023/9/1	0.03	0.067	0.055	0.05	0.06	0.16	0.04	0.04
2023/10/1	0.05	0.044	0.052	0.045	0.04	0.04	0.06	0.05
2023/11/1	0.05	0.067	0.065	0.03	0.06	0.04	0.11	0.05
2023/12/1	0.04	0.058	0.045	0.04	0.048	0.04	0.05	0.04

References

Deng, J. The theory and methods of socio-economy grey system. Soc. Sci. China 1984, 6, 47–60. [Google Scholar]
Zhang, W.; Shi, H.; Zhao, Y.; Ding, T. Significance ranking and correlation identification of accident causes in process industry based on system thinking and statistical analysis. Process Saf. Prog. 2024, 43, 144–159. [Google Scholar] [CrossRef]
Xu, S.; Lu, Q.; Bin, B. Grey correlation analysis on the synergistic development between innovation-driven strategy and marine industrial agglomeration: Based on China’s coastal provinces. Grey Syst. Theory Appl. 2022, 12, 269–289. [Google Scholar] [CrossRef]
Chen, X.; Ding, X.; Clark, A.; Wu, Y.; Feng, G.; Xiao, Y. A decision support model for subcontractor selection using a hybrid approach of QFD and AHP-improved grey correlation analysis. Eng. Constr. Archit. Manag. 2020, 28, 1780–1806. [Google Scholar] [CrossRef]
Liu, S.; Yang, Y.; Forrest, J. Grey Data Analysis: Methods, Models and Applications; Springer: London, UK, 2016; pp. 67–103. [Google Scholar]
Liu, X.; Ke, L.; Yu, J. An improved model to the absolute degree of grey incidence. Math. Pract. Theory 2018, 48, 16–22. [Google Scholar]
Liu, Z.; Dang, Y.; Xiao, Y. A new multivariate grey incidence model and its application to economic growth driving analysis. J. Grey Syst. 2018, 30, 116–133. [Google Scholar]
Gui, Y.; Xia, Y.; Deng, L. Grey incident grade in linear norm space. J. Wuhan Univ. Technol. 2004, 28, 399–412. [Google Scholar]
Wu, L.; Liu, S.; Yao, L.; Yan, S. Grey convex relational degree and its application to evaluate regional economic sustainability. Sci. Ironical 2013, 20, 44–49. [Google Scholar] [CrossRef]
Liu, Z.; Dang, Y.; Qian, W.; Zhou, W. Grey grid incidence model based on panel data. Syst. Eng. Theory Pract. 2014, 34, 991–996. [Google Scholar]
Liu, X.; Ke, L.; Yu, J. An improved model of three-dimensional absolute degree of grey incidence. Stat. Decis. 2018, 34, 20–24. [Google Scholar]
Wu, H.; Qu, Z. A grey incidence model for panel data based on the curvature of discrete surface. J. Grey Syst. 2022, 34, 75–87. [Google Scholar]
Zhou, H.; Tang, Z.; Wen, B.; Wang, S.; Yang, J.; Kou, M.; Wu, S. Application of statistical analysis, Deng’s relevancy and BP neural network for predicting molten iron sulfur in COREX process. Int. J. Chem. React. Eng. 2020, 18, 20200122. [Google Scholar] [CrossRef]
Zhang, K.; Cui, L.; Yin, Y. A multivariate grey incidence model for different scale data based on spatial pyramid pooling. J. Syst. Eng. Electron. 2020, 31, 770–779. [Google Scholar]
Javed, S.; Gunasekaran, A.; Mahmoudi, A. DGRA: Multi-sourcing and supplier classification through dynamic grey relational analysis method. Comput. Ind. Eng. 2022, 173, 108674. [Google Scholar] [CrossRef]
Xu, G.; Lu, T.; Chen, X.; Liu, Y. The convergence level and influencing factors of China’s digital economy and real economy based on grey model and PLS-SEM. J. Intell. Fuzzy Syst. 2022, 42, 1575–1605. [Google Scholar] [CrossRef]
Li, C.; Li, H.; Zhang, G. Grey-adversary perceptual network for anomaly detection. Multimed. Tools Appl. 2024, 83, 41273–41291. [Google Scholar] [CrossRef]
Liu, X.; YU, J. Grey incidence analysis models for matrix data and matrix sequences data. J. Grey Syst. 2019, 31, 59–70. [Google Scholar]
Liu, S. Negative grey relational model and measurement of the reverse incentive effect of fields medal. Grey Syst.-Theory Appl. 2023, 13, 1–13. [Google Scholar] [CrossRef]
Huang, J.; Dang, Y.; Wang, J.; Xue, Q. Novel Deng’s grey development relation model based on information difference and its application in sanatorium performance evaluation. Math. Probl. Eng. 2020, 6, 3427040. [Google Scholar] [CrossRef]
Luo, D.; Zhang, H. Grey incidence analysis method for regional drought vulnerability. J. North China Univ. Water Resour. Electr. Power (Nat. Sci. Ed.) 2018, 39, 61–67. [Google Scholar]
Luo, D.; Li, J. Grey C-type correlation analysis of regional agricultural drought disaster risk based on panel data. J. North China Univ. Water Resour. Electr. Power (Nat. Sci. Ed.) 2020, 41, 47–53. [Google Scholar]
Uddin, S.; Haque, I.; Lu, H.; Moni, M.; Gide, E. Comparative performance analysis of K-nearest neighbor (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 2024, 12, 6256. [Google Scholar]

Figure 1. Scatter plot of 88 relational degrees calculated by Deng’s GRA model for panel data.

Figure 2. Scatter plot of 88 relational degrees provided by modified Deng’s GRA model based on total samples.

Figure 3. Scatter plot of 88 relational degrees provided by the modified Deng’s GRA model based on force samples.

Figure 4. Scatter plot of 88 relational degree provided by modified Deng’s GRA model based on torque samples.

Table 1. Behavior matrix of system characteristics

X_{0}

.

Table 1. Behavior matrix of system characteristics

X_{0}

.

Time	$F_{x}$	$F_{y}$	$F_{z}$	$T_{x}$	$T_{y}$
1	−1	−1	63	−3	−1
2	−1	−1	63	−3	−1
3	−1	−1	63	−3	−1
4	−1	−1	63	−3	−1
5	−1	−1	63	−3	−1
6	−1	−1	63	−3	−1
7	−1	−1	63	−3	−1
8	−1	−1	63	−3	−1
9	−1	−1	63	−3	−1
10	−1	−1	63	−3	−1
11	−1	−1	63	−3	−1
12	−1	−1	63	−3	−1
13	−1	−1	63	−3	−1
14	−1	−1	63	−3	−1
15	−1	−1	63	−3	−1

Table 2. Relational degree interval of normal and failure states offered by Deng’s GRA model for panel data.

States	21 Normal States	67 Failure States
Interval of relational degree	[0.9908, 0.9996]	[0.6206, 0.9915]

Table 3. Relational degree interval calculated by the modified Deng’s GRA model based on total samples.

States	21 Normal States	67 Failure States
Interval of relational degree	[0.9933, 0.9994]	[0.8010, 0.9887]

Table 4. Relational degree interval was calculated by modifying Deng’s GRA model for panel data based on force samples.

States	21 Normal States	67 Failure States
Interval of relational degree	[0.9927, 0.9995]	[0.7441, 0.9898]

Table 5. Relational degree interval calculated by modifying Deng’s GRA model for panel data based on torque samples.

States	21 Normal States	67 Failure States
Interval of relational degree	[0.9868, 0.9993]	[0.7019, 0.9862]

Table 6. Classification results of the modified Deng’s GRA model and other compared models.

Samples	Metrics	Deng’s GRA	Euclidean Norm GRA	Matrix GRA	C-Type GRA	KNN	Modified Deng’s GRA
Total samples	Accuracy	0.9773	1	1	1	1	1
	Precision	0.9524	1	1	1	1	1
	Recall	0.9524	1	1	1	1	1
	F1	0.9524	1	1	1	1	1
Force samples	Accuracy	1	1	1	1	1	1
	Precision	1	1	1	1	1	1
	Recall	1	1	1	1	1	1
	F1	1	1	1	1	1	1
Torque samples	Accuracy	0.9545	0.9773	1	1	0.9773	1
	Precision	0.9048	0.9524	1	1	0.9524	1
	Recall	0.9048	0.9524	1	1	0.9524	1
	F1	0.9048	0.9524	1	1	0.9524	1

Table 7. Relational degree values provided by five GRA models for panel data.

Years	Deng’s GRA	Euclidean Norm GRA	Matrix GRA	C-Type GRA	Modified Deng’s GRA
2019	0.6611	0.1888	0.7908	0.7418	0.6819
2020	0.6745	0.2208	0.8289	0.8011	0.7360
2021	0.6364	0.2089	0.8180	0.7855	0.7170
2022	0.6631	0.2305	0.8412	0.8182	0.7518
2023	0.6454	0.2118	0.8237	0.7933	0.7273

Table 8. Relational degree order yielded by five GRA models for panel data.

Models	Relational Degree Order
Deng’s GRA	$r_{2020} ≻ r_{2022} ≻ r_{2019} ≻ r_{2023} ≻ r_{2021}$
Euclidean Norm GRA	$r_{2022} ≻ r_{2020} ≻ r_{2023} ≻ r_{2021} ≻ r_{2019}$
Matrix GRA	$r_{2022} ≻ r_{2020} ≻ r_{2023} ≻ r_{2021} ≻ r_{2019}$
C-type GRA	$r_{2022} ≻ r_{2020} ≻ r_{2023} ≻ r_{2021} ≻ r_{2019}$
Modified Deng’s GRA	$r_{2022} ≻ r_{2020} ≻ r_{2023} ≻ r_{2021} ≻ r_{2019}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jian, F.; Li, J.; Liu, X.; Wu, Q.; Zhong, D. Modified Deng’s Grey Relational Analysis Model for Panel Data and Its Applications in Assessing the Water Environment of Poyang Lake. Processes 2024, 12, 1935. https://doi.org/10.3390/pr12091935

AMA Style

Jian F, Li J, Liu X, Wu Q, Zhong D. Modified Deng’s Grey Relational Analysis Model for Panel Data and Its Applications in Assessing the Water Environment of Poyang Lake. Processes. 2024; 12(9):1935. https://doi.org/10.3390/pr12091935

Chicago/Turabian Style

Jian, Fanghong, Jiangfeng Li, Xiaomei Liu, Qiong Wu, and Dan Zhong. 2024. "Modified Deng’s Grey Relational Analysis Model for Panel Data and Its Applications in Assessing the Water Environment of Poyang Lake" Processes 12, no. 9: 1935. https://doi.org/10.3390/pr12091935

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modified Deng’s Grey Relational Analysis Model for Panel Data and Its Applications in Assessing the Water Environment of Poyang Lake

Abstract

1. Introduction

2. Deng’s GRA Model for Panel Data

2.1. Review of Deng’s GRA Model for Panel Data

2.2. Performance Testing of Deng’s GRA Model for Panel Data

2.2.1. Robot Execution Failures Dataset

2.2.2. Data Pre-Processing

2.2.3. Testing Results

3. Modified Deng’s GRA Model for Panel Data

4. Performance Testing of the Modified Deng’s GRA Model for Panel Data

4.1. Experiments Based on the Total Samples of the Robot Execution Failures Dataset

4.2. Experiments Based on the Force Samples of Robot Execution Failures Dataset

4.3. Experiments Based on the Torque Samples of the Robot Execution Failures Dataset

4.4. Comparison and Analysis

5. Case Study

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI