1. Introduction
Grey relational theory, first proposed by Professor Deng in 1984 [
1], is a mathematical method to assess the similarity or closeness between samples by calculating the relational degree among them. Unlike the factor analysis method of statistics inference, it does not need a large number of samples, and so it provides an effective tool for analyzing the interactions between various factors within a system. The grey relational analysis (GRA) model, as an important concrete manifestation of grey relational theory, has been widely applied in various fields. For example, Zhang [
2] applied GRA models to rank the importance of 22 factors in the process industry, pinpointing security inspection, risk identification, and security awareness as the most critical, so as to develop an intelligent monitoring system for key factors across subsystems, utilizing video surveillance and sensors for real-time safety management and accident prevention. Xu [
3] utilized an enhanced GRA model to assess the influence of innovation strategies on marine industry clustering and transformation, revealing stronger ties between innovation investment and industry aggregation and contributing to marine economic sustainability. Chen [
4] used an improved GRA model to develop a subcontractor selection model that integrates quality function deployment (QFD) and analytic hierarchy process (AHP) to enhance the objectivity and rationality of the selection process.
Since the GRA model was proposed 40 years ago, new patterns have been continuously proposed to adapt to wider applications. Overall, they can be divided into two types: One is univariate GRA models, which focus on the relationship between a single factor and the system’s behavior, such as Deng’s GRA model [
5], the area GRA model [
6], and the slope GRA model [
7]. The other is multivariate GRA models, which consider the interactions between multiple factors within the system, such as the three-dimensional Deng’s GRA model [
5], the norm GRA model [
8], the convex GRA model [
9], the grid GRA model [
10], the matrix GRA model [
11], and the curvature GRA model [
12]. But in these GRA models, Deng’s GRA model is most widely applied due to its simple mathematical expression and easy implementation.
Deng’s GRA model is the most basic model of grey relational theory, it constructs a mathematical model based on the difference sequence between the reference sequence and comparison sequences and calculates the relational coefficient to measure the relational degree between them. In practical applications, Deng’s GRA model can effectively identify the interactions between various factors in the system, help researchers identify key factors, and optimize the system structure. For example, Zhou et al. [
13] developed a backpropagation neural network model based on statistical analysis and Deng’s GRA theory to predict the sulfur content in the COREX process. Zhang et al. [
14] introduced a novel multivariate grey relational model based on spatial pyramid pooling for the analysis of time series data on different scales. Javed et al. [
15] provided a new perspective for supplier evaluation and classification in multi-sourcing through the dynamic grey relational method. Xu et al. [
16] utilized Deng’s GRA model and PLS-SEM to analyze the integration level of China’s digital economy and real economy. Li et al. [
17] proposed a grey-adversary perceptual network to improve the performance of anomaly detection in surveillance videos. Overall, Deng’s GRA model is an effective grey relational analysis tool that reveals the inherent connections between data sequences by quantifying their differences, providing a new perspective and method for solving practical problems.
However, as the sample data become relatively more complex, Deng’s GRA model sometimes deviates from the actual facts. The literature [
18] has pointed out that Deng’s GRA model cannot distinguish between the normal state and the failure state in the Robot Execution Failures Dataset. The reason for this may be that Deng’s GRA model is only constructed based on the raw difference sequence between the reference sequence and the comparison sequence. From the perspective of approximation theory, Deng’s GRA model may not have sufficient depth in data mining. In fact, researchers have always improved Deng’s GRA model to adapt to a wider range of applications. Liu et al. [
19] proposed a new GRA model for measuring the relationships between inverse sequences. Huang et al. [
20] proposed a new GRA model based on information differences for the performance evaluation of sanatoriums. However, the follow-up research on Deng’s GRA model has changed towards other theoretical frameworks. That is to say, the mechanisms of the subsequent models proposed have significantly diverged from the foundational principles of Deng’s original GRA model. In this way, the models become more complex, which can have a certain impact on the practical applications.
In view of the above facts, this paper will use Taylor’s approximation theory to improve Deng’s GRA model based on the original theoretical framework. Firstly, we use small-scale multivariate sample data to test the reliability of Deng’s original GRA model for panel data, and then a possible method to modify Deng’s original GRA model for panel data is provided according to the test results. The main work of this paper lies in the following aspects:
- (1)
A series of testing experiments were conducted on the performances of Deng’s GRA model for panel data based on the dataset LP1 of Robot Execution Failures.
- (2)
A modified Deng’s GRA model for panel data is presented.
- (3)
The water environment of Poyang Lake is assessed over past years using a modified Deng’s GRA model for panel data.
The outline of this paper is organized as follows.
Section 2 reviews Deng’s original GRA model for panel data and tests the performance of the model based on the dataset LP1 of Robot Execution Failures.
Section 3 presents a modification Deng’s GRA model for panel data.
Section 4 validates the modified Deng’s GRA model for panel data by three numerical experiments.
Section 5 applies the modified Deng’s GRA model for panel data to assess the water environment of Poyang Lake.
Section 6 makes conclusions.
3. Modified Deng’s GRA Model for Panel Data
From a mathematical perspective, Deng’s original GRA model is constructed based on the raw difference sequence between the reference sequence and the comparison sequence. According to Taylor’s formula, the approximation will inevitably result in significant errors. For example, let the reference sequence
be a constant matrix with all elements equal to 0.5, the comparison sequence
is a binary matrix such that all elements can only be 0 or 1. That is,
According to Deng’s original GRA model, for , Deng’s grey relational degree of and will be 1. Obviously, this deviates from the facts and it is difficult to distinguish the samples.
Taylor approximation theory provides an approximation of a function as a polynomial sum derived from its derivatives at a single point. It is widely used in various fields of mathematics, physics, engineering, and computer science for its ability to solve problems that are otherwise difficult to handle analytically. In this paper, we adopted the concept of Taylor’s formula and converted the derivative into the differential form to optimize Deng’s GRA model. Therefore, Deng’s GRA model is expanded to the first-order and second-order differences.
Consequently, Deng’s GRA model for panel data is modified as follows.
Definition 3. Assume is the behavior matrix of system characteristics, and is a behavior matrix of system factors, where represents the time dimension, represents the indicator dimension, is the sample size, and . Let
,
, here .
,
here . Then, is called the modified Deng’s grey relational degree of and and is also called the modified Deng’s GRA model for panel data.
Similarly, is a distinguished coefficient, the function of which is to enhance resolution, and, as usual, . Furthermore, the grey relational degree defined by the modified Deng’s GRA model for panel data still satisfies those primary properties, including normality, closeness, symmetry, translation invariance, and indicator permutation.
Next, we will continue to use the Robot Execution Failures Dataset adopted in
Section 2 to test the performance of the modified Deng’s GRA model for panel data.
5. Case Study
In this section, as a practical application, the modified Deng’s GRA model for panel data is applied to assess the water quality of Poyang Lake over the past five years. Poyang Lake, the largest freshwater lake in China, has abundant ecological resources and plays a significant role in economic and social spheres. Therefore, the Chinese government has always placed a high priority on the protection of water quality in Poyang Lake.
Among the various indicators of water quality, total phosphorus (TP) is the most crucial indicator and has a decisive influence on the health assessment of aquatic ecosystems. So TP is acknowledged as one of the pollutants that require stringent monitoring and control within water quality assessment and management practices. This case study will focus on the monthly total phosphorus data measured from eight monitoring sites in Poyang Lake District from 2019 to 2023. For confidentiality requirements, the eight monitoring sites are represented by symbols Site 1–8, respectively, and the detailed data can be found in
Table A1.
According to the previous discussion in this paper, we suppose that (which is zero matrix) is the behavior matrix of system characteristics, where represents the time dimension and represents the indicator dimension. The years are supposed as the samples for the behavior matrix of system factors, that is, , .
Meanwhile, to verify the effectiveness of the modified Deng’s GRA model for panel data, the results are compared with Deng’s original GRA model and other GRA models for panel data, including the Euclidean Norm GRA model [
8], the matrix GRA model [
21], and the C-type GRA model [
22]. As in
Section 4, the testing data are pre-processed in dimensionless form using interval operator. By programming in Python, the relational degree values are all obtained, as depicted in
Table 7.
Table 7 shows that the relational degree values differ significantly from each other, which is due to the fact that different GRA models for panel data have different functional expressions, and the range of relational degree is also different. Because the main purpose of GRA models is for cluster analysis based on the relational degree order, the numerical values of relational degree between various GRA models are meaningless. In other words, we merely need to compare the relational degree order of the five GRA models. Therefore, the relational degree order yielded by the five GRA models are displayed, as shown in
Table 8.
Table 8 clearly shows that, among the order provided by the five GRA models, only Deng’s original GRA model presents a different order from the others. This indicates that the modified Deng’s GRA model for panel data is effective and significantly improves the clustering accuracy of the original model. Furthermore, the earlier the year appearing in the order, the lower the TP of Poyang Lake that year. Therefore, the data in
Table 8 further confirm that the Chinese government has achieved significant results in reducing TP emissions from Poyang Lake and effectively protecting the water environment of Poyang Lake.
6. Conclusions
In this paper, a new method is provided to improve the clustering accuracy of the existing Deng’s GRA model for panel data. Based on the numerical test results for Deng’s original GRA model, it is found that the mechanism of Deng’s original GRA model is reasonable, but it lacks in-depth modeling. To solve this problem, a modified Deng’s GRA model is presented that can be applied to the cluster analysis of data with any number of dimensions. Three numerical experiments on the Robot Execution Failures Dataset have verified the significant improvement in clustering accuracy compared to Deng’s original GRA model for panel data. Furthermore, a case study on the monthly TP data of Poyang Lake over the past five years is given to demonstrate the superiority of the modified model. By comparing it with other cluster methods, the results of the case study show that the modified Deng’s GRA model for panel data is applicable and also confirm the remarkable effectiveness of the Chinese government’s water quality regulation in Poyang Lake for many years. Due to the simplicity of the model, the modified Deng’s GRA model for panel data presented in this paper can be more widely applied in the future research of grey relational theory and is expected to more effectively reveal the interactions between various factors in the system, helping researchers to identify key factors and optimize the system structure.
However, comparative analysis results show that the modified Deng’s GRA model, the matrix GRA model, and the C-type GRA model all perform well. Based on the simple mathematical structure and easy implementation of the model, the modified Deng’s GRA model is deemed effective and has more application prospects. There may be some potential shortcomings. Therefore, a more extensive simulation study with other clustering methods will be a good idea for our future work. On the other hand, the selection of data pre-processing methods is crucial for ensuring the accuracy of the model. Therefore, discussing the specific impact of different data pre-processing methods on the performance of various GRA models will also be meaningful research work for the future.