Secondary System Status Assessment of Smart Substation Based on Multi-Model Fusion Ensemble Learning in Power System

Liu, Shidan; Peng, Ye; Liu, Wei; Li, Yiquan; Cheng, Jiafu; Guo, Liang; Shao, Guangshi

doi:10.3390/pr13071986

Open AccessArticle

Secondary System Status Assessment of Smart Substation Based on Multi-Model Fusion Ensemble Learning in Power System

by

Shidan Liu

^1,*,

Ye Peng

²,

Wei Liu

³,

Yiquan Li

¹,

Jiafu Cheng

¹,

Liang Guo

⁴ and

Guangshi Shao

⁵

¹

Dispatching and Controlling Center, Guangdong Power Grid Company Ltd., Guangzhou 510000, China

²

Power Dispatching & Control Center, China Southern Power Grid Company Ltd., Guangzhou 510000, China

³

School of Electrical and Electronic Engineering, Anhui Science and Technology University, Bengbu 233100, China

⁴

Jiangmen Power Supply Bureau, Guangdong Power Grid Company Ltd., Jiangmen 529030, China

⁵

XJ Electric Co., Ltd., Xuchang 461000, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(7), 1986; https://doi.org/10.3390/pr13071986

Submission received: 11 May 2025 / Revised: 9 June 2025 / Accepted: 10 June 2025 / Published: 24 June 2025

(This article belongs to the Special Issue Smart Optimization Techniques for Microgrid Management)

Download

Browse Figures

Versions Notes

Abstract

In order to accurately evaluate the operating status of secondary equipment in smart substations, this paper establishes a secondary equipment status evaluation index system and proposes a secondary equipment status evaluation method based on multi-model fusion ensemble learning according to the differences of multiple machine learning algorithms as learners. The method consists of a two-layer structure. First, the original data is divided, and the divided data is used to perform k-fold verification on several base learners in the first layer. Then, the fully connected cascade (FCC) neural network in the second layer is used to fuse multiple base learners, and the Levenberg–Marquardt (LM) algorithm is used to train the FCC neural network so that the model converges quickly and stably. Simulation experimental analysis shows that the accuracy of secondary equipment status assessment of the proposed method is 98.71%, which can effectively evaluate the operating status of secondary equipment and provide guidance for the maintenance of smart substation systems and secondary equipment.

Keywords:

secondary equipment; state assessment; ensemble learning; multi-model fusion

1. Introduction

Smart substations are an important foundation and support for strong smart grids. The operating status of secondary equipment in smart substations is directly related to the safety and stability of smart substations [1,2,3,4].

At present, the maintenance of substation secondary equipment mainly focuses on fault maintenance and planned maintenance. The secondary equipment technology is developing rapidly, but the maintenance personnel are insufficient and lack experience, which brings huge risks to the stable operation of substations [5,6]. Secondary equipment status assessment lays the foundation for scientifically understanding the operating status of the device and guiding targeted inspections and maintenance. Accurate secondary equipment status assessment can not only improve the reliability of secondary system operation but also improve the overall economic benefits by formulating scientific and reasonable maintenance plans [7,8]. However, the amount of secondary equipment put into operation has increased exponentially. There is an urgent need to carry out equipment status maintenance with condition assessment as the core to improve the efficiency of operation and maintenance.

Currently, research and evaluation work is mainly carried out on the operating status of primary equipment, and there is not much research on the active operation, maintenance, and condition inspection of secondary equipment in smart substations.

Fuzzy theory modeling methods are usually used for secondary equipment status evaluation. This method is suitable for the status evaluation of things with fuzziness and uncertainty. Fuzzy theory can accurately quantify the influence of various indicators on the equipment status. In order to solve the problem that a single indicator of secondary equipment in smart substations cannot effectively evaluate potential faults, reference [9] proposed a fuzzy comprehensive evaluation method for the overall performance of secondary equipment based on variable weight theory and trapezoidal cloud model. In addition, the hierarchical analysis method has also been widely used in the status assessment of secondary equipment. Reference [10] uses the hierarchical analysis method to propose a status assessment standard based on expert consultation, which can well combine qualitative and quantitative analysis to achieve status assessment. Reference [11] proposed a condition-based maintenance strategy for electrical secondary equipment based on the analytic hierarchy process model according to the various types of information and online monitoring data of secondary equipment.

In recent years, with the rise of artificial intelligence algorithms, various intelligent algorithms (such as support vector machines [12,13,14], artificial neural networks [15,16,17], etc.) have been applied to secondary equipment status assessment and have achieved certain results in practical applications. Machine learning algorithms such as SVM and ANN are widely used to deal with fault detection, classification, and status assessment. Reference [18] established a secondary equipment status assessment model based on the grey clustering method and, based on the uncertainty of the assessment indicators, used the cloud model to construct the grey whitening weight function and combined the hierarchical analysis method to calculate the combined weights of the status indicators. Reference [19] proposed a smart substation secondary equipment status evaluation method based on big data mining, introducing a health index to reflect the true health status of the equipment. The single model or approach mentioned above establishes correlations between multiple indicators and the status of secondary equipment. However, when dealing with such nonlinear, high-dimensional problems, individual models demonstrate inherent instability. Multi-model fusion ensemble learning methods, by integrating the evaluation results from multiple models, can enhance overall assessment performance while effectively mitigating the risks of overfitting [20].

In response to the above issues, this paper proposed a secondary system status assessment method of smart substations based on multi-model fusion ensemble learning. The main contributions of the paper are as follows:

(1): A condition assessment indicator system for secondary equipment was established, and a multi-model fusion ensemble learning-based assessment method was proposed by leveraging the divergence among base learners of multiple machine learning algorithms;
(2): A Fully Connected Cascaded (FCC) neural network was employed to integrate multiple base learners, thereby enhancing the accuracy of condition assessment;
(3): The Levenberg–Marquardt (LM) algorithm was adopted to train the FCC neural network, enabling the model to achieve rapid and stable convergence.

The remaining sections of the paper are arranged as follows: Section 2 established a secondary equipment evaluation model for smart substations, and Section 3 proposed a secondary equipment status evaluation method based on ensemble learning. Section 4 conducts a simulation experimental analysis to verify the proposed scheme. Section 5 provided conclusions.

2. Evaluation Model of Secondary Equipment in Smart Substation

2.1. Secondary System Structure

The secondary system architecture of a smart substation is a three-layer and two-network structure, as shown in Figure 1. The three layers refer to the station control layer, bay layer, and process layer, and the “two networks” refer to the process layer network and the station control layer network.

2.2. Secondary Equipment Evaluation Index System

A reasonable secondary equipment evaluation system is the basic work of the state evaluation method research. The secondary system of the smart substation contains a variety of equipment, and the state characteristics of different equipment are also different. Therefore, it is necessary to formulate a scientific and efficient evaluation system based on the actual operation conditions, technical level, relevant technical specifications, and operating experience. The secondary equipment evaluation index is shown in Table 1.

2.3. Establishment of Secondary Equipment Evaluation Samples

The evaluation samples of secondary equipment should include evaluation indicators and corresponding evaluation results. According to the analysis in the previous section, the evaluation indicators of secondary system equipment can be expressed as

X = [X_{1, 1}, X_{1, 2}, \dots, X_{n, m}, \dots]

where

X_{n, m}

, represents the m indicators of n devices, a total of 65 indicators.

This paper adopts the relative degradation method to characterize the degree of deviation of the equipment from the normal state. The degradation degree is a quantitative indicator with a value range of 0 to 1. According to the characteristics of different indicators, they are divided into two types of indicators: the larger, the better, and the smaller, the better.

In the larger, the better indicators, such as timing accuracy and SOE resolution, their degradation degree

x_{i}

can be expressed as (1):

x_{i} = \frac{u_{i} - u_{i 0}}{u_{i \max} - u_{i 0}}

(1)

where

u_{i}

is the current actual measurement value of the equipment;

u_{i 0}

is the factory value of the status indicator, that is, the normal status value;

u_{i \max}

is the extremely degraded status value; the larger, the better.

In the smaller, the better indicators, such as synchronization jitter and response message delay, their degradation degree x_i can be expressed as (2):

x_{i} = \frac{u_{i} - u_{i 0}}{u_{i 0} - u_{i \min}}

(2)

where

u_{i \min}

is the value of the extremely degraded state; the smaller, the better.

In addition, in order to obtain evaluation samples for the operating status of secondary equipment in smart substations, several experts were invited to evaluate the status of each secondary equipment and the overall operating status based on the collected secondary equipment operating data and the actual operating conditions.

Based on expert opinions and operational experience, the operating status of evaluation indicators for secondary equipment is classified into four levels: “Normal, Attention, Abnormal, Critical,” with corresponding evaluation comment sets [v1, v2, v3, v4], as shown in Table 2.

The status evaluation results can be expressed as

Y = [y_{1}, y_{2}, y_{3}, y_{4}, y_{5}, y_{6}, y_{Σ}]

, where

y_{1}, y_{2}, y_{3}, y_{4}, y_{5}, y_{6} \in [0, 1]

,

y_{Σ} = \max (y_{1}, y_{2}, y_{3}, y_{4}, y_{5}, y_{6})

, which

y_{1}, y_{2}, y_{3}, y_{4}, y_{5}, y_{6}

, respectively, represent the operating status evaluation results of six types of secondary equipment, and

y_{Σ}

represents the secondary system status evaluation results.

3. Secondary Equipment Condition Assessment Based on Ensemble Learning

Manual analysis of the status of secondary equipment in smart substations takes a lot of time, and the accuracy of the evaluation cannot be guaranteed. This paper adopts a multi-model fusion ensemble learning algorithm to learn the results of manual evaluation so that the trained model can evaluate the status of secondary equipment like an expert, thereby realizing rapid online evaluation of the status of secondary equipment.

3.1. Multi-Model Fusion Ensemble Learning

At present, many studies have established the corresponding relationship between multiple indicators and secondary equipment status through various models or methods [21,22]. However, a single model is not stable when dealing with such nonlinear, high-dimensional problems. Multi-model fusion ensemble learning trains several models as base learners based on the evaluated data and uses the output of these base learners as a new training set to train a new learner. Usually, different base learner models can learn different features of the data. By fusing multiple base learners, the multi-model fusion ensemble learning model can learn from each other and achieve better learning results [23].

The framework of multi-model fusion ensemble learning is shown in Figure 2. First, the original data {X, Y} is divided into training data set D₁ and test data set D₂, and the training data set is randomly divided into several sub-data sets. The number of sub-data sets is determined by the type of base learners in the first layer. This paper selects five base learners: XGBoost (eXtreme Gradient Boosting), LightGBM (Light Gradient Boosting Machine), Random Forests (RF), Gradient Boosting Decision Tree (GBDT), and Long Short-Term Memory (LSTM). The selection criteria of the base learners will be analyzed in detail later. Therefore, it is necessary to divide the data set into five parts, which are recorded as D₁₁, D₁₂, D₁₃, D₁₄, and D₁₅, respectively.

3.2. Fully Connected Cascade Neural Network

Through the base learners of the first layer, we can finally obtain new training data sets and new test data sets, which will be used in the training of the second layer model. Therefore, the input in the second layer is the evaluation results of the output of each base model in the first layer, and the output is the adjusted final evaluation result. The second layer model uses a fully connected cascade neural network, and its network structure is shown in Figure 3. The five base learners are connected to five neurons through the input layer. The first four neurons use the tanh (⋅) activation function, and the last neuron is a linear summation function. With the same number of neurons, the FCC neural network can provide more connections than traditional neural networks and obtain more weight relationships. Through such structural advantages, the FCC neural network can achieve better results.

The shallow structure FCC neural network used in this paper can provide sufficient learning ability while avoiding overfitting problems. The FCC neural network has a direct mapping relationship between each input and the potential variable of each neuron. The first layer model loss function can be expressed by the sum of squared errors, as shown in (3):

L (L_{1}, i) = {minimize}_{ω_{i}} \sum_{T \in D 1_{i}} {‖{\hat{l}}_{n}^{L_{1}, i} - l_{n}‖}^{2} + α \cdot ‖ω_{i}‖

(3)

where

L (L_{1}, i)

represents the loss function of base learner

i

in the first layer;

{\hat{l}}_{n}^{L_{1}, i}

represents the evaluation result of base learner

i

for the nth data;

l_{n}

represents the actual evaluation result of the nth data, that is, the label value;

ω_{i}

represents the weight corresponding to base learner

i

;

α

is the weight coefficient.

According to the relationship between the second layer model and the base learner in the first layer, the evaluation result of the second layer model output can be expressed as (4):

{\hat{l}}_{n}^{L_{2}} = f ({\hat{l}}_{n}^{L_{1}, 1}, {\hat{l}}_{n}^{L_{1}, 2}, \dots, {\hat{l}}_{n}^{L_{1}, 5}, ω_{fcc})

(4)

where

f (•)

represents the evaluation result of the FCC neural network output;

ω_{fcc}

represents the weight of the FCC neural network.

Therefore, the loss function of the second layer model can be expressed as

L (L_{2}, i) = {minimize}_{ω_{fcc}} \sum_{n \in D_{3}} {‖{\hat{l}}_{n}^{L_{2}} - l_{n}‖}^{2} + β \cdot ‖ω_{f c c}‖

(5)

where

L (L_{2}, i)

represents the loss function in the second layer;

{\hat{l}}_{n}^{L_{2}}

represents the evaluation result of the FCC neural network for the nth data;

ω_{i}

represents the weight corresponding to the base learner

i

;

β

is the weight coefficient.

3.3. Secondary Equipment Status Assessment Process Based on Ensemble Learning

The secondary equipment evaluation method based on multi-model fusion ensemble learning is essentially to let the ensemble learning model learn experts to evaluate the equipment so as to obtain the ability to evaluate the secondary equipment. The specific process is shown in Figure 4. It includes the following steps:

(1): According to the indicator system established in Section 3.1 and the evaluation samples obtained, of which 80% are training data set D1 and of which 20% are test data set D2, respectively.
(2): Divide the sample data set and use the k-fold verification principle to train each base learner.
(3): After the training is completed, use the base learner to generate a new training data set D3 and test data set D4.
(4): Use the new training data set D3 and test data set D4 to train the FCC neural network.
(5): Construct a loss function and update the weights of the FCC neural network until the requirements are met.

The multi-model fusion ensemble learning model after training can accurately evaluate the operating status of secondary equipment and provide strong support for the operation and maintenance of smart substations.

3.4. Extraction of Feature Vectors

According to Section 2.3, the evaluation indicators of secondary system equipment can be expressed as

X = [X_{1, 1}, X_{1, 2}, \dots, X_{n, m}, \dots]

, where

X_{n, m}

represents the m indicators of n devices, a total of 65 indicators.

The feature vector dataset is shown in Table 3. The article collected 12,000 sample data, and the feature dimension is 12,000 × 65.

4. Case Analysis

To validate the effectiveness of the proposed algorithm for condition assessment of secondary equipment in smart substations, a typical line bay of a smart substation was selected as an example, with its topological structure illustrated in Figure 5. As specified in Section 2.1, the operational state information of secondary equipment comprises 65 elements. This paper uses MATLAB/Simulink for simulation, which is based on real-world operational data. During the experiment, the states of each secondary device were simulated, generating a total of 12,000 samples. The 12,000 samples were divided into four categories: normal, attention, abnormal, and critical. There were 3000 samples in each category, of which 80% were training data sets and 20% were test data sets, respectively. Six experts were invited to evaluate the condition of the secondary equipment, with the final state rating for each data instance determined by averaging the scores from all experts. Experts evaluated the data based on the quantitative indicators and state classification rules defined in Section 2.3 of the paper, which categorize equipment conditions into four levels. By synthesizing real-time equipment data, historical maintenance records, and industry expertise, experts independently assigned state ratings to each equipment sample. This study employed an averaging fusion strategy: discrete ratings from all six experts were converted into numerical values, their arithmetic mean was calculated and rounded, and then mapped back to the closest state level as the ground-truth label. All 12,000 samples underwent this labeling process. The resulting labels were combined with feature vectors to form the dataset, with 80% used for training the ensemble learning model and 20% for testing. The model optimization objective minimizes prediction errors against expert-averaged labels, enabling the algorithm to emulate expert decision-making logic for automated condition assessment. In conclusion, experts serve dual roles as data annotators and executors of evaluation criteria—their consensus labels provide learning targets for the model, while the multi-expert averaging strategy reduces individual subjective bias to ensure labeling reliability. The model evaluation accuracy index uses the mean relative error eMAPE (Mean Absolute Percentage Error, MAPE) and Root Mean Square Error (RMSE) as shown in (6) and (7):

e_{M A P E} = \frac{1}{N} \sum_{n = 1}^{N} |\frac{{\hat{y}}_{n} - y_{n}}{y_{n}}| \times 100 %

(6)

e_{R M S E} = \sqrt{\frac{\sum_{n = 1}^{N} {[({\hat{y}}_{n}) - (y_{n})]}^{2}}{N}}

(7)

where

N

is the total number of samples;

{\hat{y}}_{n}

is the evaluation result of the model for data

n

;

y_{n}

is the actual evaluation result of data n, that is, the expert scoring result.

4.1. Base Learner Selection

The performance of each base learner in the first layer of the ensemble learning model will directly affect the overall performance of the ensemble learning model. Therefore, in the process of building an ensemble learning model, it is necessary to try to select models with better performance. This paper pre-selected five base learners and adjusted the parameters according to their respective characteristics. In order to more rigorously demonstrate the evaluation performance and learning ability of the base learners, the data division and base learner learning process were repeated, and the average relative error was recorded.

The model parameters in this paper are shown in Table 4:

The final state evaluation results are shown in Table 5, where

{\bar{e}}_{M A P E}

and

{\bar{e}}_{R M S E}

are the mean values of the MAPE and RMSE obtained in multiple repeated learning processes, respectively.

According to Table 3, under the current hyperparameter settings, the average relative errors of each base learner are low, among which LightGBM has the lowest average relative error, followed closely by XGBoost. Both base learners are GBDT algorithm implementations. XGBoost’s loss function uses a second-order Taylor expansion and first-order and second-order derivative information, while GBDT only uses a first-order Taylor expansion. Therefore, during the optimization process, XGBoost can make the model training more sufficient. In addition, the average relative errors of LSTM and RF are both low, which can basically meet the requirements of secondary device evaluation.

4.2. Base Learner Correlation Analysis

In addition to the performance of the base learners, the correlation between the base learners will also directly affect the final evaluation ability of the ensemble learning model. Therefore, it is necessary to add base learners with large differences as much as possible. In order to select the best combination of base learners, this paper uses the Pearson correlation coefficient to analyze the differences between the base learners to measure the correlation of the base learners. The correlation analysis between the base learners is shown in Figure 6.

As shown in Figure 6, the five base learners used in this paper have a high error correlation. The main reasons can be analyzed from several aspects. First, according to Table 3, several base learners can achieve a high accuracy. The error may come from the inherent error of the data itself, which easily causes a surge in error correlation. Secondly, except for LSTM, the other four base learner models have high similarity. GBDT and RF are composed of a large number of trees. The difference is that boosting and bagging are used in the tree integration process. XGBoost and LightGBM are both GBDT algorithm implementations, so there will be a high similarity. However, the ensemble learning model established in this paper does not integrate too many base learners. At the same time, these base learners have achieved good evaluation accuracy. Therefore, retaining these base learners with high correlation in this paper has far more advantages than disadvantages.

4.3. Convergence Performance of the Model

The ensemble learning model established in this study employs a FCC network in the second layer and utilizes the Levenberg–Marquardt (LM) algorithm to accelerate the convergence of the ensemble model. To evaluate the impact of the LM algorithm on the overall ensemble learning model, this paper conducted comparative experiments under two conditions: with the LM algorithm and without the LM algorithm. The learning curve results are illustrated in Figure 7. As shown in Figure 7, the LM algorithm significantly enhances the convergence speed and stability of the ensemble learning model, enabling it to reach optimal performance more rapidly and consistently.

4.4. Network Optimization Under Different Hyperparameters

During the training of the FCC secondary equipment condition assessment model using the sample data set, the network was optimized by adjusting the initial learning rate, number of model layers, and iteration count. The classification accuracy of the training data set served as the optimization metric for the network parameters. In the training process, the threshold of the output layer was set to 0.05. If the neuron activation value in the output layer exceeded 0.05, the corresponding element was assigned a value of 1; otherwise, it was 0. The results are illustrated in Figure 8 and Figure 9. As shown in Figure 8, after 1000 iterations, the neural network with two layers achieved the best optimization performance across all initial learning rates. Increasing the model depth to three layers significantly raised training difficulty, while a one-layer architecture exhibited insufficient fitting capability.

From Figure 9, the network demonstrated optimal convergence when the initial learning rate was 0.1. Although a higher learning rate of 0.5 accelerated early-stage convergence, it caused accuracy fluctuations (highlighted in the boxed region of the figure) during training. Smaller learning rates resulted in slower accuracy improvements, indicating sluggish convergence of the FCC model.

The accuracy improved progressively with increasing iterations but stabilized once the iteration count reached a critical threshold. To balance training efficiency and computational resource constraints, the inflection point of accuracy growth (at 2800 iterations) was selected as the optimal stopping criterion. Here, the accuracy was calculated as 1 − eMAPE, yielding a final classification accuracy of 98.71% for the FCC model on the training data set.

4.5. Evaluation and Performance Analysis of Ensemble Learning Models

In order to verify the performance of the multi-model fusion ensemble learning model established in this paper, the ensemble learning model of each base learner and LSTM was removed (denoted as ensemble-1), the ensemble learning model of GBDT removed (denoted as ensemble-2), and the complete ensemble learning model are compared and analyzed. Considering the rationality of the evaluation, the complete training set is used in the training process of each model. After repeated experiments, the average relative accuracy distribution of each model is shown in Figure 10, where the average relative accuracy can be calculated by 1-eMAPE. The accuracy and recall of different models are shown in Table 6.

According to Table 6, multi-model fusion ensemble learning in this paper is superior to the single algorithm and ensemble learning above in the evaluation indexes of accuracy recall.

As shown in Figure 10, from a theoretical perspective, the ensemble learning model of multi-model fusion can fully play the advantages of each algorithm and obtain better evaluation accuracy by taking advantage of their strengths and avoiding their weaknesses. On the other hand, there are many secondary equipment evaluation indicators, which may be interrelated with each other. There is a possibility that multiple indicators can reach the same performance in space through different assumptions. The ensemble learning model of multi-model fusion established in this paper can effectively solve the risk of poor generalization performance of a single model. In addition, a single model is prone to fall into a local optimal solution. Through the fusion of multiple models, the ensemble learning algorithm can well solve the problem of local optimal solutions, thereby improving the overall evaluation performance. Further analysis of the three integrated models shows that both the integrated-L model and the integrated-G model integrate four models, three of which are consistent, and the fourth model uses LSTM and GBDT, respectively. The accuracy of the model integrating LSTM is significantly higher than that of the model integrating GBDT. At the same time, the accuracy of the model integrating GBDT is almost the same as that of the two high-performance models, XGBoost and LightGBM. The reason is that the LSTM model has a high degree of difference from the other four models, while the GBDT has a small degree of difference. Although the accuracy of LSTM alone is not high, its differentiation brings a higher accuracy improvement to the integrated model. This conclusion is consistent with the mathematical derivation results in the previous article [23]. In addition, the ensemble learning model integrating five models has the highest accuracy. Therefore, it can be known that as many effective base learners as possible should be integrated under the condition that resources allow, and base learners with small correlations should be selected as much as possible for integration when conditions are insufficient.

5. Conclusions

In order to solve the problem of secondary equipment evaluation in smart substations, this paper develops a comprehensive secondary equipment evaluation index system based on the characteristics of its secondary equipment and proposes a secondary equipment status evaluation method based on multi-model fusion ensemble learning. This method can integrate the evaluation capabilities of multiple base learners, take advantage of their strengths, avoid their weaknesses, and obtain better evaluation results. Through case analysis, it can be seen that ensemble learning can obtain better evaluation capabilities than a single model, and base learners with lower correlation should be selected when selecting base learners. Simulation experimental analysis shows that the accuracy of secondary equipment status assessment of the proposed method is 98.71%. The ensemble learning model established in this paper integrates XGBoost, LightGBM, RF, GBDT, and LSTM and combines the improved LM algorithm to help FCC neural network training. It can realize the status evaluation of secondary equipment very well and has high application value in the evaluation of secondary equipment in smart substations.

In this paper, the evaluation effect of secondary equipment in a noisy environment is not considered, and this will be refined in the next stage of research.

In the future, more algorithm models will be trained for different substation equipment in smart substations to improve the application range of the learning algorithm.

Author Contributions

S.L., conceptualization, methodology, writing—original draft. Y.P., investigation. W.L., data curation. Y.L., data curation. J.C., writing—review and editing. L.G., writing—review and editing. G.S., writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the technology project of Guangdong Power Grid Co., Ltd. on the research and application of lightweight prefabricated mobile commissioning and operation base (030700KC23070013-GDKJXM20230784).

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

Authors Shidan Liu, Yiquan Li and Jiafu Chen were employed by the Dispatching and Controlling Center, Guangdong Power Grid Company Ltd. Author Ye Peng was employed by the Power Dispatching & Control Center, China Southern Power Grid Company Ltd. Author Wei Liu was employed by the School of Electrical and Electronic Engineering, Anhui Science and Technology University. Author Liang Guo was employed by the Jiangmen Power Supply Bureau, Guangdong Power Grid Company Ltd. Author Guangshi Shao was employed by the XJ Electric Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Liu, H. Design and Implementation of Smart Substation Based on Internet of Things Technology. In Proceedings of the IEEE 5th International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Dali, China, 11–13 October 2023; pp. 1135–1140. [Google Scholar]
Yu, Y.; Wang, T.; Jia, L.; Zhang, F.; Cao, H.; Guan, Q.; Yan, X. Multitargets Joint Training Lightweight Model for Object Detection of Substation. IEEE Trans. Neural Netw. Learn. Syst. 2024, 35, 2413–2424. [Google Scholar]
Li, Z.; Xi, Y.; Xia, Y.; Feng, Y.; Wu, G.; Liu, G. Distribution Network Fault Location Method Based on Wide Scale Time Window Difference Operator. IEEE Trans. Ind. Inform. 2024, 20, 3446–3455. [Google Scholar]
Li, Z.; Xi, Y.; Xia, Y.; Wu, G.; Peng, W.; Mu, L. Accurate Fault Location Method for Multiple Faults in Transmission Networks Using Travelling Waves. IEEE Trans. Ind. Inform. 2024, 20, 8717–8728. [Google Scholar]
Chaudhuri, B.B.; Mukhopadhyay, P. A survey of Hough Transform. Pattern Recognit. 2015, 48, 993–1010. [Google Scholar]
Cardoso, A.; Mattioli, L.; Lamounier, E. 2D–3D spatial registration for remote inspection of power substations. Energies 2020, 13, 6209. [Google Scholar] [CrossRef]
Suo, J.; Wang, F.; Zhu, J.; Tan, Q.; Chen, K.; Liu, G. State Evaluation of Secondary Equipment Based on Matter-Element Extension Theory. In Proceedings of the IEEE International Conference on Advanced Power System Automation and Protection (APAP), Xuchang, China, 9–10 October 2023; pp. 382–386. [Google Scholar]
Ma, B.; Wu, N.; Zheng, X.; Wang, Y. Secondary Equipment Evaluation Based on Multi-Parameter Model for Smart Substation. In Proceedings of the 8th Asia Conference on Power and Electrical Engineering (ACPEE), Tianjin, China, 14–16 April 2023; pp. 2248–2252. [Google Scholar]
Dong, Z.; Wu, M.; Huang, W.; Wang, M.; Wang, M.; Song, B. A Method for State Assessment of Intelligent Substation Secondary Equipment Based on Fuzzy Set Theory. In Proceedings of the IEEE International Conference on Energy Internet (ICEI), Nanjing, China, 20–24 May 2019; pp. 144–148. [Google Scholar]
Mei, J. Health Assessment of Capacitive Voltage Transformers Based on Analytic Hierarchy. In Proceedings of the 10th International Forum on Electrical Engineering and Automation (IFEEA), Nanjing, China, 3–5 November 2023; pp. 262–265. [Google Scholar]
Dong, Z.; Li, H.; Tang, Y.; Yin, H.; Yin, J. Research and Application of Intelligent Diagnosis of Health status of Relay Protection Equipment. In Proceedings of the International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS), Hengyang, China, 26–27 March 2022; pp. 847–853. [Google Scholar]
Dragicevic, T.; Baghaee, H.R.; Mlakic, D.; Nikolovski, S. Support Vector Machine-Based Islanding and Grid Fault Detection in Active Distribution Networks. IEEE J. Emerg. Sel. Top. Power Electron. 2020, 8, 2385–2403. [Google Scholar]
Guo, W.; Wang, F.; Long, L. Intelligent Substation Relay Protection Condition Monitoring Based on Support Vector Machine Algorithm. In Proceedings of the 2nd International Conference on Networking, Communications and Information Technology (NetCIT), Manchester, UK, 26–27 December 2022; pp. 638–642. [Google Scholar]
Ren, Y. Research on Integrated Power Electronic Monitoring System of Digital Substation. In Proceedings of the IEEE 2nd International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA), Changchun, China, 24–26 February 2023; pp. 1033–1037. [Google Scholar]
Meira de Andrade, P.H.; Villanueva, J.M.M.; Macedo Braz, H.D.d. An Outliers Processing Module Based on Artificial Intelligence for Substations Metering System. IEEE Trans. Power Syst. 2020, 35, 3400–3409. [Google Scholar] [CrossRef]
Feng, Q.; Liang, Y. Condition assessment of substation equipment based on intelligence information fusion. In Proceedings of the IEEE PES Innovative Smart Grid Technologies, Tianjin, China, 21–24 May 2012; pp. 1–5. [Google Scholar]
Borkowski, D.; Wetula, A.; Bień, A. Contactless Measurement of Substation Busbars Voltages and Waveforms Reconstruction Using Electric Field Sensors and Artificial Neural Network. IEEE Trans. Smart Grid 2015, 6, 1560–1569. [Google Scholar] [CrossRef]
Chanak, P.; Kaur, G. An Energy Aware Intelligent Fault Detection Scheme for IoT-Enabled WSNs. IEEE Sens. J. 2022, 22, 4722–4731. [Google Scholar]
Shi, B.; Lu, S.; Yi, K. State Evaluation Method of Smart Substation Secondary Equipment Based on Big Data Mining. In Proceedings of the IEEE/IAS Industrial and Commercial Power System Asia (I&CPS Asia), Shanghai, China, 9–12 July 2022; pp. 929–933. [Google Scholar]
Zeng, X.; Liu, Y.; Wang, Y.; Chen, J.; Kong, Y.; Sun, S.; Guo, Y.; Chen, X. Short-Term Load Forecasting for Industrial Customers Based on TCN-LightGBM. IEEE Trans. Power Syst. 2021, 36, 1984–1997. [Google Scholar]
Das, S.; Santoso, S.; Ananthan, S.N. Relay performance verification using fault event records. Prot. Control. Mod. Power Syst. 2018, 3, 22. [Google Scholar] [CrossRef]
Chen, Y.; Wang, X.; Yin, X.; Wen, M. A Novel Markov Model for Protective Relay Reliability. IEEE Trans. Power Deliv. 2023, 38, 2110–2118. [Google Scholar]
Wang, F.; Yang, J. Auto-Ensemble: An Adaptive Learning Rate Scheduling Based Deep Learning Model Ensembling. IEEE Access 2020, 8, 217499–217509. [Google Scholar]

Figure 1. Structure of intelligent substation secondary system.

Figure 2. Framework of ensemble learning with multi-models.

Figure 3. Structure of FCC neural network.

Figure 4. Flow chart of secondary equipment condition.

Figure 5. Network topology graph of line interval.

Figure 6. Correlation of base learners.

Figure 7. Learning curves of the ensemble learning model with or without LM algorithms.

Figure 8. Network optimization after 1000 iterations.

Figure 9. Network optimization with two layers.

Figure 10. Average relative accuracy distribution of base learners and ensemble learning.

Table 1. Secondary equipment evaluation index.

Evaluation Index	Features
Measurement circuit evaluation index (12 items)	operation time, mean time between failures (MTBF), absolute delay, time synchronization pulse error, sampling amplitude error, sampling phase error, sampling packet loss rate, operating temperature, insulation resistance, leakage current, power frequency magnetic field immunity, and pulse magnetic field immunity
Protection device evaluation index (9 items)	operating time, MTBF, communication status, familial defect incidence, random defect incidence, average correct operation rate, bus differential current error, pilot differential current error, and main transformer differential current error
Evaluation Metrics for Intelligent Terminals (11 items)	operating time, operating time (action), message transmission delay, time synchronization error, response message delay, familial defects, correct operation rate, Sequence of Events (SOE) resolution, communication interfaces, insulation resistance, and leakage current
Evaluation Metrics for Measurement and Control Devices (11 items)	operating time, MTBF, GOOSE delay, synchronization performance, four-telemetry performance (measurement, signaling, control, adjustment), anti-maloperation locking performance, harmonic interference, fiber-optic interface performance, SOE resolution, insulation resistance, and leakage current
Evaluation Metrics for Communication Devices (13 items)	response time, delay performance, availability rate, utilization rate, time synchronization accuracy, impulse voltage withstand, insulation resistance, throughput rate, packet loss rate, broadcast rate, multicast rate, power frequency magnetic field immunity, and pulse magnetic field immunity
Evaluation Metrics for Synchronization Systems (9 items)	MTBF, mean time to repair (MTTR), clock jitter, pulse width error, pulse leading-edge accuracy, time alignment accuracy, time reception accuracy, power frequency magnetic field immunity, and pulse magnetic field immunity

Table 2. State assessment standard for indicators.

Relative Degradation Degree	0~0.2	0.2~0.5	0.5~0.8	0.8~1
Operating Status	Normal (v1)	Attention (v2)	Abnormal (v3)	Critical (v4)
Maintenance Strategy	Deferred Maintenance	Planned Maintenance	Expedited Maintenance	Immediate Maintenance

Table 3. Feature vector data.

Feature Vectors Data	Secondary Equipment Evaluation Index
X	$X = [\begin{array}{l} (x_{1, 1}, x_{1, 2}, x_{1, 3}, \dots, x_{1, 63}, x_{1, 64}, x_{1, 65}) \\ (x_{2, 1}, x_{2, 2}, x_{2, 3}, \dots, x_{2, 63}, x_{2, 64}, x_{2, 65}) \\ \dots \\ (x_{12000, 1}, x_{12000, 2}, x_{12000, 3}, \dots, x_{12000, 63}, x_{12000, 64}, x_{12000, 65}) \end{array}]$
Y	$Y = [\begin{array}{l} \max (y_{1, 1}, y_{1, 2}, y_{1, 3}, y_{1, 4}, y_{1, 5}, y_{1, 65}) \\ \max (y_{2, 1}, y_{2, 2}, y_{2, 3}, y_{2, 4}, y_{2, 5}, y_{2, 6}) \\ \dots \\ \max (y_{12000, 1}, y_{12000, 2}, y_{12000, 3}, y_{12000, 4}, y_{12000, 5}, y_{12000, 6}) \end{array}]$

Table 4. The model parameter.

Model	Parameter
FCC	The number of network layers is 2, and the regularization coefficient is 3750.
XGBoost	The eta is 0.05, the max depth is 6, the subsample is 0.8, the colsample_bytree is 0.8, and the min child weight is set to 1.
LightGBM	The number of trees is 790, the maximum depth is 3, the number of leaves is 8, the learning rate is set to 0.008, the bagging fraction is 0.12.
RF	The n_estimators is 100, the criterion is “gini”.
GBDT	The learning rate is 0.09, the number of tree estimators is 200, the max depth is 4, and the subsample is set to 0.9.
LSTM	The number of hidden layers is 1, and the number of hidden nodes is set to 50.

Table 5. Assessment error of different base learners.

Base Learners	XGBoost	LightGBM	RF	GBDT	LSTM
${\bar{e}}_{M A P E} / %$	2.36	2.14	5.89	5.42	6.57
${\bar{e}}_{R M S E}$	0.16	0.12	0.23	0.19	0.33

Table 6. The accuracy and recall of different models.

Model	SVM	RNN	XGBoost	LightGBM	RF	GBDT	LSTM	ensemble-1	ensemble-2	This Paper
accuracy	86.40%	91.04%	98.01%	98.21%	94.28%	94.33%	94.11%	97.86%	98.13%	98.71%
Recall	88.73%	92.76%	98.79%	98.32%	94.55%	94.64%	94.19%	97.92%	98.16%	98.87%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, S.; Peng, Y.; Liu, W.; Li, Y.; Cheng, J.; Guo, L.; Shao, G. Secondary System Status Assessment of Smart Substation Based on Multi-Model Fusion Ensemble Learning in Power System. Processes 2025, 13, 1986. https://doi.org/10.3390/pr13071986

AMA Style

Liu S, Peng Y, Liu W, Li Y, Cheng J, Guo L, Shao G. Secondary System Status Assessment of Smart Substation Based on Multi-Model Fusion Ensemble Learning in Power System. Processes. 2025; 13(7):1986. https://doi.org/10.3390/pr13071986

Chicago/Turabian Style

Liu, Shidan, Ye Peng, Wei Liu, Yiquan Li, Jiafu Cheng, Liang Guo, and Guangshi Shao. 2025. "Secondary System Status Assessment of Smart Substation Based on Multi-Model Fusion Ensemble Learning in Power System" Processes 13, no. 7: 1986. https://doi.org/10.3390/pr13071986

APA Style

Liu, S., Peng, Y., Liu, W., Li, Y., Cheng, J., Guo, L., & Shao, G. (2025). Secondary System Status Assessment of Smart Substation Based on Multi-Model Fusion Ensemble Learning in Power System. Processes, 13(7), 1986. https://doi.org/10.3390/pr13071986

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Secondary System Status Assessment of Smart Substation Based on Multi-Model Fusion Ensemble Learning in Power System

Abstract

1. Introduction

2. Evaluation Model of Secondary Equipment in Smart Substation

2.1. Secondary System Structure

2.2. Secondary Equipment Evaluation Index System

2.3. Establishment of Secondary Equipment Evaluation Samples

3. Secondary Equipment Condition Assessment Based on Ensemble Learning

3.1. Multi-Model Fusion Ensemble Learning

3.2. Fully Connected Cascade Neural Network

3.3. Secondary Equipment Status Assessment Process Based on Ensemble Learning

3.4. Extraction of Feature Vectors

4. Case Analysis

4.1. Base Learner Selection

4.2. Base Learner Correlation Analysis

4.3. Convergence Performance of the Model

4.4. Network Optimization Under Different Hyperparameters

4.5. Evaluation and Performance Analysis of Ensemble Learning Models

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI