A New Energy High-Impact Process Weather Classification Method Based on Sensitivity Factor Analysis and Progressive Layered Extraction

Liang, Zhifeng; Wang, Zhao; Wu, Nan; Jiang, Yue; Sun, Dayan

doi:10.3390/electronics14071336

Open AccessArticle

A New Energy High-Impact Process Weather Classification Method Based on Sensitivity Factor Analysis and Progressive Layered Extraction

by

Zhifeng Liang

^1,4,

Zhao Wang

³,

Nan Wu

^2,*,

Yue Jiang

² and

Dayan Sun

⁴

¹

Department of Electrical Engineering, Tsinghua University, Beijing 100084, China

²

Key Laboratory of Modern Power System Simulation and Control & Renewable Energy Technology, Ministry of Education, Northeast Electric Power University, Jilin 132012, China

³

State Key Laboratory of Renewable Energy Grid-Integration, China Electric Power Research Institute, Haidian District, Beijing 100192, China

⁴

State Grid Corporation of China, Beijing 100031, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(7), 1336; https://doi.org/10.3390/electronics14071336

Submission received: 22 February 2025 / Revised: 25 March 2025 / Accepted: 25 March 2025 / Published: 27 March 2025

(This article belongs to the Special Issue Advances in Power System Dynamics, Stability, Control and Dispatch with Large-Scale Renewable Energy Penetrated, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

For the electricity system with a high proportion of new energy, the extreme weather events caused by climate change will make the new energy power supply present an extremely complicated situation, thus affecting the safe and stable operation of the power system. In order to solve the above problems, this study proposes a classification method of the extreme weather process based on the Progressive Layered Extraction (PLE) model considering the weather-sensitive factors with high impact on new energy. This method analyses the sensitive factors affecting the new energy output from the two perspectives of abnormal output and abnormal prediction error, defines the high-impact weather process, and divides the standard set. According to the standard set, a high-impact weather process identification model based on PLE is constructed to provide more accurate early warning information. The proposed method is applied to a new energy cluster in Jiangxi Province, China. Compared with the traditional classification task model, the accuracy of the proposed method is increased by 1.30%, which verifies the effectiveness of the proposed method.

Keywords:

new electricity system; new energy power supply; extreme weather events; high-impact weather-sensitive factors; progressive layered extraction

1. Introduction

With the promotion of the “double carbon” goal, the status of new energy power generation has been increasing. China’s new energy power generation installed capacity, power generation, consumption, and other aspects have made remarkable progress; policy support continues to increase, large-scale projects and offshore projects continue to advance to promote the green, low-carbon transformation of the energy structure to achieve the sustainable development goals. In recent years, China’s new energy power generation development momentum has been flourishing. In terms of power generation, in the first three quarters of 2024, the national renewable energy generation reached 2.51 trillion kWh, an increase of 20.9%, accounting for about 35.5% of the total power generation, of which wind power and solar power combined reached about 1.35 trillion kWh, an increase of 26.3% [1]. Therefore, related studies on new energy power are increasing [2,3,4,5,6,7,8]. At the same time, in recent years, the global average temperature has continued to rise, and extreme weather events have occurred frequently, causing serious impacts on human society and the ecological environment [9]. Due to the weather vulnerability of the power system, that is, the potential threats and risks to the operation and stability of the power system in the face of various weather conditions, especially extreme weather events, large-scale power outages occur frequently around the world. For example, in November 2024, the United Kingdom was affected by storm “Dara”, which caused power outages for several consecutive days [10]. A snowstorm swept the southern United States in 2021. It overwhelmed the Texas power grid and caused widespread power outages [11]. Reference [12] systematically summarizes and evaluates the characteristics and impacts of extreme weather events occurring around the world in recent years, providing references for mechanism analysis, numerical simulation, and attribution of climate events in climate change research, and it puts forward relevant insights for improving meteorological disaster prevention and reduction strategies. Reference [13] addresses the effect of heat waves and cold waves on the variations in daily power generation in wind farms. Reference [14] considers extreme events in wind farms and proposes a new adaptive learning method for the online learning problem of RBF neural networks. However, the current classification of extreme weather does not have a completely unified standard and usually does not consider its impact on the new energy output, resulting in a lack of targeted analysis and classification. From July to August 2022, due to the insufficient prediction of extreme high temperatures and dry weather, Sichuan’s power supply guarantee ability was seriously reduced, with the largest daily power gap exceeding 17,000,000 kW and the power gap exceeding 370,000,000 kWh. In order to minimize or avoid the extreme weather causing a major impact on and even damage to the new power system, the definition, classification, and identification of high-impact extreme weather standards becomes particularly important.

At present, research on extreme weather is mainly focused on its impact on the whole new power system, and renewable energy, such as wind energy and solar energy, as an important link in the construction of a new power system, also has a non-negligible impact on the weather vulnerability of the power system [15]. Therefore, it has become crucial to explore the impact of extreme weather on renewable energy. Reference [16] used power system modelling outputs to identify weather-induced extreme events in highly renewable systems. Reference [17] proposed a probabilistic risk-assessment framework for assessing the impact of extreme events on renewable energy power plant components to assess the degradation of wind turbine transformers and photovoltaic (PV) panels in the face of extreme weather conditions. A Gaussian copula was used to model the joint probability of extreme events, effectively combining multiple phenomena. Reference [18] focused on solar PV systems under extreme weather and conducted a detailed case study. Reference [19] proposed an extreme weather event identification method based on meteorological factors and an adaptive short-term wind power prediction method based on transfer learning and autoencoder according to the meteorological conditions of wind power prediction under extreme weather conditions such as cold wave, typhoon, and ice cover. Reference [20] proposed a short-term combined prediction method for wind power in cold-wave weather based on extreme scenario division. According to the different mechanisms of cold waves affecting wind turbines, different scenarios were divided, and prediction models for each scenario were established to achieve accurate predictions of wind power output in cold-wave weather. However, most of the methods used in the current research model different extreme weather scenarios and then output classified prediction results, lacking a consideration of the internal relationships between various meteorological factors of different extreme weather. At the same time, the use of multiple independent classification models also has the problems of excessive training time cost and low computational efficiency.

Multi-task learning (MTL) can effectively reduce the above problems. At present, a large number of studies have been carried out on the prediction and classification of new energy power generation. For example, reference [21] proposes a short-term prediction method for offshore wind power based on the classification of significant weather processes and multi-task learning considering adjacent power. The power sequence of adjacent offshore wind farms is introduced as a new input feature to model wind power prediction for each wind farm in the adjacent region under each category of weather processes. Reference [22] combines the high computing power of deep neural networks (DNNs) with the improved generalization performance of MTL to design separate output layers for each task and provide them with shared representations to solve the related problems of wind ramp event prediction in wind farms. Different variants of MTL have also been developed, such as Multi-gate Mixture of Experts (MMoE). Reference [23] combines WaveNet, a special type of Convolutional Neural Network, with the architecture of MMoE, aiming to overcome inherent limitations by efficiently capturing and exploiting complex patterns and trends in time series data. However, the MTL models used in the current research have, more or less, the inherent negative transfer problem and the “see-saw” problem; that is, when the correlation between different tasks is weak, the joint training effect is not good, or the effect of one task is sacrificed to improve the effect of another task.

Data samples of extreme weather are scarce, which causes great difficulties for the modelling and training of machine learning models. Therefore, it is necessary to generate enough sample data through data generation methods to realize the effective establishment of classification models. Generative adversarial networks (GANs) are powerful generative models that generate data through the adversarial training of generators and discriminators. GANs can be used to generate images, text, and other data types, such as by using GAN to generate handwritten digital images or text data [24,25,26]. GANs are also widely used in extreme event analysis of new energy plants. Reference [27] applies GAN to wind power climbing power prediction. It takes historical climbing data and simulation feature quantity as input, generates a large number of simulated climbing data with features to similar historical climbing data through the adversarial training of the generator and discriminator, and expands the climbing dataset. Reference [28] proposes an extreme scenario generation framework based on a conditional generative adversarial network (CGAN) for power system scheduling and planning with uncertain wind power. However, the traditional GANs need to have a good training method when applied, otherwise, the output may not be ideal due to the freedom of the neural network model, and GANs may have problems such as unstable convergence, mode collapse, and gradient disappearance during training due to their own characteristics.

In summary, the following problems remain in the classification and identification of high-impact extreme weather processes: (a) The current research on extreme weather is mainly oriented towards the impact on the new power system as a whole, the current mainstream classification of extreme weather does not consider in depth its impact on new energy output, and there is no targeted analysis, identification, and classification. (b) There is a lack of a specific methodology for defining extreme weather processes that take into account the impact of new energy output, owing to the lack of fully harmonised standards for meteorological data thresholds for extreme weather. (c) Most of the current common classification task models model different extreme weather scenarios separately, and then output classification prediction results separately, without taking into account the intrinsic connection of various meteorological factors between different types of extreme weather. (d) Traditional MTL models have some inherent flaws, such as the negative migration problem and the “see-saw” problem, which can lead to unbalanced training results when applied to classification tasks. (e) Traditional GANs suffer from unstable convergence, pattern collapse, and gradient vanishing. In order to solve the above problems, this study proposes a method for defining and classifying high-impact weather processes, i.e., defining high-impact weather processes and classifying the standard set by analysing the sensitive factors affecting the new energy output based on the two perspectives of the abnormal output situation and the abnormal prediction error, and then constructing a classification and identification model for high-impact weather processes based on a variant of the MTL model, Progressive Layered Extraction (PLE). The PLE is based on a standard set to uncover possible intrinsic links in the numerical weather prediction (NWP) data for different extremes of weather in order to provide more accurate warning information while using the Wasserstein generative adversarial network (WGAN) for data generation on data defined as extreme weather to expand the training set of the model. The main contributions of this study are as follows:

A classification method of extreme weather processes, considering the sensitive factors of new energy high-impact weather is proposed, which analyses the sensitive factors affecting new energy output through the two perspectives of abnormal output situation and abnormal prediction error, determines the threshold value of meteorological data at the time of occurrence of extreme weather, defines the high-impact weather processes, and divides the standard set;
A PLE-based classification model for the identification of high-impact weather processes is constructed that is able to consider the correlation between tasks better than common classification models and at the same time overcomes the inherent shortcomings of traditional MTL models. It also incorporates a data generation method using WGAN to expand the training set of the classification model for the identification of high-impact weather processes and explores the possible intrinsic connections in the NWP data of different extreme weather events to provide more accurate warning information.

The remaining parts of this paper include the following: Section 2 introduces the overall framework of the proposed methods in this study and the method principles of each part. In Section 3, the proposed method is applied to analyse and discuss the specific cases of a wind farm cluster and PV farm cluster. Section 4 summarizes the contents of this study.

2. Study Method

2.1. Overall Framework

This study proposes a classification method of extreme weather processes based on the PLE model, considering the weather-sensitive factors with a high impact on new energy. The overall flow diagram of the method is shown in the Figure 1:

Firstly, starting with the historical power and NWP data of wind farm cluster and PV farm cluster, the sensitive factors that affect wind farm output and PV farm output are analysed, respectively, from the perspectives of abnormal output and abnormal prediction error, the high-impact weather processes of the two new energy power plants are defined, and the time point where high-impact weather occurs is labelled with corresponding weather.
Then, a high-impact weather process identification and classification model based on PLE is constructed. Historical NWP data, high-impact weather labels, and data generated by WGAN are used to train the model.
Finally, the future NWP data are used as the input of the model to identify the high-impact weather processes and obtain the predicted classification results.

2.2. Analysis of Sensitive Weather Factors

Since wind farms and PV farms account for a heavy proportion of new energy generation and are more sensitive to weather factors, and the impact of extreme weather on these two types of farms is more pronounced and more serious, this paper studies only these two clusters of farms.

For wind farms, there are many main influencing factors, including wind speed, wind direction, temperature, air pressure, humidity, etc., and the weather factors that can seriously affect the output value are mainly wind speed and temperature. Wind speed is the most important factor affecting the power output of wind farms. The power output of a wind turbine is proportional to the cube of the wind speed:

P = \frac{1}{2} ρ A v^{3} C_{p}

(1)

where P is the power, ρ is the air density, A is the sweep area of the wind turbine, v is the wind speed, and C_p is the power coefficient of the wind turbine (usually between 0.35 and 0.45).

Temperature affects the power output of wind turbines mainly by affecting the air density ρ. Air density is inversely proportional to temperature; the higher the temperature, the lower the air density, and the smaller the power output.

ρ = \frac{P_{a t m}}{R T}

(2)

where P_atm is the atmospheric pressure, R is the gas constant, and T is the absolute temperature.

For PV power fields, the most critical factor is the intensity of solar radiation; that is, the higher the intensity of solar radiation per unit time, the greater the output power of PV panels.

P = I \times A \times η

(3)

where I is the solar radiation intensity, A is the area of the PV panel, and η is the efficiency of the PV panel.

Under clear weather conditions, the solar radiation intensity is higher, and the power generation of PV power plants will increase accordingly. Therefore, the output of PV power plants will also be affected by weather that can directly affect the intensity of solar radiation, such as rain, snow (collectively referred to as precipitation), and other weather.

In addition, drastic changes in temperature will also affect the output of PV power plants by affecting the efficiency of the PV power plants. In practical applications, the impact of dust and pollution and system losses may also be considered, and the actual PV power can be calculated by the revised formula:

P_{t r u e} = P \times K_{T} \times K_{d i r t} \times K_{l o s s}

(4)

where P_true is the actual PV power, K_T is the temperature correction factor, K_dirt is the dust pollution correction factor, and K_loss is the system loss factor.

2.3. High-Impact Weather Process Definition Method

The definition method of high-impact weather process on new energy can be considered from two aspects; one is the abnormal power prediction results of new energy power plants, and the other is the abnormal output situation of new energy power plants.

Without considering the influence of the defects of the power prediction model itself, the abnormal prediction results are usually caused by external meteorological factors, such as the icing condition of wind farm fan blades under the influence of winter cold waves, which often leads to the power prediction results being far away from the actual power, resulting in large prediction errors. Therefore, time points with large power prediction errors in the dataset can be extracted. Assuming that the sensitive weather factor data corresponding to these time points meet the normal distribution, the mean value and standard deviation are calculated, and then the data points that deviate from the mean value by 1–2 standard deviations are defined as the threshold of meteorological factors with high influence on the weather process of new energy. The time point corresponding to the meteorological data above the threshold is defined as the time point at which the new energy high-impact weather process occurs.

d_{a v g} - (1 ~ 2) d_{s d} \leq d \leq d_{a v g} + (1 ~ 2) d_{s d}

(5)

where d is the data value, d_avg is the mean value of the data, and d_sd is the standard deviation of the data.

However, since the effects of certain high-impact weather processes are specifically reflected in the NWP data, in this case a serious drop in both actual and predicted power occurs, although the power prediction error does not change much. For example, for PV farms, the occurrence of heavy-precipitation weather events is usually accompanied by low irradiation intensity, leading to a consequent decrease in the power prediction results using irradiation intensity as a model input, which, coupled with the pronounced diurnal periodicity of PV farms, results in a very insignificant change in the power prediction error as reflected. Therefore, another definition method directly considers the case of new energy farms with abnormal outputs and judges all of these time points as possible high-impact weather processes. The subsequent practice is consistent with the previous one, assuming that the sensitive weather factor data corresponding to these time points satisfy normal distribution, calculating their mean value and standard deviation, and then taking the data points that deviate from the mean value by 1–2 standard deviations to be defined as the meteorological factor threshold for the new energy high-impact weather process, and the time points corresponding to the meteorological data higher than this threshold are defined as the time points where the new energy high-impact weather process occurs.

2.4. Progressive Layered Extraction Classification Model

Progressive Layered Extraction (PLE) is a type of multi-task learning (MTL) method of machine learning as opposed to a single-task learning method. MTL allows multiple tasks to be learnt in parallel at the same time, and the results influence each other. Its most basic structure is shown in Figure 2.

Assuming a shared feature extraction layer f and multiple task-specific upper task towers t_i, the model can be represented as follows:

y_{i} = t_{i} [f (x; θ_{f}); θ_{t i}]

(6)

where y_i is the output of the ith task, x is the input data, θ_f is the parameters of the shared feature extraction layer, and θ_ti is the parameters of the ith task tower. When the parameters are shared between the tasks through the weight matrix, there is the following equation:

θ_{t i} = W θ_{f}

(7)

However, there is a negative migration problem in MTL, tasks are usually less correlated or even conflicting with each other, and joint training may lead to performance degradation. In order to reduce the effect of negative migration, the Mixture of Experts (MoE) model and the MMoE model were proposed. Their core idea is to incorporate Gated Networks based on parameter sharing. In this way, the Gated Network for each task can learn which Expert to select for prediction based on sample information. The structure of the MMoE is shown in Figure 3.

The Gated Network output of the kth subtask can be expressed as

g^{k} (x) = s o f t m a x (W_{g}^{k} x),

(8)

where W_g^k is the weight matrix.

The output of task k can be expressed as

y^{k} (x) = t^{k} [f^{k} (x)],

(9)

where f^k(x) is the Expert output weighted by the Gated Network:

f^{k} (x) = \sum_{i = 1}^{n} g_{i}^{k} (x) f_{i} (x),

(10)

where f_i(x) denotes the output of the ith Expert Network.

However, MMoE, since all parameters are shared across all tasks, and there is no indication of defining private parameters for different tasks, may lead to a “see-saw” phenomenon across tasks when the relationship between different tasks is weak, which often has the potential to improve the effectiveness of some tasks at the expense of others when compared with multiple single-task learning models.

In order to solve the “see-saw” phenomenon of traditional MTL and MMoE, Customised Gate Control (CGC) model and PLE model are proposed.

CGC is the base network of PLE, and its structure is shown in Figure 4. Different tasks in MMoE share the same Expert, after which different Gated Networks corresponding to each task are used to integrate the outputs of the Experts, and the final input to the tower of different tasks is the set of all Expert networks. There is no distinction between different task-specific Expert Networks and Experts shared by multiple task networks, which may prevent the model from capturing more complex relationships between tasks, thus introducing some noise to some tasks. CGC divides Experts into two kinds: one is a Specific Expert related to a specific task, and the other is a Shared Expert shared by all tasks. The task-specific Specific Expert accepts only the tower gradient of the corresponding task to update the parameters, while the Shared Expert is updated with the parameters by the multi-task results. This allows different types of Experts to focus on learning different knowledge more efficiently and avoiding unnecessary interactions, reducing the interference of parameter sharing between Experts and effectively mitigating the negative migration and “see-saw” phenomena. In addition, owing to the dynamic fusion of inputs by the Gated Network, the CGC can more flexibly find a balance between different subtasks and better deal with inter-task conflicts and sample correlation problems.

At this point, the output g^k(x) of the Gated Network for the kth task becomes:

g^{k} (x) = w^{k} (x) S^{k} (x)

(11)

w^{k} (x) = s o f t m a x (W_{g}^{k} x)

(12)

S^{k} (x) = {[E_{(i, 1)}^{T}, E_{(i, 2)}^{T}, \dots, E_{(i, m)}^{T}, E_{(s, 1)}^{T}, E_{(s, 2)}^{T}, \dots, E_{(s, m)}^{T}]}^{T}

(13)

where S^k(x) is a selection matrix that connects the Specific Experts and Shared Experts of the kth task.

The final output of the kth subtask is:

y^{k} (x) = t^{k} [g^{k} (x)]

(14)

The PLE can be thought of as consisting of multiple layers of CGCs connected together, and its structure is shown in Figure 5. PLE further classifies Gated Networks into a Task-Specific Gate and a Shared Gate as well. PLE consists of multiple Extraction Networks, and each Extraction Network is the CGC network layer. The inputs to the first Extraction Network layer are the inputs to the native model, and each Extraction Network layer, except the last one, contains only the Experts Network and the Gate Network, but there is a unique Shared Gate Network, which is used to aggregate the information from all the Experts in this layer and provide it to the next layer for use. The last layer of the Extraction Network contains the tower network corresponding to each task and the output layer. The structure of the multilayer CGC also provides for interaction between upper-level Experts and lower-level Experts, enabling the extraction of deeper information.

When PLE is used as a classification task model, the output of the Gated Network in the jth Extraction Network for the kth classification subtask is as follows:

g^{k, j} (x) = w^{k, j} [g^{k, j 1} (x)] S^{k} (x)

(15)

After computing all the Gated Networks and Experts, the final output of the kth classification subtask of PLE is:

y^{k} (x) = t^{k} [g^{k, N} (x)]

(16)

At this point, y^k(x) denotes the model’s prediction probability for the kth category, but since the outputs of classification tasks are usually integer labels, it is usually necessary to add a layer of softmax function to the output layer as well.

In summary, compared with the traditional classification model, PLE is able to comprehensively consider the correlation between meteorological factors and deeply mine the correlation information in the NWP data. At the same time, compared with multiple independent classification models in parallel, it is able to reduce the cost of model training time and improve the computational efficiency; compared with the traditional MTL model, the introduced Gated Network is able to change the weights between individual Experts, thus effectively solving its negative migration problem. Compared with the further MMoE model, the introduction of Shared Experts and Shared Gated Network enables the task-specific Experts to focus on learning the information of their own tasks, avoiding the interference of the shared parameters; the structure of multiple feature extraction layers also facilitates the mining of deeper information. Therefore, PLE has unique advantages in extreme weather classification tasks.

The working process of the PLE model can be summarised as follows: the input data (e.g., NWP data) is first passed to multiple Expert modules for feature extraction, and each Expert module contains multiple sub-networks, which are responsible for learning the shared and task-specific modes, respectively; after that, the Gated Network of each task dynamically calculates the weights according to the input data and selectively fuses the information from the shared and task-specific Experts. Then, the data are passed into the next feature extraction layer, and in the multilayer structure, the output of each layer is used as the input of the next layer to gradually extract deeper information. Finally, the extracted information is passed into the respective task tower network of each task for the final prediction.

Before applying the PLE model for extreme weather classification, the model needs to be trained first. Firstly, the dataset is divided into a training set and a test set, and the historical NWP data of the training set and its corresponding weather labels are fed into the input layer of the model. After that, the model is forward propagated according to the above work process; i.e., the input data are passed through the various layers of the model’s Expert and Gated Networks, and the final output of the predicted values is made. The error between the model output values and the true labels is then calculated by the defined loss function; then, the gradients of the parameters of each layer are calculated and the model parameters are updated using the optimiser, i.e., backpropagation. The above steps are repeated until the model converges or reaches the preset number of training rounds. After this, the NWP data from the test set are then fed into the trained PLE model to obtain the predicted classification results.

2.5. WGAN Data Generation

Data generation is an important technique in machine learning, especially when real data are scarce or privacy is restricted. Common data generation methods include random data generation, synthetic data generation, variational self-encoders, diffusion models, and generative adversarial networks. Among them, GAN is one of the most innovative and cutting-edge techniques in machine learning in recent years; it learns complex data distributions to generate high-quality synthetic data by constructing an adversarial network consisting of generators and discriminators to play with each other, and its structure is shown in Figure 6.

However, traditional GAN models also have some limitations, such as unstable training and pattern collapse. CGANs have stherefore been proposed, which build on standard GANs by allowing the models to conditionally generate samples based on additional information. Consider the optimal solution for arbitrary generator G and discriminator D: the optimisation objective of the discriminator is to maximise the valuation function V(G,D) given the generator G. CGAN can splice additional information y at the inputs of the original generator and discriminator. The objective function is as follows:

\min_{G} \max_{D} V (G, D) = E_{x ~ p_{d a t a} (x)} [\log D (x | y)] + E_{z ~ p_{z} (z)} [\log (1 - D (G (z | y)))]

(17)

where z~p_z(z) is a randomly initialised noise distribution, x~p_data(x) is the data distribution to be learnt, E is the expected probability, z is the input, and x is G(z).

However, CGAN still suffers from the problem of gradient vanishing, for which Wasserstein proposed a new generative adversarial network, WGAN. WGAN restricts the set of strategies for the discriminator to be Lipschitz continuous. The Lipschitz continuous function restricts the rate at which the function can change. The slope of the function that meets the Lipschitz condition must be less than a real number, called the Lipschitz constant, which restricts the change in the slope of the discriminator, inhibits the learning speed of the discriminator, and optimises the problem of the disappearance of the gradient of the generator to a certain extent. At this point, the valuation function becomes:

V_{W G A N} (G, D) = E_{x ~ p_{d a t a} (x)} [l o g D (x)] E_{z ~ p_{z} (z)} [l o g D (G (z))]

(18)

The purpose of the generator is to map random noise vectors to samples similar to the real data, and its structure usually includes multiple fully connected layers or transposed convolutional layers to gradually extend the low-dimensional noise to the high-dimensional data space; the purpose of the discriminator is to distinguish between the real samples and the generated samples, and its structure usually includes multiple fully connected layers or convolutional layers to gradually compress the input data to a scalar output. Before data generation, WGAN needs to be initialized, i.e., the parameters of the generator and discriminator are randomly initialized; after that, a training loop is started, which includes updating the generator and updating the discriminator. First, the generator is fixed, and the discriminator parameters are updated according to the Wasserstein distance. Then, the losses of the real and generated data are calculated, and the discriminator is updated. Finally, the weights of the discriminator are trimmed to satisfy the Lipschitz constraints; after that, the discriminator is fixed and the generator is updated by minimizing the loss of the generated data. This cyclic process is repeated until a preset number of training times is reached. For the validation of the generated data, this can be achieved by labelling the generated data sequences according to the method defined in Section 2.3 in order to see if the generated data also meet the criteria for extreme weather processes.

3. Case Study

The actual datasets used in this study are those of a wind farm cluster and a PV farm cluster in Jiangxi Province, China. The geographic range covers most of Jiangxi Province, and the latitude and longitude ranges of the NWP data are 24.0° N to 31.0° N and 113.0° E to 118.0° E. The time span of the wind farm dataset is from 8:00 a.m. on 1 January 2021 to midnight, 0:00 a.m., on 1 January 2024, with a time-sampling interval of 1 h and a total of 67 stations and 26,272 sets of data. The dataset contains the historical power data of the wind farms as well as the NWP data of the wind speed, temperature, and dewpoint temperature. The time span of the PV farms dataset is from 0:00 a.m. on 2 January 2021 to 0:00 a.m. on 1 January 2024, with a time-sampling interval of 1 h and a total of 57 stations and 26,256 sets of data. The dataset contains the historical power data of the PV farms as well as the NWP data of the surface short-wave (solar) radiation, surface long-wave (thermal) radiation, temperature, and dewpoint temperature. The statistical properties of the dataset, as well as the distribution, are shown in Appendix A.

In order to quantitatively assess the classification effect of the proposed model, four more commonly used classification evaluation metrics are selected in this study. (a) Accuracy is the number of samples correctly classified by the model as a proportion of the total number of samples. It is usually used to measure the overall correctness of the model and is applicable to datasets with balanced categories. (b) Precision is the proportion of samples predicted by the classifier to be positive that are actually positive. It is usually used to measure the accuracy of the model in predicting positive cases. (c) Recall is the proportion of all samples that are actually positive examples that are correctly predicted to be positive examples. It is often used to measure the ability of a model to capture positive cases. (d) F1-Score is the reconciled average of precision and recall used to assess the performance of the model in a comprehensive manner. It is a metric used to balance precision and recall and is applicable when both are important. The formulae for the four evaluation indicators are shown below:

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(19)

P r e c i s i o n = \frac{T P}{T P + F P}

(20)

Recall = \frac{T P}{T P + F N}

(21)

F 1 - S c o r e = 2 \times \frac{Precision \times R e c a l l}{Precision + R e c a l l}

(22)

True Positive (TP) denotes the number of samples correctly predicted by the model to be in the positive category. True Negative (TN) indicates the number of samples that the model correctly predicts as negative classes. False Positive (FP) indicates the number of samples where the model incorrectly predicts negative samples as positive samples, also known as “Type I error”, which can lead to unnecessary resource investment if the model frequently predicts non-extreme weather as extreme. False Negative (FN) indicates the number of positive samples that the model incorrectly predicts as negative, also known as “Type II error”. If the model frequently predicts extreme weather as non-extreme, it may result in extreme weather events being overlooked, thus preventing timely response measures, which may lead to serious situations such as damage to power generation equipment. For new energy power plants, the consequences of “Type II errors” are often more serious than those of “Type I errors”, so the recall rate is slightly more important than the precision rate for the classification model of extreme weather recognition for new energy power plants.

3.1. Analysis of Sensitivity Factors

3.1.1. Analysis of Wind Farm Sensitivity Factors

Taking June and December 2023 as an example, the preliminary prediction results obtained based on the power prediction model using 100 m wind speed as an input feature are shown in Figure 7 and Figure 8, where the trend of wind speed is basically the same as the trend of predicted power.

As can be seen from the figures, situations with a wide range of large errors are usually those with high and drastic changes in wind speeds, which may also be accompanied by sudden rises and falls in temperature or sustained high and low temperatures. In addition, cold weather, which has a large impact on output, is usually characterised by a sudden drop in temperature and a sudden increase in wind speed, so there are fewer instances of large errors at high temperatures than at low temperatures, and the impact on the prediction results is relatively small. The wind speed and temperature are then analysed separately.

Wind speed:

The normalised value of the Root-Mean-Square Error (RMSE) is used as an indicator for evaluating the error of the power prediction results, which is calculated as follows:

N R M S E = \frac{1}{c a p} \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(23)

where cap is the total installed capacity, n is the number of samples, y_i is the true value of the ith observation, and

{\hat{y}}_{i}

is the predicted value of the ith observation.

The monthly average wind speeds for each month from 2021 to 2023, the average wind speeds at the time points where the power prediction error is in the first 15%, and the average wind speeds at the time points where the power prediction error is in the bottom 15% are calculated and compared, and the comparison results are shown in Figure 9, Figure 10 and Figure 11.

As can be seen from the figures, the mean wind speeds at the time points where large errors occur in each month are larger than the monthly mean wind speeds. And when there is a large difference, the prediction error for that month is also larger (e.g., January and November 2021, February 2022, and December 2023), from which it can be roughly deduced that there may be gale-force winds in that month. Mean wind speeds at time points with lower errors are also relatively low.

2.: Temperature:

In the same way as the previous analysis, the monthly mean temperature of each month from 2021 to 2023, the mean temperature at the time point where the power prediction error is in the first 15%, and the mean temperature at the time point where the power prediction error is in the bottom 15% are calculated and compared, and the results of the comparison are shown in Figure 12, Figure 13 and Figure 14.

The closer the mean temperature at the top 15% of the power prediction error is to the monthly mean temperature, the lower the prediction error for that month (e.g., August 2023), suggesting that there are fewer sudden temperature changes or anomalies in that month; the further it deviates from the monthly mean temperature, the higher the prediction error for that month, suggesting that there may be a noticeable sudden change or anomaly in temperature for that month (e.g., a sudden temperature rise to a sudden temperature drop to a sustained cold temperature in December 2023).

The prediction errors for winter months are usually relatively large, and combined with the analysis of the effects of wind speed and temperature, it can be roughly inferred that cold-wave weather is likely to be present in that month (e.g., January 2021, December 2023).

3.1.2. Analysis of PV Farm Sensitivity Factors

The trends in the actual power emitted by the PV farms on a monthly basis are analysed against the trends in temperature and total precipitation, respectively.

Temperature:

Taking June and December 2023 as an example, the actual powering trend of the PV farms and the temperature trend are shown in Figure 15 and Figure 16.

Similarly to wind farms, sudden temperature rises and drops or sustained high and low temperatures can affect the output of PV farms. In particular, the effects of sudden temperature drops and persistent low temperatures are particularly pronounced.

2.: Total precipitation:

Taking June and December 2023 as an example, the trend of the real power of the PV farms and the total precipitation are shown in Figure 17 and Figure 18.

As can be seen from the figures, the magnitude of precipitation also affects the output of the PV farms, with the greater the total precipitation, the more pronounced the degree of effect. In addition, the effects of snowfall can also occur during the winter months, and when snowfall occurs, the time span during which the PV farms exhibit low output is also longer.

3.2. New Energy High-Impact Weather Processes Defined

Taking into account the above analyses of high-impact weather processes in wind farms and PV farms, the corresponding conclusions can be drawn: Any large errors in predicted power or drastic changes in actual power that occur during the year may be accompanied by a variety of corresponding extreme weather events (e.g., high winds, cold waves, sudden changes in temperature, sustained low temperatures, sustained high temperatures, heavy precipitation, etc.). Therefore, this study proposes an inverse determination method of extreme weather that integrates the theoretical and actual values of output power, i.e., based on the data with the largest prediction error or power fluctuation within a year, the criteria of extreme weather that can affect the output of wind farms and PV farms are roughly deduced.

3.2.1. Definition of High-Impact Weather Processes at Wind Farms

Low and high temperatures:

Firstly, take the time points where the error in the annual power prediction from 2021 to 2023 is in the top 15%, and define these time points as the ones that are more significantly affected by the weather. Assuming that their temperature data satisfy the normal distribution, the normal distribution graph is shown in Figure 19, which has a mean value of 17.93 °C and a standard deviation of 9.49 °C.

To ensure a sufficient number of sampling points, for the low-temperature time points, temperature data below 1.5 standard deviations from the mean are taken to be defined as extreme cold weather data, with a total of 435 data points and a temperature threshold of 3.70 °C. For the high-temperature time points, data that are 1.5 standard deviations above the mean are taken to be defined as extreme hot weather data, with a total of 192 data points and a temperature threshold of 32.16 °C.

2.: Gale-force wind:

In the same way as the discrimination method for hot and cold weather, the wind speed data at the time points where the error in the power prediction is in the top 15% for each year from 2021 to 2023 are taken. Assuming that they satisfy the normal distribution, the normal distribution graph is shown in Figure 20, which has a mean value of 4.31 m/s and a standard deviation of 1.31 m/s.

Wind speed data that are 2 standard deviations above the mean are defined as data for gale-force winds, with a total of 129 data points and a wind speed threshold of 6.94 m/s.

3.2.2. Definition of High-Impact Weather Processes at PV Farms

For photovoltaic farms, due to their obvious daily periodicity and seasonality, each month is analysed independently, and a day with extreme weather is labelled as an extreme weather day, and then a point in time where the power is not zero in the extreme weather day is noted as an extreme weather point in order to exclude the effect of time points where the power does not come out during the night on the labels and on the predictions of the subsequent classification model.

Low and high temperatures:

Firstly, for each month from 2021 to 2023, the daily power peaks are taken and their average values are calculated; then, the dates where the daily power peaks are lower than this average value are extracted, and finally, the time points where the power is not zero are extracted from these dates, which are defined as the time points that are more significantly affected by the weather. Assuming that the temperature data at these time-sampling points satisfy a normal distribution, the normal distribution graph is shown in Figure 21, with a mean value of 21.31 °C and a standard deviation of 8.25 °C.

For low temperatures, temperature data taken 2 standard deviations below the mean are defined as extreme cold-weather data, with a total of 1375 data points and the temperature threshold of 4.80 °C. For high temperatures, since the mean temperature is high and the standard deviation is also high, to ensure that the number of sampling points is sufficient, temperature data higher than 1.5 standard deviations above the mean are taken to be defined as extreme hot-weather data, with a total of 1039 data points, and the temperature threshold is 33.69 °C.

2.: Heavy precipitation:

For precipitation, the thresholds are delineated in the same way as for temperature. Assuming that the total precipitation data obtained for the time-sampling points satisfy the normal distribution, the normal distribution graph is shown in Figure 22, with a mean value of 0.000615 m and a standard deviation of 0.000581 m.

The total precipitation data taken 2 standard deviations above the mean are defined as the data for heavy-precipitation weather, with a total of 409 data points and the total precipitation threshold of 0.001777 m.

It is worth mentioning that, since the analysis method is based on the historical power data and NWP data of the region, even if the method is carried over to other regions, the thresholds of temperature, wind speed, and total precipitation will change accordingly due to the difference in the historical power data and NWP data of the region, so that the thresholds of meteorological data can be obtained to meet the meteorological conditions of the region, which is of a certain degree of generalizability.

3.3. PLE Model Classification Results

PLE models for classification require the selection of an appropriate loss function, which is a key tool for measuring the difference between model predictions and true labels. Choosing the appropriate loss function is crucial for training effective classification models. In classification tasks, common loss functions include Cross-Entropy Loss, Focal Loss, Dice Loss, and Hinge Loss. Among them, Cross-Entropy Loss is suitable for the scenario in which each sample can belong to multiple categories and can deal with complex relationships in multilabel classification tasks, so this study solves the problem by choosing Cross-Entropy Loss as the loss function of the model, which is achieved by calculating Negative Log Likelihood Loss with the following formula:

L = - \sum_{i = 1}^{n} T_{i} \log [y_{i}^{k} (x)]

(24)

where T_i denotes the ith element in the real label vector and y_i^k(x) denotes the ith element in the model output vector after processing by the softmax function.

In addition, this study chooses the Adam optimiser in terms of the optimiser for PLE, which is suitable for most deep learning tasks and can further improve the training effect of the model by reasonably adjusting the parameters. The learning rate is set to 0.001 here. Finally, the hyperparameters are set by random search, except for the number of tasks and the corresponding number of task-specific Experts, which have to be changed manually as appropriate.

3.3.1. Introduction to Extreme Weather Data

Data preprocessing of the raw data are required first. Missing values are filled in by linear interpolation, outliers are identified by quartile method and corrected using regression model, and finally, the data are normalised so that the model can be trained effectively. Then, the dataset is divided, as shown in Table 1 and Table 2.

It can be seen that the proportion of extreme weather data in the original data are very small, and there is a very serious class imbalance problem, which is extremely unfavourable for the effective training of the classification and recognition model. Although in practice, extreme weather samples are often much less than normal weather samples, and the class imbalance problem is unavoidable in the practical application of extreme weather recognition and classification tasks, the problem can still be mitigated in the training of the model. Data generation is the means used to alleviate the class imbalance problem when training the model by generating data on extreme weather as a way to expand the training set of the model and increase the percentage of extreme weather in the sample data of the training set. The results of data generation are shown in Table 3 and Table 4.

3.3.2. Wind Farm Cluster Classification Results

Based on the temperature thresholds for high- and low-temperature weather and the wind speed thresholds for gale-force weather for the wind farm clusters obtained above, the time-sampling points are labelled with the appropriate weather labels. A label of “1” indicates that the corresponding extreme weather event occurred at the time-sampling point and a label of “0” if it did not. The NWP data with high-impact weather labels for 2021 and 2022 are used as a training set to train the PLE classification model, and the selected NWP features include 100 m wind speed, 10 m wind speed, 2 m temperature, and dewpoint temperature. Then, the 2023 NWP data with high-impact weather labelling and WGAN-generated data are used as a test set. Here, the following methods are used for comparison, respectively: no data generation (Method I); separate generation of the three types of extreme weather data (Method II); generation according to the feature division, i.e., separate generation of the extreme weather data by temperature division (low- and high-temperature weather) and wind speed (gale-force weather) (Method III); and overall generation of the three extreme weather data types (Method IV). The classification results obtained are shown in Table 5, Table 6 and Table 7.

As can be seen from the tables, the data quality obtained by using Method IV, i.e., generation for the three types of extreme weather data as a whole, is better, resulting in a slight improvement in the classification accuracy of the model, with the accuracy of the three extreme weather classifications increasing by 1.35%, 1.44%, and 2.28%, respectively; it has a better optimisation effect on the recall rate, and the recall rate of the three extreme weather classifications is improved by 21.24%, 11.82%, and 16.85%, respectively. In addition, the better data quality also leads to a better convergence of the model. The resulting model classification results are shown in Figure 23.

As can be seen from the figure, for high- and low-temperature weather, the model error mainly comes from failing to identify all high-impact weather, whereas there are fewer cases of misclassifying ordinary weather as high-impact weather; i.e., there are more “Type II errors”, which are probably due to the high proportion of ordinary weather in the test set data. The opposite is true for high-impact gale-force winds, which are almost completely recognised, with errors mainly due to misjudgements, i.e., more “Type 1 errors”, probably due to the relatively low threshold for judging them as gale-force winds. The temporal distributions of the real labels as well as the labels obtained by classification via the PLE model (by hour, month, and week, respectively) are shown in Figure 24 and Figure 25.

The representative months in which each extreme weather occurs are extracted, as shown in Figure 26, Figure 27 and Figure 28, and further observation shows that the identification results for high- and low-temperature weather are also similar to those for gale-force weather, with a large correlation with the thresholds; i.e., the high-temperature thresholds identified by the classification and identification model are higher in relation to the actual value, whereas the low-temperature thresholds are lower in relation to the actual value.

3.3.3. Comparison of Wind Farm Cluster Classification Models

Afterwards, a variety of other classification models are used for comparison, and the classification models employed include Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Convolutional Neural Network (CNN), and Support Vector Classifier (SVC). The selected features include 100 m wind speed, 10 m wind speed, 2 m temperature, and dewpoint temperature. The final classification results obtained for each model are shown in Figure 29, and the various evaluation indicators for each model are shown in Table 8, Table 9 and Table 10.

Combining the figure and the table, it can be seen that for high-temperature weather classification, PLE is slightly higher in accuracy, although it is slightly lower than the CNN model in precision, but PLE has a significant advantage in recall, with a maximum improvement of 12.58%, and it is better after considering the F1-score in a comprehensive way. For cryogenic classification, each metric is relatively closer across the models, although PLE is slightly lower than the other models in terms of accuracy and recall and is slightly better in terms of precision, with a maximum improvement of 2.22%. As for gale classification, PLE outperforms the other models in all metrics except the accuracy rate, which is slightly lower than that of CNN and SVC. In addition, compared with the previously used MMoE model, there is a significant improvement in accuracy, recall, and F1-score for all three extreme weather types. It can be seen that PLE can capture the intrinsic link between wind speed and power in NWP data more easily than the other models.

3.3.4. PV Farm Cluster Classification Results

Based on the obtained temperature thresholds for high- and low-temperature weather and precipitation thresholds for heavy-precipitation weather for the PV farm cluster, the corresponding weather labels are applied to each time-sampling point in the same way as for the wind farm cluster. The NWP data for 2021 and 2022 with high-impact weather labels are used as a training set to train the PLE classification model, and the selected NWP features include surface short-wave (solar) radiation, surface long-wave (thermal) radiation, 2 m temperature, dewpoint temperature, and total precipitation. The NWP data for 2023 are then used as a test set with the high-impact weather labels and the data generated by WGAN. Here, the comparisons are still made in the same way as for the wind farm cluster, and the classification results obtained are shown in Table 11, Table 12 and Table 13.

As can be seen from the tables, the evaluation indexes of the model are improved after data generation in whatever way, in which the data generated by Method IV can make the classification accuracy of the model slightly improved, and the accuracy of the three types of extreme weather classification is improved by 1.18%, 0.45%, and 1.37% respectively, which also has a better optimisation effect on the other evaluation indexes. In addition, although data generation can optimise the classification results of the model, there is still a common problem that the expanded training set makes the model’s ability to converge somewhat reduced and the training duration increased. Moreover, some time points may be accompanied by multiple extreme-weather processes at the same time, which may be overlooked in the method of generating data for different extreme-weather data, making the data generated insufficiently rich. The final model classification results are shown in Figure 30.

As can be seen from the figure, the model can almost completely identify the high-impact high- and low-temperature weather, and the error basically exists in the case of misjudging the ordinary weather, i.e., there are “Type I errors”, which may be due to the relatively larger proportion of ordinary weather in the test set; the opposite is true for high-impact heavy precipitation events, where there is almost no misjudgement, and the error mainly exists in failing to identify all heavy-precipitation events, which may be mainly due to the slightly higher precipitation thresholds set for heavy-precipitation events. The temporal distributions of the real labels as well as the labels obtained by classification via the PLE model (by hour, month, and week, respectively) are shown in Figure 31 and Figure 32.

The representative months in which each extreme weather type occurred are extracted for further analysis. As shown in Figure 33, Figure 34 and Figure 35, the identification of high- and low-temperature weather for PV farms is the opposite of that for wind farms, with the classification model identifying lower high-temperature thresholds and higher low-temperature thresholds compared with the actual value.

3.3.5. Comparison of PV Farm Cluster Classification Models

The selection of classification models used for the comparison is also the same as that for the wind farm cluster, and the classification results for each model are shown in Figure 36. The various evaluation metrics for each model are shown in Table 14, Table 15 and Table 16.

Combined with the figure and tables, it can be seen that for hot-weather classification, there is not much difference in the accuracy rate. In the precision rate, PLE is slightly lower than the other models, but there is a large improvement in the recall rate, with a maximum improvement of 16.33%, and after the comprehensive consideration by the F1-score, it can be seen that PLE is better, with a maximum improvement of 5.92%. For cold-weather classification, PLE has relatively poor metrics but higher recall than the other models, with a maximum improvement of 2.53%. As for the classification of heavy-precipitation weather, PLE outperforms other models in all indicators. Compared with the MMoE model, although the accuracy, precision, and F1 scores are slightly lower in hot weather, there is a significant improvement in recall, and it also shows more significant advantages in the other two extreme weather types.

Comprehensive analysis of the above wind farm cluster and photovoltaic farm cluster models for extreme weather classification shows that compared with the traditional classification model, the advantage of PLE is that the prediction results for each task can be maintained at a better level, there will almost never be a task indicator significantly worse than the other models, and the stronger the correlation between the tasks, the better the prediction results will be. However, its drawback also lies in the fact that when the correlation between tasks is weak or almost irrelevant, the PLE model will not achieve the results of an integrated classification model that trains a separate classifier for each output target, as in the case of traditional classification models. In addition, traditional classification models generally have the problem of low recall, while for new energy generation farms, the more important concern is whether the classification model can identify abnormal high-impact weather, i.e., the ability of the model to capture positive examples, so the accuracy and recall are more important, and PLE can effectively alleviate this problem and have stronger generality in new energy generation.

4. Conclusions

This study proposes a classification method for extreme weather processes that takes into account new energy high-impact weather sensitivity factors. Firstly, starting from the historical power and NWP data of a wind farm cluster and PV farm cluster, the sensitive factors affecting the power of wind farms and PV farms are analysed through the two perspectives of anomalous power situation and anomalous prediction error, respectively, to define the high-impact weather process of the two new energy farm clusters and to label the corresponding weather at the point of time when the high-impact weather occurs, defining it as an extreme weather event. WGAN is then used to generate data for data defined as extreme-weather events to expand the training set. Afterwards, a high-impact weather process identification and classification model based on PLE is constructed to identify and classify high-impact weather processes occurring at future time points. The method proposed in this study is applied to a wind farm cluster and a photovoltaic farm cluster in Jiangxi Province, China, to verify the accuracy and generalisation ability of the model by comparing it with classification models such as LSTM, GRU, CNN, and SVC, and the conclusions are as follows:

After adopting the data generation method of WGAN, there is a maximum increase of 2.28% in accuracy, 9.19% in precision, 21.24% in recall, and 16.54% in F1-score compared with no data generation.
Compared with the traditional classification model, the PLE classification model has a maximum improvement of 1.30% in accuracy and is able to achieve better results for multiple classification tasks in addition to having better recall and better generalisation ability.

Higher classification accuracy and recall are of great significance to the current new power systems. For new energy power-generation equipment, accurate identification and classification of extreme weather in advance can help to avoid equipment damage. For example, extreme cold wave or low-temperature weather will lead to the key parts of the wind turbine being covered with ice and freezing, and it may even cause blade breakage; for power grid operation, the grid scheduling department, if it can predict the occurrence and type of extreme weather in advance and make corresponding countermeasures, can effectively improve the reliability of the power supply.

In addition, the PLE model can also be used for short-term new energy power prediction for multi-tasks under various extreme weather conditions, and its ability to consider the correlation between tasks and to mine the deeper features in the NWP data makes it more predictable and generalisable when used as a power prediction model.

However, the model has some drawbacks. For instance, when the correlation between tasks is low, the classification prediction results may not meet expectations. Future studies will further focus on extracting information from NWP data that is both related to high-impact extremes and strongly correlated with each other as inputs to the model to improve the accuracy of multi-task models in high-impact extreme weather classification and prediction. In addition, although WGAN has significant advantages in data generation, there are some limitations and bias issues, such as the presence of weight-tailoring problems, long training time, hyperparameter sensitivity, and high model capacity requirements. Future research could also alleviate these problems to some extent by improving weight tailoring, optimising hyperparameter selection, increasing model capacity, or finding data generation networks that are validated to be more suitable for extreme weather sample expansion. At the same time, in the context of global climate change, the currently adopted method of dividing extreme weather thresholds may produce changes or even no longer be applicable in the future, so the future research direction will also focus on the effects of more intertwined climatic factors to cope with the complex forms of climate change.

Author Contributions

Conceptualization, Z.L. and Z.W.; methodology, N.W.; software, N.W.; validation, Z.L., Z.W. and Y.J.; formal analysis, N.W.; investigation, Z.L. and D.S.; resources, Z.W.; data curation, N.W.; writing—original draft preparation, N.W.; writing—review and editing, Z.W.; visualization, Y.J. and D.S.; supervision, Z.L.; project administration, Z.W.; funding acquisition, Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

The research was funded by State Grid Corporation Limited Science and Technology Project Grant (Project No. 4000-202355381A-2-3-XG).

Data Availability Statement

The datasets presented in this article are unavailable due to privacy restrictions.

Conflicts of Interest

Authors Zhifeng Liang and Dayan Sun were employed by the company State Grid Corporation of China. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PLE	Progressive Layered Extraction
PV	Photovoltaic
MTL	Multi-task learning
DNN	Deep neural networks
MMoE	Multi-gate Mixture of Experts
GANs	Generative adversarial networks
CGAN	Conditional generative adversarial network
NWP	Numerical weather prediction
WGAN	Wasserstein generative adversarial network
MoE	Mixture of Experts
CGC	Customised Gate Control
TP	True Positive
TN	True Negative
FP	False Positive
FN	False Negative
RMSE	Root-Mean-Square Error
LSTM	Long Short-Term Memory
GRU	Gated Recurrent Unit
CNN	Convolutional Neural Network
SVC	Support Vector Classifier

Appendix A

Table A1. Statistical properties of wind farm data.

Statistical Attribute		Power (MW)	NWP
Statistical Attribute		Power (MW)	100 m Wind Speed (m/s)	2 m Temperature (°C)	10 m Wind Speed (m/s)	2 m Dewpoint Temperature (°C)
Mean		1272.86	3.58	19.47	2.13	14.63
Std		978.28	1.46	8.54	0.93	8.28
Min		0.00	0.67	−3.03	0.50	−13.11
Percentiles	25%	452.10	2.46	12.72	1.44	8.33
	50%	1024.66	3.39	20.17	1.92	15.72
	75%	1930.89	4.50	26.53	2.64	22.57
Max		4410.81	10.40	38.26	7.01	25.90

Figure A1. Histogram of wind farm data distribution.

Table A2. Statistical properties of PV farm data.

Statistical Attribute		Power (MW)	NWP
Statistical Attribute		Power (MW)	Short-Wave Radiation (W/m²)	Long-Wave Radiation (W/m²)	2 m Temperature (°C)	2 m Dewpoint Temperature (°C)	Total Precipitation (mm)
Mean		441.00	5.01 × 10⁵	−162,134.96	19.70	14.55	0.19
Std		706.33	7.40 × 10⁵	105,339.27	9.03	8.69	0.42
Min		0.00	3.60 × 10⁻¹²	−548,029.46	−3.81	−13.01	0.00
Percentiles	25%	0.00078	3.60 × 10⁻¹²	−224,911.58	12.33	7.78	0.000029
	50%	13.03	1.87 × 10⁴	−136,630.79	20.23	15.37	0.012
	75%	641.22	8.30 × 10⁵	−81,381.98	27.32	22.91	0.16
Max		3184.09	3.10 × 10⁶	4086.53	39.38	27.27	4.58

Figure A2. Histogram of PV farm data distribution.

References

National Energy Administration Held a Press Conference to Introduce the First Three Quarters of Renewable Energy Grid Operation and Interpret the ‘on Vigorously Implement the Renewable Energy Alternative Action of the Guiding Opinions’. Available online: http://www.nea.gov.cn/2024-10/31/c_1212406333.htm (accessed on 31 October 2024).
Yang, M.; Guo, Y.; Huang, T.; Zhang, W. Power prediction considering NWP wind speed error tolerability: A strategy to improve the accuracy of short-term wind power prediction under wind speed offset scenarios. Appl. Energy 2025, 377, 124720. [Google Scholar]
ElRobrini, F.; Bukhari, S.M.S.; Zafar, M.H.; Al-Tawalbeh, N.; Akhtar, N.; Sanfilippo, F. Federated learning and non-federated learning based power forecasting of photovoltaic/wind power energy systems: A systematic review. Energy AI 2024, 18, 100438. [Google Scholar]
Yang, M.; Huang, Y.; Xu, C.; Liu, C.; Dai, B. Review of several key processes in wind power forecasting: Mathematical formulations, scientific problems, and logical relations. Appl. Energy 2025, 377, 124631. [Google Scholar]
Wang, D.; Yang, M.; Zhang, W.; Ma, C.; Su, X. Short-term power prediction method of wind farm cluster based on deep spatiotemporal correlation mining. Appl. Energy 2025, 380, 125102. [Google Scholar]
Yang, M.; Guo, Y.; Wang, B.; Wang, Z.; Chai, R. A day-ahead wind speed correction method: Enhancing wind speed forecasting accuracy using a strategy combining dynamic feature weighting with multi-source information and dynamic matching with improved similarity function. Expert Syst. Appl. 2025, 263, 125724. [Google Scholar]
Yang, M.; Li, X.; Fan, F.; Wang, B.; Su, X.; Ma, C. Two-stage day-ahead multi-step prediction of wind power considering time-series information interaction. Energy 2024, 312, 133580. [Google Scholar]
Yang, M.; Xu, C.; Bai, Y.; Ma, M.; Su, X. Investigating black-box model for wind power forecasting using local interpretable model-agnostic explanations algorithm: Why should a model be trusted? CSEE J. Power Energy Syst. 2023, 11, 227–242. [Google Scholar]
Zhang, W.; Clark, R.; Zhou, T.; Li, L.; Li, C.; Rivera, J.; Zhang, L.; Gui, K.; Zhang, T.; Li, L. 2023: Weather and Climate Extremes Hitting the Globe with Emerging Features. Adv. Atmos. Sci. 2024, 41, 1–16. [Google Scholar]
People Will Die in Storm Power Cuts, Councillor Says. Available online: https://www.bbc.com/news/articles/cdjgy32z1l4o (accessed on 11 December 2024).
Texas Weather: Deaths Mount as Winter Storm Leaves Millions Without Power. Available online: https://www.bbc.com/news/world-us-canada-56095479 (accessed on 18 February 2021).
Zhou, X.; Li, Y.; Xiao, C.; Chen, W.; Mei, M.; Wang, G. High-impact Extreme Weather and Climate Events in China: Summer 2024 Overview. Adv. Atmos. Sci. 2025, 1–13. [Google Scholar] [CrossRef]
Liu, Y.; Bai, J. Daily Variation and Regional Differences in Wind Power Output during Heat and Cold Wave Days in China. Int. Trans. Electr. Energy Syst. 2023, 1, 8828093. [Google Scholar]
Sideratos, G.; Hatziargyriou, N.D. Wind power forecasting focused on extreme power system events. IEEE Trans. Sustain. Energy 2012, 3, 445–454. [Google Scholar]
Zhao, J.; Li, F.; Zhang, Q. Impacts of renewable energy resources on the weather vulnerability of power systems. Nat. Energy 2024, 9, 1407–1414. [Google Scholar]
Grochowicz, A.; van Greevenbroek, K.; Bloomfield, H.C. Using power system modelling outputs to identify weather-induced extreme events in highly renewable systems. Environ. Res. Lett. 2024, 19, 054038. [Google Scholar]
Sánchez-Pozo, N.N.; Vanem, E.; Bloomfield, H.; Aizpurua, J.I. A probabilistic risk assessment framework for the impact assessment of extreme events on renewable power plant components. Renew. Energy 2025, 240, 122168. [Google Scholar]
Okonkwo, P.C.; Nwokolo, S.C.; Udo, S.O.; Obiwulu, A.U.; Onnoghen, U.N.; Alarifi, S.S.; Eldosouky, A.M.; Ekwok, S.E.; Andráš, P.; Akpan, A.E. Solar PV systems under weather extremes: Case studies, classification, vulnerability assessment, and adaptation pathways. Energy Rep. 2025, 13, 929–959. [Google Scholar]
Li, Y.; Chen, F.; Yan, J.; Ge, C.; Han, S.; Liu, Y. Adaptive Short-term Wind Power Forecasting for Extreme Weather Based on Transfer Learning and AutoEncoder. Autom. Electr. Power Syst. 2025, 1–13. [Google Scholar] [CrossRef]
Lu, X.; Dong, C.; Wang, Z.; Jiang, J.; Wang, B.; Li, B. Research on Short-term Wind Power Forecasting Technology Under Low Temperature and Cold Wave Weather. Power Syst. Technol. 2024, 48, 4833–4843. [Google Scholar]
Yang, Z.; Peng, X.; Zhang, X.; Song, J.; Wang, B.; Liu, C. Short-Term Offshore Wind Power Prediction Based on Significant Weather Process Classification and Multitask Learning Considering Neighboring Powers. Wind Energy 2024, 27, 1011–1023. [Google Scholar]
Dorado-Moreno, M.; Navarin, N.; Gutiérrez, P.A.; Prieto, L.; Sperduti, A.; Salcedo-Sanz, S.; Hervás-Martínez, C. Multi-task learning for the prediction of wind power ramp events with deep neural networks. Neural Netw. 2020, 123, 401–411. [Google Scholar]
Wang, H.; Peng, C.; Liao, B.; Cao, X.; Li, S. Wind Power Forecasting Based on WaveNet and Multitask Learning. Sustainability 2023, 15, 10816. [Google Scholar] [CrossRef]
Rani, S. A novel approach to low-light image and video enhancement using adaptive dual super-resolution generative adversarial networks and top-hat filtering. Comput. Electr. Eng. 2025, 123, 110052. [Google Scholar]
Shirodkar, V.; Edla, D.R.; Kumari, A. Generative Adversarial Networks for Motor Imagery Classification using Wavelet Packet Decomposition and Complex Morlet Transform. Multimed. Tools Appl. 2025, 1–24. [Google Scholar] [CrossRef]
Govindharaj, I.; Santhakumar, D.; Pugazharasi, K.; Ravichandran, S.; Prabhu, R.V.; Raja, J. Enhancing glaucoma diagnosis: Generative adversarial networks in synthesized imagery and classification with pretrained MobileNetV2. MethodsX 2025, 14, 103116. [Google Scholar]
Huang, Q.; Yan, N.; Zhong, X. Wind Power Ramping Events Prediction Based on Generative Adversarial Network. Acta Energiae Solaris Sin. 2023, 44, 226–231. [Google Scholar] [CrossRef]
Mi, Y.; Lu, C.; Shen, J.; Yang, X.; Ge, L. Wind Power Extreme Scenario Generation Based on Conditional Generative Adversarial Network. High Volt. Eng. 2023, 49, 2253–2263. [Google Scholar]

Figure 1. Overall framework.

Figure 2. MTL structure diagram.

Figure 3. MMoE structure diagram.

Figure 4. CGC structure diagram.

Figure 5. PLE structure diagram.

Figure 6. Generative adversarial network.

Figure 7. June 2023 power prediction results.

Figure 8. December 2023 power prediction results.

Figure 9. Comparison of average wind speeds in 2021.

Figure 10. Comparison of average wind speeds in 2022.

Figure 11. Comparison of average wind speeds in 2023.

Figure 12. Comparison of average temperatures in 2021.

Figure 13. Comparison of average temperatures in 2022.

Figure 14. Comparison of average temperatures in 2023.

Figure 15. Trends in real power versus temperature in June 2023.

Figure 16. Trends in real power versus temperature in December 2023.

Figure 17. Trends in actual power versus total precipitation in June 2023.

Figure 18. Trends in actual power versus total precipitation in December 2023.

Figure 19. Normal distribution of temperature data at high-error time points in wind farms.

Figure 20. Normal distribution of wind speed data at high-error time points in wind farms.

Figure 21. Normal distribution of temperature data at the time points of anomalous output of the PV farms.

Figure 22. Normal distribution of total precipitation data at the time points of anomalous output of the PV farms.

Figure 23. Wind farm PLE model classification results.

Figure 24. Temporal distribution of real labels for wind farms.

Figure 25. Temporal distribution of labels for the classification results of the PLE model for wind farms.

Figure 26. Effect of high-temperature weather identification at wind farms in early August 2023.

Figure 27. Effect of low-temperature weather identification at wind farms in mid-December 2023.

Figure 28. Effect of gale-force weather identification at wind farms in mid-December 2023.

Figure 29. Comparison of weather classification models for the wind farm cluster.

Figure 30. PV farm PLE model classification results.

Figure 31. Temporal distribution of real labels for PV farms.

Figure 32. Temporal distribution of labels for the classification results of the PLE model for PV farms.

Figure 33. Effect of high-temperature weather identification at PV farms in early August 2023.

Figure 34. Effect of low-temperature weather identification at PV farms in mid-December 2023.

Figure 35. Effect of heavy-precipitation weather identification at PV farms in late June 2023.

Figure 36. Comparison of weather classification models for PV farm cluster.

Table 1. Wind farm dataset division.

	High Temperature	Low Temperature	Gale	Ordinary Weather	Total	Extreme Weather Percentage
Training set	988	979	815	14,730	17,512	15.89%
Test set	278	441	436	7605	8760	13.18%
Total	1266	1390	1251	22,365	26,272	14.87%
Percentage	4.82%	5.29%	4.76%	85.13%	/	/

Table 2. PV farm dataset division.

	High Temperature	Low Temperature	Heavy Precipitation	Ordinary Weather	Total	Extreme Weather Percentage
Training set	789	931	276	15,500	17,496	11.41%
Test set	250	444	133	7933	8760	9.44%
Total	1039	1375	409	23,433	26,256	10.75%
Percentage	3.96%	5.24%	1.56%	89.24%	/	/

Table 3. Wind farm dataset division after data generation.

	High Temperature	Low Temperature	Gale	Ordinary Weather	Total	Extreme Weather Percentage
Training set	988 + 3103	979 + 3297	815 + 2218	14,730	17,512 + 8618	43.63%
Test set	278	441	436	7605	8760	13.18%
Total	4091	4276	3033	22,365	33,765	33.76%
Percentage	12.12%	12.66%	8.98%	66.24%	/	/

Table 4. PV farm dataset division after data generation.

	High Temperature	Low Temperature	Heavy Precipitation	Ordinary Weather	Total	Extreme Weather Percentage
Training set	789 + 2813	931 + 3436	276 + 940	15,500	17,496 + 7189	37.21%
Test set	250	444	133	7933	8760	9.44%
Total	3602	4367	1216	24,260	33,445	27.46%
Percentage	10.77%	13.06%	3.64%	72.54%	/	/

Table 5. Classification results of high-temperature weather in wind farms.

Method	Accuracy	Precision	Recall	F1-Score
Method I	0.9809	0.9627	0.7118	0.7879
Method II	0.9866	0.9408	0.8278	0.8754
Method III	0.9895	0.9946	0.8345	0.8982
Method IV	0.9946	0.9872	0.9242	0.9533

Table 6. Classification results of low-temperature weather in wind farms.

Method	Accuracy	Precision	Recall	F1-Score
Method I	0.9831	0.9563	0.8590	0.9012
Method II	0.9911	0.9794	0.9255	0.9507
Method III	0.9938	0.9862	0.9484	0.9665
Method IV	0.9975	0.9964	0.9772	0.9866

Table 7. Classification results of gale-force weather in wind farms.

Method	Accuracy	Precision	Recall	F1-Score
Method I	0.9758	0.8970	0.8308	0.8605
Method II	0.9838	0.9693	0.8535	0.9023
Method III	0.9963	0.9693	0.9937	0.9812
Method IV	0.9986	0.9866	0.9993	0.9929

Table 8. Evaluation indexes for high-temperature weather classification of the wind farm cluster.

Model	Accuracy	Precision	Recall	F1-Score
PLE	0.9946	0.9872	0.9242	0.9533
MMoE	0.9932	0.9965	0.8921	0.9378
LSTM	0.9870	0.9875	0.7984	0.8689
GRU	0.9894	0.9820	0.8414	0.8990
CNN	0.9945	0.9951	0.9154	0.9515
SVC	0.9946	0.9669	0.8571	0.9086

Table 9. Evaluation indexes for low-temperature weather classification of the wind farm cluster.

Model	Accuracy	Precision	Recall	F1-Score
PLE	0.9975	0.9964	0.9772	0.9866
MMoE	0.9971	0.9951	0.9749	0.9848
LSTM	0.9973	0.9742	0.9986	0.9860
GRU	0.9976	0.9782	0.9977	0.9877
CNN	0.9982	0.9844	0.9969	0.9906
SVC	0.9989	0.9865	0.9909	0.9886

Table 10. Evaluation indexes for gale-force weather classification of the wind farm cluster.

Model	Accuracy	Precision	Recall	F1-Score
PLE	0.9986	0.9866	0.9993	0.9929
MMoE	0.9944	0.9971	0.9438	0.9688
LSTM	0.9856	0.9319	0.9131	0.9223
GRU	0.9892	0.9398	0.9465	0.9431
CNN	0.9904	0.9937	0.9048	0.9443
SVC	0.9934	0.9976	0.9381	0.9669

Table 11. Classification results of high-temperature weather in PV farms.

Method	Accuracy	Precision	Recall	F1-Score
Method I	0.9823	0.9537	0.7404	0.8123
Method II	0.9881	0.9687	0.8303	0.8868
Method III	0.9860	0.9024	0.8588	0.8792
Method IV	0.9941	0.9139	0.9969	0.9514

Table 12. Classification results of low-temperature weather in PV farms.

Method	Accuracy	Precision	Recall	F1-Score
Method I	0.9920	0.9780	0.9367	0.9562
Method II	0.9921	0.9459	0.9755	0.9601
Method III	0.9879	0.9937	0.8798	0.9285
Method IV	0.9965	0.9682	0.9971	0.9822

Table 13. Classification results of heavy-precipitation weather in PV farms.

Method	Accuracy	Precision	Recall	F1-Score
Method I	0.9853	0.9076	0.9455	0.9256
Method II	0.9879	0.9313	0.9426	0.9369
Method III	0.9953	0.9612	0.9921	0.9761
Method IV	0.9990	0.9995	0.9662	0.9822

Table 14. Evaluation indexes for high-temperature weather classification of PV farm cluster.

Model	Accuracy	Precision	Recall	F1-Score
PLE	0.9941	0.9139	0.9969	0.9514
MMoE	0.9954	0.9866	0.9297	0.9563
LSTM	0.9898	0.9778	0.8336	0.8922
GRU	0.9935	0.9967	0.8860	0.9340
CNN	0.9921	0.9960	0.8620	0.9179
SVC	0.9947	0.9973	0.9080	0.9480

Table 15. Evaluation indexes for low-temperature weather classification of PV farm cluster.

Model	Accuracy	Precision	Recall	F1-Score
PLE	0.9965	0.9682	0.9971	0.9822
MMoE	0.9939	0.9489	0.9936	0.9701
LSTM	0.9990	0.9995	0.9899	0.9946
GRU	0.9982	0.9855	0.9958	0.9906
CNN	0.9971	0.9985	0.9718	0.9848
SVC	0.9977	0.9988	0.9775	0.9879

Table 16. Evaluation indexes for heavy-precipitation weather classification of PV farm cluster.

Model	Accuracy	Precision	Recall	F1-Score
PLE	0.9990	0.9995	0.9662	0.9822
MMoE	0.9983	0.9921	0.9398	0.9652
LSTM	0.9984	0.9912	0.9548	0.9723
GRU	0.9989	0.9917	0.9698	0.9805
CNN	0.9987	0.9994	0.9586	0.9781
SVC	0.9985	0.9992	0.9511	0.9739

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liang, Z.; Wang, Z.; Wu, N.; Jiang, Y.; Sun, D. A New Energy High-Impact Process Weather Classification Method Based on Sensitivity Factor Analysis and Progressive Layered Extraction. Electronics 2025, 14, 1336. https://doi.org/10.3390/electronics14071336

AMA Style

Liang Z, Wang Z, Wu N, Jiang Y, Sun D. A New Energy High-Impact Process Weather Classification Method Based on Sensitivity Factor Analysis and Progressive Layered Extraction. Electronics. 2025; 14(7):1336. https://doi.org/10.3390/electronics14071336

Chicago/Turabian Style

Liang, Zhifeng, Zhao Wang, Nan Wu, Yue Jiang, and Dayan Sun. 2025. "A New Energy High-Impact Process Weather Classification Method Based on Sensitivity Factor Analysis and Progressive Layered Extraction" Electronics 14, no. 7: 1336. https://doi.org/10.3390/electronics14071336

APA Style

Liang, Z., Wang, Z., Wu, N., Jiang, Y., & Sun, D. (2025). A New Energy High-Impact Process Weather Classification Method Based on Sensitivity Factor Analysis and Progressive Layered Extraction. Electronics, 14(7), 1336. https://doi.org/10.3390/electronics14071336

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Energy High-Impact Process Weather Classification Method Based on Sensitivity Factor Analysis and Progressive Layered Extraction

Abstract

1. Introduction

2. Study Method

2.1. Overall Framework

2.2. Analysis of Sensitive Weather Factors

2.3. High-Impact Weather Process Definition Method

2.4. Progressive Layered Extraction Classification Model

2.5. WGAN Data Generation

3. Case Study

3.1. Analysis of Sensitivity Factors

3.1.1. Analysis of Wind Farm Sensitivity Factors

3.1.2. Analysis of PV Farm Sensitivity Factors

3.2. New Energy High-Impact Weather Processes Defined

3.2.1. Definition of High-Impact Weather Processes at Wind Farms

3.2.2. Definition of High-Impact Weather Processes at PV Farms

3.3. PLE Model Classification Results

3.3.1. Introduction to Extreme Weather Data

3.3.2. Wind Farm Cluster Classification Results

3.3.3. Comparison of Wind Farm Cluster Classification Models

3.3.4. PV Farm Cluster Classification Results

3.3.5. Comparison of PV Farm Cluster Classification Models

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI