1. Introduction
In the current landscape, distributed energy resources dominated by wind and solar power along with new energy storage systems and charging facilities are advancing at an unprecedented pace. By 2025, distribution networks will need to support the integration of approximately 500 gigawatts of distributed energy resources and around 12 million charging stations. In this context, traditional single-source–load scenarios are increasingly inadequate for meeting the evolving planning requirements of distribution networks as they are unable to effectively manage the uncertainties introduced by operational simulations and planning schedules [
1,
2]. Therefore, it is crucial to solve the randomness and fluctuation of source–load in modern distribution networks, accurately simulate the change trend of source–load power, and ensure the safe operation of the power system [
3].
Scenario generation is a powerful method for characterizing source–load uncertainties and can be used to construct random source–load scenarios that reflect spatiotemporal characteristics [
4,
5]. This enables the evaluation of distribution network performance under different scenarios, enhancing the adaptability and robustness of distribution network. From a time perspective, random scenarios can be classified into ultra-short-term [
6], short-term [
7], and medium-long-term scenarios [
8,
9]. As a means of assessing source–load uncertainties across multi-time scales, source–load scenarios provide a critical foundation for generation planning, unit maintenance planning, and distribution network planning. However, the generation of source–load scenarios across multi-time scales is challenging, mainly due to the increasing data dimensions, constraints imposed by meteorological factors, and the frequent interactions source–load [
10]. For example, in Ref. [
11], complex factors such as high variable dimensions and random spatial and temporal correlations affect the integrated multi-energy complementary planning and long-term scheduling of water, wind, and solar energy. In Ref. [
12], accurate modeling of multiple wind farms is affected by multiple spatiotemporal characteristics. In Ref. [
13], renewable energy load and power generation level predictions are also affected by meteorological factors as well as spatial factors, which leads to a serious decline in prediction and modeling accuracy. These factors lead to low source–load interaction modeling accuracy and poor quality of generated source–load scenarios, which cannot provide planners with detailed and informative reference data.
Scenario analysis mainly consists of scenario generation and scenario reduction [
14,
15]. For example, in the work of [
16], a low-dimensional scene generation method based on typical days of photovoltaic and wind power was used for short-time scale modeling. In Ref. [
17], based on the Monte Carlo method, the joint scenario generation of photovoltaic and wind power was realized. In Refs. [
18,
19], scenario generation methods were used to assess the complementarity of future wind and solar photovoltaic hybrid energy resources. However, existing multi-time scale scenario generation methods primarily rely on statistical approaches such as probabilistic models, mean models, and quantile regression to generate sets of typical future source–load temporal scenarios. Although these methods effectively and intuitively describe the uncertainty of multi-time scale source–load power, they require a large amount of prior knowledge, and the quality and richness of the generated scenes cannot be guaranteed. If prior knowledge is lacking, it is more challenging to generate complex and multi-time scale source–load scenarios. On the other hand, scene reduction is a common means to remove noisy information from a scene set, minimizing the computational effort of multi-time scale analysis, while retaining the critical information from the generated scenario set. Common methods include backward reduction [
20], scenario tree methods, clustering [
21,
22], and optimization approaches [
23,
24]. Therefore, selecting an appropriate scenario reduction algorithm to construct a flexible scenario reduction model is of great significance for analyzing source–load power information with high value.
Due to their powerful data representation capabilities, deep learning algorithms not only show excellent performance in processing high-dimensional nonlinear historical data, but also reduce the importance of prior knowledge. They provide a powerful means to effectively capture source–load uncertainty and provide diversity, richness, and accuracy for multi-time scale scenario generation of distribution networks. Currently, data-driven approaches are the mainstream method for scene generation, and generative adversarial networks (GANs) are particularly popular as one type of data-driven method. GANs can generate new samples that are very close to the real data distribution through the adversarial training of two neural networks [
25,
26]. However, traditional GANs have shortcomings such as unstable training and mode collapse. Therefore, it is necessary to use Wasserstein distance and a clipping technique to increase the stability of GANs. For example, in Ref. [
27], a scenario generation method based on WGAN-GP was used for wind and solar power scenario generation. Likewise, in Ref. [
28], WGAN-GP was used to achieve the generation of single and multiple wind power scenarios. In order to further improve the quality and accuracy of scene generation, experts and scholars added condition information to a GAN [
29,
30]. For instance, in Ref. [
31], the author clustered the source and load information and used it as a condition to guide a generation of energy storage planning scenarios. The study [
32] applied conditional generative adversarial networks (CGANs) to scenario generation for wind and solar power output using monthly information as labels. The literature [
33] used wind and solar power forecasting data as conditions and employed CGAN to generate predictive scenario sets, providing accuracy for future wind and solar power scenario simulations. References [
34,
35] addressed the multi-time scale issue of source–load scenarios by proposing a monthly source–load scenario generation method based on a progressively growing CGAN. The study [
36] clustered historical meteorological data and used a denoising variational autoencoder to generate multi-source–load scenarios, but it did not consider multi-time scale modeling. Although these methods have made significant progress in generating source–load scenarios and effectively fitting source–load data characteristics, most of them do not construct source–load scenario models at multi-time scales, thus leading to limitations in quantifying source–load uncertainty and the multi-time correlation characteristics in the source–load dynamic scene cannot be fully captured.
In response to these challenges, this paper comprehensively considers source–load uncertainty across multi-time scales, and proposes a source–load scenario generation method based on the temporal generative adversarial network (TGAN) by introducing multi-head attention mechanisms (MHAM)s and temporal convolutional networks (TCNs). The main contributions of this paper are as follows:
- (1)
Clustering techniques are used to extract representative daily source–load power states, and Markov chains are established to model source–load state transitions;
- (2)
A spatiotemporal feature extraction unit for TGAN is constructed by incorporating MHAM and TCN, using clustering information as labels to guide the generation of source–load power time series;
- (3)
A comprehensive evaluation index system is established based on the generated source–load scenario sets to assess the high fidelity of the scenarios.
The remaining sections of this article are organized as follows. In
Section 2, we constructed a source–load time correlation model.
Section 3 is mainly focused on TGAN source–load scenario generation, including MHAM, TCN, and CGAN applications.
Section 4 and
Section 5 mainly include model training and scenario quality evaluation. This article concludes with
Section 6. It should be noted that this paper uses the source–load data set of power distribution systems in a county and district in Shandong, and code support is provided in
Appendix A.
3. Source–Load Scenario Generation Based on TGAN
To address the limitations of traditional GAN in extracting temporal features of source–load power, this paper proposes an improved CGAN model, which introduces an MHAM and TCN into the generator and discriminator to form the TGAN framework. This model generates intra-day power curves under the supervision of source–load daily state labels. MHAMs enhance the spatial feature extraction capability of TGAN for source–load data, while the TCN fully captures the temporal characteristics and correlations of source–load data, thereby improving the accuracy of source–load scenario generation.
3.1. Source–Load Spatial Feature Extraction Unit Based on MHAM
The MHAM can be used to evaluate the importance of different features of the historical source–load power matrix in scenario generation. Compared to single-head attention mechanisms, the primary advantage of the MHAM lies in its ability to process the relationships between source–load data, capturing diverse features of the data, thereby offering a more robust capacity for extracting spatial features of source–load power. In this paper, the weight calculation formula for the
i-th head of the MHAM is given as follows (see
Figure 3):
where
and
can be seen as input representations from different sources. These input matrices are projected into different spaces using the linear projection matrices
and
. The softmax function is then applied to the scaled results to produce a set of weights
.
The weights
are multiplied with the input matrix to obtain the output matrix of a single attention mechanism, which calculates the similarity between different source–load features
Through multiple output matrices, the different in source–load information can be identified, thereby proposing the spatial correlation features of source–load data.
3.2. Source–Load Temporal Feature Extraction Unit Based on TCN
A TCN is a neural network that processes sequential data through convolution operations. Its basic working principle is to capture dependencies between different time steps in a time series through causal convolution and dilated convolution. Compared to recurrent neural networks that simulate source–load temporal power, TCN is more adept at handling high-dimensional source–load feature information and demonstrates significant advantages in training speed. In this paper, TCN is used as the main structure to build the temporal feature extraction unit called TGAN.
According to the source–load temporal data, the value
at time
t is determined solely by the historical information at time
, that is
where
represents the input power at time
t,
refers to causal convolution, and
T is the total number of time slices.
When processing long-term source–load historical data sequences, the network depth or convolution size may increase sharply, leading to issues such as gradient vanishing and inefficiency during training. The dilated convolution operation enables the capture of a wider range of information, allowing the TCN to extract essential information without needing to delve deeply into the entire source–load history (see
Figure 4). The basic principle of dilated convolution is
where
represents the result of the filter performing a dilated convolution operation on the element
x in the historical power vector
x,
refers to the dilated convolution,
represents the
filter,
is the dilation rate, and
k is the filter size.
In the convolution operation above, a residual connection is further incorporated to form a residual module. The output of the residual module, which combines historical information and convolutional information, enhances the algorithm’s ability to express power characteristics in source–load scenario generation, thereby obtaining the temporal feature extraction unit.
3.3. Spatiotemporal Feature Extraction Unit for Source–Load Power Based on TGAN
CGANs extend the traditional GAN framework by integrating both supervised and unsupervised learning, resulting in enhanced generalization for specific data types. Unlike conventional GANs, CGANs incorporate conditional inputs into both the generator and discriminator, allowing for more controlled generation. This conditional input guides the generator to learn the mapping between data sample probability distributions and specified conditions, leveraging label information to improve data alignment with given parameters. In this paper, based on the CGAN framework, an MHAM and TCN are introduced to construct a TGAN, thereby enhancing the spatiotemporal feature extraction capabilities for source–load data. The generator takes Gaussian white noise Z and clustered daily source–load state labels C as inputs, and the output data are multi-spatiotemporal scale source–load scenarios S, where is the noise dimension. The discriminator inputs are the source–load historical data , integrated with state labels, and the generated scenario S. By analyzing the features of the input source–load data, the discriminator outputs the classification results on the authenticity of the input data.
Considering that the generator and discriminator are relatively independent structures, their loss functions can be expressed as, respectively,
where
represents the expected value of the corresponding random variable,
is the discriminator. The generator network takes random noise and label conditions as inputs and outputs synthesized source–load data. The discriminator network receives historical source–load data sets, generated source–load data, and label conditions as inputs, and outputs assessments distinguishing between real and fake data. The generator aims to create realistic data, while the discriminator seeks to improve the quality of the generated data by differentiating between genuine and synthetic data. This mutual competition ultimately ensures the high quality of the source–load data. Therefore, the training process of TGAN can be described as a min-max game, and its objective function is
In the original GAN framework, the discriminator’s loss function is dominated by Jensen–Shannon (JS) divergence, which can introduce training challenges, such as mode collapse. When the generated sample distribution is inconsistent with the real sample distribution, JS divergence remains static and cannot provide meaningful feedback. For generating source–load scenarios across multiple spatiotemporal scales, it is essential for the generator to capture the distribution patterns of source–load scenarios accurately. However, using JS divergence as the discriminator’s loss function hinders effective measurement of distribution differences, often causing gradient vanishing during back propagation and complicating network training, thereby reducing the accuracy of the generated scenarios. By adopting the Wasserstein distance in place of the JS divergence as the discriminator’s loss function, issues such as training instability and mode collapse can be effectively mitigated, resulting in a more precise fit to the target probability distributions. Consequently, the Wasserstein distance can be defined as follows:
where the discriminator adheres to 1-Lipschitz continuity, meaning that the maximum absolute value of its gradient is bounded by 1. By introducing the gradient penalty function
D within its defined domain, we ensure the discriminator approximately meets the 1-Lipschitz continuity requirement, thus the Wasserstein distance can be accurately represented by the discriminator’s loss function. Consequently, the objective function for the CGAN framework is reformulated as follows:
Unlike traditional CGAN, the advantage of using Wasserstein distance as a discriminator is that it can not only distinguish generated samples from real samples, but also fit Wasserstein distances between sample sets to accurately describe the differences between their distributions. Therefore, it provides an accurate training direction for the generator to fit the data distribution. This enables it to map conditional probability distributions rather than simply generate data similar to a real sample.
3.4. Stochastic Generation Method for Source–Load Scenarios at Multi-Time Scales
By combining the aforementioned source–load day-to-day transition process and intra-day temporal power simulation method, a stochastic generation method for source–load power scenarios at multi-time scales is established. This method can randomly generate a specified number of multi-time scale
source–load scenarios with spatiotemporal correlation (see
Figure 5). The specific generation steps are as follows:
- (1)
The monthly historical source–load data set X is divided into categories by the K-means clustering technique.
- (2)
Based on the typical daily power states obtained from clustering, matrices and are calculated. Then, the initial source–load scenario set is randomly generated using the MCMC algorithm.
- (3)
Taking the daily power states during the monthly power generation state transition in set as labels and driven by Gaussian white noise Z, the source–load daily power curves for the corresponding states are generated based on TGAN.
- (4)
Based on the order determined by the source–load state transition process at each time scale, the daily power curves of the source and load are concatenated to form the long-term generation scenario set S.
4. Model Training
The validation of the proposed algorithm is supported by the source–load historical data set from the distribution network in Shandong, China, where the data sampling frequency is every 15 min. The number of representative daily power generation states is determined as
using the commonly used
K-means method for evaluating clustering effectiveness. The experiments were conducted using Python 3.8.0 and the deep learning framework PyTorch 2.0.0. The computer hardware specifications include an Intel i9-10920X CPU (Intel Corporation, Santa Clara, CA, USA) with a frequency of 3.50 GHz and 24 cores, an NVIDIA GeForce RTX 3080 Ti GPU (NVIDIA Corporation, Santa Clara, CA, USA), and 128 GB of RAM. During training, the learning rate is 0.0005, and the batch size is 32 (see
Figure 6). The generator ultimately produces 1800 typical daily source–load scenarios per month, which are concatenated to form a monthly-scale source–load scenario set of 60 months (see
Figure 7,
Figure 8 and
Figure 9).
As shown in
Figure 6, during the process of generating source–load scenarios using the TGAN, we can observe that the initial loss of the discriminator dropped rapidly and flattened after about 500 iterations, which shows that the discriminator can quickly learn to differentiate real data with generated data. At the same time, the generator losses also decreased significantly in the early stages, but their downward trend was relatively slow, and it did not gradually stabilize until about 1500 iterations, which reflects that the generator shows gradual improvement in the ability to generate realistic data. In addition, the Wasserstein distance has a brief upward phase at the beginning of training, then rapidly declines and reaches a stable state after about 500 iterations, which is a phenomenon that shows that the model effectively narrows the gap between the real data distribution and the generated data distribution. To sum up, these trends show that the model has good convergence, and the losses of both discriminators and generators reach a stable state after an appropriate number of iterations. At the same time, the change in Wasserstein distance also confirms this, and after the initial learning stage, the distance between distributions remains at a stable level.
To emphasize the advantages of using the MHAM, we have added comparative experiments and used root mean square error (RMSE) and mean absolute error (MAE) as evaluation metrics. As shown in
Table 1, compared to “AM + TCN”, “MHAM + TCN” demonstrates a more significant advantage in improving the quality of generated data. Specifically, for solar data and wind data, the quality improvement is approximately
. For load data, the improvement is approximately
. Therefore, we can firmly conclude that method MHAM has a notable advantage in enhancing data quality and spatial feature extraction. Based on the results presented in
Table 1, we analyzed the errors of methods “MHAM + TCN” and “MHAM + RNN”. The comparison revealed that method “MHAM + TCN” has certain advantages, although these advantages are not immediately apparent. To further illustrate the benefits of “MHAM + TCN”, we introduced method “AM + RNN”; it was found that the quality of generated source–load data using “MHAM + TCN” showed significant improvements over historical data. Specifically, the enhancements in data quality are evident when comparing the performance metrics between different methods, demonstrating the notable advantages of method “MHAM + TCN” in improving data quality and spatial feature extraction.
Figure 7,
Figure 8 and
Figure 9 illustrate the source–load scenarios generated on daily, weekly, and monthly time scales using the TGAN. Each set of scenarios not only clearly reflects the fluctuations and trends in source–load power under different cycles but also uncovers subtle patterns within complex spatiotemporal features. Therefore, it can be concluded that TGAN has unique advantages in handling complex and multi-time source–load characteristics, providing more precise data support for the planning and scheduling of power systems.
5. Evaluation of Generated Scenario Quality
The source–load scenarios generated by the TGAN should closely match the characteristics of real scenarios. To ensure high fidelity of the generated scenarios, this paper conducts a comprehensive evaluation from three aspects: time correlation, clustering effectiveness, and probability distribution characteristics.
5.1. Temporal Correlation Indicators
In this paper, the autocorrelation coefficient (ACF) is used as the time correlation index to simulate the generated source–load scenario set, and the time correlation is compared with the historical data set to evaluate the performance of the generated scenario. The ACF reflects the correlation degree of source–load data over time, and the autocorrelation coefficient decreases gradually with the increase of hysteresis. Therefore, the performance of the source–load scenario set in capturing temporal characteristics can be measured by analyzing the magnitude and variation patterns of the ACF. The formula is as follows:
where
represents the source–load power at time
t,
j represents the time interval,
represents the average source–load power, and
represents the source–load autocorrelation coefficient with a time lag of
j, which ranges from
. It should be noted that 1, −1, and 0, respectively, mean that there is a positive correlation, a negative correlation, and no correlation.
In
Figure 10, the source–load historical scenario ACF is in stark contrast to the generated data ACF, providing a strong scenario support for us to evaluate the quality and fidelity of the generated scenario. The ACF curve of the generated solar data closely matched the actual data, exhibiting a similar trend. The correlation decreased significantly over time, reaching a negative value at around 12 h before returning to zero. This arrangement suggests that the TGAN successfully captures the typical cyclical pattern of solar scenario generation, most likely reflecting daily variations in sunlight. We can also find that the ACF of the generated wind scenario follows a similar pattern to the real scenario, although the correlation values are slightly biased over time, the overall correlation trend does not change significantly, which may be caused by the highly variable and random nature of wind scenario generation. Therefore, we can also conclude that the TGAN effectively learns the general time-dependent patterns of the wind scenario. For the load scenario, the ACF of the generated scenarios also aligns well with the real scenario, showing a similar dip in correlation around midday. This pattern reflects typical daily load cycles, with demand peaking at certain times and decreasing during off-peak hours. The close alignment of the ACF curves between the generated and actual load scenario suggests that the TGAN accurately models daily load fluctuation patterns. Overall, the data generated by TGAN shows high fidelity to the real scenario in terms of temporal correlation, as evidenced by the tight alignment of the ACF curves. The minimal deviation between the real and generated scenarios indicates that the generated scenario effectively captures the fundamental time characteristics of the historical source–load scenario. The high fidelity of this temporal correlation shows that the TGAN is well suited for generating real source–load scenarios.
In addition, the ACF value of the source–load scenarios dropped rapidly in the early stage, which indicates that there is a strong short-term dependency. The high consistency between the actual scenarios and the ACF value of the generated scenarios indicates that the model can effectively capture the time pattern of the source–load scenarios. However, the overall decline rate of wind power data is slower than that of solar scenarios and load scenarios, and the matching degree is not so perfect, which shows that the TGAN needs further improvements in the generation of long-term wind power scenarios.
5.2. Clustering Effectiveness Indicators
The source–load scenarios generated by the TGAN typically consist of large amounts of multi-spatiotemporal data. On one hand, clustering algorithms can categorize and summarize these scenarios, reducing redundant data and lowering the computational complexity of subsequent processing and analysis. On the other hand, by using clustering algorithms to extract representative typical scenarios, the system’s analysis and decision-making processes are enhanced.
Figure 11,
Figure 12 and
Figure 13 show the power scenarios of
K-means clustering of solar, wind, and load data after dimensionality reduction. Each group represents typical power patterns over different time periods, providing a representative view of daily, weekly, and monthly power fluctuations for each category. It is clear that the solar power generation scenario has a clear periodicity, with clear peaks corresponding to daily solar irradiance characteristics, which is consistent with typical solar power generation patterns. In contrast, wind power scenarios are more irregular and exhibit greater randomness due to the natural variability in wind speed. However, the TGAN managed to capture wind power at different intensities. Although the fluctuation patterns are dispersed, this is consistent with the randomness of changes in wind speed. On the other hand, load scenarios show cyclical variations, forming peaks and troughs during periods of high and low demand, which is consistent with typical daily load cycles. Additionally, to provide a more intuitive description of the realism of the generated source–load scenarios,
Figure 14 illustrates the source–load scenarios divided into multiple clusters through clustering. Each cluster represents a group of source–load data patterns with similar characteristics, such as variations in solar power generation or electricity load. Different colors are used to distinguish these clusters, clearly showcasing various typical patterns and trends, such as peak load periods or power output under specific weather conditions. This enhances the understanding and simulation accuracy of real-world application scenarios. In summary, TGAN generates a high-fidelity representative power scenario that preserves the core pattern of the original scenarios, making it suitable for further analysis or as an input to models that require typical everyday patterns, thereby increasing efficiency without significant information loss.
5.3. Probability Density Distribution Characteristics
In order to further evaluate the quality of the generated source–load scenario, the cumulative distribution function (CDF) and probability density function (PDF) are introduced in this paper to visually represent the probability distribution of the source–load scenarios, which is convenient to compare the real scenarios.
As shown in
Figure 15, the CDF curve of the generated scenarios are generally highly consistent with the actual data. Especially in the range of solar power 0–0.6 kW, wind power 0.4–0.8 kW, and load demand 8–16 kW, the actual scenarios almost completely coincide. This shows that the generated scenarios effectively capture the cumulative probability distribution characteristics of the real scenarios.
Figure 16 provides some essential insights. The PDF curve of the generated photovoltaic scenarios aligns closely with the real scenarios in the low-power range (0–0.1 kW), showing a distinct peak characteristic. However, outside this low-power range, the generated scenarios exhibit increased fluctuation, indicating a slight deviation from the real distribution in less frequent power ranges, which may be due to the inaccurate model in handling low power output cases. For wind scenarios, the generated PDF curve is basically consistent with the actual scenarios in the main peak range (0–0.4 kW), and there is a slight difference above 0.4 kW, which indicates that the generated scenarios effectively capture the main distribution characteristics of wind scenarios, and the performance is poor in the case of high power. But it performs poorly at high-power conditions, suggesting that the model may not accurately capture the distribution of wind scenario output, especially near peaks. In addition, the load scenarios present a bimodal pattern, and the generated scenarios closely match the actual scenarios for the 8 kW and 12 kW peaks, accurately reflecting typical daily load variations.
It is important to note that a limitation of this study is that it does not consider source–load scenario generation under extreme conditions or the influence of meteorological factors. In future work, we will focus on the impact of extreme conditions and meteorological factors on source–load scenario models. Our aim is to improve the adaptability, robustness, and richness of these models, thereby providing more accurate technical support for planning personnel.
6. Conclusions
This paper proposes a multi-time source–load power scenario generation method based on a TGAN. To meet the long-term analysis needs of distribution networks, a comprehensive evaluation system for source–load scenarios across various temporal and spatial scales is established. Using historical source–load power data from a county in Shandong, China, the proposed method is validated for effectiveness and accuracy. The main conclusions are as follows:
- (1)
Dividing typical daily source–load power states and simulating the state transition patterns with a Markov chain helps to uncover the probabilistic characteristics of source–load power across multiple temporal and spatial scales.
- (2)
The introduction of an MHAM and TCN enhances the TGAN’s capability to capture the spatiotemporal features of source–load power, enabling the generated multi-scale source–load scenarios to better reflect the probabilistic and correlation characteristics of source–load power.
- (3)
The evaluation system assesses the quality of the generated source–load scenarios, providing detailed and reliable analytical data to support long-term power and energy balancing as well as operational planning in distribution networks.
Additionally, the TGAN can effectively simulate the actual energy generation and consumption mode, which provides a powerful tool for source–load prediction, further optimizing distribution network management, and can better forecast future electricity consumption trends and formulate more scientific and reasonable investment plans and infrastructure upgrade strategies. In future work, we will focus on improving model algorithms based on the limitations of existing models to improve accuracy in these critical areas. At the same time, other factors affecting the change of source–load will be considered, which will further improve the accuracy of source–load scene. Finally, we will select the appropriate algorithm to automatically adjust and optimize the model parameters to make the TGAN model adapt to different geographical and climatic conditions.