A Two-Stage Generative Architecture for Renewable Scenario Generation Based on Temporal Scenario Representation and Diffusion Models

Xu, Chenglong; Xu, Peidong; Dai, Yuxin; Su, Shi; Zhang, Luxi; Zhang, Jun; Bai, Yuyang; Gao, Tianlu; Xie, Qingyang; Shang, Lei; Gao, Wenzhong

doi:10.3390/en18051275

Open AccessArticle

A Two-Stage Generative Architecture for Renewable Scenario Generation Based on Temporal Scenario Representation and Diffusion Models

by

Chenglong Xu

¹,

Peidong Xu

¹,

Yuxin Dai

¹,

Shi Su

²,

Luxi Zhang

³,

Jun Zhang

^1,*,

Yuyang Bai

¹,

Tianlu Gao

¹,

Qingyang Xie

²,

Lei Shang

¹ and

Wenzhong Gao

⁴

¹

School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China

²

Electric Power Research Institute of Yunnan Power Grid Co., Ltd., Kunming 650217, China

³

Physics Department, Brandeis University, Waltham, MA 02453, USA

⁴

Department of Electrical and Computer Engineering, University of Denver, Denver, CO 80208, USA

^*

Author to whom correspondence should be addressed.

Energies 2025, 18(5), 1275; https://doi.org/10.3390/en18051275

Submission received: 3 February 2025 / Revised: 24 February 2025 / Accepted: 25 February 2025 / Published: 5 March 2025

(This article belongs to the Section A: Sustainable Energy)

Download

Browse Figures

Versions Notes

Abstract

:

Scenario generation proves to be an effective approach for addressing uncertainties in stochastic programming for power systems with integrated renewable resources. In recent years, numerous studies have explored the application of deep generative models to scenario generation. Considering the challenge of characterizing renewable resource uncertainty, in this paper, we propose a novel two-stage generative architecture for renewable scenario generation using diffusion models. Specifically, in the first stage the temporal features of the renewable energy output are learned and encoded into the hidden space by means of a representational model with an encoder–decoder structure, which provides the inductive bias of the scenario for generation. In the second stage, the real distribution of vectors in the hidden space is learned based on the conditional diffusion model, and the generated scenario is obtained through decoder mapping. The case study demonstrates the effectiveness of this architecture in generating high-quality renewable scenarios. In comparison to advanced deep generative models, the proposed method exhibits superior performance in a comprehensive evaluation.

Keywords:

renewable resource; generative model; diffusion models; scenario generation; scenario representation

1. Introduction

The increasing penetration of renewable resources brings about heightened intermittency and randomness, significantly impacting the stable operation of power systems. Within the analysis of power systems incorporating renewable resources, modeling the uncertainty inherent in them has emerged as a pivotal concern. Scenario analysis, by reflecting the probabilistic characteristics of uncertain variables, can describe the uncertainty of renewable resource output. Based on this approach, stochastic planning models can be established for objectives such as economic dispatch [1] and unit commitment [2]. The fundamental concept of scenario analysis involves discretizing a random vector with a continuous probability distribution into a set of samples, effectively transforming a stochastic optimization problem into a deterministic one. In the context of stochastic optimization tasks, the discretized samples used for solving deterministic optimization problems are referred to as scenarios. In this work, these scenarios are represented as the daily output curves of renewable energy. The precision of the scenarios constructed in scenario analysis directly influences the proximity of the solution to the corresponding optimal value in the actual stochastic optimization problem. Therefore, accurately generating renewable scenarios stands out as a crucial research direction.

In numerous prior studies, the generation of renewable scenarios has commonly relied on methods grounded in mathematical modeling. That is, a probabilistic model is constructed based on real data to reflect the distribution pattern of renewable resource output. Of these, the Monte Carlo (MC) and Latin Hypercube Sampling (LHS) generate new scenarios for analysis through sampling. In [3], MC simulation is used to generate scenarios to simulate the power case of an unbalanced system. The MC method has the advantages of simplicity and speed. However, when the MC method is combined with simple random sampling, it faces the challenges of prolonged computation and heavy storage requirements. The LHS method has been proposed to address this issue. In [4], scenarios for wind and solar energy are generated for day-ahead scheduling optimization by analyzing the renewable energy output errors and sampling their probability distributions using the LHS. In addition, the copula method can model the dependence between random variables in a multivariate structure and has no limitation of marginal distributions. Camal et al. [5] propose a multivariate copula modeling approach to predict related scenarios for renewable resources and applied it to unit commitment, effectively reducing system operating costs. In [6], the vine copula approach is utilized to establish correlations among onboard multi-energy loads and photovoltaic output, and to generate scenarios to address uncertainty in ship microgrids. However, the above methods usually necessitate assumptions about prior probabilities. Therefore, the challenge of the methods lie in enhancing the similarity between the scenario set and the problem to be addressed by establishing an appropriate probability model. In practice, the variability in climate, the non-linearity of energy conversion in generators, and the intricate temporal–spatial relationships between sites make it challenging to construct accurate probability models. Another commonly adopted method is the optimization-based scenario generation approach. This method involves scenario reduction to obtain suitable scenarios from the set of all possible scenarios, including clustering, forward selection, moment matching [7], and others. However, optimization-based methods face issues such as low computational efficiency in large-scale situations and the inability to capture extreme scenarios.

Advancements in artificial intelligence technology are driving the emergence of data-driven methods in power system scenario generation research. In [8], a gated recurrent unit (GRU) is combined with the convolutional neural network to predict weekly photovoltaic power scenarios based on weather errors and categories. In [9], a class-driven scenario generation method is proposed. This method first categorizes electricity price sequences at different time points and then integrates a softmax layer into a long short-term memory (LSTM) network for training, aiming to obtain parameter predictions for the target probability distribution of electricity prices. Considering the challenge of fitting the underlying distribution of real data, generative models are better suited for scenario generation tasks. In recent years, generative models, represented by Variational Auto-Encoders (VAEs) [10] and Generative Adversarial Networks (GANs) [11], have found numerous applications in the generation of renewable scenarios. In [12], a GAN-based method is firstly applied to generate renewable scenarios, which is able to effectively learn the features of renewable energy output and provides a robust generative idea. In [13], a privacy-preserving method is proposed for renewable scenario generation by integrating federated learning and Least Square Generative Adversarial Networks (LSGANs), which outperforms other advanced models. In [14], GANs are employed to learn the distribution of real data and stochastic constrained optimization is used to predict day-ahead scenarios. In [15], a style-based GAN incorporating a sequence encoder is proposed to utilize weather conditions to guide controlled day-ahead scenario generation. In [16], the utilization of VAEs is implemented for scenario generation in the context of coordinated optimization within a multi-energy system. In [17], a conditional VAE-based approach is proposed for wind power forecasting. However, the performance of the VAE model in practical applications is limited by its posterior distribution. Ensuring the convergence and stability of GANs during the training process is challenging. This may result in insufficient diversity of samples generated by GANs and challenge in capturing the complete distribution of real data.

Diffusion models constitute a new category of data-driven generative models, originating from [18]. Subsequently, it has demonstrated superior capabilities in the field of images compared to GANs. Ref. [19] indicates that diffusion models outscore GANs in diverse tests. The introduction of stable diffusion further proves its superior performance in terms of image generation quality [20]. Thus, diffusion models can be considered as a candidate for the scenario generation task. In [21], a diffusion-based model is proposed for generating battery-level and station-level charging demand scenarios. Nevertheless, diffusion models suffer from slow sampling speed and a constrained ability to generalize to various data types. Currently, there is limited research in the field of electricity, and addressing some problems is necessary for their application in renewable scenario generation. The above generative models exclusively focused on learning the global data distribution. When applied to renewable scenarios, most of them encounter challenges in accurately capturing temporal characteristics.

In addition, renewable scenarios, as a type of time series, possess notable temporal features. Time series representation learning methods aim to adeptly extract meaningful features from such data, transforming the raw data into a more informative representation. Franceschi et al. propose a method combining causal dilated convolutions with time-based negative sampling to obtain the general-purpose representation of multivariate time series [22]. In [23], an unsupervised learning framework based on neighborhood is proposed for general time series representation. This framework achieves effective representation using only a single-layer recurrent neural network (RNN) encoder. In [24], a method based on the Transformer encoder structure is first proposed for multivariate time series characterization and achieves satisfactory results in classification tasks. However, these methods provide a coarse-grained representation of the entire time series, which may not achieve adequate performance. If the Transformer structure is applied for encoding, there is a problem of quadratic growth in computational complexity, leading to diminish the efficacy in renewable scenarios with strong uncertainty and long time series.

To address the above issues, we propose a two-stage generative architecture that combines scenario representation with diffusion models for renewable scenario generation. Due to the instability in simultaneously optimizing the learning of temporal features and noise prediction, making it difficult to converge to optimal parameters, we consider separating it into two stages. In the first stage, we introduce a time series representation model with an encoder–decoder structure to enhance the temporal features of renewable scenarios within the latent space. In the second stage, we employ an conditional implicit diffusion model to learn the joint distribution of latent space variables. The diffusion model maps random noise to real data and achieves the generation of renewable scenarios through the decoder of the first stage. Through comprehensive experiments, we substantiate the effectiveness and reliability of the proposed method. Compared with the current mainstream deep generative models, including GANs, VAEs, and diffusion models, the results of qualitative and quantitative evaluations from various perspectives indicate that the proposed method demonstrates superior performance in scenario generation tasks. The main contributions of this paper are summarized as follows:

A novel two-stage generative architecture is designed for renewable scenario generation using diffusion models. The diffusion model is extended to the conditional implicit one, enabling deterministic generation of specific scenarios that effectively capture the complex patterns of renewable energy.
This paper proposes a time series representation module suitable for renewable energy scenarios, which reduces computational complexity through patching. Innovatively, the module introduces a priori temporal knowledge learning, effectively addressing the problem that in diffusion models it is difficult to model temporal correlations.

The rest of this paper is organized as follows. Section 2 describes the basic principles of time series representation and diffusion models. Section 3 elaborates on the proposed two-stage generative architecture. Section 4 verifies the effectiveness of the proposed method. Section 5 concludes this paper.

2. Time Series Representation and Diffusion Models

2.1. Problem Formulation

Assume that the number of observed renewable resources sites is M. Let

x_{s} \in R^{L}

denote the output value of renewable site s in the historical observation data, where

s = 1, \dots, M

, L is the sequence length. The time series representation aims to study a non-linear projection function which maps each

x_{s}

to its optimal self-described representation

r_{s}

. The

r_{s}

can be expressed by the joint probability distribution

p_{r} (r_{s} | x_{s})

. The purpose of scenario generation is to train a generative model

G_{θ} (\cdot)

with parameter

θ

. Through the learning of historical data, it achieves the mapping from distribution

p_{r} (r_{s} | x_{s})

to synthetic data

{\hat{x}}_{s} \sim p_{x} ({\hat{x}}_{s} | r_{s})

whose distribution is as close as possible to the unknown distribution

p_{x} (x_{s})

of historical data. These generated time series data constitute the scenarios and adhere to the following objective function:

min ∥P_{x} [{\hat{x}}_{s}] - P_{x} [x_{s}]∥ \leq ε

(1)

where

P_{x} (\cdot)

is the probability operator and

ε

denotes an acceptable distribution error. The set of scenarios that satisfy objective (1) can depict a stochastic process with similar characteristics to the real renewable output data.

2.2. Time Series Representation

Consider scenario sample

x_{s} \in R^{L}

, where the time-masked encoder module was applied to perform temporal feature extraction for renewable scenarios. Inspired by [25,26], we introduce patches to chunk the original time series.

(1) Patching The Series: Each input sample undergoes an initial segmentation into multiple non-overlapping patches. Denote the length of a patch as P and the interval between two consecutive patches as S. Consequently, this patching procedure yields a sequence of patches

x_{p} \in R^{P \times N}

, where

N = \frac{L + S}{P + S}

is the number of patches.

(2) Masked Mechanism: After patching, a subset of patches is selected and masked to value 0 while recording its indices. The selection strategy employed here is a simple uniform distribution random selection to prevent potential center bias.

(3) The Transformer-Based Encoder: To achieve the representation of renewable scenarios, we use a vanilla Transformer encoder to map the patched samples to the latent space. Neural networks with a Transformer structure inherently possess the capability for temporal modeling due to the fact that they carry encoding of sequence position information. The input patches are projected to a potential representation with D dimensions via a learnable embedding layer

W_{p} \in R^{D \times P}

. Furthermore, an additional positional information encoding

W_{p o s} \in R^{D \times N}

is incorporated to delineate the sequential order of patches, enabling it to model temporal relationships. The embedded sample

x_{d}

is articulated as follows:

x_{d} = W_{p} x_{p} + W_{p o s}

(2)

Subsequently,

x_{d}

will be fed into the Transformer encoder to generate the representation information of the input sample denoted as

r \in R^{D \times N}

in the latent space.

(4) Loss Function: A linear head

W_{o} \in R^{P \times D}, b_{o} \in R^{P \times N}

is applied over the ultimate representation vector r, forecasting the

{\hat{x}}_{p}

for the input patches

x_{p}

:

{\hat{x}}_{p} = W_{o} r + b_{o}

(3)

Typically, the mean square error (MSE) loss is used to measure the discrepancy between the estimated and true masked patches:

L_{M S E} = \frac{1}{M} \sum_{i \in M} {({\hat{x}}_{i} - x_{i})}^{2}

(4)

2.3. Denoising Diffusion Probabilistic Models

The denoising diffusion probabilistic model (DDPM) is a generative model, employing variational inference to train a parameterized Markov chain capable of generating samples that align with the original data [18]. To distinguish the time steps between the concept of diffusion models and the time series, we denote T as the diffusion steps and L as the number of periods per time series. We consider a dataset

O = {\{r^{i}\}}_{i = 1}^{N}

comprising N continuous variables

r^{i} \in R^{L \times F}

sampled from the real represented distribution

p_{r} (r)

, where F is the feature dimension. Let

r_{0}

denote the 0th diffusion step of

r^{i}

,

r_{1}, \dots, r_{t}, \dots, r_{T}

are diffusion variables of the same dimensionality as the data

r_{0}

. The DDPM aims to learn the underlying distribution

p_{θ} (r_{0})

that is similar to

p_{r} (r_{0})

. On a high level, the DDPM samples noise from a Gaussian distribution and transforms it to conform to the distribution

p_{r} (r)

through the reversal of a gradual noising process. In particular, sampling starts with noise

r_{T}

and produces less noisy samples

r_{T - 1}, r_{T - 2}, \dots

, gradually, until reaching a final sample

r_{0}

. Formally, DDPM is a class latent variable model of the form:

p_{θ} (r_{0}) : = \int p_{θ} (r_{0 : T}) d r_{1 : T}

(5)

As illustrated in Figure 1, the training process of the DDPM primarily consists of two parts: the diffusion process and the denoising process. The diffusion process involves incrementally adding noise to the data through a Markov chain until the signal is degraded. The denoising process involves learning to reverse a diffusion process, gradually eliminating noise in the opposite direction of diffusion.

(1) The Diffusion Process: This process is not learned but fixed, distinguishing diffusion models from other latent variable models. Indeed, the approximate posterior

q (r_{1 : T} ∣ r_{0})

is derived from a Markov chain that systematically introduces Gaussian noise to the data

r_{0}

which can be formulated as follows:

q (r_{1 : T} ∣ r_{0}) : = \prod_{t = 1}^{T} q (r_{t} ∣ r_{t - 1})

(6)

For each diffusion step

q (r_{t})

, it depends only on the preceding step

q (r_{t - 1})

, obtained by adding noise to the input

r_{t - 1}

, and can be defined as follows:

q (r_{t} ∣ r_{t - 1}) : = N (r_{t}; \sqrt{1 - β_{t}^{2}} r_{t - 1}, β_{t}^{2} I)

(7)

where

β_{t} \in (0, 1)

is a predetermined schedule of variance with no learnable parameters. This procedure is capable of distorting the original signal, inducing it to adhere to a standard Gaussian distribution

N (0, I)

.

(2) The Denoising Process: This part anticipates finding

p (r_{0} ∣ r_{T})

to eliminate the noise introduced by the diffusion process for facilitating the recovery of

r_{0}

. As

p (r_{t - 1} ∣ r_{t})

is unknown, the reverse process cannot obtain an explicit expression similar to Equation (6) directly. Under the given condition for

r_{0}

, the application of Bayes’ theorem yields:

p (r_{t - 1} ∣ r_{t}, r_{0}) = \frac{p (r_{t} ∣ r_{t - 1}) p (r_{t - 1} ∣ r_{0})}{p (r_{t} ∣ r_{0})}

(8)

To make the final formula more concise and readable, we introduce an auxiliary variable

α_{t} = \sqrt{1 - β_{t}^{2}}

, and

{\bar{α}}_{t} = \prod_{s = 1}^{t} α_{s}, {\bar{β}}_{t} = \sqrt{1 - {\bar{α}}_{t}^{2}}

. According to the superposition principle of Gaussian distribution, we have

p (r_{t} ∣ r_{0}) = N (r_{t}; {\bar{α}}_{t} r_{0}, {\bar{β}}_{t}^{2} I)

; therefore, (8) can be rewritten as

p (r_{t - 1} ∣ r_{t}, r_{0}) = N (r_{t - 1}; \frac{α_{t} {\bar{β}}_{t - 1}^{2}}{{\bar{β}}_{t}^{2}} r_{t} + \frac{{\bar{α}}_{t - 1} β_{t}^{2}}{{\bar{β}}_{t}^{2}} r_{0}, \frac{{\bar{β}}_{t - 1}^{2} β_{t}^{2}}{{\bar{β}}_{t}^{2}} I)

(9)

where

r_{0}

is the result we ultimately want to generate, which makes the expression conflict with our purpose. Therefore, a neural network

\bar{u} (r_{t})

is introduced to estimate

r_{0}

, and the loss function is

{∥r_{0} - \bar{u} (r_{t})∥}^{2}

. From the diffusion process formula

p (r_{t} ∣ r_{0})

, it can be explicitly deduced that

r_{t} = {\bar{α}}_{t} r_{0} + {\bar{β}}_{t} ε

, where

ε

is a known Gaussian noise. If

\bar{u} (r_{t})

is used to estimate

r_{0}

, then

ε

should be an unknown variable with parameter

θ

. Thus,

{\bar{u}}_{t}

can be parameterized as

\bar{u} (r_{t}) = \frac{1}{{\bar{α}}_{t}} (r_{t} - {\bar{β}}_{t} ε_{θ} (r_{t}, t))

(10)

The loss function is derived as follows:

L_{D M} = E_{ε \sim N (0, I)} [\frac{{\bar{β}}_{t}^{2}}{\sqrt{{\bar{α}}_{t}^{2}}} {∥ε - ε_{θ} ({\bar{α}}_{t} r_{0} + {\bar{β}}_{t} ε, t)∥}^{2}]

(11)

The conclusive denoising process is also constrained to a Markov chain, parameterized by the neural network and defined as follows:

p_{θ} (r_{0} ∣ r_{T}) = p (r_{T}) \prod_{t = 1}^{T} p_{θ} (r_{t - 1} ∣ r_{t})

(12)

3. Methodology

When applied to renewable power output scenarios generation, the DDPM fails to accurately model their temporal correlation due to the interference of the noise signal, resulting in an insufficient description of uncertainty. In this section, we elaborate on the proposed method for renewable scenario generation. It is a novelty two-stage generative architecture that combines time series representation and diffusion models, as depicted in Figure 2. Unlike existing single-stage deep generative models, we introduce a time-series representation model to capture the characteristics of renewable energy generation. This approach provides prior temporal knowledge to the diffusion model, thereby enhancing its expressive capabilities.

3.1. Scenario Representation of the First Stage

In the first stage, according to Section 2.2, we have pre-trained the time-masked-based representation model through unsupervised learning, which comprises the vanilla Transformer structure [27].

As shown in Figure 3, the model can be viewed as an encoder–decoder architecture. Each set of training samples is divided into N patches before entering the encoder. In this work, we choose

P = 12

and

S = 0,

with a sample sequence length

L = 288

. If the original Transformer encoder with multi-head attention is used, the computational complexity would reach

O (L^{2})

. After introducing patches, the input sequence length is reduced to

N \approx L / P = 24

, resulting in a quadratic reduction in computational complexity to

O (L^{2} / P^{2})

. Each sample is divided into tightly non-overlapping patches to ensure that the observed patches do not contain masked information. Following this, we randomly and uniformly select a subset of patches, which constitutes 50 % of the total patches. This subset is masked with 0 values, and the indices are recorded. The entirety of patches, including the masked subset, is subsequently fed into the encoder to derive their representation within the latent space. The model’s output layer uses a simple linear layer instead of the vanilla Transformer structure’s autoregressive decoding output. This approach enables multi-step direct prediction to avoid error accumulation effects and slow decoding efficiency.

Finally, the model is trained using MSE loss to reconstruct the masked patches based on their indices. Table 1 summarizes the details of the representation model parameters.

In fact, our representation model includes the step to reconstruct the original data from the representation vectors in the latent space. Therefore, we removed the original output layer, retained the encoder module, and appended a linear layer as its decoder module. Network parameter fine-tuning was then performed, where the output dimension of the linear layer was aligned with the input dimension of the model. Note that this step is essential not only to ensure the accuracy of reconstructing samples but also to constrain the scaling of the latent space within a certain range. It is to maintain stability in the subsequent learning of the diffusion model in the latent space and avoid distribution disturbances caused by outlier samples. Hence, the loss function for the fine-tuning process should include Kullback–Leibler (KL) regularization on the basis of Equation (4), defined as follows:

L = L_{M S E} + ω K L (N (μ, σ) ∥ N (0, I))

(13)

where

μ

and

σ

are the mean vector and variance vector of a normal distribution, and

ω

denotes the regularization weight. It can be explicitly expressed as follows:

L = \frac{1}{N} \sum_{i \in N} {({\hat{x}}_{i} - x_{i})}^{2} + \frac{1}{2} \sum_{j = 1}^{D} (μ_{j}^{2} + σ_{j}^{2} - log σ_{j}^{2} - 1)

(14)

where D represents the dimension of the latent variable r. Through KL regularization, the model aligns the distribution of the encoded latent space with a standard normal distribution during the fine-tuning process, thus avoiding arbitrary dimension scaling.

3.2. The Diffusion Model of the Second Stage

In the second stage, the diffusion model comprehensively learns the underlying data distribution in the latent space with temporal inductive bias, where all renewable scenario data undergo encoding by the pre-trained representation model encoder from the first stage, as shown in Figure 2.

(1) Conditional Implicit Diffusion Model: To synthesize customized renewable scenarios for specific scenario problem analysis, we extend the traditional DDPM to the conditional one. Given that the denoising process initiates with Gaussian noise and lacks information about the target output

r_{0}

, the generation procedure becomes inherently uncontrollable. We introduce the label c as conditional information for the sample and embed it into the prediction neural network

\bar{u} (r_{t})

to guide conditional generation. Therefore, according to Equation (10),

\bar{u} (r_{t})

can be parameterized as follows:

\bar{u} (r_{t}) = \frac{1}{{\bar{α}}_{t}} (r_{t} - {\bar{β}}_{t} ε_{θ} (r_{t}, t, c))

(15)

From an advanced standpoint, although the denoising process mentioned in Section 2.3 forms a Markov chain,

p (r_{t} ∣ r_{t - 1})

does not directly contribute to the calculation of the final outcome. The term

p (r_{t} ∣ r_{t - 1})

can be omitted from the derivation procedure. At this point, the solution space for Equation (8) becomes more expansive and needs to satisfy the marginal distribution condition:

\int p (r_{t - 1} ∣ r_{t}, r_{0}) p (r_{t} ∣ r_{0}) d r_{t} = p (r_{t - 1} ∣ r_{0})

(16)

Similar to Equation (9), we solve the above Equation (16) using the method of undetermined coefficients, assuming the following:

p (r_{t - 1} ∣ r_{t}, r_{0}) = N (r_{t - 1}; κ_{t} r_{t} + λ_{t} r_{0}, σ_{t}^{2} I)

(17)

where

κ_{t}, λ_{t}, σ_{t}

is the undetermined coefficient. Taking

σ_{t}

as the variable parameter, the final result can be derived according to Equation (5):

p (r_{t - 1} ∣ r_{t}, r_{0}) = N (r_{t - 1}; \frac{γ}{{\bar{β}}_{t}} r_{t} + ({\bar{α}}_{t - 1} - \frac{{\bar{α}}_{t} γ}{{\bar{β}}_{t}}) r_{0}, σ_{t}^{2} I)

(18)

where

γ = \sqrt{{\bar{β}}_{t - 1}^{2} - σ_{t}^{2}}

. As in Equation (15),

r_{0}

is approximated by a neural network

\bar{u} (r_{t})

. Let

σ_{t} = η \frac{{\bar{β}}_{t - 1} β_{t}}{{\bar{β}}_{t}}

, where

η \in [0, 1]

. The retained term

σ_{t}

in Equation (18) can influence the noise intensity to control the randomness of the denoising process. When

η = 1

, Equation (18) transforms into Equation (9). When

η

is set to 0, the denoising process becomes a deterministic transformation:

r_{t - 1} = \frac{1}{α_{t}} (r_{t} - ({\bar{β}}_{t} - α_{t} {\bar{β}}_{t - 1}) ε_{θ} (r_{t}, t, c))

(19)

(2) Training of The Diffusion Model: For each sample

r_{0}

, the diffusion process adds Gaussian noise to it according to Equation (6), resulting in

r_{T} \sim N (0, I)

, where

β

schedule is a linear increasing schedule from

β_{1} = 10^{- 4}

to

β_{T} = 0.1

, and

T = 100

. In the denoising process, conditional information c such as weather, and time step t are introduced. According to Equation (15) and Equation (19), one can obtain

r_{t - 1}

from the Bayesian transformation of

r_{t}

. Following T iterations, the noise added to

r_{T}

is systematically eliminated to restore the original

r_{0}

. Subsequently, the diffusion model undergoes optimization based on the loss function:

{∥ε - ε_{θ} ({\bar{α}}_{t} r_{0} + {\bar{β}}_{t} ε, t, c)∥}^{2}

.

It can be seen that we explicitly incorporate the parameter t in the input. This is because, in principle, different time steps t correspond to different noise-level objects, necessitating diverse diffusion models. Ideally, there should be T distinct ones. Nevertheless, we opt to share parameters across all diffusion models. The time step t, converted into the corresponding position encoding, serves as a condition and is directly integrated into the embedding block. In this work, the neural network

ε_{θ}

of the diffusion model adopts the U-Net structure as the backbone, and specific details can be found in [28]. In this structure, the ResNet module is replaced with the ConvNeXt module [29]. The parameters of the diffusion model are summarized in Table 2. Algorithm 1 illustrates the pseudo-code for training of the diffusion model.

Algorithm 1 The Diffusion Model Training

Input:: The pre-trained representation model with encoder and decoder, diffusion model parameter $θ$ , diffusion time steps T, noise schedules ${β_{t}}_{t = 1}^{T}$ , renewable dataset D, batch size B, epochs $E_{t}$ , learning rate $l r$ .
1:: procedure
2:: for $e = 1, \dots, E_{t}$ do
3:: for each batch of training data do
4:: for $i = 1, \dots, B$ do
5:: Sample original data ${x_{0}, c} \sim D$
6:: Encode the data $x_{0} \to r_{0}$
7:: Sample diffusion step $t \sim Uniform (1, T)$ , Gaussian noise $ε \sim N (0, I)$
8:: Calculate $r_{t} \leftarrow {\bar{α}}_{t} r_{0} + {\bar{β}}_{t} ε$
9:: Obtain estimated noise $ε_{θ} (r_{t}, t, c)$
10:: Compute loss $L_{D M} \leftarrow {∥ (ε - ε_{θ} (r_{t}, t, c)) ∥}^{2}$
11:: end for
12:: Update model’s parameters $θ \leftarrow A d a m (\nabla_{θ} \frac{1}{B} \sum_{i = 1}^{B} L_{DM}, l r)$
13:: end for
14:: end for
15:: end procedure

Upon completion of the diffusion model training, the generation of new scenarios involves the combination of two-stage models. Setting

σ_{t} = 0

, a random

r_{T}

is sampled from a standard Gaussian distribution. The trained diffusion model from the second stage is then utilized to generate a latent space vector

r_{g}

, following the procedure outlined in Equation (18). Subsequently, this vector

r_{g}

is mapped to the entire renewable scenario through the pre-trained representation decoder from the first stage.

4. Case Studies

4.1. Experiment Settings

(1) Data Preparation: We employed the integrated wind and solar energy dataset sourced from NREL [30]. The experiment utilized data from 24 wind and 32 solar sites situated in Washington state as input for the model. The selected wind and solar stations possess consistent installed capacities, are geographically proximate, and display a certain level of correlation between the outputs of each site. The historical data exhibit a time resolution of 5 min. We use 80% of them as the training set, while the remaining 20% served as the test set for the model to assess against, thereby providing a genuine comparison baseline.

(2) Benchmarks: In many previous studies, it has been demonstrated that artificial intelligence methods are superior to traditional probabilistic models. Therefore, this paper primarily compares several advanced deep generative models, including LSGANs, VAEs, and DDPM. In this work, we introduce the original DDPM as a benchmark for scenario generation. It can be considered as an ablation experiment to highlight the superiority of our proposed method. All experiments were conducted on the CentOS 7 platform, equipped with 4 NVIDIA GeForce RTX 2080 Ti GPUs.

4.2. Performance in Scenario Rrepresentation

As this work focuses on scenario generation, this section briefly illustrates the efficacy of the first-stage representation. As shown in Figure 4a, the loss value continuously decreases during the training process until convergence. We present the reconstruction results of different checkpoints during the training process in it. Furthermore, three randomly selected samples from the test set are visually displayed in Figure 4b, and the MSE calculated from the test set is less than 0.001. It can be observed that the reconstructed curves almost overlap with the original curves, demonstrating the effectiveness of the first-stage autoencoder representation learning.

4.3. Performance in Scenario Generation

In this section, we validate the performance of our proposed method in scenario generation by comparing it with the three benchmarks mentioned above. The NREL power data comprises 288 points per day. All three benchmark models convert the 2-day data (576 points) into a

24 \times 24

matrix as a sample for training. This is because most current deep generative models are initially designed for the image domain with uniform dimensions. When analyzing the data, they treat the generated 2-day data as two separate samples, inevitably leading to some deviation in describing the daily output of renewable energy. In our approach, during the training phase, 288 points are input as a time series into the representation encoder to derive its mapped latent vector. Subsequently, the diffusion model comprehensively learns the potential distribution of vectors within the latent space. Upon completing the training of all models, a substantial number of scenarios are randomly generated for consequent analysis of the results.

(1) Numerous Scenarios Analysis:

Due to the strong randomness of renewable energy output and the data generated by different models, it is challenging to compare each sample individually. To address this, for both the generated scenario set and the test one, we employ the k-means clustering algorithm to partition them into 12 clusters. However, it is important to note that these clusters may not entirely summarize the output characteristics of renewable energy, particularly for wind power scenarios with pronounced fluctuations. To enhance comparability, we select several categories of representative sets from these clusters. These scenarios are further matched with curves sharing similar random characteristics through Euclidean distance searching, facilitating the construction of a set of typical scenarios. Figure 5 depicts the comparison between the real scenario set and the generated sets from four methods. For clarity, the gray curves represent the scenario set generated by our proposed method, while the remaining curves represent centroids obtained by computing the typical scenario sets from real data and various benchmarks. From the top rows of Figure 5a,b, it can be observed that the scenario set generated by the proposed method follows the trend of the actual curves, perfectly covering the real fluctuation range. Furthermore, the centroid of this scenario set is closest to the real one, which can be interpreted as the proposed method having the most similar maximum likelihood estimation to real data. Figure 5 indicates that the proposed method has effectively learned the distinctive features of renewable energy generation, such as peak-valley variations, rapid power fluctuations, and ramp events in wind power, as well as day–night variations in solar power. It can accurately capture the dynamic characteristics of the practical renewable energy generation process.

For a more detailed display, the bottom rows of Figure 5a,b normalizes all errors of the curves for comparison. The baseline is the centroid of real samples, and the upper and lower bounds of the error represent the maximum and minimum normalized error values of the generated scenario set at different time points. It is evident that, for wind power scenarios, the centroid generated by the proposed method exhibits an absolute error within 0.05 relative to the real clustering center. Furthermore, the error bounds of its scenario set do not exceed 0.3. In contrast, the normalized error limits of other benchmarks generally surpass twice that of the proposed method. Notably, the original DDPM displays a substantial deviation, highlighting the significant improvement of our method over the basic DDPM. In the case of solar power, its distribution is more regular compared to wind power. All benchmarks perform well, with the proposed method’s relative error astonishingly below 0.02.

As shown in Figure 6, we selected individual samples with representative characteristics for comparison with real samples. It can be observed that they exhibit similar fluctuating and intermittent features. Then, we calculated the autocorrelation coefficients

R (τ)

to verify their temporal correlation. The formula is shown below:

R (τ) = \frac{E [(S_{t} - μ) (S_{t + τ} - μ)]}{E {(S_{t} - μ)}^{2}}

(20)

where

S_{t}

is the sample output at time t,

μ

represents the output series average,

τ

is the time interval. It can be observed that the autocorrelation coefficient curves of the samples generated by the proposed method maintain a high consistency with those of the real samples. This demonstrates that the method can reflect the operational characteristics of real power generation scenarios in terms of temporal features.

(2) Statistical analysis:

Apart from visual assessments of sample similarity, we conduct statistical validations to ascertain the reliability of the generated samples. This is achieved by calculating the cumulative distribution function (CDF) and probability density function (PDF) of a substantial number of samples, where the PDF is obtained by Gaussian kernel density estimation. As illustrated in Figure 7, we have zoomed in on local regions of the CDF and PDF curves for a clearer comparison of each method. It can be seen that the CDF and PDF of the proposed method are more fitting to the real curves compared to LSGANs, VAEs, and DDPM. This indicates that our method not only learns the shape details of the scenario curves but also fully captures the distribution patterns of real data, validating that the generated scenarios and real scenarios share similar statistical properties.

4.4. Quality Results

Evaluating and comparing generative models poses a challenging task. Currently, there exists no consensus or standard guidelines on which metrics are ideal for assessing the capabilities and limitations of models. Remarkable results may be achieved for a specific metric, yet the same model might not perform equally well on other criteria. Hence, the incorporation of multiple complementary metrics to evaluate a model is considered a good practice. This study uses six complementary quality metrics: the root mean square error (RMSE), mean absolute error (MAE), maximum mean discrepancy (MMD), energy score (ES), variogram score (VS), and the coverage rate (CR). All of them are negatively oriented [31] except for the coverage rate.

(1) MMD: MMD quantifies the difference between two distributions, expressed as follows:

MMD = \frac{1}{M_{g}^{2}} (\sum_{i = 1}^{M_{g}} \sum_{j = 1}^{M_{g}} (k (x_{i}, x_{j}) + k ({\hat{x}}_{i}, {\hat{x}}_{j}) - 2 \cdot k (x_{i}, {\hat{x}}_{j})))

(21)

where

k (\cdot)

denotes the Gaussian kernel function,

M_{g}

is the total amount of data.

(2) ES: The ES is a commonly employed metric for assessing a finite number of scenarios that model a distribution. For a given day d of the observed set, the ES is computed as

ES = \frac{1}{D_{o}} \sum_{d \in D_{o}} (\frac{1}{M} \sum_{i = 1}^{M} ∥{\hat{x}}_{d}^{i} - x_{d}^{i}∥ - \frac{1}{2 M^{2}} \sum_{i, j = 1}^{M} ∥{\hat{x}}_{d}^{i} - {\hat{x}}_{d}^{j}∥)

(22)

(3) VS: In contrast to the ES, the VS is sensitive to the mean, variance, and correlation of incorrect scenarios, making it capable of distinguishing the correlation structure significantly. It is defined as follows:

{VS}_{d} = \sum_{k, k^{'}}^{L} ω_{k k^{'}} {({|x_{d, k} - x_{d, k^{'}}|}^{γ} - \frac{1}{M} \sum_{i = 1}^{M} {|{\hat{x}}_{d, k}^{i} - {\hat{x}}_{d, k^{'}}^{i}|}^{γ})}^{2}

(23)

where

ω

is a non-negative weight. We employ

ω_{k k^{'}} = 1

and

γ = 0.5

in this study. The scenario time resolution used for VS calculation is 1 h.

(4) CR: The CR indicates whether the scenarios synthesized by generative models possess sufficient diversity to cover the real scenarios. A higher value of the CR signifies greater diversity. At the l-th time point, the upper and lower bounds of the real and generated scenario sets are denoted as

P_{r, l}^{u p}, P_{r, l}^{d o w n}, P_{s, l}^{u p}, P_{s, l}^{d o w n}

, respectively. The formula for calculating the CR is as follows:

C R = \frac{1}{L} \sum_{l = 1}^{L} (\frac{M_{l}^{'}}{M}) \times 100 %

(24)

where

M_{l}^{'}

denotes the number of instances at the l-th time point where the real power output falls within the upper and lower bounds [

P_{s, l}^{u p}, P_{s, l}^{d o w n}

] of the generated scenario set’s power output.

Table 3 indicates that, compared to the benchmarks, our proposed method achieves the best performance across all metrics. Conversely, the outcomes of the three baseline tests exhibit relative inadequacy, with a minimum 30% and 32% increase in the two error metrics (RMSE and MAE), respectively. Concerning other distribution-level metrics (MMD, ES, VS), our proposed method registers the minimum values, underscoring its proximity to the distribution of real scenarios. In terms of diversity, the proposed method achieves the highest coverage rate, outperforming LSGAN and VAE by at least 13.5%. This also provides evidence to some extent for the diverse limitations inherent in these two types of methods. Moreover, compared to the original DDPM, the improvements made by our method are undoubtedly significant. This demonstrates that the diffusion model with U-Net architecture and convolutional neural networks has limitations in capturing temporal features. However, by providing prior temporal knowledge through a pre-trained time-series representation model, its performance in renewable scenario generation tasks can be greatly enhanced. All qualitative comparative results substantiate the conclusions drawn in the preceding sections. Consequently, our proposed method has better accuracy and quality in scenario generation.

4.5. Conditional Scenario Generation

Given the consistency in the generation process for both wind power and solar scenarios, the former exhibits stronger fluctuations, rendering it a more representative case. Consequently, this section predominantly employs the generation of wind power scenarios as an exemplar for analysis.

(1) Specific Scenario Generation:

In the course of scenario analysis, it is necessary to generate scenario data under certain special conditions. For instance, wind power data under particular weather conditions. To accomplish this, these conditions can be assigned as labels to the training set. Then, the conditional diffusion model, as elucidated earlier, can be employed to derive the generative model. The resultant scenario data will exhibit statistical characteristics akin to those encapsulated in the labeled training set. In this work, we categorize wind power scenarios into five classes based on mean value

v (x) : v (x) < 0.5, v (x) < 2, v (x) < 3.5, v (x) < 6.5

and

v (x) > 6.5

, denoting different daily wind strength. The class information is encoded into one-hot vectors and embedded into the conditional implicit diffusion model for generating scenarios corresponding to the specified label, as outlined in Equation (15).

To intuitively illustrate the performance of the proposed method, we present a visualization of the generated multi-class scenarios alongside real scenarios in a two-dimensional space, as shown in Figure 8. Specifically, we treat the 288 outputs for each day as features, then use t-SNE to reduce the feature dimensions for better visualization, and color-code the scenarios based on different categories. It can be seen that the arrangement of generated scenarios resembles that of real scenarios. In particular, the class 1 exhibits a large number of samples clustered around the center. Through numerical analysis, this pattern is attributed to the prevalence of windless days characterized by sustained output close to 0. This observation demonstrates the capability of the proposed method to generate scenarios with a distribution pattern in the sample space closely resembling that of real scenarios. We evaluate the samples generated under these conditions by examining the marginal distributions for each category within the range of 0–16 MW, divided into 10 intervals, as shown in Figure 9. The data produced by the proposed method exhibit a commendable fit to the probability distribution of each category in the test set. For instance, on days with light wind, the probability of the site’s output being less than 2 MW surpasses 50%, and as the wind speed increases, the center of the probability distribution gradually shifts to the right. This alignment with real data patterns attests that the generated samples by the proposed method adhere to the same marginal distribution as the corresponding validation samples.

(2) Multisites Scenario Generation:

In large wind farms, there are multiple power generation sites that are geographically close. When generating scenarios, in addition to considering the probability distribution of data similar to the previous context, it is also necessary to take into account the correlation between these generation sites. In this study, 24 power generation sites were selected, and the sampling time interval for each site was increased from 5 min to 1 h. Thus, each sample is a matrix of size

24 \times 24

. To analyze the spatial relationships between multiple sites, we calculate the Pearson correlation coefficient between each pair of sites for both real and generated data. The Pearson correlation coefficient

ρ

is

ρ = \frac{\sum_{i = 1}^{M} (x_{i} - {\bar{x}}_{i}) (x_{j} - {\bar{x}}_{j})}{\sqrt{\sum_{i = 1}^{M} {(x_{i} - {\bar{x}}_{i})}^{2} \sum_{i = 1}^{M} {(x_{j} - {\bar{x}}_{j})}^{2}}}

(25)

Figure 10a illustrates a collection of real and generated scenarios for a group of 24 power generation stations. The visual representation of correlation coefficients among the power stations is presented in Figure 10b. The results indicate that the generated multi-site scenarios are consistent with real scenarios in terms of correlation patterns. The proposed method is capable of simultaneously capturing the complex temporal and spatial distribution characteristics of renewable energy generation, contributing to the construction of a reliable and effective scenario set.

5. Conclusions

This paper proposes a two-stage generative architecture based on diffusion models for renewable scenario generation. In the first stage, the representation learning of time series through the encoder–decoder structure achieves the encoding of renewable scenarios in the latent space. This design imparts the specific inductive bias of renewable scenario for subsequent diffusion model learning, thereby facilitating the assimilation of temporal information. In the second stage, we introduce a conditional implicit diffusion model to learn the features of renewable scenarios in the latent space, and the ultimate generated scenarios were then obtained through the decoder of the first stage.

Case studies indicate that, compared to other advanced deep generative models, the proposed method stands out as the most competitive. By learning the intricate time-dependent relationships of renewable resource output, our method can generate high-quality scenarios with full diversity. These scenarios effectively capture the underlying distribution of real data rather than simply simulating. We validated the effectiveness of each method in generating scenarios through a series of visual and statistical approaches. The diverse quality evaluation metrics employed demonstrate that the proposed method has superior performance.

In future research, we propose to incorporate this study for the planning and optimal operation of an electrical system with renewables. Furthermore, this work focuses on investigating accurate renewable scenario generation based on univariate power time-series characteristics, without incorporating numerical weather information. Therefore, future research should consider multivariate numerical weather conditions to enhance the synthesis of reliable renewable scenarios.

Author Contributions

Writing—original draft preparation, C.X.; Conceptualization, P.X.; methodology, Y.D.; funding acquisition, S.S., Q.X.; data curation, L.Z.; editing, J.Z.; investigation, Y.B.; formal analysis, T.G.; review, L.S.; supervision, W.G.; project administration and software, S.S.; validation, Q.X.; Visualization, W.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Science and Technology Project of Yunnan Power Grid Co., Ltd. (No. YNKJXM20222105).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The authors can provide the raw data in this work upon reasonable request.

Conflicts of Interest

Authors Shi Su and Qingyang Xie were employed by the company Yunnan Power Grid. The remaining authors declare that this research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zhang, Q.; Shukla, A.; Xie, L. Efficient Scenario Generation for Chance-Constrained Economic Dispatch Considering Ambient Wind Conditions. IEEE Trans. Power Syst. 2024, 39, 5969–5980. [Google Scholar] [CrossRef]
Zhang, J.; Wang, B.; Watada, J. Stochastic distributionally robust unit commitment with deep scenario clustering. Electr. Power Syst. Res. 2023, 224, 109710. [Google Scholar] [CrossRef]
Li, Z.; Xie, X.; Cheng, Z.; Zhi, C.; Si, J. A novel two-stage energy management of hybrid AC/DC microgrid considering frequency security constraints. Int. J. Electr. Power Energy Syst. 2023, 146, 108768. [Google Scholar] [CrossRef]
Xu, M.; Li, W.; Feng, Z.; Bai, W.; Jia, L.; Wei, Z. Economic Dispatch Model of High Proportional New Energy Grid-Connected Consumption Considering Source Load Uncertainty. Energies 2023, 16, 1696. [Google Scholar] [CrossRef]
Camal, S.; Teng, F.; Michiorri, A.; Kariniotakis, G.; Badesa, L. Scenario generation of aggregated Wind, Photovoltaics and small Hydro production for power systems applications. Appl. Energy 2019, 242, 1396–1406. [Google Scholar] [CrossRef]
Fei, Z.; Yang, H.; Du, L.; Guerrero, J.M.; Meng, K.; Li, Z. Two-stage coordinated operation of a green multi-energy ship microgrid with underwater radiated noise by distributed stochastic approach. IEEE Trans. Smart Grid 2024, 16, 1062–1074. [Google Scholar] [CrossRef]
Liao, W.; Yang, Z.; Chen, X.; Li, Y. WindGMMN: Scenario forecasting for wind power using generative moment matching networks. IEEE Trans. Artif. Intell. 2021, 3, 843–850. [Google Scholar] [CrossRef]
Li, H.; Ren, Z.; Xu, Y.; Li, W.; Hu, B. A Multi-Data Driven Hybrid Learning Method for Weekly Photovoltaic Power Scenario Forecast. IEEE Trans. Sustain. Energy 2022, 13, 91–100. [Google Scholar] [CrossRef]
Stappers, B.; Paterakis, N.G.; Kok, K.; Gibescu, M. A class-driven approach based on long short-term memory networks for electricity price scenario generation and reduction. IEEE Trans. Power Syst. 2020, 35, 3040–3050. [Google Scholar] [CrossRef]
Pu, Y.; Gan, Z.; Henao, R.; Yuan, X.; Li, C.; Stevens, A.; Carin, L. Variational autoencoder for deep learning of images, labels and captions. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Ser. NIPS’16, Barcelona, Spain, 5–10 December 2016; Curran Associates Inc.: Red Hook, NY, USA, 2016; pp. 2360–2368. [Google Scholar]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Chen, Y.; Wang, Y.; Kirschen, D.; Zhang, B. Model-Free Renewable Scenario Generation Using Generative Adversarial Networks. In Proceedings of the 2019 IEEE Power & Energy Society General Meeting (PESGM), Atlanta, GA, USA, 4–8 August 2019. [Google Scholar]
Li, Y.; Li, J.; Wang, Y. Privacy-Preserving Spatiotemporal Scenario Generation of Renewable Energies: A Federated Deep Generative Learning Approach. IEEE Trans. Ind. Inform. 2022, 18, 2310–2320. [Google Scholar] [CrossRef]
Jiang, C.; Mao, Y.; Chai, Y.; Yu, M. Day-ahead renewable scenario forecasts based on generative adversarial networks. Int. J. Energy Res. 2021, 45, 7572–7587. [Google Scholar] [CrossRef]
Yuan, R.; Wang, B.; Sun, Y.; Song, X.; Watada, J. Conditional Style-Based Generative Adversarial Networks for Renewable Scenario Generation. IEEE Trans. Power Syst. 2023, 38, 1281–1296. [Google Scholar] [CrossRef]
Zhanga, H.; Hua, W.; Yub, R.; Tangb, M.; Dingc, L. Optimized operation of cascade reservoirs considering complementary characteristics between wind and photovoltaic based on variational auto-encoder. In MATEC Web of Conferences; EDP Sciences: Les Ulis, France, 2018; Volume 246, p. 01077. [Google Scholar]
Zheng, Z.; Yang, L.; Zhang, Z. Conditional Variational Autoencoder Informed Probabilistic Wind Power Curve Modeling. IEEE Trans. Sustain. Energy 2023, 14, 2445–2460. [Google Scholar] [CrossRef]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Ser. NIPS ’20, Vancouver, BC, Canada, 6–12 December 2020; Curran Associates Inc.: Red Hook, NY, USA, 2020; pp. 6840–6851. [Google Scholar]
Dhariwal, P.; Nichol, A. Diffusion models beat GANs on image synthesis. Adv. Neural Inf. Process. Syst. 2021, 34, 8780–8794. [Google Scholar]
Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10674–10685. [Google Scholar]
Li, S.; Xiong, H.; Chen, Y. DiffCharge: Generating EV Charging Scenarios via a Denoising Diffusion Model. IEEE Trans. Smart Grid 2024, 15, 3936–3949. [Google Scholar] [CrossRef]
Franceschi, J.-Y.; Dieuleveut, A.; Jaggi, M. Unsupervised scalable representation learning for multivariate time series. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019; Volume 418, pp. 4650–4661. [Google Scholar]
Tonekaboni, S.; Eytan, D.; Goldenberg, A. Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding. In Proceedings of the International Conference on Learning Representations, Virtual Event, 3–7 May 2021. [Google Scholar]
Zerveas, G.; Jayaraman, S.; Patel, D.; Bhamidipaty, A.; Eickhoff, C. A Transformer-based Framework for Multivariate Time Series Representation Learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, ser. KDD ’21, Virtual Event, 14–18 August 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 2114–2124. [Google Scholar]
He, K.; Chen, X.; Xie, S.; Li, Y.; Dollar, P.; Girshick, R. Masked Autoencoders Are Scalable Vision Learners. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 15979–15988. [Google Scholar]
Nie, Y.; Nguyen, N.H.; Sinthong, P.; Kalagnanam, J. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In Proceedings of the The Eleventh International Conference on Learning Representations (ICLR), Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
Al-Rfou, R.; Choe, D.; Constant, N.; Guo, M.; Jones, L. Character-Level Language Modeling with Deeper Self-Attention. Proc. AAAI Conf. Artif. Intell. 2019, 33, 3159–3166. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, 5–9 October 2015. [Google Scholar]
Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 11966–11976. [Google Scholar]
Draxl, C.; Clifton, A.; Hodge, B.-M.; McCaa, J. The Wind Integration National Dataset (WIND) Toolkit. Appl. Energy 2015, 151, 355–366. [Google Scholar] [CrossRef]
Dumas, J.; Wehenkel, A.; Lanaspeze, D.; Cornélusse, B.; Sutera, A. A deep generative model for probabilistic energy forecasting in power systems: Normalizing flows. Appl. Energy 2022, 305, 117871. [Google Scholar] [CrossRef]

Figure 1. Illustration of diffusion models.

Figure 2. Training and generation process of the two-stage architecture using diffusion model.

Figure 3. Structure of the first stage scenario representation model.

Figure 4. Illustration of representation results: (a) is the loss curve and (b) is result examples.

Figure 5. The comparison of typical scenarios generated by different models: (a) is wind power and (b) is solar power. At the top of (a,b), the gray curves represent a large number of scenarios generated by the proposed method, the red curve represents the centroid of the real samples, and the remaining curves denote the centroids of the scenario sets generated by various benchmarks. At the bottom of (a,b), the errors between each curve and the real red centroid are normalized and displayed. The light blue shaded area indicates the range of differences for scenarios generated by the proposed method, while the remaining curves represent the errors between the benchmarks and the real centroid.

Figure 6. The comparison of individual samples generated by different methods. (a) is wind power and (b) is solar power.

Figure 7. The comparison of CDF and PDF obtained from different methods: (a) is wind power and (b) is solar power.

Figure 8. Two-dimensional visualization of synthesis performance comparison.

Figure 9. Comparison of real and synthesized scenarios marginal distribution under different classes.

Figure 10. (a) Comparison between the real and generated wind power output sample curves, while (b) displays the colormap of their respective spatial correlation coefficient matrices.

Table 1. Implementation details of the representation model.

Parameter	Definition	Value
P	The patch length	12
S	The patches interval	0
$E_{n}$	The number of encoder layers	3
$d_{m o d e l}$	The embedding vector dimension	128
$E_{t}$	The number of training epochs	100
$E_{f}$	The number of fine-tuned epochs	16
B	The batch size of training	32
lr	The learning rate	0.001

Table 2. Implementation details of the diffusion model.

Parameter	Definition	Value
$R_{d o w n}$	The number of downsampling residual layers	3
$R_{u p}$	The number of upsampling residual layers	3
$C_{b}$	The number of ConvNext blocks in the residual layer	2
$d_{1}$ , $d_{2}$ , $d_{3}$	The channel dimension of the convolution	24, 48, 96
$K_{c}$	The size of the convolution kernel	$3 \times 3$
( $β_{1}, β_{T}$ )	The noise schedule	(0.0001, 0.1)
T	The diffusion steps	100
B	The batch size of training	64
$E_{t}$	The number of training epochs	200
lr	The learning rate	0.001

Table 3. Evaluation metrics results. The best results are in bold.

		Proposed	DDPM	VAE	LSGAN
Wind	RMSE	0.5184	1.1210	0.8240	0.9100
	MAE	0.3857	0.8236	0.6318	0.6774
	MMD	0.2374	0.5342	0.3089	0.3937
	ES	2.1907	4.3107	3.7771	4.2895
	VS	23.6096	92.2627	80.9852	85.7794
	CR	99.8863	80.3226	85.1072	81.5971
Solar	RMSE	0.1395	0.9531	0.1824	0.2465
	MAE	0.0838	0.5667	0.1101	0.1536
	MMD	0.2819	3.2079	0.3660	0.3345
	ES	0.2352	1.9264	0.3550	0.3509
	VS	1.0492	20.5207	2.4495	3.6483
	CR	99.4531	81.4068	87.6427	82.1662

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, C.; Xu, P.; Dai, Y.; Su, S.; Zhang, L.; Zhang, J.; Bai, Y.; Gao, T.; Xie, Q.; Shang, L.; et al. A Two-Stage Generative Architecture for Renewable Scenario Generation Based on Temporal Scenario Representation and Diffusion Models. Energies 2025, 18, 1275. https://doi.org/10.3390/en18051275

AMA Style

Xu C, Xu P, Dai Y, Su S, Zhang L, Zhang J, Bai Y, Gao T, Xie Q, Shang L, et al. A Two-Stage Generative Architecture for Renewable Scenario Generation Based on Temporal Scenario Representation and Diffusion Models. Energies. 2025; 18(5):1275. https://doi.org/10.3390/en18051275

Chicago/Turabian Style

Xu, Chenglong, Peidong Xu, Yuxin Dai, Shi Su, Luxi Zhang, Jun Zhang, Yuyang Bai, Tianlu Gao, Qingyang Xie, Lei Shang, and et al. 2025. "A Two-Stage Generative Architecture for Renewable Scenario Generation Based on Temporal Scenario Representation and Diffusion Models" Energies 18, no. 5: 1275. https://doi.org/10.3390/en18051275

APA Style

Xu, C., Xu, P., Dai, Y., Su, S., Zhang, L., Zhang, J., Bai, Y., Gao, T., Xie, Q., Shang, L., & Gao, W. (2025). A Two-Stage Generative Architecture for Renewable Scenario Generation Based on Temporal Scenario Representation and Diffusion Models. Energies, 18(5), 1275. https://doi.org/10.3390/en18051275

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Two-Stage Generative Architecture for Renewable Scenario Generation Based on Temporal Scenario Representation and Diffusion Models

Abstract

1. Introduction

2. Time Series Representation and Diffusion Models

2.1. Problem Formulation

2.2. Time Series Representation

2.3. Denoising Diffusion Probabilistic Models

3. Methodology

3.1. Scenario Representation of the First Stage

3.2. The Diffusion Model of the Second Stage

4. Case Studies

4.1. Experiment Settings

4.2. Performance in Scenario Rrepresentation

4.3. Performance in Scenario Generation

4.4. Quality Results

4.5. Conditional Scenario Generation

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI