Enhanced xPatch for Short-Term Photovoltaic Power Forecasting: Supporting Sustainable and Resilient Energy Systems

Wu, Bintao; Hao, Jianlong

doi:10.3390/su17167324

Open AccessArticle

Enhanced xPatch for Short-Term Photovoltaic Power Forecasting: Supporting Sustainable and Resilient Energy Systems

by

Bintao Wu

and

Jianlong Hao

^*

School of Information, Shanxi University of Finance and Economics, Taiyuan 030006, China

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(16), 7324; https://doi.org/10.3390/su17167324

Submission received: 18 July 2025 / Revised: 6 August 2025 / Accepted: 11 August 2025 / Published: 13 August 2025

(This article belongs to the Topic Solar Forecasting and Smart Photovoltaic Systems)

Download

Browse Figures

Versions Notes

Abstract

Accurate short-term photovoltaic (PV) power forecasting is a cornerstone for enhancing grid stability and promoting the sustainable integration of renewable energy sources. However, the inherent volatility of PV power, driven by multi-scale temporal patterns and variable weather conditions, poses a significant challenge to existing forecasting methods. This paper proposes NNDecomp-AdaptivePatch-xPatch, an enhanced deep learning framework that extends the xPatch architecture with a neural network-based decomposition module and an adaptive patching mechanism. The neural network decomposition module separates input signals into trend and seasonal components for specialized processing, while adaptive patching dynamically adjusts temporal windows based on input characteristics. Experimental validation on five real-world PV datasets from Australia and China demonstrates significant performance improvements. The proposed method achieves superior accuracy across multiple prediction horizons, with substantial improvements in mean absolute error (MAE) compared to baseline methods. The enhanced framework effectively addresses the challenges of short-term PV prediction by leveraging adaptive multi-scale feature extraction, providing a practical and robust tool that contributes to the sustainable development of energy systems.

Keywords:

photovoltaic power forecasting; sustainable energy; grid stability; renewable energy integration; deep learning; time series analysis

1. Introduction

The global pursuit of sustainable development, driven by the urgent need to mitigate climate change and transition away from fossil fuels, has positioned renewable energy at the center of the world’s energy strategy. Among these sources, PV systems are pivotal, experiencing exponential growth [1,2]. Highlighting this trend, the International Energy Agency reports that global solar PV capacity additions reached a record 425 GW in 2023 alone, an 80% increase from the previous year, underscoring the accelerating pace of deployment [3]. This rapid expansion is fundamental to achieving sustainability goals, but it also introduces significant challenges to the stability and reliability of existing power grids. Consequently, the ability to accurately forecast PV power generation has become a critical enabling technology, essential for ensuring the seamless and sustainable integration of solar energy into the broader energy system [4,5].

The importance of accurately forecasting PV generation goes beyond simple generation estimates. Precise predictions underpin numerous grid operations, including economic dispatch, ancillary service scheduling, real-time balancing, and peak demand management [6]. In liberalized electricity markets, forecast accuracy directly impacts financial outcomes by minimizing imbalance costs and enabling optimized trading strategies [7]. Furthermore, as grid penetration of solar energy increases, the need for sophisticated forecasting becomes more acute to mitigate grid instability and ensure a reliable power supply [8].

However, PV power forecasting presents significant technical challenges that distinguish it from conventional load forecasting. PV power output exhibits high variability and intermittency, primarily due to fluctuating meteorological conditions such as solar irradiance, cloud cover, temperature, and atmospheric turbidity [9,10]. The non-linear relationship between weather variables and power output, combined with the stochastic nature of weather patterns, creates a complex forecasting environment [11]. Additionally, PV power data exhibits multi-scale temporal dependencies, including diurnal cycles, seasonal variations, and weather-induced short-term fluctuations, which traditional forecasting methods struggle to capture adequately [12].

The development of PV forecasting methods has gone through several different stages. Early approaches relied primarily on physical models that utilize numerical weather predictions (NWP) and satellite imagery to estimate solar irradiance and subsequent power output [13]. While these methods provide valuable meteorological insights, they are computationally intensive, require extensive domain expertise, and often exhibit limited accuracy for short-term forecasting horizons [14]. Simultaneously, statistical approaches such as autoregressive integrated moving average (ARIMA), seasonal decomposition methods, and exponential smoothing gained popularity for their mathematical rigor and interpretability [15]. However, these linear models fundamentally struggle with the non-linear dynamics inherent in PV power generation, limiting their effectiveness in capturing complex weather–power relationships [16].

The emergence of machine learning marked a paradigmatic shift in forecasting methodology. support vector machines (SVMs) and random forests demonstrated improved capability in handling non-linear relationships and multivariate inputs [17]. Neural networks, particularly multi-Layer perceptrons (MLPs), showed promise in capturing complex patterns but were often constrained by their shallow architectures and a susceptibility to overfitting on complex time series data [18]. The advent of ensemble methods, combining multiple algorithms to leverage their complementary strengths, further advanced the field by improving robustness and reducing prediction variance.

The deep learning revolution has fundamentally changed time series forecasting in several fields. recurrent neural networks (RNNs) and their sophisticated variants, long short-term memory (LSTM) networks and gated recurrent units (GRUs), became the effective standard for sequential data modeling due to their ability to capture long-term dependencies and temporal patterns [19]. In PV forecasting specifically, LSTM networks have demonstrated remarkable success in modeling complex temporal dynamics, handling multivariate inputs, and adapting to varying weather conditions [20]. Bidirectional LSTMs and stacked architectures have further enhanced performance by processing sequences in both directions and learning hierarchical representations [21]. More recently, Wang et al. proposed a novel GA-AMODE-BiLSTM model that integrates genetic algorithm and adaptive multi-objective differential evolution for BiLSTM hyperparameter optimization, achieving superior stability and generalization performance in short-term PV power forecasting [22].

Concurrent with recurrent architectures, temporal convolutional networks (TCNs) emerged as a compelling alternative, leveraging dilated convolutions to extract local and global features from sequential data efficiently [23]. TCNs offer advantages in parallel processing and gradient flow stability, making them attractive for real-time forecasting applications. Building on convolutional principles, SCINet introduced a novel architecture that uses sample convolution and interaction to explicitly model downsampled sub-sequences, enhancing its ability to capture features at multiple temporal resolutions [24]. Hybrid architectures combining CNNs and RNNs have also shown promise in capturing both spatial and temporal patterns in multivariate time series.

The introduction of the transformer architecture marked another watershed moment in sequence modeling [25]. The self-attention mechanism enables models to directly capture long-range dependencies without the sequential processing limitations of RNNs. Early applications to time series forecasting, such as the temporal fusion transformer, demonstrated competitive performance [26]. Subsequent innovations like Informer addressed the computational complexity of standard attention for long sequences by introducing sparse attention mechanisms, though its focus on long-range global dependencies comes at the cost of structured temporal modeling, making it less effective for signals with strong periodicities [27]. Autoformer further advanced the field by embedding a decomposition block within its transformer layers, using a moving average to perform a residual separation of trend and seasonal components [28]. However, this reliance on a static, predefined function like a moving average limits its adaptability to complex, non-stationary data. Moreover, the repetitive decomposition at each layer risks compressing and losing important signal details. These pioneering models laid the groundwork for decomposition-based forecasting but highlighted the need for more flexible and adaptive mechanisms.

Recent developments have seen the emergence of specialized transformer variants optimized for forecasting. The iTransformer represents a significant innovation by inverting the traditional transformer approach—treating individual time series variables as tokens rather than time steps [29]. This inversion enables better capture of multivariate correlations and has shown remarkable performance across diverse forecasting benchmarks. Hybrid approaches combining iTransformer with other architectures, such as the iTansformer_LSTM_CA_KAN model, have demonstrated enhanced capabilities by integrating cross-attention mechanisms and kolmogorov–arnold networks (KANs) for improved temporal and covariate interaction modeling [30].

The paradigm of patch-based methods represents the latest frontier in time series forecasting. Inspired by computer vision’s success with patch-based processing, these approaches segment time series into patches and treat them as tokens for transformer processing. PatchTST pioneered this concept, demonstrating that patch-based tokenization can significantly improve forecasting performance while reducing computational complexity [31]. The approach excels at capturing local temporal patterns efficiently and has shown particular promise for long-term forecasting scenarios. Building upon this foundation, the xPatch framework introduced a sophisticated dual-stream architecture that processes seasonal and trend components separately after statistical decomposition [32]. This decomposition-based approach aligns with the fundamental principle that time series often comprise multiple underlying components with distinct characteristics.

Despite these advances, effectively separating the complex, non-stationary components inherent in PV power time series remains a significant challenge for existing methods. Many decomposition-based models still rely on traditional statistical filters (e.g., moving averages) or shallow network structures, which often struggle to adapt to the diverse and dynamic data-generating processes underlying PV power [33]. As recent reviews on deep learning for time series forecasting have highlighted, while decomposition is a powerful paradigm, its effectiveness is contingent on the quality and adaptability of the separation method itself [34]. An inflexible decomposition can lead to information leakage between components, where the trend retains high-frequency noise or the seasonal part contains residual trend patterns, ultimately limiting the performance of specialized downstream predictors [35]. Furthermore, the common practice of using a fixed patch size in patch-based transformers limits the model’s ability to adapt to the multi-scale temporal dynamics of PV power, where patterns of interest can manifest across various time scales simultaneously. This rigidity prevents the model from dynamically focusing on short-term fluctuations during volatile periods or long-term trends during stable conditions.

To address these fundamental limitations, the main contributions of this paper are summarized as follows:

An enhanced xPatch framework is proposed, featuring two key innovations. First, a neural network-based decomposition module (nndecomp) replaces traditional statistical methods, allowing for a more adaptive and data-driven separation of trend and seasonal components. Second, an adaptive patching mechanism dynamically processes the time series at multiple temporal scales and fuses them with an attention mechanism, overcoming the limitations of fixed patch sizes.
The proposed NNDecomp-AdaptivePatch-xPatch model, integrating these data-driven and adaptive components, demonstrates state-of-the-art performance in short-term PV power forecasting. By effectively capturing the complex, non-stationary characteristics of PV power, the model achieves high accuracy, with an R² value exceeding 0.98 for 1-h-ahead forecasts on the test data.
The superiority of the proposed model is validated through extensive experiments on five real-world datasets from two different countries (Australia and China), demonstrating its generalizability across diverse geographical and climatic conditions. Performance is benchmarked against a wide range of models, from classic LSTMs to modern transformers, using MAE, RMSE, R², and MBE as evaluation metrics. Furthermore, an ablation study is conducted to systematically verify the individual contributions of the nndecomp and adaptive patching modules.

The remainder of this paper is arranged as follows. Section 2 describes our comprehensive data preprocessing methodology. Section 3 presents the detailed architecture of our proposed framework. Section 4 outlines the case study, experimental setup, and comprehensive results analysis. Finally, Section 5 concludes the paper.

2. Data Preprocessing Methodology

A rigorous and systematic data preprocessing pipeline is essential for developing a robust forecasting model, as it directly impacts data quality and feature representation. Our pipeline consists of four main stages: data cleaning, data resampling, feature engineering, and data normalization. The original datasets were recorded at a high frequency of 5-min intervals. To prepare the data for our model, we first performed cleaning and then resampled the data to a lower frequency.

2.1. Data Cleaning

Raw sensor data is often corrupted by missing values, outliers, and measurement errors. We applied a multi-step cleaning process to address these issues before any other transformation.

2.1.1. Missing Value Imputation

Gaps in the time series were handled using linear interpolation, which estimates a missing value by connecting its preceding and succeeding neighbors with a straight line. For a missing value at time t between two known points

(t_{1}, x_{t_{1}})

and

(t_{2}, x_{t_{2}})

, the imputed value

x_{t}

is calculated as

x_{t} = x_{t_{1}} + (t - t_{1}) \frac{x_{t_{2}} - x_{t_{1}}}{t_{2} - t_{1}}

(1)

2.1.2. Outlier Treatment

Outliers can disproportionately affect model training and evaluation. We identified outliers using the 3-sigma rule, which flags any data point that falls outside three standard deviations from the mean of a given window. Let

μ

be the mean and

σ

be the standard deviation. A value x is an outlier if

x \notin [μ - 3 σ, μ + 3 σ]

. Instead of removing these outliers, which could lead to information loss, we capped them at the boundary values to mitigate their influence.

2.1.3. Negative Power Correction

Due to sensor calibration errors or other technical issues, PV power readings can occasionally be negative, which is physically impossible. We corrected these instances by setting all negative power values to zero, ensuring physical consistency:

P_{cleaned} = max (0, P_{raw})

(2)

2.2. Data Resampling

After cleaning the high-frequency data, we downsampled the time series to a 1-h resolution. This aligns with the typical requirements for short-term operational forecasting and reduces computational complexity while smoothing out high-frequency noise. This was achieved by averaging the values over each 60-min interval, which contains 12 consecutive 5-min data points.

2.3. Feature Engineering and Selection

To enhance the model’s predictive capabilities, we engineered additional features and selected the most relevant covariates.

2.3.1. Temporal Feature Extraction

To help the model understand cyclical patterns, we extracted temporal features from the timestamp, including hour of the day, day of the year, and month of the year. These features provide explicit information about the time context, which is crucial for modeling diurnal and seasonal trends.

2.3.2. Covariate Selection via Correlation Analysis

The datasets include multiple meteorological variables. To select the most influential covariates, we performed a correlation analysis using the Pearson correlation coefficient (r), which measures the linear relationship between two variables. For variables X and Y with n samples, it is defined as

r_{X Y} = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2} \sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}

(3)

Based on this analysis, we selected the several covariates most strongly correlated with PV power for inclusion in our model.

2.4. Data Normalization

Finally, to ensure that all input features have a comparable scale, which is crucial for the stable convergence of deep learning models, we applied min–max scaling. This technique transforms each feature to a [0, 1] range. The formula is

x_{norm} = \frac{x - x_{min}}{x_{max} - x_{min}}

(4)

where

x_{min}

and

x_{max}

are the minimum and maximum values of the feature, respectively.

3. Proposed Forecasting Framework

3.1. Problem Formulation

The task is formulated as a multivariate time series prediction problem. Given a historical sequence of PV power and covariates

X = [x_{1}, . . ., x_{T}] \in R^{T \times d}

(where T is the look-back window length and d is the number of features), predict the future photovoltaic generation power sequence

Y = [y_{T + 1}, . . ., y_{T + H}] \in R^{H}

(where H is the prediction horizon length).

3.2. Baseline xPatch Architecture

The xPatch model [32] serves as our baseline. As shown in Figure 1, it first uses a statistical method like EMA to decompose the input series into trend and seasonal components. Each component is then passed to its corresponding stream for specialized processing.

The dual-stream design of xPatch allows for specialized feature extraction tailored to the different characteristics of the trend and seasonal signals. However, its reliance on a fixed decomposition method and a static patch size can limit its adaptability to diverse PV power patterns, which often contain complex and non-stationary dynamics.

3.3. Innovation 1: Neural Network Decomposition Module (nndecomp)

Instead of relying on a predefined statistical method like EMA, our first key innovation is nndecomp, a learnable neural module designed to disentangle the time series in a data-driven way. As illustrated in Figure 2, nndecomp employs a dual-pathway architecture to extract trend and seasonal components.

The trend extractor uses a stack of 1D convolutional layers with large kernel sizes to smooth the sequence and capture the long-term trend component (

X_{trend}

). The seasonal extractor combines a local pathway (using CNNs with small kernels to capture short-term fluctuations) and a global pathway (using a multi-head self-attention mechanism to model long-range dependencies). The final seasonal component (

X_{seasonal}

) is a fusion of these local and global features.

A key feature of nndecomp is its perfect reconstruction guarantee. After initial extraction and weighting by learnable parameters

α

and

β

, a residual term is calculated and redistributed to ensure no information is lost:

\begin{matrix} X_{trend}^{'} & = α \cdot X_{trend} \end{matrix}

(5)

\begin{matrix} X_{seasonal}^{'} & = β \cdot X_{seasonal} \end{matrix}

(6)

\begin{matrix} X_{res} & = X - (X_{trend}^{'} + X_{seasonal}^{'}) \end{matrix}

(7)

\begin{matrix} X_{trend}^{*} & = X_{trend}^{'} + X_{res} \cdot \frac{α}{α + β} \end{matrix}

(8)

\begin{matrix} X_{seasonal}^{*} & = X_{seasonal}^{'} + X_{res} \cdot \frac{β}{α + β} \end{matrix}

(9)

This mechanism ensures that

X_{trend}^{*} + X_{seasonal}^{*} = X

, providing a robust and lossless decomposition for subsequent processing streams.

3.4. Innovation 2: Adaptive Patching Mechanism

Our second key innovation, the adaptive patching mechanism, addresses the challenge that the importance of different temporal scales can vary in PV forecasting. Detailed in Figure 3, this mechanism allows the model to dynamically evaluate and fuse information from multiple temporal scales, overcoming the limitation of a fixed patch length in the baseline xPatch.

We predefine a set of candidate patch sizes,

P = {6, 12, 18, 24}

, representing different temporal granularities. For the seasonal component, the input is segmented into patches of these different sizes, and each set is processed by a dedicated network pathway. Each pathway’s internal structure is similar to the non-linear stream of the original xPatch.

The core of this mechanism is the patch selector, a fully connected network that takes the entire seasonal sequence as input and outputs a set of weights,

{w_{p}}_{p \in P}

, via a softmax function:

w_{p} = \frac{exp (z_{p})}{\sum_{j \in P} exp (z_{j})}

(10)

where

z_{p}

is the logit output from the selector network for patch size p. These weights, which sum to one, represent the inferred relevance of each time scale for the final prediction. Crucially, the patch selector is not trained separately but is an integral part of the end-to-end learning process. Its parameters are updated alongside all other model components based on the overall forecasting loss, allowing it to learn a data-driven, context-aware selection strategy.

The outputs from the parallel pathways (

O_{p}

) are then dynamically fused using these weights in a weighted sum:

O_{fused} = \sum_{p \in P} w_{p} \cdot O_{p}

(11)

This allows the model to flexibly integrate information from multiple time scales, making it more robust to the diverse patterns in PV data.

3.5. Proposed NNDecomp-AdaptivePatch-xPatch Framework

Our proposed framework, shown in Figure 4, integrates the two aforementioned innovations into the xPatch architecture. The model replaces the static components of the original xPatch with learnable, dynamic modules. This allows the framework to adapt to the specific data characteristics of PV power time series, capturing both underlying trends and high-frequency variations more effectively. The decomposed trend component (

X_{trend}^{*}

) is processed by a linear stream, while the seasonal component (

X_{seasonal}^{*}

) is processed by the adaptive patching mechanism. Finally, the outputs of both streams are concatenated and passed through a final fully connected layer to produce the forecast.

3.6. Evaluation Metrics

To provide a comprehensive assessment of model performance, we use four standard evaluation metrics. Let

y_{i}

be the actual value,

{\hat{y}}_{i}

be the predicted value, and n be the number of samples.

mean absolute error (MAE): Measures the average magnitude of the errors. It is less sensitive to large outliers.

$MAE = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |$

(12)
root mean squared error (RMSE): Represents the square root of the average of squared differences, penalizing large errors more heavily.

$RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}$

(13)
coefficient of determination (R²): Indicates the proportion of the variance in the dependent variable that is predictable from the independent variables. A value closer to 1 indicates a better fit.

$R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}$

(14)
mean bias error (MBE): Measures the average bias in the prediction, indicating whether the model tends to over- or under-predict.

$MBE = \frac{1}{n} \sum_{i = 1}^{n} ({\hat{y}}_{i} - y_{i})$

(15)

4. Case Study and Experimental Results

4.1. Dataset Description and Analysis

To rigorously evaluate the performance and generalizability of our proposed model, we utilize five distinct, real-world PV power datasets from two different countries, covering diverse climatic and geographical conditions.

DKA solar centre datasets (Australia): Three datasets originate from the DKA Solar Centre in Australia, providing a baseline under consistent regional conditions.
- 7-First-Solar dataset: Contains data from a First Solar installation, spanning 2019 to 2022.
- Kaneka dataset: Contains data from a Kaneka installation, covering 2013 to 2016.
- 23-Calyxo dataset: Contains data from a Calyxo installation, spanning from 2012 to 2018.
PVOD datasets (China): To test the model’s robustness under different geographical and climatic conditions, we include two additional datasets from the Photovoltaic Power Output Dataset (PVOD) located in Hebei Province, China [36]. These sites experience distinct weather patterns compared to the Australian locations, making them ideal for assessing model generalization.
- Station 1 dataset: Data from the first PV station in the PVOD collection.
- Station 4 dataset: Data from the fourth PV station in the PVOD collection.

After preprocessing as described in Section 2, all five datasets have a 1-h temporal resolution. The target variable is Active Power (kW), and we use four selected meteorological variables as covariates: Weather Temperature, Weather Relative Humidity, Global Horizontal Radiation, and Diffuse Horizontal Radiation. The data is split chronologically into training (80%), validation (10%), and testing (10%) sets.

Figure 5 shows a sample of the raw PV power time series from the 7-First-Solar dataset. The upper plot displays multi-year data revealing annual patterns and long-term trends in solar power generation. The lower plot focuses on data from 2020, clearly illustrating seasonal variations with higher power generation during summer months and lower output during winter periods. These patterns reflect the fundamental relationship between solar irradiance and photovoltaic power output across different temporal scales.

Figure 6 decomposes the PV power data to reveal its underlying temporal patterns across multi-scales. The yearly trend shows seasonal variations, with higher power generation during the summer months. The weekly pattern displays a consistent 7-day cycle, while the daily pattern confirms the dominant 24-h solar cycle. These multi-scale periodicities motivate our choice of a decomposition-based model that can handle different temporal components separately.

The heatmap in Figure 7 visualizes the average PV power by hour of the day and month of the year. It provides a clear picture of the seasonal and diurnal dependencies, showing that peak power output occurs around midday and is highest during the summer months (november to february in the southern hemisphere). This visualization reinforces the importance of using time-based features to capture these predictable cycles.

Figure 8 presents the pearson correlation matrix for the target variable (active power) and the meteorological covariates. As expected, active power shows a very strong positive correlation with global horizontal radiation and diffuse horizontal radiation, confirming that solar irradiance is the primary driver of PV output. Weather temperature also has a moderate positive correlation. This analysis guided our selection of the most relevant covariates for the model.

4.2. Experimental Setup

The experiments were conducted on a computer with a 12th Gen Intel(R)Core(TM)i7-12700H CPU and an NVIDIA GeForce RTX 3060 Laptop GPU. The software environment was Python 3.9 with PyTorch 2.5.

4.2.1. Baseline Models

We compare our model against the following:

xPatch: The original patch-based model [32].
iTransformer: An advanced transformer variant [29].
iTansformer_LSTM_CA_KAN: A hybrid model combining iTransformer and LSTM [30].
LSTM: A classic recurrent neural network [19].
TCN: A temporal convolutional network [23].
SCINet: A novel architecture using sample convolution and interaction [24].

4.2.2. Hyperparameter Settings

To ensure reproducibility and transparency, the key hyperparameters for our proposed nndecomp and adaptive patching modules are detailed in Table 1. These parameters were determined through a preliminary grid search and are kept consistent across all tests.

4.3. Model Training

We train the model using MAE as the loss function, which is robust to outliers. We use the Adam optimizer with an initial learning rate of

0.001

and a batch size of 128. ‘ReduceLROnPlateau’ scheduling and early stopping (patience of 20) are employed to prevent overfitting. Figure 9 shows the training and validation loss curves for our proposed model. Both curves decrease steadily and converge, indicating that the model learns effectively without significant overfitting. The stable validation loss demonstrates good generalization to unseen data.

4.4. Comparative Results Analysis

Table 2, Table 3, Table 4 and Table 5 summarize the comparison of model performance on the 7-First-Solar dataset for prediction horizons of 1, 3, 6, and 12 h.

As the prediction horizon extends to 3 h (Table 3), our model maintains its lead, again achieving the lowest MAE and RMSE. This suggests that the adaptive mechanisms are effective at capturing the temporal dynamics needed for predictions beyond the immediate future.

At the 6-h horizon (Table 4), while still within the short-term forecasting range, the prediction task becomes more challenging. Our model continues to outperform all baselines, with a significant margin in MAE and RMSE over most competitors, highlighting the robustness of our approach for short-term forecasting across different horizons.

For the longest horizon of 12 h (Table 5), our model’s superiority is even more pronounced. It achieves the best scores on all metrics, demonstrating that its architecture is particularly well-suited for capturing the longer-range dependencies required for such forecasts.

Table 6 shows the performance on the second dataset, Kaneka, for 1-h and 6-h predictions. The results confirm the generalizability of our approach. Our model again achieves the best MAE and RMSE on both horizons, demonstrating its robustness across different PV installations and time periods.

To further evaluate the model’s robustness, we conducted experiments on the 23-Calyxo dataset. The results are summarized in Table 7. For the 1-h forecast, our model achieves an MAE of 0.0934, representing a 7.2% improvement over the next-best model, LSTM, and a 15.6% improvement over the baseline xPatch. At the 6-h horizon, the performance gap remains notable, with our model outperforming xPatch by 6.4% in MAE. These results demonstrate that the proposed nndecomp and adaptive patching mechanisms maintain strong performance across different systems and conditions.

To further validate the model’s generalization capabilities, we present the results for the two PVOD datasets from China in Table 8 and Table 9. These datasets, originating from a different geographical region with distinct climatic conditions, serve as a robust test of the model’s adaptability. On the Station 1 dataset, our proposed NNDecomp-AdaptivePatch-xPatch model achieves the best MAE for both 1-h and 6-h forecasts. For the 1-h-ahead prediction, its MAE of 0.0265 represents a 2.6% improvement over the next-best model (LSTM). For the more challenging 6-h forecast, our model’s MAE of 0.0453 outperforms the runner-up SCINet by 1.5%.

Similar compelling results are observed for the Station 4 dataset (Table 9). Our model consistently delivers good performance, achieving an MAE of 0.0350 for 1-h forecasts and 0.0578 for 6-h forecasts. The consistent top-ranking performance across all five datasets from two continents strongly validates the robustness and superior generalization capability of the NNDecomp-AdaptivePatch-xPatch framework.

The preceding tables offer a performance overview across the entire test set. To provide a more granular visual analysis, the subsequent figures are all based on a single, representative one-week (168-h) prediction batch. Figure 10 begins this analysis with a bar chart comparing the MAE and RMSE of all models for this specific forecast. The chart visually confirms the superior performance of the NNDecomp-AdaptivePatch-xPatch model, which achieves the lowest error on both metrics for this continuous week-long prediction.

Figure 11 provides a visual comparison of the predictions from our model and the baseline xPatch over a one-week period. Our model’s predictions (in blue) track the actual PV power curve (in black) more closely than the baseline xPatch (in orange), especially during the sharp ramps in the morning and evening and on days with high variability.

Figure 12 provides enlarged detail views of specific time periods from Figure 11, showing different days and time segments. The plots demonstrate our model’s ability to accurately capture varying temporal patterns and respond quickly to changes in PV power output, highlighting its robustness across different operational conditions.

4.5. Ablation Study

To dissect the contributions of our two main innovations, we conducted an ablation study on the Kaneka dataset (Table 10). We tested two variants: one with only adaptive patching (‘AdaptivePatch-xPatch’) and one with only nndecomp (‘xPatchNNDecomp’). The results clearly show that both components contribute positively to the model’s performance. While each individual component offers an improvement over the baseline xPatch, their synergistic combination in the full model achieves the best results, validating our design choices.

4.6. Interpretability Analysis of Model Mechanisms

To provide deeper insights into the model’s inner workings and validate that its performance gains stem from meaningful, physically grounded learning, we conducted an interpretability analysis of our two primary innovations: the nndecomp module and the adaptive patching mechanism.

4.6.1. NNDecomp: Effectiveness of Data-Driven Decomposition

Figure 13 presents a comparative visualization of the decomposition results from a traditional statistical method, Exponential Moving Average (EMA), versus our learnable nndecomp module. A key observation is the distinct behavior of the extracted trend components. The EMA-derived trend is notably smooth, capturing the general low-frequency diurnal shape of PV power generation. However, it exhibits a noticeable lag and fails to respond to rapid, high-magnitude fluctuations in the signal, which are common during intermittent cloudy conditions.

In contrast, the trend component extracted by our nndecomp module, while still capturing the fundamental daily cycle, is more responsive and adaptive. It learns to distinguish between the underlying base trend and high-frequency variations. Consequently, the corresponding seasonal component effectively isolates the signal’s transient, high-volatility elements, such as those caused by fast-moving clouds or other atmospheric disturbances. This demonstrates that nndecomp performs a more semantically meaningful separation than a fixed statistical filter. By disentangling the signal into a more stable trend and a focused seasonal component, it provides cleaner, more specialized inputs for the downstream processing streams, thereby enhancing the model’s robustness and accuracy in handling complex, non-stationary PV data.

As shown in Figure 13, the trend component extracted by nndecomp more accurately reflects the underlying diurnal shape of PV power generation compared to the EMA trend. nndecomp effectively learns to filter out the high-frequency fluctuations, assigning them to the seasonal component. This component, in turn, captures the rapid variations caused by cloud cover and other transient weather events. This demonstrates that nndecomp does not act as a black box; instead, it performs a semantically meaningful, data-driven separation that provides a cleaner signal for the downstream trend-processing stream, thus contributing to the model’s robustness and accuracy.

4.6.2. AdaptivePatch: Dynamic Temporal Scale Selection

To verify that the adaptive patching mechanism learns to dynamically allocate attention across different temporal scales, we visualized the evolution of the patch weights generated by the patch selector network, as shown in Figure 14. The analysis reveals a clear, periodic pattern in the weight distributions that is strongly correlated with the physical characteristics of PV power generation.

The weights for the four patch sizes exhibit a distinct cyclical behavior that mirrors the 24-h day–night cycle. During daylight hours, when PV power output is high and susceptible to rapid fluctuations due to weather, the model consistently assigns higher weights to smaller patch sizes (e.g., 6 and 12). This indicates that the model prioritizes capturing fine-grained, local features to react to high-frequency events. Conversely, during nighttime or periods of very low irradiance, the weights for larger patch sizes (e.g., 18 and 24) increase. In these stable, low-information periods, the model shifts its focus to longer-term dependencies and broader context. This adaptive behavior confirms that the mechanism is not a static feature extractor but a sophisticated, data-driven controller that learns to select the most relevant temporal granularity in response to the signal’s instantaneous characteristics. This ability to dynamically modulate its receptive field is a key factor in the model’s superior performance, especially in handling the diverse and challenging patterns inherent in solar energy forecasting.

The analysis presented in Figure 14 reveals a clear dynamic relationship between the signal’s behavior and the learned weights. During periods of high stability and smooth power output, the model assigns higher weights to larger patch sizes. This indicates a reliance on longer-term patterns when the signal is predictable. Conversely, during periods of high volatility and intermittency, the weights for smaller patch sizes increase. This shows that the model shifts its focus to capture fine-grained, local features when the signal is erratic. This adaptive behavior confirms that the mechanism provides a sophisticated, data-driven approach to multi-scale feature extraction, far surpassing the limitations of a fixed, manually selected patch size.

5. Conclusions

This paper introduced NNDecomp-AdaptivePatch-xPatch, an enhanced deep learning framework for short-term PV power prediction. By integrating a neural network-based decomposition module and an adaptive patching mechanism into the xPatch architecture, our model effectively addresses the key challenges of forecasting highly variable and multi-scale PV power time series. The comprehensive experimental evaluation on five real-world datasets from two different continents demonstrates the superiority of the proposed model, consistently outperforming a range of state-of-the-art methods across multiple prediction horizons. The ablation study further confirmed that both the Neural Network Decomposition and the adaptive patching modules are crucial contributors to this enhanced performance. The practical implications of this work are significant, as more accurate PV forecasts can lead to improved grid stability, more efficient energy trading, and better planning for energy storage. However, we acknowledge certain limitations. While our tests on Australian and Chinese datasets demonstrate good generalizability, a broader evaluation across more diverse global climatic conditions would further strengthen our conclusions. Future work will also focus on extending the framework for multi-site forecasting and integrating uncertainty quantification. In conclusion, the NNDecomp-AdaptivePatch-xPatch framework represents a notable advancement in PV power forecasting, offering a robust and accurate solution for researchers and practitioners in the field of renewable energy.

Author Contributions

Conceptualization, B.W. and J.H.; methodology, B.W.; software, B.W.; validation, B.W. and J.H.; formal analysis, B.W.; investigation, B.W.; resources, J.H.; data curation, B.W.; writing—original draft preparation, B.W.; writing—review and editing, J.H.; visualization, B.W.; supervision, J.H.; project administration, J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used in this study are publicly available from the DKA Solar Centre (https://dkasolarcentre.com.au/download?location=alice-springs (accessed on 16 July 2025)). Processed data and code are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Apeh, O.O.; Meyer, E.L.; Overen, O.K. Contributions of Solar Photovoltaic Systems to Environmental and Socioeconomic Aspects of National Development—A Review. Energies 2022, 15, 5963. [Google Scholar] [CrossRef]
Ahmed, R.; Sreeram, V.; Mishra, Y.; Arif, M. A review and evaluation of the state-of-the-art in PV solar power forecasting: Techniques and optimization. Renew. Sustain. Energy Rev. 2020, 124, 109792. [Google Scholar] [CrossRef]
International Energy Agency. World Energy Outlook 2024; Technical Report; IEA: Paris, France, 2024; Available online: https://www.iea.org/reports/world-energy-outlook-2024 (accessed on 2 August 2025).
Rajendran, G.; Raute, R.; Caruana, C. A Comprehensive Review of Solar PV Integration with Smart-Grids: Challenges, Standards, and Grid Codes. Energies 2025, 18, 2221. [Google Scholar] [CrossRef]
Di Leo, P.; Ciocia, A.; Malgaroli, G.; Spertino, F. Advancements and Challenges in Photovoltaic Power Forecasting: A Comprehensive Review. Energies 2025, 18, 2108. [Google Scholar] [CrossRef]
Antonanzas, J.; Osorio, N.; Escobar, R.; Urraca, R.; Martinez-de Pison, F.; Antonanzas-Torres, F. Review of photovoltaic power forecasting. Sol. Energy 2016, 136, 78–111. [Google Scholar] [CrossRef]
Pierro, M.; Moser, D.; Perez, R.; Cornaro, C. The Value of PV Power Forecast and the Paradox of the “Single Pricing” Scheme: The Italian Case Study. Energies 2020, 13, 3945. [Google Scholar] [CrossRef]
Yang, D.; Wang, W.; Gueymard, C.A.; Hong, T.; Kleissl, J.; Huang, J.; Perez, M.J.; Perez, R.; Bright, J.M.; Xia, X.; et al. A review of solar forecasting, its dependence on atmospheric sciences and implications for grid integration: Towards carbon neutrality. Renew. Sustain. Energy Rev. 2022, 161, 112348. [Google Scholar] [CrossRef]
Inman, R.; Pedro, H.; Coimbra, C. Solar forecasting methods for renewable energy integration. Prog. Energy Combust. Sci. 2013, 39, 535–576. [Google Scholar] [CrossRef]
Raza, M.; Nadarajah, M.; Ekanayake, C. On recent advances in PV output power forecast. Sol. Energy 2016, 136, 125–144. [Google Scholar] [CrossRef]
Mellit, A.; Pavan, A.; Lughi, V. Deep learning neural networks for short-term photovoltaic power forecasting. Renew. Energy 2021, 172, 276–288. [Google Scholar] [CrossRef]
Zhang, J.; Verschae, R.; Nobuhara, S.; Lalonde, J.F. Deep photovoltaic nowcasting. Sol. Energy 2018, 176, 267–276. [Google Scholar] [CrossRef]
Yang, D.; Wu, E.; Kleissl, J. Operational solar forecasting for the real-time market. Int. J. Forecast. 2019, 35, 1499–1519. [Google Scholar] [CrossRef]
Diagne, M.; David, M.; Lauret, P.; Boland, J.; Schmutz, N. Review of solar irradiance forecasting methods and a proposition for small-scale insular grids. Renew. Sustain. Energy Rev. 2013, 27, 65–76. [Google Scholar] [CrossRef]
Reikard, G.; Hansen, C. Forecasting solar irradiance at short horizons: Frequency and time domain models. Renew. Energy 2019, 135, 1270–1290. [Google Scholar] [CrossRef]
Alkabbani, H.; Ahmadian, A.; Zhu, Q.; Elkamel, A. Machine learning and metaheuristic methods for renewable power forecasting: A recent review. Front. Chem. Eng. 2021, 3, 665415. [Google Scholar] [CrossRef]
Shi, J.; Lee, W.J.; Liu, Y.; Yang, Y.; Wang, P. Forecasting Power Output of Photovoltaic Systems Based on Weather Classification and Support Vector Machines. IEEE Trans. Ind. Appl. 2012, 48, 1064–1069. [Google Scholar] [CrossRef]
Assaf, A.M.; Haron, H.; Abdull Hamed, H.N.; Ghaleb, F.A.; Qasem, S.N.; Albarrak, A.M. A Review on Neural Network Based Models for Short Term Solar Irradiance Forecasting. Appl. Sci. 2023, 13, 8332. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Wang, K.; Qi, X.; Liu, H. Photovoltaic power forecasting based LSTM-Convolutional Network. Energy 2019, 189, 116225. [Google Scholar] [CrossRef]
Kaur, D.; Islam, S.N.; Mahmud, M.A.; Haque, M.E.; Dong, Z. Energy forecasting in smart grid systems: A review of the state-of-the-art techniques. arXiv 2020, arXiv:2011.12598. [Google Scholar]
Wang, J.; Zhang, Z.; Xu, W.; Li, Y.; Niu, G. Short-Term Photovoltaic Power Forecasting Using a Bi-LSTM Neural Network Optimized by Hybrid Algorithms. Sustainability 2025, 17, 5277. [Google Scholar] [CrossRef]
Bai, S.; Kolter, J.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar]
Liu, M.; Zeng, A.; Chen, M.; Xu, Z.; Lai, Q.; Ma, L.; Xu, Q. SCINet: Time Series Modeling and Forecasting with Sample Convolution and Interaction. In Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, LA, USA, 28 November–9 December 2022; pp. 5816–5828. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
Lim, B.; Arik, S.; Loeff, N.; Pfister, T. Temporal Fusion Transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 2021, 37, 1748–1764. [Google Scholar] [CrossRef]
Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), Vancouver, BC, Canada, 2–9 February 2021; Volume 35, pp. 11106–11115. [Google Scholar] [CrossRef]
Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. In Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS 2021), Virtual, 6–14 December 2021; pp. 22419–22430. [Google Scholar]
Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; Long, M. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. In Proceedings of the 12th International Conference on Learning Representations (ICLR 2024), Vienna, Austria, 7–11 May 2024. [Google Scholar] [CrossRef]
Wu, G.; Wang, Y.; Zhou, Q.; Zhang, Z. Enhanced Photovoltaic Power Forecasting: An iTransformer and LSTM-Based Model Integrating Temporal and Covariate Interactions. In Proceedings of the 8th Conference on Energy Internet and Energy System Integration (EI2), Shenyang, China, 29 November–2 December 2024; pp. 856–861. [Google Scholar] [CrossRef]
Nie, Y.; Nguyen, N.; Sinthong, P.; Kalagnanam, J. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In Proceedings of the 11th International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, 1–5 May 2023. [Google Scholar] [CrossRef]
Stitsyuk, A.; Choi, J. xPatch: Dual-Stream Time Series Forecasting with Exponential Seasonal-Trend Decomposition. In Proceedings of the 39th AAAI Conference on Artificial Intelligence (AAAI 2025), San Francisco, CA, USA, 27 January–1 February 2025; Volume 39, pp. 20601–20609. [Google Scholar] [CrossRef]
Akhtar, S.; Shahzad, S.; Zaheer, A.; Ullah, H.S.; Kilic, H.; Gono, R.; Jasinski, M.; Leonowicz, Z. Short-Term Load Forecasting Models: A Review of Challenges, Progress, and the Road Ahead. Energies 2023, 16, 4060. [Google Scholar] [CrossRef]
Zhu, Y.; Luo, S.; Huang, D.; Zheng, W.; Su, F.; Hou, B. DRCNN: Decomposing residual convolutional neural networks for time series forecasting. Sci. Rep. 2023, 13, 15901. [Google Scholar] [CrossRef]
Song, X.; Deng, L.; Wang, H.; Zhang, Y.; He, Y.; Cao, W. Deep learning-based time series forecasting. Artif. Intell. Rev. 2024, 58, 23. [Google Scholar] [CrossRef]
Yao, T.; Wang, J.; Wu, H.; Zhang, P.; Li, S.; Wang, Y.; Chi, X.; Shi, M. PVOD v1.0 : A Photovoltaic Power Output Dataset. V4. 2021. Available online: http://dx.doi.org/10.11922/sciencedb.01094 (accessed on 2 August 2025).

Figure 1. Architecture of the baseline xPatch model. It employs a dual-stream design to process trend and seasonal components, which are separated using a statistical decomposition method.

Figure 2. Detailed architecture of the Neural Network Decomposition (nndecomp) module, showing separate pathways for trend and seasonal extraction.

Figure 3. Detailed architecture of the adaptive patching mechanism, illustrating the patch selector network and the dynamic fusion of multi-scale processing branches.

Figure 4. Overall architecture of the proposed NNDecomp-AdaptivePatch-xPatch framework, which integrates the nndecomp and adaptive patching modules into a unified structure.

Figure 5. Raw PV Active Power time series from the 7-First-Solar dataset, showing annual patterns and seasonal trends across multiple years.

Figure 6. Temporal patterns in the 7-First-Solar dataset, showing yearly, weekly, and daily seasonalities.

Figure 7. Heatmap of PV power by hour of the day and month of the year, highlighting strong seasonal and diurnal patterns.

Figure 8. Correlation heatmap of PV power and meteorological variables, confirming the strong influence of solar radiation.

Figure 9. Training and validation loss (MAE) curves for the proposed model, showing stable convergence and good generalization.

Figure 10. Overall performance comparison of all models based on MAE and RMSE metrics for a representative test batch.

Figure 11. Visual comparison of prediction accuracy between the proposed model and baseline models (e.g., xPatch) over a representative week-long period.

Figure 12. Enlarged detail views of specific time periods from Figure 11, showing enhanced prediction accuracy across different temporal segments.

Figure 13. Visual comparison of decomposition results between statistical EMA and our learnable nndecomp. The nndecomp module extracts a smoother, more representative trend and isolates high-frequency variability into the seasonal component.

Figure 14. Temporal variation of weights assigned by the adaptive patching mechanism. The model dynamically adjusts its focus on different patch sizes in response to the signal’s characteristics, such as volatility and stability.

Table 1. Key hyperparameters for the proposed model modules.

Module	Hyperparameter	Value
NNDecomp	Kernel Size (Trend Extractor)	25
	Dropout Rate	0.1
	Attention Heads (Seasonal)	4
	Learnable Weights ( $α, β$ )	Initialized at 0.5
Adaptive Patching	Candidate Patch Sizes ( $P$ )	{6, 12, 18, 24}
	Stride	1
	Patch Selector Network	2-Layer MLP (Input-64-4)

Table 2. Performance comparison for 1-h prediction on 7-First-Solar dataset. The best model is in boldface and the second best is underlined. Our proposed model is also highlighted in boldface, and the same convention is applied to the following tables.

Model	MAE	RMSE	R²	MBE
NNDecomp-AdaptivePatch-xPatch	0.0780	0.2133	0.9838	0.0024
xPatch	0.0895	0.2264	0.9817	0.0026
iTansformer_LSTM_CA_KAN	0.0845	0.2357	0.9795	0.0200
iTransformer	0.0945	0.2474	0.9782	−0.0066
LSTM	0.0809	0.2238	0.9821	0.0153
SCINet	0.0930	0.2365	0.9806	0.0040
TCN	0.0950	0.2427	0.9809	0.0014

Table 3. Performance comparison for 3-h prediction on 7-First-Solar dataset.

Model	MAE	RMSE	R²	MBE
NNDecomp-AdaptivePatch-xPatch	0.1065	0.2783	0.9701	0.0105
xPatch	0.1118	0.2872	0.9678	0.0125
iTansformer_LSTM_CA_KAN	0.1440	0.3683	0.9482	0.0573
iTransformer	0.1228	0.3117	0.9634	0.0177
LSTM	0.1105	0.2848	0.9687	0.0098
SCINet	0.1165	0.2982	0.9667	0.0210
TCN	0.1340	0.3167	0.9642	0.0114

Table 4. Performance comparison for 6-h prediction on 7-First-Solar dataset.

Model	MAE	RMSE	R²	MBE
NNDecomp-AdaptivePatch-xPatch	0.1258	0.3190	0.9610	0.0142
xPatch	0.1288	0.3235	0.9587	0.0211
iTansformer_LSTM_CA_KAN	0.1691	0.4287	0.9258	0.0652
iTransformer	0.1422	0.3524	0.9534	0.0212
LSTM	0.1555	0.3958	0.9388	0.0238
SCINet	0.1343	0.3429	0.9531	0.0357
TCN	0.1979	0.3957	0.9431	0.0305

Table 5. Performance comparison for 12-h prediction on 7-First-Solar dataset.

Model	MAE	RMSE	R²	MBE
NNDecomp-AdaptivePatch-xPatch	0.1426	0.3507	0.9520	0.0209
xPatch	0.1479	0.3566	0.9493	0.0318
iTansformer_LSTM_CA_KAN	0.1894	0.4700	0.9081	0.0831
iTransformer	0.1621	0.3921	0.9441	0.0345
LSTM	0.1771	0.4472	0.9248	0.0526
SCINet	0.1514	0.3773	0.9427	0.0548
TCN	0.2427	0.4670	0.9234	0.0648

Table 6. Performance comparison for 1-h and 6-h predictions on Kaneka dataset.

Model	1-h		6-h
Model	MAE	RMSE	MAE	RMSE
NNDecomp-AdaptivePatch-xPatch	0.0889	0.1210	0.1973	0.4705
xPatch	0.1269	0.3018	0.2029	0.4765
iTansformer_LSTM_CA_KAN	0.1256	0.3054	0.2725	0.6364
iTransformer	0.1347	0.3150	0.2163	0.4860
LSTM	0.1417	0.3227	0.2166	0.5183
SCINet	0.1293	0.2979	0.2113	0.5015
TCN	0.1156	0.2792	0.2217	0.5050

Table 7. Performance comparison for 1-h and 6-h predictions on 23-Calyxo dataset (average of all seasons).

Model	1-h		6-h
Model	MAE	RMSE	MAE	RMSE
NNDecomp-AdaptivePatch-xPatch	0.0934	0.2594	0.1871	0.4624
xPatch	0.1106	0.2953	0.2000	0.4764
iTansformer_LSTM_CA_KAN	0.1213	0.3212	0.2413	0.6020
iTransformer	0.1080	0.2818	0.2187	0.5182
LSTM	0.1007	0.2766	0.2029	0.5371
SCINet	0.1129	0.2820	0.2033	0.4889
TCN	0.1068	0.2829	0.2029	0.4989

Table 8. Performance comparison on the PVOD Station 1 dataset.

Model	1-h		6-h
Model	MAE	RMSE	MAE	RMSE
NNDecomp-AdaptivePatch-xPatch	0.0265	0.0602	0.0453	0.0939
xPatch	0.0300	0.0623	0.0486	0.0994
iTansformer_LSTM_CA_KAN	0.0295	0.0664	0.0538	0.1057
iTransformer	0.0310	0.0645	0.0502	0.0979
LSTM	0.0272	0.0604	0.0486	0.1029
SCINet	0.0278	0.0604	0.0460	0.0938
TCN	0.0275	0.0625	0.0517	0.1017

Table 9. Performance comparison on the PVOD Station 4 dataset.

Model	1-h		6-h
Model	MAE	RMSE	MAE	RMSE
NNDecomp-AdaptivePatch-xPatch	0.0350	0.0752	0.0578	0.1112
xPatch	0.0377	0.0771	0.0598	0.1180
iTansformer_LSTM_CA_KAN	0.0367	0.0800	0.0702	0.1271
iTransformer	0.0412	0.0805	0.0601	0.1159
LSTM	0.0367	0.0747	0.0698	0.1279
SCINet	0.0404	0.0771	0.0580	0.1114
TCN	0.0427	0.0813	0.0580	0.1112

Table 10. Ablation study results on the Kaneka dataset.

Model Variant	1-h		6-h
Model Variant	MAE	RMSE	MAE	RMSE
NNDecomp-AdaptivePatch-xPatch (Full)	0.0889	0.1210	0.1973	0.4705
AdaptivePatch-xPatch (No NN Decomp)	0.1015	0.1465	0.1989	0.4754
xPatchNNDecomp (No Adaptive Patch)	0.0904	0.1213	0.2097	0.4873
xPatch (Baseline)	0.1269	0.3018	0.2029	0.4765

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, B.; Hao, J. Enhanced xPatch for Short-Term Photovoltaic Power Forecasting: Supporting Sustainable and Resilient Energy Systems. Sustainability 2025, 17, 7324. https://doi.org/10.3390/su17167324

AMA Style

Wu B, Hao J. Enhanced xPatch for Short-Term Photovoltaic Power Forecasting: Supporting Sustainable and Resilient Energy Systems. Sustainability. 2025; 17(16):7324. https://doi.org/10.3390/su17167324

Chicago/Turabian Style

Wu, Bintao, and Jianlong Hao. 2025. "Enhanced xPatch for Short-Term Photovoltaic Power Forecasting: Supporting Sustainable and Resilient Energy Systems" Sustainability 17, no. 16: 7324. https://doi.org/10.3390/su17167324

APA Style

Wu, B., & Hao, J. (2025). Enhanced xPatch for Short-Term Photovoltaic Power Forecasting: Supporting Sustainable and Resilient Energy Systems. Sustainability, 17(16), 7324. https://doi.org/10.3390/su17167324

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhanced xPatch for Short-Term Photovoltaic Power Forecasting: Supporting Sustainable and Resilient Energy Systems

Abstract

1. Introduction

2. Data Preprocessing Methodology

2.1. Data Cleaning

2.1.1. Missing Value Imputation

2.1.2. Outlier Treatment

2.1.3. Negative Power Correction

2.2. Data Resampling

2.3. Feature Engineering and Selection

2.3.1. Temporal Feature Extraction

2.3.2. Covariate Selection via Correlation Analysis

2.4. Data Normalization

3. Proposed Forecasting Framework

3.1. Problem Formulation

3.2. Baseline xPatch Architecture

3.3. Innovation 1: Neural Network Decomposition Module (nndecomp)

3.4. Innovation 2: Adaptive Patching Mechanism

3.5. Proposed NNDecomp-AdaptivePatch-xPatch Framework

3.6. Evaluation Metrics

4. Case Study and Experimental Results

4.1. Dataset Description and Analysis

4.2. Experimental Setup

4.2.1. Baseline Models

4.2.2. Hyperparameter Settings

4.3. Model Training

4.4. Comparative Results Analysis

4.5. Ablation Study

4.6. Interpretability Analysis of Model Mechanisms

4.6.1. NNDecomp: Effectiveness of Data-Driven Decomposition

4.6.2. AdaptivePatch: Dynamic Temporal Scale Selection

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI