Next Article in Journal
An Entropic Approach to Constrained Linear Regression
Previous Article in Journal
Maxima of the Aα-Index of Non-Bipartite C3-Free Graphs for 1/2 < α < 1
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Multi-Task Learning Framework for Interval-Valued Carbon Price Forecasting Using Online News and Search Engine Data

1
College of Forestry, Fujian Agriculture and Forestry University, Fuzhou 350002, China
2
College of Economics and Management, Fujian Agriculture and Forestry University, Fuzhou 350002, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2025, 13(3), 455; https://doi.org/10.3390/math13030455
Submission received: 29 December 2024 / Revised: 21 January 2025 / Accepted: 27 January 2025 / Published: 29 January 2025
(This article belongs to the Special Issue AI in Game Theory: Theory and Applications)

Abstract

:
The European Union Emissions Trading System (EU ETS) serves as the cornerstone of European climate policy, providing a critical mechanism for mitigating greenhouse gas emissions. Accurate forecasting of the carbon allowance prices within the market is essential for policymakers, enterprises, and investors. To address the need for interval-valued time series modeling and forecasting in the carbon market, this paper proposes a Transformer-based multi-task learning framework that integrates online news and search engine data information to forecast interval-valued EU carbon allowance futures prices. Empirical evaluations demonstrate that the proposed framework achieves superior predictive accuracy for short-term forecasting and remains robust under high market volatility and economic policy uncertainty compared to single-task learning benchmarks. Furthermore, ablation experiments indicate that incorporating news sentiment intensity and search index effectively enhances the framework’s predictive performance. Interpretability analysis highlights the critical role of specific temporal factors, while the time-varying variable importance analysis further underscores the influence of carbon allowance close prices and key energy market variables and also recognizes the contributions of news sentiment. In summary, this study provides valuable insights for policy management, risk hedging, and portfolio decision-making related to interval-valued EU carbon prices and offers a robust forecasting tool for carbon market prediction.

1. Introduction

The European Union Emissions Trading System (EU ETS) has served as a cornerstone of Europe’s climate change mitigation strategy within the Fit for 55 package and represents the largest and most established carbon market globally [1,2,3]. The price of carbon allowances (EU allowances, EUA) directly influences operational and strategic decisions across industries, affecting production costs, investment in low-carbon technologies, and overall competitiveness [4,5]. Therefore, accurately forecasting the EUA prices is critical for policymakers aiming to design effective climate policies, investors managing risks and returns, and market participants navigating the complexities of compliance and competitiveness in a carbon-constrained economy [6,7].
Forecasting carbon prices in the EU ETS, however, poses a unique set of challenges [8]. Carbon markets are influenced by a complex interplay of economic indicators, policy developments, energy market dynamics, and sentiment-related factors, leading to high volatility and uncertainty in price movements [9,10,11]. Traditional econometric forecasting methods often fall short in capturing these dynamic interactions and the nonstationary nature of the carbon price time series [12,13,14]. Recent approaches have employed deep learning techniques to improve forecasting accuracy [15,16,17]. One prevalent paradigm involves ensemble models combined with optimization algorithms, integrating multiple forecasting models to capture different aspects of the data [18,19,20,21,22,23]. Another is based on the “divide and conquer” principle [24,25], utilizing decomposition-ensemble methods where the original time series is decomposed into components such as trend, seasonality, and noise, each modeled separately before integration [26,27,28,29,30,31,32,33]. While these methods have considerably advanced point forecasting, the increasingly complex climate and economic policy environment necessitates more comprehensive predictive information to capture the uncertainties of volatile markets like the EU ETS [34,35]. Point forecasts provide valuable insights into expected future prices; however, they may not fully encompass the range of possible variations around multi-expected values [36,37,38,39,40,41]. Therefore, additional forecasting approaches such as interval-valued forecasting are needed to provide deeper insights and enhance decision-making in managing exposure to risk [42].
Interval-valued forecasting offers a more comprehensive approach by predicting a range of potential prices, represented by upper and lower bounds [37,43]. The methodology provides valuable measures of market uncertainty and risk, offering deeper insights into the temporal dynamics affecting carbon prices [44,45]. For example, accurately assessing the trends of high and low price bounds enables market participants to better estimate market volatility and develop robust hedging strategies [46]. Policymakers can also benefit from interval forecasts by understanding the potential impacts of policy interventions and regulatory changes on the carbon trading market [47]. While interval-valued forecasting has gained traction, most existing research has focused on China’s carbon pilot markets, majorly employing interval decomposition-ensemble paradigms [44,48,49,50,51,52]. Research on interval carbon price forecasting for the EU ETS remains scarce. Only a few studies, such as those by Zhu et al. [40], Tian and Hao [53], Niu et al. [54], Wang et al. [55], Zhao et al. [56], Yang et al. [57], and series of research focused on open-high-low-close prices developed by Huang et al. [58,59,60] have explored this area, indicating a large gap in developing effective forecasting frameworks specifically tailored for the EU ETS. Given its unique characteristics and global significance, advancing interval forecasting methodologies for the carbon market has become both necessary and timely. Moreover, in recent years, the integration of big data sources such as online news and search trends has opened new dimensions in predictive analytics [61,62,63,64]. In the energy and financial markets, incorporating news sentiment and search index into forecasting models has been proved to effectively enhance the predictive performance of deep learning models [38,39]. Nevertheless, the utilization of such big data sources in the interval-valued carbon prices has been in its emerging stage [65,66,67].
Recent advancements in machine learning and big data analytics have provided new opportunities to address the interval-valued forecasting challenges [68]. For instance, interval decomposition-ensemble methods typically employ rolling decomposition techniques to separate the original time series into components such as trend, seasonality, and noise [36,44,55,57]. The rolling procedure aims to reduce potential data leakage and adapt to structural changes in the series, with each component subsequently modeled by specialized algorithms before recombining their forecasts. By isolating distinct temporal patterns, these methods provide fine-grained insights into the driving factors behind the interval-valued carbon prices fluctuation, which can be useful in contexts where policy adjustments and market conditions evolve over time. In addition, ensemble optimization algorithms integrate multiple predictive modules—each capturing specific aspects of the data—to form a collective forecast [40,45,48,53,54]. Through iterative optimization procedures, the ensembles seek to balance and synthesize diverse viewpoints, allowing for robust forecasting under various market scenarios. In contrast to the above methods that partition time series signals or integrate multiple standalone forecasts, the multi-task learning (MTL) perspective interval prediction by simultaneously modeling the upper and lower bounds within one unified learning framework. MTL frameworks and deep learning models like Transformers have demonstrated promise in handling large datasets and complex temporal dependencies common in nonlinear and nonstationary data [69], and offers the advantage of learning shared representations across related tasks, potentially improving performance compared to learning each task independently [70]. MTL can further accommodate diverse-source data, offering an alternative route to capturing the uncertainties and complexities of the interval-valued carbon price dynamics. Successful applications of MTL in energy and financial markets have been reported, such as in electricity load forecasting [71,72,73] and stock price prediction [74,75], where MTL models outperformed single-task approaches. Despite these advancements, applying MTL to interval-valued carbon price forecasting remains underexplored.
In this study, we propose a novel multi-task learning framework that integrates multi-source heterogeneous data (online news, Google trends, and related market futures prices) using a Transformer-based model (Temporal Fusion Transformer, TFT) [76] for the interval-valued EUA futures prices forecasting. By incorporating online news sentiment intensity and the Internet search index into the framework, we aim to enhance the predictive capability for the interval-valued carbon prices. The framework’s effectiveness is evaluated through comprehensive experiments, including model comparisons and ablation studies. We also conduct systematic robustness tests under different market uncertainty conditions to assess the model’s stability and reliability. Furthermore, the time-varying interpretability provided by our framework offers practical insights into variable importance and temporal patterns affecting carbon price forecasting.
This paper contributes to the literature and practice in several aspects. On the one hand, while previous studies on the interval-valued carbon prices forecasting in the EU ETS have majorly relied on decomposition-ensemble and hybrid modeling methodologies, this study introduces a novel approach by leveraging multi-task learning combined with diverse-source data, offering a fresh perspective on the interval-valued forecasting in carbon markets. On the other hand, through the Transformer-based model’s time-varying interpretability mechanism, we illustrate how previous trading days influence the future interval carbon prices and the importance of various variables in predictions. Through this innovation, we are able to sustain an explanation of the predictive capacity of market factors and big data for interval carbon prices. That is, we find strong empirical evidence identifying trading periods or characteristics that predict future carbon market interval prices with meaningful implications. The proposed framework and empirical findings not only build trust in complex deep learning-based forecasting systems but also offer a robust tool for practical applications in carbon market risk management and environmental policy-making.
The remainder of the paper is organized as follows: Section 2 describes the data sources and preprocessing steps in this study. Section 3 analyzes the model performance, variable importance, and temporal patterns of the carbon price forecasting. Finally, Section 4 concludes the paper with a summary of key findings and directions for future research.

2. Methodology

This section introduces the concepts of interval data and multi-task learning, describes data collection and preprocessing, and explains the model setup and evaluation metrics. Figure 1 illustrates the proposed framework of this study.

2.1. Interval-Valued Data

Interval-valued data represent a specific instance within symbolic data analysis (SDA). The method is better suited for accurately depicting the complexities and fluctuations of real-world scenarios compared to single-valued variables, as it encapsulates greater uncertainty and variability. In SDA, an interval-valued variable, denoted as X, is defined as a mapping from Ω into R . For each k Ω , X ( k ) corresponds to an interval [ a , b ] I , where I = { [ a , b ] : a , b R , a b } represents the set of closed intervals in R .
At each discrete time point t = 1 , 2 , , n , an interval is characterized as a two-dimensional vector [ X t L , X t U ] , where X t L indicates the lower bound and X t U signifies the upper bound of the interval, adhering to the condition X t L X t U . The sequence of intervals is then represented as follows:
[ X 1 L , X 1 U ] , [ X 2 L , X 2 U ] , , [ X n L , X n U ] ,
with n expressing the total number of intervals. Specifically, the observed interval at time t is denoted as I t , expressed mathematically as follows:
I t = [ X t L , X t U ] .

2.2. Multi-Task Learning

Multi-task learning is a machine learning paradigm designed to harness the useful information present across multiple related tasks, thereby enhancing the generalization performance of all tasks involved. The definition of MTL is as follows: Given m learning tasks { T i } i = 1 m where all the tasks or a subset are interrelated, MTL aims to simultaneously learn these m tasks to improve the learning outcomes for each individual task T i by leveraging the knowledge derived from other tasks.
MTL enhances the learning efficiency of individual sub-tasks by leveraging shared information, ultimately leading to increased accuracy and operational efficiency. Figure 2 shows the differences between the training methodologies of various learning approaches. The success of MTL in improving forecasting performance is attributable to both the model’s inherent capabilities and the strategic optimization of the employed loss functions.
In the context of deep learning, MTL is executed by deriving shared representations from multiple supervisory signals. Historically, deep multi-task architectures have been categorized into hard and soft parameter sharing techniques. In hard parameter sharing, the parameter set is divided into shared parameters and those specific to each task (see Figure 2c). Models utilizing hard parameter sharing generally consist of a shared encoder that branches into task-specific heads. Conversely, soft parameter sharing assigns each task its own distinct set of parameters, facilitated by a mechanism that promotes feature sharing across tasks (see Figure 2d). Moreover, there are also two model types: encoder-focused and decoder-focused architectures. Encoder-focused architectures (see Figure 2e) restrict information sharing to the encoder phase, employing either hard or soft parameter sharing before each task is decoded using an independent, task-specific head. In contrast, decoder-focused architectures (see Figure 2f) allow for information exchange during the decoding process as well. The proposed multi-task learning framework in this study is implemented using the TFT model, which aligns with the decoder-focused architecture of deep multi-task learning.

2.3. Data Collection and Preprocessing

In this section, we discuss the selection, collection, and preprocessing of diversity data.

2.3.1. Market Variables

The main predictive variable selected is the interval-valued EUA futures prices expiring on December 24th, traded on the ICE CEX platform. The interval-valued EUA futures price data were sourced from the Investing database (https://www.investing.com/), encompassing the period from 4 January 2021, to 23 February 2024 (covering EU ETS Phase 4). As illustrated in Figure 3, the price data exhibit notable non-stationarity, non-linearity, and abrupt volatility, characteristics that necessitate careful modeling approaches.
In this study, to assess the predictive performance of our proposed forecasting framework, we calculated the daily returns of interval-valued prices, i.e., by simultaneously predicting the daily returns of the high and low prices. Table 1 provides a detailed description of the statistical characteristics of the return rates.

2.3.2. Related Market Variables

Previous studies have shown that fluctuations in energy prices, such as natural gas, electricity, coal, and crude oil, are closely related to changes in EU carbon prices when predicting the EU ETS prices [7,8,18]. Therefore, our research incorporates four key daily-frequency energy prices from the energy sector: NBP natural gas futures, Brent crude oil futures, ICE Rotterdam coal futures, and German power base load futures. The STOXX Europe 600 index futures are also selected as a representative indicator of the current and expected economic conditions, reflecting the exogenous impact on carbon pricing. To ensure a comprehensive analysis, all daily frequency variables were collected from 4 January 2021, to 23 February 2024 (www.investing.com). This study constructs the input matrix for model prediction based on the trading days of carbon prices. In cases where trading day prices for other relevant market variables are missing, we apply forward filling to impute missing values using the previous day’s prices for the respective variables [6,18,77].
We conducted a Pearson correlation analysis to explore the relationships between the EU ETS allowance prices and related variables. The results as visualized in Figure 4 provide further insights into the interdependencies among the selected features. The heatmap reveals a strong correlation between the high, low, close, and open prices of the allowances, with coefficients close to 1, reflecting their inherent interdependence. Among the energy market variables, Brent crude oil futures exhibit the highest correlation with the allowance prices (e.g., 0.58 with high price), followed by Rotterdam coal futures and German power baseload futures, while NBP natural gas futures show slightly weaker correlations. These findings align with prior studies suggesting that energy prices significantly influence EU ETS price dynamics due to their direct connection to emission-intensive production processes. The trading volume also shows moderate correlations with the allowance prices, indicating its role as a potential indicator of market activity and liquidity. On the other hand, the STOXX Europe 600 index futures exhibit relatively weak correlations with the allowance prices, suggesting that financial market conditions may play a more indirect role in shaping the EU ETS dynamics.

2.3.3. Unstructured Data and Search Index

The EU ETS, as a policy-driven artificial market, has experienced sharp price fluctuations due to policy changes, unexpected events, and public sentiment. Research has confirmed a direct relationship between media sentiments and fluctuations in the EUA prices, underscoring the influence of news articles on market dynamics [62,63,65]. To ensure representative online news media data, 2750 articles were gathered from the EU ETS section of the Carbon Pulse website (https://carbon-pulse.com/) between 4 January 2021, and 23 February 2024. The founders of Carbon Pulse possess nearly three decades of experience in carbon market reporting and climate policy analysis. Their commitment to delivering in-depth news and intelligence on global carbon pricing initiatives has established a strong track record in informing market development and global policy-making through a wide array of resources.
Prior to conducting sentiment analysis, the text data were preprocessed to eliminate redundant information that utilized the Natural Language Toolkit (NLTK) to remove punctuation, numbers, excessive whitespace, and English stopwords while also converting the text to lowercase. The sentiment scores of the news texts were computed using VADER (Valence Aware Dictionary and Sentiment Reasoner), a sentiment analysis tool that evaluates text sentiment by referencing a lexicon of words assigned sentiment scores and employing straightforward rules. After calculating the positive and negative sentiment scores for each article, we resampled the data on a daily basis to ensure quality for predictive purposes. Figure 5 illustrates the daily variations in positive and negative sentiment intensity over time for Carbon Pulse news.
In addition to news articles, investors often leverage various search engines to explore topics of interest, and search indices reflect public attention levels toward specific subjects [64]. Google Trends provides real-time insights into search behaviors, enabling researchers to measure interest in specific topics across various locations and timeframes. The trends data are categorized by topics, offering a comprehensive perspective on search patterns, and also accommodate individual search terms, which are generally considered more reliable due to their inclusion of precise phrases, spelling variations, and acronyms across multiple languages. For this study, the topic “European Union Emissions Trading System” was selected as representative data for the search index, covering the period from 4 January 2021, to 23 February 2024, with data collected daily. The values of the search index are normalized on a scale from 1 to 100, rather than representing absolute search volumes. Figure 5 also depicts daily changes in attention data regarding the EU ETS over time.

2.4. Forecasting Models and Parameters Setting

In this study, we chose the Temporal Fusion Transformer to implement multi-task learning for the interval-valued EUA futures prices forecasting because its architectural design inherently supports this approach. The TFT model [76] employs hard parameter sharing within the encoder phase to extract shared representations from diverse data sources, while task-specific heads are utilized for predicting the upper and lower bounds of the interval-valued carbon prices. Moreover, the model’s interpretability enables the demonstration of the intricate details of the prediction process, fostering greater user trust in complex deep learning-based predictive systems. We also compared a TFT variant configured for single-task learning to ensure a consistent baseline for evaluating the relative benefits of the proposed multi-task approach. To provide a comprehensive benchmarking landscape, we also employ Transformer [78] and TCN [79] models under the multi-task learning paradigm, ensuring that all three multi-task frameworks are evaluated on an equal footing. For comparison, single-task variants of these architectures (Transformer and TCN) serve as consistent baselines to isolate the benefits introduced by multi-task learning. Furthermore, four well-established deep learning models—LSTM [80], DeepAR [81], DecoderMLP [82], and GRU [83]—were utilized as single-task learning benchmark models to evaluate the predictive performance of the proposed framework. The models were selected based on their demonstrated efficacy in time series forecasting tasks and their extensive application in the literature. Furthermore, when selecting benchmark models to compare forecasting ability, extensive literature has confirmed the nonlinearity and non-stationarity of the allowance prices [58,60], limiting the applicability of traditional econometric models. Deep learning models generally exhibit superior predictive performance. Therefore, we only selected deep learning models for comparison. The decision not to include specific formulas for the deep learning models utilized in this study stems from several considerations. Firstly, these models are well-established in the literature, and their underlying architectures and mathematical formulations are widely available in numerous academic sources, making detailed re-exposition unnecessary. Secondly, the focus of this manuscript is on the comparative analysis of the models’ predictive performance within the proposed framework rather than on the derivation of their mathematical foundations. By omitting the formulas, we aim to streamline the discussion and concentrate on the practical application and results of the models in the context of our study.

Experiment Setup

A sliding time window of size 5 was selected for all forecasting models determined as the optimal lag based on analyses of the Akaike Information Criterion (AIC). Next, we provide a detailed description of the fundamental parameters employed during model training. The data sequence designated for prediction is divided into training, validation, and test sets, comprising 80%, 10%, and 10% of the total sequence length, respectively. The training set is utilized for initial model learning, while the validation set is used to fine-tune and optimize hyperparameters to enhance training performance. The test set is employed directly to evaluate the performance of the final model. The computing hardware used in this study included an NVIDIA (Santa Clara, CA, USA) GeForce RTX 3090 GPU and an Intel (Santa Clara, CA, USA) Xeon(R) Platinum 8362 CPU. The deep learning framework utilized was PyTorch version 2.0.1. Hyperparameters were automatically optimized using Optuna (https://optuna.org/). A summary of the parameters used for training the forecasting models is presented in Table 2.

2.5. Evaluation Criteria

To rigorously assess the forecasting performance of the proposed multi-task learning framework and benchmark models, we employ six widely recognized metrics tailored specifically for interval-valued predictions [43,84].
First, we utilize the interval U of Theil statistics ( U I ) and the interval average relative variance ( A R V I ), defined as follows:
U I = j = 1 m X j + 1 U X ^ j + 1 U 2 + j = 1 m X j + 1 L X ^ j + 1 L 2 j = 1 m X j + 1 U X j U 2 + j = 1 m X j + 1 L X j L 2 ,
A R V I = j = 1 m X j + 1 U X ^ j + 1 U 2 + j = 1 m X j + 1 L X ^ j + 1 L 2 j = 1 m X j + 1 U X ¯ U 2 + j = 1 m X j + 1 L X ¯ L 2
.
Here, m represents the number of intervals in the test set, I ^ t = X ^ t U , X ^ t L T indicates the forecasted interval at time t, and I ¯ = X ¯ U , X ¯ L T denotes the sample mean of the interval, where X ¯ U is the mean of the upper bounds and X ¯ L is the mean of the lower bounds.
Both U I and A R V I are commonly used to compare forecasting errors between the reference model and a naïve model [85]. Specifically, U I is utilized to evaluate the forecasting errors of the random walk model against the reference model. A value of U I > 1 indicates that the reference model performs worse than the random walk model, U I = 1 uggests equivalent performance, and U I < 1 indicates that the reference model outperforms the random walk model. As U I approaches zero, the reference model’s performance is deemed perfect. Similarly, A R V I compares the reference model’s errors to the average of the series, with lower values indicating better forecasts. Notably, A R V I = 0 denotes a perfect reference model, while A R V I = 1 implies that the model performs similarly to the series average. Importantly, these metrics account for both upper and lower bound forecasting errors simultaneously and are scale-invariant with respect to the time series.
Second, we employ the mean squared error of interval ( M S E I ) [86] alongside two distance measures: the mean distance error based on the Ichino–Yaguchi distance ( M D E 1 ) and the Hausdorff distance ( M D E 2 ) [87], defined as follows:
M S E I = 1 m · j = 1 m c j c ^ j 2 + r j r ^ j 2 ,
M D E 1 = 1 m · j = 1 m 1 2 · X j U X ^ j U + X j L X ^ j L ,
M D E 2 = 1 m · j = 1 m c j c ^ j + r j r ^ j ,
where c j = ( X j L + X j U ) / 2 and r j = ( X j L X j U ) / 2 represent the center and radius of the j-th interval, while c ^ j = ( X ^ j L + X ^ j U ) / 2 and r ^ j = ( X ^ j L X ^ j U ) / 2 denote the center and radius of the forecasted interval [88]. Consequently, c j c ^ j and r j r ^ j represent the positional and length errors between the actual and forecasted intervals, respectively. Thus, M S E I captures both positional and length errors, with lower values indicating superior forecasts. M D E 1 quantifies the deviation of the interval’s minimum and maximum, while M D E 2 assesses the deviation of the center and radius.
Third, to ensure practical applicability in the financial domain, a robust prediction model must exhibit not only strong fitting performance but also superior predictive capability. Therefore, we incorporate interval directional statistics ( D s t a t I ) to evaluate the model’s predictability, defined as follows:
D s t a t I = 1 n i = 1 n a ( j ) , a ( i ) 1 , ( X j U · X ^ j U > 0 ) ( X j L · X ^ j L > 0 ) , 0 , otherwise .
A larger value of D s t a t I signifies a higher predictive ability of the model.

3. Results and Discussion

In this section, we begin by comparing the forecasting results of various models, followed by an ablation study to assess the contribution of integrated big data. We then proceed to analyze the model’s interpretability, and conclude with a robustness analysis of the prediction results.

3.1. Performance Evaluation

The performance of various forecasting models for the interval-valued EUA prices across three forecasting horizons—1 day-ahead, 3 days-ahead, and 5 days-ahead—is summarized in Table 3. The results highlight the effectiveness of the proposed multi-task learning framework (TFT*) in comparison to multi-task and single-task benchmark models.
Overall, the TFT* model demonstrates consistent superiority across all forecasting horizons, achieving lower error metrics and maintaining interval directional accuracy. Notably, the A R V I values for TFT* remained below 1 across all horizons, indicating its ability to outperform the random walk model. In contrast, the multi-task (Transformer* and TCN*) and single-task benchmarks (Transformer, TCN, LSTM, GRU, DeepAR, and DecoderMLP) show declining performance as the forecasting horizon extends, with A R V I values exceeding 1 in longer horizons, signaling their limited ability.
Focusing on the 1 day-ahead forecasts, the TFT* model outperforms all other models across most evaluation metrics, indicating its effectiveness in capturing immediate price dynamics. Transformer* and TCN* also exhibit solid performance, although their error values are slightly higher and directional accuracy somewhat lower. Among the single-task variants, TFT and LSTM achieve competitive D s t a t I values, but their higher error measures indicate challenges in maintaining precise interval estimates. As the forecasting horizon extended to 3 days-ahead, the TFT* model maintained robust performance, continuing to outperform the random walk model and demonstrating its adaptability over slightly longer periods. However, other benchmark models show increased error rates, reflecting challenges in maintaining prediction precision over multiple days. At the 5 days-ahead horizon, the performance gap widens further. The TFT* model remains the top performer, maintaining balanced interval-valued accuracy and directional consistency, while all the benchmarks have no ability to surpass the random walk model.
A more intuitive assessment of model performance is presented in Figure 6, which offers a 3D visual comparison of the predictive models across three forecasting horizons. As depicted, prediction accuracy generally declines as the forecasting horizon lengthens, consistent with typical patterns observed in time series forecasting.
We further conducted a relative percentage improvement analysis to compare the TFT* model with the benchmark models within 1 day-ahead forecasting since it is the most crucial horizon for real-time decision-making in the carbon market. As shown in Figure 7, the proposed multi-task learning model demonstrates a consistent advantage in the accuracy of interval-valued predictions, such as showing a relative improvement of approximately 40% compared to DeepAR and DecoderMLP, particularly in M S E I and A R V I metrics. While the enhancement is somewhat reduced when compared to the multi-task learning Transformer* and TCN* models, it remains statistically vivid. In terms of D s t a t I , the TFT* model shows a relative improvement of 3.59% over multi-task and single-task Transformer, 5.89% over multi-task and single-task TCN, 9.30% over GRU, 11.63% over DeepAR, and 53.48% over DecoderMLP. Although its performance is comparable to that of single-task learning TFT and LSTM, TFT* achieves greater enhancements across other quantitative prediction standards.

3.2. Ablation Analysis

To evaluate the efficacy of incorporating the diverse data streams from multiple sources into the interval-valued EUA prices prediction, we conducted a series of ablation experiments. As detailed in Table 4, the experiments systematically assess the impact of integrating various data categories into both single-task and multi-task learning frameworks to enhance predictive performance.
In the single-task learning scenario, TFT represents the configuration that integrates all the diverse-source variables, including carbon market internal variables, related market futures prices, news sentiment intensity, and search index data. TFT (Category 1) serves as the baseline single-task model, incorporating only the carbon market internal variables for forecasting. In the multi-task learning framework, TFT* models are categorized based on the data sources they utilize. TFT* (Category 1) is the baseline multi-task model that incorporates only the carbon market internal variables. Building on this baseline, TFT* (Category 2) adds the related market futures price variables, expanding the input scope beyond the carbon market data. TFT* (Category 3) further incorporates the search index data into the configuration of TFT* (Category 2), reflecting the influence of public interest on the market dynamics. TFT* (Category 4) builds on TFT* (Category 2) by incorporating the news sentiment intensity, providing a complementary layer of information on the market sentiment. Finally, TFT* integrates all the diverse-source variables, representing the complete configuration of the proposed multi-task learning framework.
The results in Table 4 demonstrate the incremental benefits of integrating additional data sources for both single-task and multi-task learning frameworks. In the single-task scenario, TFT (Category 1), which incorporates only the carbon market internal variables, exhibits the lowest predictive accuracy among all configurations. The inclusion of additional data sources in TFT improves performance, but it consistently lags behind its multi-task counterparts across all evaluation metrics. For the multi-task learning framework, the baseline model, TFT* (Category 1), demonstrates the lowest predictive accuracy among all configurations. Expanding the input scope by incorporating additional data sources greatly improves performance. For instance, TFT* (Category 2), which includes the related market futures prices, achieves a reduction in M S E I by 7.93% and A R V I by 8.10%, alongside an 8.34% improvement in interval directional accuracy ( D s t a t I ) compared to the baseline. Further enhancement is observed in TFT* (Category 3), where the addition of the search index data to TFT* (Category 2) results in a further reduction in M S E I by 8.16% and A R V I by 10.41%, while maintaining the same 8.34% improvement in D s t a t I . Similarly, TFT* (Category 4), which integrates the news sentiment intensity instead of the search index data into TFT* (Category 2), demonstrates a marked improvement, with D s t a t I increasing by 21.43% compared to the baseline, highlighting the substantial influence of sentiment data on prediction performance. This phenomenon has also been highlighted in other studies [62,63,66], which have demonstrated that incorporating indices derived from online news can largely enhance the accuracy of models in forecasting the allowance prices within the EU ETS. These results underscore the importance of combining the carbon market internal variables with the related market futures prices, search index data, and news sentiment intensity for robust interval-valued carbon price forecasting in the EU ETS.

3.3. Interpretability Use Cases

After confirming the performance advantages of the model, we used the TFT model under multi-task learning to demonstrate two interpretability use cases: one is the visualization of temporal patterns for the time index used in the model encoder, and the other is assessing the importance of each input variable in the prediction.

3.3.1. Visualizing Temporal Patterns

The interpretability of the TFT model provided insights into the importance of different time indices within the sliding window for the interval-valued allowance prices forecasting when leveraging multi-task learning. Figure 8 illustrates the attention weight patterns assigned to time indices during one-step predictions on the test dataset, highlighting the temporal focus for forecasting high and low prices of the carbon emission allowances.
For high price predictions, the multi-task learning framework consistently allocated large attention to a specific trading day within certain weekly forecasting periods. The pattern indicates that the model identifies this day as a great influential index for predicting subsequent price peaks. Notably, this attention is persistently directed toward the same time index across multiple weeks. In contrast, for low price predictions, the model assigned higher attention weights to the three most recent trading days within the sliding window, suggesting that these time indices were critical for capturing the factors influencing minimum price predictions, with relatively lower attention allocated to earlier time steps.

3.3.2. Analyzing Variable Importance

The importance of each input variable in the multi-task learning framework of the TFT model was quantified by analyzing the selection weights obtained during predictions. These weights were aggregated across the entire test set to create an importance distribution for each variable, revealing the key inputs driving the forecasting process. Figure 9 provides a heatmap depicting the relative importance of all input variables over different time steps, along with a summary bar chart on the right showing the overall contribution of each variable. Each heatmap block along the horizontal axis represents the features that the model emphasizes when predicting the interval carbon prices at a specific time point, quantitatively showing which feature segments contribute more to the prediction. For instance, as the test set progresses, if the attention weights of multiple features in the matrix simultaneously increase, it may indicate strong interactive effects among them during that period.
The analysis highlighted that the close price of carbon allowances was the most influential variable, accounting for over half of the total importance. The finding underscores the central role of the close price in the forecasting process, as it serves as a comprehensive indicator of market sentiment and daily performance, critical for generating accurate predictions. The second most important set of variables includes energy market factors, such as natural gas, coal, and crude oil prices. The energy inputs greatly impact the model’s predictions, reflecting the intrinsic link between energy prices and carbon costs, as fossil fuel combustion is a major source of emissions. The importance of energy prices in forecasting EUA futures prices has similarly been highlighted in interpretability analyses of the close prices prediction [7,89]. While the financial market variable (STOXX Europe 600 index), shows a secondary influence, it still provides contextual information that indirectly affects the allowance price forecasts. The financial market conditions affect investment flows and liquidity within the carbon market, making it crucial for risk managers to remain aware of the influences during periods of economic volatility. Salvagnin et al. [77] pointed out that during the transition of the EU ETS to Phase IV, the influence of financial market volatility appeared to take a central role. Regarding the EUA futures high-open–low-close price data and trading volume, the analysis reveals that the open price and trading volume are more influential in the forecasting process than the high and low prices. This suggests that early trading signals and overall market activity provide more reliable information for predicting price movements. Lastly, online news sentiment intensity and the search index demonstrates that these variables contribute to the prediction, albeit with slightly less importance than market-related data. Positive news sentiment exerts a stronger influence on predictions than negative sentiment, reflecting the market’s tendency to respond more importantly to optimistic developments.

3.4. Robustness Analysis

3.4.1. Forecasting Robustness Under Different Market Conditions

To evaluate the reliability of our proposed forecasting framework, we conducted an extensive analysis across diverse market conditions during the test period, focusing on interval directional accuracy ( D s t a t I ). Specifically, we used average-based thresholds to separate (i) high and low volatility, and (ii) high and low global (GEPU) and European (EEPU) policy uncertainty, reflecting the crucial influence of policy fluctuations, macroeconomic disruptions, and political events on the EU ETS prices behavior. In addition, we examined two further market characteristics: (iii) market liquidity, defined by average trading volumes, and (iv) energy price levels, proxied by the mean NBP natural gas futures prices (natural gas plays a significant role in Europe’s energy consumption and power generation structure and is closely linked to the carbon market [3,6,18]). Such an extension allowed a more comprehensive view of how varying liquidity and energy costs might affect carbon price predictability.
As summarized in Table 5, the proposed multi-task learning framework (TFT*) consistently outperformed other models in high uncertainty, high volatility, high liquidity, and both high and low energy price scenarios, demonstrating its ability to effectively capture market dynamics where price movements were more volatile and harder to predict. Even under low-volatility and low-uncertainty scenarios, where market dynamics were more stable, TFT* maintained competitive accuracy. While single-task benchmarks such as LSTM and GRU achieved comparable or slightly higher scores in some low-volatility or low-uncertainty environments, their performance degraded largely in high-volatility or high-uncertainty scenarios. The contrast demonstrates the limitations of single-task learning approaches in handling complex market conditions and highlights the advantage of the multi-task learning framework in ensuring robust and reliable predictions across diverse market environments.

3.4.2. Superior Predictive Ability Test

The Superior Predictive Ability (SPA) test [90] conducted in this study served as a rigorous evaluation of the proposed multi-task learning framework compared to other forecasting algorithms. The statistical values shown in Table 6 are the p-values resulting from the SPA test. The resulting p-values revealed whether the forecasting accuracy of the proposed model is meaningfully greater than that of its comparators. When the p-value falls below 0.05, it denotes statistical significance, suggesting that the test model delivers superior predictive performance relative to the benchmark. Furthermore, our analysis confined itself exclusively to 1 day-ahead forecasting outcomes, a framework widely adopted and practically essential within the carbon prices prediction. As shown in Panel A, the multi-task TFT framework (TFT*) consistently outperforms benchmark models for high price predictions, with statistical significance in most comparisons. Similarly, Panel B for low-price predictions further validates its robustness.

4. Conclusions

In this paper, we have proposed a novel multi-task learning-based framework for the prediction of interval-valued EU ETS carbon allowance futures prices. We utilized a Transformer-based model (Temporal Fusion Transformer) to implement multi-task learning and forecast interval-valued carbon prices by integrating diverse-source data, including online news, Google search trends, and market-related futures prices. Our findings demonstrated that the proposed multi-task learning framework consistently outperformed all benchmark models in predictive accuracy and exhibited robustness under conditions of high market volatility or economic policy uncertainty. Ablation experiments revealed that incorporating either online news sentiment or search trend data individually improved the model’s predictive performance. When both news sentiment intensity and search index data were integrated, the model achieved the highest level of predictive accuracy, indicating that the combined use of diverse big data sources effectively captured the complex dynamics of the carbon market. The interpretability analysis offered deeper insights into the factors influencing carbon prices. In this study, the model’s attention mechanisms for the test period indicated that a specific trading day within certain weekly periods considerably influenced high price predictions, highlighting its importance for identifying price peaks. For low price predictions, the model allocated substantial attention to the three trading days preceding the prediction date, underscoring their critical role in determining minimum price forecasts. Moreover, variable importance analysis confirmed that the carbon allowance close price was the most influential factor, followed by energy market variables such as natural gas, coal, and crude oil prices. Online news sentiment and search index data also contributed meaningfully to the forecasting process, with positive news sentiment exerting a stronger influence than negative sentiment.
The findings carry important policy implications and practical significance for stakeholders in the carbon market and for advancing environmental management. Policymakers and regulators could leverage the model’s insights to better understand how market sentiment and energy prices affect the EU ETS, enabling more informed policy interventions. Investors and companies could monitor the key factors identified during the forecasting process to optimize compliance strategies, low-carbon technology investments, and risk management against price volatility. The interpretability of the proposed framework facilitates user trust in complex deep learning-based predictive models, providing greater transparency in decision-making processes.
Future research could build on these findings by exploring the applicability of the proposed framework to other carbon markets or broader financial markets where interval-valued data plays a key role. Furthermore, the integration of multimodal data related to the carbon market (e.g., images, videos, audio) for the allowance prices prediction is expected to become a critical application direction driven by advancements in deep learning technologies.

Author Contributions

Conceptualization, D.L.; methodology, D.L., L.W. and S.L.; software, D.L., L.W. and S.L.; writing—original draft preparation, D.L., L.W. and S.L.; visualization, D.L., L.W. and S.L.; writing—review and editing, supervision, funding acquisition, Z.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was supported by the National Natural Science Foundation of China under Grant No. 72341030.

Data Availability Statement

The original data presented in this study are openly available in the public repository at https://github.com/dinggaoliu/Interval-valued-carbon-prices-forecasting (accessed on 26 January 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
A R V I Interval average relative variance
DeepARDeep auto-regressive recurrent neural network
D s t a t I Interval directional statistics
EEPUEuropean economic policy uncertainty
EPUEconomic policy uncertainty
EUAEuropean Union allowance
EU ETSEuropean Union emissions trading system
GEPUGlobal economic policy uncertainty
GLUGated linear unit
GRNGated residual network
GRUGated recurrent unit
LSTMLong short-term memory
MAEMean absolute error
M D E 1 Mean distance error based on Ichino–Yaguchi distance
M D E 2 Mean distance error based on Hausdorff distance
MLPMultilayer perceptron
MSEMean squared error
M S E I Mean squared error of interval
MTLMulti-task learning
NLTKNatural language toolkit
RNNRecurrent neural network
SDASymbolic data analysis
SPASuperior predictive ability
TCNTemporal convolutional network
TFTTemporal fusion transformer
U I Interval U of Theil statistics
VADERValence aware dictionary and sentiment reasoner

References

  1. Bayer, P.; Aklin, M. The European Union Emissions Trading System reduced CO2 emissions despite low prices. Proc. Natl. Acad. Sci. USA 2020, 117, 8804–8812. [Google Scholar] [CrossRef] [PubMed]
  2. Dechezleprêtre, A.; Nachtigall, D.; Venmans, F. The joint impact of the European Union emissions trading system on carbon emissions and economic performance. J. Environ. Econ. Manag. 2023, 118, 102758. [Google Scholar] [CrossRef]
  3. Lovcha, Y.; Perez-Laborda, A.; Sikora, I. The determinants of CO2 prices in the EU emission trading system. Appl. Energy 2022, 305, 117903. [Google Scholar] [CrossRef]
  4. Colmer, J.; Martin, R.; Muûls, M.; Wagner, U.J. Does Pricing Carbon Mitigate Climate Change? Firm-level evidence from the European Union Emissions Trading System. Rev. Econ. Stud. 2024, rdae055. [Google Scholar] [CrossRef]
  5. Martin, R.; Muûls, M.; De Preux, L.B.; Wagner, U.J. Industry compensation under relocation risk: A firm-level analysis of the EU emissions trading scheme. Am. Econ. Rev. 2014, 104, 2482–2508. [Google Scholar] [CrossRef]
  6. Pietzcker, R.C.; Osorio, S.; Rodrigues, R. Tightening EU ETS targets in line with the European Green Deal: Impacts on the decarbonization of the EU power sector. Appl. Energy 2021, 293, 116914. [Google Scholar] [CrossRef]
  7. Liu, D.; Chen, K.; Cai, Y.; Tang, Z. Interpretable EU ETS Phase 4 prices forecasting based on deep generative data augmentation approach. Financ. Res. Lett. 2024, 61, 105038. [Google Scholar] [CrossRef]
  8. Eslahi, M.; Mazza, P. Can weather variables and electricity demand predict carbon emissions allowances prices? Evidence from the first three phases of the EU ETS. Ecol. Econ. 2023, 214, 107985. [Google Scholar] [CrossRef]
  9. Liu, J.; Zhang, Z.; Yan, L.; Wen, F. Forecasting the volatility of EUA futures with economic policy uncertainty using the GARCH-MIDAS model. Financ. Innov. 2021, 7, 76. [Google Scholar] [CrossRef]
  10. Nguyen, D.K.; Huynh, T.L.D.; Nasir, M.A. Carbon emissions determinants and forecasting: Evidence from G6 countries. J. Environ. Manag. 2021, 285, 111988. [Google Scholar] [CrossRef]
  11. Wei, Y.; Gong, P.; Zhang, J.; Wang, L. Exploring public opinions on climate change policy in “Big Data Era”—A case study of the European Union Emission Trading System (EU-ETS) based on Twitter. Energy Policy 2021, 158, 112559. [Google Scholar] [CrossRef]
  12. Zhu, B.; Ye, S.; Wang, P.; He, K.; Zhang, T.; Wei, Y.M. A novel multiscale nonlinear ensemble leaning paradigm for carbon price forecasting. Energy Econ. 2018, 70, 143–157. [Google Scholar] [CrossRef]
  13. Qin, Q.; Huang, Z.; Zhou, Z.; Chen, Y.; Zhao, W. Hodrick–Prescott filter-based hybrid ARIMA–SLFNs model with residual decomposition scheme for carbon price forecasting. Appl. Soft Comput. 2022, 119, 108560. [Google Scholar] [CrossRef]
  14. Zhao, S.; Wang, Y.; Deng, G.; Yang, P.; Chen, Z.; Li, Y. An intelligently adjusted carbon price forecasting approach based on breakpoints segmentation, feature selection and adaptive machine learning. Appl. Soft Comput. 2023, 149, 110948. [Google Scholar] [CrossRef]
  15. Yang, S.; Chen, D.; Li, S.; Wang, W. Carbon price forecasting based on modified ensemble empirical mode decomposition and long short-term memory optimized by improved whale optimization algorithm. Sci. Total Environ. 2020, 716, 137117. [Google Scholar] [CrossRef] [PubMed]
  16. Huang, Z.; Zhang, W. Forecasting carbon prices in China’s pilot carbon market: A multi-source information approach with conditional generative adversarial networks. J. Environ. Manag. 2024, 359, 120967. [Google Scholar] [CrossRef] [PubMed]
  17. Sayed, G.I.; Abd El-Latif, E.I.; Darwish, A.; Snasel, V.; Hassanien, A.E. An optimized and interpretable carbon price prediction: Explainable deep learning model. Chaos Solitons Fractals 2024, 188, 115533. [Google Scholar] [CrossRef]
  18. Zhao, X.; Han, M.; Ding, L.; Kang, W. Usefulness of economic and energy data at different frequencies for carbon price forecasting in the EU ETS. Appl. Energy 2018, 216, 132–141. [Google Scholar] [CrossRef]
  19. Han, M.; Ding, L.; Zhao, X.; Kang, W. Forecasting carbon prices in the Shenzhen market, China: The role of mixed-frequency factors. Energy 2019, 171, 69–76. [Google Scholar] [CrossRef]
  20. Sun, S.; Jin, F.; Li, H.; Li, Y. A new hybrid optimization ensemble learning approach for carbon price forecasting. Appl. Math. Model. 2021, 97, 182–205. [Google Scholar] [CrossRef]
  21. Mao, S.; Zeng, X.J. SimVGNets: Similarity-based visibility graph networks for carbon price forecasting. Expert Syst. Appl. 2023, 230, 120647. [Google Scholar] [CrossRef]
  22. Cao, Y.; Zha, D.; Wang, Q.; Wen, L. Probabilistic carbon price prediction with quantile temporal convolutional network considering uncertain factors. J. Environ. Manag. 2023, 342, 118137. [Google Scholar] [CrossRef]
  23. Shi, H.; Wei, A.; Xu, X.; Zhu, Y.; Hu, H.; Tang, S. A CNN-LSTM based deep learning model with high accuracy and robustness for carbon price forecasting: A case of Shenzhen’s carbon market in China. J. Environ. Manag. 2024, 352, 120131. [Google Scholar] [CrossRef] [PubMed]
  24. Yu, L.; Wang, S.; Lai, K.K. Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm. Energy Econ. 2008, 30, 2623–2635. [Google Scholar] [CrossRef]
  25. Niu, X.; Wang, J.; Zhang, L. Carbon price forecasting system based on error correction and divide-conquer strategies. Appl. Soft Comput. 2022, 118, 107935. [Google Scholar] [CrossRef]
  26. Zhu, B.; Han, D.; Wang, P.; Wu, Z.; Zhang, T.; Wei, Y.M. Forecasting carbon price using empirical mode decomposition and evolutionary least squares support vector regression. Appl. Energy 2017, 191, 521–530. [Google Scholar] [CrossRef]
  27. Zhu, J.; Wu, P.; Chen, H.; Liu, J.; Zhou, L. Carbon price forecasting with variational mode decomposition and optimal combined model. Phys. A Stat. Mech. Its Appl. 2019, 519, 140–158. [Google Scholar] [CrossRef]
  28. Lu, H.; Ma, X.; Huang, K.; Azimi, M. Carbon trading volume and price forecasting in China using multiple machine learning models. J. Clean. Prod. 2020, 249, 119386. [Google Scholar] [CrossRef]
  29. Huang, Y.; Dai, X.; Wang, Q.; Zhou, D. A hybrid model for carbon price forecasting using GARCH and long short-term memory network. Appl. Energy 2021, 285, 116485. [Google Scholar] [CrossRef]
  30. Wang, J.; Sun, X.; Cheng, Q.; Cui, Q. An innovative random forest-based nonlinear ensemble paradigm of improved feature extraction and deep learning for carbon price forecasting. Sci. Total Environ. 2021, 762, 143099. [Google Scholar] [CrossRef]
  31. Zhou, F.; Huang, Z.; Zhang, C. Carbon price forecasting based on CEEMDAN and LSTM. Appl. Energy 2022, 311, 118601. [Google Scholar] [CrossRef]
  32. Nadirgil, O. Carbon price prediction using multiple hybrid machine learning models optimized by genetic algorithm. J. Environ. Manag. 2023, 342, 118061. [Google Scholar] [CrossRef]
  33. Mao, Y.; Yu, X. A hybrid forecasting approach for China’s national carbon emission allowance prices with balanced accuracy and interpretability. J. Environ. Manag. 2024, 351, 119873. [Google Scholar] [CrossRef] [PubMed]
  34. Zhang, X.; Yang, K.; Lu, Q.; Wu, J.; Yu, L.; Lin, Y. Predicting carbon futures prices based on a new hybrid machine learning: Comparative study of carbon prices in different periods. J. Environ. Manag. 2023, 346, 118962. [Google Scholar] [CrossRef] [PubMed]
  35. Lan, Y.; Huangfu, Y.; Huang, Z.; Zhang, C. Breaking through the limitation of carbon price forecasting: A novel hybrid model based on secondary decomposition and nonlinear integration. J. Environ. Manag. 2024, 362, 121253. [Google Scholar] [CrossRef] [PubMed]
  36. Liu, S.; Xie, G.; Wang, Z.; Wang, S. A secondary decomposition-ensemble framework for interval carbon price forecasting. Appl. Energy 2024, 359, 122613. [Google Scholar] [CrossRef]
  37. Zheng, L.; Sun, Y.; Wang, S. A novel interval-based hybrid framework for crude oil price forecasting and trading. Energy Econ. 2024, 130, 107266. [Google Scholar] [CrossRef]
  38. Yang, K.; Cheng, Z.; Li, M.; Wang, S.; Wei, Y. Fortify the investment performance of crude oil market by integrating sentiment analysis and an interval-based trading strategy. Appl. Energy 2024, 353, 122102. [Google Scholar] [CrossRef]
  39. Li, M.; Yang, K.; Lin, W.; Wei, Y.; Wang, S. An interval constraint-based trading strategy with social sentiment for the stock market. Financ. Innov. 2024, 10, 56. [Google Scholar] [CrossRef]
  40. Zhu, M.; Xu, H.; Wang, M.; Tian, L. Carbon price interval prediction method based on probability density recurrence network and interval multi-layer perceptron. Phys. A Stat. Mech. Its Appl. 2024, 636, 129543. [Google Scholar] [CrossRef]
  41. Xu, H.; Wang, M.; Jiang, S.; Yang, W. Carbon price forecasting with complex network and extreme learning machine. Phys. A Stat. Mech. Its Appl. 2020, 545, 122830. [Google Scholar] [CrossRef]
  42. Zeng, L.; Hu, H.; Tang, H.; Zhang, X.; Zhang, D. Carbon emission price point-interval forecasting based on multivariate variational mode decomposition and attention-LSTM model. Appl. Soft Comput. 2024, 157, 111543. [Google Scholar] [CrossRef]
  43. Sun, S.; Sun, Y.; Wang, S.; Wei, Y. Interval decomposition ensemble approach for crude oil price forecasting. Energy Econ. 2018, 76, 274–287. [Google Scholar] [CrossRef]
  44. Gao, F.; Shao, X. A novel interval decomposition ensemble model for interval carbon price forecasting. Energy 2022, 243, 123006. [Google Scholar] [CrossRef]
  45. Hao, Y.; Wang, X.; Wang, J.; Yang, W. A novel interval-valued carbon price analysis and forecasting system based on multi-objective ensemble strategy for carbon trading market. Expert Syst. Appl. 2024, 244, 122912. [Google Scholar] [CrossRef]
  46. Ji, Z.; Niu, D.; Li, M.; Li, W.; Sun, L.; Zhu, Y. A three-stage framework for vertical carbon price interval forecast based on decomposition–integration method. Appl. Soft Comput. 2022, 116, 108204. [Google Scholar] [CrossRef]
  47. Tang, X.; Wang, J.; Zhang, X. Optimal combination weight interval-valued carbon price forecasting model based on adaptive decomposition method. J. Clean. Prod. 2023, 427, 139232. [Google Scholar] [CrossRef]
  48. Zhu, B.; Wan, C.; Wang, P. Interval forecasting of carbon price: A novel multiscale ensemble forecasting approach. Energy Econ. 2022, 115, 106361. [Google Scholar] [CrossRef]
  49. Liu, J.; Wang, P.; Chen, H.; Zhu, J. A combination forecasting model based on hybrid interval multi-scale decomposition: Application to interval-valued carbon price forecasting. Expert Syst. Appl. 2022, 191, 116267. [Google Scholar] [CrossRef]
  50. Wang, P.; Tao, Z.; Liu, J.; Chen, H. Improving the forecasting accuracy of interval-valued carbon price from a novel multi-scale framework with outliers detection: An improved interval-valued time series analysis mode. Energy Econ. 2023, 118, 106502. [Google Scholar] [CrossRef]
  51. Yu, X.; Jiang, N.; Zhang, W. A combined model based on decomposition and reorganization, weight optimization algorithms for carbon price point and interval prediction. J. Clean. Prod. 2024, 472, 143445. [Google Scholar] [CrossRef]
  52. Zheng, G.; Li, K.; Yue, X.; Zhang, Y. A multifactor hybrid model for carbon price interval prediction based on decomposition-integration framework. J. Environ. Manag. 2024, 363, 121273. [Google Scholar] [CrossRef]
  53. Tian, C.; Hao, Y. Point and interval forecasting for carbon price based on an improved analysis-forecast system. Appl. Math. Model. 2020, 79, 126–144. [Google Scholar] [CrossRef]
  54. Niu, X.; Wang, J.; Wei, D.; Zhang, L. A combined forecasting framework including point prediction and interval prediction for carbon emission trading prices. Renew. Energy 2022, 201, 46–59. [Google Scholar] [CrossRef]
  55. Wang, J.; Wang, Y.; Li, H.; Yang, H.; Li, Z. Ensemble forecasting system based on decomposition-selection-optimization for point and interval carbon price prediction. Appl. Math. Model. 2023, 113, 262–286. [Google Scholar] [CrossRef]
  56. Zhao, Y.; Zhang, W.; Gong, X.; Liu, X. Carbon futures return forecasting: A novel method based on decomposition-ensemble strategy and Markov process. Appl. Soft Comput. 2024, 163, 111869. [Google Scholar] [CrossRef]
  57. Yang, K.; Sun, Y.; Hong, Y.; Wang, S. Forecasting interval carbon price through a multi-scale interval-valued decomposition ensemble approach. Energy Econ. 2024, 139, 107952. [Google Scholar] [CrossRef]
  58. Huang, W.; Wang, H.; Qin, H.; Wei, Y.; Chevallier, J. Convolutional neural network forecasting of European Union allowances futures using a novel unconstrained transformation method. Energy Econ. 2022, 110, 106049. [Google Scholar] [CrossRef]
  59. Huang, W.; Wang, H.; Wei, Y. Identifying the determinants of European carbon allowances prices: A novel robust partial least squares method for open-high-low-close data. Int. Rev. Financ. Anal. 2023, 90, 102938. [Google Scholar] [CrossRef]
  60. Huang, W.; Zhao, J.; Wang, X. Model-driven multimodal LSTM-CNN for unbiased structural forecasting of European Union allowances open-high-low-close price. Energy Econ. 2024, 132, 107459. [Google Scholar] [CrossRef]
  61. Huang, Y.; He, Z. Carbon price forecasting with optimization prediction method based on unstructured combination. Sci. Total Environ. 2020, 725, 138350. [Google Scholar] [CrossRef] [PubMed]
  62. Ye, J.; Xue, M. Influences of sentiment from news articles on EU carbon prices. Energy Econ. 2021, 101, 105393. [Google Scholar] [CrossRef]
  63. Hartvig, Á.D.; Pap, Á.; Pálos, P. EU Climate Change News Index: Forecasting EU ETS prices with online news. Financ. Res. Lett. 2023, 54, 103720. [Google Scholar] [CrossRef]
  64. Isah, K.O.; Adelakun, J.O.; Udeaja, E.A. Experimenting with the Forecasting Power of Speculation in the Predictability of Carbon Prices. Emerg. Mark. Financ. Trade 2024, 60, 2691–2702. [Google Scholar] [CrossRef]
  65. Zhang, F.; Xia, Y. Carbon price prediction models based on online news information analytics. Financ. Res. Lett. 2022, 46, 102809. [Google Scholar] [CrossRef]
  66. Gong, X.; Li, M.; Guan, K.; Sun, C. Climate change attention and carbon futures return prediction. J. Futur. Mark. 2023, 43, 1261–1288. [Google Scholar] [CrossRef]
  67. Zhang, X.; Zong, Y.; Du, P.; Wang, S.; Wang, J. Framework for multivariate carbon price forecasting: A novel hybrid model. J. Environ. Manag. 2024, 369, 122275. [Google Scholar] [CrossRef] [PubMed]
  68. Zhang, Y.; Yang, Q. An overview of multi-task learning. Natl. Sci. Rev. 2018, 5, 30–43. [Google Scholar] [CrossRef]
  69. Zhang, Y.; Yang, Q. A survey on multi-task learning. IEEE Trans. Knowl. Data Eng. 2021, 34, 5586–5609. [Google Scholar] [CrossRef]
  70. Vandenhende, S.; Georgoulis, S.; Van Gansbeke, W.; Proesmans, M.; Dai, D.; Van Gool, L. Multi-task learning for dense prediction tasks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3614–3633. [Google Scholar] [CrossRef] [PubMed]
  71. Zhang, S.; Chen, R.; Cao, J.; Tan, J. A CNN and LSTM-based multi-task learning architecture for short and medium-term electricity load forecasting. Electr. Power Syst. Res. 2023, 222, 109507. [Google Scholar] [CrossRef]
  72. Tan, M.; Liao, C.; Chen, J.; Cao, Y.; Wang, R.; Su, Y. A multi-task learning method for multi-energy load forecasting based on synthesis correlation analysis and load participation factor. Appl. Energy 2023, 343, 121177. [Google Scholar] [CrossRef]
  73. Li, K.; Mu, Y.; Yang, F.; Wang, H.; Yan, Y.; Zhang, C. Joint forecasting of source-load-price for integrated energy system based on multi-task learning and hybrid attention mechanism. Appl. Energy 2024, 360, 122821. [Google Scholar] [CrossRef]
  74. Yuan, C.; Ma, X.; Wang, H.; Zhang, C.; Li, X. COVID19-MLSF: A multi-task learning-based stock market forecasting framework during the COVID-19 pandemic. Expert Syst. Appl. 2023, 217, 119549. [Google Scholar] [CrossRef]
  75. Ma, Y.; Mao, R.; Lin, Q.; Wu, P.; Cambria, E. Quantitative stock portfolio optimization by multi-task learning risk and return. Inf. Fusion 2024, 104, 102165. [Google Scholar] [CrossRef]
  76. Lim, B.; Arık, S.Ö.; Loeff, N.; Pfister, T. Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast. 2021, 37, 1748–1764. [Google Scholar] [CrossRef]
  77. Salvagnin, C.; Glielmo, A.; De Giuli, M.E.; Mira, A. Investigating the price determinants of the European Emission Trading System: A non-parametric approach. Quant. Financ. 2024, 24, 1529–1544. [Google Scholar] [CrossRef]
  78. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.u.; Polosukhin, I. Attention is All you Need. In Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
  79. Lea, C.; Flynn, M.D.; Vidal, R.; Reiter, A.; Hager, G.D. Temporal Convolutional Networks for Action Segmentation and Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1003–1012. [Google Scholar] [CrossRef]
  80. Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
  81. Salinas, D.; Flunkert, V.; Gasthaus, J.; Januschowski, T. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 2020, 36, 1181–1191. [Google Scholar] [CrossRef]
  82. Taud, H.; Mas, J.F. Multilayer perceptron (MLP). In Geomatic Approaches for Modeling Land Change Scenarios; Springer: Berlin/Heidelberg, Germany, 2018; pp. 451–455. [Google Scholar] [CrossRef]
  83. Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar] [CrossRef]
  84. Yang, D.; Guo, J.E.; Sun, S.; Han, J.; Wang, S. An interval decomposition-ensemble approach with data-characteristic-driven reconstruction for short-term load forecasting. Appl. Energy 2022, 306, 117992. [Google Scholar] [CrossRef]
  85. Maia, A.L.S.; de Carvalho, F.d.A. Holt’s exponential smoothing and neural network models for forecasting interval-valued time series. Int. J. Forecast. 2011, 27, 740–759. [Google Scholar] [CrossRef]
  86. Zhang, D.; Li, Q.; Mugera, A.W.; Ling, L. A hybrid model considering cointegration for interval-valued pork price forecasting in China. J. Forecast. 2020, 39, 1324–1341. [Google Scholar] [CrossRef]
  87. Froelich, W.; Salmeron, J.L. Evolutionary learning of fuzzy grey cognitive maps for the forecasting of multivariate, interval-valued time series. Int. J. Approx. Reason. 2014, 55, 1319–1335. [Google Scholar] [CrossRef]
  88. Hsu, H.L.; Wu, B. Evaluating forecasting performance for interval data. Comput. Math. Appl. 2008, 56, 2155–2163. [Google Scholar] [CrossRef]
  89. Yang, C.; Zhang, H.; Weng, F. Effects of COVID-19 vaccination programs on EU carbon price forecasts: Evidence from explainable machine learning. Int. Rev. Financ. Anal. 2024, 91, 102953. [Google Scholar] [CrossRef]
  90. Hansen, P.R. A test for superior predictive ability. J. Bus. Econ. Stat. 2005, 23, 365–380. [Google Scholar] [CrossRef]
Figure 1. The overall framework of the proposed multi-task learning model for interval-valued EUA futures prices forecasting.
Figure 1. The overall framework of the proposed multi-task learning model for interval-valued EUA futures prices forecasting.
Mathematics 13 00455 g001
Figure 2. The training process of single-task learning and multi-task learning.
Figure 2. The training process of single-task learning and multi-task learning.
Mathematics 13 00455 g002
Figure 3. The basic trend of the interval-valued EUA futures prices.
Figure 3. The basic trend of the interval-valued EUA futures prices.
Mathematics 13 00455 g003
Figure 4. The Pearson correlation heatmap of the selected market variables.
Figure 4. The Pearson correlation heatmap of the selected market variables.
Mathematics 13 00455 g004
Figure 5. The daily fluctuations in positive or negative sentiment intensity and search index over time for Carbon Pulse online news and Google Trends data.
Figure 5. The daily fluctuations in positive or negative sentiment intensity and search index over time for Carbon Pulse online news and Google Trends data.
Mathematics 13 00455 g005
Figure 6. The 3D visual comparison of the forecasting models across the three forecasting horizons. * represents the model utilized multi-task learning framework, while others are single-task learning.
Figure 6. The 3D visual comparison of the forecasting models across the three forecasting horizons. * represents the model utilized multi-task learning framework, while others are single-task learning.
Mathematics 13 00455 g006
Figure 7. The relative percentage improvement of the proposed framework compared to the benchmark models in 1 day-ahead. * represents the model utilized multi-task learning framework, while others are single-task learning.
Figure 7. The relative percentage improvement of the proposed framework compared to the benchmark models in 1 day-ahead. * represents the model utilized multi-task learning framework, while others are single-task learning.
Mathematics 13 00455 g007
Figure 8. The attention weight patterns for the time index in the one-step predictions within the test dataset when leveraging the multi-task learning framework.
Figure 8. The attention weight patterns for the time index in the one-step predictions within the test dataset when leveraging the multi-task learning framework.
Mathematics 13 00455 g008
Figure 9. The importance heatmap for all input variables used within the multi-task learning framework.
Figure 9. The importance heatmap for all input variables used within the multi-task learning framework.
Mathematics 13 00455 g009
Table 1. The statistical description of the return rates of interval-valued EUA futures prices.
Table 1. The statistical description of the return rates of interval-valued EUA futures prices.
ReturnMeanStandard DeviationMinimumMaximumKurtosisSkewnessJarque-BeraQ(5)ADF
High price0.05342.1742− 17.38068.91715.7379−0.52761144.493954.7823−11.0709
Low price0.05372.5617−19.387117.373011.5170−0.82044550.586826.2153−15.1196
Table 2. The parameters used for training the forecasting models.
Table 2. The parameters used for training the forecasting models.
ModelParametersValuesModelParametersValuesModelParametersValues
Encoder length5 RNN layers2 Hidden size32
Decoder length[1,3,5]DeepARHidden size128LSTMDropout rate0.23
Dropout rate0.1 Dropout rate0.1 Learning rate0.002
TFTGradient clip0.1 Learning rate0.05 Batch size64
Hidden size115 Hidden size256 Hidden size64
Attention head size2 Number of hidden layers2GRUDropout rate0.14
Hidden continuous size44DecoderMLPActivation functionReLu Learning rate0.004
Learning rate0.07 Dropout rate0.1 Batch size64
Batch size64 Learning rate0.03
Hidden size128 Hidden size128
Attention heads8 Kernel size3
TransformerFeedforward size512TCNDilations2
Dropout rate0.1 Dropout rate0.05
Learning rate0.001 Learning rate0.001
Batch size64 Batch size64
Note: To verify the gap between multi-task and single-task learning under fair comparisons, we conduct the experiment to train the single-task models using the same hyper-parameters as the multi-task co-training.
Table 3. Performance evaluation of the proposed model.
Table 3. Performance evaluation of the proposed model.
Forecasting HorizonModels U I ARV I MSEIMDE1MDE2 Dstat I
1 day-aheadTFT*0.64890.76153.11591.32541.88620.5658
Transformer*0.66390.80293.27751.36931.88900.5455
TCN*0.66410.80343.28271.36181.89230.5325
TFT0.66020.78813.24011.38511.92440.5658
Transformer0.67110.82043.39091.39031.90630.5455
TCN0.67650.83363.41081.41971.98410.5325
LSTM0.67490.82353.41551.45412.03530.5658
GRU0.66940.81023.36581.42452.00030.5132
DeepAR0.75881.04124.31241.63792.15340.5000
DecoderMLP0.77281.08004.49461.63682.26290.2632
3 days-aheadTFT*0.70200.89123.64121.42242.03150.4868
Transformer*0.73380.98004.05281.56582.16320.4533
TCN*0.75181.02864.28001.59962.14370.4844
TFT0.73590.97944.07041.56422.17100.4054
Transformer0.75581.03954.31141.64422.23990.4222
TCN0.76791.07294.25871.63872.21700.4133
LSTM0.75661.03214.32191.61052.23750.4054
GRU0.76641.05894.43561.63202.26010.3649
DeepAR0.78481.11064.65991.70982.38580.3919
DecoderMLP0.76901.06634.46581.61512.25300.1351
5 days-aheadTFT*0.72620.95084.00241.51472.06250.4342
Transformer*0.73931.00984.10251.61142.11440.4342
TCN*0.74921.02954.30391.59982.19210.4208
TFT0.74641.02334.31951.60292.12220.4028
Transformer0.74081.03394.31681.60372.22730.4164
TCN0.74311.04024.33981.60492.23270.4137
LSTM0.74851.02894.36131.61232.23720.4028
GRU0.74531.02034.33711.61002.21440.3750
DeepAR0.78331.12684.79041.71272.42820.4028
DecoderMLP0.76701.08044.57451.65162.29330.2639
Note: * represents the model utilized multi-task learning framework, while others are single-task learning. Bold font represents the predicted evaluation results of the proposed multi task learning framework.
Table 4. The ablation experiment of verifying the impact of diverse data streams from multiple sources on the model’s prediction accuracy.
Table 4. The ablation experiment of verifying the impact of diverse data streams from multiple sources on the model’s prediction accuracy.
Models U I ARV I MSEI MDE 1 MDE 2 Dstat I
TFT0.6602 (−11.47%)0.7881 (−24.27%)3.2401 (−25.63%)1.3851 (−12.93%)1.9244 (−12.81%)0.5658 (+28.81%)
TFT (Category 1)0.73590.97944.07041.56422.17100.4028
TFT*0.6489 (−8.18%)0.7615 (−17.03%)3.1159 (−16.86%)1.3254 (−7.32%)1.8862 (−7.70%)0.5658 (+23.26%)
TFT* (Category 4)0.6620 (−6.04%)0.7924 (−12.47%)3.3166 (−9.79%)1.4042 (−1.30%)1.9705 (−3.10%)0.5526 (+21.43%)
TFT* (Category 3)0.6681 (−5.07%)0.8072 (−10.41%)3.3666 (−8.16%)1.3909 (−2.26%)1.9461 (−4.39%)0.4737 (+8.34%)
TFT* (Category 2)0.6752 (−3.97%)0.8244 (−8.10%)3.3736 (−7.93%)1.3864 (−2.60%)1.9606 (−3.62%)0.4737 (+8.34%)
TFT* (Category 1)0.70200.89123.64121.42242.03150.4342
Note: * represents the model utilized multi-task learning framework, while others are single-task learning.
Table 5. The performance evaluation of D s t a t I of the models under different market conditions.
Table 5. The performance evaluation of D s t a t I of the models under different market conditions.
ModelHigh EEPUHigh GEPUHigh VolatilityHigh LiquidityHigh Energy PriceLow EEPULow GEPULow VolatilityLow LiquidityLow Energy Price
TFT*0.60980.56760.50000.56250.56250.54290.56410.62500.43480.6136
Transformer*0.43900.48650.38890.54720.46880.60000.53850.62500.43480.5455
TCN*0.46340.48650.38890.60380.46880.62860.58970.67500.39130.5909
TFT0.58540.51350.50000.46880.53130.48570.58970.60000.60870.5682
Transformer0.46340.48650.44440.54720.50000.60000.56410.60000.47830.5455
TCN0.46340.51350.38890.54720.50000.54290.48720.60000.39130.5000
LSTM0.58540.51350.36110.53130.53130.54290.61540.75000.43480.5909
GRU0.48780.48650.33330.50940.43750.54290.53850.67500.52170.5682
DeepAR0.43900.48650.44440.49060.40630.57140.51280.55000.52170.5682
MLP0.24390.37840.22220.24530.34380.28570.15380.30000.30430.2045
Note: * represents the model utilized multi-task learning framework, while others are single-task learning.
Table 6. SPA test results of different models for high and low price predictions.
Table 6. SPA test results of different models for high and low price predictions.
Panel A. High price
TFT*Transformer*TCN*TFTTransformerTCNLSTMGRUDeepAR
Transformer*0.481
TCN*0.3570.233
TFT0.1100.1480.351
Transformer0.0490.0280.0840.168
TCN0.0360.2170.5170.4470.525
LSTM0.0320.0050.1330.0480.5260.126
GRU0.0480.5580.5370.5480.5490.5390.541
DeepAR0.0090.0000.0000.0010.0010.0000.0010.003
DecoderMLP0.0010.0000.0000.0000.0040.0000.0000.0000.517
Panel B. High price
TFT*Transformer*TCN*TFTTransformerTCNLSTMGRUDeepAR
Transformer*0.117
TCN*0.1610.470
TFT0.1920.5240.514
Transformer0.0390.5290.4430.325
TCN0.0380.4320.1970.2820.338
LSTM0.0490.5430.5100.4140.4910.536
GRU0.0380.4440.2510.0320.2870.4900.100
DeepAR0.0210.1060.0430.0340.0500.0680.0180.050
DecoderMLP0.0000.2880.1420.0430.1180.2520.0010.1740.520
Note: * represents the model utilized multi-task learning framework, while others are single-task learning.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, D.; Wang, L.; Lin, S.; Tang, Z. A Novel Multi-Task Learning Framework for Interval-Valued Carbon Price Forecasting Using Online News and Search Engine Data. Mathematics 2025, 13, 455. https://doi.org/10.3390/math13030455

AMA Style

Liu D, Wang L, Lin S, Tang Z. A Novel Multi-Task Learning Framework for Interval-Valued Carbon Price Forecasting Using Online News and Search Engine Data. Mathematics. 2025; 13(3):455. https://doi.org/10.3390/math13030455

Chicago/Turabian Style

Liu, Dinggao, Liuqing Wang, Shuo Lin, and Zhenpeng Tang. 2025. "A Novel Multi-Task Learning Framework for Interval-Valued Carbon Price Forecasting Using Online News and Search Engine Data" Mathematics 13, no. 3: 455. https://doi.org/10.3390/math13030455

APA Style

Liu, D., Wang, L., Lin, S., & Tang, Z. (2025). A Novel Multi-Task Learning Framework for Interval-Valued Carbon Price Forecasting Using Online News and Search Engine Data. Mathematics, 13(3), 455. https://doi.org/10.3390/math13030455

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop