1. Introduction
In the dynamic arena of financial markets, futures arbitrage continues to be a pivotal subject for scrutiny among investors and scholars. This strategy is integral to capitalizing on profit avenues discerned through meticulous analysis and the tactical exchange of futures contracts, by capitalizing on the variances in pricing that occur in diverse markets over time [
1]. To achieve successful arbitrage, it typically demands advanced forecasts of market trajectories and asset pricing volatilities. Price, which is a paramount factor in the ebbs and flows of trading activities, carries significant implications for the expansion of the market and the welfare of its participants. As a result, predicting price trends is an essential component of research in financial investment and a foundational element in devising strong arbitrage tactics.
Transitioning from the fundamental theory of futures arbitrage, early attempts to predict commodity futures prices were grounded in standard econometric methods. Historically, these methods were defined by the groundbreaking work of academics, such as the ARCH model introduced by Robert Engle in 1982, which explained time-varying volatility for economic time series data [
2]. Following this innovation, a multitude of statistical models emerged, refining the approach to predict financial market behaviors. Notably, in 1986, Bollerslev presented the GARCH model, enhancing the ARCH concept and tailoring it more explicitly to financial datasets [
3]. These econometric models, with the GARCH model in particular, were not only used in alignment with traditional regression techniques, but they also uniquely accounted for fluctuating error variance, facilitating a more nuanced understanding of market volatility—a key component in investor decision-making. A significant milestone was achieved with Morana’s study in 2001, which successfully applied GARCH-based methods for short-term crude oil price forecasts [
4]. Despite their empirical successes, challenges have persisted with these traditional models, primarily due to the intricacy and dynamic nature of financial data, which present nonlinear characteristics and time-sensitive elements that traditional mathematical models struggle to encapsulate fully.
The advent of machine learning techniques in the financial sector has ushered in a new era for algorithmic trading. Currently, researchers in related fields are primarily concentrating on two aspects: the extraction of Alpha factors and the optimization of synthetic models.
Factor mining, which is based on artificial intelligence algorithms, has evolved beyond the traditional approach of establishing clear investment logic relationships to mine and screen factors. Instead, it has become more adaptable, extracting and learning valuable information from relational events and sentiment data to enhance the efficiency of investment decisions. For instance, deep learning models can analyze unstructured data such as news articles and social media posts by natural language processing (NLP) technology. It allows models to capture market sentiment and potential influencing factors and enhance the precision of spread predictions. Recent literature corroborates that deep learning models excel at distilling crucial features from complex data and adeptly applying these insights to new contexts [
5].
An accumulation of research underscores the efficacy of machine learning techniques in anticipating futures prices. Early initiatives, such as Grudnitski’s 1993 study [
6], employed neural network (NN) algorithms for predicting gold futures prices, showing a marked improvement over traditional time series approaches. The competitive edge of NNs was further highlighted by Moshiri and Foroutan’s 2006 comparison [
7], which found NNs to outperform ARMA and GARCH models in crude oil price forecasting. Extending this success, Diana Osorio et al., (2017) [
8] applied neural networks to project S&P and gold futures prices, yielding promising outcomes. Hailei Zhao’s research in 2021 [
9] implemented machine learning techniques grounded on the fundamental factors of agricultural product futures, enhancing forecast precision and providing key contributions to the field. Tsantekidis et al. (2017) [
10] chose convolutional neural networks for anticipating stock prices and discovered they surpassed traditional multilayer perceptrons and support vector machines in their predictive prowess. Dixon et al. implement a deep neural network (DNN) in 2016 [
11] for various commodity futures prices and reported uniquely accurate results. In the vein of leveraging sophisticated AI, Long et al. (2018) [
12] incorporated LSTM, BPNN, and CNN in creating arbitrage models, with experiments on coking coal, iron ore, and rebar futures demonstrating LSTM’s superior performance. Sheng Y and Ma D’s comparison in 2022 [
13] among LASSO, Xgboost, BPNN, and LSTM for arbitrage effects underlined the outstanding forecasting abilities of deep learning models. Zhou et al., (2021) [
14] proposed the Informer model based on the Transformer model and demonstrated the effectiveness of the model in enhancing the prediction capacity of long sequence time-series forecasting (LSTF) through experiments. Liu et al. (2022) [
15] designed a SCINet model based on TCN, which achieved higher accuracy on public datasets across multiple fields. Moreover, neural network-based generative techniques have also begun to be applied in the field of financial time series. Zhang et al. [
16] proposed a generative adversarial network (GAN) structure based on LSTM and got a promising performance in the closing price prediction on the real data. It is becoming increasingly clear that AI, especially through deep learning and machine learning, offers tremendous promise for futures market prediction and navigation. These advanced methodologies adeptly tame the nonlinearity of financial markets, elevate forecasting accuracy to new heights, and furnish investors with sophisticated tools for crafting strategic decisions.
Nonetheless, current predictive endeavors within financial studies commonly focus on the immediate future, aiming to determine the upcoming value or direction for a single metric, essentially for univariate time series projection. Researchers often design models that are calibrated solely to forecast a single point or a unidimensional trend in subsequent time steps. This traditional approach, however, does not suffice for arbitrage strategy construction, where a broader temporal and dimensional perspective is essential for evaluating multiple indicators over an upcoming time span. To navigate these complexities, this study introduces a convolutional long short-term memory (ConvLSTM) model geared toward predicting futures price spreads. This advanced model boasts capabilities for multistep and multidimensional forecasting, which aligns more closely with the nuanced demands of practical trading scenarios.
However, network architectures like ConvLSTM pose challenges due to their complex decision-making process and limited transparency in how predictions are influenced [
17], which can affect confidence and trust in their results. Additionally, choosing the right hyperparameters often depends on previous research or the subjective judgment of practitioners. Selecting the appropriate hyperparameters is essential for enhancing the network’s structure, which improves its ability to generalize and fit data accurately. A significant scholarly effort is directed toward developing systematic methods to reduce subjective bias and identify the best set of hyperparameters.
This study presents an innovative forecasting approach tailored for the futures arbitrage domain, utilizing a PSO Deep-ConvLSTM model—a sophisticated, multidimensional, multistep predictor that seamlessly integrates PSO with the ConvLSTM network. The primary aim is to enhance the accuracy of futures price spread forecasts, thereby amplifying the potential for profitable arbitrage strategies. By harnessing real historical futures price spread data, the study endeavors to ascertain the efficacy of the PSO Deep-ConvLSTM through a comprehensive comparative analysis against alternative forecasting models.
In addressing the inherent challenges associated with the opaque decision-making of traditional ConvLSTM models, this research pioneers the utilization of the PSO algorithm for optimizing hyperparameters, with the overarching goal of enhancing the model’s transparency and augmenting its performance. This systematic tuning process is designed to mitigate reliance on subjective expertise and historical precedents, thus transitioning toward an objective and replicable methodology capable of reliably guiding the network’s learning process.
Coupling the ConvLSTM’s aptitude for capturing complex data patterns with PSO’s strengths in parameter optimization, the proposed PSO Deep-ConvLSTM model promises to be a powerful tool in futures arbitrage. By advancing this synergy of methodologies, the framework aims to provide actionable insights, decision support, and investment strategies for market participants. Findings from this research are expected to offer a strong endorsement for the practical application of the PSO Deep-ConvLSTM model in arbitrage decision-making and risk management, thereby illuminating a pathway toward more sophisticated and precise financial market analyses.
In short, we summarize the key contributions of this work as follows:
We employ the ConvLSTM model to prognosticate multiple pertinent indicators in the future timeline of the futures price spread data.
By introducing the PSO algorithm, we optimize the ConvLSTM network. This approach rectifies the shortfall related to the inaccurate acquisition of initial connection weights and hyperparameters intrinsic to the ConvLSTM model. Consequently, it fortifies the objectivity of hyperparameter selection and enables a more accurate prediction of futures price spread data.
To evaluate the predictive performance of our PSO Deep-ConvLSTM model, we conducted comparative experiments with existing models, such as the FEDformer. Our research findings demonstrate that our model exhibits marginally superior accuracy compared with state-of-the-art approaches.
The remainder of this paper is structured to facilitate clear comprehension and logical flow.
Section 2 delineates the problem and engages in an in-depth analysis of the dataset. It aims to articulate the objective function configured for our peculiar study needs and to confirm the empirical dataset’s reliability and accessibility. In
Section 3, we expound on both the COVLSTM and the innovative PSO Deep-ConvLSTM frameworks.
Section 4 is devoted to a detailed experimental evaluation and analytical comparison between the PSO Deep-ConvLSTM and established benchmark models, focusing specifically on their applicability to forecasting inter-commodity spread.
Section 5 concludes the paper and discusses future perspectives.
2. Problem Statement and Data Analysis
In this section, we initially offer a concise overview of the problem that our research seeks to address. Furthermore, we provide a clear and succinct description of the data through correlation analysis and Engle–Granger (EG) cointegration tests, thereby verifying its validity and applicability.
2.1. Problem Statement
For predicting futures markets, engaging in multistep forecasting across diverse dimensions presents a more significant practical impact than single-step prediction or forecasting within isolated dimensions. This multifaceted approach garners the interest of both financial practitioners and researchers. The primary objective of multistep forecasting is to analyze historical data and project values for forthcoming time periods. In contrast to single-step forecasting, multistep forecasting grapples with heightened uncertainty, which may precipitate a decline in the predictive model’s effectiveness due to cumulative errors during the modeling process. In response to this challenge, we propose the PSO Deep-ConvLSTM model as a viable solution. To demonstrate and validate our approach, we selected futures contracts for rebar (RB) and hot-rolled coil (HC) listed on the Shanghai Futures Exchange to formulate a pair trading strategy. We then trained and backtested the predictive model using the fitted spread data derived from this arbitrage investment portfolio.
More specifically, our approach entails the use of actual price spread data via the ConvLSTM model to forecast and generate an array of price spread variations, such as closing price, opening price, and lowest and highest price spreads, over an ensuing period. The end goal of these forecasts is to inform and guide us in formulating an appropriate futures arbitrage strategy. Therefore, with the intention of elevating the returns of our pair trading strategy, our foremost objective is to enhance the prediction accuracy of the ConvLSTM model. This objectivity is primarily achieved through the optimization of the model’s hyperparameters. Guided by our sophisticated enhanced heuristic algorithm, the hyperparameter search within the ConvLSTM model is treated as a black-box optimization task. The corresponding objective is defined as follows:
where
denotes the hyperparameter set of our model,
is the set of time stamps used for testing,
is the Frobenius norm, and
is the predictive model.
2.2. Data Structure
The Shanghai Futures Exchange proffers a snapshot-based order feed implemented via the CTP (Comprehensive Transaction Platform). The feed accumulates changes occurring within the preceding 500 milliseconds, encompassing multiple fields that encapsulate trade and order-book information. Specifically for our analysis, we capitalized on the price field to compute the spread between rebar and hot-rolled coil. The dataset utilized in this study was sourced from Choice Financial Software. Given the inherent noisy characteristics of financial data, we transformed the 500-millisecond spread data, ranging from 21:01 on 15 July 2020, to 10:50 on 18 May 2022, into 1 min K-line data. This transformation effectively mitigated the impact of noise, thereby enhancing our model’s capacity to capture the temporal dependencies inherent in the data. Additionally, among the extensive array of contracts listed annually, we narrowed our focus to the historical data derived from January, May, and October contracts.
Subsequently, all adjusted data points were integrated, taking into account the trading volume. This comprehensive process culminated in a comprehensive dataset consisting of 153,440 historical trading data points, spanning a period of 447 days. Each data point comprises the following features:
OPEN/HIGH/LOW/CLOSE: the first/highest/lowest/last value in 1-min spread data.
Exponential moving average (EMA):
Difference (DIF):
Differential exponential average (DEA):
Moving average convergence and divergence (MACD):
The price spread fluctuation.
2.3. Data Analysis
Commodity contracts for paired transactions often need to have a long-term and stable cointegration relationship, such as the combination of rebar and hot-rolled coil. This paper makes a prediction study based on the real price difference data of this combination. This section demonstrates the effectiveness of using data.
To ascertain the presence of a long-term stable cointegration relationship among the selected futures contracts, we utilized EViews10 software to conduct a cointegration analysis on the original price data.
A close examination of the contract time series plot shown in
Figure 1 reveals that the closing price data for both RB and HC display comparable fluctuation patterns. This initial observation indicates a potential correlation between the price data of these two commodity futures. A comprehensive quantitative analysis of this correlation is furnished in
Table 1. The correlation coefficients computed for the opening price, closing price, highest price, and lowest price collectively suggest a significant correlation between the two commodities.
While most variables within financial data are nonstationary, research into their correlations reveals intrinsic linkages and a stable equilibrium relationship over the long term. In this study, we performed stationarity tests on the selected RB and RH price series, employing the commonly used augmented Engle–Granger (ADF) method to test the stationarity of the time series. Tests were conducted for the series at 0th and 1st order unit root, with results presented in
Table 2. As the data in the table elucidate, for the price series of RB and HC, we could not reject the null hypothesis at a 5% confidence level in the 0-order unit root test. Therefore, all variables were nonstationary. However, after differentiating the series at the 1st order, the absolute values of the t-statistics of each variable were greater than any critical value, with accompanying probabilities of zero, indicating each of them was stationary under a first-order differential. Consequently, it can be inferred that the price series of the principal futures contracts of both RB and HC are integrated into order one.
Subsequently, we can initiate the Engle–Granger cointegration test, beginning with the formulation of the cointegration equation as follows:
where
et denotes the cointegration residual, or simply the residual. The parameter
c represents the cointegration coefficient.
Table 3 presents the results of the Engle–Granger cointegration test. At the 1% confidence level, the ADF test statistic of the residual series is smaller than the critical value. As a result, we reject the null hypothesis and consider the series to be stationary. In accordance with Engle–Granger’s cointegration theory, we deduce that the price data for the main contracts of RB and HC exhibit a cointegration relationship. Consequently, these data are suitable for pair trading.
To ensure the effectiveness of our fitting process, we conducted a stationarity test on the fitted spread data.
Table 4 demonstrates that the fitted spread data are a stationary time series at a 5% confidence level, indicating that our fitting process is robust and reliable. The time series plot of the fitted data for the closing price spreads is displayed in
Figure 2.
4. Experiment
4.1. Experimental Environment
The hardware and software configurations used for this experiment are shown in
Table 5. The network was built under the Pytorch deep learning framework, and training and testing of the network were conducted based on this framework.
4.2. The Processing of Data
To verify the performance of the proposed prediction model, this paper adopts one-minute interval k-line fitting price spread data as the experimental data. Each term of the price spread data is composed of eight features, including the opening price spread, highest price spread, lowest price spread, closing price spread, MACD, DEA, DIF, and price spread fluctuation. In the experiment, we allocate 70% of the dataset as the training set for the comparative model, with the remaining 30% serving as the test set. For the PSO Deep-ConvLSTM model, the first 70% of the dataset is utilized for the optimization algorithm to search for optimal parameters and train the model, with the final 30% deployed as the test set to assess the model’s generalization error. Herein, the data are processed as follows: Prior to feeding the feature data into the artificial neural network, the data undergo normalization, being effectively converted into a [0, 1] range. This approach not only minimizes the impact of noise, enhancing the predictive accuracy of the model, but also expedites model convergence, ensuring efficient parameter updates within the neural network. Equation (7) provides the formula for normalization:
where
and
, respectively, represent the minimum and maximum values of the entire training set.
As the data are subject to normalization during the model training phase, the output of the test set can be reverse normalized using Equation (8).
where
is the output value of the forecasting model.
4.3. Judgement Criteria
To assess the performance of the prediction model, this study employs the following four metrics to measure the model’s predictive accuracy: mean absolute percentage error (MAPE), which takes into account the error between predicted and actual values as well as the proportionality of the error to the actual values; root-mean-square error (RMSE), a measure of the discrepancy between observation values and actual values; mean absolute error (MAE), an indicator reflecting the real error in predicted values; and the coefficient of determination (
), a gauge of the predictive power of a statistical model. The respective formulas are as follows:
In these equations, and denote the actual and predicted values of the price spread data at time I, respectively; n stands for the sample size of the test dataset; and represents the average value of the dataset. Typically, the smaller the values of MAPE, RMSE, and MSE, the less the deviation between the predicted and actual values. The value of , which falls within the range [0, 1], is utilized to measure the accuracy of predictions by a statistical model. Ordinarily, a higher R-squared implies superior modeling performance.
4.4. Optimizing Network Parameters by the PSO
In the pursuit of effectively updating the weight of the neural network, this study has employed the Adam optimizer for the optimization of model parameters, with a batch size established at 128. The ConvLSTM model’s compatibility with our dataset was assured via the implementation of the particle swarm optimization (PSO) algorithm for ConvLSTM hyperparameter optimization.
The PSO algorithm, by mimicking the communal behaviors exhibited by birds when locating food, leverages a combination of individual and collective experiences to progressively approach the target of interest. Continuous position updates, shaped by their personal optimum position in tandem with the overall flock’s optimum position, consequently get folded into an optimal configuration [
25]. PSO operates in iterations, allowing for swarm updates through alterations in each component’s velocity and position in every cycle. These updates fundamentally depend on personal best values (pbest) and global best values (gbest). Therefore, the accurate tuning of the PSO parameters is of utmost importance.
Eberhart and Shi’s research [
22] posits that a decent solution success rate is achievable when a PSO algorithm deploys a particle range of 20 to 50. However, our experiments have illustrated that employing a particle size of 6 allows for quicker convergence of the PSO algorithm with fewer computational resource requirements. A comprehensive review of PSO-centric literature [
26,
27,
28,
29] guided us in setting the particle size at 6 and iteration count at 10, with the upper and lower bounds of W being determined as 0.9 and 0.4, respectively.
Furthermore, an examination of academic literature surrounding the application of the ConvLSTM network in prediction problems [
30,
31], as well as relevant works, highlighted the pivotal role of the number of ConvLSTM layers and the size of the convolution kernel corroborated by the results of our experiments. Consequently, we selected the learning rate, size of the convolution kernel, the number of ConvLSTM layers, and the epoch as targets for optimization. A logical search range is essential to preventing issues such as excessive resource consumption associated with expansive search ranges during the search process. Following the analysis of related research, definitions for the search ranges were formulated as being [0.00001, 0.0005] for the learning rate, [1, 9] for convolution kernel size, and [2, 7] for layers of ConvLSTM. Besides, a few random epoch values were tested while keeping other parameters constant, revealing a suboptimal model performance when the epoch is less than 100 and an improved performance when it exceeds 300. However, to account for potential randomness, the epoch range was set between [1, 400].
With the search range of the target optimization parameters, we will obtain the optimal parameters by the PSO. As delineated in
Table 6, the optimal parameters for the ConvLSTM model, as yielded by application of the PSO, stand at a learning rate of 5.7399496072129 × 10
5 an epoch count of 365, ConvLSTM layer featuring 6 neurons, and kernel-size of 6.
Figure 6 shows the evolution of the loss function of the PSO Deep-ConvLSTM model during training and testing. Utilizing this optimal parameter configuration should potentialize the performance of the ConvLSTM model, particularly in the domain of arbitrage spread prediction.
4.5. Experimental Results and Analysis
In this section, as we employ our proposed model to conduct multistep, multidimensional predictions on inter-commodity futures price spread data, it is imperative to validate the forecasting effectiveness of different models. According to the study by Kline et al. [
32], multistep predictions can be realized through either iterative or independent methods.
For comparative experimentation purposes, we have selected the following models: iterative LSTM, iterative GRU, Transformer, and FEDformer. The first two models belong to the iterative category, while our employed ConvLSTM, Transformer, and FEDtransformer adopt the independent prediction approach.
Specifically, recurrent neural networks (RNNs) have been appropriated in the realm of financial forecasting due to their distinct advantages in handling time series problems. However, they do present certain drawbacks, particularly the issue of gradient explosion. LSTM networks, capable of remedying the setbacks of RNNs, offer enhanced feature extraction and generalization abilities. The GRU model is another variant of the LSTM model. We have made minor modifications to these models enabling them to perform iterative predictions, thereby accomplishing multistep predictions.
Additionally, the Transformer [
33] is an encoder-decoder structure primarily encompassing position encoding, position embedding, and a self-attention module. The multihead attention feature in the Transformer allows for parallel computation, thereby reducing the training duration. Since its introduction in 2017, the Transformer has been widely applied to various data types, including text and image data. The FEDformer is a variant of the Transformer model [
34] that combines the Transformer with the seasonal-trend decomposition method and uses randomly selected Fourier components to maintain a compact representation of time series. This not only overcomes the Transformer’s high computation cost and inability to capture a global view of time series, but it also further enhances prediction accuracy. Thus, we have also selected these two models for comparison. By evaluating our proposed model against these models, we can ascertain its performance and effectiveness in predicting cross-commodity futures price spreads, supplying valuable references for further research and application.
In the experiments, the batch size for all models was set to 128. For LSTM and GRU models, a two-layer structure was adopted with a learning rate of 0.0001 over 100 training epochs. The number of neurons was set to 100 and 20 for the first and second layers, respectively, while the Transformer and FEDformer models utilized the official default parameters. To comprehensively evaluate the performance of our proposed model and validate its superiority under different prediction horizons, we selected prediction settings of one, four, and eight steps. Specifically, the one-step prediction was primarily employed for benchmarking against LSTM-based models, highlighting the competitive edge of our proposed model in single-step forecasting. On the other hand, the four-step and eight-step predictions were utilized to assess the model’s refinement capabilities in handling multistep forecasting tasks. For ease of reference, the models were assigned numerical identifiers: the GRU model was Model-1, the LSTM was Model-2, the Transformer was Model-3, FEDforemer-W was Model-4, FEDformer-F was Model-5, and our proposed PSO Deep-ConvLSTM was Model-6.
Table 7 provides a detailed characterization of each model’s performance indicators for one-step, four-step, and eight-step predictions across the “open”, “high”, “low”, and “close” dimensions. To clearly observe that the PSO Deep-ConvLSTM model can approximately predict the trend, we have plotted a comparison between the model’s eight-step predictions for the “close” dimension and the actual values (as shown in
Figure 6). Additionally,
Figure 7 illustrates the prediction errors of the PSO Deep-ConvLSTM model compared with the actual values, providing further insight into its predictive accuracy.
From these data and figures, we can obtain an approximate understanding of the relative merits of each model. To enable a more intuitive and precise comparison of predictive effectiveness among the models and to underscore the superior performance of the PSO Deep-ConvLSTM model, we compiled the average of these indicators across all dimensions.
Figure 6 clearly illustrates that, for one-step-ahead predictions, the LSTM model has a significant advantage over other models, such as the Transformer. The PSO Deep-ConvLSTM model yields RMSE, MAE, and MAPE values of 1.508, 1.184, and 0.884, respectively. Compared with the traditional GRU and LSTM models, our proposed model has achieved a reduction of 43.8% and 39.1% in RMSE, 47.9% and 41.6% in MAE, and 56.8% and 46.6% in MAPE, demonstrating its superior prediction performance in single-step forecasting. From
Figure 7, we observe that in four-step-ahead predictions, the LSTM and GRU models start to falter, while the Transformer-type models begin to show their strengths. The performance of LSTM and GRU deteriorates significantly, whereas the proposed model attains RMSE, MAE, and MAPE values of 2.579, 1.800, and 1.286, still delivering satisfactory results. When compared with advanced Transformer and FEDformer models, the PSO Deep-ConvLSTM model exhibits a decrease in RMSE by 39.1%, 43.9%, and 33.1%; in MAE by 43.0%, 47.8%; and 32.3%, and in MAPE by 45.8%, 47.8%, and 31.0%. This implies that the discrepancy between predicted and actual values for the four-step-ahead forecast is smaller with higher prediction accuracy. Furthermore, the R2 value is closer to 1, indicating a strong fitting capability. In
Figure 8, in the case of eight-step-ahead predictions, the performance of LSTM and GRU models is quite poor, which also justifies their advantage in short-term forecasting. As the number of prediction steps increases, the error metrics exhibit substantial deterioration. However, the proposed model, with RMSE, MAE, and MAPE values of 4.520, 3.293, and 2.369, can alleviate this issue to some extent. In comparison with the long-term forecasting-focused Transformer model, our model reduces the various metrics by 0.4%, 0.1%, and 3.1%, respectively. Even when compared with the FEDformer model, an expert in long-term forecasting, our model surpasses it under the wavelet exchange mode and remains fairly competitive under the Fourier exchange mode.
Figure 9,
Figure 10 and
Figure 11 display the average evaluation indicators for single-step and multiple-step predictions of the six models.
In summary, the PSO Deep-ConvLSTM model achieves satisfactory prediction results in both single-step and multistep forecasting.
5. Conclusions and Future Perspectives
In this study, we propose the PSO Deep-ConvLSTM futures price spreads prediction model, aiming to offer a superior predictive tool to aid investors in devising efficient arbitrage strategies within futures markets. The model’s efficacy was tested using real historical futures data and assessed by juxtaposing the data with alternate time series models. The empirical observations revealed the PSO Deep-ConvLSTM model’s distinct precision and superiority in both single-step and multistep predictions for futures price spread forecasting, particularly its augmented capacity to apprehend the nonlinear attributes embedded in the data. Additionally, we employed multiple data sets to further enhance the model’s confidence stability and generalization ability. The model exhibited comprehensive analytical competencies, considering multiple market indicators, thereby affirming its reliability in handling the complexities and volatility inherent in futures markets.
The model’s supremacy is reflected in two primary facets. Initially, the application of PSO methodology enhances the weight initialization and precision of parameters within the ConvLSTM network model, thus augmenting the objectivity of parameter selection. Moreover, its multidimensional and multistep predictive capacities pave the way for constructing arbitrage strategies based on forecast accuracy and comprehensiveness, elements of paramount importance to financial market participants.
To conclude, while the PSO Deep-ConvLSTM model has demonstrated significant performance potential, opportunities for amplification remain. These can include the integration of strategic statistical combinations and the addition of more futures market traits. Additionally, extending the model’s application to other financial markets might further quantify its generalizability and adaptability. Anticipated future research may build upon this groundwork, proffering an expanded range of applications within the financial sector, thus presenting an array of profound possibilities.