1. Introduction
There has been a significant increase in the use of deep-learning (DL) models to conduct multivariate time-series (TS) data analysis in many key application areas, such as banking, healthcare, environment, and other areas that benefit society, as well as crucial infrastructures that are powered by the internet of things or the climate [
1].
Quantitative system-dynamics data from historical performance and qualitative fundamental data from various news sources affect the performance of TS data obtained from economic markets [
2]. As both types of data can be combined to effectively identify patterns in economic data, they are currently of significant interest and are researched by both professionals and academics alike. Most extant studies have used data that has been derived from news sources to estimate the direction of stock movements. However, this caused a classification problem, which was the exact price of an asset in the future, posing a regression problem. Therefore, this present study hypothesised that using multiple datasets obtained from the Saudi Stock Exchange (Tadawul), specifically opening, lowest, highest, and closing prices, to conduct a multivariate analysis increases the accuracy of forecasting the future price of a stock [
3].
As stock-market predictions heavily rely on a combination of mental vs. physical characteristics and rational vs. illogical behaviours to yield stable values, they are extremely difficult to anticipate precisely. As such, it is one of the most significant topics in the computer-science industry at present.
Stock-market prices for the next 10 days can be predicted by analysing a financial timeline over an extended period of time. Extant studies indicate that significantly accurate stock-price forecasts can be made using accurate coding. Nevertheless, some studies suggest that market hypotheses and predictions are merely hypothetical and inaccurate.
Multiple models have been proposed to forecast stock prices. The variables chosen, the analysis method, and the modelling process significantly affect the forecasting accuracy of a model. Deep learning (DL) and machine learning (ML) have been frequently used to analyse and predict stock prices. Artificial intelligence (AI), a field of study in computer science and technology, has made considerable strides in recent years and significantly simplified multiple real-world problems [
4,
5].
As there are multiple variables that affect a prediction, it is simpler to project the weather or a certain trait than the stock market, as it is in a constant state of flux. More specifically, stock prices fluctuate independently at times since financial markets have regular ups and downs. Therefore, compiling accurate historical data is no small feat as the stock market is volatile. Moreover, a variety of factors, such as economic, business, personal, political, and social issues, could affect the market [
6,
7]. As such, it is virtually impossible to predict changes in stock prices due to the stock market’s extreme unpredictability. Globalisation, as well as the development of information and communication technology, (ICT) has increased the number of individuals seeking to make exorbitant returns from the stock market. As such, stock-market predictions have become crucial for investors.
The fundamental method and the technical method are the two different methods of forecasting the stock market [
8]. Unlike the technical method, which uses analyses graphs to forecast future stock prices, the fundamental method examines every facet of a firm’s internal worth. However, the fundamental method is not without drawbacks as it generates fewer time predictions and lacks compliance. As such, many stock-market prediction studies utilise the technical method. Moving averages, genetic algorithms, easy measurement, regression, and support-vector machines (SVMs) are some of the ML methods that have been employed to address this issue. However, none of them produce predictions that are extremely precise. This has given rise to the use of mixed-method approaches. Although problem-solving methods have been around for a while, in the data-science industry, these are some of the most difficult issues to solve as many significant concerns are involved when predicting future prices and spotting trends in stock-market data.
This present study examined a method that uses a long short-term memory (LSTM) or temporary memory algorithm. A potent technique for handling TS data is the recurrent neural network (RNN). In many ways, LSTMs are subordinate to sensory RNN networks and feed broadcasts [
9]. This is due to the fact that, by choosing long patterns, they possess a commemorative quality. Few RNN models have been utilised to predict movements in the stock market. An LSTM is one of the most effective RNN structures. This present study discusses the distinctions between univariate and multivariate analysis as it primarily examines multivariate TS analysis. A TS that only contains one variable is referred to as a univariate TS. A TS data’s value at time step ‘t’ is basically derived from its preceding time steps, t − 1, t − 2, t − 3, and so on. Unlike multivariate models, univariate models are easier to construct. The closing price of a capital asset is typically the variable of interest when predicting stock prices [
7].
Unlike a univariate TS, multivariate TSs do not contain single dependent variables but several interconnected variables [
10]. It also considers numerous influencing aspects as well as noninfluencing aspects. Predicting stock-market prices using multivariate analysis depends on many different variables, such as opening, highest, lowest, and closing prices. Therefore, multivariate models are typically more challenging [
11] to develop than univariate models. This poses difficulties, including the management of several parameters in a high dimensional space [
12].
This present study concentrates on the short-term price behaviour of the communication industry, which is made up of three major firms: Etihad Etisalat (Mobily), Saudi Telecom Company (STC), and Zain KSA. These three businesses were chosen as they best reflect the economy of the Kingdom of Saudi Arabia (KSA) in the Saudi Stock Exchange (Tadawul) as they are heavily traded. The KSA’s communication industry is among the largest and fastest emerging markets in North Africa and the Middle East, with Mobily, STC, and Zain KSA as the top players in the mobile market. In 2022, the combined market valuation of these three firms surpassed 58 billion Saudi riyals.
The main contributions of this paper are as follows:
Outlines what is required to conduct a time-series (TS) analysis, as well as demonstrates how each of the univariate and multivariate models examined is satisfied and applied to the proposed LSTM DL;
Provides an indepth comparison between univariate and multivariate time series (TS) to explain why a multivariate TS is better than a univariate TS;
Uses real datasets to depict the results of the experimental analysis, which indicated that other prices affect the predictive accuracy of the proposed LSTM DL.
Section 1 provides an introduction to the researched topic while
Section 2 is a Literary Survey that discusses relevant extant studies and how the present study differs from them. In
Section 3, the dataset that this present study used is discussed.
Section 4 outlines the methodology that was used to develop the proposed LSTM DL while
Section 5, Experiment Results, that discusses the prediction algorithm of the proposed LSTM DL. And
Section 6 contains the conclusion, which is discussed.
2. Literary Survey
It is essential to review extant studies prior to conducting research. The experimental findings of Liu and Long (2020) demonstrated that the proposed hybrid framework, which could be used for fiscal data analysis and research or stock-market monitoring, had the best projection accuracy. As the primary component of the mixed framework, a dropout strategy and particle swarm optimisation (PSO) were used to augment a DL network predictor that was based on an LSTM network. The proposed framework for projecting stock closing prices outperformed conventional models in terms of prediction. Theoretically, the LSTM network is particularly well suited for financial TS projection due to its cyclic nature, which gives it the function of long-term memory. The deep hybrid framework’s components included data processing, a DL predictor, and the predictor optimisation technique [
13].
Nabipour et al. (2020) used bagging, decision tree, random forest, gradient boosting, adaptive boosting (AdaBoost), eXtreme gradient boosting (XGBoost), RNN, artificial neural networks (ANN), and LSTM. Different ML techniques were used to forecast future stock-market group values. From the Tehran Stock Exchange, four groups, i.e., petroleum, diversified financials, nonmetallic minerals, and basic metals, were selected for experimental assessments. Due to its intrinsic subtleties, nonlinearity, and complexity, the estimation of stock group prices has long been appealing to and challenging for investors [
14].
Manujakshi et al. (2022) used a hybrid prediction rule ensembles (PRE) and deep neural networks (DNN) stock-prediction model to address nonlinearity in the examined data. The mean absolute error (MAE) and root-mean-square-error (RMSE) metrics were used to calculate the results of the hybrid PRE–DNN stock estimation model. The stock-prediction rules with the lowest RMSE score were selected from all the stock-prediction rules generated using the PRE approach. As it improved the RMSE by 5% to 7%, the hybrid PRE–DNN stock-estimation model outperformed the DNN and ANN single prediction models. Due to the many aspects which affect the stock market, including corporate earnings, geopolitical tension, and commodity prices, stock prices are prone to volatility [
15].
The single-layer RNN model suggested by Zaheer et al. (2023) improved prediction accuracy by 2.2%, 0.4%, 0.3%, 0.2%, and 0.1%. The experimental findings supported the efficacy of the proposed prediction methodology, which helps investors increase their earnings by selecting wisely. The model uses the input stock data to anticipate the closing price and highest price of a stock for the following day. The obtained results demonstrated that CNN performed the worst while the LSTM outperformed the CNN–LSTM, the CNN–RNN outperformed the CNN–LSTM and the LSTM, and the proposed single-layer RNN model outperformed all of them. Due to the nonlinearity, substantial noise, and volatility of TS data on stock prices, stock-price estimation is highly difficult [
16].
Kim et al. (2022) proposed methods of predicting the oil prices of Brent Crude and West Texas Intermediate (WTI) using multivariate TS of key S&P 500 stock prices, Gaussian process modelling, vine copula regression, DL, and a smaller number of significant covariates. The 74 large-cap key S&P 500 stock prices and the monthly log returns of the oil prices of both companies for the timespan of February 2001 to October 2019 were used to test the proposed methods on the actual data. With regards to the extent of prediction errors, the vine copula regression with NLPCA was generally superior to the other proposed approaches. To reduce the dimensions of the data, a Bayesian variable selection and nonlinear principal component analysis (NLPCA) were used [
17].
Munkhdalai et al. (2022) used a unique locally adaptable and interpretable DL architecture supplemented with RNNs to render model explainability and superior predictive accuracy for time-series data. The experimental findings using publicly available benchmark datasets demonstrated that the model not only outperformed state-of-the-art baselines in terms of prediction accuracy but also identified the dynamic link between input and output variables. Next, to make the regression coefficients adjustable for each time step, RNNs were used to reparameterise the basic model. A simple linear regression and statistical test were used to establish the base model. In time-dependent fields like finance and economics that depend on variables, explaining dynamic relationships between input and output variables is one of the most crucial challenges [
18].
According to Charan et al. (2022), the Prophet model based on logistic regression had the lowest RMSE value of all the other algorithms; therefore, it was the most effective. A firm’s growth and advancement can be easily predicted by examining its stock market performance, which paves the way for the advancement of stock-price projection technology to determine the effects of any event occurring in the modern world. Four supervised ML models, Facebook Prophet, multivariate linear regression, autoregressive integrated moving average (ARIMA), and LSTM, were used to predict the closing stock price of Tata Motors, a major Indian automaker, while RMSE was used to determine efficacy [
19].
The introduction of an automated system which can predict potential stock prices with significant precision is important as it is not humanly possible for stock market traders and investors to comprehend the nature of fluctuations in prices. Forecasting future stock-market trends is unfortunately quite difficult due to the unpredictable, nonlinear, and volatile character of stock-market values. Uddin et al. (2022) demonstrated the promising potential of their proposed architecture by examining data from the Dhaka Stock Exchange. In order to reduce investment risks, it is crucial for the interested parties and stakeholders to possess suitable and insightful trends of stock prices [
12].
Using an RNN and discrete wavelet transform (DWT), Jarrah and Salim (2019) sought to forecast Saudi stock price trends based on historic prices. A comparison indicated that the proposed DWT–RNN method provided for a more precise estimation of the day’s closing price using MAE, mean squared error (MSE), and RMSE than the ARIMA method. The developed RNN was trained using the backpropagation through time (BPTT) approach to assist in forecasting the closing price of the Saudi market’s stocks for the selected sample of firms for the upcoming seven days. The noise around the data collected from the Saudi stock market was reduced with the help of the DWT technique. The results were then analysed and compared to that of more conventional prediction algorithms, such as ARIMA [
20].
Appendix A summarizes the details of the literary survey.
4. Methodology and Implementation
Different strategies have been proposed to address problems in the real world. However, the most difficult algorithms to complete are those that are created to address such challenges. Projecting the weather is one example of a real-world challenge as is understanding patterns in massive amounts of data and using them to make future estimations, hearing a voice and converting it into text, translating languages, and predicting the subsequent word as a sentence is typed into a word processor.
4.1. Prediction Model
Multiple issues are categorised according to the available data and the desired outcome. Many extant TS data-based studies have concluded that LSTM significantly improves performance, especially when the outputs of the previous state need to be remembered. It has also been found to outperform both a traditional feed forward network (FFN) and RNN in sequence prediction tasks due to its capacity for long-term memory.
There are input, output, and forget gates in every LSTM cell. The LSTM network stores all the data that enters and uses the forget gate to delete information that is not necessary (
Figure 1).
As seen in
Figure 1, the preceding hidden state (
ht − 1), prior cell state (
Ct − 1), and present input (
Xt) are the inputs of the current cell state (
Ct). An LSTM cell consists of an input gate, output gate, and forget gate.
The forget gate uses a sigmoid function to identify and decide which information needs to be scrubbed from the block. It examines the previous state (
ht − 1), the content input (
Xt), and each number in the cell state
Ct − 1 to create a number between 0 (delete) and 1 (store).
An input gate identifies which input value should be utilised to alter the memory. The sigmoid function ascertains whether to permit 0 or 1 values while the tanh function assigns weight to the supplied data and ranks its significance on a scale, from −1 to 1.
An output gate determines the output of a block using its input and memory. The sigmoid function ascertains whether to permit 0 or 1 values while the tanh function ascertains which values, 0 or 1, are permitted to pass through as well assigns weight to the supplied values and ranks their significance on a scale, from −1 to 1, and multiplies it with the sigmoid output.
4.2. LSTM Prediction Algorithm
Long short-term memory (LSTM) is an RNN architecture that is frequently used to predict TSs and process natural language. It can also be used for both univariate and multivariate TS prediction problems.
Univariate LSTMs are used when a TS only contains a single input feature, such as a stock’s closing price on a daily basis or a city’s temperature on a daily basis. In such cases, the LSTM is trained to forecast future values solely using the single input feature’s past values.
Multivariate LSTMs, on the other hand, are used when a TS contains multiple input features, such as a stock’s closing price on a daily basis, a city’s temperature on a daily basis, and the volume of the stock on a daily basis. In such cases, the LSTM is trained to forecast future values using all these input features’ past values.
Regardless of whether an LSTM is univariate or multivariate, the principal idea of its architecture remains the same. Long short-term memory (LSTM) uses gates to control the flow of information through a network as well as preserve dependencies that are long term in the data. Therefore, LSTMs are ideal and well-suited for TS prediction problems where the correlation between past and future values is complex and may not be easily modelled using traditional statistical methods.
Univariate LSTM Algorithm | Multivariate LSTM Algorithm |
1. Libraries Importing and Data Loading: Import the necessary libraries, such as TensorFlow and Numpy, and load the data into the system. | 1. Libraries Importing and Data Loading: Import the necessary libraries, such as TensorFlow and Numpy, and load the data into the system. |
2. Data Preprocessing: Preprocess the data to prepare it for training by removing noise using exponential smoothing (ES), which scales, transforms, and splits the data into training and validation sets. | 2. Data Preprocessing: Preprocess the data to prepare it for training by removing noise using ES, which scales, transforms, and splits the data into training and validation sets. |
3. Model Definition: Define the architecture of the LSTM model using the number of units, layers, activation functions, and other hyperparameters. | 3. Data Reshaping: Reshape the data into a three-dimensional (3D) format with dimensions such as time steps, features, and samples. |
4. Model Compilation: Compile the model by defining the optimiser, loss function, and metrics to be used. | 4. Model Definition: Define the architecture of the LSTM model using the number of units, layers, activation functions, and other hyperparameters. |
5. Model Training: Train the model by using the fit() function to fit it to the training data. | 5. Model Compilation: Compile the model by defining the optimiser, loss function, and metrics to be used. |
6. Model Evaluation: Evaluate the model using the validation dataset and the evaluate() function to calculate performance metrics; such as accuracy and loss. | 6. Model Training: Train the model by using the fit() function to fit it to the training data. |
7. Prediction Generation: Use the model to create predictions using the new data and the predict() function. | 7. Model Evaluation: Evaluate the model using the validation dataset and the evaluate() function to calculate performance metrics; such as accuracy and loss. |
8. Results Plotting: Plot the results to visualise the model’s performance and compare it to the actual values. | 8. Prediction Generation: Use the model to create predictions using the new data and the predict() function. |
9. Save the Model: Save the trained model to a disc for later use. | 9. Results Plotting: Plot the results to visualise the model’s performance and compare it to the actual values. |
| 10. Save the Model: Save the trained model to a disc for later use. |
4.3. Forecast Methods
The data that were used as inputs in this present study were collected from a finance website, namely,
www.investing.com, and used to examine both the univariate and multivariate methods.
4.3.1. Univariate Method Using Closing Prices
As seen in
Figure 2, the performance and price behaviours of the forecasting model were validated by directly inputting the closing prices of a stock.
4.3.2. Multivariate Method Using Closing, Opening, Highest, and Lowest Prices
As seen in
Figure 3, the performance and price behaviours of the forecasting model were validated by directly inputting four different prices of a stock.
Table 4 provides a summary of the statistics of the input data.
4.4. Noise Removal from the Dataset
The exponential window function is a general method for smoothing TS data, known as exponential smoothing (ES). In contrast to the ordinary moving average, which weights previous data equally, exponential functions use weights that reduce exponentially with time. It is a simple process that can be understood and used to make a decision based on the user’s existing presumptions, like seasonality. Time-series (TS) data analysis frequently employs ES. One of the various window functions frequently used in signal processing to smooth data is ES, which serves as a low-pass filter to eliminate high-frequency noise.
Poisson’s use of recursive exponential window functions in convolutions from the 19th century [
21] and Kolmogorov and Zurbenko’s use of recursive moving averages in their turbulence studies in the 1940s are methods that precede ES [
22].
The result of the ES algorithm, typically expressed as [
19], could be viewed as the best prediction of what the next value of [
20] will be. The raw data sequence is frequently signified by starting at time (t = 0). The formulas below provide the simplest form of ES when the observation sequence starts at time (t = 0):
where α is the smoothing factor and 0 < α < 1.
Figure 4,
Figure 5 and
Figure 6 illustrate the importance of showing a sample of the noise in a dataset. The original data was also plotted postnoise removal (denoised). The following figures show the original and denoised datasets.
5. Experiment Results
The present study was conducted using the historical price data of three high-volume stocks, namely, STC, Mobily, and Zain KSA, over a period of 1462 days, specifically, from 1 January 2017 to 8 November 2022. Only the closing prices of the aforementioned stocks were used to univariately forecast their closing prices for the subsequent seven days. Meanwhile, four different prices, namely closing, opening, highest, and lowest, were used to multivariate forecast their closing prices for the subsequent seven days.
5.1. Forecasting Accuracy
Error measures were calculated to assess the feasibility of both the approaches stated in the Methodology and Implementation section. The MAPE, MSE, MAE, and RMSE were employed to substantiate the performance of the suggested models. Equations (8)–(11) depict their formulas. These three indexes are stated as:
where,
is the actual value,
is the predicted value, and
t = 1 …
n and
n are the number of observations.
The outcomes for each of the methods are outlined in the Methodology and Implementation section after conducting tests with two different approaches.
5.2. Fitting the Models with LSTM
Similar fitting processes were used for both the univariate and multivariate LSTMs. This involved the following parameters:
Neurons: the quantity of neurons in the network’s dense output layer. The number of projected values that a network will generate is determined by the output layer. For instance, in a regression problem, the output layer possesses a single neuron, whereas, in a classification problem, the output layer contains the same number of neurons as classes.
Epochs: the number of times that a model will iterate over an entire training dataset. An epoch is defined as one complete iteration over the training data. Although more epochs may increase performance, it also increases the risk of overfitting.
Time steps: the number of time steps or sequence length of the input.
Batch size: the number of samples in a single batch.
Units: the quantity of LSTM units in the layer(s) of the network.
Dropout rate: the rate of dropout regularisation applied to the output of each LSTM unit, which helps prevent overfitting.
Input size: the number of input variables in the input.
Several other parameters, such as the optimiser, loss function, and learning rate, also control the training process and the convergence of a network. Therefore, these parameters must be carefully chosen and tuned according to the specific problem and data at hand.
Table 5 provides a summary of the parameters that were used in the model.
5.2.1. The First Method (Univariate)
The closing prices of the Mobily, STC, and Zain KSA stocks were fed directly into the LSTM model, which attempts to identify and comprehend patterns in the historic data. As the closing price was the only attribute used from the dataset, the model was univariate. Years of historical data were used to train the model.
Table 6 and
Figure 7,
Figure 8 and
Figure 9 provide an explanation of the model’s outcomes.
5.2.2. The Second Method (Multivariate)
The opening, highest, lowest, and closing prices of the Mobily, STC, and Zain KSA stocks were parallelly fed into the LSTM model. As multiple prices from the dataset were used, the model was multivariate. The model was then trained and its predictions assessed.
Table 7 and
Figure 10,
Figure 11 and
Figure 12 provide a more thorough explanation of the model’s outcomes.
The multivariate method, which used multiple stock prices, outperformed the univariate method and successfully predicted the stock prices for the next seven days. The multivariate LSTM models are better at predicting the stock market than their univariate counterparts as they use information from multiple sources, such as the opening, highest, and lowest prices of related stocks, which improves their predictive accuracy. Meanwhile, univariate models only use a single input variable, typically the price of the stock being predicted, which does not contain all the information necessary to make accurate predictions. Furthermore, as multivariate models are also better at capturing the complex correlations and interdependencies of the different input variables, they make significantly more accurate predictions.
7. Conclusions
The purpose of this present study was to forecast stock prices. The stock-price performance for the coming week was forecasted using historical data from Mobily, STC, and Zain. The first univariate analysis only applied the closing-price data as a single input directly in the LSTMDL projection model while the second multivariate analysis applied the closing, opening, highest, and lowest price data parallelly in the LSTMDL model to forecast the closing prices for the following seven days. The multivariate method predicted the closing prices for the next seven days more accurately than the univariate method, thereby making investments extremely profitable and secure.
Other economic factors, such as governmental regulations, currency exchange rates, interest rates, inflation, Twitter sentiment, or hybrid methods, that influence financial markets may be used to build a knowledge base or as data for forecasting models. Short-term expectations are another variable that could, potentially, affect trend predictions or pricing. This present study only forecasted prices for the following seven days. Therefore, future studies may examine longer periods, such as 10 or 15 days. Long-term predictions may also be examined for further indepth analysis or guidance. This may include regulatory stability, reviews of quarterly stock performance, sales returns, and dividend returns, among other elements. Although longer time frames can be investigated and validated if a longer-term outlook is necessary, shorter time frames, such as the 10 day or 50 day simple moving average (SMA) or exponential moving average (EMA), can also be used.
It is well known that a wide range of aspects, such as governmental regulations, corporate practices, and interest rates to name a few, impact stock prices. For that matter, any announcements regarding these factors impact stock prices. Even the best ML or DL models will be significantly impacted by natural catastrophes and other unforeseen events. Therefore, a hybrid method that considers all future elements, such as sentiment, news, and technical indicators, can be developed to yield a more vigorous and precise forecasting mechanism.
The limitations we faced were not using more elements that affect stock price fluctuations, such as the resignation of the board chairman and transforming this news into time series and using them alongside current inputs. This is the challenge and upcoming research for us.