A Comparative Analysis of Machine Learning and Deep Learning Techniques for Accurate Market Price Forecasting
Abstract
:1. Introduction
1.1. Research Contributions
- -
- Comparative Analysis of SVR, RNN, and LSTM: This study provides a detailed comparison of three widely used models for time-series forecasting in financial markets, highlighting their strengths and limitations.
- -
- Demonstration of LSTM Superiority: This research demonstrates that LSTM, with its ability to handle long-term dependencies, outperforms both SVR and RNN in longer time windows and also confirms its suitability for financial forecasting.
- -
- Inclusion of Exogenous Variables: The incorporation of On-Balance Volume (OBV) as an exogenous variable in this study shows how the inclusion of technical indicators can significantly enhance predictive accuracy in stock price forecasting.
- -
- Application of Advanced Hyperparameter Tuning: The use of Optuna for hyperparameter tuning illustrates the importance of optimisation in improving the model’s performance. This framework enables a more systematic and efficient search for the best hyperparameters compared to manual tuning.
- -
- Evaluation with Time-Series Cross-Validation (TimeSeriesSplit): This study emphasises the importance of time-series cross-validation in ensuring robust model evaluation, as it respects the temporal structure of financial data and prevents information leakage, leading to more realistic performance estimates.
- -
- Comprehensive Error Metrics: A broad range of evaluation metrics, including MAE, MSE, RMSE, MPE, and R-squared, are used to provide a thorough assessment of each model’s performance. The inclusion of R-squared highlights the models’ ability to explain variance in the data, which is particularly relevant for financial prediction tasks.
- -
- Diagnostic Evaluation: Further diagnostic evaluation tools such as scatter plot, Loss over Epoch plot, and residual plot are used to provide a comprehensive analysis of how artificial intelligence models behave and assist in diagnosing issues, fine-tuning the model, and ensuring generalisation.
1.2. Significance of the Study
- -
- Improved Financial Forecasting: This study’s identification of LSTM as the most effective model for the time-series forecasting of stock prices enables traders and investors to make more informed decisions about buying or selling stocks, and other financial assets.
- -
- Enhanced Algorithmic Trading Strategies: The findings support the use of LSTM models in algorithmic trading, where predictive models are used to automate trades based on real-time data. This can lead to more profitable trading strategies, allowing algorithms to respond to market movements faster and more accurately.
- -
- Risk Management and Financial Planning: This study contributes to better risk management as financial institutions can use the findings to develop more accurate models that anticipate market downturns or periods of high volatility.
- -
- Application of Technical Indicators: The incorporation of technical indicators such as On-Balance Volume (OBV) in machine learning models provides traders and analysts with a competitive edge in identifying market opportunities.
- -
- Supporting Quantitative Finance and Machine Learning in Finance: The findings of this study provide a valuable benchmark for model selection in financial forecasting and have practical significance for quantitative finance professionals who develop machine learning models to predict market trends.
- -
- Educational Value for Financial Data Science: This research provides a clear and practical example of how machine learning models can be applied to real-world financial data and offers educational value for students, researchers, and professionals learning about financial data science and its practical applications. It also contributes to the growing body of knowledge on applying machine learning to financial markets, thereby providing a case study that can be expanded upon by future researchers or practitioners.
2. Materials and Methods
2.1. Feature Re-Engineering
- Gain = Positive price change between the closing prices of two consecutive periods (only positive changes are considered).
- Loss = The absolute value of the negative price change (only negative changes are considered).
- n = Number of periods (commonly 14).
- Average Gain: The sum of all gains (positive price changes) divided by the number of periods being considered (often 14 periods).
- Average Loss: The sum of all losses (negative price changes) divided by the number of periods.
- Increase in price ():
- ○
- Add the current volume to the previous OBV:
- Decrease in price ():
- ○
- Subtract the current volume from the previous OBV:
- Unchanged price ():
- ○
- Subtract the current volume from the previous OBV:
- Cumulative Calculation: OBV starts with an initial value, set to zero, and the volume is then added or subtracted based on price changes at each time step, which leads to the cumulative OBV over time.
2.2. Algorithm Selection for NGX Market Data
2.2.1. Model Architecture, Development and Training
2.2.2. Hardware and Computational Resources:
3. Results and Discussion
3.1. Support Vector Machines for Regression (SVR)
3.2. Recurrent Neutral Network
3.3. Long Short-Term Memory Network
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Han, H.; Liu, Z.; Barrios Barrios, M.; Li, J.; Zeng, Z.; Sarhan, N.; Awwad, E.M. Time series forecasting model for non-stationary series pattern extraction using deep learning and GARCH modeling. J. Cloud Comput. 2024, 13, 2. [Google Scholar] [CrossRef]
- Abuein, Q.Q.; Shatnawi, M.Q.; Aljawarneh, E.Y.; Manasrah, A. Time Series Forecasting Model for the Stock Market using LSTM and SVR. Int. J. Adv. Soft Comput. Appl. 2024, 16, 169–185. [Google Scholar] [CrossRef]
- Fama, E.F. The behavior of stock-market prices. J. Bus. 1965, 38, 34–105. [Google Scholar] [CrossRef]
- Lo, A.W. The adaptive markets hypothesis: Market efficiency from an evolutionary perspective. J. Portf. Manag. Forthcom. 2004. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=602222 (accessed on 6 May 2024).
- Zulfiker, M.S.; Basak, S. Stock Market Prediction: A Time Series Analysis. ResearchGate. 2021. Available online: https://www.researchgate.net/publication/354362969 (accessed on 6 May 2024).
- Lakshminarayanan, S.K.; McCrae, J.P. A Comparative Study of SVM and LSTM Deep Learning Algorithms for Stock Market Prediction. CEUR Workshop Proceedings. 2019. Available online: https://ceur-ws.org/Vol-2563/aics_41.pdf (accessed on 5 June 2024).
- Pashankar, S.S.; Shendage, J.D.; Pawar, J. Machine Learning Techniques for Stock Price Prediction—A Comparative Analysis of Linear Regression, Random Forest, And SVR. J. Adv. Zool. 2024, 45, 118–127. [Google Scholar] [CrossRef]
- Chhajer, P.; Shah, M.; Kshirsagar, A. The applications of artificial neural networks, support vector machines, and long-short term memory for stock market prediction. Decis. Anal. J. 2022, 2, 100015. [Google Scholar] [CrossRef]
- Shangshang, J. A Comparative Analysis of Traditional and Machine Learning Methods in Forecasting the Stock Markets of China and the US. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 1–8. Available online: https://thesai.org/Downloads/Volume15No4/Paper_1-A_Comparative_Analysis_of_Traditional_and_Machine_Learning.pdf (accessed on 5 August 2024).
- Li, X.; Liang, C.; Ma, F. Forecasting Stock Market Volatility with a Large Number of Predictors: New evidence from the MS-MIDAS-LASSO model. Ann. Oper. Res. 2022, 1–40. [Google Scholar] [CrossRef]
- Dong, X.; Li, Y.; Rapach, D.E.; Zhou, G. Anomalies and the expected market return. J. Financ. 2022, 77, 639–681. [Google Scholar] [CrossRef]
- Sarainmaa, O. Swing Trading the S&P500 Index with Technical Analysis and Machine Learning Methods with Responsible Way. Master’s Thesis, Abo Akademi University, Turku, Finland, 2024. Available online: https://www.doria.fi/handle/10024/189661 (accessed on 6 August 2024).
- Badhe, T.; Borde, J.; Thakur, V.; Waghmare, B.; Chaudhari, A. Comparison of different machine learning methods to detect fake news. In Innovations in Bio-Inspired Computing and Applications; Abraham, A., Ed.; IBICA 2021. Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2022; p. 419. [Google Scholar] [CrossRef]
- Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms; MIT Press: Cambridge, MA, USA, 2022. [Google Scholar]
- Edward, S. Tesla Stock Close Price Prediction Using KNNR, DTR, SVR, and RFR. J. Hum. Earth Future 2022, 3, 403–422. [Google Scholar] [CrossRef]
- Roy, D.K.; Sarkar, T.K.; Kamar, S.S.A.; Goswami, T.; Muktadir, M.A.; Al-Ghobari, H.M.; Alataway, A.; Dewidar, A.Z.; El-Shafei, A.A.; Mattar, M.A. Daily Prediction and Multi-Step Forward Forecasting of Reference Evapotranspiration Using LSTM and Bi-LSTM Models. Agronomy 2022, 12, 594. [Google Scholar] [CrossRef]
- Vikas, K.; Kotgire, K.; Reddy, B.; Teja, A.; Reddy, H.; Salvadi, S. An Integrated Approach Towards Stock Price Prediction using LSTM Algorithm. In Proceedings of the 2022 International Conference on Edge Computing and Applications (ICECAA), Tamilnadu, India, 13–15 October 2022; pp. 1696–1699. [Google Scholar] [CrossRef]
- Akinjole, A.; Shobayo, O.; Popoola, J.; Okoyeigbo, O.; Ogunleye, B. Ensemble-Based Machine Learning Algorithm for Loan Default Risk Prediction. Mathematics 2024, 12, 3423. [Google Scholar] [CrossRef]
- Benidis, K.; Rangapuram, S.S.; Flunkert, V.; Wang, Y.; Maddix, D.; Turkmen, C.; Januschowski, T. Deep learning for time series forecasting: Tutorial and literature survey. ACM Comput. Surv. 2022, 55, 1–36. [Google Scholar] [CrossRef]
- Mehmood, F.; Ahmad, S.; Whangbo, T.K. An Efficient Optimization Technique for Training Deep Neural Networks. Mathematics 2023, 11, 1360. [Google Scholar] [CrossRef]
- Ghadimi, N.; Akbarimajd, A.; Shayeghi, H.; Abedinia, O. Improving time series forecasting using LSTM and attention models. J. Ambient. Intell. Humaniz. Comput. 2022, 13, 673–691. [Google Scholar] [CrossRef]
- Naidu, G.; Zuva, T.; Sibanda, E.M. A review of evaluation metrics in machine learning algorithms. In Artificial Intelligence Application in Networks and Systems; Silhavy, R., Silhavy, P., Eds.; CSOC 2023. Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2023; Volume 724, pp. 19–32. [Google Scholar] [CrossRef]
- Shobayo, O.; Adeyemi-Longe, S.; Popoola, O.; Ogunleye, B. Innovative Sentiment Analysis and Prediction of Stock Price Using FinBERT, GPT-4 and Logistic Regression: A Data-Driven Approach. Big Data Cogn. Comput. 2024, 8, 143. [Google Scholar] [CrossRef]
- Espíritu Pera, J.A.; Ibañez Diaz, A.O.; García López, Y.J.; Taquía Gutiérrez, J.A. Prediction of Peruvian companies’ stock prices using machine learning. In Proceedings of the First Australian International Conference on Industrial Engineering and Operations Management, Sydney, Australia, 20–22 December 2022; IEOM Society International: Southfield, Michigan, 2022; pp. 20–21. [Google Scholar]
- Tawakuli, A.; Havers, B.; Gulisano, V.; Kaiser, D. Survey: Time-series data preprocessing: A survey and an empirical analysis. J. Eng. Res. 2024. [Google Scholar] [CrossRef]
- Ibrahim, K.S.M.H.; Huang, Y.F.; Ahmed, A.N.; Koo, C.H.; El-Shafie, A. A review of the hybrid artificial intelligence and optimization modelling of hydrological streamflow forecasting. Alex. Eng. J. 2022, 61, 279–303. [Google Scholar] [CrossRef]
- López-González, J.; Peña, D.; Zamar, R. Detecting and handling outliers in financial time series: Methods and applications. J. Financ. Econom. 2023, 21, 345–369. [Google Scholar]
- Zhu, X.; Zhang, Y.; Li, X. The impact of non-stationarity on machine learning models in financial time series. Quant. Financ. 2024, 24, 95–110. [Google Scholar]
- Lewis, C.D. Industrial and Business Forecasting Methods: A Practical Guide to Exponential Smoothing and Curve Fitting; Butterworth Scientific: London, UK; Boston, MA, USA, 1982. [Google Scholar]
- Song, R.; Shu, M.; Zhu, W. The 2020 global stock market crash: Endogenous or exogenous? Phys. A Stat. Mech. Appl. 2022, 585, 126425. [Google Scholar] [CrossRef]
- Salem, F.M.; Salem, F.M. Recurrent neural networks (RNN): From simple to gated architectures. In Recurrent Neural Networks; Springer: Cham, Switzerland, 2022; pp. 43–67. [Google Scholar]
- Moodi, F.; Jahangard-Rafsanjani, A.; Zarifzadeh, S. Feature selection and regression methods for stock price prediction using technical indicators. arXiv 2023, arXiv:2310.09903. [Google Scholar]
- Gupta, A.; Kapil, D.; Jain, A.; Negi, H.S. Using neural network for financial forecasting. In Proceedings of the 2024 5th International Conference for Emerging Technology (INCET), Belgaum, India, 24–26 May 2024; pp. 1–6. [Google Scholar]
- Swamy, S.R.; Rajgoli, S.R.; Hegde, T. Stock market prediction with machine learning: A comprehensive review. Indiana J. Multidiscip. Res. 2024, 4, 265–271. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhong, W.; Li, Y.; Wen, L. A deep learning prediction model of DenseNet-LSTM for concrete gravity dam deformation based on feature selection. Eng. Struct. 2023, 295, 116827. [Google Scholar] [CrossRef]
- Mba, J.C. Assessing portfolio vulnerability to systemic risk: A vine copula and APARCH-DCC approach. Financ. Innov. 2024, 10, 20. [Google Scholar] [CrossRef]
- Janardhan, N.; Kumaresh, N. Enhancing the early prediction of depression among adolescent students using dynamic ensemble selection of classifiers approach based on speech recordings. Int. J. Early Child. Spec. Educ. 2022, 14, 1–21. [Google Scholar] [CrossRef]
- Lønning, K.; Caan, M.W.; Nowee, M.E.; Sonke, J.J. Dynamic recurrent inference machines for accelerated MRI-guided radiotherapy of the liver. Comput. Med. Imaging Graph. 2024, 113, 102348. [Google Scholar] [CrossRef]
- Muralidhar, K.S.V. Demystifying R-Squared and Adjusted R-Squared. 2023. Available online: https://builtin.com/data-science/adjusted-r-squared (accessed on 7 June 2024).
- Liu, Y. Stock prediction using LSTM and GRU. In Proceedings of the 2022 6th Annual International Conference on Data Science and Business Analytics (ICDSBA), Changsha, China, 14–18 October 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Gheisari, M.; Ebrahimzadeh, F.; Rahimi, M.; Moazzamigodarzi, M.; Liu, Y.; Dutta Pramanik, P.K.; Kosari, S. Deep learning: Applications, architectures, models, tools, and frameworks: A comprehensive survey. CAAI Trans. Intell. Technol. 2023, 8, 581–606. [Google Scholar] [CrossRef]
- Mohr, F.; van Rijn, J.N. Learning curves for decision making in supervised machine learning: A survey. arXiv 2022, arXiv:2201.12150. [Google Scholar] [CrossRef]
- Li, C.; Song, Y. Applying LSTM model to predict the Japanese stock market with multivariate data. J. Comput. 2024, 35, 27–38. [Google Scholar]
- Dhafer, A.H.; Mat Nor, F.; Alkawsi, G.; Al-Othmani, A.Z.; Ridzwan Shah, N.; Alshanbari, H.M.; Baashar, Y. Empirical analysis for stock price prediction using NARX model with exogenous technical indicators. Comput. Intell. Neurosci. 2022, 2022, 9208640. [Google Scholar] [CrossRef]
- Poernamawatie, F.; Susipta, I.N.; Winarno, D. Sharia Bank of Indonesia stock price prediction using long short-term memory. J. Econ. Financ. Manag. Sci. (JEFMS) 2024, 7, 4777. [Google Scholar] [CrossRef]
Author | Algorithms Used | Study Procedure | Outcome | Study Limitation |
---|---|---|---|---|
Abuein, QQ. et al. [2] | LSTM and Support Vector Regression (SVR) | LSTM showed lower MSE, RMSE, and MAE values compared to SVR. | LSTM outperformed and captured complex trends while SVR struggled with capturing the underlying complex trend. | Limited to two comparative models (LSTM and SVR) without comparison to RNN. |
Zulfike, MS. et al. [5] | LSTM, SVR, and Vector Autoregression (VAR) | LSTM had higher prediction accuracy and a better R-squared score than SVR. | Our study includes technical indicators such as moving averages and volatility, as well as capturing non-linear relationships better. | Limited to the addition of a linear statistical model without technical indicators or comparison to RNN. |
Lakshminarayanan, JP. et al. [6] | SVM and LSTM | LSTM achieved better MAPE and R-squared scores than SVM. | LSTM outperformed due to the handling of temporal dependencies. | Limited to SVM that classifies and cannot predict continuous values. |
Pashankar, SS. et al. [7] | Linear regression, Random Forest, SVR | SVR demonstrated good short-term predictions but struggled with long-term trends. | LSTM had better results in long-term prediction accuracy, especially with large datasets. | Limited to short-term predictions with traditional models. Lack of focus on long-term dependencies and non-linear data. |
Chhajer P, et al. [8] | LSTM, SVM, and ANN | LSTM outperformed SVM and ANN for non-linear time-series data. | LSTM provided superior long-term performance compared to SVM and ANN when trained on multiple input features. | Models lacked diverse input features as variables. |
Shangshang J. [9] | LSTM, ARIMA, SVR, GRU | LSTM significantly outperformed ARIMA, GRU, and SVR in long-term predictions. | LSTM captured both long-term trends and non-linear relationships better than RNN. | Limited focus on long-term trends, with ARIMA struggling in non-linear tasks. |
Dataset Split | Split% Per Feature | Dataset/Observations Per Feature |
---|---|---|
Training and Validation Set | 85% of 3573 | 3037 |
Testing Set | 15% of 3573 | 536 |
Hyperparameter | Value Range |
---|---|
C | 0.001, 1000 |
Epsilon | 0.001–10 |
Kernel | Linear, Poly, RBF, Sigmoid |
Degree | 2, 5 (for kernel = Poly), 3 |
Hyperparameter | Value Range |
---|---|
Units | 32, 128 |
Dropout rates | 0.1 to 0.5 |
Batch size | 32, 64, 128 |
Learning rate | 0.00001 to 0.01 |
Optimiser | Adam |
Loss function | Mean Squared Error |
Best Hyperparameters (Without OBV) | Best Hyperparameters (with OBV) | |
---|---|---|
{‘C’: 0.2520357450941616, ‘Epsilon’: 0.027026985756138284, ‘Kernel’: ‘Linear’} | {‘C’: 1.1625895249237326, ‘Epsilon’: 0.005801399787450704, ‘Kernel’: ‘Rbf’} | |
Cross-Validation Training Loss | 0.0011 | 0.00133 |
Average Final Validation Loss | 0.00197 | 0.0136 |
Test MAE | 0.0143 (1231.61) | 0.027 (2318.66) |
MSE | 0.000459 (3,396,945.36) | 0.00242 (17,878,018.81) |
RMSE | 0.0214 (1843.08) | 0.0491 (4228.24) |
MAPE (%) | 2.03 | 3.4 |
R-Squared | 0.99 | 0.95 |
RNN Model Variations | ||||||
---|---|---|---|---|---|---|
Evaluation Metrics | 30-Day Step-Time (Unnormalised Value) Unoptimised RNN Model | Transformed Original Value (Unoptimised RNN Model) | Optimised RNN (30-Day Time Step Without OBV) * | Optimised RNN (60-Day Time Step Without OBV) ** | Optimised RNN (30-Day Time-Stamp with OBV) *** | Optimised RNN (60-Day Time-Stamp with OBV) **** |
MAE | 0.07483 | 6434.68 | 0.011321 (973.52) | 0.012490 (1074.05) | 0.019642 (1689.04) | 0.017465 (1501.84) |
MSE | 0.01043 | 77,130,110.92 | 0.000388 (2,870,386.08) | 0.000613 (4,533,240.36) | 0.000872 (6,447,080.45) | 0.000982 (7,264,937.75) |
RMSE | 0.10213 | 8782.37 | 0.019702 (1694.22) | 0.024760 (2129.14) | 0.029528 (2539.11) | 0.031345 (2695.35) |
MAPE (%) | 11.27 | 11.27 | 1.57 | 1.65 | 2.72 | 2.17 |
R-Squared | 0.78613 | 0.79 | 0.992 | 0.987 | 0.982 | 0.980 |
LSTM Model Variations | ||||||
---|---|---|---|---|---|---|
Evaluation Metrics | 60-Day Time Step (Unnormalised Value) Unoptimised LSTM Model | Transformed Original Value (Unoptimised LSTM Model) | Optimised LSTM (30-Day Time Step Without OBV) * | Optimised LSTM (60-Day Time Step Without OBV) ** | Optimised LSTM (30-Day Time Step with OBV) *** | Optimised LSTM (60-Day Time Step with OBV) **** |
MAE | 0.015733 | 1352.93 | 0.015346 (1319.58) | 0.012397 (1066.05) | 0.014333 (1232.53) | 0.009662 (830.84) |
MSE | 0.000734 | 5,425,600.48 | 0.000745 (5,512,396.17) | 0.000485 (3,586,624.63) | 0.000756 (5,593,662.96) | 0.000343 (2,536,732.36) |
RMSE | 0.027088 | 2329.29 | 0.027303 (2347.85) | 0.022024 (1893.84) | 0.027504 (2365.09) | 0.018522 (1592.71) |
MAPE (%) | 2.02 | 2.02 | 1.96 | 1.61 | 1.96 | 1.33 |
R-squared | 0.985 | 0.980 | 0.984 | 0.990 | 0.984 | 0.993 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shobayo, O.; Adeyemi-Longe, S.; Popoola, O.; Okoyeigbo, O. A Comparative Analysis of Machine Learning and Deep Learning Techniques for Accurate Market Price Forecasting. Analytics 2025, 4, 5. https://doi.org/10.3390/analytics4010005
Shobayo O, Adeyemi-Longe S, Popoola O, Okoyeigbo O. A Comparative Analysis of Machine Learning and Deep Learning Techniques for Accurate Market Price Forecasting. Analytics. 2025; 4(1):5. https://doi.org/10.3390/analytics4010005
Chicago/Turabian StyleShobayo, Olamilekan, Sidikat Adeyemi-Longe, Olusogo Popoola, and Obinna Okoyeigbo. 2025. "A Comparative Analysis of Machine Learning and Deep Learning Techniques for Accurate Market Price Forecasting" Analytics 4, no. 1: 5. https://doi.org/10.3390/analytics4010005
APA StyleShobayo, O., Adeyemi-Longe, S., Popoola, O., & Okoyeigbo, O. (2025). A Comparative Analysis of Machine Learning and Deep Learning Techniques for Accurate Market Price Forecasting. Analytics, 4(1), 5. https://doi.org/10.3390/analytics4010005