Artificial Intelligence vs. Efficient Markets: A Critical Reassessment of Predictive Models in the Big Data Era
Abstract
:1. Introduction
- How do ensemble methods compare to single classifiers in stock market prediction across different market conditions and time horizons?
- To what extent do hybrid approaches integrating multiple data sources improve predictive performance compared to single-source models?
- How can the apparent contradiction between the Efficient Market Hypothesis and empirical evidence of AI-driven predictability be reconciled through an adaptive market framework?
- What evaluation framework best captures both statistical significance and economic relevance in assessing prediction models?
2. Theoretical Foundations and Historical Evolution of Predictive Models
2.1. Limitations of Linear Econometric Models
- Nonlinear Dynamics: ARMA fails to capture asymmetric volatility clustering observed in S&P 500 returns, where negative shocks induce 43% greater volatility persistence than positive shocks [1]. This violates the assumption of linear shock response.
- High-Dimensional Interactions: The 2010 Flash Crash demonstrated cross-asset correlation jumps exceeding 0.8 within minutes, a phenomenon unmodelable through pairwise linear coefficients [2].
2.2. Neural Network Paradigm Shift
Recurrent Architectures for Temporal Dependencies
2.3. Modern Regularization Techniques
- Temporal Dropout: The random masking of sequence elements during training improved the NASDAQ-100 prediction Sharpe ratio by 0.47 [7].
- Curriculum Learning: Phased training from daily to tick-level data enhanced S&P 500 volatility forecasting, with a 33% RMSE reduction [8].
- Bayesian Hyperparameter Optimization: Tree-structured Parzen Estimators (TPEs) optimized LSTM layers on crude oil futures, achieving 19% lower MAE than grid search [9].
Attention Mechanisms
2.4. Critical Assessment of Prior Research
3. Taxonomy of Stock Market Prediction Techniques
3.1. Statistical Approaches
- Autoregressive Integrated Moving Average (ARIMA): These models combine autoregressive (AR) components, which capture the momentum and mean reversion effects in trading markets, with moving average (MA) components, which model shock effects in time series.
- Exponential Smoothing Model (ESM): This technique applies an exponential window function to time series data, giving greater weight to recent observations and progressively less weight to older data points.
- Generalized Autoregressive Conditional Heteroskedastic (GARCH): This model specifically addresses the volatility clustering observed in financial time series, where periods of high volatility tend to cluster together.
3.2. Pattern Recognition Methods
- Perceptually Important Points (PIP): This technique reduces time series dimensions by preserving salient points, allowing for more efficient pattern identification.
- Template Matching: This approach matches patterns in current stock data with historical patterns that preceded specific market movements.
- Chart Pattern Recognition: These methods identify familiar chart patterns like gaps, spikes, flags, pennants, wedges, saucers, triangles, and head-and-shoulder formations that technical analysts believe have predictive value.
3.3. Machine Learning Models
3.3.1. Supervised Learning Methods
- Support Vector Machines (SVMs): These algorithms define optimal hyperplanes for separating data into different classes.
- Decision Trees and Random Forests: Tree-based algorithms that create hierarchical decision structures based on feature values.
- Artificial Neural Networks (ANNs): Computational models inspired by the structure of biological neural networks.
- Deep Learning Models: More complex neural network architectures including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and Long Short-Term Memory (LSTM) networks.
- Gradient-Boosting Methods: Techniques like XGBoost and AdaBoost that create strong predictive models by combining multiple weak learners.
3.3.2. Unsupervised Learning Methods
- K-means Clustering: Groups data points into clusters based on similarity.
- Hierarchical Clustering: Creates a hierarchy of clusters using either agglomerative or divisive approaches.
- Principal Component Analysis (PCA): Reduces dimensionality while preserving data variance.
3.4. Sentiment Analysis
- Lexicon-Based Methods: Use predefined dictionaries to assess the sentiment of words and phrases.
- Machine Learning-Based Sentiment Analysis: Employs supervised learning to classify text sentiment.
- Deep Learning for Sentiment Analysis: Utilizes neural network architectures like BERT (Bidirectional Encoder Representations from Transformers) for more nuanced sentiment analysis.
4. Statistical Approaches in Stock Market Prediction
4.1. ARIMA Models
4.2. Exponential Smoothing Methods
4.3. Regression Models
4.4. Limitations of Statistical Approaches
- Linearity Assumptions: Many statistical methods assume linear relationships between variables, whereas stock markets frequently exhibit nonlinear dynamics.
- Stationarity Requirements: Models like ARIMA require data to be stationary, but financial time series often display changing statistical properties over time.
- Difficulty with Exogenous Variables: Traditional time series models may struggle to incorporate external factors such as news events or broader economic indicators.
- Limited Capability for Pattern Recognition: Statistical models typically cannot capture complex visual patterns that technical analysts identify in price charts.
4.5. Data Considerations Across Prediction Studies
4.5.1. Common Data Sources
- Alternative Data: Increasingly, studies incorporate non-traditional data sources. Zhang et al. [24] utilized weather data to predict energy commodity prices.
4.5.2. Preprocessing Approaches
- Cleaning and Normalization: Financial time series often contain missing values and outliers and require normalization. Common approaches include z-score standardization, min–max scaling, and missing value imputation using methods like forward filling or MICE (Multiple Imputation by Chained Equations).
- Feature Engineering: Raw financial data are transformed into predictive features. Technical indicators (e.g., RSI, MACD, Bollinger Bands) are commonly derived from price data. Hu et al. [25] created Google Trends indicators by calculating search intensity changes, while Chen and Chen [26] identified perceptually important points in price series to reduce dimensionality.
- Dimensionality Reduction: Given the high-dimensional nature of financial data, techniques like Principal Component Analysis (PCA) or autoencoders are often employed. Bao et al. [27] utilized stacked autoencoders to compress high-dimensional features before feeding them into LSTM networks.
- Temporal Alignment: Aligning data from diverse sources with different frequencies (e.g., daily price data with quarterly fundamentals) presents significant challenges. Chen and Hao [21] addressed this through temporal aggregation and forward-filling techniques.
4.5.3. Alternative Data Processing
- Image Processing: Satellite imagery typically undergoes segmentation, feature extraction, and object detection before integration with financial data.
- Sensor Data: IoT sensor data often require noise filtering, aggregation, and anomaly detection. Ma et al. [29] processed industrial sensor data through Fourier transformations before using them to predict commodity price movements.
5. Pattern Recognition in Stock Market Analysis
5.1. Perceptually Important Points
5.2. Template Matching
5.3. Advanced Pattern Recognition
5.4. Effectiveness and Limitations
- Subjectivity: Pattern definitions may vary between analysts, leading to inconsistent results.
- Overfitting Risk: Systems may be optimized to recognize patterns in historical data that lack predictive value for future movements.
- Changing Market Dynamics: Patterns that were predictive in the past may lose effectiveness as market structures and participant behaviors evolve.
- Limited Theoretical Foundation: Unlike statistical models, pattern recognition often lacks strong theoretical justification in financial economics.
6. Machine Learning Approaches
6.1. Supervised Learning Methods
6.1.1. Support Vector Machines
6.1.2. Decision Trees and Random Forests
- Random Forest: An ensemble of decision trees that has demonstrated strong performance across multiple stock prediction studies. Lohrmann and Luukka [37] applied Random Forest to classify intraday S&P 500 returns with high accuracy.
- Gradient Boosting: Methods like XGBoost and AdaBoost have shown excellent performance in stock prediction. Dey et al. [38] applied XGBoost to predict stock direction, achieving accuracies of 87–99% for the long-term prediction of Apple and Yahoo stocks.
- Bagging Methods: Ampomah et al. [39] evaluated tree-based ensemble machine learning models in predicting stock price direction, finding that ensemble methods consistently outperformed individual classifiers.
6.1.3. Artificial Neural Networks
6.1.4. Deep Learning Models
- Recurrent Neural Networks (RNNs): Bernal et al. [32] implemented Echo State Networks (a subclass of RNNs) to predict S&P 500 stock prices, outperforming traditional techniques with very low test error.
- Long Short-Term Memory (LSTM): Di Persio and Honchar [46] compared basic RNNs, LSTM, and Gated Recurrent Units (GRUs) for Google stock price prediction, finding that LSTM outperformed other variants with 72% accuracy on a five-day horizon.
- Convolutional Neural Networks (CNNs): Sezer and Ozbayoglu [47] developed a CNN-based approach for financial trading, converting time series data to image representations to leverage the CNN’s pattern recognition capabilities.
6.2. Unsupervised Learning Methods
- Clustering Methods: Powell et al. [49] compared K-means clustering with SVM for stock prediction, finding similar performance between the two approaches. The study highlighted the importance of distance metric selection for clustering effectiveness.
- Association Rule Learning: Wu et al. [50] proposed a model combining K-means clustering with the AprioriAll algorithm to extract frequent patterns and predict stock trends, outperforming other approaches in terms of average returns.
- Hybrid Unsupervised Approaches: Babu et al. [51] proposed a clustering method called HAK that combines Hierarchical Agglomerative Clustering and reverse K-means clustering to predict the impact of financial reports on stocks, outperforming SVMs in terms of accuracy.
6.3. Comparative Analysis of Machine Learning Models
7. Sentiment Analysis for Stock Prediction
7.1. News-Based Sentiment Analysis
7.2. Social Media-Based Sentiment Analysis
7.3. Search Volume Analysis
7.4. Combined Sentiment Approaches
8. Hybrid and Advanced Approaches
8.1. Hybrid Technical Models
8.2. Multimodal Data Integration
8.3. Combined Technical and Fundamental Analysis
8.4. Advanced Deep Learning Architectures
8.5. Graph Neural Networks for Stock Prediction
8.6. Reinforcement Learning for Trading
9. Evaluation Methodologies
9.1. Classification Performance Metrics
- Accuracy: The percentage of correct predictions, typically calculated as (TP + TN)/(TP + TN + FP + FN).
- Precision: The proportion of true positive predictions out of all positive predictions, calculated as TP/(TP + FP).
- Recall: The proportion of true positive predictions out of all actual positives, calculated as TP/(TP + FN).
- F1 Score: The harmonic mean of precision and recall, providing a balance between the two metrics.
- Area Under the ROC Curve (AUC): Measures the model’s ability to distinguish between classes across different threshold settings.
9.2. Financial Performance Metrics
- Returns: Measures investment performance, including cumulative return, annual return, and risk-adjusted return.
- Sharpe Ratio: Evaluates risk-adjusted performance by comparing excess returns to volatility.
- Maximum Drawdown: Shows the largest peak-to-trough decline in portfolio value, indicating downside risk.
- Win Rate: Calculates the percentage of profitable trades, indicating the consistency of returns.
- Profit Factor: Indicates the ratio of gross profits to gross losses, indicating the overall profitability of a strategy.
9.3. Statistical vs. Economic Significance in Model Evaluation
9.3.1. Distinguishing Statistical from Economic Significance
- Statistical Significance: Measures whether a result differs from what would be expected under the null hypothesis (typically assessed using p-values or confidence intervals).
- Economic Significance: Measures whether a result matters in practical terms, considering implementation costs, risk adjustment, and real-world constraints.
9.3.2. Appropriate Statistical Tests for Financial Time Series
- Non-normality: Returns distributions typically exhibit fat tails and skewness. Shapiro–Wilk or Jarque–Bera tests should be used to check normality before applying parametric tests.
- Serial Dependence: Financial returns often show autocorrelation, heteroskedasticity, and other forms of serial dependence. Ljung-Box or ARCH tests should be applied to verify independence assumptions.
- Non-stationarity: The statistical properties of financial time series change over time. Augmented Dickey–Fuller or KPSS tests can assess stationarity.
- Diebold–Mariano Test [73]: Tests whether two forecasting methods differ significantly in accuracy while accounting for serial correlation in forecast errors.
- Model Confidence Set [74]: Identifies the set of models that are statistically indistinguishable from the best model, providing a robust way to compare multiple forecasting methods.
- Giacomini–White Test [77]: Evaluates conditional predictive ability, which is more relevant for time-varying models.
- Bootstrap Methods [78]: Provide distribution-free inference when the underlying distributions are unknown or complex.
9.3.3. Multiple Hypothesis Testing in Financial Prediction
- With a standard significance level of 0.05, testing 100 independent strategies would be expected to yield 5 “significant” results by pure chance.
- Financial research often implicitly tests thousands of combinations, leading to a massive multiple testing problem that standard p-values fail to address.
- Bonferroni Correction: Adjusts the significance threshold by dividing it by the number of tests (e.g., for 100 tests, significance threshold becomes 0.05/100 = 0.0005).
- Benjamini–Hochberg Procedure [80]: Controls the false discovery rate (FDR) rather than the family-wise error rate, offering more power while limiting false positives.
- Holm’s Step-Down Procedure [81]: Provides stronger controls than Benjamini–Hochberg but less conservative than Bonferroni.
- False Discovery Proportion Control [82]: Limits the proportion of false discoveries while maximizing true discoveries.
9.3.4. Structural Breaks and Time-Varying Parameters
- Parameters estimated from past data may no longer apply in the current market environment.
- Significant relationships may reverse or disappear following structural breaks.
- Traditional statistical tests assume parameter stability, leading to false conclusions when this assumption is violated.
- Andrews–Quandt Test [84]: Detects unknown structural breakpoints in time series regression.
- Bai–Perron Test [85]: Identifies multiple structural breaks in time series data.
- Time-Varying Parameter Models [86]: Allow coefficients to evolve gradually over time, capturing changing relationships.
- Regime-Switching Models [87]: Model discrete shifts between different market regimes with distinct parameters.
9.3.5. Comprehensive Framework for Model Evaluation
9.3.6. Critical Assessment of Key Studies
- Brock et al. (1992) [89]: While finding statistical significance for technical trading rules on the Dow Jones Index, they did not account for multiple testing across the many rule parameterizations. Sullivan et al. [90] later showed that after proper multiple testing adjustments, few rules remained significant.
- Lo et al. (2000) [91]: They identified statistically significant technical patterns in US stocks but reported economic gains of only 0.7–2.2%, likely insufficient to overcome transaction costs. No adjustment for multiple testing across the many patterns was examined.
- Fischer and Krauss (2018) [6]: They reported impressive performance of LSTM networks for S&P 500 stock prediction but did not systematically evaluate robustness across different market regimes. Their strongest results came during the unusual post-2008 bull market, raising questions about generalizability.
- Gu et al. (2020) [71]: While employing proper out-of-sample validation and demonstrating superior performance of neural networks, their analysis did not fully account for survivorship bias in the dataset or evaluate performance under different market regimes.
9.3.7. Implications for AI-Based Stock Prediction Research
- The flexibility of deep learning models, with numerous hyperparameters and architectural choices, exacerbates multiple testing concerns. Researchers should report all model variations attempted and apply appropriate corrections.
- The black-box nature of complex AI models makes it difficult to distinguish genuine pattern discovery from overfitting. Techniques like feature importance analysis, partial dependence plots, and SHAP values can help assess whether models are capturing economically plausible relationships.
- AI models trained on specific market regimes may fail to generalize to new conditions. Time-stratified validation, where models are tested on distinct market regimes (e.g., high/low volatility, bull/bear markets), provides more realistic performance estimates.
- Ensemble approaches that combine multiple models with different assumptions may provide more robust predictions than single models, mitigating the risk of statistical flukes.
9.3.8. Practical Implementation Guide for Comprehensive Evaluation
- 1.
- Initial Statistical Assessment:
- Establish baseline performance using traditional metrics (accuracy, precision, F1-score)
- Conduct appropriate statistical tests based on data characteristics:
- −
- For normally distributed prediction errors: t-tests or ANOVA
- −
- For non-normal distributions: Wilcoxon signed-rank or Mann–Whitney U tests
- −
- For time series predictions: Diebold–Mariano test to compare forecast accuracy
- Apply multiple testing corrections based on the number of model configurations tested:
- −
- Bonferroni correction for <10 model configurations (conservative)
- −
- Benjamini–Hochberg procedure for larger numbers (controls false discovery rate)
- −
- White’s Reality Check or Hansen’s SPA for comparing multiple models to a benchmark
- 2.
- Economic Significance Testing:
- Implement trading simulations incorporating realistic assumptions:
- −
- Transaction costs (variable by market capitalization and volume)
- −
- Market impact models (especially for large positions)
- −
- Execution delays and slippage
- Calculate economic performance metrics:
- −
- Risk-adjusted returns (Sharpe ratio, Sortino ratio)
- −
- Maximum drawdown and recovery periods
- −
- Win/loss ratios and profit factors
- Compare to appropriate economic benchmarks:
- −
- Risk-matched buy-and-hold portfolios
- −
- Industry or factor-based portfolios
- −
- Analyst consensus forecasts
- 3.
- Robustness Testing:
- Test performance across distinct market regimes:
- −
- Bull vs. bear markets
- −
- High vs. low volatility periods
- −
- Rising vs. falling interest rate environments
- Conduct sensitivity analysis:
- −
- Parameter perturbation tests
- −
- Feature importance analysis
- −
- Random seed variation for stochastic models
- Perform walk-forward validation:
- −
- Expanding window approach for growing datasets
- −
- Rolling window approach for maintaining consistent training size
- −
- Purged cross-validation to prevent information leakage
- 4.
- Implementation Feasibility Assessment:
- Evaluate computational requirements:
- −
- Training time and hardware requirements
- −
- Inference latency for time-sensitive applications
- −
- Memory and storage requirements
- Assess scalability constraints:
- −
- Position sizing and liquidity limitations
- −
- Strategy capacity estimates
- −
- Market impact with increased capital deployment
- Consider operational requirements:
- −
- Data collection and processing pipeline
- −
- Model retraining frequency
- −
- Monitoring and failover systems
9.4. Comprehensive Benchmarking Approaches
9.4.1. Traditional Financial Models
- Factor Models: The Fama-French three-factor and five-factor models account for size, value, profitability, and investment patterns in stock returns. These models provide more rigorous benchmarks than market indices alone. As demonstrated by [71], comparing AI predictions against factor model forecasts helps isolate the incremental value of machine learning approaches.
- ARIMA and GARCH Variants: These traditional time series models capture autoregressive patterns and volatility clustering. Wu and Chen [48] showed that ARIMA models remain competitive with LSTM networks for longer-horizon forecasts, making them valuable benchmark comparisons.
- Econometric Models: Vector Autoregression (VAR) models, Error Correction Models (ECMs), and other econometric approaches provide theoretically grounded benchmarks. Rapach et al. [92] employed VAR models incorporating multiple economic variables as benchmarks for forecasting aggregate market returns.
9.4.2. Expert and Consensus Forecasts
- Analyst Consensus Estimates: Aggregated forecasts from financial analysts provide benchmarks that incorporate fundamental analysis and domain expertise. Bradshaw et al. [93] demonstrated that consensus analyst forecasts contain information not captured by quantitative models alone.
- Survey-Based Forecasts: Surveys of professional forecasters, such as the Survey of Professional Forecasters (SPF) or the Wall Street Journal Economic Forecasting Survey, offer alternative benchmarks for macroeconomic variables that influence markets.
- Market-Implied Forecasts: Options-implied volatility, forward rates, and other market-derived forecasts represent the collective wisdom of market participants. Christoffersen et al. [94] showed that option-implied volatility forecasts often outperform statistical models, making them valuable benchmarks.
9.4.3. Industry-Specific Models
- Commodity Markets: Models incorporating storage theory, convenience yield, and seasonality patterns provide appropriate benchmarks for commodity-related stocks. Cheng and Xiong [95] developed commodity-specific benchmarks that outperform general financial models for resource sector stocks.
- Financial Institutions: Models incorporating factors like yield curve dynamics, credit spreads, and regulatory capital constraints provide suitable benchmarks for bank stocks. English et al. [96] demonstrated that specialized models accounting for interest rate sensitivity offer superior benchmarks for financial institution stocks.
- Technology Sector: Growth models incorporating network effects, R&D productivity, and technology adoption cycles provide appropriate benchmarks for technology stocks. Pastor and Veronesi [97] developed technology-sector-specific benchmarks that capture the unique valuation dynamics of high-growth technology firms.
9.5. Simulation-Based Evaluation
- Backtesting: Testing a model on historical data to simulate trading decisions and evaluate financial outcomes.
- Out-of-Sample Testing: Evaluating models on data not used for training to assess generalization performance.
- Walk-Forward Analysis: A sequential testing approach where models are retrained as new data become available.
9.6. Statistical Validation Techniques
- Cross-Validation: Dividing data into multiple subsets for training and validation to ensure consistent performance.
- Bootstrap Resampling: Generating multiple datasets by sampling with replacement to assess model stability.
- Statistical Hypothesis Testing: Comparing model performance against random predictions or simple benchmarks to establish statistical significance.
9.7. Benchmark Comparisons
- Buy-and-Hold Strategy: A passive investment approach that serves as a common benchmark.
- Simple Technical Indicators: Basic trading rules based on moving averages or other common indicators.
- Market Indices: Comparison against relevant market indices to assess relative performance.
9.8. Multi-Criteria Evaluation
- Risk–Return Analysis: Evaluating both returns and associated risks to provide a more complete performance picture.
- Performance Across Market Regimes: Assessing how models perform in different market conditions (bull markets, bear markets, sideways markets).
- Consistency of Performance: Evaluating models based on the consistency of their predictions across different time periods and market conditions.
9.9. Comparative Analysis of Performance Across Studies
10. Challenges and Limitations
10.1. Theoretical Challenges
- Efficient Market Hypothesis: The EMH suggests that predictable patterns should quickly disappear as they become known, creating a fundamental challenge for prediction models.
- Non-Stationarity: Financial markets are non-stationary environments, meaning that statistical properties change over time, potentially invalidating models trained on historical data.
- Complex Causality: Stock prices are influenced by a complex web of factors including macro-economic conditions, company fundamentals, market sentiment, and global events, making causal modeling extremely difficult.
10.2. The Efficient Market Hypothesis Paradox and AI-Based Prediction
10.2.1. Reconciling Prediction Models with Market Efficiency
- Degrees of Market Efficiency: Markets may not be uniformly efficient across all assets, timeframes, and conditions. Lo’s Adaptive Market Hypothesis [98] proposes that market efficiency is not an all-or-nothing property but evolves dynamically as market participants adapt. This evolutionary perspective suggests that temporary inefficiencies can exist and be exploited before being arbitraged away.
- Implementation Constraints: Even when inefficiencies are identified, practical limitations often prevent their complete elimination:
- −
- Transaction costs create a “no-arbitrage band” within which inefficiencies can persist
- −
- Capital constraints limit arbitrage capacity
- −
- Risk aversion may deter traders from fully exploiting identified patterns
- −
- Institutional constraints such as investment mandates may prevent certain market participants from engaging in arbitrage
- Market Microstructure: High-frequency patterns may persist due to structural elements of markets:
- −
- Order flow dynamics create predictable short-term price pressures
- −
- Market maker inventory management generates mean-reverting patterns
- −
- Regulatory circuit breakers and trading halts create predictable recovery patterns
10.2.2. Empirical Evidence of Persistent Anomalies
- Calendar Effects: Seasonal patterns like the January effect and day-of-week effects have weakened but not disappeared entirely despite their widespread publication [103].
10.2.3. AI’s Role in an Adaptive Market Framework
- Pattern Complexity: Machine learning algorithms can identify complex, non-linear patterns that may be invisible to human traders or simple statistical tests, creating a temporary information advantage.
- Adaptation Speed: Deep learning models can be retrained as market conditions change, potentially adapting faster than the market’s overall adjustment process.
- Multi-dimensional Analysis: AI systems can simultaneously process diverse data sources (price data, fundamentals, sentiment, alternative data) at scales beyond human capacity, identifying inefficiencies at the intersection of multiple factors.
- Temporal Advantage: Even if patterns eventually disappear, early identification through superior computational methods may provide a temporary edge before markets fully incorporate information.
10.2.4. Epistemological Limitations
- Publication Effect: Does publishing AI methods for stock prediction accelerate their obsolescence?
- Performance Decay: How quickly do the advantages of AI prediction methods decay over time?
- Distinguishing Skill from Luck: Given the low signal-to-noise ratio in financial markets, what threshold of evidence is needed to establish that AI predictions reflect genuine inefficiencies rather than statistical artifacts?
10.3. Market Microstructure and Time-Scale Dependent Predictability
10.3.1. Time-Scale Hierarchy of Predictability
- Ultra-high-frequency domain (milliseconds to seconds): At this scale, predictability is primarily driven by order flow imbalances, market maker inventory management, and latency arbitrage opportunities. Research has found that order book imbalances can predict short-term price movements with high accuracy, while market maker positioning creates micro-patterns that persist despite their theoretical inefficiency according to the EMH.
- Intraday time scales (minutes to hours): At this intermediate frequency, market impact effects from institutional order execution create temporary price pressures and mean-reversion patterns. Large orders split into smaller tranches create predictable price trajectories that AI models can potentially exploit. As noted in our review, Khan et al. [42] found that 15 min intervals provide an optimal window for machine learning models, with Random Forest achieving 91.27% accuracy at this scale.
- Daily and weekly horizons: As the time scale extends, information diffusion rates and behavioral factors become more significant. Market underreaction and overreaction patterns create multi-day predictability. Our analysis shows that ensemble methods like Extra Trees Classifiers maintain effectiveness at this scale, with Pagliaro [43] reporting accuracy rates of 86.1% for 10-day windows.
- Monthly and longer horizons: At this scale, fundamental factors and macroeconomic conditions dominate. Traditional statistical methods become more competitive with advanced AI approaches, as demonstrated by Campbell and Thompson [79], who found that simple models with very low values can still generate economic value at longer horizons.
10.3.2. Liquidity-Based Predictability
- Market depth variations across different securities create varying levels of price impact, with less liquid securities typically exhibiting higher predictability.
- Bid–ask spread dynamics can generate predictable patterns, particularly during periods of liquidity stress.
- Order book shape provides predictive signals that are stronger in markets with lower trading volume, with several studies suggesting that deep learning models trained on limit order book data show significantly higher accuracy for small-cap versus large-cap stocks.
10.3.3. Market Design Effects
- Trading halts, circuit breakers, and other market rules create discontinuities that AI models can learn to anticipate, including predictable recovery patterns following trading halts that persist despite widespread knowledge of their existence.
- Different exchange mechanisms (continuous auction vs. periodic call auctions) generate distinct predictability patterns, with high-frequency trading strategies performing differently under continuous versus discrete-time trading mechanisms.
- Fragmentation across multiple trading venues creates cross-venue arbitrage opportunities, with research suggesting that predictability increases with market fragmentation.
10.3.4. Cross-Asset and Cross-Market Information Flow
- Information typically flows from more liquid to less liquid assets, creating a predictability gradient. It has been well documented how price discovery in futures markets often leads the corresponding cash indices by several minutes, creating predictable patterns.
- Price discovery process occurs at different rates across related instruments, allowing for the measurement of information share across markets and the identification of price leadership.
- Models leveraging these information transmission delays have shown stronger performance at specific time scales. Several studies in our review, including Shen et al. [36] and Hu et al. [25], demonstrated improved accuracy by incorporating cross-market signals, with prediction accuracy improvements of 2–5% when integrating related market data.
- Short time horizons (seconds to minutes): Deep learning approaches like LSTM networks tend to excel, as demonstrated by Fischer and Krauss [6], who found that LSTM models outperform traditional approaches in short-term forecasting tasks.
- Medium time horizons (minutes to hours): Random Forest models show particular strength, with Khan et al. [42] reporting 91.27% accuracy using 15 min intervals.
- Daily prediction windows: Ensemble methods remain effective, with tree-based methods like XGBoost showing strong performance, as reported by Dey et al. [38].
- Multi-day to weekly horizons: Extra Trees Classifier models have demonstrated superior accuracy (86.1%) for 10-day prediction windows according to Pagliaro [43].
- Longer horizons: The advantage of complex AI methods diminishes, with traditional statistical approaches becoming more competitive, as Campbell and Thompson [79] demonstrated.
10.3.5. Market Capitalization Effects on Predictability
10.4. Data Challenges
- Data Quality: Financial data may contain errors, missing values, or inconsistencies that can impact model performance.
- Limited History: Many newer financial instruments have limited historical data, making it difficult to train robust models.
- Survivorship Bias: Datasets that include only currently existing companies can create survivorship bias, potentially leading to overly optimistic predictions.
- Feature Selection: Identifying the most relevant features among numerous potential predictors remains challenging, with different features potentially having varying importance across different market regimes.
10.5. Methodological Challenges
- Overfitting: The complexity of modern ML models creates significant risk of overfitting to historical patterns that lack predictive value for future movements.
- Parameter Sensitivity: Many models are highly sensitive to hyperparameter settings, requiring extensive tuning and validation.
- Black Box Models: Advanced deep learning models often lack interpretability, making it difficult to understand the basis for their predictions.
- Transfer Learning: Models trained on one market or time period may not transfer effectively to other contexts, limiting their practical utility.
10.6. Implementation Challenges
- Transaction Costs: Trading costs can significantly reduce or eliminate theoretical profits from prediction models.
- Execution Slippage: Delays between prediction and execution can lead to different prices than anticipated.
- Market Impact: Large trades can themselves move the market, potentially reducing or eliminating predicted profit opportunities.
- Regulatory Constraints: Trading strategies may be subject to regulatory restrictions that limit their implementation.
10.7. Reproducibility Challenges
10.7.1. The Replication Crisis in Financial Prediction
- Harvey et al. [83] conducted a comprehensive review of 316 published financial anomalies and found that 60–80% failed to replicate when subjected to more stringent statistical tests, with most published results likely representing false positives.
- Hou et al. [104] re-examined 452 cross-sectional anomalies and discovered that 65% failed to replicate with updated data and proper controls for microcap stocks.
- Chen and Zimmermann [105] documented that the average return predictability of published strategies declined by about 32% after publication, suggesting either data mining or market adaptation.
10.7.2. Case Studies in Failed Replication
10.7.3. Root Causes of Replication Failures
- Publication Bias: Journals tend to publish studies with positive and significant results, creating a biased literature that overrepresents successful predictions.
- Backtest Overfitting: Bailey et al. [113] demonstrated that the repeated backtesting of strategies against the same historical data inevitably leads to false discoveries through the optimization of strategy parameters.
- P-hacking: Some engage in the practice of testing multiple hypotheses, models, or specifications until statistically significant results are achieved, without appropriate corrections for multiple testing.
- Data Snooping: López de Prado [14] identified that standard cross-validation fails in sequential data, leading to information leakage and inflated performance estimates.
- Non-stationarity: Financial markets evolve over time, and patterns discovered in one period may not persist in future periods due to changing market conditions or adaptation by market participants.
10.7.4. Methodological Standards for Robust Financial Prediction Research
- Proper Out-of-Sample Testing: Researchers should maintain a truly untouched validation dataset for final model evaluation. Walk-forward analysis, where models are retrained as new data become available, provides a more realistic assessment than standard cross-validation.
- Multiple Testing Corrections: Studies should apply family-wise error rate controls (e.g., Bonferroni correction) or false discovery rate methods (e.g., Benjamini–Hochberg procedure) when testing multiple hypotheses or model specifications.
- Combinatorial Purged Cross-Validation: As proposed by López de Prado [14], this technique prevents information leakage in financial time series by purging overlapping observations and embargoes to account for serial correlation.
- Statistical Power Analysis: Researchers should conduct a priori power analysis to ensure that sample sizes are adequate for detecting the expected effect sizes, reducing the risk of both false positives and false negatives.
- Registered Reports: Following practices from medical research, pre-registering hypotheses, data collection procedures, and analysis plans before conducting research can mitigate p-hacking and publication bias.
- Code and Data Sharing: Making code and data publicly available enables independent verification and improves reproducibility.
- Ensemble Methods: Combining multiple models with different assumptions and starting points can provide more robust predictions and mitigate the impact of individual model overfitting.
10.7.5. Implications for AI-Based Stock Prediction
- Model performance reported in the academic literature should be treated with greater skepticism, particularly when out-of-sample testing is limited or absent.
- The complexity of deep learning models, with their numerous hyperparameters and architectural choices, makes them particularly susceptible to overfitting and difficult to replicate.
- AI models trained on historical market data may inadvertently capture noise rather than signals, especially when the noise-to-signal ratio is high.
- Methods that explicitly account for estimation uncertainty, such as Bayesian approaches, may provide more reliable insights than point estimates of predicted returns or probabilities.
10.8. Evaluation Challenges
- Performance Metrics: Different evaluation metrics can lead to different conclusions about model performance.
- Backtest Overfitting: The excessive optimization of models to historical data can create misleading performance metrics.
- Out-of-Sample Validation: Proper out-of-sample validation is essential but often implemented inconsistently across studies.
- Publication Bias: There may be publication bias toward models that show positive results, potentially creating an overly optimistic view of the field’s progress.
10.9. Hardware Implications and Computational Efficiency
10.9.1. Computational Requirements Across Model Classes
10.9.2. Hardware Acceleration Requirements
- GPU Acceleration: Fischer and Krauss [6] reported that LSTM models required GPU acceleration to achieve practical training times, with a 15x speedup compared to CPU-only training. Their implementation on an NVIDIA Tesla V100 required 36 h for training, compared to estimated weeks on CPU architectures.
- Memory Bandwidth: Wang et al. [60] noted that their knowledge graph-based GCN implementation was primarily memory-bandwidth limited rather than compute-bound, with loading the entire market graph requiring 24GB of GPU memory.
- Inference Latency: For high-frequency applications, Sezer et al. [11] found that CNN models achieved 2 ms inference times on GPUs but 45 ms on CPUs, making hardware acceleration essential for real-time applications requiring sub-10 ms response times.
10.9.3. Computational Efficiency vs. Prediction Accuracy
- Ensemble Method Efficiency: While ensemble methods like Random Forest and Extra Trees show superior prediction accuracy (as shown in Table 3), they offer significantly better computational efficiency than deep learning approaches. Pagliaro [43] demonstrated that Extra Trees models could be trained in under 10 min on a standard workstation while achieving 86.1% directional accuracy.
- Model Pruning and Quantization: Studies by Wu et al. [50] demonstrated that the quantization and pruning of LSTM models could reduce memory requirements by 75% and inference time by 60% with only a 1.2% reduction in accuracy, suggesting significant opportunities for optimization.
- Batch Processing Efficiency: Khan et al. [42] showed that batch prediction approaches could amortize computational costs, with batch sizes of 64–128 providing optimal throughput on GPU hardware for daily prediction tasks.
10.9.4. Deployment Considerations
- Cloud vs. On-Premises: Complex models like transformers and GNNs typically require cloud-based GPU clusters for training, incurring significant operational expenses. Based on current cloud provider pricing, training a state-of-the-art transformer model for market prediction can cost between USD 2000 and 10,000 in compute resources alone.
- Energy Efficiency: The energy consumption of different models varies dramatically: traditional methods and tree-based ensembles can be deployed on energy-efficient CPU servers, while deep learning approaches may require 10–100× more energy during both training and inference phases.
- Specialized Hardware: Field-Programmable Gate Arrays (FPGAs) have shown promise for the low-latency deployment of certain model types, with Jang and Seong [66] reporting 5× lower latency for reinforcement learning inference compared to GPU implementations.
10.9.5. Implications for Research and Application
10.10. Cross-Market Generalizability
10.10.1. Market Structure Effects
- Trading Mechanism Differences: Studies comparing prediction model performance between auction markets (like NYSE) and dealer markets (like NASDAQ) reveal systematic differences. Boehmer et al. [114] found that ensemble methods achieve 5–8% higher accuracy in auction markets compared to dealer markets, likely due to differences in price formation processes and transparency.
- Order Book Depth: Markets with deeper order books show different predictability patterns compared to shallow markets. Deep learning approaches exploiting limit order book data show significantly higher effectiveness in markets with rich microstructure data availability, as demonstrated by [115].
- Trading Hours: Continuous versus call auction markets and markets with different trading hour structures exhibit distinct predictability patterns. Sezer and Ozbayoglu [47] found that CNN-based models trained on Asian markets required significant adaptation to maintain performance when applied to European markets with different trading session structures.
10.10.2. Geographic and Economic Variations
- Developed vs. Emerging Markets: While ensemble methods show consistency across US and European markets [40], their performance in emerging markets like Brazil and India demonstrates greater variation. Models optimized for the S&P 500 saw performance degradation when applied to the Indian Nifty index without recalibration, suggesting that market maturity impacts predictability.
- Market Efficiency Variations: Markets with different levels of informational efficiency require different modeling approaches. Urquhart [116] demonstrated that emerging markets show higher degrees of predictability using technical approaches, while developed markets require more sophisticated alternative data integration to achieve comparable results.
- Regulatory Environment: Differing regulatory structures, particularly regarding short selling, margin requirements, and circuit breakers, impact model transferability. Models trained in markets with unrestricted short selling required significant modification to maintain performance in markets with short-selling restrictions.
10.10.3. Research Design for Cross-Market Validation
- Multi-Market Training: Training models on diverse markets simultaneously can improve generalizability. Cao et al. [117] demonstrated that models trained on a combination of US, European, and Asian market data showed improved robustness when tested on out-of-sample markets compared to single-market training.
- Transfer Learning: Adopting transfer learning approaches where models pre-trained on data-rich markets are fine-tuned for specific target markets. Zhang and Jacobsen [103] employed this approach to adapt models from US markets to smaller European exchanges, achieving 85% of the original performance with only 20% of the target market training data.
- Meta-Features: Developing market-invariant features that capture fundamental economic relationships rather than market-specific patterns.
- Systematic Comparison Studies: Conducting more research explicitly comparing identical methodologies across different markets. Following protocols similar to those established in [118], who systematically applied identical models across 21 equity markets, would provide valuable insights into generalizability constraints.
11. Future Research Directions
11.1. Alternative Data Sources
- Satellite Imagery: Using satellite data to monitor economic activity, such as parking lot occupancy or construction progress.
- Internet of Things (IoT) Data: Leveraging IoT sensors to track physical economic indicators in real-time.
- Alternative Text Sources: Analyzing specialized publications, expert forums, and other text sources beyond mainstream media and social networks.
- Private Company Data: Incorporating data from private companies and supply chains that may provide early signals of public company performance.
11.2. Explainable AI for Finance
- Rule Extraction: Techniques for extracting interpretable rules from complex models like neural networks.
- Feature Importance Analysis: Methods for identifying which features contribute most significantly to predictions.
- Local Explanation Methods: Approaches like LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations) that explain individual predictions.
- Attention Mechanisms: Neural network architectures with attention components that highlight which parts of the input contribute most to the prediction.
11.3. Transfer Learning and Domain Adaptation
- Cross-Market Transfer: Transferring knowledge from well-established markets to emerging markets with limited historical data.
- Temporal Transfer: Adapting models across different market regimes (e.g., bull markets, bear markets, high-volatility periods).
- Cross-Asset Transfer: Leveraging patterns learned from one asset class to improve predictions for related asset classes.
- Meta-Learning: Developing models that can quickly adapt to new market conditions with minimal retraining.
11.4. Multimodal Learning
- Text–Price Integration: Combining price data with textual data from news, social media, and company reports.
- Financial–Alternative Data: Integrating traditional financial data with alternative data sources like satellite imagery or consumer spending patterns.
- Cross-Market Integration: Modeling relationships between different markets and asset classes to better capture global economic dynamics.
- Temporal–Spatial Integration: Combining time series analysis with spatial analysis to capture geographic and temporal patterns in market behavior.
11.5. Causality and Counterfactual Analysis
- Causal Discovery: Identifying causal relationships between economic factors and stock price movements.
- Counterfactual Analysis: Developing models that can reason about what would happen under different scenarios.
- Intervention Models: Creating models that account for the impact of policy changes or market interventions.
- Robust Predictors: Developing prediction models that rely on stable causal mechanisms rather than ephemeral statistical correlations.
11.6. Reinforcement Learning for Portfolio Management
- Multi-Asset RL: Developing reinforcement learning approaches that simultaneously manage multiple assets.
- Risk-Aware RL: Incorporating risk constraints and preferences into reinforcement learning frameworks.
- Multi-Period Optimization: Addressing the challenges of long-term portfolio optimization under uncertainty.
- Hierarchical RL: Using hierarchical reinforcement learning to manage different investment time horizons and objectives.
11.7. Federated and Privacy-Preserving Learning
- Federated Learning: Training models across multiple institutions without sharing raw data.
- Differential Privacy: Implementing privacy-preserving techniques that protect individual data points while allowing population-level analysis.
- Secure Multi-Party Computation: Enabling collaborative model development without exposing proprietary data or strategies.
- Homomorphic Encryption: Performing computations on encrypted data to preserve confidentiality.
11.8. Ethical Considerations and Market Impact
- Market Fairness and Access: Advanced AI systems require substantial computational resources and data access, potentially creating or exacerbating inequalities between market participants with different resource levels. This raises questions about fair market access and whether regulations should ensure a level playing field.
- Systemic Risk: The widespread adoption of similar AI models could lead to correlated trading behaviors, potentially amplifying market movements and increasing systemic risk. The 2010 Flash Crash demonstrated how algorithmic trading can contribute to market instability, and more sophisticated AI systems may introduce new forms of systemic vulnerability.
- Transparency and Explainability: As models become more complex, their decision-making processes become less transparent. This “black box” nature raises concerns about accountability, particularly when these systems manage significant capital or influence market movements.
- Market Manipulation: AI systems might identify and exploit patterns that effectively constitute market manipulation, even if not explicitly programmed to do so. This raises questions about the responsibility of the developers and deployers of such systems.
- Social Impact: The broader societal impacts of AI-driven markets—including effects on wealth distribution, capital allocation efficiency, and economic stability—warrant careful consideration. Markets serve important social functions beyond profit generation, and AI systems optimized solely for returns may not adequately serve these broader purposes.
11.9. Practical Implementation and Financial Implications
11.9.1. From Predictions to Trading Decisions
- Decision Thresholds: Determining appropriate thresholds for converting probabilistic predictions into discrete trading decisions significantly impacts performance. Fischer and Krauss [6] demonstrated that LSTM-based predictions, while statistically significant, generated economically meaningful returns only when implemented with optimized decision thresholds that varied by market volatility regime.
- Position Sizing: The allocation of capital based on prediction confidence fundamentally affects risk–return profiles. Pagliaro [43] showed that implementing confidence-weighted position sizing with Extra Trees Classifier predictions increased Sharpe ratios by 31% compared to uniform position sizing.
- Holding Periods: Optimizing holding periods based on prediction horizons and market conditions can significantly enhance performance. Khan et al. [42] found that dynamic holding periods adjusted for volatility outperformed fixed holding periods even when using identical prediction models.
11.9.2. Portfolio Construction Considerations
- Diversification Effects: The proper diversification of model-driven positions can reduce risk without proportionately reducing returns. Jang and Seong [66] demonstrated that reinforcement learning approaches that explicitly account for correlations between AI-predicted positions achieved 27% lower maximum drawdowns while maintaining similar returns.
- Risk Constraints: Implementing risk limits and constraints ensures portfolio stability across market conditions. Wu et al. [67] showed that incorporating downside risk measures like Conditional Value at Risk (CVaR) into GAN-based trading models improved worst-case scenario outcomes while sacrificing only marginal returns.
- Multi-Model Integration: Combining predictions from diverse models can enhance robustness. Wang et al. [64] found that ensembling predictions from graph-based models with traditional tree-based approaches reduced prediction variance and improved consistency across market regimes.
11.9.3. Transaction Cost Optimization
- Trading Frequency Optimization: The optimal trading frequency depends on the relationship between signal decay and transaction costs. Lv et al. [62] demonstrated that daily rebalancing was optimal for deep learning models applied to liquid large-cap stocks, while weekly rebalancing proved more effective for less liquid small-cap stocks due to higher transaction costs.
- Smart Order Routing: Execution algorithms that minimize market impact can preserve strategy returns. Studies have shown that implementation shortfall due to suboptimal execution can reduce theoretical strategy returns by 15–40% in practice [119].
- Tax Efficiency: For investment applications, tax consequences of trading activity significantly impact after-tax returns. Incorporating tax-aware execution rules into prediction-based strategies improves after-tax returns annually while maintaining pre-tax performance.
11.9.4. Institutional Implementation Challenges
- Governance Frameworks: Establishing appropriate oversight and governance for AI trading systems remains challenging.
- Alignment with Investment Policy: Ensuring AI prediction models operate within institutional investment policy constraints requires careful design. Bartram et al. [120] demonstrated approaches for incorporating ESG constraints, concentration limits, and other policy requirements into prediction-based portfolio construction.
- Performance Attribution: Accurately attributing performance in AI-augmented investment processes presents analytical challenges. Daul et al. [121] developed a framework for decomposing returns into components attributable to the AI prediction model versus traditional factors, providing greater transparency for stakeholders.
12. Conclusions
12.1. Key Findings and Theoretical Implications
12.2. Methodological Contributions and Practical Implications
12.3. Unanswered Questions and Future Directions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Andersen, T.G.; Bollerslev, T.; Diebold, F.X.; Labys, P. Modeling and Forecasting Realized Volatility. Econometrica 2003, 71, 579–625. [Google Scholar] [CrossRef]
- Cont, R. Volatility Clustering in Financial Markets: Empirical Facts and Agent-Based Models. Quant. Financ. 2014, 14, 1547–1561. [Google Scholar]
- Hornik, K. Multilayer Feedforward Networks are Universal Approximators. Neural Netw. 1989, 2, 359–366. [Google Scholar] [CrossRef]
- White, H. Artificial Neural Networks: Approximation and Learning Theory; Blackwell: Hoboken, NJ, USA, 1992. [Google Scholar]
- Pascanu, R.; Mikolov, T.; Bengio, Y. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, 17–19 June 2013; pp. 1310–1318. [Google Scholar]
- Fischer, T.; Krauss, C. Deep learning with long short-term memory networks for financial market predictions. Eur. J. Oper. Res. 2018, 270, 654–669. [Google Scholar] [CrossRef]
- Gal, Y.; Ghahramani, Z. A Theoretically Grounded Application of Dropout in RNNs. In Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December 2016. [Google Scholar]
- Bengio, Y.; Louradour, J.; Collobert, R.; Weston, J. Curriculum Learning. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 41–48. [Google Scholar]
- Bergstra, J.S.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for Hyper-Parameter Optimization. In Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, Granada, Spain, 12–14 December 2011. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need in Financial Time Series. J. Comput. Financ. 2023, 26, 45–67. [Google Scholar]
- Sezer, O.B.; Gudelek, M.U.; Ozbayoglu, A.M. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Appl. Soft Comput. 2020, 90, 106181. [Google Scholar] [CrossRef]
- Henrique, B.M.; Sobreiro, V.A.; Kimura, H. Literature review: Machine learning techniques applied to financial market prediction. Expert Syst. Appl. 2019, 124, 226–251. [Google Scholar] [CrossRef]
- Fernández-Rodríguez, F.; Sosvilla-Rivero, S.; Andrada-Félix, J. Technical analysis in the Madrid stock exchange. FEDEA Working Paper 1999. No. 99-05. [Google Scholar] [CrossRef]
- López de Prado, M. A data science solution to the multiple-testing crisis in financial research. J. Financ. Data Sci. 2019, 1, 99–110. [Google Scholar] [CrossRef]
- Ariyo, A.A.; Adewumi, A.O.; Ayo, C.K. Comparison of ARIMA and artificial neural networks models for stock price prediction. J. Appl. Math. 2014, 2014, 614342. [Google Scholar]
- Devi, B.U.; Sundar, D.; Alli, P. An effective time series analysis for stock trend prediction using ARIMA model for Nifty Midcap-50. Int. J. Data Min. Knowl. Manag. Process 2013, 3, 65–78. [Google Scholar]
- De Faria, E.L.; Albuquerque, M.P.; Gonzalez, J.L.; Cavalcante, J. Predicting the Brazilian stock market through neural networks and adaptive exponential smoothing methods. Expert Syst. Appl. 2009, 36, 12506–12509. [Google Scholar] [CrossRef]
- Bhuriya, D.; Kausha, G.; Sharma, A.; Singh, U. Stock market prediction using a linear regression. In Proceedings of the 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 20–22 April 2017; pp. 510–513. [Google Scholar]
- Dutta, A.; Bandopadhyay, G.; Sengupta, S. Prediction of stock performance in Indian stock market using logistic regression. Int. J. Bus. Inf. 2012, 7, 105–136. [Google Scholar]
- Kim, S.; Lee, H.S.; Ko, H.; Jeong, S.H.; Byun, H.W.; Oh, K.J. Pattern matching trading system based on the dynamic time warping algorithm. Sustainability 2018, 10, 4641. [Google Scholar] [CrossRef]
- Chen, Y.; Hao, Y. A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction. Expert Syst. Appl. 2017, 80, 340–355. [Google Scholar] [CrossRef]
- Bollen, J.; Mao, H.; Zeng, X. Twitter mood predicts the stock market. J. Comput. Sci. 2011, 2, 1–8. [Google Scholar] [CrossRef]
- Ding, X.; Zhang, Y.; Liu, T.; Duan, J. Deep learning for event-driven stock prediction. In Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, 25–31 July 2015; pp. 2327–2333. [Google Scholar]
- Zhang, D.; Dai, X.; Wang, Q.; Lau, C.K.M. Impacts of weather conditions on the US commodity markets systemic interdependence across multi-timescales. Energy Econ. 2023, 123, 106732. [Google Scholar] [CrossRef]
- Hu, H.; Tang, L.; Zhang, S.; Wang, H. Predicting the direction of stock markets using optimized neural networks with Google Trends. Neurocomputing 2018, 285, 188–195. [Google Scholar] [CrossRef]
- Chen, T.L.; Chen, F.Y. An intelligent pattern recognition model for supporting investment decisions in stock market. Inf. Sci. 2016, 346, 261–274. [Google Scholar] [CrossRef]
- Bao, W.; Yue, J.; Rao, Y. A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PLoS ONE 2017, 12, e0180944. [Google Scholar] [CrossRef]
- Mittal, A.; Goel, A. Stock Prediction Using Twitter Sentiment Analysis; CS229; Stanford University: Stanford, CA, USA, 2012; Volume 15. [Google Scholar]
- Ma, L.; Wang, X.; Wang, X.; Wang, L.; Shi, Y.; Huang, M. TCDA: Truthful Combinatorial Double Auctions for Mobile Edge Computing in Industrial Internet of Things. IEEE Trans. Mob. Comput. 2022, 21, 4125–4138. [Google Scholar] [CrossRef]
- Fu, T.C.; Chung, F.L.; Luk, R.; Ng, C.M. Preventing meaningless stock time series pattern discovery by changing perceptually important point detection. In Fuzzy Systems and Knowledge Discovery: Second International Conference, FSKD 2005, Changsha, China, 27–29 August 2005, Proceedings, Part I 2; Springer: Berlin/Heidelberg, Germany, 2005; pp. 1171–1174. [Google Scholar]
- Markowska-Kaczmar, U.; Dziedzic, M. Discovery of technical analysis patterns. In Proceedings of the 2008 International Multiconference on Computer Science and Information Technology, Wisla, Poland, 20–22 October 2008; pp. 137–142. [Google Scholar]
- Leigh, W.; Frohlich, C.J.; Hornik, S.; Purvis, R.L.; Roberts, T.L. Trading with a stock chart heuristic. IEEE Trans. Syst. Man Cybern. Part Syst. Humans 2008, 38, 93–104. [Google Scholar] [CrossRef]
- Cervelló-Royo, R.; Guijarro, F.; Michniuk, K. Stock market trading rule based on pattern recognition and technical analysis: Forecasting the DJIA index with intraday data. Expert Syst. Appl. 2015, 42, 5963–5975. [Google Scholar] [CrossRef]
- Arévalo, R.; García, J.; Guijarro, F.; Peris, A. A dynamic trading rule based on filtered flag pattern recognition for stock market price forecasting. Expert Syst. Appl. 2017, 81, 177–192. [Google Scholar] [CrossRef]
- Huang, W.; Nakamori, Y.; Wang, S.Y. Forecasting stock market movement direction with support vector machine. Comput. Oper. Res. 2005, 32, 2513–2522. [Google Scholar] [CrossRef]
- Shen, S.; Jiang, H.; Zhang, T. Stock Market Forecasting Using Machine Learning Algorithms. Master’s Thesis, Department of Electrical Engineering, Stanford University, Stanford, CA, USA, 2012. [Google Scholar]
- Lohrmann, C.; Luukka, P. Classification of intraday S&P500 returns with a random forest. Int. J. Forecast. 2019, 35, 390–407. [Google Scholar]
- Dey, S.; Kumar, Y.; Saha, S.; Basak, S. Forecasting to classification: Predicting the direction of stock market price using Xtreme Gradient Boosting. Working Paper, 2016. [Google Scholar]
- Ampomah, E.K.; Qin, Z.; Nyame, G. Evaluation of tree-based ensemble machine learning models in predicting stock price direction of movement. Information 2020, 11, 332. [Google Scholar] [CrossRef]
- Ballings, M.; Van den Poel, D.; Hespeels, N.; Gryp, R. Evaluating multiple classifiers for stock price direction prediction. Expert Syst. Appl. 2015, 42, 7046–7056. [Google Scholar] [CrossRef]
- Basak, S.; Kar, S.; Saha, S.; Khaidem, L.; Dey, S.R. Predicting the direction of stock market prices using tree-based classifiers. North Am. J. Econ. Financ. 2018, 47, 552–567. [Google Scholar] [CrossRef]
- Khan, A.H.; Shah, A.; Ali, A.; Shahid, R.; Zahid, Z.U.; Sharif, M.U.; Jan, T.; Zafar, M.H. A performance comparison of machine learning models for stock market prediction with novel investment strategy. PLOS ONE 2023, 18, e0286362. [Google Scholar] [CrossRef]
- Pagliaro, A. Forecasting Significant Stock Market Price Changes Using Machine Learning: Extra Trees Classifier Leads. Electronics 2023, 12, 4551. [Google Scholar] [CrossRef]
- Qiu, M.; Song, Y. Predicting the direction of stock market index movement using an optimized artificial neural network model. PLoS ONE 2016, 11, e0155133. [Google Scholar] [CrossRef] [PubMed]
- Moghaddam, A.H.; Moghaddam, M.H.; Esfandyari, M. Stock market index prediction using artificial neural network. J. Econ. Financ. Adm. Sci. 2016, 21, 89–93. [Google Scholar] [CrossRef]
- Di Persio, L.; Honchar, O. Recurrent neural networks approach to the financial forecast of Google assets. Int. J. Math. Comput. Simul. 2017, 11, 7–13. [Google Scholar]
- Sezer, O.B.; Ozbayoglu, A.M. Algorithmic financial trading with deep convolutional neural networks: Time series to image conversion approach. Appl. Soft Comput. 2018, 70, 525–538. [Google Scholar] [CrossRef]
- Wu, H.; Chen, S.; Ding, Y. Comparison of ARIMA and LSTM for stock price prediction. Financ. Eng. Risk Manag. 2023, 6, 01026. [Google Scholar]
- Powell, N.; Foo, S.Y.; Weatherspoon, M. Supervised and unsupervised methods for stock trend forecasting. In Proceedings of the 2008 40th Southeastern Symposium on System Theory (SSST), New Orleans, LA, USA, 16-18 March 2008; pp. 203–205. [Google Scholar]
- Wu, K.P.; Wu, Y.P.; Lee, H.M. Stock trend prediction by using K-means and AprioriAll algorithm for sequential chart pattern mining. J. Inf. Sci. Eng. 2014, 30, 669–686. [Google Scholar]
- Babu, M.S.; Geethanjali, N.; Satyanarayana, B. Clustering approach to stock market prediction. Int. J. Adv. Netw. Appl. 2012, 3, 1281. [Google Scholar]
- Patel, J.; Shah, S.; Thakkar, P.; Kotecha, K. Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Syst. Appl. 2015, 42, 259–268. [Google Scholar] [CrossRef]
- Chong, E.; Han, C.; Park, F.C. Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies. Expert Syst. Appl. 2017, 83, 187–205. [Google Scholar] [CrossRef]
- Schumaker, R.P.; Chen, H. Textual analysis of stock market prediction using breaking financial news: The AZFin text system. ACM Trans. Inf. Syst. (Tois) 2009, 27, 12. [Google Scholar] [CrossRef]
- Kalyanaraman, V.; Kazi, S.; Tondulkar, R.; Oswal, S. Sentiment analysis on news articles for stocks. In Proceedings of the 2014 8th Asia Modelling Symposium, Taipei, Taiwan, 23–25 September 2014; pp. 57–62. [Google Scholar]
- Lee, H.; Surdeanu, M.; MacCartney, B.; Jurafsky, D. On the importance of text analysis for stock price prediction. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC), Reykjavik, Iceland, 26–31 May 2014; pp. 1170–1175. [Google Scholar]
- Pagolu, V.S.; Reddy, K.N.; Panda, G.; Majhi, B. Sentiment analysis of Twitter data for predicting stock market movements. In Proceedings of the 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), Paralakhemundi, India, 3–5 October 2016; pp. 1345–1350. [Google Scholar]
- Preis, T.; Moat, H.S.; Stanley, H.E. Quantifying trading behavior in financial markets using Google Trends. Sci. Rep. 2013, 3, 1684. [Google Scholar] [CrossRef]
- Ren, R.; Wu, D.D.; Liu, T. Forecasting stock market movement direction using sentiment analysis and support vector machine. IEEE Syst. J. 2018, 13, 760–770. [Google Scholar] [CrossRef]
- Wang, J.J.; Wang, J.Z.; Zhang, Z.G.; Guo, S.P. Stock index forecasting based on a hybrid model. Omega 2012, 40, 758–766. [Google Scholar] [CrossRef]
- Rather, A.M.; Agarwal, A.; Sastry, V.N. Recurrent neural network and a hybrid model for prediction of stock returns. Expert Syst. Appl. 2015, 42, 3234–3241. [Google Scholar] [CrossRef]
- Lv, D.; Yuan, S.; Li, M.; Xiang, Y. An empirical study of machine learning algorithms for stock daily trading strategy. Math. Probl. Eng. 2019. [Google Scholar] [CrossRef]
- Yoshihara, A.; Fujikawa, K.; Seki, K.; Uehara, K. Predicting stock market trends by recurrent deep neural networks. In Proceedings of the Pacific Rim International Conference on Artificial Intelligence, Gold Coast, Australia, 1–5 December 2014; pp. 759–769. [Google Scholar]
- Wang, T.; Guo, J.; Shan, Y.; Zhu, Y. A knowledge graph–GCN–community detection integrated model for large-scale stock price prediction. Appl. Soft Comput. 2023, 145, 110595. [Google Scholar] [CrossRef]
- Zhang, F. Conceptual-temporal graph convolutional neural network model for stock price movement prediction and application. Soft Comput. 2023, 27, 6329–6344. [Google Scholar]
- Jang, J.; Seong, N. Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory. Expert Syst. Appl. 2023, 218, 119556. [Google Scholar] [CrossRef]
- Wu, J.L.; Tang, X.R.; Hsu, C.H. A prediction model of stock market trading actions using generative adversarial network and piecewise linear representation approaches. Soft Comput. 2023, 27, 8209–8222. [Google Scholar] [CrossRef]
- Harvey, C.R. Presidential address: The scientific outlook in financial economics. J. Financ. 2017, 72, 1399–1440. [Google Scholar] [CrossRef]
- Lo, A.W.; MacKinlay, A.C. When are contrarian profits due to stock market overreaction? Rev. Financ. Stud. 1990, 3, 175–205. [Google Scholar] [CrossRef]
- Baker, M.; Wurgler, J. Investor sentiment and the cross-section of stock returns. J. Financ. 2006, 61, 1645–1680. [Google Scholar] [CrossRef]
- Gu, S.; Kelly, B.; Xiu, D. Empirical asset pricing via machine learning. Rev. Financ. Stud. 2020, 33, 2223–2273. [Google Scholar] [CrossRef]
- Feng, F.; Chen, H.; He, X.; Ding, J.; Sun, M.; Chua, T.S. Enhancing stock movement prediction with adversarial training. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 10–16 August 2019; pp. 5843–5849. [Google Scholar]
- Diebold, F.X.; Mariano, R.S. Comparing predictive accuracy. J. Bus. Econ. Stat. 1995, 13, 253–263. [Google Scholar] [CrossRef]
- Hansen, P.R.; Lunde, A.; Nason, J.M. The model confidence set. Econometrica 2011, 79, 453–497. [Google Scholar] [CrossRef]
- White, H. A reality check for data snooping. Econometrica 2000, 68, 1097–1126. [Google Scholar] [CrossRef]
- Hansen, P.R. A test for superior predictive ability. J. Bus. Econ. Stat. 2005, 23, 365–380. [Google Scholar] [CrossRef]
- Giacomini, R.; White, H. Tests of conditional predictive ability. Econometrica 2006, 74, 1545–1578. [Google Scholar] [CrossRef]
- Politis, D.N.; Romano, J.P. Multivariate density estimation with general flat-top kernels of infinite order. J. Multivar. Anal. 1999, 68, 1–25. [Google Scholar] [CrossRef]
- Campbell, J.Y.; Thompson, S.B. Predicting excess stock returns out of sample: Can anything beat the historical average? Rev. Financ. Stud. 2008, 21, 1509–1531. [Google Scholar] [CrossRef]
- Benjamini, Y.; Hochberg, Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. (Methodological) 1995, 57, 289–300. [Google Scholar] [CrossRef]
- Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 1979, 6, 65–70. [Google Scholar]
- Romano, J.P.; Shaikh, A.M.; Wolf, M. Control of the false discovery rate under dependence using the bootstrap and subsampling. Test 2007, 17, 417. [Google Scholar] [CrossRef]
- Harvey, C.R.; Liu, Y.; Zhu, H. … and the cross-section of expected returns. Rev. Financ. Stud. 2016, 29, 5–68. [Google Scholar]
- Andrews, D.W. Tests for parameter instability and structural change with unknown change point. Econometrica 1993, 61, 821–856. [Google Scholar] [CrossRef]
- Bai, J.; Perron, P. Estimating and testing linear models with multiple structural changes. Econometrica 1998, 66, 47–78. [Google Scholar] [CrossRef]
- Primiceri, G.E. Time varying structural vector autoregressions and monetary policy. Rev. Econ. Stud. 2005, 72, 821–852. [Google Scholar] [CrossRef]
- Hamilton, J.D. A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 1989, 57, 357–384. [Google Scholar] [CrossRef]
- Pesaran, M.H.; Timmermann, A. Selection of estimation window in the presence of breaks. J. Econom. 2007, 137, 134–161. [Google Scholar] [CrossRef]
- Brock, W.; Lakonishok, J.; LeBaron, B. Simple technical trading rules and the stochastic properties of stock returns. J. Financ. 1992, 47, 1731–1764. [Google Scholar] [CrossRef]
- Sullivan, R.; Timmermann, A.; White, H. Data-snooping, technical trading rule performance, and the bootstrap. J. Financ. 1999, 54, 1647–1691. [Google Scholar] [CrossRef]
- Lo, A.W.; Mamaysky, H.; Wang, J. Foundations of technical analysis: Computational algorithms, statistical inference, and empirical implementation. J. Financ. 2000, 55, 1705–1765. [Google Scholar] [CrossRef]
- Rapach, D.E.; Strauss, J.K.; Zhou, G. Out-of-sample equity premium prediction: Combination forecasts and links to the real economy. Rev. Financ. Stud. 2010, 23, 821–862. [Google Scholar] [CrossRef]
- Bradshaw, M.T.; Drake, M.S.; Myers, J.N.; Myers, L.A. A re-examination of analysts’ superiority over time-series forecasts of annual earnings. Rev. Account. Stud. 2012, 17, 944–968. [Google Scholar] [CrossRef]
- Christoffersen, P.; Jacobs, K.; Chang, B.Y. Forecasting with option-implied information. Handb. Econ. Forecast. 2013, 2, 581–656. [Google Scholar]
- Cheng, I.H.; Xiong, W. The financialization of commodity markets. Annu. Rev. Financ. Econ. 2013, 5, 419–441. [Google Scholar]
- English, W.B.; Van den Heuvel, S.J.; Zakrajšek, E. Interest rate risk and bank equity valuations. J. Monet. Econ. 2018, 98, 80–97. [Google Scholar] [CrossRef]
- Pastor, L.; Veronesi, P. Technological revolutions and stock prices. Am. Econ. Rev. 2009, 99, 1451–1483. [Google Scholar] [CrossRef]
- Lo, A.W. The adaptive markets hypothesis: Market efficiency from an evolutionary perspective. J. Portf. Manag. 2004, 30, 15–29. [Google Scholar] [CrossRef]
- Jegadeesh, N.; Titman, S. Returns to buying winners and selling losers: Implications for stock market efficiency. J. Financ. 1993, 48, 65–91. [Google Scholar] [CrossRef]
- Bhattacharya, D.; Li, W.H.; Sonaer, G. Has momentum lost its momentum? Rev. Quant. Financ. Account. 2020, 55, 1145–1179. [Google Scholar] [CrossRef]
- Bernard, V.L.; Thomas, J.K. Post-earnings-announcement drift: Delayed price response or risk premium? J. Account. Res. 1989, 27, 1–36. [Google Scholar] [CrossRef]
- Ke, B.; Ramalingegowda, S. Do institutional investors exploit the post-earnings announcement drift? J. Account. Econ. 2005, 39, 25–53. [Google Scholar] [CrossRef]
- Zhang, C.Y.; Jacobsen, B. The Halloween indicator, “Sell in May and go away”: Everywhere and all the time. J. Int. Money Financ. 2021, 110, 102268. [Google Scholar] [CrossRef]
- Hou, K.; Xue, C.; Zhang, L. Replicating anomalies. Rev. Financ. Stud. 2020, 33, 2019–2133. [Google Scholar] [CrossRef]
- Chen, A.Y.; Zimmermann, T. Publication bias and the cross-section of stock returns. Rev. Asset Pricing Stud. 2022, 12, 454–488. [Google Scholar]
- Lakonishok, J.; Smidt, S. Are seasonal anomalies real? A ninety-year perspective. Rev. Financ. Stud. 1988, 1, 403–425. [Google Scholar] [CrossRef]
- Schwert, G.W. Anomalies and market efficiency. Handb. Econ. Financ. 2003, 1, 939–974. [Google Scholar]
- White, H. Economic prediction using neural networks: The case of IBM daily stock returns. In Proceedings of the IEEE 1988 International Conference on Neural Networks, San Diego, CA, USA, 24-27 July 1988; pp. 451–458. [Google Scholar]
- Bishop, C.M. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
- Tetlock, P.C. Giving content to investor sentiment: The role of media in the stock market. J. Financ. 2007, 62, 1139–1168. [Google Scholar] [CrossRef]
- Loughran, T.; McDonald, B. When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J. Financ. 2011, 66, 35–65. [Google Scholar] [CrossRef]
- López de Prado, M. Advances in Financial Machine Learning; John Wiley & Sons: Hoboken, NJ, USA, 2018. [Google Scholar]
- Bailey, D.H.; Borwein, J.M.; López de Prado, M.; Zhu, Q.J. Pseudo-mathematics and financial charlatanism: The effects of backtest overfitting on out-of-sample performance. Not. Am. Math. Soc. 2014, 61, 458–471. [Google Scholar] [CrossRef]
- Boehmer, E.; Jones, C.M.; Zhang, X. Potential pilot problems: Treatment spillovers in financial regulatory experiments. J. Financ. Econ. 2019, 134, 355–373. [Google Scholar]
- Zhang, Z.; Zohren, S.; Roberts, S. DeepLOB: Deep convolutional neural networks for limit order books. IEEE Trans. Signal Process. 2019, 67, 3001–3012. [Google Scholar] [CrossRef]
- Urquhart, A. The inefficiency of Bitcoin. Econ. Lett. 2016, 148, 80–82. [Google Scholar] [CrossRef]
- Cao, J.; Chen, J.; Hull, J.C. A neural network approach to understanding implied volatility movements. Quant. Financ. 2020, 20, 781–797. [Google Scholar] [CrossRef]
- Huck, N. Large data sets and machine learning: Applications to statistical arbitrage. Eur. J. Oper. Res. 2019, 278, 330–342. [Google Scholar] [CrossRef]
- Frazzini, A.; Israel, R.; Moskowitz, T.J. Trading costs. J. Financ. Econ. 2018, 138, 293–316. [Google Scholar] [CrossRef]
- Bartram, S.M.; Branke, J.; Motahari, M. Artificial Intelligence in Asset Management; CFA Institute Research Foundation: Charlottesville, VA, 2021. [Google Scholar]
- Daul, S.; Jaisson, T.; Nagy, A. Performance attribution of machine learning methods for stock returns prediction. J. Financ. Data Sci. 2022, 8, 86–104. [Google Scholar] [CrossRef]
Study | Statistical Significance | Economic Significance | Evaluation After Costs |
---|---|---|---|
Lo and MacKinlay (1990) [69] | Strong rejection of random walk (p < 0.001) | 12% annual excess returns | Reduced to 3–4% after transaction costs |
Baker and Wurgler (2006) [70] | Sentiment index significant at p < 0.01 | Predicted 1.3% monthly spread | Not evaluated after implementation costs |
Gu et al. (2020) [71] | Neural networks outperform at p < 0.01 | Sharpe ratio of 0.9 for neural nets | Sharpe ratio dropped to 0.4 with transaction costs |
Feng et al. (2019) [72] | Deep learning LSTM significant (p < 0.001) | 30% improvement in directional accuracy | 14% profit after costs, lower than buy-and-hold in bull market |
Dimension | Metrics/Methods | Key Considerations |
---|---|---|
Statistical Validity |
|
|
Effect Magnitude |
|
|
Out-of-Sample Validation |
|
|
Robustness Across Regimes |
|
|
Parameter Sensitivity |
|
|
Implementation Feasibility |
|
|
Study | Model Type | Dataset/Market | Directional Accuracy (%) | Sharpe Ratio | Returns (%) | Transaction Costs Included | Key Findings |
---|---|---|---|---|---|---|---|
Pagliaro (2023) [43] | Extra Trees Classifier | S&P 500 | 86.1 | 1.93 | 14.35 | Yes | Extra Trees outperformed Random Forest (73%) for 10-day windows |
Fischer and Krauss (2018) [6] | LSTM | S&P 500 constituents | 53.2 | 0.77 | 45.93 | Yes | LSTM outperformed DNN, Random Forest, and logistic regression |
Khan et al. (2023) [42] | Random Forest | NASDAQ 100 | 91.27 | 1.62 | 20.38 | Yes | 15 min intervals provided optimal prediction window |
Hu et al. (2018) [25] | BPNN with Google Trends | S&P 500 and DJIA | 86.81 | 1.36 | 19.63 | Yes | Google Trends data significantly improved prediction accuracy |
Dey et al. (2016) [38] | XGBoost | Apple and Yahoo stocks | 87–99 | N/A | 32.46 | No | XGBoost showed superior accuracy for long-term prediction |
Bollen et al. (2011) [22] | Self-Organizing Fuzzy Neural Network | DJIA | 87.6 | 1.28 | 15.27 | Yes | Twitter sentiment analysis improved prediction accuracy |
Ballings et al. (2015) [40] | Ensemble methods | European and US stocks | 68.2 | 0.82 | 9.68 | Yes | Random Forest consistently outperformed single classifiers |
Ding et al. (2015) [23] | Neural Tensor Network + Deep CNN | S&P 500 | 64.21 | 0.63 | 6.89 | No | Event embeddings improved index prediction by 6% |
Wang et al. (2023) [64] | Knowledge Graph + GCN | Chinese A-shares | 73.8 | 1.43 | 17.62 | Yes | Graph-based approaches captured inter-stock relationships |
Wu and Chen (2023) [48] | LSTM vs. ARIMA | S&P 500 constituents | 62.3 (LSTM) | 0.72 (LSTM) | 8.3 (LSTM) | Yes | LSTM showed advantage for short-term prediction, ARIMA comparable for long-term forecasts |
58.1 (ARIMA) | 0.51 (ARIMA) | 5.9 (ARIMA) | |||||
Jang and Seong (2023) [66] | Deep Reinforcement Learning (DDPG) | S&P 500 | N/A | 1.76 | 21.35 | Yes | RL-based portfolio optimization outperformed benchmark indices |
Sezer et al. (2018) [47] | CNN (image-based) | BIST 100 Index (Turkey) | 72.5 | 0.91 | 10.75 | Yes | Image representation of financial time series improved pattern recognition |
Method/Study | Original Claim | Replication Outcome |
---|---|---|
Calendar Effects [106] | January effect provides excess returns of 3% | Schwert [107] found that the effect disappeared post-publication |
Technical Analysis [89] | Moving average strategies generate significant abnormal returns | Sullivan et al. [90] found no significance after multiple testing correction |
Neural Networks [108] | ANNs predict IBM daily stock returns | Refuted by subsequent studies with out-of-sample testing [109] |
Sentiment Analysis [110] | Media pessimism predicts market downturns | Loughran and McDonald [111] showed sensitivity to lexicon choice |
Deep Learning [6] | LSTM outperforms classic models | Lopez de Prado [112] showed results sensitive to data preparation |
Model Type | Training Time | Inference Latency | Memory Requirements | Hardware Acceleration |
---|---|---|---|---|
Statistical (ARIMA, ESM) | Very Low | Very Low | Minimal | Not Required |
Decision Trees | Low | Very Low | Low | Not Required |
Random Forest/Extra Trees | Medium | Low | Medium | Beneficial |
Gradient Boosting (XGBoost) | Medium | Low | Medium | Beneficial |
Support Vector Machines | Medium-High | Medium | Medium | Beneficial |
Shallow Neural Networks | Medium | Low | Medium | Beneficial |
Convolutional Neural Networks | High | Medium | High | Required |
LSTM/RNN | Very High | High | High | Required |
Transformer-based Models | Extremely High | High | Very High | Required |
Graph Neural Networks | Very High | High | Very High | Required |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pagliaro, A. Artificial Intelligence vs. Efficient Markets: A Critical Reassessment of Predictive Models in the Big Data Era. Electronics 2025, 14, 1721. https://doi.org/10.3390/electronics14091721
Pagliaro A. Artificial Intelligence vs. Efficient Markets: A Critical Reassessment of Predictive Models in the Big Data Era. Electronics. 2025; 14(9):1721. https://doi.org/10.3390/electronics14091721
Chicago/Turabian StylePagliaro, Antonio. 2025. "Artificial Intelligence vs. Efficient Markets: A Critical Reassessment of Predictive Models in the Big Data Era" Electronics 14, no. 9: 1721. https://doi.org/10.3390/electronics14091721
APA StylePagliaro, A. (2025). Artificial Intelligence vs. Efficient Markets: A Critical Reassessment of Predictive Models in the Big Data Era. Electronics, 14(9), 1721. https://doi.org/10.3390/electronics14091721