Enhancing Exchange-Traded Fund Price Predictions: Insights from Information-Theoretic Networks and Node Embeddings
Abstract
:1. Introduction
2. Data and Methodology
2.1. Prior Research
2.2. Data
3. Methodology
3.1. Mutual Information
3.1.1. Mutual Information
3.1.2. Normalized Mutual Information (NMI)
3.2. Transfer Entropy (TE)
3.3. Test for Obtaining p-Values of MI and TE
3.4. Network Analysis
3.4.1. Network Theory
3.4.2. Centrality Measures
3.4.3. Node Embedding Algorithm and Dimensionality Reduction
Role2vec [69]
FEATHER [70]
UMAP [71]
- Topological Data Analysis: Used to understand the high-dimensional structure of the data.
- Fuzzy Simplicial Sets: Used to approximate the manifold the data resides on, providing both local and global preservation.
- Riemannian Geometry: Used to accurately measure distances and maintain data relationships.
3.5. Machine Learning Algorithms and xAI (eXplainable Artificial Intelligence) Techniques
- Interpretability: Tree-based models, at their core, make decisions based on certain conditions, making them more interpretable than many deep learning models. This interpretability is vital in financial sectors, where understanding the reasons behind predictions can be as critical as the predictions themselves.
- Handling of Mixed Data Types: Financial datasets often consist of numerical and categorical data. Tree-based models like CatBoost are particularly effective at handling categorical variables without extensive preprocessing.
- Automatic Feature Selection: These models inherently perform feature selection. As a result, they can identify and prioritize the most essential features, which is particularly useful in financial datasets with potentially redundant or less impactful variables.
- Resistance to Overfitting: With techniques such as gradient boosting and regularization in models like XGBoost and LightGBM, tree models exhibit resistance to overfitting, especially when appropriately tuned.
- Flexibility: These models can easily capture nonlinear relationships in the data, which is common in financial time series data. Traditional linear models might not capture this nonlinearity as quickly.
- Efficiency and Scalability: Models like LightGBM and CatBoost have been designed with efficiency in mind. They can handle large datasets, making them suitable for comprehensive financial data.
- Consistency in Results: While deep learning models like RNNs can be potent, they require more meticulous fine-tuning and can sometimes produce inconsistent results due to their complex architectures. In contrast, tree models, once well-tuned, can provide more consistent predictions.
- End-to-End Modeling: These models do not necessarily require extensive data preprocessing or normalization, making the modeling process more straightforward and sometimes more accurate since no information is lost in preprocessing.
3.5.1. XGBoost
3.5.2. Light Gradient Boosting Machine (LightGBM)
3.5.3. CatBoost
3.5.4. SHAP (Shapley Additive Explanation)
3.5.5. Performance Metrics of Classification Problem
4. Results
4.1. Exploratory Data Analysis (EDA)
4.2. Prediction Results
4.2.1. Prediction Performance
4.2.2. Feature Importance of Causal Network-Related Features
5. Discussion
6. Conclusions
- The utilization of information entropy-based measures to discern and showcase the underlying relationships in U.S. stock market sector indices.
- The illustration of nonlinear dependencies and causal relationships in the U.S. market sector index networks, which is an area that has yet to be deeply probed.
- The revelation that nonlinear dependencies and causal relationships can significantly contribute to predictive models, shedding light on new avenues in market forecasting.
- Empirical evidence supports using return-based data to bolster prediction results by probing into the intricate webs formed by return and trading volume networks. This offers a promising direction in enhancing data efficiency by leveraging inter-sectoral relationships without additional external features.
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- MSCI. Sector Indexes—Expanding Investor’s Toolkit. 2022. Available online: https://www.msci.com/our-solutions/indexes/sector-indexes (accessed on 30 October 2023).
- Leung, T.; Zhao, T. Multiscale Decomposition and Spectral Analysis of Sector ETF Price Dynamics. J. Risk Financ. Manag. 2021, 14, 464. [Google Scholar] [CrossRef]
- Krause, T.; Tse, Y. Volatility and return spillovers in Canadian and US industry ETFs. Int. Rev. Econ. Financ. 2013, 25, 244–259. [Google Scholar] [CrossRef]
- Hernandez, J.A.; Al Janabi, M.A.; Hammoudeh, S.; Khuong Nguyen, D. Time lag dependence, cross-correlation and risk analysis of US energy and non-energy stock portfolios. J. Asset Manag. 2015, 16, 467–483. [Google Scholar] [CrossRef]
- Dutta, A. Oil and energy sector stock markets: An analysis of implied volatility indexes. J. Multinatl. Financ. Manag. 2018, 44, 61–68. [Google Scholar] [CrossRef]
- Shahzad, S.J.H.; Bouri, E.; Kristoufek, L.; Saeed, T. Impact of the COVID-19 outbreak on the US equity sectors: Evidence from quantile return spillovers. Financ. Innov. 2021, 7, 1–23. [Google Scholar] [CrossRef]
- Khan, M.A.; Hernandez, J.A.; Shahzad, S.J.H. Time and frequency relationship between household investors’ sentiment index and US industry stock returns. Financ. Res. Lett. 2020, 36, 101318. [Google Scholar] [CrossRef]
- Matos, P.; Costa, A.; da Silva, C. COVID-19, stock market and sectoral contagion in US: A time-frequency analysis. Res. Int. Bus. Financ. 2021, 57, 101400. [Google Scholar] [CrossRef] [PubMed]
- Wan, X.; Yang, J.; Marinov, S.; Calliess, J.P.; Zohren, S.; Dong, X. Sentiment correlation in financial news networks and associated market movements. Sci. Rep. 2021, 11, 1–12. [Google Scholar] [CrossRef]
- Shahzad, S.J.H.; Mensi, W.; Hammoudeh, S.; Balcilar, M.; Shahbaz, M. Distribution specific dependence and causality between industry-level US credit and stock markets. J. Int. Financ. Mark. Inst. Money 2018, 52, 114–133. [Google Scholar] [CrossRef]
- Choi, I.; Kim, W.C. Detecting and Analyzing Politically-Themed Stocks Using Text Mining Techniques and Transfer Entropy—Focus on the Republic of Korea’s Case. Entropy 2021, 23, 734. [Google Scholar] [CrossRef]
- Jin, Z.; Guo, K. The dynamic relationship between stock market and macroeconomy at sectoral level: Evidence from Chinese and US stock market. Complexity 2021, 2021, 1–16. [Google Scholar] [CrossRef]
- Mensi, W.; Al Rababa’a, A.R.; Alomari, M.; Vo, X.V.; Kang, S.H. Dynamic frequency volatility spillovers and connectedness between strategic commodity and stock markets: US-based sectoral analysis. Resour. Policy 2022, 79, 102976. [Google Scholar] [CrossRef]
- Beer, K.; Bondarenko, D.; Farrelly, T.; Osborne, T.J.; Salzmann, R.; Scheiermann, D.; Wolf, R. Training deep quantum neural networks. Nat. Commun. 2020, 11, 808. [Google Scholar] [CrossRef] [PubMed]
- Chen, Q.; Zhang, W.; Lou, Y. Forecasting stock prices using a hybrid deep learning model integrating attention mechanism multilayer perceptron and bidirectional long-short term memory neural network. IEEE Access 2020, 8, 117365–117376. [Google Scholar] [CrossRef]
- Chen, S.; Zhou, C. Stock prediction based on genetic algorithm feature selection and long short-term memory neural network. IEEE Access 2020, 9, 9066–9072. [Google Scholar] [CrossRef]
- Althelaya, K.A.; Mohammed, S.A.; El-Alfy, E.S.M. Combining deep learning and multiresolution analysis for stock market forecasting. IEEE Access 2021, 9, 13099–13111. [Google Scholar] [CrossRef]
- Aldhyani, T.H.; Alzahrani, A. Framework for predicting and modeling stock market prices based on deep learning algorithms. Electronics 2022, 11, 3149. [Google Scholar] [CrossRef]
- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Granger, C.W. Investigating causal relations by econometric models and cross-spectral methods. Econom. J. Econom. Soc. 1969, 37, 424–438. [Google Scholar] [CrossRef]
- Geweke, J. Measurement of linear dependence and feedback between multiple time series. J. Am. Stat. Assoc. 1982, 77, 304–313. [Google Scholar] [CrossRef]
- Schreiber, T. Measuring information transfer. Phys. Rev. Lett. 2000, 85, 461. [Google Scholar] [CrossRef]
- Boba, P.; Bollmann, D.; Schoepe, D.; Wester, N.; Wiesel, J.; Hamacher, K. Efficient computation and statistical assessment of transfer entropy. Front. Phys. 2015, 3, 10. [Google Scholar] [CrossRef]
- Fiedor, P. Networks in Financial Markets Based on the Mutual Information Rate. Phys. Rev. E 2014, 89, 052801. [Google Scholar] [CrossRef]
- You, T.; Fiedor, P.; Holda, A. Network Analysis of the Shanghai Stock Exchange Based on Partial Mutual Information. J. Risk Financ. Manag. 2015, 8, 266–284. [Google Scholar] [CrossRef]
- Barbi, A.Q.; Prataviera, G.A. Nonlinear Dependencies on Brazilian Equity Network From Mutual Information Minimum Spanning Trees. Phys. A Stat. Mech. Its Appl. 2019, 523, 876–885. [Google Scholar] [CrossRef]
- Han, D. Network Analysis of the Chinese Stock Market during the Turbulence of 2015–2016 Using Log–Returns, Volumes and Mutual Information. Phys. A Stat. Mech. Its Appl. 2019, 523, 1091–1109. [Google Scholar]
- Jiang, J.; Shang, P.; Li, X. An Effective Stock Classification Method via MDS Based on Modified Mutual Information Distance. Fluct. Noise Lett. 2020, 19, 2050018. [Google Scholar] [CrossRef]
- Yan, Y.; Wu, B.; Tian, T.; Zhang, H. Development of Stock Networks Using Part Mutual Information and Australian Stock Market Data. Entropy 2020, 22, 773. [Google Scholar] [CrossRef]
- Lahmiri, S.; Bekiros, S. Rényi Entropy and Mutual Information Measurement of Market Expectations and Investor Fear during the COVID-19 Pandemic. Chaos Solitons Fractals 2020, 139, 110084. [Google Scholar] [CrossRef] [PubMed]
- Kvålseth, T.O. Entropy and Correlation: Some Comments. IEEE Trans. Syst. Man Cybern. 1987, 17, 517–519. [Google Scholar] [CrossRef]
- Banerjee, A.; Dhillon, I.S.; Ghosh, J.; Sra, S. Clustering on the Unit Hypersphere Using Von Mises-Fisher Distributions. J. Mach. Learn. Res. 2005, 6, 1345–1382. [Google Scholar]
- Kraskov, A.; Stögbauer, H.; Andrzejak, R.G.; Grassberger, P. Hierarchical Clustering Using Mutual Information. Europhys. Lett. 2005, 70, 278. [Google Scholar] [CrossRef]
- Vinh, N.X.; Epps, J.; Bailey, J. Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance. J. Mach. Learn. Res. 2010, 11, 2837–2854. [Google Scholar]
- Kvålseth, T.O. On Normalized Mutual Information: Measure Derivations and Properties. Entropy 2017, 19, 631. [Google Scholar] [CrossRef]
- Dimpfl, T.; Peter, F.J. Using Transfer Entropy to Measure Information Flows between Financial Markets. Stud. Nonlinear Dyn. Econom. 2013, 17, 85–102. [Google Scholar] [CrossRef]
- Sandoval, L. Structure of a Global Network of Financial Companies Based on Transfer Entropy. Entropy 2014, 16, 4443–4482. [Google Scholar] [CrossRef]
- Sensoy, A.; Sobaci, C.; Sensoy, S.; Alali, F. Effective Transfer Entropy Approach to Information Flow between Exchange Rates and Stock Markets. Chaos Solitons Fractals 2014, 68, 180–185. [Google Scholar] [CrossRef]
- Bekiros, S.; Nguyen, D.K.; Junior, L.S.; Uddin, G.S. Information Diffusion, Cluster Formation and Entropy-Based Network Dynamics in Equity and Commodity Markets. Eur. J. Oper. Res. 2017, 256, 945–961. [Google Scholar] [CrossRef]
- Lim, K.; Kim, S.; Kim, S.Y. Information Transfer Across Intra/Inter-Structure of CDS and Stock Markets. Phys. A Stat. Mech. Appl. 2017, 486, 118–126. [Google Scholar] [CrossRef]
- Bron, C.; Kerbosch, J. Algorithm 457: Finding all cliques of an undirected graph. Commun. ACM 1973, 16, 575–577. [Google Scholar] [CrossRef]
- Freeman, L.C. A set of measures of centrality based on betweenness. Sociometry 1977, 40, 35–41. [Google Scholar] [CrossRef]
- Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw. 1978, 1, 215–239. [Google Scholar] [CrossRef]
- Bonacich, P. Power and centrality: A family of measures. Am. J. Sociol. 1987, 92, 1170–1182. [Google Scholar] [CrossRef]
- Stephenson, K.; Zelen, M. Rethinking centrality: Methods and examples. Soc. Netw. 1989, 11, 1–37. [Google Scholar] [CrossRef]
- Wasserman, S.; Faust, K. Social Network Analysis: Methods and Applications; Cambridge University Press: Cambridge, UK, 1994. [Google Scholar]
- Kleinberg, J.M. Authoritative sources in a hyperlinked environment. J. ACM 1999, 46, 604–632. [Google Scholar] [CrossRef]
- Page, L.; Brin, S.; Motwani, R.; Winograd, T. The PageRank Citation Ranking: Bringing Order to the Web. 1999. Available online: https://www.cis.upenn.edu/~mkearns/teaching/NetworkedLife/pagerank.pdf (accessed on 30 October 2023).
- Brandes, U. A faster algorithm for betweenness centrality. J. Math. Sociol. 2001, 25, 163–177. [Google Scholar] [CrossRef]
- Barrat, A.; Barthelemy, M.; Pastor-Satorras, R.; Vespignani, A. The architecture of complex weighted networks. Proc. Natl. Acad. Sci. USA 2004, 101, 3747–3752. [Google Scholar] [CrossRef] [PubMed]
- Tomita, E.; Tanaka, A.; Takahashi, H. The worst-case time complexity for generating all maximal cliques. In Proceedings of the 10th Annual International Conference, COCOON 2004, Jeju Island, Republic of Korea, 17–20 August 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 161–170. [Google Scholar]
- Borgatti, S.P. Centrality and network flow. Soc. Netw. 2005, 27, 55–71. [Google Scholar] [CrossRef]
- Brandes, U.; Fleischer, D. Centrality measures based on current flow. In Annual Symposium on Theoretical Aspects of Computer Science; Springer: Berlin/Heidelberg, Germany, 2005; pp. 533–544. [Google Scholar]
- Langville, A.N.; Meyer, C.D. A survey of eigenvector methods for web information retrieval. SIAM Rev. 2005, 47, 135–161. [Google Scholar] [CrossRef]
- Newman, M.E. A measure of betweenness centrality based on random walks. Soc. Netw. 2005, 27, 39–54. [Google Scholar] [CrossRef]
- Borgatti, S.P.; Everett, M.G. A graph-theoretic perspective on centrality. Soc. Netw. 2006, 28, 466–484. [Google Scholar] [CrossRef]
- Brandes, U.; Pich, C. Centrality estimation in large networks. Int. J. Bifurc. Chaos 2007, 17, 2303–2318. [Google Scholar] [CrossRef]
- Brandes, U. On variants of shortest-path betweenness centrality and their generic computation. Soc. Netw. 2008, 30, 136–145. [Google Scholar] [CrossRef]
- Cazals, F.; Karande, C. A note on the problem of reporting maximal cliques. Theor. Comput. Sci. 2008, 407, 564–568. [Google Scholar] [CrossRef]
- Hagberg, A.; Swart, P.; S Chult, D. Exploring Network Structure, Dynamics, and Function Using NetworkX (No. LA-UR-08-05495; LA-UR-08-5495); Los Alamos National Lab (LANL): Los Alamos, NM, USA, 2008. [Google Scholar]
- Estrada, E.; Higham, D.J.; Hatano, N. Communicability betweenness in complex networks. Phys. A Stat. Mech. Its Appl. 2009, 388, 764–774. [Google Scholar] [CrossRef]
- Kermarrec, A.M.; Le Merrer, E.; Sericola, B.; Trédan, G. Second order centrality: Distributed assessment of nodes criticity in complex networks. Comput. Commun. 2011, 34, 619–628. [Google Scholar] [CrossRef]
- Benzi, M.; Klymko, C. A matrix analysis of different centrality measures. arXiv 2014, arXiv:1312.6722. [Google Scholar]
- Boldi, P.; Vigna, S. Axioms for centrality. Internet Math. 2014, 10, 222–262. [Google Scholar] [CrossRef]
- Brandes, U.; Borgatti, S.P.; Freeman, L.C. Maintaining the duality of closeness and betweenness centrality. Soc. Netw. 2016, 44, 153–159. [Google Scholar] [CrossRef]
- Zhang, J.X.; Chen, D.B.; Dong, Q.; Zhao, Z.D. Identifying a set of influential spreaders in complex networks. Sci. Rep. 2016, 6, 1–10. [Google Scholar] [CrossRef]
- Negre, C.F.; Morzan, U.N.; Hendrickson, H.P.; Pal, R.; Lisi, G.P.; Loria, J.P.; Rivalta, I.; Ho, J.; Batista, V.S. Eigenvector centrality for characterization of protein allosteric pathways. Proc. Natl. Acad. Sci. USA 2018, 115, E12201–E12208. [Google Scholar] [CrossRef]
- Newman, M. Networks; Oxford University Press: Oxford, UK, 2018. [Google Scholar]
- Ahmed, N.K.; Rossi, R.; Lee, J.B.; Willke, T.L.; Zhou, R.; Kong, X.; Eldardiry, H. Learning role-based graph embeddings. arXiv 2018, arXiv:1802.02896. [Google Scholar]
- Rozemberczki, B.; Sarkar, R. Characteristic functions on graphs: Birds of a feather, from statistical descriptors to parametric models. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, 19–23 October 2020; pp. 1325–1334. [Google Scholar]
- McInnes, L.; Healy, J.; Melville, J. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv 2018, arXiv:1802.03426. [Google Scholar]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30, 3146–3154. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, QC, Canada, 3–8 December 2018; p. 31. [Google Scholar]
- Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; p. 30. [Google Scholar]
- Shapley, L. Quota Solutions of n-Person Games. 1953, p. 343. Available online: https://www.rand.org/content/dam/rand/pubs/papers/2021/P297.pdf (accessed on 30 October 2023).
Full Name | Abbreviation | Tracking Index | Specialized Description |
---|---|---|---|
Materials | XLB | S&P Materials Index | Tracks companies in the materials sector, including chemicals, mining, and forestry. |
Energy | XLE | S&P Energy Index | Comprises companies in the energy sector, including oil, gas, and renewable energy. |
Financials | XLF | S&P Financials Index | Represents companies in the financial sector, including banks, insurance, and real estate services. |
Industrials | XLI | S&P Industrials Index | Covers companies in the industrial sector, such as machinery, aerospace, and defense. |
Technology | XLK | S&P Technology Index | Represents the technology sector, including software, hardware, and electronics. |
Consumer Staples | XLP | S&P Consumer Staples Index | Consists of companies in the consumer staples sector, like food, beverages, and household goods. |
Utilities | XLU | S&P Utilities Index | Represents utilities sector companies, including electric, gas, and water utilities. |
Health Care | XLV | S&P Health Care Index | Covers companies in the health care sector, including pharmaceuticals, biotech, and health care services. |
Consumer Discretionary | XLY | S&P Consumer Discretionary Index | Represents companies in the consumer discretionary sector, like entertainment, retail, and autos. |
Centrality Measures | Citation |
---|---|
Average Neighbor Degree | [50,60] |
Degree Centrality | [60] |
Eigenvector Centrality | [44,60,68] |
Closeness Centrality | [43,46,60] |
Information Centrality | [45,60,65] |
Betweenness Centrality | [42,49,57,58,60] |
Current Flow Betweenness Centrality | [53,55,60] |
Communicability Betweenness Centrality | [60,61] |
Harmonic Centrality | [60,64] |
Second-Order Centrality | [60,62] |
Voterank Importance | [60,66] |
Number of Maximal Cliques | [41,51,59] |
PageRank | [48,60] |
HITS | [47,54,60] |
Metric | Definition |
---|---|
Accuracy | |
Balanced Accuracy | |
Cohen’s Kappa Coefficient | |
Precision | |
Recall | |
F1 Score | |
F-Beta Score | |
Hamming Loss | where is the predicted value for the j-th label of a given sample i, is the corresponding true value, is the number of samples, is the number of labels (in this study, , and 1(x) is the indicator function. |
Statistics | XLB | XLE | XLF | XLI | XLK | XLP | XLU | XLV | XLY |
---|---|---|---|---|---|---|---|---|---|
Mean | 0.0004 | 0.0002 | 0.0004 | 0.0004 | 0.0006 | 0.0004 | 0.0004 | 0.0005 | 0.0006 |
Standard Deviation | 0.0143 | 0.0183 | 0.0177 | 0.0134 | 0.0135 | 0.0090 | 0.0112 | 0.0107 | 0.0132 |
Min. | −0.1166 | −0.2249 | −0.1807 | −0.1204 | −0.1487 | −0.0987 | −0.1206 | −0.1038 | −0.1355 |
Max. | 0.1112 | 0.1487 | 0.1524 | 0.1191 | 0.1109 | 0.0817 | 0.1204 | 0.0742 | 0.0897 |
Q1 | −0.0063 | −0.0078 | −0.0065 | −0.0054 | −0.0049 | −0.0036 | −0.0051 | −0.0044 | −0.0050 |
Median | 0.0009 | 0.0004 | 0.0007 | 0.0009 | 0.0011 | 0.0007 | 0.0009 | 0.0009 | 0.0013 |
Q3 | 0.0080 | 0.0090 | 0.0079 | 0.0069 | 0.0072 | 0.0051 | 0.0062 | 0.0062 | 0.0072 |
Skewness | −0.4846 | −0.7722 | −0.0455 | −0.4728 | −0.5109 | −0.5013 | −0.3185 | −0.4312 | −0.6604 |
Kurtosis | 6.0502 | 13.9521 | 16.4934 | 8.9646 | 9.7817 | 13.0281 | 16.1477 | 7.4502 | 7.9877 |
Shapiro–Wilk Test (W) | 0.9421 *** | 0.9073 *** | 0.837 *** | 0.9144 *** | 0.9175 *** | 0.9033 *** | 0.8903 *** | 0.9321 *** | 0.9227 *** |
Jarque–Bera Test (JB) | 5395.7801 *** | 28,327.5218 *** | 39,109.1785 *** | 11,680.1006 *** | 13,903.6934 *** | 24,544.2796 *** | 37,543.7296 *** | 8084.7764 *** | 9421.6981 *** |
Augmented Dickey–Fuller () | −14.9626 *** | −11.5752 *** | −10.9766 *** | −11.6973 *** | −12.9465 *** | −18.7009 *** | −14.6622 *** | −16.2154 *** | −11.9238 *** |
Ljung–Box Test (Q(5)) | 27.8087 *** | 21.2132 *** | 100.1074 *** | 22.9614 *** | 70.5875 *** | 59.3114 *** | 61.5096 *** | 42.4347 *** | 23.582 *** |
Ljung–Box Test (Q(10)) | 65.3377 *** | 60.4795 *** | 164.8877 *** | 83.7618 *** | 190.4879 *** | 142.5232 *** | 154.016 *** | 121.0344 *** | 78.3964 *** |
Ljung–Box Test (Q(20)) | 87.2127 *** | 86.7658 *** | 207.5788 *** | 107.8259 *** | 237.6269 *** | 162.8771 *** | 209.2207 *** | 161.026 *** | 100.1244 *** |
Data | XLB | XLE | XLF | XLI | XLK | XLP | XLU | XLV | XLY | |
---|---|---|---|---|---|---|---|---|---|---|
Statistics | ||||||||||
Mean | 0.0002 | 0.0000 | −0.0003 | 0.0001 | 0.0002 | 0.0005 | 0.0005 | 0.0005 | 0.0001 | |
Standard Deviation | 0.3912 | 0.3271 | 0.3676 | 0.3730 | 0.4187 | 0.4396 | 0.3918 | 0.3958 | 0.4108 | |
Min. | −1.5326 | −1.5394 | −1.3862 | −1.8871 | −1.8236 | −1.7922 | −1.5858 | −2.1560 | −1.6051 | |
Max. | 1.7235 | 1.3046 | 1.5283 | 1.5598 | 1.9678 | 2.4972 | 1.6630 | 1.9397 | 2.1229 | |
Q1 | −0.2523 | −0.2091 | −0.2497 | −0.2495 | −0.2570 | −0.2753 | −0.2428 | −0.2586 | −0.2596 | |
Median | −0.0080 | −0.0093 | −0.0059 | 0.0026 | −0.0064 | −0.0130 | −0.0112 | −0.0114 | −0.0068 | |
Q3 | 0.2471 | 0.2055 | 0.2293 | 0.2357 | 0.2504 | 0.2534 | 0.2452 | 0.2376 | 0.2450 | |
Skewness | 0.1301 | 0.0791 | 0.1892 | 0.0980 | 0.0656 | 0.2973 | 0.1710 | 0.0994 | 0.1642 | |
Kurtosis | 0.7003 | 0.6145 | 0.4871 | 0.7279 | 1.2428 | 1.6248 | 0.8267 | 0.9842 | 1.1379 | |
Shapiro–Wilk Test (W) | 0.9956 *** | 0.996 *** | 0.9963 *** | 0.9956 *** | 0.9906 *** | 0.9857 *** | 0.9935 *** | 0.9935 *** | 0.9918 *** | |
Jarque–Bera Test (JB) | 79.9351 *** | 57.6037 *** | 54.5113 *** | 81.3655 *** | 223.9659 *** | 429.7603 *** | 114.7139 *** | 144.4836 *** | 201.1662 *** | |
Augmented Dickey–Fuller () | −20.5462 *** | −15.9105 *** | −15.1273 *** | −15.3356 *** | −16.3168 *** | −16.3461 *** | −16.415 *** | −19.7349 *** | −20.9397 *** | |
Ljung–Box Test (Q(5)) | 439.3993 *** | 438.0747 *** | 427.4039 *** | 442.3769 *** | 545.3189 *** | 598.9662 *** | 632.7569 *** | 483.8212 *** | 594.0558 *** | |
Ljung–Box Test (Q(10)) | 442.1597 *** | 456.3206 *** | 431.9172 *** | 445.1095 *** | 547.6234 *** | 618.9684 *** | 639.1071 *** | 489.3571 *** | 599.5767 *** | |
Ljung–Box Test (Q(20)) | 473.9014 *** | 482.8198 *** | 458.9153 *** | 464.1181 *** | 558.6023 *** | 647.9025 *** | 654.0895 *** | 494.3574 *** | 610.6616 *** |
(Unit: %) | Shapiro–Wilk Test (W) | D’Agostino K-Squared Test (K-Squared) | Lilliefors Test (T) | Jarque–Bera Test (JB) | Kolmogorov–Smirnov Test (KS) | Anderson–Darling Test (A-Squared) | Cramér–von Mises Test (U) |
---|---|---|---|---|---|---|---|
XLB | |||||||
α = 0.1 | 91.69 | 91.54 | 89.33 | 91.54 | 100.00 | 92.39 | 100.00 |
α = 0.05 | 90.28 | 90.05 | 87.01 | 90.53 | 100.00 | 90.60 | 100.00 |
α = 0.01 | 87.30 | 87.23 | 82.46 | 88.43 | 100.00 | 87.32 | 100.00 |
XLE | |||||||
α = 0.1 | 93.41 | 92.72 | 89.89 | 92.81 | 100.00 | 92.98 | 100.00 |
α = 0.05 | 92.27 | 91.64 | 87.66 | 92.10 | 100.00 | 91.22 | 100.00 |
α = 0.01 | 90.46 | 89.92 | 83.74 | 90.94 | 100.00 | 87.88 | 100.00 |
XLF | |||||||
α = 0.1 | 94.11 | 93.99 | 91.96 | 93.99 | 100.00 | 93.94 | 100.00 |
α = 0.05 | 93.04 | 92.75 | 90.33 | 93.16 | 100.00 | 92.37 | 100.00 |
α = 0.01 | 90.49 | 89.69 | 87.36 | 91.39 | 100.00 | 89.93 | 100.00 |
XLI | |||||||
α = 0.1 | 93.34 | 93.30 | 92.48 | 93.43 | 100.00 | 94.14 | 100.00 |
α = 0.05 | 92.01 | 91.97 | 90.58 | 92.58 | 100.00 | 92.86 | 100.00 |
α = 0.01 | 89.69 | 89.56 | 86.69 | 90.93 | 100.00 | 89.91 | 100.00 |
XLK | |||||||
α = 0.1 | 95.88 | 95.65 | 93.83 | 95.51 | 100.00 | 95.94 | 100.00 |
α = 0.05 | 94.89 | 94.69 | 91.89 | 94.83 | 100.00 | 94.80 | 100.00 |
α = 0.01 | 92.62 | 92.44 | 88.36 | 93.51 | 100.00 | 92.41 | 100.00 |
XLP | |||||||
α = 0.1 | 94.47 | 94.86 | 91.74 | 94.80 | 100.00 | 94.47 | 100.00 |
α = 0.05 | 93.13 | 93.81 | 89.40 | 94.15 | 100.00 | 92.89 | 100.00 |
α = 0.01 | 90.86 | 91.86 | 84.82 | 92.85 | 100.00 | 89.68 | 100.00 |
XLU | |||||||
α = 0.1 | 90.83 | 91.65 | 88.12 | 91.65 | 100.00 | 89.81 | 100.00 |
α = 0.05 | 89.13 | 89.98 | 85.62 | 90.46 | 100.00 | 88.06 | 100.00 |
α = 0.01 | 86.40 | 87.49 | 81.01 | 88.65 | 100.00 | 84.48 | 100.00 |
XLV | |||||||
α = 0.1 | 93.61 | 93.91 | 90.51 | 93.85 | 100.00 | 93.76 | 100.00 |
α = 0.05 | 92.55 | 93.06 | 87.93 | 93.3 | 100.00 | 92.30 | 100.00 |
α = 0.01 | 89.75 | 90.59 | 83.98 | 91.91 | 100.00 | 88.77 | 100.00 |
XLY | |||||||
α = 0.1 | 95.22 | 95.48 | 92.54 | 95.19 | 100.00 | 94.53 | 100.00 |
α = 0.05 | 94.00 | 94.37 | 90.48 | 94.39 | 100.00 | 93.41 | 100.00 |
α = 0.01 | 91.83 | 91.83 | 86.04 | 92.69 | 100.00 | 91.05 | 100.00 |
(Unit: %) | Shapiro–Wilk Test (W) | D’Agostino K-Squared Test (K-Squared) | Lilliefors Test (T) | Jarque–Bera Test (JB) | Kolmogorov–Smirnov Test (KS) | Anderson–Darling Test (A-Squared) | Cramér–von Mises Test (U) |
---|---|---|---|---|---|---|---|
XLB | |||||||
α = 0.1 | 75.04 | 77.25 | 40.81 | 77.12 | 99.78 | 68.89 | 99.90 |
α = 0.05 | 70.16 | 73.57 | 25.55 | 74.01 | 99.54 | 61.10 | 99.75 |
α = 0.01 | 60.57 | 65.45 | 1.87 | 68.97 | 98.74 | 46.41 | 99.12 |
XLE | |||||||
α = 0.1 | 76.65 | 79.49 | 37.70 | 79.47 | 99.95 | 64.88 | 99.99 |
α = 0.05 | 69.92 | 73.76 | 27.75 | 75.62 | 99.84 | 54.95 | 99.94 |
α = 0.01 | 50.64 | 50.83 | 11.27 | 62.36 | 99.34 | 35.56 | 99.56 |
XLF | |||||||
α = 0.1 | 78.19 | 79.61 | 51.59 | 80.01 | 99.89 | 70.32 | 99.96 |
α = 0.05 | 73.37 | 74.10 | 42.80 | 75.08 | 99.71 | 63.23 | 99.86 |
α = 0.01 | 61.00 | 63.86 | 25.21 | 66.37 | 99.06 | 50.93 | 99.31 |
XLI | |||||||
α = 0.1 | 73.71 | 76.71 | 28.21 | 77.52 | 99.88 | 49.85 | 99.95 |
α = 0.05 | 70.07 | 73.61 | 16.67 | 75.21 | 99.66 | 38.19 | 99.84 |
α = 0.01 | 56.83 | 67.41 | 2.09 | 71.26 | 99.00 | 15.81 | 99.28 |
XLK | |||||||
α = 0.1 | 85.99 | 86.03 | 77.41 | 86.33 | 99.73 | 85.93 | 99.87 |
α = 0.05 | 83.65 | 83.98 | 69.76 | 85.21 | 99.46 | 81.83 | 99.68 |
α = 0.01 | 78.64 | 80.26 | 54.79 | 82.91 | 98.62 | 74.94 | 99.00 |
XLP | |||||||
α = 0.1 | 87.34 | 89.86 | 79.01 | 89.63 | 99.57 | 85.64 | 99.75 |
α = 0.05 | 84.44 | 86.88 | 74.78 | 87.34 | 99.24 | 82.04 | 99.52 |
α = 0.01 | 78.91 | 80.89 | 66.92 | 82.93 | 98.25 | 73.63 | 98.74 |
XLU | |||||||
α = 0.1 | 79.74 | 83.58 | 68.10 | 84.63 | 99.78 | 77.19 | 99.89 |
α = 0.05 | 74.97 | 78.16 | 62.55 | 80.92 | 99.56 | 72.42 | 99.74 |
α = 0.01 | 64.45 | 64.73 | 47.43 | 71.40 | 98.80 | 62.06 | 99.15 |
XLV | |||||||
α = 0.1 | 83.72 | 85.21 | 64.76 | 85.86 | 99.80 | 82.53 | 99.91 |
α = 0.05 | 79.20 | 80.76 | 55.61 | 82.49 | 99.58 | 77.04 | 99.75 |
α = 0.01 | 68.36 | 69.73 | 43.06 | 75.57 | 98.84 | 62.80 | 99.13 |
XLY | |||||||
α = 0.1 | 73.54 | 77.90 | 57.74 | 78.33 | 99.74 | 70.27 | 99.86 |
α = 0.05 | 68.80 | 73.29 | 50.16 | 74.77 | 99.48 | 63.39 | 99.67 |
α = 0.01 | 62.41 | 66.20 | 31.03 | 69.55 | 98.64 | 53.53 | 98.98 |
Model | Original Dataset | Dataset with Proposed Features (MI and TE-Based Network-Driven Features) | Independent t-Test Statistic | ||||||
---|---|---|---|---|---|---|---|---|---|
XGBoost | LightGBM | CatBoost | XGBoost | LightGBM | CatBoost | XGBoost | LightGBM | CatBoost | |
XLB (Materials) | |||||||||
Accuracy | 0.5376 | 0.5418 | 0.5312 | 0.5460 | 0.5460 | 0.5619 | 14.17 *** | 7.03 *** | 63.38 *** |
Balanced Accuracy | 0.5389 | 0.5431 | 0.5324 | 0.5454 | 0.5426 | 0.5594 | 15.41 *** | −2.11 | 64.49 *** |
Cohen’s Kappa Coefficient | 0.0773 | 0.0857 | 0.0644 | 0.0906 | 0.0855 | 0.1190 | 297.86 *** | −2.31 | 711.19 *** |
Precision | 0.5408 | 0.5451 | 0.5343 | 0.5471 | 0.5446 | 0.5612 | 11.61 *** | −1.03 | 109.54 *** |
Recall | 0.5376 | 0.5418 | 0.5312 | 0.5460 | 0.5460 | 0.5619 | 14.49 *** | 6.35 *** | 59.31 *** |
F1 Score | 0.5379 | 0.5421 | 0.5315 | 0.5464 | 0.5450 | 0.5615 | 18.34 *** | 8.83 *** | 67.71 *** |
F-Beta Score (0.5) | 0.5393 | 0.5435 | 0.5329 | 0.5468 | 0.5447 | 0.5613 | 14.90 *** | 1.78 ** | 80.04 *** |
F-Beta Score (2) | 0.5374 | 0.5416 | 0.5311 | 0.5464 | 0.5450 | 0.5615 | 22.20 *** | 6.79 *** | 65.44 *** |
Hamming Loss | 0.4624 | 0.4582 | 0.4688 | 0.4540 | 0.4540 | 0.4381 | 19.40 *** | 8.59 *** | 72.07 *** |
XLE (Energy) | |||||||||
Accuracy | 0.5238 | 0.5270 | 0.5439 | 0.5651 | 0.5503 | 0.5365 | 103.61 *** | 38.95 *** | −14.99 |
Balanced Accuracy | 0.5263 | 0.5307 | 0.5485 | 0.5578 | 0.5422 | 0.5286 | 52.75 *** | 36.89 *** | −34.53 |
Cohen’s Kappa Coefficient | 0.0523 | 0.0609 | 0.0960 | 0.1173 | 0.0857 | 0.0580 | 1282.24 *** | 389.73 *** | −717.6 |
Precision | 0.5275 | 0.5328 | 0.5527 | 0.5681 | 0.5520 | 0.5352 | 67.81 *** | 41.47 *** | −52.66 |
Recall | 0.5238 | 0.5270 | 0.5439 | 0.5651 | 0.5503 | 0.5365 | 88.46 *** | 38.01 *** | −9.78 |
F1 Score | 0.5219 | 0.5225 | 0.5372 | 0.5475 | 0.5277 | 0.5139 | 57.68 *** | 10.00 *** | −70.39 |
F-Beta Score (0.5) | 0.5243 | 0.5268 | 0.5436 | 0.5545 | 0.5354 | 0.5203 | 161.22 *** | 11.99 *** | −38.32 |
F-Beta Score (2) | 0.5221 | 0.5235 | 0.5387 | 0.5539 | 0.5363 | 0.5228 | 66.91 *** | 20.64 *** | −37.53 |
Hamming Loss | 0.4762 | 0.4730 | 0.4561 | 0.4349 | 0.4497 | 0.4635 | 120.90 *** | 42.91 *** | −19.04 |
XLF (Financials) | |||||||||
Accuracy | 0.5259 | 0.5238 | 0.5407 | 0.5407 | 0.5249 | 0.5503 | 43.08 *** | −0.42 | 14.55 *** |
Balanced Accuracy | 0.5279 | 0.5246 | 0.5413 | 0.5362 | 0.5263 | 0.5477 | 18.78 *** | 5.92 *** | 9.85 *** |
Cohen’s Kappa Coefficient | 0.0555 | 0.0491 | 0.0823 | 0.0729 | 0.0523 | 0.0957 | 232.33 *** | 62.69 *** | 124.04 *** |
Precision | 0.5294 | 0.5259 | 0.5425 | 0.5383 | 0.5276 | 0.5491 | 18.96 *** | 0.89 | 13.99 *** |
Recall | 0.5259 | 0.5238 | 0.5407 | 0.5407 | 0.5249 | 0.5503 | 32.48 *** | 0.86 | 57.54 *** |
F1 Score | 0.5255 | 0.5240 | 0.5410 | 0.5370 | 0.5248 | 0.5493 | 18.74 *** | 3.61 *** | 17.84 *** |
F-Beta Score (0.5) | 0.5273 | 0.5249 | 0.5417 | 0.5372 | 0.5261 | 0.5491 | 16.38 *** | 10.47 *** | 6.55 *** |
F-Beta Score (2) | 0.5253 | 0.5237 | 0.5407 | 0.5387 | 0.5245 | 0.5498 | 26.20 *** | 2.29 ** | 24.94 *** |
Hamming Loss | 0.4741 | 0.4762 | 0.4593 | 0.4593 | 0.4751 | 0.4497 | 36.54 *** | 4.50 *** | 34.39 *** |
XLI (Industrials) | |||||||||
Accuracy | 0.5354 | 0.5164 | 0.5376 | 0.5302 | 0.4910 | 0.5450 | −13.62 | −90.49 | 33.05 *** |
Balanced Accuracy | 0.5345 | 0.5131 | 0.5334 | 0.5328 | 0.4995 | 0.5480 | −4.78 | −76.45 | 38.29 *** |
Cohen’s Kappa Coefficient | 0.0688 | 0.0262 | 0.0670 | 0.0649 | −0.0009 | 0.0949 | −83.71 | −2684.93 | 296.78 *** |
Precision | 0.5371 | 0.5159 | 0.5362 | 0.5357 | 0.5023 | 0.5511 | −4.06 | −30.59 | 35.38 *** |
Recall | 0.5354 | 0.5164 | 0.5376 | 0.5302 | 0.4910 | 0.5450 | −10.48 | −109.98 | 14.31 *** |
F1 Score | 0.5360 | 0.5161 | 0.5367 | 0.5303 | 0.4851 | 0.5450 | −11.23 | −81.94 | 27.85 *** |
F-Beta Score (0.5) | 0.5366 | 0.5160 | 0.5363 | 0.5329 | 0.4924 | 0.5479 | −7.1 | −106.52 | 28.69 *** |
F-Beta Score (2) | 0.5356 | 0.5163 | 0.5372 | 0.5296 | 0.4860 | 0.5443 | −8.79 | −50.11 | 25.06 *** |
Hamming Loss | 0.4646 | 0.4836 | 0.4624 | 0.4698 | 0.5090 | 0.4550 | −14.29 | −176.44 | 18.14 *** |
XLK (Technology) | |||||||||
Accuracy | 0.4899 | 0.5365 | 0.5090 | 0.5513 | 0.5333 | 0.5397 | 164.25 *** | −3.89 | 158.33 *** |
Balanced Accuracy | 0.4866 | 0.5304 | 0.5017 | 0.5342 | 0.5267 | 0.5211 | 112.09 *** | −5.91 | 36.56 *** |
Cohen’s Kappa Coefficient | −0.0268 | 0.0611 | 0.0035 | 0.0703 | 0.0538 | 0.0434 | 1838.24 *** | −136.5 | 926.00 *** |
Precision | 0.4907 | 0.5344 | 0.5058 | 0.5426 | 0.5308 | 0.5284 | 179.15 *** | −15.42 | 78.61 *** |
Recall | 0.4899 | 0.5365 | 0.5090 | 0.5513 | 0.5333 | 0.5397 | 162.94 *** | −9.32 | 69.18 *** |
F1 Score | 0.4903 | 0.5351 | 0.5067 | 0.5345 | 0.5316 | 0.5192 | 97.61 *** | −5.93 | 40.65 *** |
F-Beta Score (0.5) | 0.4905 | 0.5346 | 0.5060 | 0.5360 | 0.5310 | 0.5206 | 108.76 *** | −6.11 | 22.56 *** |
F-Beta Score (2) | 0.4901 | 0.5359 | 0.5079 | 0.5419 | 0.5325 | 0.5283 | 414.04 *** | −9.23 | 45.26 *** |
Hamming Loss | 0.5101 | 0.4635 | 0.4910 | 0.4487 | 0.4667 | 0.4603 | 171.61 *** | −9.13 | 79.72 *** |
XLP (Consumer Staples) | |||||||||
Accuracy | 0.4984 | 0.5185 | 0.5196 | 0.5354 | 0.5185 | 0.5429 | 179.05 *** | −0.8 | 41.66 *** |
Balanced Accuracy | 0.4941 | 0.5124 | 0.5117 | 0.5336 | 0.5130 | 0.5307 | 178.37 *** | 2.05 ** | 41.78 *** |
Cohen’s Kappa Coefficient | −0.0117 | 0.0250 | 0.0236 | 0.0670 | 0.0261 | 0.0624 | 5736.80 *** | 105.42 *** | 1548.05 *** |
Precision | 0.4980 | 0.5163 | 0.5157 | 0.5372 | 0.5168 | 0.5361 | 219.84 *** | 0.41 | 39.55 *** |
Recall | 0.4984 | 0.5185 | 0.5196 | 0.5354 | 0.5185 | 0.5429 | 67.65 *** | −1.18 | 58.84 *** |
F1 Score | 0.4982 | 0.5170 | 0.5165 | 0.5361 | 0.5175 | 0.5343 | 83.44 *** | 3.18 *** | 40.08 *** |
F-Beta Score (0.5) | 0.4981 | 0.5165 | 0.5157 | 0.5367 | 0.5170 | 0.5341 | 88.12 *** | 4.62 *** | 51.29 *** |
F-Beta Score (2) | 0.4983 | 0.5178 | 0.5181 | 0.5357 | 0.5180 | 0.5383 | 138.82 *** | 1.80 ** | 36.87 *** |
Hamming Loss | 0.5016 | 0.4815 | 0.4804 | 0.4646 | 0.4815 | 0.4571 | 129.93 *** | 1.04 | 94.83 *** |
XLU (Utilities) | |||||||||
Accuracy | 0.5090 | 0.5090 | 0.5164 | 0.5291 | 0.5429 | 0.5386 | 44.90 *** | 85.79 *** | 52.85 *** |
Balanced Accuracy | 0.5075 | 0.5060 | 0.5101 | 0.5240 | 0.5254 | 0.5141 | 33.85 *** | 37.36 *** | 7.19 *** |
Cohen’s Kappa Coefficient | 0.0149 | 0.0120 | 0.0203 | 0.0480 | 0.0522 | 0.0295 | 683.80 *** | 1142.35 *** | 985.14 *** |
Precision | 0.5120 | 0.5106 | 0.5147 | 0.5284 | 0.5327 | 0.5227 | 25.40 *** | 47.61 *** | 16.81 *** |
Recall | 0.5090 | 0.5090 | 0.5164 | 0.5291 | 0.5429 | 0.5386 | 51.71 *** | 52.67 *** | 68.31 *** |
F1 Score | 0.5100 | 0.5097 | 0.5154 | 0.5287 | 0.5277 | 0.5066 | 39.26 *** | 40.01 *** | −41.61 |
F-Beta Score (0.5) | 0.5111 | 0.5102 | 0.5149 | 0.5285 | 0.5280 | 0.5089 | 46.33 *** | 34.48 *** | −11.1 |
F-Beta Score (2) | 0.5093 | 0.5092 | 0.5160 | 0.5289 | 0.5345 | 0.5209 | 87.16 *** | 69.83 *** | 13.87 *** |
Hamming Loss | 0.4910 | 0.4910 | 0.4836 | 0.4709 | 0.4571 | 0.4614 | 89.99 *** | 94.39 *** | 41.61 *** |
XLV (Health Care) | |||||||||
Accuracy | 0.4899 | 0.5037 | 0.5069 | 0.5259 | 0.5259 | 0.5450 | 127.26 *** | 34.39 *** | 106.88 *** |
Balanced Accuracy | 0.4889 | 0.4990 | 0.5044 | 0.5192 | 0.5211 | 0.5347 | 87.79 *** | 50.89 *** | 68.11 *** |
Cohen’s Kappa Coefficient | −0.0222 | −0.0020 | 0.0088 | 0.0387 | 0.0424 | 0.0705 | 2969.95 *** | 1817.86 *** | 6413.97 *** |
Precision | 0.4915 | 0.5015 | 0.5069 | 0.5220 | 0.5237 | 0.5395 | 71.63 *** | 37.44 *** | 116.81 *** |
Recall | 0.4899 | 0.5037 | 0.5069 | 0.5259 | 0.5259 | 0.5450 | 88.41 *** | 170.69 *** | 62.38 *** |
F1 Score | 0.4905 | 0.5021 | 0.5069 | 0.5220 | 0.5242 | 0.5353 | 92.07 *** | 133.52 *** | 82.12 *** |
F-Beta Score (0.5) | 0.4910 | 0.5017 | 0.5069 | 0.5215 | 0.5238 | 0.5360 | 75.33 *** | 119.90 *** | 56.07 *** |
F-Beta Score (2) | 0.4901 | 0.5030 | 0.5069 | 0.5220 | 0.5242 | 0.5353 | 54.62 *** | 107.51 *** | 53.91 *** |
Hamming Loss | 0.5101 | 0.4963 | 0.4931 | 0.4741 | 0.4741 | 0.4550 | 83.94 *** | 39.56 *** | 162.06 *** |
XLY (Consumer Discretionary) | |||||||||
Accuracy | 0.5185 | 0.5376 | 0.5323 | 0.5365 | 0.5524 | 0.5545 | 44.96 *** | 45.45 *** | 108.49 *** |
Balanced Accuracy | 0.5141 | 0.5319 | 0.5272 | 0.5421 | 0.5534 | 0.5546 | 50.47 *** | 27.77 *** | 55.50 *** |
Cohen’s Kappa Coefficient | 0.0281 | 0.0638 | 0.0543 | 0.0823 | 0.1053 | 0.1078 | 786.75 *** | 462.79 *** | 1066.78 *** |
Precision | 0.5206 | 0.5381 | 0.5334 | 0.5490 | 0.5594 | 0.5604 | 123.30 *** | 36.93 *** | 60.39 *** |
Recall | 0.5185 | 0.5376 | 0.5323 | 0.5365 | 0.5524 | 0.5545 | 43.30 *** | 44.46 *** | 65.24 *** |
F1 Score | 0.5194 | 0.5378 | 0.5328 | 0.5372 | 0.5539 | 0.5560 | 79.35 *** | 34.30 *** | 63.25 *** |
F-Beta Score (0.5) | 0.5200 | 0.5380 | 0.5332 | 0.5429 | 0.5567 | 0.5583 | 45.20 *** | 112.82 *** | 53.28 *** |
F-Beta Score (2) | 0.5188 | 0.5377 | 0.5325 | 0.5372 | 0.5539 | 0.5560 | 51.08 *** | 46.74 *** | 51.19 *** |
Hamming Loss | 0.4815 | 0.4624 | 0.4677 | 0.4635 | 0.4476 | 0.4455 | 57.21 *** | 45.01 *** | 50.27 *** |
Category | Number of Features | |
---|---|---|
Price | Volume | |
Original Features | 168 (31.11%) | 89 (16.48%) |
Proposed Features | 131 (24.26%) | 152 (28.15%) |
Category | Number of Features |
---|---|
Mutual Information | 198 (69.96%) |
Transfer Entropy | 85 (30.04%) |
Category | Number of Features | |
---|---|---|
Price | Volume | |
20-day window | 42 (14.84%) | 52 (18.37%) |
60-day window | 33 (11.66%) | 62 (21.91%) |
120-day window | 21 (7.42%) | 33 (11.66%) |
240-day window | 35 (12.37%) | 5 (1.77%) |
Category | Number of Features |
---|---|
Centrality Measures | 153 (54.06%) |
Node embeddings | 130 (45.94%) |
Feature | Number of Features |
---|---|
Average Neighbor Degree | 9 (5.88%) |
Degree Centrality | 1 (0.65%)
|
Eigenvector Centrality | 13 (8.50%) |
Closeness Centrality | 0 (0.00%) |
Information Centrality | 2 (1.31%) |
Betweenness Centrality | 9 (5.88%) |
Current Flow Betweenness Centrality | 7 (4.58%) |
Communicability Betweenness Centrality | 5 (3.27%) |
Harmonic Centrality | 3 (1.96%) |
Second-Order Centrality | 34 (22.22%) |
Voterank Importance | 0 (0.00%) |
Number of Maximal Cliques | 5 (3.27%) |
PageRank | 26 (16.99%) |
HITS | 39 (25.49%)
|
Category | Number of Features |
---|---|
Role2Vec | 74 (56.92%) |
FEATHER | 56 (43.08%) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Choi, I.; Kim, W.C. Enhancing Exchange-Traded Fund Price Predictions: Insights from Information-Theoretic Networks and Node Embeddings. Entropy 2024, 26, 70. https://doi.org/10.3390/e26010070
Choi I, Kim WC. Enhancing Exchange-Traded Fund Price Predictions: Insights from Information-Theoretic Networks and Node Embeddings. Entropy. 2024; 26(1):70. https://doi.org/10.3390/e26010070
Chicago/Turabian StyleChoi, Insu, and Woo Chang Kim. 2024. "Enhancing Exchange-Traded Fund Price Predictions: Insights from Information-Theoretic Networks and Node Embeddings" Entropy 26, no. 1: 70. https://doi.org/10.3390/e26010070
APA StyleChoi, I., & Kim, W. C. (2024). Enhancing Exchange-Traded Fund Price Predictions: Insights from Information-Theoretic Networks and Node Embeddings. Entropy, 26(1), 70. https://doi.org/10.3390/e26010070