AI-Driven Financial Analysis: Exploring ChatGPT’s Capabilities and Challenges
Abstract
:1. Introduction
2. Artificial Intelligence Techniques and Related Studies in Financial Analysis
2.1. Historical Evolution and Technological Advancements
2.2. AI Applications in Financial Analysis
2.2.1. Enhancing Market Efficiency and Risk Management
2.2.2. Predictive Analytics and Financial Stability
2.2.3. Modeling Behavioural Biases and Sentiment Analysis
2.3. Emergence of ChatGPT in Financial Analysis
Ethical and Regulatory Considerations
3. Empirical Design
3.1. Financial Analysis and Reasoning in a Nutshell
3.2. Rationale of Human Analysts and ChatGPT in Financial Analysis and Reasoning
3.3. Tasks/Prompt
3.3.1. Multi-Step Reasoning Tasks
3.3.2. Complex Reasoning Tasks
3.3.3. Evaluation Metrics
4. Empirical Results and Findings
4.1. Data Collection/Retrieval
4.2. Multi-Step Reasoning Tasks Results and Findings
4.3. Complex Reasoning Tasks Results and Findings
4.4. Discussion and Practical Application and Implementation
5. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Multi-Step Reasoning Tasks/Prompt
|
Appendix B. Complex Reasoning Tasks/Prompt
|
Appendix C. Data Collection/Retrieval
Appendix D. Multi-Step Reasoning Task 9 Demonstration
The First ChatGPT-4o Trial | The Second Trial with Instruction | The Third Fresh Trial |
Appendix E. Multi-Step Reasoning Task 12 Demonstration
The First Trial | The Second Trial with Instruction (Correct Result) | The Third Fresh Trial (Wrong Result) |
Appendix F. 10-Stock Selection by ChatGPT-4
Appendix G. Complex Reasoning Task 1 Demonstration: Technical Analyses and Indicators for Chalice Mining Limited (CHN)
Human analyst result—LSEG Workspace |
Red is MACD line and blue is Signal line. Source: LSEG Refinitiv Workspace Recommendation Bollinger Bands: price is close to the upper band and a bullish reversal has recently occurred, suggesting a “hold” at this stage. RSI: it is approaching the upper limit “70”, may indicate “overbought” or “sell” situation. MACD: MACD is above signal line, indicating “bullish” signal. |
ChatGPT-4o result |
Stock Recommendation Explanation CHN Hold/Sell Overbought. MACD bullish. Price above upper Bollinger Band. |
Appendix H. Complex Reasoning Task 2 Demonstration: Portfolio Construction
Human analyst result |
Stata: Excel: |
ChatGPT-4o result |
Human analyst result |
Stata: Excel: |
ChatGPT-4o result |
Human analyst result |
Stata: |
ChatGPT-4o result |
Appendix I. Complex Reasoning Task 3 Demonstration: Capital Budgeting
Human analyst result |
ChatGPT-4o result |
Appendix J. Complex Reasoning Task 4 Demonstration: Financial Statement Analysis
Human analyst result |
ChatGPT-4o result |
Appendix K. Complex Reasoning Task 5 Demonstration: Financial Statement Analysis
Human analyst result |
ChatGPT-4o result |
ChatGPT-4o’s Parts a—i calculations: ChatGPT-4o’s Parts a—ii calculations: ChatGPT-4o’s Part b explanations: Excel template created by GPT-4o: |
Appendix L. Complex Reasoning Task 6 Demonstration: Option Pricing—Binomial Tree
Human analyst result |
ChatGPT-4o result |
ChatGPT-4o voice model: The black boxes contain the asset prices at each node. The dashed boxes contain the option prices at each node. Blue dashed lines indicate the option prices when holding the option is optimal. Red dashed lines indicate the option prices when early exercise is optimal. |
References
- Ahmed, Shamima, Muneer M. Alshater, Anis El Ammari, and Helmi Hammami. 2022. Artificial Intelligence and Machine Learning in Finance: A Bibliometric Review. Research in International Business and Finance 61: 101646. [Google Scholar] [CrossRef]
- Blesiada, Jamie. 2023. Expedia Group Gives Users the Opportunity to Test New Technology. Tavel Weekly. Available online: https://www.travelweekly.com/Travel-News/Travel-Technology/Expedia-Group-gives-users-opportunity-test-new-technology (accessed on 16 May 2024).
- Bodie, Zvi, Alex Kane, and Alan J. Marcus. 2022. Essentials of Investments, 12th ed. New York: McGraw Hill LLC. [Google Scholar]
- Bollen, Johan, Huina Mao, and Xiaojun Zeng. 2011. Twitter Mood Predicts the Stock Market. Journal of Computational Science 2: 1–8. [Google Scholar] [CrossRef]
- Boukes, Mark, Bob Van de Velde, Theo Araujo, and Rens Vliegenthart. 2020. What’s the Ttone? Easy doesn’t Do It: Analyzing Performance and Agreement between Off-the-shelf Sentiment Analysis Tools. Communication Methods and Measures 14: 83–104. [Google Scholar] [CrossRef]
- Brealey, Richard, Stewart C. Myers, Alex Edmans, and Franklin Allen. 2022. Principles of Corporate Finance, 14th ed. New York: McGraw-Hill US. [Google Scholar]
- Burgess, Nicholas. 2021. Machine Earning–Algorithmic Trading Strategies for Superior Growth, Outperformance and Competitive Advantage. International Journal of Artificial Intelligence and Machine Learning 2: 38–60. [Google Scholar] [CrossRef]
- Cao, Longbing. 2022. AI in Finance: Challenges, Techniques, and Opportunities. ACM Computing Surveys (CSUR) 55: 1–38. [Google Scholar] [CrossRef]
- Chaboud, Alain P., Benjamin Chiquoine, Erik Hjalmarsson, and Clara Vega. 2014. Rise of the Machines: Algorithmic Trading in the Foreign Exchange Market. The Journal of Finance 69: 2045–84. [Google Scholar] [CrossRef]
- Cheng, Liying, Xingxuan Li, and Lidong Bing. 2023. Is GPT-4 a Good Data Analyst? arXiv arXiv:2305.15038. [Google Scholar] [CrossRef]
- de Lange, Petter Eilif, Borger Melsom, Christian Bakke Vennerød, and Sjur Westgaard. 2022. Explainable AI for credit assessment in banks. Journal of Risk and Financial Management 15: 556. [Google Scholar] [CrossRef]
- Demajo, Lara Marie, Vince Vella, and Alexiei Dingli. 2020. Explainable AI for Interpretable Credit Scoring. arXiv arXiv:2012.03749. [Google Scholar] [CrossRef]
- Dilmegani, Cem. 2024. ChatGPT Code Interpreter Plugin: Use Cases & Limitations in 2024. AIMultiple Research. Available online: https://research.aimultiple.com/chatgpt-code-interpreter/ (accessed on 16 May 2024).
- Dowling, Michael, and Brian Lucey. 2023. ChatGPT for (Finance) Research: The Bananarama Conjecture. Finance Research Letters 53: 103662. [Google Scholar] [CrossRef]
- Elad, Barry. 2024. OpenAI Statistics 2024 By Demographics, Products, Revenue and Growth. Available online: https://www.enterpriseappstoday.com/stats/openai-statistics.html#google_vignette (accessed on 11 June 2024).
- Farooq, Akeel, and Privanka Chawla. 2021. Review of Data Science and AI in Finance. Paper presented at International Conference on Computing Sciences (ICCS), Phagwara, India, December 4–5. [Google Scholar]
- Félix, Luiz, Roman Kräussl, and Philip Stork. 2020. Implied Volatility Sentiment: A Tale of Two Tails. Quantitative Finance 20: 823–49. [Google Scholar] [CrossRef]
- Hansen, Anne Lundgaard, and Sophia Kazinnik. 2023. Can ChatGPT Decipher Fedspeak? Federal Reserve Bank of New York. United States of America. Available online: https://policycommons.net/artifacts/5671671/can-chatgpt-decipher-fedspeak/6437313/ (accessed on 11 June 2024).
- Hartmann, Jochen, Mark Heitmann, Christian Siebert, and Christina Schamp. 2023. More Than a Feeling: Accuracy and Application of Sentiment Analysis. International Journal of Research in Marketing 40: 75–87. [Google Scholar] [CrossRef]
- Hull, John C. 2015. Options, Futures, and Other Derivatives, Global Edition. London: Pearson Education. [Google Scholar]
- Jullum, Martin, Anders Løland, Ragnar Bang Huseby, Geir Ånonsen, and Johannes Lorentzen. 2020. Detecting Money Laundering Transactions with Machine Learning. Journal of Money Laundering Control 23: 173–86. [Google Scholar] [CrossRef]
- Kelly, Jack. 2023. Goldman Sachs Predicts 200 Million Jobs will be Lost or Degraded by Artificial Intelligence. Forbes. Available online: https://www.forbes.com/sites/jackkelly/2023/03/31/goldman-sachs-predicts-300-million-jobs-will-be-lost-or-degraded-by-artificial-intelligence/?sh=43cb004a782b (accessed on 16 May 2024).
- Kocoń, Jan, Igor Cichecki, Oliwier Kaszyca, Mateusz Kochanek, Dominika Szydło, Joanna Baran, Julita Bielaniewicz, Marcin Gruza, Arkadiusz Janz, Kamil Kanclerz, and et al. 2023. ChatGPT: Jack of All Trades, Master of None. Information Fusion 99: 101861. [Google Scholar] [CrossRef]
- Leippold, Markus. 2023a. Sentiment Spin: Attacking Financial Sentiment with GPT-3. Finance Research Letters 55: 103957. [Google Scholar] [CrossRef]
- Leippold, Markus. 2023b. Thus spoke GPT-3: Interviewing a large-language model on climate finance. Finance Research Letters 53: 103617. [Google Scholar] [CrossRef]
- Lin, Tom C. 2019. Artificial Intelligence, Finance, and the Law. Fordham Law Review 88: 531. [Google Scholar]
- Liu, Li Xian, Shuangzhe Liu, and Milind Sathye. 2021. Predicting Bank Failures: A Synthesis of Literature and Directions for Future Research. Journal of Risk and Financial Management 14: 474. [Google Scholar] [CrossRef]
- Lopez-Lira, Alejandro, and Yuehua Tang. 2023. Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models. arXiv arXiv:2304.07619. [Google Scholar] [CrossRef]
- Masson, Dubos J. 2018. 6 Steps to An Effective Financial Statement Analysis. Association for Financial Professionals. Available online: https://www.afponline.org/training-resources/resources/articles/Details/6-steps-to-an-effective-financial-statement-analysis (accessed on 16 May 2024).
- Roumeliotis, Konstantinos I., and Nikolaos D. Tselikas. 2023. Chatgpt and Open-AI models: A Preliminary Review. Future Internet 15: 192. [Google Scholar] [CrossRef]
- Saunders, Anthony, Marcia Cornett, and Otgo Erhemjamts. 2021. Financial Institutions Management: A Risk Management Approach, 10th ed. New York: McGraw-Hill Education. [Google Scholar]
- Selke, Mary J. Goggins. 2013. Rubric Assessment Goes to College: Objective, Comprehensive Evaluation of Student Work. Lanham: R&L Education. [Google Scholar]
- Sokolov, I. A. 2019. Theory and Practice in Artificial Intelligence. Вестник Рoссийскoй академии наук 89: 365–70. [Google Scholar] [CrossRef]
- Son, Guijin, Hanearl Jung, Moonjeong Hahm, Keonju Na, and Sol Jin. 2023. Beyond Classification: Financial Reasoning in State-of-the-Art Language Models. arXiv arXiv:2305.01505. [Google Scholar]
- Stevens, Dannelle D., and Antonia J. Levi. 2023. Introduction to rubrics: An Assessment Tool to Save Grading Time, Convey Effective Feedback, and Promote Student Learning. Abingdon-on-Thames: Routledge. [Google Scholar]
- Tetlock, Paul C., Maytal Saar-Tsechansky, and Sofus Macskassy. 2008. More than words: Quantifying language to measure firms’ fundamentals. The Journal of Finance 63: 1437–67. [Google Scholar] [CrossRef]
- Turing, Alan M. 1950. Computing Machinery and Intelligence. In The Essential Turing: The Ideas That Gave Birth to the Computer Age. Oxford: Clarendon Press. Available online: https://academic.oup.com/book/42030/chapter-abstract/355746326?redirectedFrom=fulltext (accessed on 16 May 2024). [CrossRef]
- Wahlen, James Michael, Stephen P. Baginski, and Mark Thomas Bradshaw. 2018. Financial Reporting, Rinancial Statement Analysis, and Valuation: A Strategic Perspective. Boston: Cengage Learning. [Google Scholar]
- Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35: 24824–37. [Google Scholar]
- Wenzlaff, Karsten, and Sebastian Spaeth. 2022. Smarter than Humans? Validating how OpenAI’s ChatGPT Model Explains Crowdfunding, Alternative Finance and Community Finance. Validating How OpenAI’s ChatGPT Model Explains Crowdfunding, Alternative Finance and Community Finance. Available online: https://ssrn.com/abstract=4302443 (accessed on 16 May 2024).
- Yu, Lining, Wolfgang Karl Härdle, Lukas Borke, and Thijs Benschop. 2023. An AI Approach to Measuring Financial Risk. The Singapore Economic Review 68: 1529–49. [Google Scholar] [CrossRef]
- Yue, Thomas, David Au, Chi Chung Au, and Kwan Yuen Iu. 2023. Democratizing Financial Knowledge with ChatGPT by OpenAI: Unleashing the Power of Technology. Available online: http://dx.doi.org/10.2139/ssrn.4346152 (accessed on 11 May 2024).
- Zaremba, Adam, and Ender Demir. 2023. ChatGPT: Unlocking the Future of NLP in Finance. ChatGPT: Unlocking the future of NLP in finance. Modern Finance 1: 93–98. [Google Scholar] [CrossRef]
Task Number | Tasks | Task Understanding | Task Deconstruction | Calculation Ideas and Formulas | Accuracy | Critical Thinking/Application of Knowledge |
---|---|---|---|---|---|---|
1–8 | Time value of money | advanced | advanced | advanced | Yes | functional |
9 | Investment yield | advanced | advanced | advanced | No | applicable |
10–11 | Effective rate, WACC | advanced | advanced | advanced | Yes | functional |
12 | Internal rate of return | advanced | advanced | advanced | No | applicable |
13–18 | Cost, valuation, option models | advanced | advanced | advanced | Yes | functional |
19 | Simple business valuation in M&A | advanced | advanced | advanced | No | applicable |
20–30 | Beta, bond, forecasting, pricing, arbitrage, risk, etc. | advanced | advanced | advanced | Yes | functional |
31 | Dividend payout suggestions | advanced | advanced | N/A | N/A | functional |
32 | Capital structure | advanced | advanced | N/A | N/A | functional |
Task Number | Tasks | Task Understanding | Task Deconstruction | Calculation Ideas and Formulas | Accuracy | Critical Thinking/Level of Critical Thinking |
---|---|---|---|---|---|---|
1.1 | Technical Analysis and Stock Recommendation (Bollinger Bands) | advanced | advanced | advanced | Yes | advanced |
1.2 | Technical Analysis and Stock Recommendation (MACD) | advanced | advanced | advanced | Yes | advanced |
1.3 | Technical Analysis and Stock Recommendation (RSI) | advanced | advanced | advanced | Partially accurate | advanced |
Portfolio Construction | ||||||
2.1 | Stock summary statistics | advanced | advanced | advanced | Yes | advanced |
2.2 | Correlation matrix | advanced | advanced | advanced | Yes | advanced |
2.3 | Portfolio Construction–Global Minimum Variance | advanced | advanced | advanced | Partially accurate | advanced |
2.4 | Portfolio Construction–Optimal Risky Portfolio | advanced | advanced | advanced | Yes | advanced |
2.5 | Efficient Frontier | advanced | advanced | advanced | Yes | advanced |
3 | Capital Budgeting | intermediate | basic | basic | No | naive |
4 | Financial Statement Analysis—Appendix B Q4 | advanced | advanced | intermediate | Partially accurate | moderate |
5 | Financial Statement Analysis—Appendix B Q5 | intermediate | intermediate | intermediate | No | superficial |
6 | Option pricing- Binomial Tree | advanced | advanced | moderate | No | moderate |
Human analyst result |
Recommendations: Hold/Sell Bollinger Bands: price is close to the upper band and a bullish reversal has recently occurred, suggesting a “hold” at this stage. RSI: it is approaching the upper limit “70”, which may indicate “overbought” or “sell” situation. MACD: MACD is above the signal line, indicating a “bullish” signal. |
ChatGPT-4o result |
“Recommendations: Hold/Sell Overbought. MACD bullish. Price above upper Bollinger Band.” |
Result comparison |
Same recommendations, despite discrepancies in the RSI charts between ChatGPT-4o and LSEG Workspace. |
Human analyst result |
Global minimum variance portfolio weights based on Excel and Stata: CHN: 0; VUL: 1.65%; FCL: 5.86%; SXG: 2.19%; LTR: 1%; NEU: 8.69%; WTC: 5.15%; ALL: 56.61%; NXT: 13.46%; PME: 5.49% Optimal risky portfolio weights based on Excel and Stata: CHN: 0; VUL: 6.34%; FCL: 0; SXG: 18.40%; LTR: 0; NEU: 0; WTC: 25.39%; ALL: 0; NXT: 34.86%; PME: 15.01% |
ChatGPT-4o result |
Global minimum variance portfolio weights: “CHN: 0; VUL: 4.83%; FCL: 10.94%; SXG: 4.94%; LTR: 4.69%; NEU: 14.50%; WTC: 14.97%; ALL: 16.20%; NXT: 13.83%; PME: 15.11%” Optimal risky portfolio weights: “CHN: 0; VUL: 6.37%; FCL: 0; SXG: 18.40%; LTR: 0; NEU: 0; WTC: 25.44%; ALL: 0; NXT: 34.85%; PME: 14.94%” |
Result comparison |
ChatGPT-4o and Excel/Stata generated similar weights for optimal risky portfolios; however, the weights generated by ChatGPT-4o for the global minimum variance portfolio are different from the results produced by Excel and Stata. |
Human analyst result |
At 12% discount rate: NPV = $255,234.67 At 14% discount rate: NPV = $115,539.10 |
ChatGPT-4o result |
At 12% discount rate: “The NPV of the project is $1,071,417.83, indicating that the project is profitable.” At 14% discount rate: “The NPV of the project is $966,137.31, still indicating profitability, but with a reduced margin due to the higher risk reflected in the higher discount rate.” |
Result comparison |
NPVs computed by ChatGPT-4o are completely different from the correct answers. |
Human analyst result |
Part a: components of the DuPont formula Tax burden: 0.6335 Interest burden: 0.9699 Profit margin: 0.1615 Asset turnover: 1.6992 Leverage: 1.4070 Part b: ROE: ROE = 23.72% Part c: growth rate g = 16.46% |
ChatGPT-4o result |
Part a: components of the DuPont formula “Net profit margin: 0.099 Asset turnover: 1.658 Equity multiplier: 1.409” Part b: ROE: “ROE = 23.2%” Part c: growth rate “g = 16.3%” |
Result comparison |
The ROE and growth rates calculated by ChatGPT-4o are correct. However, ChatGPT-4o only performed 3-step DuPont calculations. It requires further instruction to conduct 5-step DuPont analyses. |
Human analyst result |
Part a—i: components of the DuPont formula in 2022 White Brush (ProtectBrush) Tax burden: 0.6744 (0.6599) Interest burden: 1.00 (0.9545) Profit margin: 0.0851 (0.0646) Asset turnover: 1.8259 (3.6286) Leverage: 1.5071 (1.5386) ROE: 15.8% (22.7%) Parts a—ii: ROE and plowback in 2022 White Brush (ProtectBrush) ROE = 15.8% (22.7%) Plowback = 1.00 (0.344) Sustainable growth rate = 15.8% (7.8%) Part b: explanation of WhiteBrush’s EPS growth rate WhiteBrush’s recent EPS growth has been achieved by increasing book value per share, not by achieving greater profits per dollar of equity. Since EPS is equal to (Book value per share × ROE), a firm can increase EPS even if the ROE is declining; this is the case with WhiteBrush. WhiteBrush’s book value per share has more than doubled in the last two years. Book value per share can increase either by retaining earnings or by issuing new stock at a market price greater than book value. WhiteBrush has been retaining all earnings, but the increase in the number of outstanding shares indicates that it has also issued a substantial amount of stock. |
ChatGPT-4o result |
Parts a—i: components of the DuPont formula in 2022 WhiteBrush (ProtectBrush) “Net profit margin: 0.0573 (0.0373) Asset turnover: 1.42 (3.61) Equity multiplier: 3.12 (3.30) ROE: 25.38% (44.51%)” Parts a—ii: ROE and plowback in 2022 White Brush (ProtectBrush) “ROE = 25.38% (44.51%) Plowback = 0.75 (−0.25) * Sustainable growth rate = 19.04% (−11.13%)” * “Dividend payout ratio is 125%. ProtectBrush is paying out more than its earnings in dividends, resulting in a negative plowback ratio.” Part b: explanation of WhiteBrush’ EPS growth rate “WhiteBrush has produced an average annual EPS growth rate of 40% over the last two years despite a declining ROE. Possible reasons include:
|
Result comparison |
In parts a—i, although instruction has been given for 5-step DuPont analyses, ChatGPT-4o was only able to demonstrate 3-step DuPont calculations. The components were computed incorrectly in both parts a—i and a—ii. In part b, ChatGPT-4o was able to provide superficial discussions. It discussed share buyback. However, WhiteBrush’s shares outstanding increased from 2021 to 2022. |
Human analyst result |
DerivaGem output shows that early exercise is optimal at node D with a value of 0.0538. The value of the option at the initial node A is 0.0188. |
ChatGPT-4o result |
“There are no nodes where early exercise would be optimal since the option values are non-negative and less than the intrinsic values at every point. Therefore, early exercise is not optimal at any step.” The value of the option at the initial node A computed by ChatGPT-4o is 0.0777. The value of the option at the initial node A computed based on ChatGPT-4o voice interaction is 0.0255. |
Result comparison |
The values of the option computed by ChatGPT-4o were incorrect. In addition, its decision on early exercise was incorrect. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, L.X.; Sun, Z.; Xu, K.; Chen, C. AI-Driven Financial Analysis: Exploring ChatGPT’s Capabilities and Challenges. Int. J. Financial Stud. 2024, 12, 60. https://doi.org/10.3390/ijfs12030060
Liu LX, Sun Z, Xu K, Chen C. AI-Driven Financial Analysis: Exploring ChatGPT’s Capabilities and Challenges. International Journal of Financial Studies. 2024; 12(3):60. https://doi.org/10.3390/ijfs12030060
Chicago/Turabian StyleLiu, Li Xian, Zhiyue Sun, Kunpeng Xu, and Chao Chen. 2024. "AI-Driven Financial Analysis: Exploring ChatGPT’s Capabilities and Challenges" International Journal of Financial Studies 12, no. 3: 60. https://doi.org/10.3390/ijfs12030060
APA StyleLiu, L. X., Sun, Z., Xu, K., & Chen, C. (2024). AI-Driven Financial Analysis: Exploring ChatGPT’s Capabilities and Challenges. International Journal of Financial Studies, 12(3), 60. https://doi.org/10.3390/ijfs12030060