Next Article in Journal
Mergers and Acquisitions’ Moderating Effect on the Relationship Between Credit Risk and Bank Value: A Quantile Regression Approach
Previous Article in Journal
Understanding the Future of Money: The Struggle Between Government Control and Decentralization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A First Look at Financial Data Analysis Using ChatGPT-4o

1
College of Business, Florida International University, 11200 SW 8th St., Miami, FL 33199, USA
2
College of Business Administration, The University of Texas at El Paso, 500 W. University Ave., El Paso, TX 79968, USA
3
John Chambers College of Business and Economics, West Virginia University, 83 Beechurst Avenue, Morgantown, WV 26505, USA
*
Author to whom correspondence should be addressed.
J. Risk Financial Manag. 2025, 18(2), 99; https://doi.org/10.3390/jrfm18020099
Submission received: 10 January 2025 / Revised: 4 February 2025 / Accepted: 5 February 2025 / Published: 14 February 2025
(This article belongs to the Section Financial Technology and Innovation)

Abstract

:
OpenAI’s new flagship model, ChatGPT-4o, released on 13 May 2024, offers enhanced natural language understanding and more coherent responses. This paper investigates ChatGPT-4o’s capabilities in financial data analysis, including zero-shot prompting, time series analysis, risk and return analysis, and ARMA-GARCH estimation. ChatGPT-4o’s performance is generally comparable to traditional statistical software like Stata, though some errors and discrepancies arise due to differences in implementation. Despite these issues, our findings indicate that ChatGPT-4o has significant potential for real-world financial analysis. Integrating ChatGPT-4o into financial research and practice may lead to more efficient data processing, improved analytical capabilities, and better-informed investment decisions.

1. Introduction

Released on 13 May 2024, OpenAI’s new flagship model, ChatGPT-4o (with its “o” standing for “omni”), represents a significant leap forward in generative artificial intelligence (GenAI). Designed to process and reason across audio, vision, and text in real time, ChatGPT-4o marks a notable improvement over its previous models like GPT-3.5 and GPT-4. Its enhancements include expanded knowledge coverage, more coherent and contextually relevant responses, and advanced analytical capabilities, making it a versatile tool for complex tasks like financial analysis.
The release of ChatGPT-4o has garnered widespread attention in industry and the media. GPT-4o has superior text, vision, and audio processing capabilities, making it twice as fast and 50% cheaper than its predecessor, GPT-4 Turbo. The model has achieved high Elo scores, significantly outperforming previous models in tasks that require comprehension, speech recognition, translation, and visual perception.1 These qualities are particularly relevant to the finance sector, where vast amounts of complex data require timely and accurate analysis.
As with each new model release, the launch of ChatGPT-4o prompts a wave of quick tests from experts from different fields and industries to validate the model’s performance. The finance industry, in particular, may benefit from the model’s advanced data analysis capabilities, especially given that financial data are complex and dynamic. The versatility of ChatGPT-4o promises substantial improvements in financial research and practice.
However, human expertise remains indispensable. While ChatGPT-4o can assist in handling large datasets, executing computations, and performing financial analysis, it lacks crucial judgment for real-time decision making in volatile markets. Therefore, incorporating human oversight is essential to interpret outputs and validate results. This necessity is particularly evident in finance, where data such as stock returns, risk factors, and regression estimates can change rapidly and often require domain-specific knowledge and strategic insight.
Understanding what ChatGPT-4o excels at is crucial for determining its potential applications in financial analysis. The model’s enhanced natural language processing capabilities make it adept at interpreting financial reports, analyzing market trends, and assisting in decision making. Its ability to handle vast amounts of data and run complex statistical analyses is particularly valuable in finance. Despite the extensive literature on data analysis using previous versions of ChatGPT (see the literature review in Section 2), few studies have explored ChatGPT-4o’s expanded capabilities in this domain. This gap exists because the model is still new, and its full potential remains untapped. This paper aims to be one of the first to investigate the use of ChatGPT-4o for data analysis in finance, highlighting its strengths and identifying areas for further improvement.
In this paper, we comprehensively evaluate ChatGPT-4o’s performance in financial analysis, utilizing daily stock return data from CRSP and the Fama French Factors. We undertake various tests to assess the model’s capabilities, including zero-shot prompting, time series analysis, risk and return analysis, and ARMA-GARCH estimation. Our empirical results demonstrate ChatGPT-4o’s performance in financial analysis. The model excels in interpreting complex datasets and offering insights into AI-assisted financial analysis with human oversight. We also demonstrate that GenAI should be viewed as a complementary tool rather than a replacement for human expertise, particularly in areas that require real-time decision making and interpretation. Further research is needed on how GenAI can be effectively integrated with human expertise to enhance financial analysis.
In this empirical assessment, we compare ChatGPT-4o’s performance to results from Stata, a widely used statistical software platform in academics. These tests underscore the model’s potential to transform financial research. ChatGPT-4o’s ability to handle large datasets and generate comprehensive analyses efficiently enhances the accuracy and speed of data-driven decision-making processes. Furthermore, our comparative analyses reveal that ChatGPT-4o’s performance is comparable to traditional statistical software like Stata, with minor discrepancies primarily due to differences in implementation methods. This validation highlights the model’s reliability and practical utility in real-world financial scenarios. The implications of these findings are significant, suggesting that integrating ChatGPT-4o into financial research and practice can lead to more efficient data processing and improved analytical capabilities. While ChatGPT-4o demonstrates the ability to accurately identify key variables, perform robust statistical analyses, and generate financial metrics such as Sharpe ratios and market betas, human validation and oversight remain imperative, especially in interpreting market trends and unexpected market events. Ultimately, we find that ChatGPT-4o’s analytical efficiency has the potential to enhance data-driven financial processes, yet it remains far from being a substitute for human judgment.
Moreover, recent advancements in GenAI have significantly expanded the capabilities of financial analysis as major technology firms introduce increasingly sophisticated models. For instance, DeepSeek, a China-trained model released on 10 January 2025, reportedly competes with top-tier models at a fraction of the training cost. DeepSeek’s emergence highlights an industry-wide trend shifting from brute-force scaling to intelligent optimization, by balancing power and efficiency without demanding excessive compute resources. Likewise, Google’s Gemini, OpenAI’s o-series, and other AI frameworks from Microsoft, Meta, and Anthropic focus on interpretability, efficiency, and domain-specific expertise. These efforts reflect a broader shift in GenAI research: moving beyond mere text generation to produce reliable, context-aware analytical tools that complement human insight in critical decision-making processes.
The contributions of this paper are twofold. First, it provides a comprehensive, empirical evaluation of ChatGPT-4o’s capabilities in the context of financial analysis, filling a critical gap in the current literature. This evaluation thoroughly explains the model’s strengths and limitations in financial analysis. Second, it highlights the practical implications of integrating advanced GenAI models in finance, offering valuable insights for researchers and practitioners. These implications cover various aspects of financial operations, including market analysis, portfolio management, and risk assessment, demonstrating the model’s broad applicability. However, domain expertise and real-time human oversight remain crucial.
Looking ahead, the implications of ChatGPT-4o in future research in GenAI are profound. Its ability to integrate and analyze data across multiple modalities opens new avenues for interdisciplinary studies, combining insights from finance, economics, computer science, and other fields. The model’s advanced capabilities can drive innovation in financial research, leading to more accurate models, improved decision-making processes, and a deeper understanding of market dynamics. Furthermore, the model’s adaptability to different data types and contexts suggests potential for future enhancements and applications beyond the current scope of this study. This paper lays the groundwork for future studies, exploring the vast potential of ChatGPT-4o and setting the stage for its widespread adoption in finance and beyond.

2. Related Literature

Many early papers focused on the general applications of ChatGPT across broad disciplines. For instance, the literature discusses ChatGPT’s capacity in scientific writing (Alkaissi & McFarlane, 2023; González-Padilla, 2022; Hosseini et al., 2024) and potential applications in academia and industry (Dale, 2021; Alshater, 2022; Dai et al., 2023; Lin, 2023; Rahman & Watanobe, 2023; Schlosky et al., 2024; L. X. Liu et al., 2024). It addresses concerns raised by academia (da Silva, 2023; Frye, 2022; Gao et al., 2022; Khalil & Er, 2023; Stokel-Walker, 2023; Thorp, 2023), highlights the limitations of ChatGPT (Borji, 2023; X. Liu et al., 2023; Shen et al., 2023; Zhu et al., 2023; Rice et al., 2024), and imposes standards to eliminate ethical issues and bias in the implementation of ChatGPT (Anders, 2023; Liebrenz et al., 2023; Lund & Wang, 2023; Yeo-Teh & Tang, 2023). These studies emphasize the need for validation and human oversight to ensure accuracy and reliability in its outputs. While ChatGPT can significantly enhance efficiency and depth in research, there is a caution against over-reliance on AI-generated analyses.
Another stream of literature focuses on the importance of interdisciplinary collaboration to maximize the benefits of ChatGPT and similar technologies (e.g., Bang et al., 2023; van Dis et al., 2023; Zhu et al., 2023; Fahad et al., 2024; Rice et al., 2024). These works underline the potential for AI to drive innovation across various fields.
The adoption of ChatGPT in business disciplines has garnered significant attention, with numerous studies highlighting its transformative potential. Specifically, in finance, ChatGPT has been utilized for various applications, ranging from general financial analysis to more specific areas. Examples in general financial analysis include Alshater (2022), Dowling and Lucey (2023), Chen et al. (2023), Zaremba and Demir (2023), Feng et al. (2024), and more,2 who discuss the potential applications of ChatGPT in financial data processing and interpretation, modeling and simulation, hypothesis generation, automated report generation, and even drafting research papers. These studies demonstrate how ChatGPT can streamline and improve efficiency in these areas.
Ali and Aysan (2023) explore how ChatGPT’s capabilities can transform various financial services, including customer service automation, fraud detection, and personalized financial advice, highlighting the model’s ability to enhance efficiency and accuracy in financial operations. Bhatia et al. (2024) introduce FinTral, a family of GPT-4 level multimodal financial large language models, to handle a variety of financial data inputs, including text and numerical data, to improve the accuracy and utility of financial analysis and forecasting.
Specific financial areas also see growing interest in the application of ChatGPT. For instance, Jha et al. (2023) investigate its role in corporate finance, particularly in automating and enhancing decision-making processes. Smales (2023) examines the impact of ChatGPT on interpreting and responding to monetary policy announcements, revealing its potential to improve market efficiency. Li et al. (2024) study the model’s ability to identify optimistic biases in human financial analysts, offering insights into how AI can mitigate subjective biases in financial analysis.
Further applications include analyzing stock price movements (Lopez-Lira & Tang, 2024), evaluating stock-picking strategies (Pelster & Val, 2024), and conducting sentiment analysis (Fatouros et al., 2023). Additionally, Aldridge (2023) demonstrates the utility of ChatGPT in performing linear regression analysis, highlighting its versatility in handling complex statistical tasks within finance. Cao and Zhai (2023) analyze the impact of ChatGPT on financial research domains such as ESG, corporate culture, and Federal Reserve opinion analysis. Kim et al. (2024) discuss applying large language models in analyzing financial statements.
Beyond business disciplines, ChatGPT has shown remarkable promise in fields that require rigorous mathematical analysis. For example, Frieder et al. (2023) evaluate the mathematical reasoning abilities of ChatGPT versions and GPT-4 using datasets specifically curated for graduate-level mathematics. This study highlights strengths in basic mathematical tasks while pointing out limitations in more advanced, graduate-level problem solving. Korinek (2023) explores the application of large language models like ChatGPT in various domains and details how ChatGPT can assist with mathematical derivations, data analysis, coding, and other research tasks, significantly boosting productivity for researchers.
The recent literature investigates ChatGPT’s new features in interpreting visual information. Yang et al. (2023) investigate the model’s performance in integrating visual and textual information, highlighting its strengths in understanding and generating descriptions of images, recognizing objects, and interpreting visual data. The authors also identify areas where the model struggles, such as handling complex visual scenes and nuances in image context. Wang et al. (2024) evaluate the capabilities of ChatGPT in interpreting scientific figures in their article published in NPJ Precision Oncology. They highlight ChatGPT’s strengths in recognizing and explaining various plot types effectively, which can aid researchers in understanding complex data visualizations.
Overall, the literature highlights ChatGPT’s extensive applications across various disciplines, underscoring its potential to enhance business and scientific research through advanced mathematical analysis and data processing capabilities.
Our paper fits into the literature by comprehensively evaluating ChatGPT-4o’s capabilities, specifically in financial data analysis. By examining its performance in various tasks, such as zero-shot prompting, time series analysis, risk and return analysis, and ARMA-GARCH estimation, we fill a critical gap in the current literature and highlight the practical implications of integrating ChatGPT-4o into financial research and practice.

3. Data

The financial data for the thirty companies included in the Dow Jones Industrial Average (DJIA) are sourced from the Center for Research in Security Prices (CRSP). For each Dow 30 stock from 2019 to 2023, we collect its permno, date, ticker, comnam, permco, cusip, vol, ret, bid, ask, shrout, numtrd, and retx, along with the CRSP return (including all distributions) Value-Weighted Index (vwretd) and the S&P 500 Composite Index Return (sprtrn), at daily frequency. This comprehensive dataset ensures we capture a detailed and accurate picture of each company’s stock performance over the specified period. To calculate each stock’s Sharpe ratio and market beta, data on the daily risk-free rate and the Fama–French three factors are sourced from Kenneth French’s website.3 These variables are essential for robust financial analysis and understanding the stocks’ underlying risk and return dynamics.
Table 1 reports the structure of the data used in the empirical tests. Panel A lists the 30 components of the DJIA as of 15 May 2024, collected from CNBC.4 Panel B presents the data structure for the “Dow 30 Daily Returns” file, which consists of panel data encompassing 30 firms over five years. This structure allows for a detailed examination of stock performance over time and across different companies. Panel C details the data structure for the “FamaFrench_3Factors” file, noting that we divide the original data by 100 to align with the CRSP data in the analysis. This adjustment ensures consistency and comparability across different data sources, facilitating more accurate and meaningful analyses.

4. Zero-Shot Prompting Analysis

Our analysis begins with zero-shot open-end prompting, which means that the prompt used to interact with ChatGPT-4o does not contain examples or demonstrations but asks it to generate results about the best approach. One advantage of zero-shot open-end prompting is that it may reveal unseen problems and provide unexpected solutions.
After uploading the “Dow 30 Daily Returns” File, illustrated in Panel B of Table 1, we ask ChatGPT-4o: “Act as a finance professor and statistician. Analyze this data for the daily returns of Dow 30 components from 2019 to 2023, pulled from the Center for Research in Security Prices (CRSP) via the Wharton Research Data Services (WRDS) platform. Perform analysis that can be used in academic papers. Illustrate the meaning and implications of the results”.
In Panel A of Table 2, ChatGPT-4o begins its response by identifying the variables/column headers of the input data. Out of 15 columns, ChatGPT-4o correctly identifies 14 header names. The only misidentified column is “sprtrn”, which should represent the S&P 500 Composite Index Return but is labeled as “Spread Return” in the responses.
ChatGPT-4o suggests that the statistical analysis should include five parts: descriptive statistics, distribution analysis, time series analysis, risk and return analysis, and correlation analysis. Next, ChatGPT-4o provides further details on what those analyses should entail. In implementing these analyses, ChatGPT-4o starts with descriptive statistics and distribution analysis. ChatGPT-4o mentions that the descriptive statistics should include the count, mean, standard deviation, minimum, 25th, 50th, 75th percentiles, maximum, skewness, and kurtosis without providing the actual numbers for these statistics. ChatGPT-4o suggests including a histogram, KDE plot, and QQ plot for the distribution statistics. In our example, it plots the distribution of the daily returns, QQ plots of the daily returns, and the time series of the daily returns for MSFT. In the subsequent time series analysis, ChatGPT-4o uses the time series plot for Microsoft’s daily returns as an example to illustrate return fluctuations over time. It suggests examining four implications: risk and return, distribution properties, volatility clustering, and modeling returns (such as using the GARCH model).
ChatGPT-4o proceeds with the analysis, including calculating annualized metrics (return, volatility, and Sharpe ratio), examining correlations, and performing additional time series analysis. In the correlation analysis, ChatGPT-4o provides a correlation matrix of daily returns for the Dow 30 components. It identifies stock pairs with high and low correlations and suggests implications for portfolio diversification, risk management, and investment strategy.
ChatGPT-4o demonstrates a robust understanding of financial data analysis and suggests comprehensive steps to analyze the Dow 30 daily returns. It accurately identifies most data columns and outlines essential descriptive and time series analyses. However, it falls short in providing actual descriptive statistics and mislabels one column, indicating room for improvement in accuracy and completeness. The suggested analyses include calculating annualized metrics, examining correlations, and performing additional time series analysis, which are crucial for understanding the underlying risk and return dynamics. ChatGPT-4o’s ability to identify high- and low-correlated stocks and suggest implications for portfolio diversification, risk management, and investment strategy highlights its potential utility in financial research despite some shortcomings in precision.

5. Summary Statistics

Summary statistics are typically reported in the first table in many finance papers. Descriptive statistics is also the first analysis we should conduct according to the zero-shot prompting analysis by ChatGPT-4o, as illustrated in the previous section. Therefore, we proceed with our analysis to compare the summary statistics generated from statistical software with those obtained from ChatGPT-4o prompts.
Table 3 reports the summary statistics for the “Dow 30 Daily Returns” file. We first present the results generated using Stata 18 Standard Edition in Panel A of Table 3. As shown in Panel B, we upload the “Dow 30 Daily Returns” file and ask ChatGPT-4o to “Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS. Provide a summary statistic table that includes the mean, median, standard deviation, minimum, maximum, and number of observations for the following variables: return (ret), volume (vol), bid price (bid), ask price (ask), shares outstanding (shrout), and value-weighted return (vwretd)”.
ChatGPT-4o responds with the statistics it calculates. Interestingly, the statistics for return (ret) and volume (vol) do not match those generated by Stata, while those for bid price (bid), ask price (ask), shares outstanding (shrout), and value-weighted return (vwretd) do match. This discrepancy suggests that while ChatGPT-4o can accurately process and analyze most of the data, certain variables might be prone to errors, warranting further investigation and validation.
Overall, ChatGPT-4o provides a reasonably accurate set of summary statistics, demonstrating its utility in performing preliminary data analysis. However, the mismatches in key variables like return and volume highlight the need for cross-validation with traditional statistical software to ensure the accuracy and reliability of the results. These findings underscore the potential of integrating AI tools into financial research while also emphasizing the importance of careful verification.

6. Plotting and Insights

In Table 2, the response of ChatGPT-4o suggests that time series analysis should follow the descriptive and distribution statistics, and it plots the time series of daily returns for MSFT (Microsoft Corp). Therefore, we assess the ability of ChatGPT-4o to generate time series figures and conduct time series analysis (the latter is discussed in the regression analysis section, i.e., fitting an ARMA-GARCH model).
Table 4 evaluates a time series plot for Microsoft’s return (ret) and the market’s return (vwretd) from 2019 to 2023. Panel A shows the results generated using Stata, illustrating that Microsoft’s returns are more volatile than the market’s returns, with increased volatility during late 2019 and early 2020.
As shown in Panel B, we upload the “Dow 30 Daily Returns” file and send the following prompts to ChatGPT-4o: “Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS. Provide a time series plot for the return (ret) of Microsoft (ticker = MSFT) and the market return (vwretd)”.
Interestingly, ChatGPT-4o has an issue with producing a time series plot, although it does have the ability to generate a time series plot, as evidenced by Table 2. However, here, instead of generating figures, ChatGPT-4o provides a textual description of the data trends for the daily returns of Microsoft and the market return from 2019 to 2023. For Microsoft, it highlights the company’s typical fluctuations associated with a high-profile technology company. ChatGPT-4o notes that the daily return volatility reflects underlying market conditions, including periods of heightened uncertainty like the COVID-19 pandemic. ChatGPT-4o also mentions a positive correlation between Microsoft’s returns and the market returns, which was not requested but adds value to the analysis.
Overall, ChatGPT-4o shows potential in analyzing and describing time series data but faces challenges in generating visual representations directly. This limitation underscores the importance of using traditional statistical software or tools for precise visualizations. The descriptive insights provided by ChatGPT-4o demonstrate its analytical capabilities, yet the inconsistencies in generating figures highlight the need for further improvements in its application.

7. Risk and Return Analysis

In the previous two sections, both the results from Stata and the response from ChatGPT-4o indicate increased return volatility during the onset of the COVID-19 period. Following this observation, we examine stock returns around the COVID-19 lockdown. In Table 5, we aim to extract the daily returns of Dow 30 components before and after the first COVID-19 lockdown day (Sunday, 15 March 2020). Specifically, we need to obtain the daily returns on 13 March and 16 March, the sum of these two daily returns for each stock, and the average returns of all stocks on each day.
After uploading the “Dow 30 Daily Returns” file, we command ChatGPT-4o to “Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS. Provide the return (ret) of each stock on the trading days before and after the first COVID lockdown day (Sunday, 15 March 2020). Compute the total return of each stock’s two trading days, and the average return of all stocks’ two trading days”.
ChatGPT-4o does not provide the returns directly but instead supplies step-by-step Python codes as a response. We need to copy and run the suggested code in Python. However, we only obtain partial results using the given Python code, specifically, the sum of the two daily returns for each stock, as shown in the first table in Panel C.
Upon reviewing the code, we find that it successfully extracts the daily returns for each stock from the original dataset but fails to display them. This oversight prevents the full visibility of the individual daily returns on 13 March and 16 March. To resolve this issue, we add a few lines of codes to ensure that both individual daily returns and their sums are displayed. The final results, incorporating these changes, are shown in the second table in Panel C.
To ensure the accuracy of the results provided by ChatGPT-4o, we perform a parallel analysis using Stata. The results generated by Stata are presented in Panel A. By comparing these results with those obtained from the modified Python code provided by ChatGPT-4o, we find that they are identical, confirming the reliability of the Python code after our adjustments.
Thus, overall, ChatGPT-4o demonstrates a strong capability in generating effective Python code for financial data analysis. However, users must be vigilant as the initial code may contain minor omissions or errors that require careful verification and potential adjustments.
In Table 2, the response from ChatGPT-4o suggests that we should analyze the risk and return of each stock and compute their annualized metrics. Specifically, it mentions three key metrics: annualized return, annualized volatility, and Sharpe ratio.
We focus on the Sharpe Ratio as it involves a multi-step calculation that tests the computation abilities of ChatGPT-4o. To obtain the Sharpe Ratio, we first need to compute the daily excess return (return minus risk-free rate), aggregate these to obtain the annual excess return, compute the annualized stock return volatility, and then use the annualized excess return to divide the annual return volatility. If ChatGPT-4o calculates the Sharpe Ratio correctly, it indicates it can accurately compute the preceding annualized return and volatility. Sharpe Ratio analysis is presented in Table 6.
After uploading the “Dow 30 Daily Returns” file, we command ChatGPT-4o to “Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS, and the daily risk-free rate, pulled from Kenneth R. French—Data Library. Compute the annualized excess return (raw return—risk-free rate) of each stock. Compute the standard deviation of excess returns for each stock each year. Then, calculate the Sharpe Ratio, the risk-adjusted return, for each stock each year. Report the Sharpe Ratio for each stock each year and the average Sharpe Ratio of all stocks each year”.
Again, ChatGPT-4o does not provide the results directly but instead supplies step-by-step solutions as a response. ChatGPT-4o suggests this step-by-step approach: (1) Load and merge the data. (2) Compute the annualized excess returns. (3) Calculate the standard deviation of the excess returns. (4) Compute the Sharpe ratios. (5) Report the Sharpe ratios and the average Sharpe ratio for each year. Each step comes with a Python code that we can copy and run on our machines.
However, we encountered some issues when executing the code provided by ChatGPT-4o. For instance, the excess returns were annualized, but the volatilities were not. Also, Python could not find the “date” column in the Fama–French dataset because the column name is actually “Date”. Finally, the initial commands suggested by ChatGPT-4o only displayed the first five rows of the results. We resolved these issues by annualizing the stock return volatilities, correcting the column name, and adjusting the code to display all results. After these adjustments, we ran the corrected code in Python and obtained results that were very similar to those from Stata, as shown in Panel B.5
Overall, ChatGPT-4o demonstrates a solid understanding of the process involved in calculating the Sharpe ratio and provides useful code for this purpose. However, users must be diligent in verifying and adjusting the provided code to ensure accuracy, particularly in handling data specifics such as column names and understanding annualization methods.
Next, we consider another commonly used risk measure for stocks, market beta, which is not suggested in ChatGPT-4o’s response in the zero-shot prompting, as shown in Table 2. Computing market beta involves multiple steps that can effectively test the computational abilities of ChatGPT-4o. To obtain the market beta, we first need to compute the daily excess return (return minus risk-free rate). Then, we run a regression by firm and year to store the slope coefficients. In Stata, this requires using local macros and loops to get the results. Table 7 reports the market beta analysis.
After uploading the “Dow 30 Daily Returns” file, illustrated in Panel B of Table 1, and the “FamaFrench 3Factors” file, illustrated in Panel C of Table 1, we command ChatGPT-4o to “Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS, and the daily risk-free rate, pulled from Kenneth R. French—Data Library. Compute the CAPM’s Beta for each stock each year”.
ChatGPT-4o has an issue with inspecting the data columns, so it does not provide the results directly. Instead, ChatGPT-4o provides a detailed guide with Python code on how to analyze the data to compute the CAPM’s Beta for each stock each year on the local machine.
Unfortunately, in the initial trial, we cannot obtain any results using the provided Python code because it has a couple of errors. First of all, the name of the variable “excess market return” should be ‘MktRF’, not ‘Mkt-RF’. Second, the beta value for DOW in 2019 is missing because the value of the variable ‘ret’ is null on 2 April 2019. After correcting these errors in the Python code, we obtain the same results as those generated from Stata.
Overall, ChatGPT-4o demonstrates a solid understanding of the process involved in calculating market beta and provides useful code for this purpose. However, it has issues in reading the variable name precisely. Users must be diligent in verifying and adjusting the provided code to ensure accuracy, particularly in handling data specifics such as column names and dealing with missing values.

8. ARMA-GARCH Estimation

Lastly, to capture the volatility clustering and leverage effects in the market returns, we perform an autoregressive moving average (ARMA) model with generalized autoregressive conditional heteroskedasticity (GARCH) processes, namely ARMA–GARCH model, on the market return (vwretd). In Table 2, the response from the zero-shot prompting suggests that we should also model the return dynamics, explicitly mentioning GARCH. Estimating the ARMA-GARCH model is a complex task that involves multiple steps.
First, our data are daily, consisting of a panel of 30 firms over five years, meaning the vwretd values are repeated 30 times in the data. We need to keep only one set of the vwretd values over the 5 years (1258 trading days) and delete duplicated ones. Second, the estimation of the ARMA-GARCH model parameters is sensitive to option specifications. We obtain the results in Panel A of Table 8 in Stata using the following syntax: arch vwretd, arch(1) garch(1) ar(1) ma(1).
We upload the “Dow 30 Daily Returns” file and command ChatGPT-4o: “Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS. Estimate an ARMA-GARCH model on the value-weighted return (vwretd)”.
ChatGPT-4o again has an issue with executing the model fitting directly. Instead, it provides the steps (load the data, estimate ARMA model, and estimate GARCH model) with Python codes. We copy the Python codes and run them on a local machine, but they come with error messages. We then copy and paste the error message into ChatGPT-4o, which provides another set of codes. However, it still has issues to work immediately.
Specifically, in the first trial, we did not obtain any results using the Python code provided by ChatGPT-4o because the ‘ARMA’ function in the ‘statsmodels.tsa.api’ module has been replaced by the ‘ARIMA’ function. After correcting this error, we obtain some estimation results, but these results are incorrect since they are produced using duplicated data. As mentioned above, the ‘vwretd’ values are repeated 30 times in the data. Thus, we need to delete duplicate data before fitting an ARMA-GARCH model. The final results are shown in Panel C of Table 8. However, they are still different from those produced by Stata. This is because ARMA and GARCH models are estimated simultaneously in Stata, while they are estimated sequentially in Python. Unfortunately, there is currently no package in Python that can estimate these two models jointly.
Overall, ChatGPT-4o demonstrates a fundamental understanding of the process involved in estimating ARMA-GARCH models and provides somewhat helpful code. However, users must be diligent in verifying and adjusting the provided code to ensure accuracy, particularly in handling deprecated functions and avoiding data duplication.

9. Conclusions

OpenAI’s ChatGPT-4o makes an ambitious leap forward in generative artificial intelligence. In this paper, we evaluate ChatGPT-4o’s performance in financial data analysis, using daily stock data from the CRSP database for 30 companies in the Dow Jones Industrial Average, along with the market return and risk-free data from Ken French’s library. We conduct tests including zero-shot prompting, time series analysis, risk and return analysis, and ARMA-GARCH estimation. Our findings reveal that ChatGPT-4o’s performance is comparable to traditional statistical software, such as Stata, with minor discrepancies due to differences in implementation methods. ChatGPT-4o demonstrates a robust understanding of financial data analysis, accurately interpreting complex datasets and delivering thorough evaluations. These capabilities may significantly enhance investment strategies and risk management practices.
Through detailed comparative analyses, we illustrate how ChatGPT-4o can transform financial research methodologies and enhance the efficiency of data-driven decision-making processes. Despite some initial challenges with data handling and coding errors, our adjustments show that ChatGPT-4o can be a powerful tool with appropriate verification and validation procedures. These results underscore ChatGPT-4o’s robust understanding of financial data, as evidenced by its ability to interpret complex datasets and offer insightful analyses. Nonetheless, human expertise remains indispensable in validating outputs, ensuring the contextual relevance of interpretations, and adapting to real-time market fluctuations. As highlighted throughout our discussion, ChatGPT-4o should be viewed as a complementary tool, rather than a replacement for seasoned analysts.
Our research underscores the practical utility of ChatGPT-4o in real-world financial analysis and sets the stage for its broader adoption in the finance industry. The model’s advanced natural language processing capabilities and analytical strength offer significant benefits for financial analysts, researchers, and practitioners. These results suggest that integrating ChatGPT-4o into financial research and practice can lead to more efficient data processing, improved analytical capabilities, and better-informed investment decisions.
Looking ahead, we aim to further explore ChatGPT-4o’s adaptability to different data types and contexts, including its application to other financial instruments and markets. Future research will also consider integrating ChatGPT-4o with emerging AI models, including DeepSeek, Google’s Gemini, and newer iterations of OpenAI’s o-series. Ultimately, continued innovation in GenAI-driven financial research promises to refine predictive models, optimize investment strategies, and promote a deeper understanding of market dynamics, all while underscoring the indispensable role of expert human oversight.

Author Contributions

Conceptualization, Z.F.; methodology, Z.F. and F.L.; software, Z.F. and F.L.; validation, W.-H.C. and B.L.; formal analysis, Z.F. and F.L.; writing—original draft preparation, F.L. and B.L.; writing—review and editing, W.-H.C. and Z.F.; visualization, Z.F. and F.L.; supervision, Z.F.; project administration, Z.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data was obtained through academic subscriptions obtained by our respective institutions, and the authors cannot, under the terms of the agreements, have the data sets available for deposit.

Conflicts of Interest

The authors declare no conflict of interest.

Notes

1
OpenAI’s new multimodal “GPT-4 omni” combines text, vision, and audio in a single model (https://the-decoder.com/, accessed on 13 May 2024).
2
See, for example, Khan and Umer (2024); Shue et al. (2023); Yue et al. (2023).
3
4
https://www.cnbc.com/dow-30/ (accessed on 13 May 2024).
5
The small difference is due to the method of annualization. The code in Stata uses the actual number of trading days to compute the annual excess return and its annual standard deviation (e.g., 252 days in 2019 and 253 days in 2020), whereas the code suggested by ChatGPT-4o uses 252 days for all years. If the actual trading days are used for annualization in both cases, the results from Stata and Python would be identical.

References

  1. Aldridge, I. (2023). The AI revolution: From linear regression to ChatGPT and beyond and how it connects to finance. The Journal of Portfolio Management, 49(9), 64–77. [Google Scholar] [CrossRef]
  2. Ali, H., & Aysan, A. F. (2023). What will ChatGPT revolutionize in the financial industry? Modern Finance, 1(1), 116–129. [Google Scholar] [CrossRef]
  3. Alkaissi, H., & McFarlane, S. I. (2023). Artificial hallucinations in ChatGPT: Implications in scientific writing. Cureus, 15(2), e35179. [Google Scholar] [CrossRef]
  4. Alshater, M. M. (2022). Exploring the role of artificial intelligence in enhancing academic performance: A case study of ChatGPT. SSRN Electronic Journal, 4312358. [Google Scholar] [CrossRef]
  5. Anders, B. A. (2023). Is using ChatGPT cheating, plagiarism, or both, neither, or forward thinking? Patterns, 4(3), 100694. [Google Scholar] [CrossRef]
  6. Bang, Y., Cahyawijaya, S., Lee, N., Dai, W., Su, D., Wilie, B., Lovenia, H., Ji, Z., Yu, T., Chung, W., & Fung, P. (2023). A multitasking, multilingual, multimodal evaluation of chatbot on reasoning, hallucination, and interactivity. arXiv, arXiv:2302.04023. [Google Scholar]
  7. Bhatia, G., Nagoudi, E. M., Cavusoglu, H., & Abdul-Mageed, M. (2024). FinTral: A family of GPT-4 level multimodal financial large language models. arXiv, arXiv:2402.10986. [Google Scholar]
  8. Borji, A. (2023). A categorical archive of ChatGPT failures. arXiv, arXiv:2302.03494. [Google Scholar]
  9. Cao, Y., & Zhai, J. (2023). Bridging the gap—The impact of ChatGPT on financial research. Journal of Chinese Economic and Business Studies, 21(2), 177–191. [Google Scholar] [CrossRef]
  10. Chen, B., Wu, Z., & Zhao, R. (2023). From fiction to fact: The growing role of generative AI in business and finance. Journal of Chinese Economic and Business Studies, 21(4), 471–496. [Google Scholar] [CrossRef]
  11. Dai, H., Liu, Z., Liao, W., Huang, X., Wu, Z., Zhao, L., Liu, W., Liu, N., Li, S., Zhu, D., & Li, X. (2023). ChatAug: Leveraging ChatGPT for text data augmentation. arXiv, arXiv:2302.13007. [Google Scholar]
  12. Dale, R. (2021). GPT-3: What’s it good for? Natural Language Engineering, 27(1), 113–118. [Google Scholar] [CrossRef]
  13. da Silva, J. A. T. (2023). Is ChatGPT a valid author? Nurse Education in Practice, 68, 103600. [Google Scholar] [CrossRef] [PubMed]
  14. Dowling, M., & Lucey, B. (2023). ChatGPT for (finance) research: The Bananarama conjecture. Finance Research Letters, 53, 103662. [Google Scholar] [CrossRef]
  15. Fahad, S. A., Salloum, S. A., & Shaalan, K. (2024). The role of ChatGPT in knowledge sharing and collaboration within digital workplaces: A systematic review. In A. Al-Marzouqi, S. A. Salloum, M. Al-Saidat, A. Aburayya, & B. Gupta (Eds.), Artificial intelligence in education: The power and dangers of ChatGPT in the classroom (Vol. 144). Studies in Big Data. Springer. [Google Scholar] [CrossRef]
  16. Fatouros, G., Soldatos, J., Kouroumali, K., Makridis, G., & Kyriazis, D. (2023). Transforming sentiment analysis in the financial domain with ChatGPT. arXiv, arXiv:2308.07935v1. [Google Scholar] [CrossRef]
  17. Feng, Z., Hu, G., & Li, B. (2024). Unleashing the power of ChatGPT in finance research: Opportunities and challenges. SSRN Electronic Journal, 4424979. [Google Scholar] [CrossRef]
  18. Frieder, S., Pinchetti, L., Griffiths, R., Salvatori, T., Lukasiewicz, T., Petersen, P., & Berner, J. (2023). Mathematical capabilities of ChatGPT. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Eds.), Advances in neural information processing systems (Vol. 36, pp. 27699–27744). MIT Press. [Google Scholar]
  19. Frye, B. L. (2022). Should using an AI text generator to produce academic writing be plagiarism? Fordham intellectual property. Media & Entertainment Law Journal, 33, 946. [Google Scholar]
  20. Gao, C. A., Howard, F. M., Markov, N. S., Dyer, E. C., Ramesh, S., Luo, Y., & Pearson, A. T. (2022). Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers. bioRxiv. [Google Scholar] [CrossRef]
  21. González-Padilla, D. A. (2022). Concerns about the potential risks of artificial intelligence in manuscript writing. Journal of Urology, 209(4), 682–683. [Google Scholar] [CrossRef]
  22. Hosseini, M., Rasmussen, L. M., & Resnik, D. B. (2024). Using AI to write scholarly publications. Accountability in Research, 31, 715–723. [Google Scholar] [CrossRef] [PubMed]
  23. Jha, M., Qian, J., Weber, M., & Yang, B. (2023). ChatGPT and corporate policies. Working paper. SSRN, 4521096. [Google Scholar] [CrossRef]
  24. Khalil, M., & Er, E. (2023). Will ChatGPT get you caught? Rethinking of plagiarism detection. arXiv, arXiv:2302.04335. [Google Scholar]
  25. Khan, M. S., & Umer, H. (2024). ChatGPT in finance: Applications, challenges, and solutions. Heliyon, 10(2), e24890. [Google Scholar] [CrossRef] [PubMed]
  26. Kim, A. G., Muhn, M., & Nikoleav, V. V. (2024). Financial statement analysis with large language models. Chicago booth research paper forthcoming, fama-miller working paper. Available online: https://ssrn.com/abstract=4835311 (accessed on 15 May 2024).
  27. Korinek, A. (2023). Language models and cognitive automation for economic research. NBER working paper series. Available online: https://www.nber.org/papers/w30957 (accessed on 15 May 2024).
  28. Li, X., Feng, H., Yang, H., & Huang, J. (2024). Can ChatGPT reduce human financial analysts’ optimistic biases? Economic and Political Studies, 12(1), 20–33. [Google Scholar] [CrossRef]
  29. Liebrenz, M., Schleifer, R., Buadze, A., Bhugra, D., & Smith, A. (2023). Generating scholarly content with ChatGPT: Ethical challenges for medical publishing. The Lancet Digital Health, 5(3), e105–e106. [Google Scholar] [CrossRef]
  30. Lin, Z. (2023). Why and how to embrace AI such as ChatGPT in your academic life. Royal Society Open Science, 10(8), 230658. [Google Scholar] [CrossRef] [PubMed]
  31. Liu, L. X., Sun, Z., Xu, K., & Chen, C. (2024). AI-driven financial analysis: Exploring ChatGPT’s capabilities and challenges. International Journal of Financial Studies, 12(3), 60. [Google Scholar] [CrossRef]
  32. Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., & Tang, J. (2023). GPT understands, too. AI Open, 5, 208–215. [Google Scholar] [CrossRef]
  33. Lopez-Lira, A., & Tang, Y. (2024). Can ChatGPT forecast stock price movements? Return predictability and large language models. Working paper. SSRN, 4412788. [Google Scholar] [CrossRef]
  34. Lund, B. D., & Wang, T. (2023). Chatting about ChatGPT: How may AI and GPT impact academia and libraries? Library Hi Tech News, 40(3), 26–29. [Google Scholar] [CrossRef]
  35. Pelster, M., & Val, J. (2024). Can ChatGPT assist in picking stocks? Finance Research Letters, 59, 104786. [Google Scholar] [CrossRef]
  36. Rahman, M. M., & Watanobe, Y. (2023). ChatGPT for education and research: Opportunities, threats, and strategies. Applied Sciences, 13, 5783. [Google Scholar] [CrossRef]
  37. Rice, S., Crouse, S. R., Winter, S. R., & Rice, C. (2024). The advantages and limitations of using ChatGPT to enhance technological research. Technology in Society, 76, 102426. [Google Scholar] [CrossRef]
  38. Schlosky, M. T. T., Karadas, S., & Raskie, S. (2024). ChatGPT, help! I am in financial trouble. Journal of Risk and Financial Management, 17(6), 241. [Google Scholar] [CrossRef]
  39. Shen, Y., Heacock, L., Elias, J., Hentel, K. D., Reig, B., Shih, G., & Moy, L. (2023). ChatGPT and other large language models are double-edged swords. Radiology, 307(2), 230163. [Google Scholar] [CrossRef] [PubMed]
  40. Shue, E., Liu, L., Li, B., Feng, Z., Li, X., & Hu, G. (2023). Empowering beginners in bioinformatics with ChatGPT. Quantitative Biology, 11(2), 105–108. [Google Scholar] [CrossRef]
  41. Smales, L. A. (2023). Classification of RBA monetary policy announcements using ChatGPT. Finance Research Letters, 58(C), 104514. [Google Scholar] [CrossRef]
  42. Stokel-Walker, C. (2023). ChatGPT listed as author on research papers: Many scientists disapprove. Nature, 613, 620–621. [Google Scholar] [CrossRef] [PubMed]
  43. Thorp, H. H. (2023). ChatGPT is fun, but not an author. Science, 379(6630), 313. [Google Scholar] [CrossRef]
  44. van Dis, E. A., Bollen, J., Zuidema, W., van Rooij, R., & Bockting, C. L. (2023). ChatGPT: Five priorities for research. Nature, 614(7947), 224–226. [Google Scholar] [CrossRef] [PubMed]
  45. Wang, J., Ye, Q., Liu, L., Guo, N. L., & Hu, G. (2024). Scientific figures interpreted by ChatGPT: Strengths in plot recognition and limits in color perception. NPJ Precision Oncology, 8(1), 84. [Google Scholar] [CrossRef]
  46. Yang, Z., Li, L., Lin, K., Wang, J., Lin, C., Liu, Z., & Wang, L. (2023). The dawn of LMMs: Preliminary explorations with GPT-4V(ision). arXiv, arXiv:2309.17421. [Google Scholar]
  47. Yeo-Teh, N. S. L., & Tang, B. L. (2023). Letter to editor: NLP systems such as ChatGPT cannot be listed as an author because these cannot fulfill widely adopted authorship criteria. Accountability in Research, 31(7), 968–970. [Google Scholar] [CrossRef] [PubMed]
  48. Yue, T., Au, D., Au, C. C., & Iu, K. Y. (2023). Democratizing financial knowledge with ChatGPT by OpenAI: Unleashing the power of technology. SSRN Electronic Journal, 4346152. [Google Scholar] [CrossRef]
  49. Zaremba, A., & Demir, E. (2023). ChatGPT: Unlocking the future of NLP in finance. Modern Finance, 1(1), 93–98. [Google Scholar] [CrossRef]
  50. Zhu, G., Fan, X., Hou, C., Zhong, T., Seow, P., Shen-Hsing, A. C., Rajalingam, P., Yew, L. K., & Poh, T. L. (2023). Embrace opportunities and face challenges: Using ChatGPT in undergraduate students’ collaborative interdisciplinary learning. arXiv, arXiv:2305.18616. [Google Scholar]
Table 1. Data. This table reports the data structure we use in the empirical tests. Panel A contains 30 components from the Dow Jones Industrial Average Stocks List. Panel B contains the data structure for the “Dow 30 Daily Returns” File. Panel C contains the data structure for the “FamaFrench_3Factors” file.
Table 1. Data. This table reports the data structure we use in the empirical tests. Panel A contains 30 components from the Dow Jones Industrial Average Stocks List. Panel B contains the data structure for the “Dow 30 Daily Returns” File. Panel C contains the data structure for the “FamaFrench_3Factors” file.
Panel A: Dow Jones Industrial Average Stocks List
SYMBOLNAME
AAPLApple Inc.
AMGNAmgen Inc.
AMZNAmazon.com Inc.
AXPAmerican Express Co.
BABoeing Co.
CATCaterpillar Inc.
CRMSalesforce Inc.
CSCOCisco Systems Inc.
CVXChevron Corp
DISWalt Disney Co.
DOWDow Inc.
GSGoldman Sachs Group Inc.
HDHome Depot Inc.
HONHoneywell International Inc.
IBMInternational Business Machines Corp
INTCIntel Corp
JNJJohnson & Johnson
JPMJPMorgan Chase & Co.
KOCoca-Cola Co.
MCDMcDonald’s Corp
MMM3M Co.
MRKMerck & Co. Inc.
MSFTMicrosoft Corp
NKENike Inc.
PGProcter & Gamble Co.
TRVTravelers Companies Inc.
UNHUnitedhealth Group Inc.
VVisa Inc.
VZVerizon Communications Inc.
WMTWalmart Inc.
Panel B: Data Structure for the “Dow 30 Daily Returns” File
PERMNOdateTICKERCOMNAMPERMCOCUSIPVOLRETBIDASKSHROUTNUMTRDRETXvwretdsprtrn
101072 January 2019MSFTMICROSOFT CORP80485949181035,347,045−0.00443101.12101.137,683,000239,618−0.004430.0017960.001269
101073 January 2019MSFTMICROSOFT CORP80485949181042,570,779−0.0367997.3597.387,683,000302,446−0.03679−0.02104−0.02476
101074 January 2019MSFTMICROSOFT CORP80485949181044,032,8620.046509101.93101.947,683,000301,8380.0465090.033410.034336
101077 January 2019MSFTMICROSOFT CORP80485949181035,650,3030.001275102.04102.067,683,000240,5810.0012750.0092020.00701
9265526 December 2023UNHUNITEDHEALTH GROUP INC726791324P101,390,912−0.000538520.10999520.12924,925 −0.0005380.0052180.004232
9265527 December 2023UNHUNITEDHEALTH GROUP INC726791324P101,851,8400.005307522.95001523.09003924,925 0.0053070.0019950.00143
9265528 December 2023UNHUNITEDHEALTH GROUP INC726791324P102,001,2080.004036525.07001525.21002924,925 0.004036−0.0001080.00037
9265529 December 2023UNHUNITEDHEALTH GROUP INC726791324P102,080,1970.002991526.53998526.94924,925 0.002991−0.004045−0.002826
Panel C: Data Structure for the “FamaFrench_3Factors” File
DateMktRFSMBHMLRF
2 January 20190.00230.00590.01110.0001
3 January 2019−0.02450.00360.0120.0001
4 January 20190.03550.0041−0.0070.0001
7 January 20190.00940.01−0.00750.0001
26 December 20230.00480.00690.00460.00021
27 December 20230.00160.00140.00120.00021
28 December 2023−0.0001−0.00360.00020.00021
29 December 2023−0.0043−0.0113−0.00370.00021
Table 2. Zero-shot prompting. This table reports the financial analysis with a zero-shot prompt. Panel A is the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o). Panel B is an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Table 2. Zero-shot prompting. This table reports the financial analysis with a zero-shot prompt. Panel A is the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o). Panel B is an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Panel A: Results from ChatGPT-4o
Prompts:
[upload from computer the “Dow 30 Daily Returns” file]
Act as a finance professor and statistician. Analyze these data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS. Perform analysis that can be used in academic papers. Illustrate the meaning and implications of the results.
 
Response:
The dataset contains daily returns data for the Dow 30 components from 2019 to 2023. Here are the key columns in the dataset:
  • PERMNO : Permanent number assigned to each security.
  • date : Date of the observation.
  • TICKER : Stock ticker symbol.
  • COMNAM : Company name.
  • PERMCO : Permanent company number.
  • CUSIP : Committee on Uniform Securities Identification Procedures number.
  • VOL : Trading volume.
  • RET : Daily return.
  • BID : Bid price.
  • ASK : Ask price.
  • SHROUT : Number of shares outstanding.
  • NUMTRD : Number of trades.
  • RETX : Return excluding dividends.
  • vwretd : Value-weighted return.
  • sprtrn : Spread return.
Statistical Analysis
  • Descriptive statistics : We will calculate basic descriptive statistics (mean, median, standard deviation, etc.) for daily returns.
  • Distribution analysis : We will examine the distribution of daily returns, including skewness and kurtosis, and visualize it using histograms and QQ plots.
  • Time Series analysis : We will analyze the returns’ time series properties, including autocorrelation and volatility clustering.
  • Risk and return analysis : We will compute key metrics such as annualized returns, annualized volatility, Sharpe ratio, and Beta.
  • Correlation analysis : We will analyze the correlation between the returns of different Dow 30 components.
Implementation
Let us start with the descriptive statistics and distribution analysis. [View Analysis]
Jrfm 18 00099 i001
Jrfm 18 00099 i002
Jrfm 18 00099 i003
Descriptive Statistics
Here are the basic descriptive statistics for the daily returns (RET):
  • Count : Number of observations
  • Mean: Average daily return
  • Std: Standard deviation of daily returns
  • Min: Minimum daily return
  • 25%: 25th percentile
  • 50%: Median (50th percentile)
  • 75%: 75th percentile
  • Max: Maximum daily return
  • Skewness: Measure of asymmetry in the distribution of returns
  • Kurtosis: Measure of the “tailedness” of the distribution
Distribution Analysis
Histogram and KDE Plot
The histogram and kernel density estimate (KDE) plot show the distribution of daily returns. The distribution appears to be approximately normal but with noticeable tails, indicating the presence of extreme values or outliers.
QQ Plot
The QQ plot compares the distribution of daily returns to a standard normal distribution. Deviations from the line suggest departures from normality. The tails of the distribution show significant deviations, indicating the presence of outliers or heavy tails.

Time Series Analysis
Time Series Plot for MSFT
The time series plot for Microsoft’s daily returns highlights the fluctuations over time. Notable patterns include periods of high volatility and clustering of extreme values.

Implications
  • Risk and return: The mean daily return provides insight into the expected performance, while the standard deviation and kurtosis highlight the risk and potential for extreme returns.
  • Distribution properties: The skewness and kurtosis indicate that the distribution of returns is not perfectly normal, suggesting that models assuming normality may underestimate the risk of extreme events.
  • Volatility clustering: The time series plot reveals periods of increased volatility, which is typical in financial markets and has implications for risk management and portfolio optimization.
  • Modeling returns: The deviations from normality suggest that more sophisticated models, such as GARCH or EVT (Extreme Value Theory), might be necessary for accurate risk assessment.
Next, we will proceed with further analysis, including calculating annualized metrics, examining correlations, and performing additional time series analysis. [View Analysis]
Jrfm 18 00099 i004
Annualized Metrics
Here are the key metrics for each Dow 30 component:
  • Annualized return: The compounded return over a year.
  • Annualized volatility: The standard deviation of returns over a year.
  • Sharpe ratio: The risk-adjusted return, calculated as A n n u a l i z e d   R e t u r n R i s k F r e e   R a t e A n n u a l i z e d   V o l a t i l i t y
Key Observations
  • Highest annualized return: AAPL (Apple) at approximately 45.89%
  • Highest annualized volatility: BA (Boeing) at approximately 50.40%
  • Highest Sharpe ratio: AAPL (Apple) at 1.362
These metrics indicate that Apple has provided the highest risk-adjusted return over this period, making it an attractive investment despite its relatively high volatility.

Correlation Analysis
The heatmap of the correlation matrix shows the relationships between the daily returns of the Dow 30 components. Key observations include the following:
  • High correlation: Companies within the same industry or sector often exhibit higher correlations. For example, financial firms like JPM (JPMorgan Chase) and GS (Goldman Sachs) show a strong positive correlation.
  • Low correlation: Companies from different sectors might show lower correlations, offering diversification benefits. For instance, technology firms like MSFT (Microsoft) and healthcare companies like JNJ (Johnson & Johnson) show moderate to low correlations.
Implications
  • Portfolio diversification: The correlation analysis supports the principle of diversification. By combining assets with low or negative correlations, investors can reduce portfolio risk without significantly sacrificing returns.
  • Risk management: Understanding the volatility and correlation of different stocks helps in constructing portfolios that align with the investor’s risk tolerance and investment objectives.
  • Investment strategy: Stocks with high Sharpe ratios, like AAPL, indicate superior risk-adjusted returns, making them attractive for investors seeking to maximize returns for a given level of risk.
Conclusion
The analysis of the Dow 30 components from 2019 to 2023 reveals important insights into their risk–return profiles and interrelationships. This information is crucial for academic research, investment strategies, and risk management practices. Further studies could involve more sophisticated modeling techniques to better understand market dynamics and predict future performance. [View Analysis]

Panel B: Assessment of the Results from ChatGPT-4o
  • In identifying the variables/column headers of the input data, out of 15 columns, ChatGPT-4o correctly identifies the header names for 14 of them. The misidentified column is sprtrn, which should represent the S&P 500 Composite Index Return but is labeled as Spread Return in the responses.
  • No actual numbers are provided in the descriptive analysis, but it does include the names of those common descriptive statistics that we should have in a financial paper.
  • ChatGPT-4o understands this is time series data and suggests relevant time series analyses, such as risk and return relationships and modeling returns.
Table 3. Summary statistics. This table reports the summary statistics for the daily returns data of Dow 30 components from 2019 to 2023. Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Table 3. Summary statistics. This table reports the summary statistics for the daily returns data of Dow 30 components from 2019 to 2023. Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Panel A: Results from Stata
Variable Mean Median Standard Deviation Minimum Maximum Number of Observations
ret0.0006160.0006710.019557−0.2384840.26044938,935
vol11,965,6006,329,27016,900,200416,678331,573,00038,936
bid213.455147.610389.84221.9903731.41038,936
ask213.510147.625390.01822.0003733.39038,936
shrout2,326,3701,274,9602,756,090228,39917,102,50038,936
vwretd0.0006050.0009870.013280−0.1181680.09155638,936
Panel B: Results from ChatGPT-4o
Prompts:
[upload from computer the “Dow 30 Daily Returns” file]

Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS. Provide a summary statistic table that includes the mean, median, standard deviation, minimal, maximum, and number of observations for the following variables: return (ret), volume (vol), bid price (bid), ask price (ask), shares outstanding (shrout), and value-weighted return (vwretd).

Response:
Here are the summary statistics for the daily returns data of Dow 30 components from 2019 to 2023:
Variable Mean Median Standard Deviation Minimum Maximum Number of Observations
RET 0.000409 0.000548 0.018073 −0.308225 0.350837 38,936
VOL 11,965,626 6,329,272 16,900,193 416,678 331,572,902 38,936
BID 213.455 147.610 389.842 21.990 3731.410 38,936
ASK 213.510 147.625 390.018 22.000 3733.390 38,936
SHROUT 2,326,367 1,274,956 2,756,085 228,399 17,102,540 38,936
vwretd 0.000605 0.000987 0.013280 −0.118168 0.091556 38,936
These statistics provide a comprehensive overview of the dataset, highlighting key measures of central tendency and dispersion. [View Analysis]

Panel C: Assessment of the Results from ChatGPT-4o
The statistics for return (ret) and volume (vol) do not match those generated by Stata, while those for bid price (bid), ask price (ask), shares outstanding (shrout), and value-weighted return (vwretd) do match. This assessment indicates that ChatGPT-4o can process and analyze most of the data accurately, but discrepancies in some variables highlight the need for further validation and cross-checking with traditional statistical software.
Table 4. Time series analysis of returns of Microsoft and the market. This table evaluates a time series plot for Microsoft’s (ticker = MSFT) return (ret) and the market’s (vwretd) return from 2019 to 2023. Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Table 4. Time series analysis of returns of Microsoft and the market. This table evaluates a time series plot for Microsoft’s (ticker = MSFT) return (ret) and the market’s (vwretd) return from 2019 to 2023. Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Panel A: Results from Stata
Jrfm 18 00099 i005
Panel B: Results from ChatGPT-4o
Prompts:
[upload from computer the “Dow 30 Daily Returns” file]

Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS. Provide a time series plot for the return (ret) of Microsoft (ticker = MSFT) and the market return (vwretd).

Response:
It seems there’s an issue with generating the plot directly. Let’s save the plot as an image file and provide it that way. I’ll try creating the plot again and save it as an image file. [View Analysis]

It looks like we’re facing persistent issues with generating the plot. Let’s try an alternative approach by summarizing the data trends instead.

For the analysis of the daily returns of Microsoft (RET) and the market return (vwretd) from 2019 to 2023:

Microsoft (RET)
  • Trend: Over the observed period, Microsoft’s daily returns have shown typical fluctuations associated with a high-profile technology company. This includes notable peaks and troughs corresponding to major market events and company-specific news.
  • Volatility: The daily return volatility reflects the underlying market conditions, including periods of heightened uncertainty like the COVID-19 pandemic.

Market Return (vwretd)
  • Trend: The value-weighted market return similarly shows fluctuations, but these are generally more subdued compared to individual stock returns. This is because it represents a broader market index, which is inherently more stable.
  • Volatility: Market returns exhibit less volatility on a day-to-day basis compared to individual stocks due to the diversification effect.

Correlation
  • Positive correlation: Both Microsoft’s returns and the market returns tend to move in the same direction, showing a positive correlation typical for a large-cap stock within a major index.

For a more detailed visual representation, you can use statistical software or tools like Python, R, or Excel to plot the data directly on your local machine. If you need further assistance with the steps to create the plot locally, please let me know! [View Analysis]

Panel C: Assessment of the Results from ChatGPT-4o
ChatGPT-4o has difficulty generating the figure from the data based on the prompt. However, Table 2: Zero-shot prompting does demonstrate the ability of ChatGPT-4o to visualize the time series plots. This indicates some instability in ChatGPT-4o’s ability to generate responses consistently.
Table 5. Returns around COVID-19 lockdown. This table assesses the stock returns on the trading days before and after the first COVID-19 lockdown day (Sunday, 15 March 2020). Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Table 5. Returns around COVID-19 lockdown. This table assesses the stock returns on the trading days before and after the first COVID-19 lockdown day (Sunday, 15 March 2020). Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Panel A: Results from Stata
TICKER13-March-202016-March-2020Total
AAPL0.119808−0.128647−0.008839
AMGN0.108977−0.0664030.042574
AMZN0.064648−0.0536970.010951
AXP0.192386−0.1378510.054535
BA0.099199−0.238484−0.139285
C0.179843−0.192986−0.013143
CAT0.079991−0.0625250.017466
CRM0.051142−0.158885−0.107743
CSCO0.133735−0.1044100.029325
CVX0.093889−0.164469−0.070580
DIS0.116654−0.0732540.043400
DOW0.085340−0.143636−0.058296
GS0.175803−0.1270530.048750
HD0.079123−0.197938−0.118815
HON0.107972−0.0946390.013333
IBM0.049995−0.082168−0.032173
INTC0.195213−0.1804150.014798
JNJ0.070808−0.0533170.017491
JPM0.180125−0.1496490.030476
KO0.036472−0.066227−0.029755
MCD0.041145−0.158754−0.117609
MMM0.060638−0.076016−0.015378
MRK0.039087−0.088990−0.049903
MSFT0.142169−0.147390−0.005221
NKE0.018598−0.116301−0.097703
PG0.120090−0.0488300.071260
TRV0.051576−0.208004−0.156428
UNH0.090691−0.172769−0.082078
V0.098388−0.135472−0.037084
VZ0.058008−0.058704−0.000696
WMT0.096588−0.0643300.032258
Total0.098003−0.121039−0.023036
Panel B: Results from ChatGPT-4o
Prompts:
[upload from computer the “Dow 30 Daily Returns” file]

Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS.
Provide the return (ret) of each stock on the trading days before and after the first COVID lockdown day (Sunday, 15 March 2020). Compute the total return of each stock’s two trading days, and the average return of all stocks’ two trading days.

Response:

It seems that there is an issue with accessing or interpreting the data. Let’s re-attempt by ensuring we handle the file and its columns correctly. I’ll reload the file and ensure proper column handling.
Let’s try this step-by-step again to pinpoint the issue. [View Analysis]

It appears that there is a persistent issue with accessing the data. Let’s approach this step by step:
  • Check if the data are loaded correctly.
  • Ensure that the date conversion and column selection are correct.
  • Isolate the trading days before and after the specified lockdown date.

Let’s retry these steps carefully. [View Analysis]

It seems there’s an ongoing issue with loading the CSV file. To proceed effectively, I’ll outline the steps you can follow to analyze the data on your local machine using Python:
  • Load the Data: Ensure the CSV file is loaded correctly using pandas.
Python Copy code
import pandas as pd

# Load the data
file_path = ’path/to/your/Dow 30 Daily Returns.csv’
data = pd.read_csv(file_path)
2.
Convert Date Column: Convert the date column to datetime format.
Python Copy code
data[‘date’] = pd.to_datetime(data[‘date’])
3.
Identify Trading Days: Identify the trading days before and after 15 March 2020.
Python Copy code
lockdown_date = pd.to_datetime(‘2020-03-15’)
trading_day_before = data[data[‘date’] < lockdown_date][‘date’].max()
trading_day_after = data[data[‘date’] > lockdown_date][‘date’].min()
4.
Filter Data: Filter the data for these trading days.
Python Copy code
data_before_after = data[(data[‘date’] == trading_day_before) | (data[‘date’] == trading_day_after)]
5.
Calculate Returns: Calculate the total return of each stock over these two days and the average return.
Python Copy code
total_returns = data_before_after.groupby(‘TICKER’)[‘RET’].sum()
average_return = total_returns.mean()

print(total_returns)
print(f’Average Return: {average_return}’)
If you follow these steps on your local machine, you should be able to analyze the daily returns data of the Dow 30 components effectively. If you encounter any specific errors or issues during these steps, please let me know, and I can assist you further! [View Analysis]
Panel C: Assessment of the Results from ChatGPT-4o
We used the above Python code suggested by ChatGPT-4o and obtained the following results.
TICKERTotal
AAPL−0.008839
AMGN0.042574
AMZN0.010951
AXP0.054535
BA−0.139285
C−0.013143
CAT0.017466
CRM−0.107743
CSCO0.029325
CVX−0.070580
DIS0.043400
DOW−0.058296
GS0.048750
HD−0.118815
HON0.013333
IBM−0.032173
INTC0.014798
JNJ0.017491
JPM0.030476
KO−0.029755
MCD−0.117609
MMM−0.015378
MRK−0.049903
MSFT−0.005221
NKE−0.097703
PG0.071260
TRV−0.156428
UNH−0.082078
V−0.037084
VZ−0.000696
WMT0.032258
Total−0.023036
Thus, the Python code given by ChatGPT-4o is correct but incomplete. It does not display the daily returns on 13 March and 16 March. To fix the problem, we add a couple of lines of code to the initial one. After running the corrected code shown below, we obtain the same results as those in Panel A generated by Stata.
import pandas as pd

# Load the data
file_path = ‘path/to/your/Dow 30 Daily Returns.csv’
data = pd.read_csv(file_path)

# Convert date Column
data[‘date’] = pd.to_datetime(data[’date’])

# Identify trading days
lockdown_date = pd.to_datetime(‘2020-03-15’)
trading_day_before = data[data[‘date’] < lockdown_date][‘date’].max()
trading_day_after = data[data[‘date’] > lockdown_date][‘date’].min()

# Filter data
data_before_after = data[(data[‘date’] == trading_day_before) | (data[‘date’] == trading_day_after)]
data_before_after = data_before_after[[‘TICKER’, ‘date’, ‘RET’]]
data_before_after = data_before_after.sort_values(by = [‘TICKER’, ‘date’])
pd.set_option(‘display.max_columns’, None)
print(data_before_after.to_string())


# Calculate total returns
total_returns = data_before_after.groupby(‘TICKER’)[‘RET’].sum()
total_perday_returns = data_before_after.groupby(‘date’)[‘RET’].mean()
average_return = total_returns.mean()

# Display the results
print(total_returns)
print(total_perday_returns)
print(f’Average Return: {average_return}’)
TICKER13-March-202016-March-2020Total
AAPL0.119808−0.128647−0.008839
AMGN0.108977−0.0664030.042574
AMZN0.064648−0.0536970.010951
AXP0.192386−0.1378510.054535
BA0.099199−0.238484−0.139285
C0.179843−0.192986−0.013143
CAT0.079991−0.0625250.017466
CRM0.051142−0.158885−0.107743
CSCO0.133735−0.1044100.029325
CVX0.093889−0.164469−0.070580
DIS0.116654−0.0732540.043400
DOW0.085340−0.143636−0.058296
GS0.175803−0.1270530.048750
HD0.079123−0.197938−0.118815
HON0.107972−0.0946390.013333
IBM0.049995−0.082168−0.032173
INTC0.195213−0.1804150.014798
JNJ0.070808−0.0533170.017491
JPM0.180125−0.1496490.030476
KO0.036472−0.066227−0.029755
MCD0.041145−0.158754−0.117609
MMM0.060638−0.076016−0.015378
MRK0.039087−0.088990−0.049903
MSFT0.142169−0.147390−0.005221
NKE0.018598−0.116301−0.097703
PG0.120090−0.0488300.071260
TRV0.051576−0.208004−0.156428
UNH0.090691−0.172769−0.082078
V0.098388−0.135472−0.037084
VZ0.058008−0.058704−0.000696
WMT0.096588−0.0643300.032258
Total0.098003−0.121039−0.023036
Table 6. Sharpe ratio analysis. This table assesses the Sharpe ratio, (stock return—risk-free rate)/(standard deviation of the stock return), for each stock each year. Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Table 6. Sharpe ratio analysis. This table assesses the Sharpe ratio, (stock return—risk-free rate)/(standard deviation of the stock return), for each stock each year. Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Panel A: Results from Stata
TICKER20192020202120222023
AAPL3.2512061.7454681.380334−0.770262.079339
AMGN1.191135−0.062180.0438030.86330.395884
AMZN0.8915171.9580810.0986−1.006932.202766
AXP1.741743−0.023991.318358−0.276280.8745
BA0.039458−0.38989−0.16568−0.143731.124431
C2.230223−0.295840.038892−0.694070.527584
CAT0.629980.5877730.6249570.5051190.705026
CRM0.6330020.6844190.504112−1.030443.027711
CSCO0.467521−0.093272.384514−0.806020.206838
CVX0.697617−0.429611.8910781.712105−0.76628
DIS1.3603940.504435−0.58419−1.20843−0.02319
DOW0.0030030.1222460.237751−0.265530.424431
GS1.5856390.3251151.888141−0.310280.437651
HD1.5296740.551792.963595−0.738570.350903
HON1.9542970.495626−0.015110.136817−0.25516
IBM1.019095−0.038610.718670.3841311.030328
INTC1.033142−0.281150.191356−1.239992.235086
JNJ0.8360690.3433910.7898190.257685−0.78172
JPM2.358938−0.10821.297706−0.464371.187297
KO1.0627580.0586420.7493220.461006−0.66602
MCD0.7707920.2710251.817212−0.045740.696737
MMM−0.251170.0635780.26232−1.12427−0.28893
MRK1.069891−0.237570.0825632.377886−0.1987
MSFT2.7359910.9545482.493022−0.823812.034473
NKE1.6996540.9717690.668222−0.71205−0.38724
PG2.2295520.421181.424969−0.29119−0.36953
TRV0.8758520.1020080.6430980.908932−0.04441
UNH0.6906080.4306592.3626840.223695−0.18567
V2.2362040.389683−0.01329−0.153751.317385
VZ0.72356−0.02082−0.52791−0.93618−0.089
WMT1.9042960.7241010.117798−0.070030.488505
Total1.2645690.313691120.82860385−0.170362650.55777513
Panel B: Results from ChatGPT-4o
Prompts:
[upload from computer the “Dow 30 Daily Returns” file & the “FamaFrench 3Factors” file]

Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS, and the daily risk free rate, pulled from Kenneth R. French—Data Library.
Compute the annualized excess return (raw return—risk free rate) of each stock.
Compute the standard deviation of excess returns for each stock each year.
Then, calculate the Sharpe Ratio, the risk-adjusted return, for each stock each year.
Report the Sharpe Ratio for each stock each year and the average Sharpe Ratio of all stocks each year.

Response:
It looks like there’s an issue with generating the computations directly. Let’s break down the process into more manageable steps to ensure we capture all necessary calculations. Here’s the step-by-step approach:
  • Load and merge the data.
  • Compute the annualized excess returns.
  • Calculate the standard deviation of the excess returns.
  • Compute the Sharpe ratios.
  • Report the Sharpe ratios and the average Sharpe ratio for each year.
I’ll guide you through each of these steps. Please use the following code snippets on your local machine to achieve the required analysis.
1. Load and Merge the Data
Python Copy code
import pandas as pd

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Load the Fama-French 3 factors data (including the risk-free rate)
ff_data_path = ‘path/to/FamaFrench_3Factors.csv’
ff_data = pd.read_csv(ff_data_path)

# Convert date columns to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])
ff_data[‘date’] = pd.to_datetime(ff_data[‘date’], format = ‘%Y%m%d’)

# Merge the datasets on the date column
merged_data = pd.merge(dow_data, ff_data, on = ‘date’, how = ‘inner’)
2. Compute the Annualized Excess Returns
Python Copy code
# Calculate the excess return (raw return—risk free rate)
merged_data[‘excess_ret’] = merged_data[‘RET’]—merged_data[‘RF’]

# Extract the year from the date for annual calculations
merged_data[‘year’] = merged_data[‘date’].dt.year

# Function to annualize returns
def annualize_return(daily_returns):
return (1 + daily_returns).prod() ** (252/len(daily_returns))—1

# Compute the annualized excess return for each stock each year
annual_excess_return = merged_data.groupby([‘TICKER’, ‘year’])[‘excess_ret’].apply(annualize_return).reset_index()
annual_excess_return.columns = [‘TICKER’, ‘year’, ‘annual_excess_return’]
3. Calculate the Standard Deviation of Excess Returns
Python Copy code
# Compute the standard deviation of excess returns for each stock each year
std_excess_return = merged_data.groupby([‘TICKER’, ‘year’])[‘excess_ret’].std().reset_index()
std_excess_return.columns = [‘TICKER’, ‘year’, ‘std_excess_return’]
4. Compute the Sharpe Ratios
Python Copy code
# Merge the annualized return and standard deviation data
annual_stats = pd.merge(annual_excess_return, std_excess_return, on = [‘TICKER’, ‘year’])

# Calculate the Sharpe Ratio (annualized excess return/standard deviation of excess return)
annual_stats[‘sharpe_ratio’] = annual_stats[‘annual_excess_return’]/annual_stats[‘std_excess_return’]
5. Report the Sharpe Ratios and the Average Sharpe Ratio for Each Year
Python Copy code
# Calculate the average Sharpe Ratio for all stocks each year
average_sharpe_ratio = annual_stats.groupby(‘year’)[‘sharpe_ratio’].mean().reset_index()
average_sharpe_ratio.columns = [‘year’, ‘average_sharpe_ratio’]

# Merge the average Sharpe Ratio with the annual stats
annual_stats = pd.merge(annual_stats, average_sharpe_ratio, on = ‘year’, how = ‘left’)

# Display the results
print(annual_stats.head())
print(average_sharpe_ratio.head())
By following these steps, you should be able to compute the required statistics on your local machine. If you encounter any specific errors, please provide details, and I can assist you further.

Panel C: Assessment of the Results from ChatGPT-4o
We encountered some problems when running the code provided by ChatGPT-4o. First, Python could not find the “date” column in the Fama–French dataset. By using the command “print(ff_data.columns)”, we discovered that the column name should be “Date”. Second, the commands in the last two lines, “print(annual_stats.head())” and “print(average_sharpe_ratio.head())”, only display the first five rows of the results. Thus, we changed them to “print(annual_stats.to_string())” and “print(average_sharpe_ratio.to_string())”. After these adjustments, we obtained the following results.
TICKER20192020202120222023
AAPL51.611298527.617827921.9121292−12.24437433.1895478
AMGN18.9086843−0.98510380.6953569813.73656846.31156242
AMZN14.152395930.98438461.56523015−15.99657735.1909812
AXP27.6493114−0.380124620.9282788−4.393667613.9494736
BA0.62638058−6.1817707−2.6300849−2.285839917.9409506
C35.4036827−4.68898060.61738469−11.0343578.41287971
CAT10.00061549.307668099.920890958.0370377111.2451069
CRM10.048590610.83635098.00253161−16.37104648.3916149
CSCO7.42166932−1.477808237.8529936−12.8141523.2971023
CVX11.0743322−6.810246530.019937927.2589356−12.203726
DIS21.59558627.98819977−9.2737801−19.200922−0.3695768
DOW0.05476961.936468563.77418689−4.22284016.76698079
GS25.17123835.1491861929.9733156−4.93445266.97805876
HD24.28282088.7382160847.0456082−11.7418175.59428674
HON31.02351027.84900383−0.23991852.17637072−4.0659503
IBM16.1776254−0.611691811.40853866.1111070516.4313454
INTC16.400627−4.45570523.03768961−19.70084435.7198084
JNJ13.27218275.4392917712.53798834.09912797−12.452303
JPM37.446978−1.714403320.6004417−7.384117218.9402234
KO16.87076510.9290346111.89512087.33408611−10.611021
MCD12.23593924.2929838728.8473469−0.727611311.1086764
MMM−3.9872421.007230924.16420214−17.870596−4.6034771
MRK16.9839904−3.76435191.3106441537.8539555−3.1663317
MSFT43.432501315.11192639.5754938−13.0951932.4821803
NKE26.98116115.384939310.6076948−11.318366−6.1690898
PG35.39304576.6710736322.6206742−4.6310539−5.888251
TRV13.90372231.6159644910.208861614.4631431−0.7076998
UNH10.96305546.820357837.50643633.55848636−2.9586416
V35.49864226.17185233−0.2109088−2.445398421.0124148
VZ11.4861602−0.32978−8.3802438−14.884179−1.418363
WMT30.229755111.46717431.86999057−1.11381897.78804097
Total20.07463864.9651344613.1536784−2.70265818.90763885
However, the results shown in the above table are very different from those from Stata. We reviewed the Python code again and found that the standard deviation of the excess returns was incorrect because it was not annualized like the excess returns. Thus, we corrected the Python code as follows:
import pandas as pd

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Load the Fama-French 3 factors data (including the risk-free rate)
ff_data_path = ‘path/to/FamaFrench_3Factors.csv’
ff_data = pd.read_csv(ff_data_path)

# Convert date columns to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])
ff_data[‘date’] = pd.to_datetime(ff_data[‘Date’], format = ‘%Y%m%d’)

# Merge the datasets on the date column
merged_data = pd.merge(dow_data, ff_data, on = ‘date’, how = ‘inner’)

# Calculate the excess return (raw return—risk free rate)
merged_data[‘excess_ret’] = merged_data[‘RET’]—merged_data[‘RF’]

# Extract the year from the date for annual calculations
merged_data[‘year’] = merged_data[‘date’].dt.year

# Function to annualize returns
def annualize_return(daily_returns):
return (1 + daily_returns).prod() ** (252/len(daily_returns))—1

# Compute the annualized excess return for each stock each year
annual_excess_return = merged_data.groupby([‘TICKER’, ‘year’])[‘excess_ret’].apply(annualize_return).reset_index()
annual_excess_return.columns = [‘TICKER’, ‘year’, ‘annual_excess_return’]

# Compute the standard deviation of excess returns for each stock each year
std_excess_return = merged_data.groupby([‘TICKER’, ‘year’])[‘excess_ret’].std().reset_index()
std_excess_return[‘excess_ret’] = std_excess_return[‘excess_ret’] * (252 ** 0.5)
std_excess_return.columns = [‘TICKER’, ‘year’, ‘std_excess_return’]

# Merge the annualized return and standard deviation data
annual_stats = pd.merge(annual_excess_return, std_excess_return, on = [‘TICKER’, ‘year’])

# Calculate the Sharpe Ratio (annualized excess return/standard deviation of excess return)
annual_stats[‘sharpe_ratio’] = annual_stats[‘annual_excess_return’]/annual_stats[‘std_excess_return’]

# Calculate the average Sharpe Ratio for all stocks each year
average_sharpe_ratio = annual_stats.groupby(‘year’)[‘sharpe_ratio’].mean().reset_index()
average_sharpe_ratio.columns = [‘year’, ‘average_sharpe_ratio’]

# Merge the average Sharpe Ratio with the annual stats
annual_stats = pd.merge(annual_stats, average_sharpe_ratio, on = ‘year’, how = ‘left’)

# Display the results
pd.set_option(‘display.max_columns’, None)
print(annual_stats.to_string())
print(average_sharpe_ratio.to_string())
We obtained the following results by running the corrected code in Python:
TICKER20192020202120222023
AAPL3.2512061.7397601.380334−0.7713232.090745
AMGN1.191135−0.0620560.0438030.8653220.397591
AMZN0.8915171.9518330.098600−1.0076902.216823
AXP1.741743−0.0239461.318358−0.2767750.878734
BA0.039458−0.389415−0.165680−0.1439941.130174
C2.230222−0.2953780.038892−0.6950990.529962
CAT0.6299800.5863280.6249570.5062860.708375
CRM0.6330020.6826260.504112−1.0312793.048385
CSCO0.467521−0.0930932.384514−0.8072160.207698
CVX0.697617−0.4290051.8910781.717152−0.768762
DIS1.3603940.503209−0.584193−1.209544−0.023281
DOW0.0034500.1219860.237751−0.2660140.426280
GS1.5856390.3243681.888141−0.3108410.439576
HD1.5296740.5504562.963595−0.7396650.352407
HON1.9542970.494441−0.0151130.137098−0.256131
IBM1.019095−0.0385330.7186700.3849641.035077
INTC1.033142−0.2806830.191356−1.2410372.250136
JNJ0.8360690.3426430.7898190.258221−0.784421
JPM2.358938−0.1079971.297706−0.4651561.193122
KO1.0627580.0585240.7493220.462004−0.668431
MCD0.7707920.2704331.817212−0.0458350.699781
MMM−0.2511730.0634500.262320−1.125742−0.289992
MRK1.069891−0.2371320.0825632.384575−0.199460
MSFT2.7359900.9519622.493022−0.8249192.046185
NKE1.6996530.9691600.668222−0.712990−0.388616
PG2.2295520.4202381.424969−0.291729−0.370925
TRV0.8758520.1017960.6430980.911092−0.044581
UNH0.6906080.4296422.3626830.224164−0.186377
V2.2362040.388790−0.013286−0.1540461.323658
VZ0.723560−0.020774−0.527906−0.937615−0.089348
WMT1.9042960.7223640.117798−0.0701640.490600
Total1.2645830.3127740.828604−0.1702520.561129
Thus, the results from Python are very similar to those in Panel B generated using Stata. The small difference is due to the method of annualization. The code in Stata uses the actual number of trading days to compute the annual excess return and its annual standard deviation, for example, 252 days in 2019 and 253 days in 2020. However, the code suggested by ChatGPT-4o uses 252 days to annualize excess return in all years, regardless of the actual trading days. If we also use the actual trading days to compute the annualized return and standard deviation in Python, we would obtain the exact same results in both Stata and Python.
Table 7. Market Beta Analysis. This table assesses the market (CAPM) beta for each stock each year. Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Table 7. Market Beta Analysis. This table assesses the market (CAPM) beta for each stock each year. Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Panel A: Results from Stata
TICKER20192020202120222023
AAPL1.47511231.11745831.19562721.23857951.0368298
AMGN0.685971560.786177040.50713670.311216980.47645012
AMZN1.27594260.686589480.982177671.59219361.464787
AXP1.0086211.46606871.04710151.0770381.1996084
BA0.950041121.72582731.49605681.19258520.97405726
C1.44399091.56586410.866053880.91477121.1263285
CAT1.38697720.982325610.859578010.731042621.102708
CRM1.23865721.06559861.11964941.47091561.1648153
CSCO1.22994530.974722450.730841580.780481640.71736181
CVX0.801890671.35917250.812789440.520703490.57260776
DIS0.738066021.04860750.862951991.12068711.0726198
DOW1.48489931.32726841.05091940.723404290.94973165
GS1.21252241.27188410.915071070.915609481.0485996
HD0.819153961.09154440.702274560.884946351.0125784
HON1.01253031.07249680.784978750.733902930.8524366
IBM1.03148930.98061860.42113420.484656070.47231638
INTC1.25967131.15069791.2623061.13389161.3475364
JNJ0.495478960.657261010.28816220.263559070.29711422
JPM1.0208251.27373360.751673280.829422060.84921855
KO0.384985570.789401710.405629040.436692660.34200588
MCD0.365077820.939030950.54146010.440585020.45249423
MMM1.14028730.809127330.360162680.637408671.1587117
MRK0.594146910.684596540.190010530.245934550.21484913
MSFT1.21838751.11757611.03044061.21364951.0744908
NKE1.06978940.940292780.918901921.27852110.94393331
PG0.463745740.688336190.235360410.418871460.33232403
TRV0.615731841.05505190.723424380.390976190.48993209
UNH0.692630111.12246910.506406610.518836620.23741463
V1.0002081.11910380.925488110.919531230.72169179
VZ0.321661650.513058720.137391570.332525760.44305083
WMT0.4639330.5021220.4118490.4027640.363042
Panel B: Results from ChatGPT-4o
Prompts:
[upload from computer the “Dow 30 Daily Returns” file & the “FamaFrench 3Factors” file]

Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS, and the daily risk free rate, pulled from Kenneth R. French—Data Library.
Compute the CAPM’s Beta for each stock each year.

Response:
It seems there’s an issue with inspecting the data columns. I will provide you with a detailed guide on how to analyze this data to compute the CAPM’s Beta for each stock each year on your local machine.
Steps to Compute CAPM’s Beta
  • Load and Merge Data: Load the Dow 30 returns data and the Fama–French 3 factors data, then merge them.
  • Calculate Excess Returns: Calculate the excess returns for each stock and the market.
  • Compute Beta: Compute the CAPM’s Beta for each stock for each year using linear regression.
Detailed Guide

1. Load and Merge Data
Python Copy code
import pandas as pd

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Load the Fama-French 3 factors data
ff_data_path = ‘path/to/FamaFrench_3Factors.csv’
ff_data = pd.read_csv(ff_data_path)

# Convert date columns to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])
ff_data[‘Date’] = pd.to_datetime(ff_data[‘Date’], format = ‘%Y%m%d’)

# Merge the datasets on the date column
merged_data = pd.merge(dow_data, ff_data, left_on = ‘date’, right_on = ‘Date’, how = ‘inner’)
2. Compute Excess Returns
Python Copy code
# Calculate the excess return for each stock and the market
merged_data[‘excess_stock_ret’] = merged_data[‘RET’]—merged_data[‘RF’]
merged_data[‘excess_market_ret’] = merged_data[‘Mkt-RF’]
3. Compute Beta
Python Copy code
# Function to compute Beta using linear regression
def compute_beta(group):
slope, intercept, r_value, p_value, std_err = stats.linregress(group[‘excess_market_ret’], group[‘excess_stock_ret’])
return slope

# Extract the year from the date for annual calculations
merged_data[‘year’] = merged_data[‘date’].dt.year

# Group data by TICKER and year to compute Beta for each stock each year
betas = merged_data.groupby([‘TICKER’, ‘year’]).apply(compute_beta).reset_index()
betas.columns = [‘TICKER’, ‘year’, ‘beta’]

# Display the results
print(betas.head())
Complete Script
Python Copy code
import pandas as pd
from scipy import stats

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Load the Fama-French 3 factors data
ff_data_path = ‘path/to/FamaFrench_3Factors.csv’
ff_data = pd.read_csv(ff_data_path)

# Convert date columns to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])
ff_data[‘Date’] = pd.to_datetime(ff_data[‘Date’], format = ‘%Y%m%d’)

# Merge the datasets on the date column
merged_data = pd.merge(dow_data, ff_data, left_on = ‘date’, right_on = ‘Date’, how = ‘inner’)

# Calculate the excess return for each stock and the market
merged_data[‘excess_stock_ret’] = merged_data[‘RET’]—merged_data[‘RF’]
merged_data[‘excess_market_ret’] = merged_data[‘Mkt-RF’]

# Function to compute Beta using linear regression
def compute_beta(group):
slope, intercept, r_value, p_value, std_err = stats.linregress(group[‘excess_market_ret’], group[‘excess_stock_ret’])
return slope

# Extract the year from the date for annual calculations
merged_data[‘year’] = merged_data[‘date’].dt.year

# Group data by TICKER and year to compute Beta for each stock each year
betas = merged_data.groupby([‘TICKER’, ‘year’]).apply(compute_beta).reset_index()
betas.columns = [‘TICKER’, ‘year’, ‘beta’]

# Display the results
print(betas.head())
By following these steps on your local machine, you should be able to compute the CAPM’s Beta for each stock each year. If you encounter any specific issues or errors, please provide the details so I can assist you further. [View Analysis]

Panel C: Assessment of the Results from ChatGPT-4o
We could not obtain results using the initial Python code provided by ChatGPT-4o. There are a few errors. First, the variable name for “excess market return” in the merged_data dataset should be ‘MktRF’, not ‘Mkt-RF’. Second, the beta value for DOW in 2019 was missing because the value of the ‘ret’ variable was null on 2 April 2019. We corrected these issues by renaming the variable and adding a line of code to delete observations with null values in either the ‘excess_stock_ret’ or ‘excess_market_ret’ columns.
import pandas as pd
from scipy import stats

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Load the Fama-French 3 factors data
ff_data_path = ‘path/to/FamaFrench_3Factors.csv’
ff_data = pd.read_csv(ff_data_path)

# Convert date columns to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])
ff_data[‘Date’] = pd.to_datetime(ff_data[‘Date’], format = ‘%Y%m%d’)

# Merge the datasets on the date column
merged_data = pd.merge(dow_data, ff_data, left_on = ‘date’, right_on = ‘Date’, how = ‘inner’)

# Calculate the excess return for each stock and the market
merged_data[‘excess_stock_ret’] = merged_data[‘RET’]—merged_data[‘RF’]
merged_data[‘excess_market_ret’] = merged_data[‘MktRF’]
merged_data.dropna(subset = [‘excess_stock_ret’, ‘excess_market_ret’], inplace = True)

# Function to compute Beta using linear regression
def compute_beta(group):
slope, intercept, r_value, p_value, std_err = stats.linregress(group[‘excess_market_ret’], group[‘excess_stock_ret’])
return slope

# Extract the year from the date for annual calculations
merged_data[‘year’] = merged_data[‘date’].dt.year

# Group data by TICKER and year to compute beta for each stock each year
betas = merged_data.groupby([‘TICKER’, ‘year’]).apply(compute_beta).reset_index()
betas.columns = [‘TICKER’, ‘year’, ‘beta’]

# Display the results
print(betas.head())
print(betas.to_string())
After fixing these issues, the results obtained were consistent with those from Stata.
TICKER20192020202120222023
AAPL1.4751121.1174581.1956271.2385801.036830
AMGN0.6859720.7861770.5071370.3112170.476450
AMZN1.2759430.6865890.9821781.5921941.464787
AXP1.0086211.4660691.0471021.0770381.199609
BA0.9500411.7258271.4960571.1925850.974057
C1.4439911.5658640.8660540.9147711.126328
CAT1.3869770.9823260.8595780.7310431.102708
CRM1.2386571.0655991.1196491.4709161.164815
CSCO1.2299450.9747220.7308420.7804820.717362
CVX0.8018911.3591720.8127890.5207040.572608
DIS0.7380661.0486070.8629521.1206871.072620
DOW1.4848991.3272681.0509190.7234040.949732
GS1.2125221.2718840.9150710.9156091.048600
HD0.8191541.0915440.7022750.8849461.012578
HON1.0125301.0724970.7849790.7339030.852437
IBM1.0314890.9806190.4211340.4846560.472316
INTC1.2596711.1506981.2623061.1338921.347536
JNJ0.4954790.6572610.2881620.2635590.297114
JPM1.0208251.2737340.7516730.8294220.849219
KO0.3849860.7894020.4056290.4366930.342006
MCD0.3650780.9390310.5414600.4405850.452494
MMM1.1402870.8091270.3601630.6374091.158712
MRK0.5941470.6845970.1900110.2459350.214849
MSFT1.2183871.1175761.0304411.2136501.074491
NKE1.0697890.9402930.9189021.2785210.943933
PG0.4637460.6883360.2353600.4188710.332324
TRV0.6157321.0550520.7234240.3909760.489932
UNH0.6926301.1224690.5064070.5188370.237415
V1.0002081.1191040.9254880.9195310.721692
VZ0.3216620.5130590.1373920.3325260.443051
WMT0.4639330.5021220.4118490.4027640.363042
Table 8. ARMA-GARCH Estimation. This table assesses the ARMA-GARCH estimation for the market return (vwretd). Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Table 8. ARMA-GARCH Estimation. This table assesses the ARMA-GARCH estimation for the market return (vwretd). Panel A presents the results generated using Stata. Panel B shows the results generated using ChatGPT-4o. The “Prompt” refers to the request made by the authors, while the “Response” refers to the feedback provided by ChatGPT-4o. Panel C offers an “Assessment” of the quality and accuracy of the ChatGPT-4o response provided by the authors.
Panel A: Results from Stata
vwretd
vwretd
Intercept0.001***
(0.000)
ARMA
L.ar−0.581
(0.496)
L.ma0.525
(0.518)
ARCH
L.arch0.424***
(0.036)
L.garch0.637***
(0.060)
Intercept−0.000
(0.000)
Number of observations1258
Log likelihood3829.98
*** p < 0.01, ** p < 0.05, * p < 0.1
Panel B: Results from ChatGPT-4o
Prompts:
[upload from computer the “Dow 30 Daily Returns” file]

Act as a finance professor and statistician. Analyze this data for the daily returns data of Dow 30 components from 2019 to 2023, pulled from WRDS.
Estimate an ARMA-GARCH model on the value-weighted return (vwretd).

Response:
It seems there’s an issue with executing the model fitting directly. I’ll provide a detailed guide on how to estimate an ARMA-GARCH model on the value-weighted return (vwretd) using Python on your local machine.
Steps to Estimate ARMA-GARCH Model
  • Load the Data: Load the Dow 30 returns data and set up the time series.
  • Estimate ARMA Model: Fit an ARMA model to the value-weighted return.
  • Estimate GARCH Model: Fit a GARCH model on the residuals of the ARMA model.
Detailed Guide

1. Load and Merge Data
Python Copy code
import pandas as pd

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Convert date column to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])

# Set date as index for time series analysis
dow_data.set_index(‘date’, inplace = True)

# Extract the value-weighted return (vwretd)
vwretd = dow_data[‘vwretd’].dropna()
2. Estimate ARMA Model
Python Copy code
import statsmodels.api as sm

# Estimate ARMA model (start with ARMA(1,1) as an example)
arma_model = sm.tsa.ARMA(vwretd, order = (1, 1)).fit()

# Summarize the ARMA model results
arma_summary = arma_model.summary()
print(arma_summary)
3. Estimate GARCH Model
Python Copy code
from arch import arch_model

# Estimate GARCH model on the residuals of the ARMA model
garch_model = arch_model(arma_model.resid, vol = ‘Garch’, p = 1, q = 1)
garch_fit = garch_model.fit(disp = ‘off’)

# Summarize the GARCH model results
garch_summary = garch_fit.summary()
print(garch_summary)
Complete Script
Python Copy code
import pandas as pd
import statsmodels.api as sm
from arch import arch_model

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Convert date column to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])

# Set date as index for time series analysis
dow_data.set_index(‘date’, inplace = True)

# Extract the value-weighted return (vwretd)
vwretd = dow_data[‘vwretd’].dropna()

# Estimate ARMA model (start with ARMA(1,1) as an example)
arma_model = sm.tsa.ARMA(vwretd, order = (1, 1)).fit()

# Summarize the ARMA model results
arma_summary = arma_model.summary()
print(arma_summary)

# Estimate GARCH model on the residuals of the ARMA model
garch_model = arch_model(arma_model.resid, vol = ‘Garch’, p = 1, q = 1)
garch_fit = garch_model.fit(disp = ‘off’)

# Summarize the GARCH model results
garch_summary = garch_fit.summary()
print(garch_summary)
By following these steps on your local machine, you should be able to estimate the ARMA-GARCH model on the value-weighted return (vwretd). If you encounter any specific issues or errors, please provide the details so I can assist you further. [View Analysis]
Prompts:
Python code has the following errors:

Traceback (most recent call last):
File “/Users/fliu/PycharmProjects/helloworld/chatGPT531/Table 6 test.py”, line 21, in <module>
arma_model = sm.tsa.ARMA(vwretd, order = (1, 1)).fit()
^^^^^^^^^^^
AttributeError: module ‘statsmodels.tsa.api’ has no attribute ‘ARMA’. Did you mean: ‘ARIMA’?

Response:

The error you encountered is because the ARMA class has been deprecated and is replaced by the ARIMA class in the statsmodels library. You should use the ARIMA class with the appropriate order for ARMA models.
Here is the corrected code using the ARIMA class for ARMA model estimation:
Corrected Script

Python Copy code
import pandas as pd
import statsmodels.api as sm
from arch import arch_model

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Convert date column to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])

# Set date as index for time series analysis
dow_data.set_index(‘date’, inplace = True)

# Extract the value-weighted return (vwretd)
vwretd = dow_data[‘vwretd’].dropna()

# Estimate ARMA model (start with ARIMA(1,0,1) as an example for ARMA(1,1))
arma_model = sm.tsa.ARIMA(vwretd, order = (1, 0, 1)).fit()

# Summarize the ARMA model results
arma_summary = arma_model.summary()
print(arma_summary)

# Estimate GARCH model on the residuals of the ARMA model
garch_model = arch_model(arma_model.resid, vol = ‘Garch’, p = 1, q = 1)
garch_fit = garch_model.fit(disp = ‘off’)

# Summarize the GARCH model results
garch_summary = garch_fit.summary()
print(garch_summary)

Explanation
  • Loading the Data: Load the data and preprocess it by setting the date as the index.
  • Estimating the ARMA Model: Use ARIMA with the order (1, 0, 1) to specify an ARMA(1,1) model. The ARIMA function allows specifying the AR and MA components directly.
  • Estimating the GARCH Model: Fit a GARCH(1,1) model on the residuals of the ARMA model using the arch package.
By following this corrected script, you should be able to estimate the ARMA-GARCH model on the value-weighted return (vwretd) without encountering the AttributeError. If you encounter any further issues or need additional assistance, please let me know!

Panel C: Assessment of the Results from ChatGPT-4o
We did not obtain any results using the initial Python code provided by ChatGPT-4o due to a few errors. First, the ‘ARMA’ function in the ‘statsmodels.tsa.api’ module has been replaced by the ‘ARIMA’ function. Thus, we should use ARIMA(1,0,1) instead of ARMA(1,1). Second, our data are a panel data of 30 firms over 5 years, so the ‘vwretd’ values must be repeated 30 times in the data. Thus, we can simply keep observations for just one stock and delete duplicated data. To address this, we added a line of code: “dow_data = dow_data[dow_data[‘TICKER’] == ‘AAPL’]”. Thus, the corrected code is as follows:
Python Copy code
import pandas as pd
import statsmodels.api as sm
from arch import arch_model

# Load the Dow 30 daily returns data
dow_data_path = ‘path/to/Dow 30 Daily Returns.csv’
dow_data = pd.read_csv(dow_data_path)

# Convert date column to datetime format
dow_data[‘date’] = pd.to_datetime(dow_data[‘date’])
dow_data = dow_data[dow_data[‘TICKER’] == ‘AAPL’]


# Set date as index for time series analysis
dow_data.set_index(‘date’, inplace = True)

# Extract the value-weighted return (vwretd)
vwretd = dow_data[‘vwretd’].dropna()

# Estimate ARMA model (start with ARIMA(1,0,1) as an example for ARMA(1,1))
arma_model = sm.tsa.ARIMA(vwretd, order = (1, 0, 1)).fit()

# Summarize the ARMA model results
arma_summary = arma_model.summary()
print(arma_summary)

# Estimate GARCH model on the residuals of the ARMA model
garch_model = arch_model(arma_model.resid, vol = ‘Garch’, p = 1, q = 1)
garch_fit = garch_model.fit(disp = ‘off’)

# Summarize the GARCH model results
garch_summary = garch_fit.summary()
print(garch_summary)
We obtained the following results using the corrected Python code. The first table below shows the estimated results for the ARIMA (1,0,1) model, and the second table shows the results for the GARCH(1,1) model. We can clearly see that these results differ from those given by Stata. The reason is that Stata estimates ARIMA and GARCH models simultaneously, while Python estimates these two models sequentially. Unfortunately, there is currently no package in Python that can estimate these two models jointly.
ARIMA (1,0,1)
vwretd
Intercept0.0006*
(0.000)
L.ar−0.5738***
(0.051)
L.ma0.4299***
(0.055)
Number of observations1258
Log likelihood3671.892
*** p < 0.01, ** p < 0.05, * p < 0.1
*
GARCH
sigma2
Omega3.4142 × 10−6***
(2.422 × 10−11)
alpha(1)0.2000***
(0.03641)
beta(1)0.7800***
(0.02977)
Number of observations1258
Log likelihood3979.10
*** p < 0.01, ** p < 0.05, * p < 0.1
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chou, W.-H.; Feng, Z.; Li, B.; Liu, F. A First Look at Financial Data Analysis Using ChatGPT-4o. J. Risk Financial Manag. 2025, 18, 99. https://doi.org/10.3390/jrfm18020099

AMA Style

Chou W-H, Feng Z, Li B, Liu F. A First Look at Financial Data Analysis Using ChatGPT-4o. Journal of Risk and Financial Management. 2025; 18(2):99. https://doi.org/10.3390/jrfm18020099

Chicago/Turabian Style

Chou, Wen-Hsiu (Julia), Zifeng Feng, Bingxin Li, and Feng Liu. 2025. "A First Look at Financial Data Analysis Using ChatGPT-4o" Journal of Risk and Financial Management 18, no. 2: 99. https://doi.org/10.3390/jrfm18020099

APA Style

Chou, W.-H., Feng, Z., Li, B., & Liu, F. (2025). A First Look at Financial Data Analysis Using ChatGPT-4o. Journal of Risk and Financial Management, 18(2), 99. https://doi.org/10.3390/jrfm18020099

Article Metrics

Back to TopTop