1. Introduction
The phenomenon of trading volume concentration has been observed across various domains, and particularly in financial markets (
Balakrishnan et al. 2008;
Shankar et al. 2020) and cryptocurrencies (
Doğan and Yalçıntaş 2023;
Metelski and Sobieraj 2022). Similarly, in the technology sector, market dominance by a few major players has been a subject of ongoing debate and scrutiny (
Armoogum et al. 2022;
Mayer-Foulkes and Hafner 2023). This trend is characterized by a convergence of interest and activity around a limited set of assets or entities, often driven by herd behavior and the pursuit of short-term gains (
Terranova and Turco 2022). The research conducted by
Balakrishnan et al. (
2008) revealed that the distribution of daily stock trading volumes exhibits power law characteristics, indicating that daily trading had become more concentrated in a small set of stocks over time. Furthermore,
Shankar et al. (
2015,
2020) modeled the distribution of daily trading volumes as a power law function and used the power law exponent, also referred to as the Trading Concentration Index (TCI), as a measure of trading volume concentration. Their research revealed that while trading concentration for non-S&P 500 stocks has steadily increased since the 1960s, the concentration for S&P 500 stocks has actually decreased since the mid-1970s, coinciding with the introduction of index funds. This divergence in trading patterns highlights the significant impact of financial innovations like index funds on stock market dynamics.
The underlying drivers of concentration can be multifaceted. One such driver, as described by
Palley (
1995), is the pursuit of perceived safety in numbers.
Palley (
1995) provides a formalization of managerial herd behavior driven by this pursuit, supported by individual risk aversion and relative performance-based remuneration. Other drivers include the influence of social media and online communities (
Hagar and Shaw 2022), and the desire for quick gains in volatile markets (
Batrinca et al. 2018). Some studies provide insights into the underlying drivers of trading volume concentration in financial markets, which could be related to trading strategies in volatile markets (
Ananzeh 2015;
Davidsson 2014). They highlight that factors such as the significant relationship between trading volume and market volatility (
Ananzeh 2015), and the strong correlation between volume, volatility, and return momentum in global financial markets (
Davidsson 2014) play crucial roles in understanding these dynamics. Additionally, the asymmetry in price-volume relations, volatility features, and the impact of day-of-the-week patterns on trading volume (
Batrinca et al. 2018;
Maki 2024) further emphasize the influence of quick gains in volatile markets on trading volume concentration. These findings suggest that the desire for rapid profits in fluctuating market conditions significantly contributes to the concentration observed in trading volume on financial markets.
There is also evidence that trading volume concentration can elevate systemic risks and contribute to market distortions or bubbles. The presence of crowded trades among similarly trading peers can influence asset price dynamics, potentially creating systemic risk (
van Kralingen et al. 2021). It is important to note that the impact of trading volume on stock return volatility has been explored, revealing a direct relationship between trading volume and stock return volatility across various quantiles (
Tsagkanos et al. 2021). Furthermore, incorporating trading volume into volatility forecasting models has shown to enhance the performance of Value at Risk (VaR) models, especially during crisis periods, indicating the significance of trading volume in assessing market risk and volatility (
Slim 2016). These findings suggest that trading volume concentration plays a crucial role in exacerbating systemic risks and market distortions, highlighting the importance of monitoring and managing trading activities to maintain market stability. To mitigate the risks associated with trading concentration, diversification strategies and a broader understanding of market dynamics are crucial.
The question of whether trading volume concentration across the 500 S&P 500 constituents has increased or decreased over time has also been a subject of debate in the literature (
Balakrishnan et al. 2008;
Shankar et al. 2020;
Amundi 2023). Different studies have presented contrasting findings, and it is essential to examine the evidence from various sources to gain a comprehensive understanding of this issue. One perspective suggests that trading concentration across the S&P 500 constituents has decreased due to the popularity of index funds and exchange-traded funds (ETFs) that facilitate investment in the entire basket of stocks comprising the index. This view is supported in the
Shankar et al.’s (
2020) study, that found that the TCI for the S&P 500 stocks exhibited an inverted V-shaped pattern. The TCI increased from 1960 to 1975, indicating increasing trading concentration. However, after the introduction of S&P 500 index funds in 1975, the TCI for the S&P 500 stocks steadily decreased, suggesting a more even distribution of trading activity across all 500 stocks.
Hegde and McDermott (
2003) found that the trading volume and institutional investor interest in stocks added to the S&P 500 index increased immediately after their inclusion, primarily due to the index inclusion itself. This finding suggests that index funds and ETFs play a significant role in driving trading volume and liquidity changes for S&P 500 constituents. On the other hand, some studies have argued that trading concentration across the S&P 500 stocks has increased over time, contradicting the findings mentioned above (
Amundi 2023). This perspective is supported by
Balakrishnan et al. (
2008) who modeled the distribution of daily stock trading volumes as a power law function and found that the power law exponents steadily increased from 1962 to 2005, indicating that daily trading had become more concentrated in a small set of stocks over time. Also,
Gabaix (
2009) found that the distribution of firm sizes and trading volumes often follows a power law distribution, with a small number of firms or stocks accounting for a disproportionately large share of the total. Specifically,
Gabaix et al. (
2006) focused on institutional investors and their impact on stock market volatility, contributing to our understanding of power law dynamics in financial markets. It is important to note that these contrasting findings may be attributed to differences in the time periods analyzed, the specific measures of trading concentration used, and the methodologies employed. Additionally, some studies have focused on the entire stock market, while others have specifically examined the stocks from the S&P 500 index. To reconcile these conflicting perspectives, it is necessary to consider the potential impact of various factors, such as the increasing popularity of index funds and ETFs, changes in market structure and regulations, and the evolving investment strategies of institutional and individual investors. The fact of the matter is, that an increasing popularity of index funds and ETFs influences the stock selection by institutional investors, impacting trading volume in specific stocks (
Duffy et al. 2019;
Alina-Cristina 2018). Research by
Kang (
2023) demonstrates that higher ETF ownership weakens return reversals, indicating a shift in trading patterns (
Kang 2023). Additionally,
Huang et al. (
2021) find that industry ETFs reduce post-earnings-announcement drift, enhancing market efficiency and influencing trading behavior (
Huang et al. 2021). Moreover,
Hu et al. (
2020) highlight that large institutions, particularly those with high index fund ownership, opt to lend shares rather than cast votes, affecting trading volumes in stocks with significant index ownership (
Hu et al. 2020). Also,
Shu (
2013) highlights that institutional trading volume plays a crucial role in stock market anomalies, showing a decrease in anomalies like price momentum and the value premium as institutional trading volume increases (
Shu 2013). These studies suggest that the rise of index funds and ETFs alters institutional investors’ portfolio choices and trading activities, ultimately impacting trading volumes in specific stocks. On the flip side,
Del Rio and Santamaria’s (
2010) study contrasts with the US evidence, indicating that differences in investor type do not significantly impact trading volume dynamics, but mutual funds ownership levels can influence autocorrelation in trading volume (
Del Rio and Santamaria 2010). It is important to note the role of both institutional (
Griffin et al. 2011) and individual investors (
Welch 2022;
Ülkü et al. 2023) in driving trading volume concentration. As highlighted by
Welch (
2022) and
Ülkü et al. (
2023), retail investors also contribute significantly to trading volume concentration, challenging the notion that this phenomenon is driven solely by institutional players. These studies have shown that this is particularly applicable to the period under investigation in this study, namely the post-COVID-19 era.
It is also important to mention the study by
Hoekstra and Güler’s (
2022) who studied the mediation effect of trading volume, revealing that it mediates the relationship between investor sentiment and stock returns, particularly evident in the case of Tesla and the MSCI World Index (
Hoekstra and Güler 2022). Additionally,
Wei’s (
2009) work on Taiwan’s institutional trading volume demonstrates its impact on the stock market index through volatility effects and causality, emphasizing the significant relationship between trading volume and stock index returns (
Wei 2009).
Based on what was presented above, the following hypotheses are formulated: (1) trading volume concentration across the S&P 500 index constituents exhibits significant stochastic changes in response to major economic events and market conditions; (2) chasing trading volume concentration can be a profitable strategy for investors, potentially outperforming the S&P 500 index; (3) a concentration-driven portfolio offers a competitive risk–return profile compared to the S&P 500 index; (4) the level of trading volume concentration, as measured by various metrics (e.g., percentage of volume accounted for by top companies, disparity indices), provides valuable insights into market dynamics and investor behavior. These hypotheses form the foundation for this study’s investigation into trading volume concentration trends, profitability in chasing trading volume concentration, and implications for portfolio management and risk mitigation strategies. The overview of the literature presented above highlights a research gap in understanding the dynamics of trading volume concentration across the S&P 500 constituents, particularly in light of recent market events and the evolving landscape of institutional (
Griffin et al. 2011) and retail investing (
Welch 2022;
Ülkü et al. 2023). This study aims to address this gap by examining the variation in trading volume concentration across S&P 500 stocks from January 2020 to December 2022, a period marked by significant market volatility and major global events. The study employs various concentration measurement methods, including the power law exponent, the Herfindahl–Hirschman index, and Gini-based TCI, and introduces a novel experimental design, comparing a concentration-driven portfolio, rebalanced daily based on the top 30 stocks by trading volume, against the S&P 500 benchmark. The whole research is motivated by the hypothesis that following trading volume concentration may yield higher returns compared to the S&P 500 index itself. This hypothesis is grounded in the observation that increased trading volume among top-ranked stocks is associated with heightened investor interest (
Welch 2022), which should be reflected in their valuations.
The structure of this article is as follows.
Section 2 presents various methods for measuring trading volume concentration, including mathematical notations to facilitate better understanding and comparison of different approaches. Additionally,
Section 2.2 outlines the ‘experimental design’ and ‘data analysis methodology’.
Section 3 provides a detailed discussion of the results, featuring a figure (chart) for the Gini-based TCI and a comparison of ‘power law exponent’, HHI, and Gini-based TCI methods. This section also analyzes concentration levels at different points along the spectrum of trading volume concentration distribution, examining combinations such as the top 10, 30, 75, and 250 stocks by trading volume and their corresponding percentiles. Furthermore, it presents a comparative analysis of the concentration-driven (daily rebalanced) portfolio results (PnL) versus the S&P 500 (PnL), along with a risk–return analysis. It also includes a qualitative assessment of portfolio changes in relation to significant market events, such as the COVID-19 pandemic, the Russia–Ukraine war, the energy crisis, inflation spikes, Federal Reserve policy shifts, and banking sector turbulence, providing context for the observed changes in trading volume concentration and portfolio composition (constituents).
Section 4 is dedicated to discussing the results in context, while
Section 5 presents the conclusions drawn from the entire study.
2. Materials and Methods
The data utilized in this study were sourced directly in R using the quantmod and tidyquant packages. The dataset comprises historical stock data for the 500 companies constituting the S&P 500 index. The process involved retrieving data for each company within the index from 2 January 2020 to 30 December 2022, encompassing a total of 756 trading sessions over a span of three years. The dataset encompasses a comprehensive view of stock market dynamics within the S&P 500 index over a three-year period. With each company’s historical stock data captured for every trading session during this timeframe, the dataset provides a rich source of information for analyzing trading concentration and market trends within the index. This robust dataset forms the foundation for conducting an in-depth analysis of trading volume concentration dynamics, presenting comparative analysis of the concentration-driven (daily rebalanced) portfolio results (PnL) versus the S&P 500 (PnL), along with a risk–return analysis, and finally making a qualitative assessment of portfolio changes in relation to significant market events over the specified study period.
2.1. The Power Law Exponent, HHI and Gini-Based TCI as Measures of Trading Concentration
In the realm of financial markets, understanding the concentration of trading activity is crucial for assessing market dynamics, liquidity, and potential systemic risks. Three prominent methods have emerged to quantify this concentration: the power law exponent, the Herfindahl–Hirschman Index (HHI), and the Gini-based Trading Concentration Index (TCI). Each of these measures offers unique insights into the distribution of trading volumes across stocks, particularly within the context of the S&P 500.
Naldi (
2003) and
Balakrishnan et al. (
2008) were the first to demonstrate that the power law exponent could be used as a general measure of market concentration, with larger exponent values indicating greater concentration. Traditionally, the HHI has been used to measure market concentration. However,
Naldi (
2003) and
Balakrishnan et al. (
2008) found that the power law exponent is more useful for studying changes in concentration than the HHI measure when dealing with a large number of observations, as the extreme elasticity of the HHI obscures trends over time.
Naldi (
2003),
Balakrishnan et al. (
2013), and
Shankar et al. (
2020) argue that the power law distribution can be specifically applied used to model the phenomenon whose distribution can be ranked by a metric of size. More specifically, the power law exponent, as employed by
Balakrishnan et al. (
2013) and
Shankar et al. (
2020), is derived from the observation that trading volumes often follow a power law distribution. This approach models the relationship between trading volume and its rank as:
where
- -
is the volume rank of stock on day ;
- -
is the trading volume of stock on day ;
- -
is the intercept term for day ;
- -
is the slope coefficient for day , representing the power law exponent (Trading Concentration Index);
- -
is the index for individual stocks (1 to 500 for S&P 500);
- -
is the index for trading days.
This formulation expresses the log-linear relationship between a stock’s trading volume rank and its actual trading volume on a given day. The power law exponent
quantifies the degree of concentration in trading volumes across stocks, with higher values indicating greater concentration. Scientific evidence from various research papers supports the extensive utilization of power law distributions in modeling economic and financial phenomena (
Gabaix et al. 2003;
Gabaix 2009). Studies have shown that power law distributions explain processes in economics and finance, reflecting extreme events like debt distress in emerging countries (
Akhundjanov and Chamberlain 2019;
Dufrénot and Paret-Onorato 2016). Furthermore, investigations into wealth distribution among billionaires have revealed that the upper tail of wealth data follows a power law distribution, indicating a statistical understanding of wealth disparity (
Asif et al. 2021). Additionally, analyses on the wealth of the richest individuals have verified that power law behavior is present in only 35% of the datasets, often coexisting with rival distributions like log-normal or stretched exponential distributions (
Brzezinski 2014).
Axtell (
2001) found that the distribution of firm sizes in the US follows Zipf’s law, a special case of the power law, with a power law exponent of 1.0. However, the study by
Kondo et al. (
2023) demonstrates that a lognormal distribution or a convolution of lognormal and non-Zipf Pareto distributions better describe the US economy, deviating from Zipf’s law (
Kondo et al. 2023). Additionally,
Asif et al. (
2021) found that the distribution of wealth among billionaires follows a power law distribution with exponents ranging from 1.306 to 1.571, indicating a more evenly distributed wealth than suggested by Zipf’s law (
Asif et al. 2021). Furthermore,
Medrano-Adan and Salas-Fumás (
2018) argue that the conditions for the firm size distribution to follow a power law are more restrictive than commonly believed, challenging the assumption of a Zipf distribution in the US (
Medrano-Adan and Salas-Fumás 2018).
Gopikrishnan et al. (
1999) and
Nirei et al. (
2020) showed that the distribution of trading volume for individual stocks obeys a power law function. I. . Various research papers discussed the application of power law analysis in different fields, showcasing its relevance in understanding economic structures and relationships (
Vogel 2022). Additionally, some studies focused on power law distributions in financial instruments and market capitalization across stock exchanges, highlighting fluctuations in power law exponents and their relation to fundamental values of stocks (
Tuncay 2020;
Mizuno et al. 2016). Furthermore, power law behavior was observed in traffic distribution on road segments, emphasizing the robustness of such patterns in urban traffic simulations (
Umemoto and Ito 2018).
The Herfindahl–Hirschman Index (HHI), originally developed to measure market concentration (
Hirschman 1964), can be adapted to assess trading volume concentration:
where
- -
, and is the total number of stocks (e.g., 500 for S&P 500),
- -
is the market share of stock ’ s trading volume,
- -
is the trading volume of stock , and
- -
is the total trading volume across all stocks.
This formulation expresses the HHI as the sum of squared market shares of trading volumes, providing a measure of concentration that ranges from (perfect equality) to 1 (complete concentration), where is the market share of stock i’s trading volume. The HHI is particularly sensitive to the distribution of trading volume among the most actively traded stocks, providing a measure of concentration that emphasizes the impact of dominant players in the market.
In turn, the Gini-based Trading Concentration Index (TCI), derived from the Gini coefficient commonly used in income inequality studies (
Gini [1912] 1955), offers a comprehensive view of trading volume distribution:
where
- -
is the Gini coefficient (TCI in our case);
- -
is the area under the Lorenz curve, representing the cumulative proportion of trading volume against the cumulative proportion of stocks. The area is calculated using:
- -
is the number of stocks (500 for S&P 500),
- -
is the trading volume of stock , and
- -
Stocks are ordered from lowest to highest trading volume.
It is important to note, that traditionally the Gini index is used as a measure of statistical dispersion intended to represent the income or wealth distribution of a nation’s residents (
Gini [1912] 1955). When adapting it to the actual distribution of trading volume across different companies it can be visualized as in
Figure 1 below.
In the context of adapting the Lorenz curve and Gini index concepts to measure trading concentration using the TCI, the surface between the Lorenz curve and the line of equal distribution of trading volume (in
Figure 1 above) represents the degree of concentration in the distribution of trading volume across stocks. The line of equal distribution of trading volume is the 45-degree diagonal line, representing the case where all stocks have an equal share of the total trading volume. The Lorenz curve represents the actual cumulative distribution of trading volume across the ranked stocks. The area between the Lorenz curve and the line of equal distribution of trading volume can be viewed as the area of inequality in trading volume distribution. This area quantifies the deviation of the actual trading volume distribution from the ideal case of perfect equality, where each stock accounts for an equal share of the total trading volume. A larger area between the Lorenz curve and the line of equal distribution of trading volume indicates a higher degree of trading volume concentration, where a small subset of stocks accounts for a disproportionately large share of the total trading volume. Conversely, a smaller area between the two lines suggests a more even distribution of trading volume across all stocks. The TCI is derived from this area, with higher TCI values corresponding to a larger area of trading volume concentration and lower TCI values indicating a more equal distribution of trading volume activity. Therefore, the surface between the Lorenz curve and the line of equal distribution of trading volume can be appropriately labeled as the “area of trading volume concentration” or the “area of inequality in trading volume distribution”, representing the degree of unevenness or concentration in the distribution of trading volume across stocks. The Gini-based concentration index (TCI) ranges from 0 (perfect equality, where every stock accounts for the same share of trading volume) to 1 (maximum inequality, where one individual or a very small group holds all the share of trading volume).
While each method has its merits, the Gini-based TCI emerges as a particularly robust and insightful measure for investigating trading volume concentration across all 500 S&P 500 stocks. This preference is rooted in several key advantages: its strong theoretical foundation in economic theory, sensitivity to trading volume distribution, resilience against outliers, and normalized comparability. The measure’s range between 0 and 1 allows for intuitive interpretation, further enhancing its appeal. Unlike the power law exponent, which primarily focuses on the tail of the distribution, or the HHI, which is more sensitive to the largest players, the Gini-based TCI considers the entire distribution of trading volumes. This comprehensive approach ensures that concentration changes across all levels of trading activity are captured, providing a more nuanced understanding of market dynamics. The Gini-based (TCI) coefficient’s output range of [0,1] allows for straightforward interpretation and comparison across different time periods or market conditions. This normalization facilitates the identification of trends and anomalies in trading concentration over time. While extreme trading volumes can significantly impact both the power law exponent and HHI, the Gini-based TCI is less susceptible to such outliers (which is reflected in
Figure 2).
This robustness of Gini-based TCI ensures that as a measure of trading volume concentration it remains stable and informative even in the presence of occasional trading volume spikes or anomalies (see
Figure 2). The Gini coefficient’s strong theoretical foundation in inequality studies translates well to the analysis of trading volume concentration. Its mathematical properties, including the principle of transfers (i.e., the Pigou-Dalton transfer principle) (
Cowell 2000), make it particularly suitable for assessing the distribution of a finite resource (trading volume) among a fixed set of entities (S&P 500 stocks). The Gini-based TCI provides an intuitive measure of concentration that can be easily communicated to both technical and non-technical audiences. Its visual representation through the Lorenz curve offers additional insights into the nature of trading volume distribution.
From a mathematical perspective, the Gini-based TCI’s calculation involves evaluating the area between the Lorenz curve and the line of perfect equality. This approach inherently captures the cumulative effects of trading volume distribution, providing a more comprehensive measure of concentration than methods that rely on specific distributional assumptions or are overly sensitive to particular segments of the market. And while the power law exponent and HHI offer valuable insights into specific aspects of trading volume concentration, the Gini-based TCI provides a more holistic and robust measure for analyzing the S&P 500 market. Its ability to capture concentration effects across the entire distribution of stocks, coupled with its normalized output and strong theoretical foundation, makes it an ideal tool for researchers and practitioners seeking to understand and monitor trading volume concentration in this critical market index.
2.2. Experimental Design and Data Analysis Methodology
This study employed a quantitative approach to test the hypothesis that a portfolio based on trading volume concentration outperforms the S&P 500 index. The experiment focused on all 500 companies in the S&P 500 index over a three-year period from 2 January 2020 to 30 December 2022, encompassing 756 trading sessions. After analyzing various combinations of concentration levels, we determined that the most representative measure of trading volume concentration was the combination of the top 30 companies with the highest trading volume and the 50th percentile. Consequently, our experimental portfolio consisted of the 30 stocks with the highest daily share of total trading volume among all 500 S&P 500 companies. The portfolio was rebalanced daily, with stocks falling out of the top 30 being sold and replaced by new entrants. This approach ensured that the portfolio remained ‘concentration-variant’, reflecting changes in overall trading volume concentration. The concentration level for the top 30 stocks fluctuated between approximately 38% and 64% of the total trading volume (for all 500 S&P 500 companies) during the study period. To maintain consistent risk management and allow for a more direct comparison with trading volume concentration changes, we opted for a fixed number of stocks (30) rather than a fixed concentration level. This decision was based on the premise that a constant number of stocks with varying concentration would better represent the dynamics of trading volume concentration than a fixed concentration level with a variable number of stocks.
The study also included a qualitative assessment of portfolio changes in relation to significant market events, such as the COVID-19 pandemic, the Russia–Ukraine war, the energy crisis, inflation spikes, Federal Reserve policy shifts, and banking sector turbulence. This analysis aimed to provide context for the observed changes in trading volume concentration and portfolio composition.
To evaluate the performance of the concentration-based portfolio against the S&P 500 index, we conducted a comprehensive risk–return analysis. This included calculating and comparing key metrics such as Annualized Return, Annualized Standard Deviation, Annualized Sharpe Ratio (with a risk-free rate of 1%), Maximum Drawdown, Value at Risk (VaR) at 95% confidence level, and Conditional Value at Risk (CVaR) at 95% confidence level.
The experiment’s design allowed for a robust comparison between the daily rebalanced, concentration-based rotational portfolio and the S&P 500 index. This approach enabled us to test our hypothesis and draw scientific inferences about the potential benefits of following trading volume concentration as an investment strategy.
3. Results
Utilizing the R programming language and applying the equations presented in
Section 2.1, specifically Equations (3) and (4), we conducted a visualization of the trading volume concentration index. This visualization was based on the theoretical considerations of the Gini-based TCI index discussed in this
Section 2.1 and the aforementioned mathematical notations. The R environment provided a robust platform for implementing these complex calculations and generating graphical representations (visualizations) of the trading volume concentration measures. By leveraging the computational power of R and the theoretical framework established by the Gini index, we were able to create a comprehensive visual representation of trading volume concentration across the S&P 500 constituents. This approach allowed for a nuanced analysis of trading volume concentration patterns, enabling us to identify trends and anomalies in trading volume distribution over the study period. The resulting visualizations (shown in
Figure 3) offer valuable insights into the dynamics of trading volume concentration and serve as a foundation for our subsequent analyses.
The results of our analysis reveal that the Gini-based trading volume concentration (TCI) fluctuated within a range of 55.98% to 77.35% throughout the study period. This metric provides valuable insights into market dynamics, allowing us to identify periods when the TCI was above or below its median for the entire timeframe. These findings also enable the identification of key market reversal days (i.e., trading sessions). Notably, TCI readings were low prior to early March 2020, after which the onset of the pandemic and subsequent Federal Reserve actions to increase system liquidity fueled a technology-driven bull market, resulting in rising trading volume concentration levels. This trend culminated in concentration peaks during the latter half of August 2020, with the aforementioned highest reading of 77.35% for the entire study period. Subsequently, we observed a downward trend until May 2021, followed by another increase until November 2021, then a declining trend until June 2022, a rapid one-month increase from June to July 2022, and finally another decline until December 2022. While numerous short-term swings could be identified, the primary objective of this study is to demonstrate the effectiveness of tools like the Gini-based TCI in assessing trading volume concentration, rather than providing an exhaustive analysis of market fluctuations.
To better understand the dynamics of trading volume concentration, measured as the share of individual stocks ranked from highest to lowest trading volume, we analyzed various combinations and points along this distribution. Specifically, we examined four key combinations: (1) the top 10 stocks by trading volume and the 25th percentile, (2) the top 30 stocks and the 50th percentile, (3) the top 75 stocks and the 75th percentile, and (4) the top 250 stocks (representing half of the S&P 500 index) and the 90th percentile. These combinations were chosen to provide a comprehensive view of the concentration distribution across different levels of market depth. By examining these diverse points along the spectrum, we aimed to identify patterns and thresholds that could inform our understanding of trading volume concentration and its potential impact on portfolio performance.
Figure 4,
Figure 5,
Figure 6 and
Figure 7 below illustrate these combinations for different points of the distribution, offering a visual representation of how trading volume concentration varies across different subsets of the S&P 500 index. Analyzing different percentiles, such as the 25th, 50th, 75th, and 90th, allows to examine the distribution of trading volume concentration more comprehensively. This approach can reveal potential skewness, kurtosis, or other distributional properties that might be overlooked by focusing solely on central tendencies or specific quantiles. Additionally, examining different combinations, allows to potentially identify companies or groups of companies that exhibit unusual or extreme trading volume patterns, providing qualitative insights into market dynamics, investor behavior, or other factors influencing trading activity. Having multiple combinations also facilitates comparisons and benchmarking of concentration levels across different segments of the S&P 500 index. For example, it allows to compare the concentration levels of the top 10 companies with the top 30 or top 75 companies, providing a more nuanced understanding of the distribution.
The top 10 companies (stock) with the highest trading volume and 25th percentile—this combination provides insights into the concentration levels at the lower end of the distribution, capturing the trading volume dynamics of the companies with relatively higher trading activity. It allows to identify potential outliers or anomalies among the smaller companies and understand their impact on the overall trading volume concentration (see
Figure 4).
Analyzing the top 30 companies and the 50th percentile offers a balanced view of the concentration levels, encompassing a significant portion of the trading volume while also considering the distribution’s central tendency (see
Figure 5). This combination is particularly useful for benchmarking and comparing concentration levels across different time periods or market conditions.
In turn, by examining the top 75 companies and the 75th percentile, we gain a comprehensive understanding of the concentration levels at the upper end of the distribution (see
Figure 6). This combination is valuable for identifying potential pockets of high concentration among the larger and more actively traded companies, which can have significant implications for market dynamics and investor behavior.
Finally, analyzing the top 250 companies, which represent half of the S&P 500 index, and the 90th percentile provides a broad perspective on the concentration levels across a substantial portion of the index (see
Figure 7). This combination is particularly useful for assessing the overall distribution of trading volume and identifying potential skewness or kurtosis in the data.
This comprehensive approach captures different aspects of the distribution, identifies potential anomalies, facilitates comparisons, and ensures the reliability and generalizability of the findings, ultimately contributing to a deeper understanding of the market dynamics and investor behavior. Based on the results, it is important to note, that the trading volume concentration phenomenon observed in the S&P 500 index stocks is an intriguing and multifaceted occurrence. In early 2020, specifically from the beginning of March, there was a notable increase in the concentration of trading activity (measured as the share of total dollar volume) among the stocks comprising the S&P 500 index. More precisely, the number of stocks accounting for 50% of the total trading volume across all 500 S&P 500 companies decreased from 52 at the start of March to just 6 by the end of August 2020 (a span of approximately 6 months, from 10 March to 21 August) [as shown in
Figure 5]. In other words, by late August 2020, a mere 6 out of the 500 S&P 500 stocks were responsible for generating half of the total trading volume across the entire index. This concentration pattern repeated itself in 2021. Starting from 27 May 2021, when 54 stocks accounted for 50% of the trading volume, the concentration level steadily increased until 9 November 2021, when only 9 stocks were responsible for generating half of the total trading volume. From November 2021 until the end of 2022, the trading concentration began to decline again. By December 2022, more than 50 stocks were needed to account for 50% of the total trading volume across the S&P 500 index, indicating a more even distribution of trading activity. Alternatively, the share of trading volume attributed to the top 30 stocks decreased from over 60% to approximately 40%, as illustrated in
Figure 5.
All the results illustrated in
Figure 4,
Figure 5,
Figure 6 and
Figure 7, which depict trading volume concentration measured as the share of individual stocks in total trading volume (ranked from top to bottom with the highest to lowest share in total trading volume), are summarized in
Table 1 for key reversal days when trading volume concentration peaked or bottomed out. These figures present various combinations and distribution levels, including the top 10 stocks by trading volume and the 25th percentile, the top 30 stocks and the 50th percentile, the top 75 stocks and the 75th percentile, and the top 250 stocks (representing half of the S&P 500 index) and the 90th percentile.
Table 1 provides a comprehensive overview of these concentration metrics for the critical market turning points, allowing for a detailed comparison of how different segments of the market contribute to overall trading volume concentration during pivotal moments in the study period. This consolidated presentation facilitates a more detailed understanding of the relationship between market depth and trading volume concentration at key junctures in the market cycle.
The data strongly suggest a positively skewed (right-skewed) distribution of trading volume across S&P 500 companies. This is evident from the fact that a small number of companies (top 10 or top 30) consistently account for a disproportionately large share of the trading volume. For instance, on 21 August 2020, the top 10 companies accounted for 53.63% of the volume, indicating a long right tail in the distribution. The large difference between the share of trading volume for the top 10 companies and the median (50th percentile) suggests a heavy-tailed distribution. This is characteristic of a distribution where extreme values are more probable than in a normal distribution. The concentration of volume in the top companies, particularly during high trading volume concentration periods, indicates a distribution with high kurtosis. This means the distribution has fatter tails and a higher, sharper peak compared to a normal distribution. The data show that the median trading volume is significantly lower than the mean. This can be inferred from the fact that the 50th percentile (median) consistently includes many fewer companies than would be expected if the distribution were symmetric, indicating that the bulk of the distribution is shifted towards lower trading volumes. The wide swings in trading volume concentration levels between dates (as shown in
Table 1 above) suggest a distribution with high variability and a large standard deviation. The standard deviation appears to change over time, being larger during high concentration periods. Therefore, the distribution shape appears to change over time. During high concentration periods (e.g., 21 August 2020), the distribution is more extremely skewed and peaked. In contrast, during lower concentration periods (e.g., 10 March 2020), the distribution becomes relatively less skewed and more spread out. The relationship between the cumulative share of trading volume and the number of companies often follows a pattern reminiscent of a power law distribution, which is common in many natural and economic phenomena (
Naldi 2003;
Balakrishnan et al. 2013;
Shankar et al. 2020).
Moreover,
Table 1 shows a cyclical pattern of concentration, with alternating periods of high and low concentration. This is evident across all metrics, but particularly pronounced for the Top 10 Companies’ Share. The highest levels of concentration occurred on 21 August 2020, and 9 November 2021. On these dates, the Top 10 Companies’ Share exceeded 50% of trading volume, indicating significant market dominance by a small number of companies. The following dates: 10 March 2020, 27 May 2021, and 17 June 2022, show relatively lower concentration levels. During these periods, trading volume was more evenly distributed across all 500 S&P 500 stocks. What is important to note is that there are instances of rapid shifts in concentration, particularly between 17 June 2022, and 22 July 2022. This suggests periods of market volatility or significant events affecting trading patterns.
Also, there is a consistency across different metrics (different combination of the top k number companies’ share vs. number of companies at the n-th percentile). The patterns observed in the Top 10 companies’ share are generally mirrored in the other metrics, indicating that concentration trends affect different segments of the market similarly. As the share of trading volume for top companies increases, the number of companies at various percentiles decreases. This inverse relationship is particularly evident at the 25th and 50th percentiles. The Top 250 companies’ share remains relatively stable compared to other metrics, suggesting that larger companies maintain a consistent share of trading volume even during periods of high concentration. The fluctuations in the number of companies at different percentiles provide insights into market breadth. Lower numbers indicate periods of narrower market participation, while higher numbers suggest broader market engagement. These observations suggest that the S&P 500 experienced significant shifts in trading volume concentration during the studied period, with alternating phases of high concentration and more distributed trading. These patterns may reflect broader market trends, economic events, or changes in investor behavior, and could have implications for market efficiency, liquidity, and overall index performance.
To further illustrate and comprehend the trading volume concentration of S&P 500 stocks,
Table 2 presents the trading volume concentration results for the top 30 stocks with the highest trading volume on key dates (trading sessions) where significant extremes in trading volume concentration were observed. This provides an alternative perspective to the information depicted in
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7.
Table 2 allows for a qualitative assessment of why specific companies appeared in this top 30 list, which, as previously emphasized, represents the most representative metric (concentration distribution level) among the four combinations illustrated in
Figure 4,
Figure 5,
Figure 6 and
Figure 7. This detailed breakdown enables a deeper understanding of the factors driving concentration during critical market periods and offers insights into the dynamics of stock selection within the highest volume segment of the 500 S&P 500 stocks. By examining these key dates, we can better contextualize the shifts in trading volume concentration and their potential implications for market behavior and investment strategies.
The results of the experimental design described in
Section 2.2 are presented in
Figure 8 and
Figure 9, as well as in
Table 3. To recap, this study employed a quantitative approach to test the hypothesis that a portfolio based on trading volume concentration outperforms the S&P 500 index. The experiment focused on all 500 companies in the S&P 500 index over a three-year period from 2 January 2020 to 30 December 2022, encompassing 756 trading sessions. Our experimental portfolio consisted of the 30 stocks with the highest daily share of total trading volume among all S&P 500 companies, rebalanced daily to maintain a ‘concentration-variant’ approach.
Figure 8 specifically illustrates the Profit and Loss comparison (i.e., the PnL) curve for both the concentration-driven (daily rebalanced) rotation portfolio and the S&P 500 index over the entire study period. This visual representation allows for a direct comparison of the performance of our experimental portfolio against the benchmark index, providing a clear illustration of the potential benefits or drawbacks of the concentration-based investment strategy. The PnL curves offer insights into the cumulative returns of both approaches, highlighting periods of outperformance or underperformance and allowing for a comprehensive evaluation of the effectiveness of the trading volume concentration strategy in various market conditions throughout the three-year study period.
Table 3 presents a comprehensive comparative risk–return analysis between the experimentally designed portfolio, which daily reallocates funds from stocks falling out of the top 30 highest trading volume to new entrants, and the S&P 500 index as a benchmark. This analysis provides quantitative metrics to evaluate the performance and risk characteristics of the concentration-driven strategy against the broader market index.
Figure 9 visualizes the same results as
Table 3 but in the form of a radar chart, offering a more intuitive understanding of the portfolio strategy’s advantages. This graphical representation allows for a quick and clear comparison across multiple performance and risk dimensions, effectively illustrating the potential benefits of pursuing concentration in S&P 500 stocks as an investment strategy.
The comparative analysis of the concentration-driven portfolio (with its constituents rebalanced daily based on top 30 companies with the highest share of trading volume) and the S&P 500 index yields several noteworthy observations (as shown in
Table 3 and
Figure 8 and
Figure 9). Firstly, the portfolio demonstrates a superior annualized return of 10.66%, compared to the S&P 500’s 5.89%. This represents an outperformance of 477 basis points, indicating a substantially higher rate of capital appreciation for the concentration strategy. Secondly, the portfolio exhibits a higher annualized standard deviation (29.43%) relative to the S&P 500 (25.48%). This suggests that the concentration strategy experiences greater volatility, which is consistent with the expectation that a more concentrated portfolio may be subject to larger price fluctuations. Thirdly, the portfolio’s Sharpe ratio (0.325) surpasses that of the S&P 500 (0.19) by a considerable margin. This indicates that the concentration strategy provides a more favorable risk-adjusted return, delivering a higher excess return per unit of volatility. Fourthly, the portfolio’s maximum drawdown (37.70%) is moderately higher than that of the S&P 500 (33.92%). This metric reveals that the concentration strategy experienced a slightly larger peak-to-trough decline during the observation period, suggesting potentially higher downside risk. Fifthly, the portfolio’s VaR (−3.05%) is marginally higher in absolute terms compared to the S&P 500 (−2.48%). This implies that, with 95% confidence, the worst daily loss for the portfolio is expected not to exceed 3.05%, whereas for the S&P 500 it is 2.48%. Finally, the portfolio’s CVaR (−5.24%) is slightly higher in absolute terms than the S&P 500’s (−4.84%). This indicates that, in the worst 5% of cases, the average loss for the portfolio is expected to be 5.24%, compared to 4.84% for the S&P 500.
In conclusion, the empirical evidence suggests that the concentration-driven (daily rebalanced) rotating portfolio significantly outperforms the S&P 500 in terms of absolute returns and risk-adjusted performance (as measured by the Sharpe ratio). However, this outperformance is accompanied by moderately higher volatility and slightly elevated downside risk metrics. The strategy appears to offer a compelling risk–return tradeoff for investors who can tolerate the additional volatility in pursuit of higher returns. It is important to note that these findings are based on historical data and may not be indicative of future performance.
4. Discussion
This concentration aligns with the “winner-takes-all” dynamics observed in digital markets (
Borsenberger 2014;
Boussard and Lee 2020;
Hagar and Shaw 2022). For example,
Hagar and Shaw’s (
2022) study on attention markets demonstrates attention concentration patterns consistent with cumulative advantage, a key aspect of “winner-takes-all” dynamics. The rise of platform economies and network effects has led to the emergence of tech giants that capture a disproportionate share of market value and user attention (
Parker et al. 2021;
Liu 2023). The drivers behind this concentration trend are multifaceted. In many cases, it stems from a collective pursuit of safety in numbers (
Palley 1995), increased impact of social media (
Hagar and Shaw 2022), and a tendency to gravitate towards perceived winners (
Boussard and Lee 2020). In financial markets, this often translates to investors pursuing volatility (
Batrinca et al. 2018) and flocking to investments that are in the spotlight, reinforcing the concentration of trading volumes in certain stocks.
This study focuses on measuring and analyzing trading volume concentration across all 500 S&P 500 stocks. While various methods exist for quantifying concentration, including the power law exponent and the Herfindahl–Hirschman Index (HHI), we opted for the Gini-based Trading Concentration Index (TCI). As demonstrated in
Figure 2 and
Figure 3, the Gini-based TCI provides a more nuanced and responsive measure of trading volume concentration compared to other methods (as shown in
Figure 2). This approach allowed to capture the dynamic nature of concentration in financial markets more effectively. The results, as shown in
Figure 3, reveal that the Gini-based TCI for the S&P 500 fluctuated between 55.98% and 77.35% over the study period, with the highest concentration observed in August 2020. This peak coincided with heightened interest in technology stocks, particularly those benefiting from the pandemic-induced shift to remote work and digital services (
Sugiana et al. 2022;
Kamdjoug et al. 2023).
Table 5 provides insights into the specific companies driving this concentration, highlighting the dominance of tech giants like Apple, Amazon, and Tesla during key periods. A qualitative analysis of the data presented in
Table 5 offers valuable insights into the changing dynamics of trading volume concentration. It allows to identify patterns in which sectors and companies attract the most trading activity during different market conditions and events.
The core of this study involves an experimental design comparing the performance of the concentration-driven (rebalanced-daily) portfolio against the S&P 500 index. This portfolio, rebalanced daily to include the 30 stocks with the highest trading volumes, significantly outperformed the S&P 500 index benchmark. The concentration-driven portfolio achieved an annualized return of 10.66%, compared to the S&P 500’s 5.89%. This outperformance is particularly noteworthy given the general difficulty fund managers face in beating the S&P 500 index. As illustrated in
Figure 8 and
Figure 9, and
Table 3 (in the
Section 3), the concentration-driven portfolio not only delivered superior returns but also demonstrated favorable risk-adjusted performance metrics. The portfolio exhibited a higher Sharpe ratio and only slightly higher maximum drawdown compared to the S&P 500, indicating a better risk–return profile. These findings suggest that from a fund manager’s perspective, following trading volume concentration can yield above-average results.
The study contributes to the literature on trading volume concentration and financial markets in several ways. Firstly, it provides insights into effective methods for measuring concentration. Secondly, it demonstrates the stochastic nature of concentration in financial markets. It builds upon and extends previous studies in this field. For instance,
Balakrishnan et al. (
2008,
2013) showed that daily trading had become more concentrated in a small set of stocks over time, while
Shankar et al. (
2020) found that the Trading Concentration Index (TCI) for S&P 500 stocks decreased after the introduction of index funds in 1975, suggesting more evenly distributed trading activity. Our findings complement these studies by providing a more comprehensive analysis of trading volume concentration and its implications for portfolio management. The study also aligns with recent industry observations, such as those reported by
Amundi (
2023), which highlight the increasing importance of understanding market concentration for effective investment strategies. Our results suggest that monitoring trading volume concentration can serve as a valuable indicator for market entry and exit points, risk hedging, and overall portfolio management.
Notably, even during periods of significant market stress, such as the consecutive days of 5% declines in the S&P 500 index in March 2020, the concentration-driven portfolio demonstrated greater resilience. As shown in
Figure 8, the profit and loss (PnL) curve of the concentration-based portfolio outperformed the S&P 500 index during these turbulent times.
Finally, it is important to note that the study has several limitations that should be considered when interpreting its results. Firstly, the research focuses solely on the S&P 500 stocks, which may not be representative of broader market trends or smaller cap stocks. While the study does not capture the trading volume concentration in stocks outside the S&P 500, it is important to note that significant trading volume can concentrate in smaller or mid-cap companies (as was the case for “meme stocks”: GameStop (GME), AMC Entertainment (AMC), and Bed Bath & Beyond (BBBY) during the post-COVID-19 bull market), which are not part of the S&P 500 index (notably, during the period under study, individual investors, as highlighted by
Welch (
2022) and
Ülkü et al. (
2023), often generated high trading volumes on such stocks, as many retail investors were confined at home with ample time to implement their own strategies, frequently outperforming institutional investors). For example, the highest dollar trading volume for GME shares occurred on 27 January 2021. On that day, the stock price reached a maximum of USD 347.51, and the trading volume was approximately 93.7 million shares, which translates to a dollar trading volume of approximately USD 32.5 billion. For comparison, the most liquid stocks, such as Apple, Microsoft, Amazon, and Tesla, can achieve daily trading volumes of USD 5–15 billion under normal market conditions. Furthermore, the three-year period from January 2020 to December 2022 covered in the study was marked by exceptional market conditions, including the COVID-19 pandemic, which may limit the generalizability of findings to more typical market environments. Additionally, the concentration-driven portfolio strategy, while outperforming the S&P 500 benchmark, does not account for several real-world trading costs. Although many brokerage firms have reduced or eliminated commissions on stock trades, other costs such as bid-ask spreads, market impact costs, opportunity costs, and taxes still apply. Bid-ask spreads can incur costs, especially for stocks with lower liquidity. Large trades can move the market price, leading to less favorable prices (market impact costs). The time it takes to execute trades at desired prices can result in missed opportunities (opportunity costs or slippage costs). Lastly, capital gains taxes can affect the net returns from trading, particularly for short-term trades. The study’s reliance on daily trading volume data may not capture intraday trading patterns or the impact of after-hours trading. Furthermore, the research does not fully explore the potential causes of changes in trading volume concentration, such as the influence of algorithmic trading or the growing popularity of passive investment strategies. The experimental design used in the study is theoretical, assuming that trades can be executed exactly at closing prices. In practice, the execution of large trades by a hedge fund attempting to implement the strategy may not occur at these prices due to the size of the transactions, leading to potential discrepancies between theoretical and actual performances. Lastly, while the study employs multiple concentration measures, including the power law exponent, the Herfindahl–Hirschman index, and Gini-based TCI, it does not exhaustively compare these methods or validate their effectiveness across different market conditions or time horizons. This last issue could be addressed as a subject for future research by other scholars, providing a more comprehensive comparison and validation of these concentration measures under various market conditions and over different time periods.
All in all, this study underscores the significance of trading volume concentration in financial markets and its potential as a basis for effective investment strategies. By providing a comprehensive analysis of concentration dynamics in the S&P 500, the study contributes to a deeper understanding of market behavior and offers insights that can inform more sophisticated approaches to portfolio management and risk assessment in an increasingly concentrated financial landscape.
5. Conclusions
This study was inspired by the pervasive phenomenon of concentration observed across various domains, from financial markets to technology and urban development. The background of this research is rooted in the recognition that concentration has become increasingly prevalent in many aspects of modern life, particularly in financial markets. The period from January 2020 to December 2022 was marked by significant events such as the COVID-19 pandemic, Russia–Ukraine war, the energy crisis, surging inflation, Federal Reserve policy shifts, and banking turmoil, which collectively fueled heightened market volatility and trading volume concentration fluctuations.
The research employed a comprehensive methodological approach to analyze trading volume concentration across the 500 S&P 500 stocks. Various methods for measuring and assessing concentration were presented, including indicators based on the power law exponent, the Herfindahl–Hirschman Index (HHI), and the Gini-based Trading Concentration Index (TCI).
Figure 2 in the study illustrates the differences between these concentration measurement methods. Notably, the Gini-based TCI revealed that concentration fluctuated between 55.98% and 77.35% during the study period. The research also conducted a purely statistical evaluation of concentration levels at different points along the spectrum of trading volume concentration distribution, analyzing combinations such as the top 10 stocks by trading volume and the 25th percentile, top 30 stocks and 50th percentile, top 75 stocks and 75th percentile, and top 250 stocks (half of the S&P 500) and 90th percentile. This statistical assessment led to the conclusion that the combination of the top 30 stocks by trading volume and the 50th percentile was most representative of the trading volume concentration phenomenon.
Based on these findings, the experimental design for testing the research hypotheses involved creating a portfolio that mimicked concentration by consisting of the 30 stocks with the highest daily share of total trading volume among all 500 S&P 500 companies. This approach was chosen to ensure the portfolio remained “concentration-variant”, reflecting changes in overall market trading volume concentration. The decision to use a fixed number of stocks (30) rather than a fixed concentration level was made to better manage risk and provide a more direct comparison with concentration changes.
The comparison between the concentration-driven (daily rebalanced) portfolio and the S&P 500 index benchmark yielded significant results. The concentration-driven portfolio demonstrated superior performance across several risk–return metrics. Specifically, the portfolio achieved an annualized return of 10.66% compared to the S&P 500’s 5.89%. The annualized standard deviation was moderately higher for the portfolio (29.43%) than for the S&P 500 (25.48%), indicating somewhat higher volatility. However, the portfolio’s annualized Sharpe ratio (0.325) surpassed that of the S&P 500 (0.19), suggesting a more favorable risk-adjusted return. The maximum drawdown for the portfolio (37.70%) was moderately higher than the S&P 500 (33.92%), while the Value at Risk (VaR) at 95% confidence level was −3.05% for the portfolio compared to −2.48% for the S&P 500. The Conditional Value at Risk (CVaR) at 95% was −5.24% for the portfolio versus −4.84% for the S&P 500.
The implementation of the proposed strategy in the experimental design required substantial programming work and time investment. However, this effort was justified by the interesting results obtained, which expand scientific knowledge about concentration in financial markets. This article includes numerous proprietary figures and tables, enhancing its value with visual representations of complex data patterns. Additionally, a qualitative assessment of the concentration-driven portfolio’s composition was conducted to better understand the temporal synchronization of concentration fluctuations and the underlying factors driving these changes. This analysis helps elucidate why certain stocks entered or exited the portfolio during different market phases, contributing to a deeper understanding of the relationship between market events, trading volume concentration, and portfolio performance. Overall, this comprehensive study provides valuable insights into the dynamics of trading volume concentration and its potential implications for investment strategies in the S&P 500 market.