2. Materials and Methods
2.1. Dataset Assembly
Specific geographic location data across the horse population in the United States is difficult to obtain, and in many breeds is not collected for all horses (or potentially any registered horses). It is more common for breed associations to retain location data (i.e., zip code) for an owner than for the animal itself. The purpose of the current study was to evaluate geographic clustering within the breeding sector of the horse racing industry. Within the population of breeding animals in the U.S., comprehensive and accurate location data is most available for stallions (as compared to mares or foals). Additionally, stallions represent a critical asset in this industry due to the potential genetic impact a stallion has because of the large number of offspring it sires in any given year.
Therefore, for this study of the U.S. racehorse breeding population, registered Thoroughbred and Standardbred stallions were selected for evaluation. Thoroughbred data were sourced from the national registry database maintained by The Jockey Club (Lexington, KY, USA). The dataset included all registered Thoroughbred stallions in the United States for the periods 1995–1999 and 2012–2017. For each year that a stallion appeared in the dataset, the zip code of the physical location of that stallion was provided. Standardbred data were obtained from the United States Trotting Association (USTA; Columbus, OH, USA), and the dataset included all registered Standardbred stallions in the United States across the fifteen-year period from 2002–2017. Unlike The Jockey Club, the USTA only retains location data at a state-level resolution. Therefore, zip code data had to be obtained on a state-by-state basis from organizations or regulatory bodies tasked with such record-keeping responsibilities at the state level. The initial year in the dataset for each breed (Thoroughbreds, 1995; Standardbreds, 2002) represents the earliest year in which complete digitized records were available from The Jockey Club and USTA, respectively. The geographic location provided by all of the national and state breed registries is where the stallions lived in the reference year and where mating/sperm collection was done.
For Standardbreds, organizations working to document and retain stallion location records are not active in all states. Even in states with higher Standardbred populations—where more than one established organization, agency or other body may exist—records of location data were, for some states, incomplete. Therefore, comprehensive zip code data for registered Standardbred stallions could not be obtained for all 50 states. The Northeast/Mid-Atlantic region was selected for a more in-depth case study at the county level, as complete zip code data was available in each of the following states: Delaware (Delaware Standardbred Breeders’ Fund), Maryland (Maryland Standardbred Race Fund), New Jersey (Standardbred Breeders and Owners Association of New Jersey), New York (New York Racing Commission), Pennsylvania (Pennsylvania State Horse State Gaming Commission), and Virginia (Virginia Harness Horse Association). Another reason to focus on the Mid-Atlantic region is that the states are relatively small, competition for the gambling dollar is high: racing customers and stallions can both move easily across the region’s borders.
The datasets also included lifetime performance data for each stallion and its offspring. Multiple performance metrics are collected within an individual breed, and the individual breed association was consulted in order to determine the metric that would be most informative when evaluating performance quality. For Thoroughbreds, The Jockey Club provided records for races won in specific race classes (Grade 1 (G1), Grade 2 (G2), Grade 3 (G3), Black-Type (BT)). The highest race class in which a stallion won a race was used as the metric to designate the stallion’s own performance quality. Progeny performance metrics included the number of graded and/or black-type winners sired by each stallion. For Standardbreds, USTA supplied lifetime stallion and progeny earnings records.
To separate stallions by performance level, a performance index was created using the above-mentioned metrics. Because the performance parameters provided by each breed registry differed (race level for Thoroughbreds vs. money earned for Standardbreds), the performance indices for each were constructed separately (
Table 1, Thoroughbreds;
Table 2, Standardbreds). For the Thoroughbred data, the stallion’s own performance record was scored as follows, with the race level representing the highest race class won by an individual stallion: 1 = G1; 2 = G2; 3 = G3; BT = 4; and 5 = no recorded graded or BT wins. A second score was assigned for lifetime progeny performance. Progeny performance data provided by The Jockey Club indicated if a stallion had sired graded stakes or BT winners. However, the grade level (1–3) won by the offspring was not specified in the dataset. Therefore, progeny performance quality scores were defined as: 1 = sired graded stakes winners; 2 = sired BT winners, but no graded winners; 5 = did not sire any graded or BT winners. These stallion and offspring scores were then summed to determine the final combined performance quality score (CQS). This CQS was then used to rank stallions based on level of performance quality. In the final performance index used for further data analysis in this study, Level One stallions had a CQS of 2–3. Level Two stallions had a CQS of 4–6. A CQS of 7–9 placed stallions in Level Three, while the lowest level, Level Four, had a score of 10, which represented stallions that neither won a stakes or BT race themselves nor sired any offspring winning at either of these levels. The total number of stallions assigned to each of final performance index levels for both the initial year (1995 or 2002) and the final year (2017) for Thoroughbreds and Standardbreds can be found in
Table A1, respectively.
The Standardbred performance index was created similarly, by first scoring the stallion and offspring performance records separately. Stallion race performance based on lifetime race earnings was scored as follows: 1 = ≥USD 2 million; 2 = <USD 2 million and ≥USD 1 million; 3 = <USD 1 million and ≥USD 500,000; 4 = <USD 500,000 and ≥USD 100,000; 5 = <USD 100,000 and ≥USD 10,000; 6 = <USD 10,000 and >USD 0; 7 = USD 0. Progeny performance scores were assigned using total lifetime offspring earnings for each stallion: 1 = ≥USD 100 million; 2 = <USD 100 million and ≥USD 10 million; 3 = <USD 10 million and ≥USD 1 million; 4 = <USD 1 million and ≥USD 100,000; 5 = <USD 100,000 and ≥USD 10,000; 6 = <USD 10,000 and >USD 0; and 7 = USD 0. The CQS was then generated, and stallions were ranked by score: Level One, CQS = 2–5; Level 2, CQS = 6–9; Level Three, CQS = 10–13; Level Four, CQS = 14. Similar to the Thoroughbred index, stallions in Level Four neither won any money themselves as a racehorse, nor sired any offspring with race earnings.
It should be noted that any performance score for stallions whose location is specified for the earlier year of our trend analysis (1995 or 2002 depending on breed) incorporates information on subsequent performance of the stallion or its progeny that we possess, but which the breeder in that year did not. This is not a serious problem for making inferences about geographic behavior using the performance data that we have available. First, the final racing record for the stallion itself was known to the breeder in the majority of cases. Older animals dominate the equine age distribution in any given year, and very few members of these two breeds race after being put out to stud. Also known by the breeders in real time was the quality of the bloodline based on the performance of the stallion’s ancestors. Finally, given the detail that our analysis requires on date, horse quality, and geographic location, no alternative to the lifetime performance data provided by the two breeding organizations exists.
After the CQS was calculated for each registered stallion, the number of stallions was summed over the geographic units of interest. For each breed, this aggregation task was done for the earliest and latest years for which complete data were available. For Thoroughbreds, geographic subtotals were calculated for the years 1995 and 2017. For Standardbreds, geographic subtotals were calculated for the years 2002 and 2017. All necessary match-merge operations and subtotal calculations were done using the SAS Statistical Software program (SAS Institute, Cary, NC, USA). The total number of registered stallions in each year of the evaluated periods (1995–2017, Thoroughbreds; 2002–2017, Standardbreds) were examined to confirm that the above specified years selected for analysis did not deviate from global trends for the stallion population within each breed. A representation of these total stallion trends for Thoroughbreds and Standardbreds can be found in
Figure A1 and
Figure A2, respectively.
The Jockey Club’s data on Thoroughbred stallions contained zip codes for each stallion-year combination. In this dataset, each stallion is located in only one zip code in a given year. Each zip code in the 1995 Thoroughbred file was matched to a unique county and state using the Geographic Correspondence Engine of the Missouri Census Data Center (MCDC) [
46]. This online software tool provides zip code-county crosswalk files for the census years 1990 and 2000. The crosswalk files for both years were used in order to maximize the probability of finding a match (zip codes are created and retired over time). If a stallion’s zip code of residence was split between two counties, the county that had the largest share of the zip code’s census population was selected and the second county was ignored. After a county Federal Information Processing System (FIPS) code was assigned to each zip code in the 1995 file of registered stallions, a state code was assigned using state-county crosswalk files from the US Census Bureau. This second geographic crosswalk file is stable compared to the zip code files.
The same procedure was used to assign standing Thoroughbreds to states in the year 2017. There is one difference, however. Instead of using zip code-county crosswalk files from the Missouri Census Data Center, we used the equivalent file for the first quarter of 2017 from the US Department of Housing and Urban Development (HUD) [
47]. The Department maintains a continuous series of geographic crosswalk files for recent years, while the MCDC is the best source for older geographic files.
The data on national Standardbred stallions included a code for state of residence for each year. Therefore, it was not necessary to match zip codes to higher-level geography before creating subtotals for U.S. states. For the case study on six Northeast/Mid-Atlantic states, however, Standardbred data were collected at the zip code level (see above). These data were matched to county FIPS codes using a procedure similar to that described above for Thoroughbreds, except that the year 2000 MCDC crosswalk file was used for the year 2002 data, and the 2017: Q1 HUD crosswalk file was used for the year 2017 data [
46,
47].
For the six Northeast/Mid-Atlantic states, it was therefore possible to analyze the geographic distribution of both breeds across counties, not just across states. The Northeast/Mid-Atlantic region is the only part of the U.S. for which county geography is the geographic unit of analysis in this study.
2.2. Descriptive Data Analysis
The primary method of summarizing the data will consist of maps of the numbers or percentages of standing stallions in each areal unit in the earliest available year, which varies by breed, and in the most recent year, which is 2017 for both breeds. Subtotaled data were match-merged into the ESRI Arc-Map GIS program for this purpose. In our maps of the 48 contiguous states and Puerto Rico, data bars for the early year and for 2017 are shown side by side for each state. These bars represent each state’s percentage of the total number of stallions standing in each year. Because we use percentages and not raw counts, these national scale maps depict the change in the geographic distribution of stallions: The eye is not misled by the significant decline over time in the absolute number of registered stallions across all states.
In contrast, the county-level maps for the six Northeast/Mid-Atlantic states show actual head counts of stallions in each year, not percentages of the six-state total. Each year appears on its own map. The data were mapped this way in order to use graduated circles instead of bars. Graduated circles are easier to read when a high number of geographic units are clustered together in a small space. Notwithstanding the display of raw counts for the years side by side, these Northeast/Mid-Atlantic maps were designed to focus attention on cross-area distributions and not on the decline in absolute numbers of stallions over time.
The maps described above can tell interesting stories on their own. It is sometimes useful, however, to quantify geographic phenomena. In particular, we wish to quantify the extent to which stallions are concentrated in fewer states (counties) as opposed to many states (counties). We would like a single national metric in order to test the hypothesis that key business assets—registered stallions in our case—concentrate into a smaller number of geographic locations in response to a decline in total numbers.
The Herfindahl–Hirschman index (HHI) is the simplest measure of the concentration of countable objects across cells, categories, or jurisdictions. It is calculated as the sum of the squares of the shares that each jurisdiction has of the total national count. Because we use decimals, not percentages to calculate the index, our HHIs are all less than or equal to 1.0, which is the index’s maximum value if all stallions were to locate in a single jurisdiction. The data required as inputs to the HHI are precisely what we are mapping with our vertical bars shown on national maps. The HHI provides a single measure of spatial concentration for the entire U.S., or, for the entire six-state Northeast/Mid-Atlantic Region.
An index such as the HHI must be used with caution when one wishes to compare the degree of concentration across cases where the overall count is very different. Imagine two scenarios in which there is never more than one stallion per county. In the 1995 scenario, assume that there are exactly 310 stallions distributed across the 310 counties in the six Northeast/Mid-Atlantic States. In the 2017 scenario, however, imagine that there are only 100 stallions, which are distributed across these same 310 counties at a rate of one stallion per county. In 1995, the smallest possible HHI that could exist would be 310 × (1/310)2. In 2017, however, the smallest possible HHI that could exist would be 100 × (1/100)2. The second of these numbers is larger than the first, in spite of the fact that both scenarios represent the maximum possible dispersion of a set of indivisible units, horses, across a set of 310 counties.
This total size problem is not serious in our dataset, because the number of stallions for which we calculate HHI never falls below the number of cells over which they might hypothetically be distributed. The closest we get to this threshold is 69 stallions distributed over 50 states.
In order to be conservative with any conclusions we draw based on the HHI, we will adjust each calculated index to control for total head count at the national or regional level. Specifically, we rescale the HHI so that it can take any value between 0 and 1, but 0 is set equivalent to the HHI minimum that is specific to each national/regional total. The formula for this adjusted HHI is as follows:
where HHI
raw is calculated without any adjustment.
1.0 is the universal HHI maximum.
where N = number of jurisdictions over which the stallions could be distributed;
T = total number of stallions;
W = T/N rounded down to the nearest whole number; F = T/N − T.
2.3. Causal Data Analysis
We conducted two sets of analyses to test hypotheses that relate our geographic measures to: (1) the economic rents earned by different classes of horses, or across different states; and (2) state-level policies that incentivize local breeding and help to retain stallions over time. Our hypotheses about the relationship between geographic clustering and economic rents are addressed largely by comparing HHI results for high performing stallions to the results for lower performing stallions. We also use Census equine sales data for the largest Thoroughbred states to explore the relationship between economic rents and success at retaining Thoroughbred stallions over our study period. This analysis is preliminary, and the method is described in the Results section. We describe here the method that we used to test our hypothesis linking supportive state policies to stallion retention over a very challenging couple of decades.
A statistical test on the effect of state policies requires data on changes in stallion numbers (see above), as well as a judgement call on the extent to which each state’s portfolio of policies is supportive of local breeders. To make the policy characterization portion of this research project manageable, we focused only on the top ten states for each breed in the earliest available year, 1995 or 2002. Some states ranked in the top ten for both breeds, bringing the combined number of states examined to fifteen.
We collected information on policies related to breeder incentive programs, presence of casinos and racinos, gambling revenues, gaming activities at the track, and state subsidies. We classified as “supportive” all states that (1) funded breeding incentive programs or purse supplements using monies other than the pari-mutuel handle, and (2) implemented these incentive programs early enough in our study period to affect the trends we measure in this article. States that did not meet both of these conditions were classified as “not supportive.” Two states, Kentucky and Illinois, met our supportiveness criteria for Thoroughbreds, but not for Standardbreds (Kentucky supports Standardbreds, but at a much lower percentage of the relevant revenue stream—15%). In the analysis below where geographic data for both breeds are combined, these two states will be classified as “mixed.”
With these policy categorizations in hand, we then created two numeric measures of each state’s degree of success or failure at retaining stallions over the study period. We calculated the change in each state’s national rank between the earliest year in the data and the latest year, using number of head as the ranking criterion. For each breed we also calculated the percentage point change in each state’s national share between the earliest available year and the latest available year. (These are the numbers shown in the right-most columns of
Table A2 and
Table A5). These state-specific success variables were compared across policy categories using box plots and
t-tests on the differences between means, although the feasibility of the
t-test varied by breed. We were unable to conduct hypothesis tests on the ten-state Thoroughbred sample, because only two states fall into the NS category. In contrast, the top ten Standardbred states are divided evenly between S and NS policy categories.
Table 3 below shows the full sample of states used for our analysis of the effects of state breeder incentive policies. When all fifteen states are used, for example in a
t-test, both breeds’ changes in national share are pooled in the dataset: any distinction between breeds is lost.
Table 3 shows our judgement of the degree of policy supportiveness by state and breed. It lists the top ten states (in # of stallions) in the early year for each breed and also shows the change in each breed’s rank on the list of all 49 states.
3. Results and Discussion
The figures and tables discussed in this section will chronicle changes in the national and regional distribution of registered stallions.
Depicted in
Figure 1 is the map comparing the distribution (as a percentage of the total) of Thoroughbred stallions in the United States in 1995 and 2017. In 1995 there were 5203 Thoroughbred stallions in the U.S., while in 2017 that number was reduced to 1710, a decline of 67%. A numeric version of the map in
Figure 1 may be found in
Table A2.
There were increases from 1995 to 2017 in the percentage of Thoroughbred stallions standing at stud in Florida, Indiana, Kentucky, Louisiana, Ohio, and West Virginia. While Kentucky was not the sole dominant state in 1995, it was the big relative gainer over the 22-year period, in the face of national decline. Increases in the percentage of standing Thoroughbred stallions are also evident in the neighboring states of Indiana, Ohio, and West Virginia, where the purse and breeder incentive structure is supported by revenue from alternative gaming such as slot machines and table games [
48,
49,
50]. Our analysis found that a significant percentage of Indiana and Ohio Thoroughbreds are standing near the Kentucky border, suggesting that there is a three-state cluster centered on the cultural heartland of the breed in Fayette County, Kentucky. In contrast, more than two-thirds of West Virginia’s Thoroughbreds are standing far to the east, in a county that is almost completely surrounded by Maryland and Virginia. They are therefore part of a cluster of Thoroughbreds in those states.
Shown in
Figure 2a,b are maps comparing the distribution (as a percentage of the total) of top-tier versus lower-tier Thoroughbred stallions in the U.S. in 1995 and 2017. For these maps, top-tier is defined as Level 1 or 2 based on our CQS, while lower-tier is defined as Level 3 or 4. Detailed state-by-state data for each tier can be found in
Table A3 (top-tier) and
Table A4 (lower-tier). Although lower-tier Thoroughbred stallions are located in most states throughout the U.S., it is evident that Kentucky is dominant when it comes to standing top-tier Thoroughbred stallions in the U.S. for both years.
Kentucky not only dominates the nation for top-tier Thoroughbreds, it is also the only state that significantly increased its share of these stallions in the face of national decline. Looking at
Figure 2b for lower-tier Thoroughbred stallions, the neighboring states of Kentucky, Ohio, Indiana, and West Virginia have done well over time, but so have the states of Florida, Louisiana, and Arkansas. While Kentucky historically has been a leader in the breeding of Thoroughbred racehorses in the U.S. and has been able to survive the lack of alternative gaming revenue support for horseracing, this has not been the case for the Standardbred breeding business in the Commonwealth. The relative decline of the Kentucky Standardbred industry is detailed in Garkovich [
24] and is also shown on the map that follows.
Figure 3 compares the cross-state distribution (as a percentage of the total) of Standardbred stallions in the United States in 2002 and 2017. In 2002 there were 1082 registered Standardbred stallions in the U.S., while in 2017 that number was reduced to 681, a reduction of 37%. It is evident from
Figure 3 that Standardbred stallions are more clustered nationally than Thoroughbred stallions: They are primarily located east of the Mississippi. There were increases in the percentage of Standardbred stallions in Delaware, Maryland, Ohio, Pennsylvania, and Iowa, where purse and breeder incentive structure is supported by revenue from alternative gaming such as slot machines and table games [
51,
52,
53,
54,
55,
56]. The same is not true in the states of Kentucky, New Jersey, and Michigan, which experienced marked declines in their share of Standardbred stallions over the 2002–2017 period [
16,
40,
57,
58]. A complete numeric comparison between distribution by state is displayed in
Table A5.
Depicted in
Figure 4a,b are maps of top-tier and lower-tier Standardbred stallions (as a percentage of the total) in the U.S. in 2002 and 2017. In contrast to the Thoroughbred maps, top-tier Standardbreds are defined here as Level 1 only; lower-tier Standardbreds are Levels 3, 4, or 5. Selecting a more elite group at the top end shows more clearly the difference in geographic change over time by quality level. Ohio was the biggest gainer overall for this breed, but its relative success over the study period is even more dramatic when the analysis is restricted to Level 1 stallions (
Figure 4a). A full numeric summary of top- and lower-tier Standardbred stallions can be found in
Table A6 and
Table A7, respectively.
We have argued that the geographic tendency in racehorse breeding, especially under conditions of secular decline, is toward a winner-take-all outcome. If Kentucky was the big national winner in Thoroughbreds from 1995 to 2017, Ohio was the big national winner in Standardbreds. Like Kentucky, it had a special edge in retaining the top-tier stallions, suggesting that agglomeration is even more important when the economic returns are high. This could not have happened if Ohio had not implemented purse and breeding incentives funded from alternative gaming.
We also examined states in the Northeast/Mid-Atlantic region to assess county-level changes in the stallion population for both Thoroughbreds and Standardbreds. Changes are depicted in
Figure 5 for Thoroughbreds, and
Figure 6 for Standardbreds. In both cases there is not only a decrease in the total number of stallions standing at stud, but also in the number of breeding farms still in existence in the first few years of the current decade [
57].
Both breeds show a similar trend in the geographic distribution of stallions over time. First, small numbers of stallions standing isolated in western New York and Pennsylvania disappeared from the national registries. Instead, both breeds appear to have contracted to a high-density core in the Chesapeake-Susquehanna region. A few counties in New York’s Hudson Valley region also held their own over the study period.
Specific state-level policy choices can help to explain these patterns. New York, Pennsylvania, and Delaware now have a lucrative purse and breeder incentive structure supported by alternative gaming [
51,
54,
55,
59,
60]. The big loser over the study period was New Jersey, a state that lacks these incentives [
16,
57]. An important reason for this is that New Jersey pioneered legal gambling in stand-alone casinos [
16]. Located in Atlantic City, New Jersey’s casino industry has opposed alternative gaming at the state’s racetracks, the key source of breeder incentives in other states. For this reason, Malinowski and Avenatti [
57] predicted that New Jersey could lose much of its equine agribusiness, which generated USD 780 million of economic impact annually, USD 110 million in federal, state, and local taxes, and 57,000 acres of working agricultural landscape and open space. Judging by the latest figures on standing stallions, it appears as if the Malinowski-Avenetti prediction was beginning to come true [
16]. It should be noted that in 2019, subsequent to the time period examined in the current study, the New Jersey state legislature passed legislation authorizing an appropriation from the state budget of USD 20 million annually for a period of five years in support of Thoroughbred and Standardbred racing in this state [
61]. Future study of the impact that this influx of capital into horse racing and its downstream effects on the breeding segment of the industry is warranted.
Table 4 summarizes the change in the geographic distribution of all kinds of registered stallions over time, using the size-adjusted Herfindahl–Hirschman index described in the section on materials and methods. As previously discussed, these changes in spatial concentration must be considered in light of the decline in the number of stallions standing nationwide, including the six states selected for county-level analysis. It is not inevitable that a measure of spatial concentration would increase under these conditions. An equal percentage decline in number of head across all geographic units, for example, would leave any concentration index unchanged.
The HHI results in
Table 4 show that Standardbred stallions are consistently more concentrated than Thoroughbred stallions, as already stated above. Ideally, one would like to compare pairs of HHIs using a test of statistical significance. Djolov [
62], however, argues that the study of such a test is in its infancy, calculations are complicated—requiring a conversion to the Gini coefficient—and results in applied contexts difficult to interpret. This is also true across the 310 eastern counties, although the difference between breeds is less dramatic at this scale, especially in 2017.
A second key finding from
Table 4 is that top tier stallions are always more concentrated across US states than lower tier stallions, controlling for any bias that may be imparted by the total count. (This fact can be seen in the national maps as well.) This finding confirms the idea that observed agglomeration in this agricultural sector is correlated with value-added production generating significant economic rents.
We define economic rents as above-average sales prices and stud fees that owners can earn due to the genetic advantages certain horses have as racers, combined with expert training and management practices that allow these horses to perform at their maximum potential. Our definition of top-tier stallions is based on win-loss records, lifetime earnings, and the racing performance of progeny, all of which are likely to be correlated with stud fees. The top-tier categorization is therefore a proxy for economic rents. It follows that
Table 4’s rows for top-tier stallions support a hypothesis linking economic rents in the racehorse breeding industry to both static geographic concentration and to increases in concentration over time.
Additional information on economic rents may be found in the U.S. Census of Agriculture. The Census reports equine sales by state, and this figure can be averaged over the number of head sold. Agricultural Census data cover all breeds. We assume that, compared to Standardbreds, Thoroughbreds are numerous enough that their sale prices might affect the aggregate dollar sales reported in the Census of Agriculture. The Thoroughbred stallions and breeders in our sample might also benefit, via shared resources and knowledge-sharing, from economic rents earned by the much larger number of Census-enumerated horses that are sold near them, including all breeds, uses, and genders.
Figure 7 reports data on the top seven U.S. Thoroughbred states by number of standing stallions in 1995. The horizontal axis measures equine dollar sales per head across all breeds, as reported in the 2002 U.S. Census of Agriculture [
17]. The vertical axis measures the change in each state’s percentage share of the national count of Thoroughbred stallions between 1995 and 2017 (see
Table A1). The relationship between the two variables is positive and resembles a logarithmic function. Although the sample is quite small and Kentucky (far right) is a high-leverage observation,
Figure 7 supports a hypothesis that links change in geographic concentration to state-level economic rents earned throughout the equine industry. Data on sales prices and stud fees for Thoroughbreds alone would be the best way to measure rents in this figure, but those data are not currently available for states.
The final column of
Table 4 depicts a finding that is especially interesting in light of modern cluster theory. It shows that in response to overall decline, and when adjusted for total head count, concentration across spatial units increased uniformly across both breeds, and at both of the geographic scales examined. Instead of a uniform percentage decline in all pre-existing jurisdictions, smaller states and counties disappeared from the map (literally, in the case of
Figure 5 and
Figure 6), while a small handful of jurisdictions with significant critical mass in the early year increased their share of stallions. If we restrict our attention to top-tier stallions, there is really only one winning jurisdiction for each breed.
Having said that, state policies and subtle differences in relative attractiveness must have played a role in which specific jurisdictions would gain share. In the case of all Thoroughbreds, for example, Kentucky emerged in 2017 with most of the share [if not the actual animals] that had been ceded by all of the small states, while California did not. This is true even though California had a larger share of all Thoroughbreds than did Kentucky in 1995.
For Standardbreds, we see that Ohio emerged as more of a winner in 2017 than Indiana, in spite of their similar starting points in 2002. This may be attributable to the timing of the influx of revenue to horseracing from alternative gaming. Ohio opened its first racino, more recently, in 2012, whereas Indiana opened its first racino in 2008, and it is possible that the peak in reinvestment is currently occuring in Ohio but has waned in Indiana [
63,
64]. Similarly, there are more Standardbred stallions standing in York County, Pennsylvania in 2017 compared to a past leader in Standardbred breeding, Monmouth County, New Jersey. This most likely is due to the fact that New Jersey does not have the lucrative purse structure and breeder incentive program when compared to the neighboring state of Pennsylvania [
16,
57].
An analysis of the relationship between state policies and geographic outcomes for breeding stallions can, of course, be more systematic and less anecdotal than the stories told immediately above.
Figure 8 below shows the change in national rank for the number of Standardbred stallions, by the Standardbred policy category reported in
Table 3, for the top ten Standardbred states in 2002. (The vertical axis is an inverted scale for change in rank, ensuring that
Figure 8,
Figure 9 and
Figure 10 have the same interpretation for the success variables, lower on the graph being worse). The change in national rank for each policy category, each containing exactly five states, is shown using box and whisker plots. In spite of the small sample size for each category, the means of the success variables for each policy category are different from each other according to a simple
t-test: the
p-value of this test is 0.04.
Figure 9 presents an analysis identical to
Figure 8, except that the success variable is the change in national share of Standardbred stallions, rather than change in national rank. The means of the two policy categories are also significantly different from each other according to a
t-test, with a
p-value of 0.03. As stated in the
Section 2.3, we did not repeat
Figure 8 and
Figure 9 and the related
t-tests for Thoroughbreds.
Figure 10 below shows the boxplot results for both breeds together. Total sample size is fifteen states that ranked high on head count for either breed (see
Table 3). The changes in national share over the study periods, 1995–2017 for Thoroughbreds and 2002–2017 for Standardbreds, are pooled together in this figure. This means that an analysis of the difference between means in the success variable could be based on as many as 15 × 2 = 30 observations. More specifically, the S category in
Figure 10 includes 18 observations, the MIX category includes four observations, and the NS category includes eight observations.
It makes most sense to conduct a
t-test on the difference between the means of the S and NS categories in
Figure 10, giving a total sample size of 26. This
t-test of change in national share for the two breeds combined is significant with a
p-value of 0.01. (As stated in note 4 for
Table 3, removing the Standardbred observations for Washington and Oklahoma does not change this fundamental result, increasing the
p-value by only 0.004).
Figure 8,
Figure 9 and
Figure 10 and their associated
t-tests allow us to conclude that the relationship between supportive state policies and success at retaining breeding stock is more than anecdotal. A more complete empirical analysis of this issue would require a more detailed dataset on state policies, including supplemental appropriations measured in dollars. For example, Neibergs and Thalheimer compared the effectiveness of supplements to non-restricted purses to local breeder subsidies, both of which could be classified as supportive [
65]. One could also conduct the study using more states, although we would be concerned that the smaller states are idiosyncratic. Most U.S. states are not truly engaged in the racing industry at the “national level”: they may not engage in the interstate competition for breeding stock that is implied by the program impact analysis presented here.
4. Conclusions
In this study, we cited several references arguing that a thriving equine industry can generate side benefits related to landscape conservation, ecosystem services, and rural amenities. If a region wants to achieve this environmental definition of sustainability, however, its equine industry must be sustained.
Racehorses sit at the high-value end of the industry, so their presence requires a dense agglomeration of high-quality suppliers. For a number of reasons, racehorse breeding is likely to exhibit agglomerative behavior that benefits from increasing returns to scale. Virtuous cycles of growth or vicious cycles of decline may set in, and they can be independent of fixed place characteristics, like the quality of the local bluegrass. Registered stallions standing at stud are key assets in this industry: they can be used as a bellwether of cluster success or failure.
We have shown in this study that under conditions of secular decline, measures of overall concentration tend to increase. A previously large region can become a lower-risk refuge, and can significantly increase its share of stallion assets. If there is more than one jurisdiction that operated at scale before decline, however, the ultimate geographic winner will be highly contingent on state-level policies. Revenue streams for purses and breeder incentive programs appear to be significant determinants of which states became high-share regions for racing stallions under conditions of secular decline. Ohio and Pennsylvania are examples of a positive effect of state policy on the economic viability of Standardbred breeding operations, while Kentucky and New Jersey illustrate the downturn that can occur in states that lack sufficient revenue streams for purses and breeder incentive programs. For the most part, racehorse clusters in the U.S. are fragile and can only be sustained using programs that reward breeders financially on the basis of state of residence. The horse racing sector is worth saving nationwide, not only because of its long and prominent history in U.S. sport, but because it is an economic driving engine of the entire U.S. horse industry and is extremely valuable to the quality of life in the form of agricultural working landscape.
This study confirmed several hypotheses arising from industry cluster theory; most notably, a correlation between the level of geographic concentration of stallions and the magnitude of economic rents and returns to tacit knowledge, holding other things equal. The study has begun an interesting inquiry into a subject not often studied by cluster theorists: change in geographic concentration when an industry with more than one cluster at the national level begins to decline. It would be useful to contrast this industry with another declining industry, like automotive. There, too, state policy plays an important role in the observed distribution, as there have been fewer plant openings recently in union states than in states that have passed right-to-work laws. State policy makers may have more control over the three elements of the local TBL than they realize.