1. Introduction
Over the past decades, the increasing concerns of society and consumers regarding the wellbeing of farm animals has led to the development of a variety of animal welfare programmes and certification systems. In addition to governmental initiatives, these programmes and certification systems include private quality assurance systems, such as animal welfare labels and organic schemes initiated by producer organisations, retailers, and non-governmental organisations [
1,
2,
3]. Consequently, the need to reliably assess the animal welfare status at farm level—e.g., by means of external audits—is becoming more and more important.
However, animal welfare is a complex multidimensional concept for which no universal definition exists. The current understanding of animal welfare goes beyond the biological functioning and also includes behaviour and the emotional status of the animals [
4]. For example, in the concept of the “five freedoms” [
5] good welfare requires freedom from hunger and thirst, freedom from discomfort, freedom from pain, injury or disease, freedom to express normal and natural behaviours, and freedom from fear and stress.
Due to its complexity, animal welfare itself cannot be measured directly but must be reflected through a variety of measurements that represent the multidimensionality [
6]. The measurements and indicators used to assess on-farm animal welfare are typically categorised as resource- or management-based parameters and animal-based parameters [
7,
8]. Resource-based parameters describe the environment that affects the animals—e.g., the type of housing or the supply of resources [
8]—and are thus related only indirectly to animal welfare [
3]. In contrast, animal-based measurements (ABMs)—e.g., body condition score or disease state—directly reflect the health and welfare status of the animals [
8] as a result of husbandry and management practices [
9]. ABMs are therefore widely accepted in welfare science [
10,
11,
12] and have been integrated into many animal welfare assessment methods [
13,
14,
15,
16], including some for dairy cattle [
14,
17]. A disadvantage of on-farm ABMs is their time-consuming survey [
8,
9,
18], requiring alternative indicators if larger numbers of farms are to be assessed. A promising approach is to complement or replace on-farm ABMs with data-based variables (DBVs). In this context, DBVs that are based on data or records collected directly from the animals are also referred to as indirect animal-based measurements [
11]. For the DBVs used to monitor animal welfare, data routinely collected on the animals—e.g., records from identification and registration, data on milk yield and quality, and animal health parameters—are suitable [
9,
19]. For dairy cows, an abundance of data are available in European countries as harmonised EU legislation provides for records on birth, movements, and death for each individual bovine animal [
20]. In addition to these identification and registration records, other data routinely collected from cows—such as data on milk yield and quality or animal health parameters—are also suitable as DBVs [
9,
19]. However, to be suitable for animal welfare assessment, the DBVs must be easy to record, and a clear relationship between the DBV and the welfare status of the animals must be demonstrated [
21].
With regard to dairy herd welfare, the ongoing interest in DBVs has led to numerous scientific publications investigating the suitability of DBVs as animal welfare indicators. With our review, we compile this work to provide an overview of the current state of research by following two objectives. First, we describe the definitions of animal welfare used in 13 papers identified through a literature search and categorise the publications according to these definitions. Second, we review the selected papers and extract the associations of DBVs and the indicators of animal welfare at farm level to outline the potential of DBVs for monitoring animal welfare.
2. Design and Results of the Literature Search
The basis for this review was a systematic literature search for a project aimed at identifying and establishing DBVs for dairy welfare in Switzerland. For this purpose, in August 2019 and December 2020, five scientific literature databases were screened (PubMed, Web of Science, Scopus, CAB Direct, and ScienceDirect). For the searches, synonyms for ‘dairy cows’ were combined with different animal welfare and data-associated terms (
Table 1).
The queries were mainly carried out as title-and-abstract searches; in the case of large numbers of hits (>100), only titles and keywords were searched instead. Furthermore, filters were used to exclude papers related to human health. Identified literature published since 1995 in German, English, and French was subjected to a screening by the first author. In this process, it was checked whether the literature met the criteria to be included in the present review: first, the use of variables based on routine herd data and second, the investigation of the relationships between these variables and dairy welfare at herd or farm level. From the screening, 13 papers met the criteria and were selected as a basis for the present review (
Table 2).
To provide an overview of the papers, both the data sources used to construct DBVs and the definitions and methods used to determine the animal welfare status at farm level were described (
Section 3). Subsequently, the papers were categorised according to the definition of welfare used, and the current state of research was reviewed for each welfare definition category (see
Section 4,
Section 5 and
Section 6). If the methodology of the papers was comparable, the associations between DBVs and indicators of animal welfare were extracted from the papers and grouped by type of DBV. Consideration was given to whether the associations were based on univariable or multivariable analyses. The strengths of the associations were not included in this review because the studies differed in terms of their conditions and analyses. In this review, the names of the specific DBVs and their variants are written in italics to distinguish them from data sources and umbrella terms that may include several related DBVs.
3. Overview of the Origin of Data-Based Variables and the Animal Welfare Definitions Used
The 13 selected publications used different data sources to extract and calculate suitable potential DBVs (
Table 3). Overall, the databases consisting of identification and registration records were most frequently used as a source of DBVs, whereas disease and treatment data as well as records on herd disease status and reasons for culling were rarely included in the analyses. In total, more than 150 variables were found from such databases and examined as possible indicators of animal welfare.
Although all 13 selected papers examined links between DBVs and animal welfare and thus pursued the same objective, they differed in the definition and assessment of the animal welfare status at farm level. For background information on the assessment of animal welfare at farm level, see
Box 1.
Three of the papers defined poor welfare as violations of legal animal welfare requirements (see
Section 4). Five studies examined the relationships between DBVs and ABMs (see
Section 5). Finally, seven publications, including two also included in
Section 5, investigated the relationship between DBVs and animal welfare scores composed of several ABMs, e.g., Welfare Quality (WQ) criteria and principles, the area scores of the Centro di Referenza Nazionale per il Benessere Animale (CReNBA) welfare protocol, and overall scores (see
Section 6).
Box 1. How can animal welfare be assessed at farm level?
The complexity of animal welfare and the lack of a universal definition make it difficult to assess the animal welfare status, especially at farm level. Over the past decades, various assessment methods have been developed, e.g., the ‘animal needs index’ in Austria [
35], the ‘Centro di Referenza Nazionale per il Benessere Animale’ (CReNBA) protocol in Italy [
36], the ‘Kuratorium für Technik und Bauwesen in der Landwirtschaft e. V.’ (KTBL) animal welfare protocol in Germany [
37], and the Welfare Quality (WQ) protocol as a European approach [
17]. At the present time, none of these assessment protocols can serve as a ‘gold standard’ for surveying animal welfare, but the WQ protocol is the most extensive approach [
38].
The welfare assessments differ mainly in the number and type of measurements used. For example, the ‘animal needs index’ assesses available resources, whereas the WQ protocol focuses on animal-based measurements (ABMs) and is only complemented by resource- and management-based measurements when ABMs are not available. Further differences occur in whether and how the results of specific measurements and indicators are combined (and often weighted) to quantify the animal welfare status at farm level. In the KTBL animal welfare protocol, for example, the different indicators are evaluated separately. In contrast, other assessment protocols such as the CReNBA protocol and the WQ protocol aim to combine the results of the measurements into an overall score.
For the CReNBA animal welfare assessment, measurements from four areas are carried out: farm management and personnel, facilities and equipment, animal-related measures, and microclimatic environmental conditions and alarm systems. To determine the numeric overall score, the results of the different measurements are weighted according to their relevance and summed up. Consequently, the farm-level welfare status identified by the CReNBA protocol can be presented using area scores as well as an overall score.
The WQ protocol provides a hierarchical integration process for which, in a first step, the specific measurements (mostly ABMs) are grouped into 12 criteria according to their interrelationships. Within the criteria, the measurements are weighted according to relevance and can partially compensate each other. The 12 criteria themselves are designed to cover all relevant aspects of the WQ animal welfare definition and are combined—again weighted—into four principle scores: ‘good feeding’, ‘good housing’, ‘good health’, and ‘appropriate behaviour’. From these principle scores, an overall score can be calculated that classifies the welfare status at farm level as ‘excellent’, ‘improved’, ‘acceptable’, and ‘not classified’. Thus, with the WQ protocol, the welfare at farm level can be presented using data of the raw measurements, increasingly aggregated levels of criteria and principles, and an overall score.
4. Data-Based Variables as Predictors of Farms Violating Animal Welfare
In this section, the focus will be on the three papers that stand out by defining poor animal welfare as the presence of violations against the animal welfare legislation. Among these papers, two Irish studies investigated whether there are similarities in DBVs in herds with officially confirmed animal welfare violations [
22,
23], whereas a Danish study analysed correlations between DBVs and two common animal welfare violations [
24].
Kelly et al. [
22,
23] focused on the identification and validation of DBVs as indicators of animal welfare violations. In both studies, 18 cattle herds, including five dairy herds with officially confirmed welfare conflicts, were used as case herds. In this context, animal welfare incidents were defined as situations in which the responsible person inflicted avoidable pain or suffering on the animals or did not act appropriately to prevent it, as well as situations where there was no rapid response to animals in pain or suffering. For the first study [
22], six DBVs that had been selected during previous work and by expert opinion were calculated on an annual basis over a 4- to 9-year period that included the animal welfare violation. For the analyses, it was investigated whether the case herds showed any parallels in terms of performance in the DBVs. The four variables
late registration of calves,
on-farm burial of carcasses,
increasing number of movements to knackeries, and
movements to unknown herds were prominent in the case farms. In contrast, no patterns emerged for the DBVs
changes in herd size and
number of calves registered per cow and year. The follow-up study [
23] aimed to validate the four variables identified in the previous study and the additional variable
movements to factories or abattoirs by comparing the distribution of the DBVs between the case herds with confirmed welfare violations and the remaining Irish herds. For all five variables, this study revealed a significant difference in distribution between these two groups. Furthermore, the DBVs were tested alone and in combination at different cut-offs for their ability to distinguish between herds with and without violations of animal welfare. Because of the low sensitivities and specificities, the authors did not consider any of the DBVs or sets to be applicable to identify farms with violations of animal welfare.
A further approach by Otten et al. [
24] pursued the same objective, but they limited animal welfare violations to the two most commonly found in Danish animal welfare inspections: the presence of sick or injured animals not kept in sick pens and the presence of animals in a condition requiring euthanasia. Out of 73 farms visited, 23 were classified as case herds because they met at least one of the two violations. For this study, 25 variables were examined for their association with animal welfare violations. In a univariable analysis, associations were found for five variables at a statistical significance level of α = 0.2:
yield for first lactation cows,
standard deviation (SD) of milk yield for first lactation cows, SD of milk yield for second lactation cows,
number of veterinary treatments per 100 cow years, and the
number of abattoir remarks. By backward elimination, a final multivariable model for predicting herds with violations of animal welfare was obtained consisting of three variables:
increasing SD of milk yield for first lactation cows,
high bulk milk somatic cell count (≥250,000 cells per millilitre), and a suspiciously
small number of veterinary treatments (≤25 treatments per 100 cow years).
In summary, the Irish and Danish approaches differ fundamentally in the DBVs used, the methodology, and the definition of animal welfare violations. Nevertheless, all three publications emphasise that DBVs have the potential to contribute to animal welfare monitoring in the future. The three studies presented provide a first evaluation of DBVs as predictors of animal welfare violations, which needs to be complemented by future research. As no evaluation of specific DBVs is possible at this stage, we suggest for future research the inclusion of a broad set of clearly defined DBVs. Additionally, to correctly interpret the results, the specific animal welfare violations investigated should be outlined. Whereas the application of DBVs as predictors of welfare violations requires a validation in the specific reference population, a broad set of studies in different countries could provide important information on underlying correlations between DBVs and animal welfare of dairy farms.
6. Relationships between Data-Based Variables and Welfare Quality Criteria and Principles, CReNBA Areas, and Overall Scores
In this section, all identified papers are reviewed that examine the relationship of DBVs with WQ criteria and principles, CReNBA areas, and overall scores (see
Box 1) or that aim to predict farm-level welfare expressed in these scores. Three of the 13 identified publications examined one or a few related DBVs for their association with dairy herd welfare [
32,
33,
34].
Coignard et al. [
34] investigated the relationship between milk yield and dairy welfare at farm level. For 125 French dairy farms whose animal welfare status was surveyed with the WQ protocol, individual-cow milk yields were determined for 30 days before and after the on-farm assessment. The associations of the milk yield with the WQ criteria, the WQ principles, and the WQ overall score were analysed using linear mixed models. At the criteria level, the occurrence of agonistic behaviour and a poor emotional state, i.e., lower values in the Qualitative Behaviour assessment of the WQ, were associated with lower milk yield. Herds that scored worse in the criteria ‘absence of injury’ and ‘absence of disease’ had a higher milk yield. The principle ‘good health’, which is made up of these two criteria, was thus negatively linked to milk yield. However, no significant association of milk yield to the overall score could be shown.
The relationship between dairy herd welfare and the reproductive parameters
calving rate and
CFSI was studied by Grimard et al. [
33]. The welfare of 124 dairy herds was assessed using the WQ protocol, and the relationship with the two parameters was checked at the level of WQ criteria, WQ principles, and WQ overall score. The overall score showed a significant association with
CFSI: the higher the overall score was, the shorter was the
CFSI. The
calving rate, in contrast, was not linked to overall welfare but showed a positive relationship to the principle ‘good housing’.
Ginestreti et al. [
32] studied whether herd welfare status could be predicted using the bulk milk parameters
BMSCC, total bacterial count, and
fat, protein, and
urea contents. For 287 farms, the results of routine on-farm welfare assessments with the CReNBA protocol were obtained from the Italian animal welfare database. This welfare assessment provided an overall score consisting of three areas: management and stock training, housing, and ABMs. The examination of the relationships between the bulk milk parameters and overall welfare or the area scores did not reveal a significant relationship between animal welfare and
milk fat content. All other DBVs showed only weak relationships. Consequently, the authors assigned a very limited predictive value to data gained by bulk milk analysis.
Finally, the objective of four studies [
25,
26,
30,
31], including two already mentioned in
Section 5 [
25,
31], was to form sets of DBVs that could predict the farm-level welfare status and thus be suitable as predictive models. In a study by Sandgren et al. [
25], 13 of 55 visited herds were considered to have poor animal welfare, because they scored among the worst 10% in at least two of nine assessed ABMs. Eighteen DBVs were found to be associated with these ABMs in a multivariable analysis. By means of a systematic selection process, the authors recognised three DBVs that, in combination, were suitable for identifying herds with poor animal welfare status:
cows with late ongoing artificial insemination,
late-bred heifers, and
calf mortality. This predictive model correctly classified 77% of the herds with poor welfare, with a sensitivity of 62%. Three further models, including
cow mortality and
young stock mortality, classified 76% of these herds correctly, with a sensitivity of 77%.
Based on the aforementioned approach, Nyman et al. [
26] investigated whether a set of DBVs could reliably identify dairy herds with good welfare. Among the 55 visited herds, 28 were classified as having good welfare, because they were not among the worst 10% of farms in any of the nine selected ABMs. Statistical analysis yielded six DBVs that together correctly classified 96% of the herds with good welfare:
cows with late ongoing artificial insemination, late-bred heifers, cow mortality, stillbirth rate, mastitis incidence, and
incidence of feed-related diseases. Nyman et al. [
26] additionally suggested combining the developed predictive model with the model of Sandgren et al. [
25] to allow a refined classification into farms with presumed good welfare, farms with presumed poor welfare, and farms that could not be classified.
Krug et al. [
30] used DBVs from the Portuguese database for bovine identification and registration to build a model to detect dairy herds with poor welfare. The welfare status of 24 herds was assessed using the WQ protocol, resulting in five herds having poor welfare according to the WQ overall score. Of the 15 DBVs examined, the
proportion of on-farm deaths and the
ratio of female to male births differed significantly between herds with good and poor welfare status in the univariable analysis. In addition, data mining was used to detect the best performing group of DBVs for the detection of herds with poor animal welfare status. This resulted in a classification tree model with a sensitivity of 70.0% and specificity of 78.9%, based on the variables
on-farm deaths and
proportion of calving intervals >430 days.
Otten et al. [
31] aimed to use DBVs to predict the animal welfare status at farm level by means of an animal-based index. The index consisted of 12 aggregated, weighted ABMs and was applied on 73 farms. Similarly, 21 DBVs were aggregated linearly into a weighted data-based index, considering three time periods (covering 90, 180, and 365 days before the farm visit). Of these three calculation periods, only the 180-day data-based index was significantly related to the animal-based index, providing only poor model fit and predictive value. It was concluded that DBVs can give a first indication of the welfare status of a farm. However, to assess the real welfare status, the authors still considered an on-farm welfare assessment to be necessary.
In summary, some DBVs, including milk yield and the fertility parameter
CFSI, showed associations with WQ criteria, WQ principles, or the WQ overall score. For the bulk milk parameters, on the other hand, only weak associations with the animal welfare areas of the CReNBA protocol were found. One should consider that criteria, principles, and overall scores are based on measurements that are aggregated and weighted according to the animal welfare definitions of the protocol used and partially compensate each other. Thus, the identified associations are dependent on the given welfare definition and might not be confirmed if a different definition of welfare were applied. However, because these assessments can only provide an approximation of the actual animal welfare status, the associations must be interpreted with caution. With regard to the prediction of case herds—i.e., herds with poor or good animal welfare—three of the studies [
25,
26,
30] provided predictive models formed by DBVs that achieved sufficient sensitivities and specificities to give a first impression of farm-level welfare status. In contrast, one publication [
31] showed only limited correlations between a data-based index and an animal-based welfare assessment. It should be noted that the predictive models were built using statistical methods, applying selection processes that included those DBVs in the model that best identified the case herds of the study populations. The predictive sets in the presented studies were not applied to additional data sets and consequently have not been validated. This makes the models highly dependent on the conditions under which they were designed, i.e., the influencing factors and dairy farms used as the study population. Nevertheless, the authors of the presented studies considered DBVs as useful tools to give first evidence on the animal welfare status of farms. However, to obtain the actual welfare status, the results of the predictive models would need to be confirmed by comprehensive on-farm surveys.
7. Conclusions and Implications for Future Animal Welfare Monitoring
In summary, we could identify relatively few studies focusing on the relationships between DBVs and dairy welfare at farm level. In the 13 studies reviewed here, the DBVs investigated were similar with regard to the raw data collected, but differed in the definition and survey of animal welfare. To describe the farm-level welfare status, either the presence of welfare violations was considered or the welfare of dairy cows was surveyed at the level of ABMs, welfare overall scores, or scores for areas of multidimensional welfare definitions. Three studies that investigated the suitability of DBVs to detect cattle farms violating animal welfare suggested that DBVs may have the potential to contribute to animal welfare monitoring. The studies examining sets of DBVs to predict dairy farms with good or poor welfare status mostly showed the suitability of DBVs for this purpose. Nevertheless, comprehensive on-farm surveys are necessary to determine the actual animal welfare status. In addition, several DBVs were related to scores of welfare assessments such as WQ criteria, CReNBA areas, and overall score. The evaluation of relationships between DBVs and specific ABMs, such as lameness or body condition, yielded a large number of associations. In this context, DBVs based on mortality were particularly frequently associated with different ABMs. Owing to varying calculations and the consideration of different age or performance groups, a large number of DBV variants were examined. Together with sometimes missing definitions of the variables used, this led to a limited comparability of the studies. This may be a reason why repeated associations of specific DBVs with ABMs were rare.
Overall, the literature included in this review indicates a wide range of potential applications for DBVs. However, the limited number of studies and lack of validation of DBVs necessitate further research to fully assess the value of DBVs for animal welfare monitoring. To account for the multidimensional nature of animal welfare, comprehensive sets of ABMs should be used. In addition, the use of validated and widely accepted ABMs could increase comparability between studies. Future work may be based on the variables examined so far, using comparable DBVs if possible, and include information on their definitions and calculation. The investigation of, as yet, rarely used data sources, such as reports from slaughter examinations, could provide additional DBVs with the potential for monitoring the welfare of dairy cows.