A common feature of the mentioned studies [
6,
7,
22] is the high proportion of false-positive alarms (also referred to as error rate). This value was 0.99 for mastitis and 0.89 for lameness treatments in the results from Miekley et al. [
22]. The study of Steeneveld et al. [
7] included 52 true positive and 3636 false-positive alarms, which led to an error rate of approximately 0.99. The PPV in our own study of 0.07, which corresponds to a 0.93 error rate, is due to the low frequency of days with treatment in the test data, which was 3.5% and 5.4% for mastitis and lameness treatments, respectively. In other studies, the ratio of treated to non-treated cows was artificially increased, i.e., the data were sampled, e.g., by pairing cows [
5], by excluding unclear cases of lameness [
24,
25], or by considering shorter periods before treatment [
2,
4]. From the perspective of the developers of these models, these measures are justified because sensitivity and specificity are hardly influenced by the frequency of occurrence of the target trait. However, these results do not reflect the situation of the application on a practical farm. Furthermore, Post et al. [
17] could show that even different up- and downsampling methods for balancing training data during application to unknown, realistic data had no influence. Thus, it becomes clear that, despite sufficient model quality, the frequency of occurrence of the event to be predicted substantially influences the magnitude of the predictive values and thus the share of animals reported as positive. As known from medicine and other fields [
8,
9], the application of test procedures and, accordingly, algorithms in groups where the risk for the event to be predicted is higher, allows the ratio between correct and false-positive reports to improve, i.e., the PPV becomes higher. In the risk groups analyzed, the question arises whether this improves the ratio in such a way that an implementation of this approach can be recommended.
4.2.1. Classification of Cows with a Previous Treatment (RGtreat)
Our own results have shown that cows with mastitis or lameness treatment have a higher chance of needing to be treated again in the next lactation (RGtreat-SC/PL) or at a later stage of lactation (RGtreat-SC/SL). In other studies, an increased risk of further mastitis was found in cows that had already been infected with the cow-associated pathogen
Staphylococcus aureus [
26], as well as in cows with a past infection with the environmentally associated pathogens
Streptococcus uberis [
26] and
Escherichia coli [
27], in whose study approx. 13% of all
E. coli infections already had an infection in the same udder quarter. In [
11], the odds ratio of mastitis was found to be up to 5.9 if at least one previous treatment was given in the current lactation. By narrowing the data to cows with a previous treatment in the same lactation, the frequency of occurrence in our own study was increased to 13.2% from 3.5% of days with treatment. This means a 3.7-fold higher risk of mastitis for this group. Another study found an odds ratio of 4.15 for mastitis incidence in the first 120 days of lactation for previous clinical mastitis [
12]. These results suggest that cows or udder quarters are more likely to develop mastitis again [
26,
27]. However, in another study by Hammer et al. [
28], no statistical correlation between the risk of mastitis and previous treatments that were more than 30 days old was found in 245 cases of mastitis.
An increased risk for subsequent treatments in the following lactation was also found by other authors. In a study of 402 cows, cows treated in the previous lactation were found to be 1.7 times more likely to develop subclinical mastitis in the first 60 days in the next lactation [
29]. When restricted to animals treated in the last 60 days of the previous lactation, the risk there increased 4.9-fold. Another study with data from 350 Norwegian dairy herds and a total of 6046 cows in their second lactation [
14] showed an increased risk (1.5-fold) when mastitis treatment was given in the first lactation. Limiting the risk to animals with mastitis treatment in the previous lactation (RGtreat-SC/PL) achieved a 2-fold increase in risk to 7.1% in our own study. This shows that cows with mastitis treatment also carry a higher risk into the next lactation due to individual susceptibility to pathogens or the persistence of a subclinical infection over the dry period [
11]. However, this risk is reduced to some extent by the possibility of udder healing in the dry period through appropriate therapies [
30], compared to the follow-up treatments within one lactation.
The risk that a cow will need to be treated again was also elevated for lameness treatments for both RGtreat-SC/PL and RGtreat-SC/SL. In another study with 600 cows over 44 months, a high range of positive odds ratios between 2.5 and 23 for all types of lameness diagnoses was found for the probability of a cow needing re-treatment [
13]. A different study of over 7600 cows from 23 dairy farms found significant positive effects of prior lameness treatment on both at dry-off (2.5 times higher risk) and next lactation (twice the risk) for claw horn disruption lesions [
31]. In other studies, this association has also been established for treatments for sole ulcers, white line defects, and digital dermatitis [
13,
32]. This is the case when treatment of the clinical symptoms does not address the underlying cause sufficiently, e.g., a thinned digital cushion [
13].
The AUCs of RGtreat-SC/PL and RGtreat-OC/SL did not differ significantly from those models applied to all cows. Only the AUCs after application in RGtreat-SC/SL showed significantly lower values in both treatment categories. This was due to the combination of low numbers of cow-days in the corresponding test data (see
Appendix A Table A3) and the restriction of the test data to a subgroup with a different distribution of features for days with and without treatment than in the whole test data. This introduces a sampling bias into the classification, which has a negative effect on AUC, especially in small data sets [
33,
34]. At the same time, in this RGtreat-SC/SL, the risk of repeated treatment for mastitis or lameness was highest. Accordingly, PPVs in this RGtreat-SC/SL had the significantly highest values compared to the other groups. This means that they have the greatest potential for reducing false-positives compared to the other RGs, yet the PPVs were not in a range satisfactory for practical use, with 0.20 for mastitis and 0.15 for lameness treatments.
RGtreat-OC/SL narrowed the data down to cows that had already undergone a different treatment in the same lactation. A study on genetic correlations found a comparatively low correlation of 0.32 ± 0.07 between the occurrence of mastitis between lactation days −10 to 50 and other treatments (fertility disorders, metabolic diseases, and lameness) in the period up to 100 DIM [
35]. Another study by Hossein-Zadeh and Ardalan [
12] found odds ratios for clinical mastitis in the first 120 DIM of 57,300 Holstein cows, 9.45 with previous retained placenta and 12.36 with previous milk fever. The association between the retained placenta and later clinical mastitis has before been quantified by [
36] with a 1.5-fold higher risk for mild and 5.4-fold higher risk for severe mastitis, respectively. Acidosis can act as a trigger for laminitis, which then develops into lameness [
37]. A study by Berge and Vertenten [
38] with 131 Dutch farms found odds ratios at previous ketosis of 1.9 for mastitis treatments and 1.7 for lameness treatments in the rest of the lactation. The authors of [
39] found a significant doubling of the frequency of interdigital dermatitis in cows with previous endometritis in the same lactation based on data with 2109 lactations, but the data showed no correlation between other previous diseases and mastitis. Our own results for RGtreat-OC/SL could only cause a small increase in the frequency of occurrence of mastitis and lameness treatments, and consequently no higher PPVs by limiting the animals to those treated against diseases from other disease categories (with otherwise comparable AUC values). Since the cows remained in this risk group for the remainder of the lactation, the effects of these pre-treatments are too small in relation to the total data at the daily level.
4.2.2. Classification of Cows with Increased SCC After Milk Recording (RG-SCC)
Several studies have investigated the association between increased SCC in MR and the subsequent occurrence of mastitis. In Whist and Østerås [
14], a 1.9-fold higher risk of clinical mastitis was found for SCC > 200,000 cells/mL in the first MR after calving. The authors also found a 1.7-fold higher risk of developing mastitis in the second lactation with a geometric mean between 400,000 and 800,000 of the last three MR cell counts before the second calving [
14]. In a study by Steeneveld et al. [
15], the relationship between the previous month’s SCC and the geometric mean of all MR test days of the previous lactation with mastitis treatments was examined using data from almost 40,000 cows and 8500 mastitis cases. The significant odds ratios here were 1.33 and 1.15 for elevated SCC (> 200,000) in the preceding MR and previous lactation on average, respectively, which signaled a slightly increased risk of a subsequent mastitis treatment.
RG-SCC showed a comparable AUC as an indicator of model quality, but to a significantly lower PPV compared to RGtreat-SC/SL. The reason for this is that, whereas clinical symptoms were present at one time during pre-treatment, a SCC of > 100,000 and thus a risk after MR is not necessarily associated with clinical symptoms, and therefore no treatment is performed. Thus, the limitation to this risk group and the application of the classification algorithms would not lead to any added value other than the animal listings themselves, which are conspicuous in the context of MR with regard to udder health.
4.2.3. Classification of Cows in Early Lactation (RGtime-100)
Only cow-days within the first 100 DIM were classified as this last risk group. It is known that treatments for mastitis are more common in early lactation [
12,
40]. The study by Hammer et al. [
28] found in 245 cows that the odds ratio in cows over 100 DIM dropped to only 0.3 compared to the reference group between 10 and 20 DIM. However, this odds ratio was also only 0.4 between 30 and 100 DIM. The odds ratio for clinical mastitis decreased after the first month of lactation [
15], but after the first three months (after about 100 DIM) the odds ratio was still 1.9 for primiparous and 3.6 for multiparous cows, compared to lactation month 8 and higher as reference. In terms of lameness treatments, in a study of 2100 cows over three years, these were most common between 61 and 150 DIM and least common between 16 and 60 DIM [
41]. However, these data are from only one farm, so a farm effect cannot be excluded.
In our own study, the restriction only to animals in the first 100 DIM did not lead to an increase in the frequency of occurrence and thus no effect on the PPV. The effects in the quoted studies often reported shorter time windows after calving with higher risk. This was also investigated in our own study (60 DIM) but did not lead to any change in the frequency of treatments. In line with the findings from the other risk groups, this narrowing of the data set also did not lead to any improvement in predictive values or false alarms.