Next Article in Journal
Paired Emitter–Detector Diode Array for Colorimetric Detection of Water Treatment Chemicals
Previous Article in Journal
In-Plane Thermoelectric Characterisation of PEDOT:PSS Films with Inkjet-Printed Test Structures
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Cattle Disease Prediction Using Machine Learning Algorithms †

1
Department of Software Engineering, University of Sialkot, Sialkot 51040, Pakistan
2
Information Systems Study Program, Faculty of Engineering and Design, Nusa Putra University, Sukabumi 43152, West Java, Indonesia
*
Author to whom correspondence should be addressed.
Presented at the 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society, Aizuwakamatsu City, Japan, 20–26 January 2025.
Eng. Proc. 2025, 107(1), 85; https://doi.org/10.3390/engproc2025107085
Published: 1 September 2025

Abstract

The main purpose of this research paper is to assess the prevalence, diagnosis, and management of common cattle diseases in different countries, including Jordan, Pakistan, Uganda, Korea, Bangladesh, and Europe. Our dataset includes 163 cases and 123 detailed symptoms; this research identifies patterns of symptoms with great accuracy. The accuracy of the dataset is 98%. The main diseases in cattle include digestive disorders, osteodystrophy, tick-borne diseases, and lumpy skin disease. Two types of tools were used: innovative diagnostic tools, like fuzzy logic models, and a diagnostic decision support tool (DDST). This tool performs disease detection and management. The findings demonstrate the importance of accurate diagnosis in vaccination programs and biosecurity measures in order to adequately measure economic losses and improve livestock health.

1. Introduction

Cattle diseases pose challenges to both livestock health and economic stability. Pakistan and Uganda face a similar problem: tick-borne diseases. This type of disease affects management strategies—not only in terms of production but also food security and livelihoods For some countries, food production is their main source of income. The dataset used herein included 163 cases and 123 symptoms; it provides a great opportunity to obtain good results in terms of disease and symptom detection. The dataset helps to identify specific symptoms and diseases. All over the world, the demand for dairy and meat products is increasing, so now more than ever is the time for sustainable livestock health management. Disease management will not be effective without innovative diagnostic tools, vaccination programs tailored to a specific need, and robust, proactive measures of biosecurity. Addressing these challenges will require inputs such as those provided by the dataset, along with those from researchers and farmers. This paper reviews the available literature on cattle diseases and highlights innovations in diagnostics and measures, including vaccination and better detection practices. The study aims to describe existing practices and the demand for pragmatic solutions that will serve to upgrade cattle disease management in resource-limited and economically developed regions. The problems posed by cattle diseases reflect a major challenge to livestock health and economic stability around the world. From metabolic disorders in Jordan to tick-borne diseases in Pakistan and Uganda, their diversity and prevalence necessitate effective management strategies. These diseases lead to production losses, but some threaten food security and the livelihoods of people who depend on livestock farming as their main source of income.

2. Literature Review

Between 2015 and 2021, an analysis was conducted in Jordan on 513 Holstein cows, revealing that digestive system issues were a common problem. These issues were more prevalent in older Holstein cows [1]. A fuzzy logic model was used, and according to the dataset, the most common disease detected in the cattle was osteodystrophy (the growth of abnormal bone). Osteodystrophy is mainly caused by chronic kidney disease and an imbalance in calcium and phosphorus levels in the body [2]. Tick-borne diseases are the most common in cattle and buffalos in Pakistan. Their development is mainly caused by climate change, overcrowding, etc., and principally involve major pathogens like babesia and anaplasma. Babesia is a protozoan disease caused by the genus babesia. Anaplasma is a bacterial disease. The authors of [3] suggest using vaccines to control these diseases. They studied diseases in cattle in Bangladesh between February and April in 2019 based on 500 cases. The most common diseases were digestive issues, with a 70% infection increase in just a few months in comparison to before [4]. The clinical cases were further categorized into specific diseases, including bacterial, viral, ecto-parasitic, endo-parasitic, nutritional deficiency, metabolic disorders, protozoan infections, fungal diseases, digestive disorders, surgical affections, and other syndromes. At the end of 2020, 33 European countries analyzed cattle diseases, as required under animal health law [5]. The most common diseases found were enzootic bovine leucosis (EBL) infections, bovine rhinotracheitis (IBR), blue tongue, etc. The main purpose was to ensure the safety of animal products, aiding farmers and authorities in managing cattle import risks. In [6], a study from the Netherlands, the six most common diseases in cattle between 2009 and 2019 were BVD, IBR, salmonellosis, paratuberculosis, leptospirosis, and neosporosis. For this study, milk was collected in dairy form, which is a good way to collect such a dataset. The main purpose of detecting diseases in cattle is to reduce them [5]. One academic study from Uganda focused on the diseases trypanosomiasis, theileriosis, anaplasmosis, and parasitic gastroenteritis [2]. This paper introduced the new diagnostic decision support tool (DDST). The aforementioned diseases were the most common based on 713 clinical cases.
After the DSTs were introduced, fasciolosis diagnoses increased significantly [5,6]. Ref. [6] highlights the importance of accurate diagnoses in achieving better disease management. In Korean cattle, the most commonly detected disease was lumpy skin disease. In October 2023, a vaccination program for the disease was launched in the country. Korean researchers analyzed 3910 cattle and found a total seropositivity rate of 30.69%. The rate was higher in dairy cattle (42.97%) compared to Korean native beef cattle [5].

3. Methodology

The proposed methodology for this research study entails using machine learning (ML) classifiers [7,8] inside an integrated framework to detect diseases in cattle and devise management strategies:

3.1. K-Nearest Neighbors

The K-Nearest Neighbors (KNN) algorithm often achieves an accuracy of around 85%, which is dictated by the level of noise in and size of the dataset. This algorithm provides balanced precision and recall [9], but poor performance can be expected when features are overlapping.
D x , y = n x i y i 2

3.2. Naïve Bayes

Naive Bayes (NB), which assumes features are independent, offers an accuracy rating of 80%. It performs well for classes but fails to predict with correlated features. Bayes’ theorem is phrased as follows:
P x c = P c x × P ( x ) P ( c )
where a P ( x | c ) represents the probability of identifying feature x in class c; P(c|x) is the probability of class c given feature x, which the method uses to categorize the data point; P(x) represents the total likelihood of observing feature x; P(c) represents the initial possibility of class c.

3.3. Decision Tree

Decision trees (DTs) can usually hit 90% accuracy, inducing a great deal of interpretability. They overfit with smaller datasets as well.
H i = n p i , k log 2 p ( i , k )

3.4. Random Forest

Random Forest, known for its robustness, is highly capable of producing results with an accuracy of 98%; high precision and recall are guaranteed by the ensemble of many decision trees.

3.5. Gradient Bosting

Gradient Boosting (GB) achieves an accuracy up to 99%, but parameter tuning takes a lot of time and increases the time for training.

3.6. Framework

The framework uses RapidMiner to detect cattle disease. This is designed to streamline the process of model training and the data processing performance of the dataset. The dataset is imported and cleaned using operators, replacing missing values, handling null entries, and normalizing the data. Important features are identified using the Feature Selection operator, ensuring only the most relevant attributes are used in the analysis. Machine learning models such as K-Nearest Neighbors (KNN), Naive Bayes (NB), decision tree (DT), Random Forest (RF), and Gradient Boosting (GB) are then applied to the processed data. These models are trained and tested using the “Split Data” operator, while the “Performance” operator calculates evaluation metrics, including accuracy, precision, recall, and F1 score [9].

3.7. Dataset Description

To enhance predictive accuracy, ensemble learning techniques like voting and stacking are implemented, combining outputs from multiple models for robust results. Optimization is achieved through hyperparameter tuning using the “Parameter Optimization” operator, and validation is performed with “Cross-Validation” to ensure reliability. Visualization tools such as the “Plot” operator create confusion matrices and ROC(Region of Interest) curves, providing valuable insights into model performance. Once trained, the models can be exported using the “Export Model” operator for deployment, and comprehensive reports can be generated to communicate findings. This framework enables efficient and accurate cattle disease analysis, supporting effective decision making and disease management strategies.

3.8. Performance Evaluation

The final step evaluates the model’s performance. Metrics such as accuracy, precision, recall, and F1 score are calculated to assess how well the model predicts detection quality. This step provides a quantitative measure of the model’s effectiveness and identifies areas for improvement.

4. Result

Data processing frameworks were employed in the experiments to preprocess the dataset and execute classification algorithms by deriving evaluation metrics. In predicting disease classes according to the dataset, models like K-Nearest Neighbors (KNN), Naive Bayes (NB), decision tree (DT), Random Forest (RF), Gradient Boosting (GB), and Voting Ensemble (VE) implemented an experimental platform. These results include accuracy, precision, recall, and F1 scores for each model to indicate their strength regarding disease detection. Gradient Boosting (GB) exhibits the best accuracy of 99%, and it requires a significant amount of parameter tuning that, in turn, increases the time taken by the training algorithm. On the contrary, Random Forest provides an acceptable and promising accuracy of 98% coupled with increased robustness, such that it becomes another interesting alternative, especially for those wanting to avoid the longer training time that Gradient Boosting requires. It is clear that the best model for achieving maximum accuracy, with enough patience and time for parameter tuning, is likely Gradient Boosting. Alternatively, if you still want high accuracy but with a little less attention and input for fine tuning, Random Forest is an excellent option. Fuzzy logic models help diagnose diseases associated with chronic kidney disease and altered calcium and phosphorus levels. Such an approach offers a rather cheap and accurate diagnosis method, making it convenient in parts of the world that do not have advanced diagnosis capabilities. DSTs have increased the specificity of fasciolosis diagnosis, while at the same time improving the diagnosis of other diseases, paving the way for enhanced precision agriculture.

5. Conclusions and Future Work

Cattle diseases, as far as regions are concerned, remain a worldwide problem that significantly affects livestock and economies. As this research indicates, the focus should be placed on putting innovative diagnostic tools such as fuzzy logic models and DSTs into use for disease detection and management, is evident. Disease control would not be complete without vaccine programs that are tailored, strict biosecurity, and ongoing surveillance. Future research needs to pay attention to the incorporation of advanced technologies, such as AI-aided diagnostics, genetic engineering for crafting disease-resistant breeds, and region-based approaches, to solve unique challenges. It would also be necessary to strengthen linkages between governments, research institutions, and farmers in order to achieve sustainable livestock production and animal health improvement globally. The very next measures will create a population of cattle that are healthier and more productive and will feed into global food security and stability.

Author Contributions

M.A. was responsible for conceptualization, methodology design, data curation, formal analysis, and drafting the original version of the manuscript, as well as overseeing project administration. S.J. contributed to the literature review, validation of results, preparation of visualizations, and participated in reviewing and editing the manuscript. S.S. provided supervision, resources, critical revisions to improve the quality of the manuscript, and support in funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data supporting the findings of this study are available from the corresponding author (M.A.) upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Rasmussen, P.; Barkema, H.W.; Osei, P.P.; Taylor, J.; Shaw, A.P.; Conrady, B.; Chaters, G.; Muñoz, V.; Hall, D.C.; Apenteng, O.O.; et al. Global losses due to dairy cattle diseases: A comorbidity-adjusted economic analysis. J. Dairy Sci. 2024, 107, 6945–6970. [Google Scholar] [CrossRef] [PubMed]
  2. Hodnik, J.J.; Acinger-Rogić, Ž.; Alishani, M.; Autio, T.; Balseiro, A.; Berezowski, J.; Carmo, L.P.; Chaligiannis, I.; Conrady, B.; Costa, L.; et al. Overview of Cattle Diseases Listed Under Category C, D or E in the Animal Health Law for Which Control Programmes Are in Place Within Europe. Front. Vet. Sci. 2021, 8, 688078. [Google Scholar] [CrossRef] [PubMed]
  3. Jabbar, A.; Abbas, T.; Sandhu, Z.-U.; A Saddiqi, H.; Qamar, M.F.; Gasser, R.B. Tick-Borne Diseases of Bovines in Pakistan: Major Scope for Future Research and Improved Control; BioMed Central Ltd.: London, UK, 2015. [Google Scholar]
  4. Alekish, M.; Ismail, Z. Common diseases of cattle in Jordan: A retrospective study (2015–2021). Vet. World 2022, 15, 2910–2916. [Google Scholar] [CrossRef] [PubMed]
  5. Carslake, D.; Grant, W.; Green, L.E.; Cave, J.; Greaves, J.; Keeling, M.; McEldowney, J.; Weldegebriel, H.; Medley, G.F. Endemic cattle diseases: Comparative epidemiology and governance. Philos. Trans. R. Soc. B Biol. Sci. 2011, 366, 1975–1986. [Google Scholar] [CrossRef] [PubMed]
  6. Faisal, A.; Jhanjhi, N.Z.; Ashraf, H.; Ray, S.K.; Ashfaq, F. A Comprehensive Review of Machine Learning Models: Principles, Applications, and Optimal Model Selection. TechRxiv 2025. [Google Scholar] [CrossRef] [PubMed]
  7. Airehrour, D.; Gutierrez, J.; Kumar Ray, S. GradeTrust: A secure trust based routing protocol for MANETs. In Proceedings of the 2015 International Telecommunication Networks and Applications Conference (ITNAC), Sydney, Australia, 18–20 November 2015; pp. 65–70. [Google Scholar] [CrossRef]
  8. Lim, M.; Abdullah, A.; Jhanjhi, N.; Khurram Khan, M.; Supramaniam, M. Link prediction in time-evolving criminal network with deep reinforcement learning technique. IEEE Access 2019, 7, 184797–184807. [Google Scholar] [CrossRef]
  9. Diwaker, C.; Tomar, P.; Solanki, A.; Nayyar, A.; Jhanjhi, N.Z.; Abdullah, A.; Supramaniam, M. A New Model for Predicting Component Based Software Reliability Using Soft Computing. IEEE Access 2019, 7, 147191–147203. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ahmed, M.; Javaid, S.; Saepudin, S. Cattle Disease Prediction Using Machine Learning Algorithms. Eng. Proc. 2025, 107, 85. https://doi.org/10.3390/engproc2025107085

AMA Style

Ahmed M, Javaid S, Saepudin S. Cattle Disease Prediction Using Machine Learning Algorithms. Engineering Proceedings. 2025; 107(1):85. https://doi.org/10.3390/engproc2025107085

Chicago/Turabian Style

Ahmed, Muneeb, Sabeen Javaid, and Sudin Saepudin. 2025. "Cattle Disease Prediction Using Machine Learning Algorithms" Engineering Proceedings 107, no. 1: 85. https://doi.org/10.3390/engproc2025107085

APA Style

Ahmed, M., Javaid, S., & Saepudin, S. (2025). Cattle Disease Prediction Using Machine Learning Algorithms. Engineering Proceedings, 107(1), 85. https://doi.org/10.3390/engproc2025107085

Article Metrics

Back to TopTop