**4. Discussion**

In this study, a machine-learning technique was instigated on a data-mining platform with a dataset of 281 patients suffering from diabetes. The data was collected only from Nigeria for the assessment of diabetes mellitus prevalence by determining two rule classifiers (PART and Decision tables) on 10 non-invasive and easily accessible medical attributes/variables. They include age (age of the patient), gender (male and female), glucose level of the patient, body mass index of the patient, hypertension, history of cardiovascular disease, family history of diabetes, physical exercise, stress of work, and diet of the patient (healthy and unhealthy) to accurately measure diabetes mellitus ratio for rapid and precise screening of patients suffering with diabetes mellitus status along with other chronic disease symptoms.

Initially, during the assessment on the data mining platform (Weka), the dataset was divided into two parts for training and testing in a 20:80 percent ratio. Twenty percent of the training data was used to train the machine and assess the outcome. Whereas, 80 percent of the data was used for testing. Furthermore, a complete dataset of 281 patients was analyzed on the experimental mode of Weka for the final assessment of both classifiers together. The results of the Rule classification show the mean accuracy of 98.75% with an error rate of 0.02%. In addition, the mean kappa stats were 0.97%, true positive rate remained 0.97%, false positive rate 0.01%, precision 0.98%, recall 0.98%, F-matrix 0.98%, MCC 0.97%, ROC area ratio 0.99%, and PRC area ratio 0.99%.

The outcomes of the non-invasive medical features used in this study indicate this assessment can successfully help to predict the patients of diabetes and pre-diabetes without the need for preliminary laboratory tests. In addition, the 23 rules generated during the assessment clearly show the main features of individuals with diabetes. Therefore, this study raises the prediction that age is the underlying and root variable, followed by a family history of diabetes, body mass index, gender, work stress, physical exercise, diet lifestyle, hypertension, and cardiovascular family history. These implementations are useful for substantial epidemiological threats and low socioeconomic status regions around the world, such as Africa and other developing states.

The key strength of this study is its use of a unique approach to both classifiers with logistic regression assessment to identify and forecast diabetes mellitus prevalence. Moreover, the use of realistic health records collected from the four principal hospitals in the developing country of Nigeria where the prevalence proportion of diabetes in men and women is high and explicitly mentioned in the literature study. Hence, patients with diabetes mellitus can be screened by 23 generated rules. Diabetes mellitus can be controlled through organizing appropriate educational programs in developing countries to govern the widespread growth of diabetes mellitus. This can help people reduce the burden of health hitches through awareness-raising activities. The classification assessment proposed in this paper was set to test other well-known machine learning algorithms by the same data to evaluate and compare classification accuracy results. Table 3 and Figure 6 clearly show that PART and Decision table rule classifiers have been successful in clinically meaningful research.


**Table 3.** The rule classification average precision is compared to other machine learning classifiers based on the same dataset.

1 It comprehensively compares the proposed classification results with the other machine learning classifiers on the same dataset.

**Figure 6.** Comparisons of the rule classifier with other machine learning classifiers.
