*2.7. Rules Classification*

The machine learning algorithms PART and Decision table were utilized for the classification of the dataset with a 10-fold cross-validation assessment. PART classifications are projected as discrete rules to conquer the rule methods of any dataset and generate a rule set for a better understanding of the decision list. In addition, PART works with a combination of C4.5 and Ripper [36]. The paramount leaf in the rules assessment was generated by the fractional C4.5 decision tree repetitions. It compares the data to the rules of each list, and vice versa, and assigns the items accordingly.

The decision table summarizes the testing dataset and compares it with the training dataset generated. In addition, it classifies the unknown dataset samples by the Wrapper method, which helps to reduce the unknown values and produce better results with higher accuracy and minimal error rates [37]. The first attribute in the rule tree is the most informative node, which is measured by Equations (5) and (6):

$$I\_A = E(D) - \sum\_{i=1}^k \frac{|D\_i|}{|D|} E(D\_i),\tag{5}$$

$$E(X) = -\sum\_{i=1}^{m} \frac{count(c\_i, \mathbf{x})}{|\mathbf{x}|} \cdot \log \frac{count(c\_i, \mathbf{x})}{|\mathbf{x}|}. \tag{6}$$

The parameter selected for the PART classifier was 100 as the batch size with false in binary splits by a confidence interval of 0.25%. The number of objects was set as 2, decimal number places as 2, fold number as 3, error pruning as false, and seeds value as 1. In addition, the parameters for the Decision table were 100 as the batch size with a cross value was 1 and the number of decimal places was 2 with the best first in search results.

## *2.8. Kappa Statistics*

Kappa statistics have the consistency of frequent testing, which provides extended facts about data collection in the research that is correct for variable measurements. It compares the model results with the randomly generated classification. We adopted kappa stats measures based on values between 0 and 1 as in Equations (7)–(9) where the value 0 is invalid and 1 is the expected e ffect of the assessment. Furthermore, kappa stats indicate the consistency of assessment.

$$K = \left[P(A) - P(E)\right] / \left[1 - P(E)\right] \tag{7}$$

$$P(A) = \begin{bmatrix} (TP + TN) / N \end{bmatrix} \tag{8}$$

$$P(E) = \left[ (TP + FN) \* (TP + FP) \* (TN + FN) / N^2 \right] \tag{9}$$

#### *2.9. Logistic Regression Forecasting*

Logistic regression was implemented on the classification outcomes with the primary objective to define the initial screening for disease diagnosis and prediction [38]. In most cases, the variables of the logistic regression work to solve the two-way binary classifications. It predicts the continuous values to maintain the sensitivity in the numbers field where the values are 0 and 1. The value 1 is assigned only if the value is greater than the threshold (value > threshold); otherwise, it will be 0. Hence, the range of output works in the logistic regression is between 0 and 1 with the addition of the sigmoid function layers measured by Equations (10)–(13):

$$P = \alpha + \beta\_1 X\_1 + \beta\_2 X\_2 + \dots + \beta\_m X\_m \tag{10}$$

$$\sigma(\mathbf{x}) \frac{1}{1 + \mathbf{c}^{-\mathbf{x}}} \in [0, \ 1], \tag{11}$$

$$\Pr(Y = +1|X) \sim \beta X,\tag{12}$$

$$\Pr(Y=-1|X) = 1 - \Pr(Y=+1|X). \tag{13}$$

It consists of a positive and a negative group of values. The variable X will be assigned to the β coefficient values, which represent the weight. *Y* indicates the patients with diabetes. The variations between the values *X* and *Y* occur on the basis of weight.

The parameters selected for the logistic regression forecast was 1 for a number of time units. The confidence interval was set at 0.95%. The M5 method was chosen for attribute selection with a batch size of 100, and the ridge was set as 1.0 E-8. After accurately setting up, it is easy to predict the outcome of positive or negative. The sigmoid function σ(x) proposition is described as follows:

**Proposition 1.** A function *f*: (0,1)→ *R* is absolutely a monotone on (0,1) if and only if it possesses a power series expansion with non-negative coefficients, converging for 0 < *x* < 1.

**Proof.** If (*f*) function is completely monotone in (0,1), then the power series expansion of (*f*) function in (0,1) has to be alternating because (−<sup>1</sup>)*<sup>k</sup> f k* ≥ 0. On the other hand, consider an alternating power series of function *f*(*x*) converging for all 0 < *x* < 1 and its derivatives by Equations (14)–(16):

$$f(\mathbf{x}) = a\_0 - a\_1 \mathbf{x} + a\_2 \mathbf{x}^2 - a\_3 \mathbf{x}^3 \dots a\_l \ge (0 < \mathbf{x} < 1),\tag{14}$$

$$(-1)f^1(\mathbf{x}) = a\_1 - 2a\_2\mathbf{x} + 3a\_3\mathbf{x}^3 + \dots,\tag{15}$$

$$f^2(\mathbf{x}) = 2a\_2 - 6a\_3 \mathbf{x} + \dots \tag{16}$$

