*3.2. Feature Extraction and Selection*

The extracted features covered the shape of the breast lesions, which can be arranged in the following mathematical form:

$$\mathbf{F} = \{\mathbf{F}1, \mathbf{F}2, \dots, \mathbf{F}7\}. \tag{1}$$

The (*p* + *q*) order of the geometric moment can be defined as follows [13]:

$$m\_{p\eta} = \sum\_{y=1}^{N} \sum\_{x=1}^{M} x^p y^q f(x, y), \; p, \; q = 1, \; 2, \; 3, \; \dots \tag{2}$$

where *f*(*x*, *y*) is the extracted ground-truth region. The central moments are defined as

$$\mu\_{pq} = \sum\_{y=1}^{N} \sum\_{x=1}^{M} (x - \overline{x})^p (y - \overline{y})^q f(x, y), \text{ } p, \text{ } q = 1, \text{ 2, 3, ...}, \tag{3}$$

where *x* = *<sup>m</sup>*<sup>10</sup> *<sup>m</sup>*<sup>00</sup> and *<sup>y</sup>* <sup>=</sup> *<sup>m</sup>*<sup>01</sup> *<sup>m</sup>*<sup>00</sup> represent the center-of-gravity coordinates of the image. The normalized central moment is

$$
\eta\_{p\eta} = \frac{\mu\_{p\eta}}{\mu\_{00}^{\mathcal{P}}}, \; \rho = \frac{p+q}{2} + 1. \tag{4}
$$

The seven Hu moments were computed using the second- and third-order normalized central moments.

The k-nearest neighbor (k-NN) is a classifier which provides an efficient prediction model based on the closest training examples [15,33]. An object is classified by a majority vote of its neighbors, but the optimal k-value is the most sensitive factor of k-NN. To overcome this drawback, the k-NN algorithm is run many times using various k-values until the optimal value is found. The best performance provided by the validation set indicated k = 3 as the optimal value.

Not all features are useful for improving classification accuracy. In the first step, the *t*-test indicated which features were significant to distinguish between benign masses and malignant ones. Then, the remaining features were independently evaluated by k-NN using a fivefold cross-validation algorithm and an RBFNN. The dataset was split it into training data and test data. The concept of fivefold cross-validation indicates that the training data were randomly split into five equal parts. For each k, the mean accuracy was computed, denoting the final score of each feature. RBFNN is a neural network with a three-layer feedforward architecture (Figure 2). The input layer provides features to the hidden layer in which the nodes use Gaussian functions f1, f2, ... , fn as radially symmetric functions. The output layer summarizes the number of possible output classes [30].

**Figure 2.** The structure of an RBFNN classifier.

The RBFNN is a nonlinear and three-layer feedforward neural network. There is one unsupervised layer between input nodes and the hidden neurons and one supervised layer between the hidden neurons and output nodes. The RBFNN was trained using four folds, while one fold was used to test the classifier. The net training seeks to determine the centers of the hidden layer in a first step, followed by the computation of the weights connecting the hidden layer to the output layer. RBFNN training was carried out by determining the proper weights and biases to obtain the target output by minimizing the error function, i.e., the root-mean-square error. The training stage of the RBFNN model was terminated once the calculated error reached the goal value of 0.01, and the number of training iterations of 1000 was already completed. Ten neural networks were built by varying the number of hidden neurons from eight to 40, in steps of eight neurons, i.e., eight, 16, 24, 32, and 40 neurons. Furthermore, the spread of radial basis functions (SRBFs) was used. For each number of hidden neurons, two SRBFs (namely, 0.01 and 0.05) were established. Each network was trained until the mean squared error fell below the goal of 0.01. The SRBF of 0.01 reached the imposed goal. The goal/desired value was iteratively compared to the mean squared error. If the goal/desired value was not reached, another eight neurons were added to the structure. A total of 20 trials were run to decide the most suitable number of hidden layer neurons for effective prediction. The best results were obtained for 32 neurons in the hidden layer.

#### **4. Results and Discussion**

The dataset we used included ground-truth cases for the normal, benign, and malignant categories. Among these regions, 133 were normal tissue, 210 were malignant masses, and 437 were benign masses.

Before using the k-NN classifier, the penalty *t*-test was used to optimize the feature vector and improve the classification performance (Table 1).

**Table 1.** Results for *t*-test (*p* < 0.05).


The selected features fed a k-NN classifier and RBFNN. We used fivefold crossvalidation to select the best features. In fivefold cross-validation, the feature vector (437 benign images, 210 malignant images, and a total of 3882 moments) was randomly divided into five sets. The classifiers were trained using four folds, and one fold was used to test the classifier. The provided average accuracy was considered to evaluate the k-NN and RBFNN classifiers.

The values of accuracy, sensitivity, precision, and F1-score reflect the diagnosis accuracy. Higher values for these metrics indicate a better performance of the system. Figure 3 displays the performance of the k-NN algorithm in the classification of BUS images. The selected moment features with the highest classification accuracy rate were M1 and M5. The M1 moment showed the best accuracy of 0.85, representing the number of correctly classified images. It also had the best precision of 0.87, indicating the proportion of correct positive identifications. The M5 moment provided the best sensitivity of 0.83, indicating a good performance of the classifier. Furthermore, M5 had the second-best accuracy and precision values and the highest F1-score, which denotes the harmonic mean between precision and sensitivity.

The diagnostic performance of the RBFNN is presented in Figure 4. A relatively high classification performance (i.e., accuracy value of 0.76) was obtained for M1. Moreover, M1 had a high precision (0.81), indicating the proportion of correctly positive identifications. The other moments showed lower accuracy but had a good proportion of correctly positive identifications, with precision values around 0.78. The RBFNN model built in this study showed low sensitivity and F1-scores for M2 to M6 moments.

**Figure 3.** Classification metrics for k-NN.

**Figure 4.** Classification performance of RBFNN from BUSI database.

The diagnostic performance of k-NN and RBFNN classifiers in the differentiation of benign and malignant breast lesions indicated significant differences among the classifiers in the classification performance. The data shown in Figures 3 and 4 indicate the M1 moment as the best feature with relatively high classification performance. When compared to the precision values (i.e., the proportion of correctly positive identifications), the best performance results were provided by the RBFNN. In the case of the M1 moment, there were some differences in the diagnostic performance among these two models. The k-NN was found to have a high accuracy of prediction.

The differentiation ability of our approach is in line with other existing conventional models, as presented in Table 2.


**Table 2.** Comparison of existing handcrafted approaches and the present handcrafted approach.

The k-NN algorithm is a simple and nonparametric machine learning algorithm built to identify the group's membership by exploiting similarity and to predict the class of the new data. The performance of the k-NN algorithm is related to the complexity of the decision boundary. When the number of neighbors is low, the algorithm chooses only the closest values to the data sample, and a very complex decision boundary is formed. In this case, the model fails at providing an adequate generalization and shows poor results. When the number of neighbors is increased, in an early phase, the model generalizes well; however, when the value is increased too much, it results in a performance drop. RBFNN as a deep learning tool requires several trials in establishing the number of hidden layers and/or choosing the activation function, but it is advantageous as it needs less effort and preprocessing. This is one of the limitations of the present study. Other limitations are related to the small size of the dataset. A neural architecture reaches good performance when handling large amounts of data. However, the recognition accuracy of the proposed method is comparable to some state-of-the-art methods. A future research direction will be devoted to improving the classification performance by combining geometric moments with other feature descriptors.

#### **5. Conclusions**

In this paper, we employed both a k-NN algorithm and an RBFNN model, and we investigated their performance in differentiating between benign and malignant breast lesions on BUS images. Both methods highlighted that moment M1 (i.e., correlated with the area of the lesion) was the feature that best differentiated between benign and malignant breast lesions. The k-NN classifier had a classification performance described by an accuracy of 0.85, while RBFNN had a decent score of 0.76. Despite the small difference in classification performance, we believe that RBFNN is a proper tool for the classification task, as the proportion of correctly positive identifications reached higher values.

**Author Contributions:** Conceptualization, S.M. and L.M.; methodology, S.M. and L.M.; software, S.M., I.-N.A.N. and L.M.; validation, S.M. and I.-N.A.N.; formal analysis, I.-N.A.N.; investigation, S.M. and I.-N.A.N.; writing—original draft preparation, S.M., I.-N.A.N. and L.M.; writing—review and editing, S.M. and L.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.
