2.1.2. K-Nearest Neighbor (KNN)

The k-nearest neighbor (KNN) algorithm is a type of supervised machine learning algorithm that classifies objects based on the classes of their nearest neighbors [42]. It is typically used for classification but can also be applied to regression problems. The algorithm predicts the class or value of a new data point based on the k-closest data points in the training dataset. To identify the nearest neighbors, the algorithm calculates the distance between the new data point and all other data points in the dataset. For example, in Figure 1B, the green unknown data point belongs to the red dataset. For classification, the algorithm assigns the new data point to the most common class among its k-nearest neighbors, while for regression analysis, it calculates the average value of the k-nearest neighbors and assigns it to the new data point [42]. The value of k is usually determined through cross-validation or other optimization techniques, and it impacts the bias-variance trade-off of the model. Despite its simplicity, KNN is a highly effective algorithm and is widely used in many fields, including image recognition, natural language processing, and healthcare problems [43–45].
