Next Article in Journal
Twelve-Element MIMO Wideband Antenna Array Operating at 3.3 GHz for 5G Smartphone Applications
Previous Article in Journal
Research on Multi-Layer Defense against DDoS Attacks in Intelligent Distribution Networks
 
 
Article
Peer-Review Record

Interpretable Support Vector Machine and Its Application to Rehabilitation Assessment

Electronics 2024, 13(18), 3584; https://doi.org/10.3390/electronics13183584
by Woojin Kim *, Hyunwoo Joe, Hyun-Suk Kim and Daesub Yoon
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Electronics 2024, 13(18), 3584; https://doi.org/10.3390/electronics13183584
Submission received: 7 August 2024 / Revised: 23 August 2024 / Accepted: 2 September 2024 / Published: 10 September 2024
(This article belongs to the Section Bioelectronics)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript introduces an innovative approach to rehabilitation assessment through the development of an interpretable support vector machine (SVM). The research presents a novel concept of the nearest boundary point, which standardizes the one-class SVM decision function and identifies the shortest path for transforming data from abnormal to normal cases. The proposed method is computationally simple and yields a unique solution, making it particularly valuable for medical assessment applications. The manuscript can be published in Electronics after suitable revisions.

1. Please standardize the formatting of the authors' addresses.

2. The background section of the paper references relatively few works, particularly when discussing the design rationale and innovation. It would be beneficial to include more references and discuss how this study compares with existing work in the field, highlighting its contributions.

3. The captions of Figures 5-8 could be enhanced.

Comments on the Quality of English Language

Minor editing of English language required.

Author Response

1. Please standardize the formatting of the authors' addresses.

(Answer) The address has been revised to the standard format as follows:
Mobility UX Research Section, Electronics and Telecommunications Research Institute, Daejeon 34129, Republic of Korea.

2. The background section of the paper references relatively few works, particularly when discussing the design rationale and innovation. It would be beneficial to include more references and discuss how this study compares with existing work in the field, highlighting its contributions.

(Answer) We appreciate the beneficial comments. In the section of the introduction that presents the solution to the research question, we have emphasized the contribution of this paper by including more references, highlighting the features of different approaches, and comparing them with the method proposed in our study. The revised content is as follows:

The first research question focuses on addressing the issue of data imbalance, where the dataset contains two classes—normal and abnormal—with the abnormal class significantly underrepresented. To tackle this imbalance, several approaches have been considered. One approach is the synthetic minority over-sampling technique (SMOTE) [4], which resamples the minority (rare) class. Another approach is an ensemble method, where the majority (abundant) class is split into smaller subsets, each of which is then trained with the entire rare class to create multiple models, which are subsequently combined [5]. A third method involves clustering the abundant class and training only on representative values (e.g., medians) of these clusters [6]. However, these methods have limitations when applied to our context, where the imbalance is severe. The SMOTE technique is less effective because it requires identifying the distribution pattern of the rare class, which is challenging when the class is extremely underrepresented. Similarly, ensemble and clustering methods are not suitable for our case, as they effectively reduce the amount of data used for modeling, which is problematic given our already limited dataset. Given these challenges, we chose to address the anomaly detection problem by adopting a one-class classification approach. This method involves training the model exclusively on the abundant class and classifying the remaining data as belonging to the rare class. This approach is particularly well-suited to our scenario, as it allows for robust detection of anomalies despite the significant data imbalance. Several methods have been developed to perform the one-class classification [7,8], from which we used a one-class support vector machine (SVM) [9]. This type of SVM is well-suited for anomaly detection, which identifies datapoints that differ substantially from most of the data. It captures the data distribution and identifies outliers. Another advantage of the one-class SVM is its ability to handle imbalanced datasets, where one class is represented by drastically fewer samples than another one. The one-class SVM can handle situations in which only positive samples are available and negative samples are unknown or difficult to obtain. Hence, it is adequate for medical assessments, especially when collecting data from patients with rare conditions is challenging. To address the second research question, we devised the nearest boundary point (NBP) to solve for the shortest Mahalanobis distance. Specifically, based on the decision function of the one-class SVM formulated as a mixture of normal distributions, we standardized and analytically derived a series of steps to obtain and reconstruct the NBP. The analytically derived solution is computationally simple and yeilds a unique solution, making it particularly valuable for our rehabilitative assessment application.

3. The captions of Figures 5-8 could be enhanced.

(Answer) Not only for Figures 5-8, but we have also enhanced the caption for Figure 9. We have added key observations for each figure to emphasize the critical aspects that should be noted. The revised captions are as follows:

Figure 5. Randomly sampled (1571) datapoints (blue circles) and their one-class SVM boundary (black contour). The boundary effectively encloses the majority of the datapoints, excluding a few outliers, demonstrating a smooth and robust classification performance that is less sensitive to noise.

Figure 6. Standardized datapoints (red circles) and their boundary (black contour) in the z space. The boundary effectively encloses the majority of the datapoints while excluding outliers, similar to the original space. In the 2D standardized space, the boundary appears geometrically circular.

Figure 7. Decision boundary (black contour) that separates positive samples (blue circles) from negative samples (red circles). The boundary effectively distinguishes between the two classes, even for datapoints near the decision boundary.

Figure 8. Standardized negative test points (x marks) and their NBPs (triangular marks). The NBPs lie on the decision boundary (black contour) and are accurately identified in the direction toward the center of the circular boundary.

Figure 9. Negative test samples (x marks) and their NBPs (triangular marks). The NBPs lie on the decision boundary (black contour). The results show that the nearest boundary points identified in the x plane are consistently mapped back to the x plane, effectively locating the nearest points on the
decision boundary.

Reviewer 2 Report

Comments and Suggestions for Authors

The paper on ML (machine learning) and its employment in healthcare environments provides technical insights into the mathematical calculations necessary to enhance its successful adaptation in rehabilitation environments. There are a few editorial elements such as too large blanks, as highlighted in the attached document.

Comments for author File: Comments.pdf

Author Response

1. There are a few editorial elements such as too large blanks, as highlighted in the attached document.

(Answer) Thank you for your careful review. The highlighted large blanks in the attached document are not intentional but appear to be a result of the LaTeX typesetting process. We expect that this issue will be resolved during the final editing phase by adjusting parameters such as spacing.

Reviewer 3 Report

Comments and Suggestions for Authors

This paper introduced an SVM and its application on rehabilitation assessment. This paper also introduced the concept of nearest boundary point and standardize the one-class SVM decision function. Simulation and application results demonstrate the effectiveness of the proposed method.

 

1. The concept of one-class classification and its density function is introduced. The mathematical functions are comprehensive to explain the method. 

2. The effectiveness of standardization of non-normally distributed data is shown in figure to demonstrate. A figure is also provided to show the process of determining the desired point.

3. Figure 5 and figure 6 clearly show the effectiveness of standardization process. It will be better to provide some numerical metrics to quantify the effectiveness of the method in this section.

4. The application of the method in rehabilitation assessment is shown. It is able to classify the status of patients muscle function and analyze the underlying causes of anomalies.

Author Response

1. The concept of one-class classification and its density function is introduced. The mathematical functions are comprehensive to explain the method.

(Answer) Yes, that is correct. The approach presented in this paper is analytically derived, making the equations comprehensive.

2. The effectiveness of standardization of non-normally distributed data is shown in figure to demonstrate. A figure is also provided to show the process of determining the desired point.

(Answer) Explaination of the effectiveness of the standardization and the desired point determining process has been added to the captions of the mentioned figures as follows:

Figure 5. Randomly sampled (1571) datapoints (blue circles) and their one-class SVM boundary (black contour). The boundary effectively encloses the majority of the datapoints, excluding a few outliers, demonstrating a smooth and robust classification performance that is less sensitive to noise.

Figure 6. Standardized datapoints (red circles) and their boundary (black contour) in the z space. The boundary effectively encloses the majority of the datapoints while excluding outliers, similar to the original space. In the 2D standardized space, the boundary appears geometrically circular.

Figure 7. Decision boundary (black contour) that separates positive samples (blue circles) from negative samples (red circles). The boundary effectively distinguishes between the two classes, even for datapoints near the decision boundary.

Figure 8. Standardized negative test points (x marks) and their NBPs (triangular marks). The NBPs lie on the decision boundary (black contour) and are accurately identified in the direction toward the center of the circular boundary.

Figure 9. Negative test samples (x marks) and their NBPs (triangular marks). The NBPs lie on the decision boundary (black contour). The results show that the nearest boundary points identified in the x plane are consistently mapped back to the x plane, effectively locating the nearest points on the
decision boundary.

3. Figure 5 and figure 6 clearly show the effectiveness of standardization process. It will be better to provide some numerical metrics to quantify the effectiveness of the method in this section.

(Answer) A numerical metric was employed by performing the Kolmogorov-Smirnov test. The normality of the standardized datapoints was confirmed at a significance level of p<0.00, and this information has been added to the manuscript as follows:

To quantitatively evaluate the effectiveness of the standardization in Figure 6, we 203 performed a Kolmogorov-Smirnov test [11] on the 2D data. The results confirmed the normality of the standardized datapoints, with a significance level of p < 0.00.

4. The application of the method in rehabilitation assessment is shown. It is able to classify the status of patients muscle function and analyze the underlying causes of anomalies.

(Answer) Yes, while assessing the muscle function of the patient is important, it is equally crucial to analyze the reasons behind a low muscle function score. This analysis allows for the recommendation of appropriate treatment methods.

Back to TopTop