Next Article in Journal
CETS: Enabling Sustainable IoT with Cooperative Energy Transfer Schedule towards 6G Era
Previous Article in Journal
Measurement and Analysis of Shock Wave Pressure in Moving Charge and Stationary Charge Explosions
Previous Article in Special Issue
Multi-Swarm Algorithm for Extreme Learning Machine Optimization
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Capped Linex Metric Twin Support Vector Machine for Robust Classification

School of Mathematics and Information Science, North Minzu University, Yinchuan 750021, China
*
Author to whom correspondence should be addressed.
Sensors 2022, 22(17), 6583; https://doi.org/10.3390/s22176583
Submission received: 21 June 2022 / Revised: 11 August 2022 / Accepted: 29 August 2022 / Published: 31 August 2022

Abstract

:
In this paper, a novel robust loss function is designed, namely, capped linear loss function L a ε . Simultaneously, we give some ideal and important properties of L a ε , such as boundedness, nonconvexity and robustness. Furthermore, a new binary classification learning method is proposed via introducing L a ε , which is called the robust twin support vector machine (Linex-TSVM). Linex-TSVM can not only reduce the influence of outliers on Linex-SVM, but also improve the classification performance and robustness of Linex-SVM. Moreover, the effect of outliers on the model can be greatly reduced by introducing two regularization terms to realize the structural risk minimization principle. Finally, a simple and efficient iterative algorithm is designed to solve the non-convex optimization problem Linex-TSVM, and the time complexity of the algorithm is analyzed, which proves that the model satisfies the Bayes rule. Experimental results on multiple datasets demonstrate that the proposed Linex-TSVM can compete with the existing methods in terms of robustness and feasibility.

1. Introduction

Data collecting and reasonable processing are becoming increasingly crucial as modern computer technology advances. As an excellent machine learning tool, support vector machine (SVM) [1,2,3,4] has been widely used in financial forecast, bioinformatics, computer vision, image annotation, data mining and other fields in recent years. The main idea of SVM classification based on statistical learning theory and optimization theory is to construct a pair of parallel hyperplanes to maximize the minimum distance between two classes of samples. Generally speaking, the optimal hyperplane is realized by solving an optimization problem with inequality constraints. In order to avoid overfitting, scholars extend support vector machines to soft difference support vector machines (C-SVM) [5], introduce relaxation variables to relax constraints, and increase the penalty term of relaxation variables in the objective function. However, the loss function adopted by C-SVM is generally a hinge loss, which makes it very sensitive to noise. In the following research, C-SVM is extended to deal with the problem of function estimation, and a support vector interpretation of ridge regression [6] is proposed, which is different from the inequality constraints in C-SVM, as it uses equivalent constraints. Similarly, Suykens [7] considered equality constraints in the sense of least squares and proposed a least squares support vector machine (LSSVM). Unlike C-SVM, it does not use the non-support vector machine to optimize the classifier; LSSVM makes full use of the information of all data points and uses L 2 loss to punish both data points symmetrically. In order to further improve the performance of classification, researchers need to impose heavier penalties on samples that are misclassified. For this reason, in the literature [8], Ma et al. considered asymmetric linear exponential loss (LINEX) to be used to achieve this goal, and Linex-TSVM is proposed to study the binary classification problem. However, both SVM, C-SVM, LSSVM and Linex-TSVM have their own advantages, but they all have to solve a large-scale QPP(Quadratic Programming problem), which requires a lot of time to learn and is not suitable for dealing with practical problems.
Because all the above models need to solve a big QPP, to further improve the computing speed, Jayadeva et al. [9] proposed a twin support vector machine (TSVM) for pattern classification based on the generalized eigenvalue approximation support vector machine (GEPSVM). Since TSVM solves two smaller QP problems instead of a single large QPP, it can theoretically learn four times faster than a standard SVM. The main goal of TSVM is to find two parallel hyperplanes, each of which is as close as possible to the corresponding class in the sample data, while being as far away from the other classes as possible. Therefore, TSVM is more suitable for the classification of large-scale data.
It is well known that distance metrics play a crucial role in many machine learning algorithms. Although the above algorithms demonstrate good performance in pattern classification, it is worth noting that most of them adopt the L 2 -norm distance metric, whose squaring operation will exaggerate the impact of outliers on model performance. To effectively alleviate the impact of the L 2 -norm distance metric on the robustness of the algorithm, the L 1 -norm distance metric with the bounded derivative has received extensive attention and research in many fields of machine learning in recent years [10,11,12]. Recently, more and more researchers have paid attention to the capped L 1 -norm. The capped L 1 -norm can solve the deficiency of L 1 -norm unboundedness. In particular, Wang et al. [13] proposed a new robust TSVM (CTSVM) by the applying the capped L 1 -norm.
Inspired by the successful application of capped L 1 -norm and linex loss function [14,15,16,17], meanwhile, the latest research shows that no scholar has extended the Linex loss function to twin support vector machines, therefore, a new robust twin support vector machine is established in this paper. The details and the main contributions of this work are as follows:
(1) A novel robust loss function is designed, namely, capped linear loss function L a ε .
(2) A novel robust twin support vector machine, namely capped linex twin support vector machine (Linex-TSVM) is proposed.
(3) A efficient iterative algorithm is designed to solve Linex-TSVM, which is not only easy to implement, but also theoretically guarantees the existence of a reasonable optimal solution. We analyze the computational complexity of the algorithm and prove that the model satisfies the Bayesian rule.
(4) Extensive experiments conducted across multiple datasets demonstrates that the proposed Linex-TSVM is competitive with state-of-the-art methods in terms of robustness and feasibility. Therefore, the Linex-TSVM is feasible for practical applications.
The rest of this article is organized as follows. In Section 2, we briefly review Linex-SVM and TSVM. In Section 3, we describe in detail the proposed capped linex loss function and Linex-TSVM, and give the relevant theoretical analysis. After the experimental results on multiple data sets are presented in Section 4, we conclude this paper in Section 5.

2. Related Work

In this section, we are warranted to review Linex-SVM and TSVM.

2.1. Linex-SVM

Linex loss function is a typical asymmetric loss function, defined as:
L l i n e x ( x ) = e a x a x 1
where a 0 is a parameter. If a < 0 , the left side of Linex loss is steeper than the right side, and the opposite is true when a > 0 . The value of | a | decides the symmetry of Linex loss function. This shows that the symbol of a determines the shape of the function. When a takes an appropriate value, it can be reduced to square loss. Linex loss function is not only asymmetric, but also convex and derivable; thus, it is widely used in statistics.
For the dichotomy problem in n-dimensional Euclidean space, the training set can be expressed as
T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , , ( x m , y m ) }
where x i R n is the feature vector of the data i, and y i { 1 , + 1 } is the label of the data i.
For the training set Equation (2), the Linex-support vector machine model can be written as a convex optimization problem with equation constraints in Equation (3) by introducing a linear loss function:
min ω , b , ξ 1 2 ω 2 + C i = 1 m ( e a ξ i a ξ i 1 ) , s . t y i ( ω T x i + b ) 1 = ξ i , i = 1 , 2 , , m .
where  ξ = ( ξ 1 , ξ 2 , , ξ m )  is a slack variable, C is a penalty parameter and a is a parameter of the Linex loss. Furthermore, we can use the Nesterov accelerated gradient (NAG) method to obtain the optimal solution ( ω 1 , b 1 ) and construct the decision function  f ( x ) = s g n ( ω 1 T x + b 1 ) .

2.2. TSVM

The support vector machine is not suitable for dealing with large-scale data, to improve the practical application of the model and further shorten the learning time. Jayadeva et al. proposed a twin support vector machine (TSVM) for pattern classification based on the generalized eigenvalue approximation support vector machine (GEPSVM). The details are as follows:
Considering n dimensional Euclidean space R n the binary classification problem, the training set is T = { x i , y i | i = 1 , 2 . . . m } , x i R n , where y i { 1 , 1 } . A R m 1 × n represents all positive samples; B R m 2 × n represents all negative samples. TSVM identifies two non-parallel hyper planes in the feature space:
f 1 ( x ) = ω 1 T x + b 1 = 0 ,
f 2 ( x ) = ω 2 T x + b 2 = 0 .
where ω 1 , ω 2 R n , b 1 , b 2 R .
The TSVM classifier is obtained by solving the following pair of QPPS:
min ω 1 , b 1 1 2 A ω 1 + e 1 b 1 2 2 + C 1 e 2 T ξ 1 , s . t ( B ω 1 + e 2 b 1 ) + ξ 1 e 2 , ξ 1 0 .
min ω 2 , b 2 1 2 B ω 2 + e 2 b 2 2 2 + C 2 e 1 T ξ 2 , s . t ( A ω 2 + e 1 b 2 ) + ξ 2 e 1 , ξ 1 0 .
where C 1 0 , C 2 0  represent regularization parameters, e 1 , e 2  are all unit vectors, ξ 1 , ξ 2 are the slack vectors.
Then, the dual problem of TSVM is obtained by dual theory:
min α 1 2 α T G ( H T H + λ I ) 1 G T α + e 2 T α , s . t 0 α C 1 e 2 .
min β 1 2 β T H ( G T G + λ I ) 1 H T β + e 2 T β , s . t 0 β C 2 e 1 .
α R m 2  and  β R m 1 are lagrange multipliers. At the same time, matrix G and H respectively defined as: G = [ B e 2 ]  and  H = [ A e 1 ] .
Furthermore, by introducing the kernel method, TSVM can be extended to nonlinear space, and it can be decided whether the sample data x belongs to positive class or negative class according to the shortest distance between the sample data x and two non-parallel planes. The decision function is
f ( x ) = a r g m i n k = 1 , 2 | x ω k + b k | ω k .

3. Main Contribution

3.1. Capped Linex Loss Function

In this section, in order to minimize the influence of abnormal values on the classification results of the model, we propose a novel robust loss function, that is, the capped linex loss function. The details are as follows:
Definition 1.
The capped linex loss function is defined as
L a ε ( x ) = m i n ( i e a x i a x i 1 , ε )
where a 0 is a parameter, when a < 0 , the left side of the loss function is steeper than the right side; when a > 0 , the right side of the loss function is steeper than the left side; see Figure 1. ε > 0 is a thresholding parameter; x i is the component of x.
Figure 1 shows the comparison between the capped linex loss function and the linex loss function. Obviously, we can observe that the improved linear loss has an upper bound, and when the error tends to be consistent, even if there are outliers, the loss will not increase to a certain extent, which improves the robustness of the model.

3.2. Capped Linex Twin Support Vector Machine

Linex-SVM model still needs to be improved: linex loss is an unbounded function, and the loss tends to be consistent with the increase in error. However, in practical applications, datasets are often accompanied by noise, and the unboundedness of the linex loss function will affect the overall performance of the model. In other words, Linex-SVM is a relatively weak method in dealing with training sets with outliers. In addition, almost all the instances in the Linex-SVM contribute to the final optimal hyperplane, which will greatly reduce the training speed.
In order to improve the classification performance of Linex-SVM, we first improve Linex loss to capped linex loss and introduce regularization term to enhance robustness. Secondly, we generalize Linex-SVM to twin support vector machine, and transform a large QPP into two small QP problems to improve the training speed. Based on the above two points, a new twin support vector machine model, named the capped linex twin support vector machine (Linex-TSVM) is obtained:
min ω 1 , b 1 i = 1 m 1 min ( ω 1 x i + b 1 1 , ε 1 ) + C 1 i = 1 m 2 min ( e a ξ i a ξ i 1 , ε 2 ) + C 3 2 ( ω 1 2 2 + b 1 2 ) , s . t ( B ω 1 + e 2 b 1 ) + ξ e 2 .
min ω 2 , b 2 i = 1 m 2 min ( ω 2 x i + b 2 1 , ε 3 ) + C 2 i = 1 m 1 min ( e a η i a η i 1 , ε 4 ) + C 4 2 ( ω 2 2 2 + b 2 2 ) , s . t ( A ω 2 + e 1 b 2 ) + η e 1 .
where C 1 , C 2 , C 3 , C 4 0 , e 1 R m 1 and e 2 R m 2 are the unit vectors. ξ and η are slack vectors.
In addition, we also notice that when using the traditional convex optimization method, it is difficult to solve the problems Equations (12) and (13) simply and quickly. Here, in order to simplify the original problem to an approximate problem that is easier to solve, we use the re-weighted trick [12,18,19,20], the most important of which is the formula x 1 = x T x | x | . Take Equation (12) as an example, for the distance measurement items, when the F = 1 | x | holds, then x 1 = x T F x . For the loss function terms, when e a ξ i a ξ i 1 ε 2 , there are i = 1 m 2 min ( e a ξ i a ξ i 1 , ε 2 ) = i = 1 m 2 ( e a ξ i a ξ i 1 ) . Further, in order to simplify function e a ξ a ξ 1 into an easy-to-solve ξ T Q ξ , we define Q as diagonal matrices with i-th diagonal element as:
q i = e a ξ i a ξ i 1 ξ i 2 , e a ξ i a ξ i 1 ε 2 0 , o t h e r w i s e .
similarly,
u i = e a η i a η i 1 η i 2 , e a η i a η i 1 ε 4 0 , o t h e r w i s e .
Based on the above discussion and calculation, we can obtain the optimization problem Equations (16) and (17), as follows:
min ω 1 , b 1 ( A ω 1 + e 1 b 1 ) T F ( A ω 1 + e 1 b 1 ) + 1 2 C 1 ξ T Q ξ + C 3 2 ( ω 1 2 2 + b 1 2 ) , s . t . ( B ω 1 + e 2 b 1 ) + ξ e 2 .
min ω 2 , b 2 ( B ω 2 + e 2 b 2 ) T K ( B ω 2 + e 2 b 2 ) + 1 2 C 2 η T U η + C 4 2 ( ω 2 2 2 + b 2 2 ) , s . t . ( A ω 2 + e 1 b 2 ) + η e 1 .
where e 1 R m 1 and e 2 R m 2 are the unit vectors, F and K are also two diagonal matrices:
f i = 1 | ω 1 x i + b 1 | , | ω 1 x i + b 1 | ε 1 0 , o t h e r w i s e .
k i = 1 | ω 2 x i + b 2 | , | ω 2 x i + b 2 | ε 3 0 , o t h e r w i s e .
Remark 1.
What is more detailed is that in the objective functions Equations (16) and (17), we use diagonal matrices F, Q and K, U, respectively, to reduce the influence of outliers and abnormal noise on the model. Specifically, if the points in the same class are far away from the hyperplane, they can be treated as noise and removed. In addition, the model mainly sets the elements in the diagonal matrix according to the distance from the data point x i to the hyperplane. For F, if f i is greater than ε 1 , the corresponding f i is set to a smaller value (Smallval), which is almost equivalent to 0. Where ’Smallval’ is a small constant, which will be set to 10−8 in the experiment.
The corresponding Lagrange function of the above optimization problem Equation (16) can be written as:
L ( ω 1 , b 1 , ξ 1 , α ) = 1 2 ( A ω 1 + e 1 b 1 ) T F ( A ω 1 + e 1 b 1 ) + 1 2 C 1 ξ T Q ξ + C 3 2 ( ω 1 2 2 + b 1 2 ) α T ( ( B ω 1 + e 2 b 1 ) + ξ e 2 )
where α is a Lagrange multiplier, derive the Lagrange function about ω 1  and  β 1 , obtain the following Karush–Kuhn–Tucker Conditions.
L ω 1 = A T F ( A ω 1 + e 1 b 1 ) + B T α + C 3 ω 1 = 0 , ( i ) L b 1 = e 1 T F ( A ω 1 + e 1 b 1 ) + e 2 T α + C 3 b 1 = 0 , ( i i ) L ξ i = C 1 Q ξ α = 0 , ( i i i ) α T ( B ω 1 + e 2 b 1 ξ + e 2 ) = 0 , ( i v ) α 0 . ( v )
Through KKT condition, the dual problem of Equation (12) is as follows:
min α 1 2 α T ( E ( H T F H + C 3 I ) 1 E T + 1 C 1 Q 1 ) α e 2 T α , s . t 0 α C 1 e 2 .
Similarly, the dual problem of Equation (13) is:
min α 1 2 β T ( H ( E T K E + C 4 I ) 1 H T + 1 C 2 U 1 ) β e 1 T β , s . t 0 β C 2 e 1 .
where β is a Lagrange multiplier, and
H = A e 1 , E = B e 2 .
Thus, we get the vector Z 1 and Z 2 , and gain the new data point x R n to a positive or negative category.
Based on the above discussion, our algorithm will be presented in Algorithm 1.
Algorithm 1 Iterative algorithm to solve Linex-CTSVM
Input: Training data A R m 1 × n and B R m 1 × n ; Parameters C i ( i = 1 , 2 , 3 , 4 ) and ε i ( i = 1 , 2 , 3 , 4 ) . Establish matrixs H = [ A e 1 ] , E = [ B e 2 ] .
 Initialize F R m 1 × m 1 and K R m 2 × m 2 . Let k = 0
 Iterative
( [ ω 1 , b 1 ] T ) k + 1 = ( H T F H + C 3 I ) 1 E T α .
( [ ω 2 , b 2 ] T ) k + 1 = ( E T K E + C 4 I ) 1 H T β .
 Update matrix separately Q , U , F , K by Equations (14), (15), (18) and (19)
 Let k = k + 1 and go to step 2, until convergence stops.
Output: Optimal solution [ ω 1 , b 1 ] T and [ ω 2 , b 2 ] T .

3.3. Bayes Rule

We want to prove that the model proposed in this paper can satisfy the Bayes rule, assuming that the sample ( x i , y i ) are independent of the same probability ϕ , and the probability ϕ is defined on X × Y , where X R n , Y = 1 , 1 . Further, we assume that the conditional distribution ϕ ( y | x ) is a binomial distribution, including ϕ ( 1 | x ) and ϕ ( 1 | x ) . As we all know, the ultimate goal of the classification problem is to obtain a classifier C : X Y with small error. Bayesian classifier [8] is defined as the classifier with the lowest probability of classification error among all kinds of classifiers.
f C ( x ) = 1 , i f ϕ ( y = 1 | x ) ϕ ( y = 1 | x ) , 1 , i f ϕ ( y = 1 | x ) < ϕ ( y = 1 | x ) .
For any loss function L, the expected risk of the classifier f : X R can be defined as
R L , ϕ = X × Y L ( 1 y f ( x ) ) d ϕ .
Next, by minimizing the expected risk of all measurable classification functions, we can obtain
f L , ϕ = a r g m i n τ R L ( 1 y ( x ) τ ) d ϕ ( y | x ) , x X .
Based on the above important definition of Bayes rule, we obtain Theorem 1 to prove that Bayes rule holds for capped linex loss function. The details of the proof are as follows.
Theorem 1.
Function f L a ε , ϕ , which minimizes the expected risk on all measurable functions f : X Y , making the result equivalent to that of a Bayes classifier, that is f L a ε , ϕ ( x ) = f C ( x ) , x X .
Proof. 
By the properties of capped linex loss function, when e a x i a x i 1 < ε , we can obtain
L a ε ( x ) = e a x a x 1 .
So, there are
Y L a ε ( 1 y ( x ) τ ) d ϕ ( y | x ) = L a ε ( 1 τ ) ϕ ( y = 1 | x ) + L a ε ( 1 + τ ) ϕ ( y = 1 | x ) = e a ( 1 τ ) a ( 1 τ ) 1 ϕ ( y = 1 | x ) + e a ( 1 + τ ) a ( 1 + τ ) 1 ϕ ( y = 1 | x )
By Equation (29), when ϕ ( y = 1 | x ) ϕ ( y = 1 | x ) and ϕ ( y = 1 | x ) < ϕ ( y = 1 | x ) , obtain the minimum value at τ = 1 and τ = 1 , respectively, and when ϕ ( y = 1 | x ) = ϕ ( y = 1 | x ) , we obtain the minimum value at τ = 1 or τ = 1 . Therefore, when e a x i a x i 1 < ε , the capped linex loss function can measure the minimum expected risk of f L a ε , ϕ ( x ) . To sum up
f L a ε , ϕ ( x ) = 1 , i f ϕ ( y = 1 | x ) ϕ ( y = 1 | x ) , 1 , i f ϕ ( y = 1 | x ) < ϕ ( y = 1 | x ) .
i . e , f L a ε , ϕ ( x ) = f C ( x )

3.4. Computational Complexity Analysis

This part mainly analyzes the computational complexity of Algorithm 1. As we all know, the computational complexity includes the number of iterations and the computational cost of iterations. The computational complexity of Algorithm 1 after one iteration is divided into two parts: (1) the time complexity of solving QPP is not more than m 3 4 , and the inverse of matrix is not greater than ( n + 1 ) ( n + 1 ) . Therefore, the total time complexity of solving Linex-TSVM is about O ( t · ( m 3 4 ( n + 1 ) 3 ) ) , where t is the number of iterations, and the experimental results of this paper demonstrate that t = 50 meets the expectation. Under the condition of universality, the number of iterations of each algorithm is much less than the number of samples. Similarly, Linex-TSVM has cubic time complexity in the number of samples.

4. Experimental Results and Discussions

In this section, we first set the experimental parameters in Section 4.1, and in sections Section 4.2 and Section 4.3, we give in detail the experimental results of the model Linex-TSVM with or without noise. Finally, we present some results on the data set in Section 4.4 to prove the convergence of the objective function.

4.1. Experimental Setup

4.1.1. Evaluation Criteria

In order to evaluate the classification performance of our proposed truncated linear loss support vector machine more accurately, we compare it with other mature methods, including SVM, LSSVM, C-SVM, Linex-TSVM, and TBSVM. For these five support vector machines and Linex-TSVM, the iterative process is stopped when the difference between the target values of the two iterations is less than 0.001 and the number of iterations is more than 50. At the same time, in order to measure the performance of all algorithms, the traditional precision index (ACC) is used to measure the performance of these algorithms, which is defined as follows:
ACC = TP + TN TP + FN + TN + FP ,
Among them, TP and TN represent correct positive samples and negative samples, respectively. FN and FP represent wrong positive samples and negative samples, respectively. In order to make a more accurate comparison, we use the quadratic programming (QP) toolbox of matlab to solve the QP problem in related algorithms. The experimental environment consists of a Windows 10 machine and Intel i7 Processor (3.70 GHz) with 8 GB of RAM.

4.1.2. Parameters Selection

For the learning algorithm, its performance is very sensitive to the parameters involved, so it is necessary to record the parameters of each algorithm in detail and list them as follows.
  • SVM and LSSVM:the kernel parameter σ .
  • C-SVM: the regularization parameter c, the kernel parameter σ .
  • NPSVM and TBSVM: the regularization parameters c 1 , c 2 , c 3 and c 4 , the kernel parameter σ .
  • Linex-SVM: the regularization parameter c, a parameter a of the linex loss, the kernel parameter σ .
  • Linex-TSVM: the regularization parameter c 1 , c 2 , c 3 , c 4 , a parameter a of the linex loss, the parameters ε 1 , ε 2 , ε 3 , ε 4 and the kernel parameter σ .
    where ε 1 = ε 2 = ε 3 = ε 4 = 10 5 ; c 1 , c 2 , c 3 , c 4 : { 10 i | 5 , 4 , . . . , 4 , 5 } ; σ , ε : { 10 i | i = 4 , 3 , . . . , 3 , 4 } . The experimental parameters are selected by ten cross-validation methods, and the test accuracy is the average of 10 clusters of results in each dataset.

4.1.3. Description of the Datasets

To verify the effectiveness of Linex-TSVM, we conduct numerical simulations on different datasets, including seven benchmark datasets from the UCI machine learning repository and two artificial datasets. The datasets are described as follows:
Artificia datasets: In the artificial dataset (a) and (b), there are 50 positive samples and 50 negative samples, represented by ‘+’, ‘☐’ and ‘◯’, respectively, as shown in Figure 2. Because the outliers will have a certain impact on the classification performance, it is also the standard to measure the stability of the algorithm. Therefore, we introduce four outliers in the artificial dataset to evaluate its robustness, two of which belong to class +1 and two belong to class −1.
UCI datasets: Australian, Spect, Pima, German, Vote, CMC, Sonar, Spect and Large dataset(codrna). Details of the eight UCI datasets are given in Table 1. These UCI datasets are used to test the performance of our algorithms and related algorithms.
We divide all the data sets into ten subsets, including nine training sets and one test set, that is, 10-fold cross-validation, so that the process is repeated ten times, and the average value of the final result is taken as the criterion to measure the performance of the model. At the same time, we normalize the eight participating data sets, which can avoid errors caused by different orders of magnitude and units, keeping the result within [ 0 , 1 ] .

4.2. Experimental Results on the Employed Datasets without Outliers

Eight UCI datasets are selected and the running results are compared with the other six algorithms to verify the better classification performance of the proposed algorithm. All experimental results presented in Table 2 are based on optimal parameters. Here, ”Time(s)” denotes the average runtime in seconds taken by each algorithm according to the optimal parameters, ”ACC ± S” denotes the average classification accuracy plus or minus standard deviations.
Intuitively, it can be observed from Table 2 that the classification performance of the twin support vector machine based on capped linear loss function proposed in this paper is better than that of the other six models. Except for CMC data sets, Linex-TSVM has better results on other data sets. At the same time, we also observe that the computing time of this model is not dominant, which is because the model is more complex. The time of LSSVM algorithm for solving a system of linear equations is shorter, and compared with SVM, it shortens the time while retaining accuracy, which is in line with the relevant theory. It is worth mentioning that the result of Linex-SVM is still good, which shows that the introduction of linear loss function is meaningful.
Through the detailed analysis of the above experimental results, we can obtain an objective and reasonable conclusion: the use of capped linear loss function on the basis of TBSVM can improve the classification performance, and the introduction of L 1 -norm distance metric can also enhance the robustness of the model; thus, our model is an effective supervision algorithm without the influence of outliers.

4.3. Experimental Results on the Employed Datasets with Outliers

4.3.1. Experimental Results on Artificial Dataset with Outliers

It is well known that outliers tend to have a certain impact on classification performance, which is also a measure of the stability of the algorithm. Therefore, we introduce outliers in artificial datasets (a) and (b), respectively, and Figure 2 is displayed visually. In order to further verify the robustness of the capped linear loss function, we show the classification accuracy of this algorithm on artificial data sets (a) and (b) in Figure 3, and compare the other five algorithms.
From Figure 3, we observe that the proposed Linex-TSVM has higher accuracy when considering outliers; on artificial datasets (a) and (b), the classification accuracy of Linex-TSVM is 68.06% and 91.97%, respectively, which is better than the other five algorithms, can deal with outliers well, and has stronger robustness and better classification ability.
In summary, the capped L 1 -norm is robust to different types of outliers in the literature [21,22,23,24,25]. It can overcome the residual error of outliers in the experiment, and can help the model to eliminate the influence of outliers. In particular, the truncated linear loss function in this model can increase the punishment for outliers. In a word, Linex-TSVM can effectively improve the robustness of TBSVM.

4.3.2. Experimental Results on UCI Dataset with Outliers

In order to verify that this model is also suitable for large-scale data with outliers, we add 10% and 25% noises to the eight UCI data sets, respectively. The reason why the algorithm is introduced into the model is that in practical application, there are various kinds of data and there must be different degrees of noise. In order to verify that the model is suitable for data sets of different fields and different sizes to a certain extent, it is necessary to introduce different noise to compare the models. At the same time, we find that after adding noise, the accuracy will fluctuate to a certain extent, but the overall trend shows a slow decline, which shows that when the noise is relatively large, it will have a certain impact on the model, but the model in this paper is more stable. The results, such as Table 3 and Table 4, show that after the introduction of outliers, the seven algorithms all have varying degrees of accuracy fluctuations, but show a downward trend as a whole, and the classification accuracy of Linex-TSVM is almost better than other algorithms. This shows that the model proposed in this paper has stronger robustness.
Specifically, in Table 3 and Table 4, Linex-TSVM has the best accuracy in seven of the eight data sets, while the least squares support vector machine model has the shortest computing time under different noises. It is worth noting that compared with SVM, LSSVM, NPSVM, C-SVM, Linex-SVM, due to the use of capped linear loss function, the penalty for outliers is increased, so it has better classification accuracy. Linex-TSVM is better than Linex-SVM and TBSVM.
Furthermore, in order to more comprehensively analyze the robustness of the algorithm under different noises, we have carried out more experiments on Australian, Spect, Pima, German, Vote, CMC, Sonar, Codrna and Spect, and we use different noises to test the performance of the six algorithms. For an original dataset X , we changed it with X + λ X ¯ , where λ = q X F X ¯ F and q is a noise factor. Here, X ¯ is the noise matrix whose elements are i.i.d. standard Gaussian variables. The value is q { 0.1 , 0.2 , 0.3 , 0.4 } . Through Figure 4, we can observe that under different noise factors, Linex-TSVM shows better classification accuracy and stability, while the other six models are relatively more volatile.
Next, we introduce the box line diagram to verify that the model is better from another point of view. In Figure 5, we select six datasets to analyze the height of the box reflects the fluctuation of the data to a certain extent, that is, it represents the fluctuation of classification accuracy. The upper and lower edges represent the maximum and minimum values of the group of data, and the points outside the box can be understood as "outliers" in the data, so we can directly observe that the classification accuracy of Linex-TSVM is higher than that of other models.
To sum up, the capped linear loss function twin support vector machine proposed in this paper is superior to the other six algorithms in terms of classification accuracy and robustness, indicating that Linex-TSVM is a robust learning algorithm for large-scale data classification with noise.

4.4. Analysis for the Convergence

In this section, we show the convergence curve of the proposed algorithm on four datasets to directly verify that the convergence speed of the proposed algorithm can achieve the desired speed. The result is Figure 6, where the horizontal axis represents the number of iterations and the vertical axis represents the value of the objective function. We set: when the difference between the target values of two consecutive iterations is less than 0.001 and the number of iterations is less than 50, the iterative process stops.
The result of Figure 6 shows that the value of the objective function of Linex-TSVM decreases monotonously with the increase in the number of iterations, and the algorithm can converge quickly in about 5 iterations, that is, it converges within a limited number of iterations, and we obtain satisfactory results, which is consistent with the previous theoretical analysis.

4.5. Statistical Analysis

In this section, the statistical analysis method-Friedman test is used to compare the differences among the six algorithms involved. In this paper, the Friedman test is a statistical test of the homogeneity of multiple (related) samples, which makes full use of all the information in the original data and has many advantages. It is worth noting that the zero hypothesis means that all algorithms have the same performance. When the zero hypothesis is rejected, we can perform the post-processing test of the Nemeny test [26]. Next, the average ranking and accuracy of the six algorithms on seven data sets are shown in Table 5.
Next, we take eight UCI datasets with 10% Gaussian noise as examples to compare the six algorithms. The formula for Friedman statistical variables is as follows:
χ F 2 = 12 N k ( k + 1 ) [ i R i 2 k ( k + 1 ) 2 4 ] = 30.18 .
where k is the number of algorithms and N is the number of UCI datasets. In our paper, k = 7 , N = 8 . R i represents the average ranking of the i algorithm on the seven UCI datasets. In addition, according to the χ F 2 distribution with k 1 degrees of freedom, we can obtain:
F F = ( N 1 ) χ F 2 N ( k 1 ) χ F 2 = 11.86 .
where F F ( ( k 1 ) , ( k 1 ) ( N 1 ) ) obeys the F-distribution, and its degree of freedom is ( k 1 ) and ( k 1 ) ( N 1 ) . In this paper, we choose α = 0.05 and we can get F α ( 6 , 42 ) = 2.34 . Obviously, F F > F α , we reject the zero hypothesis.
Intuitively, from the Table 5, ee observe that Linex-TSVM has better classification performance, which means that our algorithm is more effective.
Next, through the Nemenyipost-hoctest, we can further compare the errors of the six algorithms in this paper. If the average rank difference between each other is greater than the critical value, the results demonstrate that the performance of the two algorithms is different. By dividing the Studentized range statistic by 2 , we can obtain q α = 2.95 . Therefore, we calculate the critical difference (CD) by the following formulation:
C D = q α = 0.05 × k ( k + 1 ) 6 N = 3.18 .
Based on Figure 7, the performance of Linex-TSVM is significantly better than SVM, LSSVM, C-SVM, Linex-SVM and TBSVM, but the different between Linex-SVM and TBSVM is not obvious, because the different is smaller than the calculated CD value. Through the above analysis, the Linex-TSVM proposed in this paper has better performance.

5. Conclusions

The Twin Support vector machine classification has become a research hotspot. Twin support vector machine models based on different loss functions have been proposed, such as TPMSVM, TWSVM, SG-TSVM and so on. It is urgent to propose a loss function with better performance under the framework of support vector machine. The summary of this paper is as follows:
Firstly, this paper proposes capped linear loss function and applies it to twin support vector machine, and proposes a new robust classification model, which is called truncated linear loss function twin support vector machine. Compared with the linear loss support vector machine model proposed by Ma et al. [8], it has better classification performance. Secondly, we give an efficient iterative algorithm to solve Linex-TSVM. Unlike SVM, which needs to solve a large QP problem, this algorithm needs to solve a pair of small QP problems. Finally, we strictly analyze the computational complexity of the algorithm; it is verified that Linex-SVM satisfies the Bayesian rule. Experimental results on multiple data sets demonstrate that our algorithm Linex-TSVM is more feasible and robust in dealing with large-scale datasets with outliers than other models, and intuitively show the convergence of the algorithm. In particular, compared with SVM, LSSVM, C-SVM, NPSVM, Linex-SVM and TBSVM, the average accuracy of Linex-TSVM is higher than that in the absence of noise. The average accuracy of the model in this paper is higher than that of 4.36%, 4.29%, 2.53%, 2.33%, 1.91% and 0.77%, respectively. Linex-TSVM is more robust and stable for outliers.
The focus of future work is that we should still focus on finding better models to improve different data classification results, shorten the computing time while ensuring accuracy, and extend the model of this paper to other work, such as multi-classification problems. In future work, we can further consider applying different models to practical hot issues, such as face recognition, fingerprint recognition, UAV scheduling and so on. Of course, how to develop a better new algorithm for our Linex-TSVM is also very important.

Author Contributions

Y.W., methodology, software, validation, formal analysis, investigation, data curation, writing—original draft. G.Y., conceptualization, methodology, validation, investigation, project administration, writing—original draft. J.M., project administration, writing—original draft. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (No. 11861002, 61907012) and Natural Science Foundation of Ningxia Provincial of China (No. 2022A0950). This research was also funded by the Young Talent Cultivation Project of North Minzu University (No. 2021KYQD23) and the Fundamental Research Funds for the Central Universities (No. 2022XYZSX03).

Informed Consent Statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Vapnik, V.N. Statistical Learning Theory; Wiley: New York, NY, USA, 1998. [Google Scholar]
  2. Brown, M.P.; Grundy, W.N.; Lin, D.; Cristianini, N.; Sugnet, C.W.; Furey, T.S.; Haussler, D. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 2000, 97, 262–267. [Google Scholar] [CrossRef] [PubMed]
  3. Ma, S.; Cheng, B.; Shang, Z.; Liu, G. Scattering transform and LSPTSVM based fault diagnosis of rotating machinery. Mech. Syst. Signal Process. 2018, 104, 55–170. [Google Scholar] [CrossRef]
  4. Goh, K.S.; Chang, E.Y.; Li, B. Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans. Knowl. Data Eng. 2005, 17, 1333–1346. [Google Scholar] [CrossRef]
  5. Bi, J.; Zhang, T. Support vector classification with input data uncertainty. In Proceedings of the Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 13–18 December 2004. [Google Scholar]
  6. Saunders, C.; Gammerman, A.; Vovk, V. Ridge regression learning algorithm in dual variables. In Proceedings of the 15th International Conference on Machine Learning, ICML’98, Madison, WI, USA, 24–27 July 1998. [Google Scholar]
  7. Suykens, J.A.K.; Vewalle, J. Least squares support vector machine classifiers. Neural Process. Lett. 1999, 9, 293–300. [Google Scholar] [CrossRef]
  8. Ma, Y.; Zhang, Q.; Li, D.; Tian, Y. Linex support vector machine for large-scale classification. IEEE Access 2019, 7, 70319–70331. [Google Scholar] [CrossRef]
  9. Khemchani, R.; Chra, S. Twin support vector machines for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 905–910. [Google Scholar]
  10. Gao, S.; Ye, Q.; Ye, N. 1-Norm least squares twin support vector machines. Neurocomputing 2011, 74, 3590–3597. [Google Scholar] [CrossRef]
  11. Ye, Q.; Zhao, H.; Li, Z.; Yang, X.; Gao, S.; Yin, T.; Ye, N. L1-Norm distance minimization-based fast robust twin support vector k-plane clustering. IEEE Trans. Neural Netw. Learn. Syst. 2017, 29, 4494–4503. [Google Scholar] [CrossRef] [PubMed]
  12. Yan, H.; Ye, Q.; Zhang, T.A.; Yu, D.J.; Yuan, X.; Xu, Y.; Fu, L. Least squares twin bounded support vector machines based on L1-norm distance metric for classification. Pattern Recognit. 2018, 74, 434–447. [Google Scholar] [CrossRef]
  13. Wu, M.J.; Liu, J.X.; Gao, Y.L.; Kong, X.Z.; Feng, C.M. Feature selection and clustering via robust graph-laplacian PCA based on capped L1-norm. In Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Kansas City, MO, USA, 13–16 November 2017; pp. 1741–1745. [Google Scholar]
  14. Zhang, C.; Wang, Z. Linex-RSVM:ramp linex support vector machine. Procedia Comput. Sci. 2022, 199, 524–531. [Google Scholar] [CrossRef]
  15. Kinyanjui, J.K.; Korir, B.C. Bayesian Estimation of Parameters of Weibull Distribution Using Linex Error Loss Function. Int. J. Stat. Probab. 2020, 9, 1–38. [Google Scholar] [CrossRef] [Green Version]
  16. Zou, G. Admissible estimation for finite population under the LINEX loss function. J. Stat. Plan. Inference 1997, 61, 373–384. [Google Scholar] [CrossRef]
  17. Hwang, L.C. Second order optimal approximation in a particular exponential family under asymmetric linex loss. Stat. Probab. Lett. 2018, 137, 283–291. [Google Scholar] [CrossRef]
  18. Wang, C.; Ye, Q.; Luo, P.; Ye, N.; Fu, L. Robust capped L1-norm twin support vector machine. Neural Netw. 2019, 114, 47–59. [Google Scholar] [CrossRef] [PubMed]
  19. Zhang, L.; Luo, M.; Li, Z.; Nie, F.; Zhang, H.; Liu, J.; Zheng, Q. Large-scale robust semisupervised classification. IEEE Trans. Cybern. 2018, 49, 907–917. [Google Scholar] [CrossRef] [PubMed]
  20. Nie, F.; Huang, Y.; Wang, X.; Huang, H. New primal SVM solver with linear computational cost for big data classifications. In Proceedings of the 31st International Conference on International Conference on Machine Learning, Bejing, China, 22–24 June 2014; Volume 32, pp. 505–513. [Google Scholar]
  21. Nie, F.; Huo, Z.; Huang, H. Joint capped norms minimization for robust matrix recovery. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017. [Google Scholar]
  22. Zhao, M.; Chow, T.W.; Zhang, H.; Li, Y. Rolling fault diagnosis via robust semi-supervised model with capped L2,1-norm regularization. In Proceedings of the IEEE International Conference on Industrial Technology, Toronto, ON, Canada, 22–25 March 2017; pp. 1064–1069. [Google Scholar]
  23. Nie, F.; Wang, X.; Huang, H. Multiclass capped Lp-norm SVM for robust classifications. In Proceedings of the 32th AAAI Conference on Artificial Intelligence, Palo Alto, CA, USA, 4–9 February 2017. [Google Scholar]
  24. Ahmadi, J.; Doostparast, M.; Parsian, A. Estimation and prediction in a two-parameter exponential distribution based on k-record values under LINEX loss function. Commun.-Stat. Theory Methods 2005, 34, 795–805. [Google Scholar] [CrossRef]
  25. Pandey, B.N.; Dwividi, N.; Pulastya, B. Comparison between Bayesian and maximum likelihood estimation of the scale parameter in Weibull distribution with known shape under linex loss function. J. Sci. Res. 2011, 55, 163–172. [Google Scholar]
  26. Demar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 2006, 7, 1–3. [Google Scholar]
Figure 1. Linex loss capped linex loss.
Figure 1. Linex loss capped linex loss.
Sensors 22 06583 g001
Figure 2. Distribution of artificial datasets with outliers.
Figure 2. Distribution of artificial datasets with outliers.
Sensors 22 06583 g002
Figure 3. Accuracy of the two artificial datasets with outliers.
Figure 3. Accuracy of the two artificial datasets with outliers.
Sensors 22 06583 g003
Figure 4. Accuracies of seven algorithms via different noises factors.
Figure 4. Accuracies of seven algorithms via different noises factors.
Sensors 22 06583 g004
Figure 5. Box diagram of UCI datasets with outliers.
Figure 5. Box diagram of UCI datasets with outliers.
Sensors 22 06583 g005
Figure 6. Convergence rate of Linex-TSVM.
Figure 6. Convergence rate of Linex-TSVM.
Sensors 22 06583 g006
Figure 7. Visualization of post hoc tests for UCI datesets.
Figure 7. Visualization of post hoc tests for UCI datesets.
Sensors 22 06583 g007
Table 1. Characteristics of UCI datasets.
Table 1. Characteristics of UCI datasets.
DatasetsSamplesAttributesDatasetsSamplesAttributes
Australian69014Spect26745
Pima7688German100024
Sonar19860Vote43216
CMC11409codrna59,5359
Table 2. Experimental results on UCI datasets without noise.
Table 2. Experimental results on UCI datasets without noise.
SVMLSSVMC-SVMNPSVMLinex-SVMTBSVMLinex-TSVM
DatasetsACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)
Times (s)Times (s)Times (s)Times (s)Times (s)Times (s)Times (s)
Australian82.06 ± 0.6382.75 ± 1.5283.24 ± 3.3983.96 ± 2.5383.68 ± 1.6384.05 ± 1.1484.12 ± 0.42
1.9051.1081.2412.0372.6442.3584.535
Vote89.52 ± 0.7390.12 ± 1.1793.62 ± 1.5394.03 ± 1.8794.29 ± 0.9795.02 ± 1.3495.24 ± 1.05
0.9430.7931.8321.8341.8461.1152.062
German70.60 ± 1.7471.35 ± 0.2172.11 ± 3.2273.26 ± 1.2673.20 ± 2.8675.61 ± 0.8775.85 ± 0.21
2.2861.7192.3033.4214.6723.7199.695
Spect78.08 ± 1.6277.94 ± 1.5379.42 ± 5.2580.11 ± 0.4180.52 ± 2.3480.77 ± 1.0982.12 ± 1.01
0.6390.9451.3472.0141.7391.2351.438
CMC55.13 ± 3.6256.42 ± 2.7957.52 ± 1.5256.33 ± 3.1557.17 ± 0.7661.12 ± 5.7160.97 ± 0.62
1.4411.3672.9803.2865.8623.7448.328
Pima73.42 ± 0.7272.79 ± 1.2474.21 ± 1.0373.48 ± 1.6474.63 ± 1.6975.58 ± 0.6675.79 ± 1.31
1.1171.0361.1252.1852.0921.6265.032
Sonar63.54 ± 2.5463.62 ± 2.3465.39 ± 0.6964.13 ± 0.6864.37 ± 1.4265.63 ± 2.1666.50 ± 0.68
0.3500.1370.6590.5170.8130.5340.913
codrna82.14 ± 1.7380.01 ± 2.6183.62 ± 0.1185.43 ± 2.2486.10 ± 0.6285.43 ± 3.1188.73 ± 3.40
50.94743.40049.20351.22970.56266.54859.914
Table 3. Experimental results on UCI datasets with 10% noise.
Table 3. Experimental results on UCI datasets with 10% noise.
SVMLSSVMC-SVMNPSVMLinex-SVMTBSVMLinex-TSVM
DatasetsACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)
Times (s)Times (s)Times (s)Times (s)Times (s)Times (s)Times (s)
Australian81.61 ± 0.8281.51 ± 0.8382.44 ± 2.6081.27 ± 1.3983.11 ± 1.4583.61 ± 1.0783.83 ± 0.31
1.8661.1541.3461.3022.8422.7754.279
Vote88.77 ± 0.6389.25 ± 1.3893.07 ± 1.6393.60 ± 2.9592.35 ± 1.6893.27 ± 2.1794.29 ± 1.01
1.1930.8101.8931.0271.0881.6072.034
German68.22 ± 0.8969.15 ± 2.4170.84 ± 2.4174.30 ± 2.2671.66 ± 2.5073.14 ± 1.2173.91 ± 0.82
2.8762.2742.5072.4604.3393.1059.148
Spect77.29 ± 1.8777.91 ± 0.8878.48 ± 3.5778.51 ± 3.1178.75 ± 1.3579.06 ± 4.6281.73 ± 0.94
1.5190.9881.8301.6161.6131.0121.910
CMC54.74 ± 3.3053.20 ± 2.7954.59 ± 2.2458.03 ± 0.8757.74 ± 1.1759.91 ± 0.1359.91 ± 0.75
1.4411.3672.9804.3395.8623.7448.215
Pima70.42 ± 0.7271.79 ± 1.2472.21 ± 1.0371.82 ± 0.1472.70 ± 1.6973.92 ± 0.6673.53 ± 1.31
1.5351.2391.5993.7002.2721.8035.118
Sonar62.93 ± 2.5462.71 ± 2.3463.06 ± 1.1963.14 ± 3.0163.27 ± 1.4263.85 ± 1.1364.50 ± 4.95
1.1840.9170.7261.7151.4761.2441.619
Codrna80.67 ± 2.6781.44 ± 3.6181.61 ± 1.0282.35 ± 1.1184.29 ± 0.9086.77 ± 1.3087.54 ± 0.34
54.30246.16950.49562.37655.48261.51758.455
Table 4. Experimental results on UCI datasets with 25% Gaussian noise.
Table 4. Experimental results on UCI datasets with 25% Gaussian noise.
SVMLSSVMC-SVMNPSVMLinex-SVMTBSVMLinex-TSVM
DatasetsACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)ACC ± S (%)
Times (s)Times (s)Times (s)Times (s)Times (s)Times (s)Times (s)
Australian78.83 ± 1.2778.75 ± 1.5279.24 ± 3.3979.14 ± 1.6780.68 ± 1.6382.05 ± 1.1481.18 ± 3.74
1.3651.2331.4901.9102.0762.8514.648
Vote87.69 ± 1.0788.05 ± 0.9889.13 ± 1.3988.28 ± 2.3890.71 ± 0.8891.01 ± 2.5992.62 ± 3.37
1.0530.9281.6341.6672.0691.9422.143
German68.83 ± 1.2168.04 ± 0.8069.71 ± 2.3670.35 ± 1.6470.14 ± 2.8469.79 ± 3.5472.30 ± 0.99
2.9222.4212.6655.1964.9043.4548.615
Spect75.81 ± 1.1775.93 ± 0.8177.64 ± 1.5376.04 ± 1.3078.60 ± 2.0980.19 ± 3.7781.15 ± 3.06
0.7030.9140.9981.6321.1431.4471.519
CMC52.13 ± 3.6253.42 ± 2.7954.52 ± 1.5254.86 ± 0.8855.17 ± 0.7656.91 ± 0.1357.79 ± 3.50
2.9512.5663.2915.1574.0174.5218.693
Pima70.42 ± 0.7271.79 ± 1.2472.16 ± 1.0373.25 ± 3.6773.70 ± 1.6972.43 ± 2.4173.92 ± 0.93
1.7851.3551.9361.9022.0842.7415.375
Sonar60.24 ± 4.9560.50 ± 0.3661.73 ± 1.0662.88 ± 2.7862.99 ± 1.1262.87 ± 0.9463.25 ± 0.35
0.8610.3520.8861.7491.3471.3651.698
codrna79.14 ± 1.9178.01 ± 2.7679.65 ± 3.0882.20 ± 5.2983.52 ± 1.6386.14 ± 2.3086.87 ± 1.49
58.14754.99053.89268.95670.10869.56071.928
Table 5. Average accuracy and ranks of six algorithms on UCI datasets with 0%, 10%, 25% Gaussian noise.
Table 5. Average accuracy and ranks of six algorithms on UCI datasets with 0%, 10%, 25% Gaussian noise.
SVMLSSVMC-SVMNPSVMLinex-SVMTBSVMLinex-TSVM
Avg.ACC 0%74.3174.3876.1476.3476.7677.9078.67
Avg.rank 0%6.636.254.384.253.382.001.13
Avg.ACC 10%73.0873.3774.5475.3875.4876.6977.41
Avg.rank 10%6.386.254.133.883.501.941.31
Avg.ACC 25%71.6471.8172.9773.3874.4475.1776.14
Avg.rank 25%6.636.384.634.002.882.381.13
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wang, Y.; Yu, G.; Ma, J. Capped Linex Metric Twin Support Vector Machine for Robust Classification. Sensors 2022, 22, 6583. https://doi.org/10.3390/s22176583

AMA Style

Wang Y, Yu G, Ma J. Capped Linex Metric Twin Support Vector Machine for Robust Classification. Sensors. 2022; 22(17):6583. https://doi.org/10.3390/s22176583

Chicago/Turabian Style

Wang, Yifan, Guolin Yu, and Jun Ma. 2022. "Capped Linex Metric Twin Support Vector Machine for Robust Classification" Sensors 22, no. 17: 6583. https://doi.org/10.3390/s22176583

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop