Next Article in Journal
The Effect of Monitoring Committees on the Relationship between Board Structure and Firm Performance
Previous Article in Journal / Special Issue
The Design and Risk Management of Structured Finance Vehicles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Credit Scoring by Fuzzy Support Vector Machines with a Novel Membership Function

School of Electrical & Automatic Engineering, Changshu Institute of Technology, Changshu 215500, China
*
Author to whom correspondence should be addressed.
J. Risk Financial Manag. 2016, 9(4), 13; https://doi.org/10.3390/jrfm9040013
Submission received: 28 June 2016 / Revised: 15 September 2016 / Accepted: 26 October 2016 / Published: 7 November 2016
(This article belongs to the Special Issue Credit Risk)

Abstract

:
Due to the recent financial crisis and European debt crisis, credit risk evaluation has become an increasingly important issue for financial institutions. Reliable credit scoring models are crucial for commercial banks to evaluate the financial performance of clients and have been widely studied in the fields of statistics and machine learning. In this paper a novel fuzzy support vector machine (SVM) credit scoring model is proposed for credit risk analysis, in which fuzzy membership is adopted to indicate different contribution of each input point to the learning of SVM classification hyperplane. Considering the methodological consistency, support vector data description (SVDD) is introduced to construct the fuzzy membership function and to reduce the effect of outliers and noises. The SVDD-based fuzzy SVM model is tested against the traditional fuzzy SVM on two real-world datasets and the research results confirm the effectiveness of the presented method.

1. Introduction

During the recent financial crisis, many financial institutions endured great losses from customers’ defaults on loans such as the subprime mortgage crisis in the USA. However, credit-granting institutions cannot simply refuse all applicants to avoid credit risk as the competition in the growing credit market has become fierce. Effective credit scoring has become one of the primary techniques for gaining competitive advantages in credit market which can help financial institutions to increase credit volume without excessively increasing their exposure to default.
Credit scoring models are developed to discriminate applicants as either accepted or rejected with respect to the customers’ application form and credit report, which is built on the basis of past applicants’ characteristics [1,2,3]. Since even a fraction of improvement in accuracy of credit scoring might translate into noteworthy future savings, numerous data mining and statistical techniques have been proposed to derive a satisfied credit scoring model over the past decades. Generally, these methods can be classified to statistics approaches (e.g., discriminant analysis and logistic regression), and machine learning approaches (e.g., artificial neural network and support vector machine). Though traditional statistical methods are relatively simple and explainable, their discriminating ability is still an argumentative problem due to the nonlinear relationship between default probability and credit patterns. Additionally, statistical methods must assume posterior probability models while machine learning approaches commonly operate without this limitation.
Regarding recent credit scoring techniques, artificial neural network [4,5] has been criticized for its poor performance when incorporating irrelevant attributes or small data sets, while support vector machine, motivated by statistical learning theory [6,7], is particularly well suited for coping with a large number of explanatory attributes or sparse data sets [8,9,10,11]. Baesens et al. studied the performance of various state-of-the-art classification algorithms on eight real-life credit scoring data sets [12]. Thomas et al. tested 17 consumer credit modeling methods and report that SVM ranks the highest in term of classification accuracy [13]. Huang et al. constructed hybrid SVM-based credit scoring models with three strategies for feature selection and model parameter setting [14]. Martens et al. extracted rules from the trained SVM to obtain both accurate and comprehensible credit scoring models [15]. Niklis et al. developed linear and nonlinear support vector machines risk assessment and classification models, as well as additive SVM model that well suited the requirements of credit rating systems, which provided very competitive results [16]. Harris introduced the use of the clustered support vector machine for credit scorecard development and addresses some of the limitations associated with traditional nonlinear support vector machine based methods [17]. Cheng and Li proposed a rating model based on a support vector machine with monotonicity constraints derived from the prior knowledge of financial experts which helps correct the loss of monotonicity in data occurring during the collecting process [18].
However, in many real word credit scoring problems, some input training points are corrupted by noise, and some applicants may be assigned to the wrong side by accident. These points are all outliers, which do not completely belong to one class, but with different memberships in the two classes. In this case, standard SVM training algorithm will make the classification boundary to severely deviate from the optimal hyperplane as SVM is very sensitive to outliers. Lin and Wang introduced the concept of fuzzy SVM, which applies a fuzzy membership value to each input point according to its relative importance in the classification and reformulates the SVM such that allows different input point to make different contributions to the learning of decision surface [19]. Fuzzy SVM can be used as a remedy for unwanted over fitting caused by treating every data sample equally [20,21,22], however, the effects of the membership values when training fuzzy SVM is an interesting issue in the context of credit scoring modeling [23]. In this study, considering the methodological consistency, a novel support vector date description based fuzzy membership function is proposed to reduce the effects of outliers and improves the classification accuracy and generalization, and the effects of the membership values in fuzzy SVM is investigated and compared to those of SVM.
This paper is organized as follows. Section 2 recalls the background on SVM and fuzzy SVM, and reports two typical linear and nonlinear fuzzy membership functions. Section 3 highlights the main novelty of this work. The main aspects of the SVDD-based membership function are detailed. Section 4 collects the experimental results on two real world credit datasets. Finally, Section 5 draws the conclusion remarks.

2. SVM and Fuzzy SVM

2.1. Standard Support Vector Machines

In this section, the basic concept of SVM for classification problems is presented. Consider a two-class problem with a set of l sample points: (x1,y1), ..., (xl,yl). Each xi has a class label yi  ∈  {−1,1} which denotes two classes separately. When the sample points are rigidly linear separable, SVM classifier tries to search for a hyperplane with the largest margin to separate them by solving the following quadratic program:
m i n 1 2 | | w | | 2 s u b j e c t   t o y i ( w T x i + b ) 1     i = 1 , 2 , ... , l
where w is the weight vector and b is the bias term. For linear non-separable separable case, non-negative slack variables ξi are introduced to measure the amount of violation of the constraints in Equation (1). The QP problem turns to:
m i n 1 2 | | w | | 2 + C i = 1 l ξ i s u b j e c t   t o y i ( w T x i + b ) 1 ξ i i = 1 , 2 , ... , l ξ i 0 i = 1 , 2 , ... , l
where C is a regularized constant determining the tradeoff between margin maximization and classification violation. In many cases, it’s difficult to obtain a suitable hyperplane in the original input space by such a linear classifier, thus a nonlinear mapping φ(x) that satisfies Mercer’s condition can be introduced to map the input variable xi into a higher dimensional feature space. To solve the QP problem, the kernel function K(xi, xj) = φ(xiφ(xj) is introduced to compute the dot product of the data points in feature space, and the shape of the mapping φ is unnecessary to know. Then the optimal problem turns to the following dual form by constructing a Lagrangian:
m i n 1 2 i = 1 l j = 1 l α i α j y i y j K ( x i , x j ) i = 1 l α i s u b j e c t   t o i = 1 l y i α i = 0   0 α i C    i = 1 , 2 , ... , l
where α are the non-negative Lagrange multipliers related to the constraints in Equation (2).
Some commonly used kernel functions are polynomial, sigmoid and radial basis function (RBF). RBF kernel is employed in this study because of its superiority, which is experimentally demonstrated by Gestel et al. [24].

2.2. Fuzzy Support Vector Machines

In credit scoring modeling, the applicants may not be exactly assigned to creditworthy or default ones. In other words, there is a fuzzy membership associated with each applicant, which can be regarded as the attitude of the corresponding applicants toward one class in the classification. Lin and Wang proposed the theory of fuzzy support vector machine based on standard SVM [19]. Suppose a set of labeled sample points with associated fuzzy membership: (x1,y1,s1), ..., (xl,yl,sl). Each xi has a class label yi  ∈  {−1,1} and a fuzzy membership which satisfies 0 < si ≤ 1.
Since the fuzzy membership si is the attitude of the corresponding point toward one class and the parameter ξi is a measure of the constraints violation, the term siξi can be treated as a measure of constraints violation with different weights. Then the quadratic problem can be described as:
m i n 1 2 | | w | | 2 + C i = 1 l s i ξ i s u b j e c t   t o y i ( w T φ ( x i ) + b ) 1 ξ i i = 1 , 2 , ... , l ξ i 0 i = 1 , 2 , ... , l
By constructing a Lagrangian, the quadratic programs can be solved in their dual space just as that for standard SVM:
m i n 1 2 i = 1 l j = 1 l α i α j y i y j K ( x i , x j ) i = 1 l α i s u b j e c t   t o i = 1 l y i α i = 0   0 α i s i C    i = 1 , 2 , ... , l
With different value of si, the tradeoff between the maximization of margin and the amount of constraints violation can be controlled. It is noted that a smaller si makes the corresponding point xi less important in the training, thus choosing appropriate fuzzy memberships in a given problem is of pivotal importance for FSVM. Lin proposed a model by setting a linear fuzzy membership as a function of the distance between each data point and its corresponding class center [19]. For the above training sample sequence: (x1,y1,s1), ..., (xl,yl,sl), denote x + the mean of class with label +1 and x the mean of class −1. The radius of class +1 is as follows:
r + = m a x | x + x i |     w h e r e     y i = 1
and the radius of class −1 is:
r = m a x | x x i |     w h e r e     y i = 1
The fuzzy membership of each sample is
s i = { 1 | x + x i | / ( r + + δ )     w h e r e     y i = 1 1 | x x i | / ( r + δ )     w h e r e     y i = 1
with δ > 0 to avoid the case si = 0.
The method performs well as the fuzzy membership is a function of the mean and radius of each class, and the effect of outlier can be reduced as it contributes little to the final decision plane. However, the algorithm is carried out in the original input space rather than the feature space, Tang further proposed a nonlinear membership function defined in the feature space with mapping function φ(x) on the basis of the above method [25]. Define φ + and φ as the centers of the two classes by taking average on the points mapped to the feature space:
φ + = 1 n + y i = 1 φ ( x i ) ,     φ = 1 n y i = 1 φ ( x i )
where n + and n are the numbers of samples in two classes.
The radius is defined similarly to those of Lin:
r + = m a x | φ + φ ( x i ) |     w h e r e     y i = 1 r = m a x | φ φ ( x i ) |     w h e r e     y i = 1
Then the square of distance is calculated in the feature space:
d i + 2 = | | φ φ ( x i ) | | 2 = K ( x i , x i ) 2 n + y j = 1 K ( x j , x i ) + 1 n + 2 i y j = 1 K ( x i , x j ) d i 2 = | | φ + φ ( x i ) | | 2 = K ( x i , x i ) 2 n + y j = 1 K ( x j , x i ) + 1 n + 2 i y j = 1 K ( x i , x j )
Finally the fuzzy membership of each sample is calculated as:
s i = { 1 d i + 2 / ( r + 2 + δ )     w h e r e     y i = 1 1 d i 2 / ( r 2 + δ )     w h e r e     y i = 1
with δ > 0 to avoid the case si = 0.
The nonlinear version of FSVM outperforms the linear one as it could more accurately represent the contribution of each sample to the decision surface in the feature space. Both algorithm define the center and radius by taking average and maxim on sample points, however, they could be defined in a more explicable way.

3. Fuzzy SVM with SVDD Membership Function

Methodological consistency has been a major design principle and is expected to improve the comprehensibility of the modeling paradigm, which, in turn, may avail dispersion in practical applications [26]. Support Vector Data Description (SVDD) is inspired by the support vector machine classifier, which searches for a spherically shaped boundary around a dataset to detect novel data or outliers [27,28]. In this section, a SVDD member function is proposed for FSVM, which is defined in the feature space as well.
Similar to the SVM hyperplane approach with largest margin between two classes, SVDD estimates a hypersphere with the minimum volume to find an enclosed boundary containing almost all target objects, instead of using a hyperplane in SVM. Assume a hypersphere with center a and radius R, the cost function is defined as below:
m i n R 2 + C i = 1 l ξ i s u b j e c t   t o | | ( φ ( x i ) a ) | | 2 R 2 + ξ i ,     ξ i   0   i = 1 , 2 , ... , l
where ξi are slack variables, and the parameter C controls the trade-off between the volume and the violation.
The above optimal hypersphere problem can be solved by constructing the Lagrangian, which results in:
m a x i = 1 l α i K ( x i , x i ) i = 1 l j = 1 l α i α j K ( x i , x j ) s u b j e c t   t o i = 1 α i = 1 ,   0 α i C     i = 1 , 2 , ... , l
The training samples with nonzero αi are support vectors and are used to describe the hypersphere boundary.
Denote a + and a as the centers of class +1 and class −1 in the feature space respectively.
According to K-T condition, the center of the each class can be calculated as:
a + = y i = 1 α i x i ,     a = y i = 1 α i x i
The radius of class +1 is then defined by
r + = m a x | a + φ ( x i ) |     w h e r e   y i = 1
and the radius of class −1 by
r = m a x | a φ ( x i ) |     w h e r e   y i = 1
The square of the distance between any sample xi with class label yi = 1 and the class center a + in the feature space can be calculated as:
d i + 2 = K ( x i , x i ) 2 y j = 1 α j + K ( x j , x i ) + i y j = 1 α i + α j + K ( x i , x j )
The square of the distance between the sample of class -1 and its corresponding class center a is derived similarly:
d i 2 = K ( x i , x i ) 2 y j = 1 α j K ( x j , x i ) + i y j = 1 α i α j K ( x i , x j )
By definition , the radius r + is the distance from the center a + of the hypersphere of class +1 to any of its support vectors on the boundary, and r is the corresponding radius of class −1.
r + = d s v + ,     r = d s v
Then the fuzzy membership si of each input sample xi can be defined as follows:
s i = { 1 d i + 2 / ( r + 2 + δ )     w h e r e     y i = 1 1 d i 2 / ( r 2 + δ )     w h e r e     y i = 1
where δ > 0 is a small constant used to avoid the case si = 0. In this paper, δ is set to 1×10−3 for all three models. Once the coefficients si are derived, fuzzy SVM for classification can be modeled according to Equation (4) in Section 2.
The fuzzy memberships are calculated in the feature space with a more explainable center and radius. As the class center and radius are defined in the minimum SVDD hypersphere including all the class targets, the fuzzy membership are more explicit and interpretable in the feature space compared to those average and maxim algorithm. In addition, the above formulas are similar to that of Jiang’s, and are expected to reduce the effect of outliers as well.

4. Experimental Results and Discussion

In this section, two real-world credit datasets, the Australian and German credit data sets, are adopted to evaluate the performance of the proposed SVDD-FSVM method. Experimental results are compared with those of linear and nonlinear fuzzy SVMs. Australian credit data set, with 307 instances of accepted customers and 383 instances of rejected customers, contains 14 attributes, where six are continuous attributes and eight are categorical attributes. For confidentiality reasons, all attribute names and values have been changed to meaningless symbols. The German credit dataset is more unbalanced, consisting of 700 examples of creditworthy applicants and 300 examples of default applicants, with 24 numeric features describing 19 attributes. The attributes of each applicant are depicted in Table 1, with 4 attributes changed to dummy variables. Both data sets are made public from the UCI Repository of Machine Learning Databases, and are generally adopted as benchmarks to compare the performance of various classification models.
The input variables are normalized with respect to their maximum values and the fuzzy membership of each input instance is then derived by the SVDD operation. To visualize the SVDD hypersphere, the variables are projected to the two dimensional plane by principal component analysis, which is a popular multivariate statistic tool aiming to handle high dimensional, noisy, and correlated data by defining a reduced set of latent variables (referred to as principal components) [29]. The boundaries of class +1 and class –1 of Australian dataset for the first fold partition are pictured in Figure 1 and those of German dataset are pictured in Figure 2. Both hyperspheres are calculated with a confidence limit of 98%. Though the two principal components contain over 90% of the information (measured by variance) of the original data in both sets, the figures show that the rejected applicants and accepted applicants can hardly be distinguished by a simple boundary in two dimensional plane.
For comparison purpose, the detailed performance of SVDD-FSVM is tested against linear FSVM and nonlinear FSVM. This study assessed the three credit scoring methods in terms of accuracy and other major assessment metrics such as sensitivity and specificity. Denote the number of default clients classified as accepted ones by DA and classified as rejected ones by DR. Let the number of creditworthy clients classified as accepted ones be CA and classified as rejected ones be CR. Then these evaluation criteria measuring the efficiency of the classification are defined as:
S e n s i t i v i t y = C A C A + C R S p e c i f i c i t y = D R D R + D A A c c u r a c y = C A + D R C A + C R + D A + D R
In this study, the credit dataset is randomly partitioned into identical training and test sets using 5-fold cross validation and a grid search is employed to find the optimal parameters [30]. The average comparison results on German and Australian dataset are shown in Table 2 and Table 3 respectively.
The result listed in Table 2 indicates that the SVDD-FSVM outperformed the other two FSVMs on Australia data set, with overall accuracy of 87.25, comparing with those of 87.10 and 86.67 obtained from the corresponding nonlinear and linear FSVM models, respectively. Though the sensitivity, specificities and accuracy of German dataset in Table 3 are relatively lower than those of Australia data set as the German data structure is considered more unbalanced than the Australian data set, SVDD-FSVM still yields the best results among all the three approaches, with the specificity especially improved, which is considered more important than the sensitivity for the credit risk control of financial institutes.

5. Conclusions

This paper has presented methods for using fuzzy support vector machines to establish credit scoring models. Comparing with standard SVM, fuzzy SVM imposes a fuzzy membership to each input point such that different input points can make different contributions to the learning of decision surface, which enhances the SVM in reducing the effect of outliers and noises in data points with unmodeled characteristics. As choosing a proper fuzzy membership function is quite important to solving classification problem with FSVM, the SVDD version of fuzzy membership is proposed for methodological consistency consideration, which is a function of the distance between each input point and its corresponding SVDD hypersphere center. The SVDD-FSVM credit scoring model overall yields the best performance among all the three models when appropriately trained on two real-world credit data sets. The results indicate that the proposed method provides classification accuracy and reliability, and is supposed to have promising potential for practical use.

Acknowledgments

This work is supported by national natural science foundation of China (No. 61273312, No. 61673075), the natural science fundamental research program of higher education colleges in Jiangsu province (No. 14KJD510001), Suzhou municipal science and technology plan project (No. SYG201548) and the project of talent peak of six industries (No. DZXX-013).

Author Contributions

Benlian Xu conceived and designed the experiments; Jian Shi performed the experiments, analyzed the data and wrote the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. L.C. Thomas, D.B. Edelman, and J.N. Crook. Credit Scoring and Its Applications. Philadelphia, PA, USA: Siam, 2002. [Google Scholar]
  2. A. Blöchlinger, and M. Leippold. “Economic benefit of powerful credit scoring.” J. Bank. Financ. 30 (2006): 851–873. [Google Scholar] [CrossRef]
  3. L. Einav, M. Jenkins, and J. Levin. “The impact of credit scoring on consumer lending.” RAND J. Econ. 44 (2013): 249–274. [Google Scholar] [CrossRef]
  4. D. West. “Neural network credit scoring models.” Comput. Oper. Res. 27 (2000): 1131–1152. [Google Scholar] [CrossRef]
  5. C.L. Chuang, and S.T. Huang. “A hybrid neural network approach for credit scoring.” Expert Syst. 28 (2011): 185–196. [Google Scholar] [CrossRef]
  6. V.N. Vapnik. The Nature of Statistical Learning Theory. New York, NY, USA: Springer, 1995. [Google Scholar]
  7. V.N. Vapnik. Statistical Learning Theory. New York, NY, USA: John Wiley & Sons, 1998. [Google Scholar]
  8. A. Christmann, and R. Hable. “Consistency of support vector machines using additive kernels for additive models.” Comput. Stat. Data Anal. 56 (2012): 854–873. [Google Scholar] [CrossRef]
  9. H. Jiang, Z. Yan, and X. Liu. “Melt index prediction using optimized least squares support vector machines based on hybrid particle swarm optimization algorithm.” Neurocomputing 119 (2013): 469–477. [Google Scholar] [CrossRef]
  10. A.E. Ruano, G. Madureira, O. Barros, H.R. Khosravani, M.G. Ruano, and P.M. Ferreira. “Seismic detection using support vector machines.” Neurocomputing 135 (2014): 273–283. [Google Scholar] [CrossRef]
  11. S. Maldonado, and J. López. “Imbalanced data classification using second-order cone programming support vector machines.” Pattern Recognit. 47 (2014): 2070–2079. [Google Scholar] [CrossRef]
  12. B. Baesens, T. van Gestel, S. Viaene, M. Stepanova, J. Suykens, and J. Vanthienen. “Benchmarking state-of-the-art classification algorithms for credit scoring.” J. Oper. Res. Soc. 54 (2003): 627–635. [Google Scholar] [CrossRef]
  13. L.C. Thomas, R.W. Oliver, and D.J. Hand. “A survey of the issues in consumer credit modelling research.” J. Oper. Res. Soc. 56 (2005): 1006–1015. [Google Scholar] [CrossRef]
  14. C.-L. Huang, M.-C. Chen, and C.-J. Wang. “Credit scoring with a data mining approach based on support vector machines.” Expert Syst. Appl. 33 (2007): 847–856. [Google Scholar] [CrossRef]
  15. D. Martens, B. Baesens, T. van Gestel, and J. Vanthienen. “Comprehensible credit scoring models using rule extraction from support vector machines.” Eur. J. Oper. Res. 183 (2007): 1466–1476. [Google Scholar] [CrossRef]
  16. D. Niklis, M. Doumpos, and C. Zopounidis. “Combining market and accounting-based models for credit scoring using a classification scheme based on support vector machines.” Appl. Math. Comput. 234 (2014): 69–81. [Google Scholar] [CrossRef]
  17. T. Harris. “Credit scoring using the clustered support vector machine.” Expert Syst. Appl. 42 (2015): 741–750. [Google Scholar] [CrossRef] [Green Version]
  18. C.-C. Chen, and S.-T. Li. “Credit rating with a monotonicity-constrained support vector machine model.” Expert Syst. Appl. 41 (2014): 7235–7247. [Google Scholar] [CrossRef]
  19. C.-F. Lin, and S.-D. Wang. “Fuzzy support vector machines.” IEEE. Trans. Neural Netw. 13 (2002): 464–471. [Google Scholar]
  20. W. An, and M. Liang. “Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises.” Neurocomputing 110 (2013): 101–110. [Google Scholar] [CrossRef]
  21. A. Chaudhuri. “Modified fuzzy support vector machine for credit approval classification.” AI Commun. 27 (2014): 189–211. [Google Scholar]
  22. Z. Wu, H. Zhang, and J. Liu. “A fuzzy support vector machine algorithm for classification based on a novel PIM fuzzy clustering method.” Neurocomputing 125 (2014): 119–124. [Google Scholar] [CrossRef]
  23. M.-D. Shieh, and C.-C. Yang. “Classification model for product form design using fuzzy support vector machines.” Comput. Ind. Eng. 55 (2008): 150–164. [Google Scholar] [CrossRef]
  24. T. Van Gestel, J.A. Suykens, B. Baesens, S. Viaene, J. Vanthienen, G. Dedene, B. De Moor, and J. Vandewalle. “Benchmarking least squares support vector machine classifiers.” Mach. Learn. 54 (2004): 5–32. [Google Scholar] [CrossRef]
  25. W.M. Tang. “Fuzzy SVM with a new fuzzy membership function to solve the two-class problems.” Neural Process. Lett. 34 (2011): 209–219. [Google Scholar] [CrossRef]
  26. S. Lessmann, and S. Voß. “A reference model for customer-centric data mining with support vector machines.” Eur. J. Oper. Res. 199 (2009): 520–530. [Google Scholar] [CrossRef]
  27. D.M. Tax, and R.P. Duin. “Support vector data description.” Mach. Learn. 54 (2004): 45–66. [Google Scholar] [CrossRef]
  28. R. Strack, V. Kecman, B. Strack, and Q. Li. “Sphere Support Vector Machines for large classification tasks.” Neurocomputing 101 (2013): 59–67. [Google Scholar] [CrossRef]
  29. I. Jolliffe. Principal Component Analysis. New York, NY, USA: John Wiley & Sons, 2005. [Google Scholar]
  30. L. Zhou, K.-K. Lai, and L. Yu. “Least squares support vector machines ensemble models for credit scoring.” Expert Syst. Appl. 37 (2010): 127–133. [Google Scholar] [CrossRef]
Figure 1. Principal components plot of SVDD for Australia data set.
Figure 1. Principal components plot of SVDD for Australia data set.
Jrfm 09 00013 g001
Figure 2. Principal components plot of SVDD for German data set.
Figure 2. Principal components plot of SVDD for German data set.
Jrfm 09 00013 g002
Table 1. Input variables of German data set.
Table 1. Input variables of German data set.
Original AttributesInput VariablesVariable TypeAttribute Description
A1V1qualitativeStatus of existing checking account
A2V2numericalDuration in month
A3V3qualitativeCredit history
A4V4,V5dummyPurpose (V4: new car, V5: used car)
A5V6numericalCredit amount
A6V7qualitativeSavings account/bonds
A7V8qualitativePresent employment since
A8V9qualitativePersonal status and sex
A9V10,V11dummyOther debtors/guarantors (V10: none, V11: co-applicant)
A10V12numericalPresent residence since
A11V13qualitativeProperty
A12V14numericalAge in years
A13V15qualitativeOther installment plans
A14V16,V17dummyHousing (V16: rent, V17: own)
A15V18numericalNumber of existing credits at this bank
A16V19,V20,V21dummyJob (V19: unemployed/unskilled (non-resident), V20: unskilled (resident), V21: skilled employee/official)
A17V22numericalNumber of people being liable to provide maintenance for
A18V23qualitativeTelephone
A19V24qualitativeforeign worker
Table 2. Performance on the Australian test data set.
Table 2. Performance on the Australian test data set.
MethodsSensitivity (%)Specificity (%)Accuracy (%)
SVDD-FSVM87.5386.8487.25
Nonlinear FSVM89.8785.1387.10
Linear FSVM86.9586.4886.67
Table 3. Performance on the German test data set.
Table 3. Performance on the German test data set.
MethodsSensitivity (%)Specificity (%)Accuracy (%)
SVDD-FSVM89.5948.6077.30
Nonlinear FSVM92.1541.7577.00
Linear FSVM95.1823.4273.60

Share and Cite

MDPI and ACS Style

Shi, J.; Xu, B. Credit Scoring by Fuzzy Support Vector Machines with a Novel Membership Function. J. Risk Financial Manag. 2016, 9, 13. https://doi.org/10.3390/jrfm9040013

AMA Style

Shi J, Xu B. Credit Scoring by Fuzzy Support Vector Machines with a Novel Membership Function. Journal of Risk and Financial Management. 2016; 9(4):13. https://doi.org/10.3390/jrfm9040013

Chicago/Turabian Style

Shi, Jian, and Benlian Xu. 2016. "Credit Scoring by Fuzzy Support Vector Machines with a Novel Membership Function" Journal of Risk and Financial Management 9, no. 4: 13. https://doi.org/10.3390/jrfm9040013

Article Metrics

Back to TopTop