A Deep Learning Approach for Credit Scoring Using Feature Embedded Transformer
Abstract
:1. Introduction
- This paper introduces transformer into the field of credit scoring based on user online behavioral data, and the experimental results show that the transformer used in this study outperforms LSTM and traditional machine learning models.
- We make use of credit feature data and user behavioral data and develop a novel end-to-end deep learning credit scoring framework. The framework is composed of two parts, a wide part and a deep part, and it can automatically learn from user behavioral data and feature data.
2. Related Work
3. Theory and Method
3.1. LSTM
3.2. Transformer
3.3. Feature Embedded Transformer
3.3.1. Input Data and Data Coding
3.3.2. Transformer Encoding Layer
3.3.3. Concatenate Layer and Output Layer
3.4. Evaluation Metrics
4. Experimental Results
4.1. Experimental Set
4.2. Performance Analysis
4.3. Parameter Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, C.; Han, D.; Liu, Q.; Luo, S. A Deep Learning Approach for Credit Scoring of Peer-to-Peer Lending Using Attention Mechanism LSTM. IEEE Access 2018, 7, 2161–2168. [Google Scholar] [CrossRef]
- Wang, C.; Liu, Q.; Li, S. A two-stage credit risk scoring method with stacked-generalisation ensemble learning in peer-to-peer lending. Int. J. Embed. Syst. 2022, 15, 158–166. [Google Scholar] [CrossRef]
- Altman, E.I. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Financ. 1968, 23, 589–609. [Google Scholar] [CrossRef]
- Parnes, D. Applying Credit Score Models to Multiple States of Nature. J. Fixed Income 2007, 17, 57–71. [Google Scholar] [CrossRef]
- Bolton, C. Logistic Regression and Its Application in Credit Scoring; University of Pretoria: Pretoria, South Africa, 2010. [Google Scholar]
- Lessmann, S.; Baesens, B.; Seow, H.-V.; Thomas, L.C. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. Eur. J. Oper. Res. 2015, 247, 124–136. [Google Scholar] [CrossRef] [Green Version]
- Bhatia, S.; Sharma, P.; Burman, R.; Hazari, S.; Hande, R. Credit scoring using machine learning techniques. Int. J. Comput. Appl. 2017, 161, 1–4. [Google Scholar] [CrossRef]
- Mandala, I.G.N.N.; Nawangpalupi, C.B.; Praktikto, F.R. Assessing Credit Risk: An Application of Data Mining in a Rural Bank. Procedia Econ. Financ. 2012, 4, 406–412. [Google Scholar] [CrossRef] [Green Version]
- Harris, T. Credit scoring using the clustered support vector machine. Expert Syst. Appl. 2015, 42, 741–750. [Google Scholar] [CrossRef] [Green Version]
- Abellán, J.; Castellano, J.G. A comparative study on base classifiers in ensemble methods for credit scoring. Expert Syst. Appl. 2017, 73, 1–10. [Google Scholar] [CrossRef]
- Munkhdalai, L.; Ryu, K.; Namsrai, O.-E.; Theera-Umpon, N. A Partially Interpretable Adaptive Softmax Regression for Credit Scoring. Appl. Sci. 2021, 11, 3227. [Google Scholar] [CrossRef]
- Malekipirbazari, M.; Aksakalli, V. Risk assessment in social lending via random forests. Expert Syst. Appl. 2015, 42, 4621–4631. [Google Scholar] [CrossRef]
- Xia, Y.; Liu, C.; Li, Y.; Liu, N. A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Syst. Appl. 2017, 78, 225–241. [Google Scholar] [CrossRef]
- Kang, Y.; Chen, L.; Jia, N.; Wei, W.; Deng, J.; Qian, H. A CWGAN-GP-based multi-task learning model for consumer credit scoring. Expert Syst. Appl. 2022, 206, 117650. [Google Scholar] [CrossRef]
- Ghorbanali, A.; Sohrabi, M.K.; Yaghmaee, F. Ensemble transfer learning-based multimodal sentiment analysis using weighted convolutional neural networks. Inf. Process. Manag. 2022, 59, 102929. [Google Scholar] [CrossRef]
- Bae, J.-H.; Yu, G.-H.; Lee, J.-H.; Vu, D.T.; Anh, L.H.; Kim, H.-G.; Kim, J.-Y. Superpixel Image Classification with Graph Convolutional Neural Networks Based on Learnable Positional Embedding. Appl. Sci. 2022, 12, 9176. [Google Scholar] [CrossRef]
- Liu, Q.; Mu, L.; Sugumaran, V.; Wang, C.; Han, D. Pair-wise ranking based preference learning for points-of-interest recommendation. Knowl.-Based Syst. 2021, 225, 107069. [Google Scholar] [CrossRef]
- Tomczak, J.M.; Zięba, M. Classification restricted Boltzmann machine for comprehensible credit scoring model. Expert Syst. Appl. 2015, 42, 1789–1796. [Google Scholar] [CrossRef]
- Yu, L.; Yang, Z.; Tang, L. A novel multistage deep belief network based extreme learning machine ensemble learning paradigm for credit risk assessment. Flex. Serv. Manuf. J. 2015, 28, 576–592. [Google Scholar] [CrossRef]
- Zhang, Z.; Wang, Z. Research on Credit Scoring Based on Transformer-CatBoost Network Structure. In Proceedings of the 2022 IEEE 12th International Conference on Electronics Information and Emergency Communication (ICEIEC), Beijing, China, 15–17 July 2022; pp. 75–79. [Google Scholar]
- Hidasi, B.; Quadrana, M.; Karatzoglou, A.; Tikk, D. Parallel Recurrent Neural Network Architectures for Feature-rich Session-based Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, 15–19 September 2016; pp. 241–248. [Google Scholar] [CrossRef]
- Lang, T.; Rettenmeier, M. Understanding consumer behavior with recurrent neural networks. In Proceedings of the Workshop on Machine Learning Methods for Recommender Systems, Houston, TX, USA, 27–29 April 2017. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J.U.R. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Liu, G.; Guo, J. Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 2019, 337, 325–338. [Google Scholar] [CrossRef]
- Guo, D.; Zhou, W.; Li, H.; Wang, M. Hierarchical lstm for sign language translation. Proc. AAAI Conf. Artif. Intell. 2018, 32, 6845–6852. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.U.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; The MIT Press: Cambridge, MA, USA, 2017; p. 30. [Google Scholar]
- Ba, J.L.; Kiros, J.R.; Hinton, G.E. Layer normalization. arXiv 2016, arXiv:1607.06450. [Google Scholar]
- Zhang, X.; Gao, T. Multi-head attention model for aspect level sentiment analysis. J. Intell. Fuzzy Syst. 2020, 38, 89–96. [Google Scholar] [CrossRef]
- Jing, H.; Yang, C. Chinese text sentiment analysis based on transformer model. In Proceedings of the 2022 3rd International Conference on Electronic Communication and Artificial Intelligence (IWECAI), Zhuhai, China, 14–16 January 2022; pp. 185–189. [Google Scholar]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 6 July 2015; pp. 448–456. [Google Scholar]
Models | Training Set | Test Set | ||
---|---|---|---|---|
KS | AUC | KS | AUC | |
LR | 0.23 | 0.622 | 0.23 | 0.621 |
XGBoost | 0.248 | 0.634 | 0.241 | 0.63 |
Models | Training Set | Test Set | ||
---|---|---|---|---|
KS | AUC | KS | AUC | |
LR | 0.092 | 0.57 | 0.09 | 0.54 |
XGBoost | 0.1 | 0.58 | 0.095 | 0.553 |
LSTM | 0.203 | 0.631 | 0.198 | 0.62 |
AM-LSTM | 0.243 | 0.66 | 0.238 | 0.661 |
Transformer | 0.26 | 0.679 | 0.25 | 0.672 |
Models | Train Set | Test Set | ||
---|---|---|---|---|
KS | AUC | KS | AUC | |
LR | 0.25 | 0.670 | 0.251 | 0.658 |
XGBoost | 0.262 | 0.679 | 0.26 | 0.665 |
LSTM | 0.273 | 0.7 | 0.26 | 0.682 |
AM-LSTM | 0.31 | 0.707 | 0.313 | 0.71 |
FE-Transformer | 0.33 | 0.731 | 0.32 | 0.72 |
Models | With Normalization | Without Normalization | ||
---|---|---|---|---|
KS | AUC | KS | AUC | |
LSTM | 0.26 | 0.682 | 0.175 | 0.61 |
AM-LSTM | 0.313 | 0.71 | 0.21 | 0.6 |
FE-Transformer | 0.32 | 0.72 | 0.24 | 0.63 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, C.; Xiao, Z. A Deep Learning Approach for Credit Scoring Using Feature Embedded Transformer. Appl. Sci. 2022, 12, 10995. https://doi.org/10.3390/app122110995
Wang C, Xiao Z. A Deep Learning Approach for Credit Scoring Using Feature Embedded Transformer. Applied Sciences. 2022; 12(21):10995. https://doi.org/10.3390/app122110995
Chicago/Turabian StyleWang, Chongren, and Zhuoyi Xiao. 2022. "A Deep Learning Approach for Credit Scoring Using Feature Embedded Transformer" Applied Sciences 12, no. 21: 10995. https://doi.org/10.3390/app122110995
APA StyleWang, C., & Xiao, Z. (2022). A Deep Learning Approach for Credit Scoring Using Feature Embedded Transformer. Applied Sciences, 12(21), 10995. https://doi.org/10.3390/app122110995