1. Introduction
Since 2011, with the rapid development of internet finance in China, peer-to-peer (P2P) network lending has also sprung up suddenly, giving rise to Alibaba, Suning Micro-finance, and a large number of P2P lending platforms. P2P is the main form of network lending, and, according to the “Guiding Opinions on Promoting the Healthy Development of Internet Finance” released by The People’s Bank of China and other ten ministries and commissions on 18 July 2015, network lending includes individual network lending (i.e., P2P network lending) and online network microfinance. Individual network lending refers to the direct lending between individuals through the internet platform. Network microfinance refers to the petty loans provided by internet enterprises to customers by making use of the internet through the microfinance companies under their control. As intermediary organs, service agencies of network lending provide information release, risk assessment, credit consulting, transaction management, and customer services for lending activities, and offer services to the borrower and the lender so as to obtain service charges. P2P gives play to the advantages of the internet and directly connects to the borrower and the lender without the media of commercial banks, which greatly reduces transaction costs and meets the needs of China’s current economic development to a large extent.
P2P lending, since its emergence, has kept a sound momentum of development. In 2005, Zopa, the first network lending website, was born in Britain. In 2007, Prosper, America’s first P2P lending website, was founded. Since then, P2P platforms have sprung up like mushrooms around the world. Due to the imperfect financial system, China is facing a more serious phenomenon of credit rationing, there is a strong demand for financing and lending in society, and traditional commercial banks find it hard to meet this demand. Under this historical background, China’s P2P platforms have ushered in an explosive growth. According to the data of “Annual Report of China’s Network Lending Industry in 2017” released by WDZJ, by the end of December 2017, the number of normal operating platforms of the network lending industry has reached up to 1931, a decrease of 517 compared with the end of 2016, and the number of normal operating platforms throughout the whole year had been declining unilaterally. Since the process of platform rectification has not yet been completed, it is expected that the number of operating platforms of the network lending industry will continue to decline further in 2018, and the specific rate of decline depends on the filing and compliance. From the current information, it is estimated that the number of operating platforms of the network lending industry may fall to about 800 by the end of 2018. In the meantime, the rectification process of the network lending industry in 2017 has come to the ending stage, the number of platforms exiting the industry has dropped to a substantial extent compared with that in 2016, the number of platforms that suspend business and have problems in is 645 in 2017 while the number of platforms, in that case, is 1713 in 2016. The ratio of the number of problematic platforms continues to decrease, the number of problematic platforms in 2017 only takes up 33.49%, and 66.51% of the platforms choice to benignly withdraw. All the above data indicate that the regulation of China’s network lending industry is highly effective and fruitful, and the industry development environment will be increasingly healthy in the future. Thus, it can be seen that China’s online lending industry has moved from the stage of “savage development” to the new stage of “standardized development”.
Due to the lack of regulation, domestic P2P faces enormous legal risks and platform risks, and at the same time, as the credit business, P2P is also confronted with the credit risks due to the default of the borrower, but this cannot obliterate the positive significance P2P brings to the society. In view of the current situation, domestic and foreign scholars have conducted research on P2P lending in various aspects so as to help the platform carry out risk control and promote the healthy development of the industry. In the meantime, as China’s credit system and credit rating system are not sound enough, the study on theories related to P2P network lending can also offer certain theoretical guidance for formulating interest rates of China’s network lending and conducting credit rating.
Different from the existing research, this paper has made some progress in the following two aspects: first of all, this paper deeply analyzes the influencing factors of P2P network credit from the perspective of loss given default (LGD). Secondly, it selects the transaction data of Lending Club to depict the probability distribution characteristics and influencing factors of LGDs in P2P network lending. It has been discovered the LGDs of P2P loans present an obvious unimodal distribution, the peak value is relatively high and tends to concentrate with the decrease of the borrower’s credit rating. The total assets of the borrower have no significant impact on LGD, the credit rating and the debt-to-income ratio exert a significant negative impact, while the term and amount of the loan produce a relatively strong positive impact.
The next research of this paper mainly includes the following parts: the second part is the literature review, the third part presents the theory analysis and research hypothesis, the fourth part displays the empirical analysis, and the fifth part includes the conclusions and suggestions.
2. Literature Review
A large number of scholars have been investigating P2P lending, a new lending mode. The related studies mainly concentrate on the factors influencing P2P default and P2P credit risks, etc.
In respect of empirical research, the data concerning the empirical research of network lending theories mainly come from the transaction data of Prosper and Lending Club. The initial research mainly focuses on the influencing factors of interest rate, success rate and default rate of network lending. Ravina (2007) [
1] conducted a more comprehensive empirical analysis on network lending, and he carried out an empirical research on network lending from the perspectives of lending success rate, interest rate and default rate with the transaction data of Prosper. It is found that not only the borrower’s hard financial information and the loan treaty itself will directly affect the success rate of the loan, other factors (such as the loan amount, term and interest rate) will have a certain impact on the success rate; besides, good hard financial information, credit line of limit, race and other factors can significantly affect interest rates as well. Based on the empirical research of Prosper, Klafft (2008) [
2] found that the “hard information” related to the default rate, such as the transaction status of the borrower’s bank authentication account and the borrower’s credit rating, produces a significant positive impact on the transaction rate of P2P. Marques, Garcia and Sanchez (2013) [
3], based on the expert scoring, establish the regression model to measure the borrower’s risk situation by regarding financial indexes as the explaining variables. Serrano-Cinca (2015) [
4] used single factor mean test and survival to analyze 24,449 loan sample data of Lending Club platform from 2008 to 2014, and explained that the default factors were loan purpose, annual income, current housing status, credit records, and liabilities. Malekipirbazari (2015) [
5] compared different machine learning methods in order to identify high-quality P2P borrowing customers, the results show that random forecasting (RFS) is significantly superior to FICO scoring and LC grade identification in identifying the best borrowers.
In terms of credit risks, in order to maximize the capacity of batch processing with the internet and big data technology, many transactions on the P2P platform adopt an unsecured mode of credit loans, and the biggest risk comes from the borrower’s credit risks. Under this constraint, in order to reduce the information asymmetry between the borrower and the lender, the P2P platform has also introduced a great number of advanced information control methods. For instance, when rating the loan credit, Prosper referred to the group mode of Grameen Bank, that is, all groups develop strict admission standards to help members get low borrowing rates in the future, so the influence of social information is also one of the hot spots in the study. Though research on the platform of Proper, Herzenstein (2008) [
6] found that one of the essential conditions for the financing success of the borrower is the ideal credit scoring. Rating is the further deepening of scoring. Herrero (2009) [
7] social group and other information in network lending cannot merely enhance the availability of loans, but fill the disadvantages of personal high borrowing rate, low credit rating, etc. Lin (2009) [
8] also found that social network capital can raise the possibility of network lending, reduce the interest rate and lower the default rate simultaneously. Leow et al. (2014) [
9], taking retail loan data in the UK as the research object, found that macroeconomic factors have a certain impact on LGD, for mortgage loans, macroeconomic factors make the estimation effect of LGD better; but for personal loans, macroeconomic factors cannot bring about the improvement of prediction accuracy. Robert et al., (2015) [
10] without considering the correlation between PD and LGD, conducted modeling for LGDs of small and medium-sized loans with complex collateral to get the expression form of LGD containing collateral and risk exposure. Without the quantized data similar to bank lending, investors cannot convert available information into appropriate market behavior, thus threatening the sustainable development of P2P lending (Mild et al., 2015) [
11]. Traditionally, Emekter et al. (2015) [
12] analyzed the platform data of Lending Club from May 2007 to June 2012, constructed a logistic regression (LR) model to predict the default probability of the borrower, and made use of the debt-to-income ratio, FICO scoring and circulating credit amount to conduct credit rating for the borrower, and the empirical evidence suggests that credit rating plays a crucial role in reducing the default of the lender.
In China, with the rapid development of P2P network lending, related studies have also emerged in endlessly. Li Yuelei, et al. (2013) [
13] found that the basic properties of loan orders, the basic information of borrowers and the social capital of borrowers exert significant effects on the lending success rates in the Chinese market. At the same time, it is found that investors in China’s P2P microfinance market exhibit obvious herding behavior characteristics, and these herding behaviors have an important influence on the success rates of borrowing. Liao Li, et al. (2014) [
14] used the transaction data of Renrendai to empirically study the risk identification effect of interest rates in P2P lending. The result turns out that in the case of asymmetric information, the non-complete market-oriented interest rates generated by Renrendai reflect the borrower’s default risks, but still a high proportion of default risks is not reflected in the interest rate, and some basic public information of the borrower contributes to predicting this part of risks to some extent. The empirical results of Liao Li, et al. (2015) [
15] demonstrate that the repayment probability of borrowers with high academic qualifications as agreed is higher, and the length of higher education has enhanced the borrower’s self-discipline ability. However, investors do not prefer borrowers with high academic qualifications and there exist biases in their behaviors of identifying credit risks through educational level. The research results of He Qizhi, et al. (2016) [
16] suggest that the fluctuations of interest rates of network lending have the effects of agglomeration and risk accumulation instead of leverage effect and have generally consistent response to bull and bear information, which means that the risk of the network lending market is strong while the risk awareness of market participants is not strong. The research of Xuchen Lin, et al. (2017) [
17] Empirical results reveal that gender, age, marital status, educational level, working years, company size, monthly payment, loan amount, debt to income ratio and delinquency history play a significant role in loan defaults. Empirical results reveal that gender, age, marital status, educational level, working years, company size, monthly payment, loan amount, debt to income ratio and delinquency history play a significant role in loan defaults.
To sum up, though P2P lending is different from loans from commercial banks in terms of lending style, they are both the loan relations generated based on credit essentially, and the biggest risk is still the borrower’s credit risks. According to the internal rating theory introduced by the New Basel Capital Accord, the analysis of credit risk valuation can be carried out from two aspects: namely, the default rate reflecting the possibility of default and LGD reflecting the severity of the loss after default. Scholars both at home and abroad have achieved fruitful research findings of the default rate. These findings have also substantiated the negative influence of information increase and hard finance conditions of the borrower on the default rate. However, research into the loss given default (LGD) is still insufficient. The default rate and the LGD are two expression factors of equal importance to describe losses caused by loan defaults. They are two indispensable aspects to fully describe default losses. Therefore, the author considers that in addition to the indicator of probability of default, the study on LGD of P2P lending should also be an important issue. Toward this end, this paper analyzes the factors of influencing LGD of P2P network credit, and on this basis, tentatively selects relevant data of Lending Club, factors influencing LGD of P2P lending. Theoretically, this research enriches research into risks of loan default and provides a theoretical basis for a comprehensive description of default losses. Meanwhile, research findings of this paper about LGD can guide P2P platforms to more accurately evaluate credit risks and monitor platform operation risks. In this way, the P2P lending industry can seek more reasonable development.
4. Empirical Analysis
4.1. Data and Variables
This paper adopts the transaction data of Lending Club as the research object. The transaction data of Lending Club from June 2007 to December 2017 are selected to form a dataset. The original dataset contains 1,341,582 borrowing records. The loans already charged off are set to be defaulted ones. The data with the income not yet verified by Lending Club as well as the data providing incomplete information are screened out. Finally, 41,717 data constitute samples for empirical analysis of this paper. Taking advantage of these existing data, this paper selects appropriate loans and borrower attributes as explaining variables and makes a detailed empirical analysis of the LGD of loans on Lending Club.
In view of the loan factors, this paper selects the amount of each loan, loan term, interest rate and credit rating and other variables to conduct studies. It should be noted that the credit rating of loans in Lending Club is obtained by converting the borrower’s FICO credit score, loan amount and term. Lending Club first conducts a preliminary classification of credit rating of loans based on the borrower’s FICO score, and then considers the amount and term of the loan, adds a risk modification, and finally obtains the specific credit rating of the loan. That is to say, the credit rating of the loan is a mixed variable of the borrower’s credit rating, loan amount and loan term. Lending Club’s interest rate is derived from the credit rating of the loan plus risk premium rather than determined by negotiation between the borrower and the investor, resulting in a strong correlation between the interest rate and the credit rating of the loan. Therefore, this paper cannot examine the influencing factors of interest rate alone and can only regard them as an alternative variable of the credit rating of the loan.
The borrower factors mainly include his financial position, total assets, and recent credit record. Since it is unable to obtain direct data of these attributes, this paper uses some variables published by Lending Club for alternative analysis. The debt-to-income ratio can well reflect the repaying capability of the borrower, so it is used to measure the borrower’s financial position. Income and housing can reflect the current capital condition of the borrower very well while working years reflect the historical accumulation of the borrower, so this paper measures the total asset of the borrower by employing information such as working years, housing ownership and income. Lending Club directly announced the number of credit defaults of the borrower within two years, which can reflect the recent credit conditions well. The detailed description of all variables is shown in
Table 1.
All the selected explaining variables are statistically analyzed, and the results are shown in
Table 2. Then the variables are statistically analyzed according to the credit rating, and
Table 3 lists the statistical information of variables under different credit ratings. Among the default samples, the highest interest rate is 30.9% while the lowest is 5.23%, which differs to a large extent; then from the perspective of loan amount, compared with the average loan amount of 18,166 US dollars, the maximum loan amount of 40,000 cannot be counted as high, indicating that the amounts of default loans in Lending Club are relatively concentrated; by observing the annual income, it is found that the highest annual income among the samples reaches 7,500,000 US dollars, compared with an average annual salary of 73,673 US dollars, the standard deviation of income is 58024, indicating that the distribution of the defaulter’s annual incomes is relatively scattered.
Table 3 shows the performance of loans of all credit ratings. Through analysis, it is found that the proportion of default loans with the credit rating of B–E Is the largest, reaching 84.55%, while the A level only accounts for 2.96%, and F and G take up 9.34% and 3.15% respectively. Compared with the total loans, the A level accounts for 16.73%, while the F and G levels only take up 2.13% and 0.68% respectively, thus the default rate increases significantly with the decline of the rating.
Figure 2 lists the proportions of loans of all credit ratings in total loans and default loans. In addition, what is interesting is that credit ratings among the default samples have a negative relationship with working years, from an average of 6.08 to A G level of 6.29 years, the longer the working age, the higher the credit score.
Table 4 displays the correlation coefficient matrix of 9 dependent variables, which demonstrates that loan interest rate is strongly correlated with credit rating in Lending Club. This is because Lending Club set interest rates on the loans according to the credit rating, with 5.05% as the benchmark rate, and once credit rating declines by a small level, the risk interest rate of 0.5–1% will be floating upward accordingly, from 5.32% at the highest A1 level to 30.99% at the worst G5 level. The lending amount is found to have a significantly positive correlation with the lending term and annual income, respectively. The significantly positive position is also observed in the lending term with the interest rate and the credit rating, respectively. However, the positive correlation is very weak among the working years, poor credit records in two years and credit rating.
4.2. Distribution Characteristics of LGD
According to the definitions given in Formula (1), this paper analyzes the default data released by Lending Club, records the loans that have been charged off and delayed for more than 120 days as a breach of contract, and calculates the LGDs of all default loans with Formula (1), and the statistical results are shown in
Table 5:
As shown in the statistical results of LGDs, the average value is 0.6236 and the standard deviation is 0.2195, indicating that the total default loss is relatively large and distributed in a relatively concentrated manner. In addition, it is noted that the minimum value is negative, this is because in the calculation of this paper, only the book value of the debt is taken into consideration, although some loans have defaulted, plus the late fees and total penalty interest charged later, it may exceed the total amount of principal and interest, so the loss rate may appear negative.
In order to display the probability distribution characteristics of LGD more intuitively, this paper uses Kernel method to estimate the probability distribution of the LGD of the default loans of Lending Club and obtains the distribution result shown in
Figure 3. It can be found that the LGDs of the loans of Lending Club display a distinct unimodal distribution with peak values concentrated around 75%. This shows that LGDs of loans are generally high, and though only a small part of loans default, it causes only minor losses. Analysis results of Lending Club’s defaulted bonds are close to statistical results of Til Schuermann about subordinate bonds and limited uncovered bonds. Both demonstrate a unimodal distribution and the peak value of the two is close to each other. This suggests that the priority for credit loans generated by Lending Club is relatively low and the LGD is comparatively larger.
Then, the same probability distribution analysis of LGDs of loans of different credit ratings is analyzed respectively, and the results shown in
Figure 4 are obtained. In
Figure 4, the LGDs of A-level loans are distributed relatively uniformly, and there is a relatively wide and gentle distribution between 0.1 and 0.9. But with the decline of the credit rating of the loan, the LGD curve gradually steeps and moves to the right and finally converges to the single peak of 0.85. The G level loan exhibits most obviously, and its LGD probability density has a high peak value near 0.85. The information shown in
Figure 4 is similar to that of Til Schuermann, and as the credit rating of the loan declines, the mean value of LGDs increases gradually and the distribution tends to be concentrated.
4.3. Result Analysis
The previous text lists probability distribution characteristics of LGDs of Lending Club, but what factors affect LGD? This section will conduct an empirical study of the impact of loans and borrower characteristics on LGDs, and the empirical model adopts multivariate linear regression model, defined as follows:
In the above formula,
is a constant term, X is the feature information vector of the loan, including the credit evaluation, principal, term and interest rate of the loan, while W represents the borrower’s feature information vector, including the lender’s working years, annual income, debt-to-income ratio and number of defaults within two years (see
Table 1 for specific information).
Table 6 lists the results of multiple linear regression analysis of this model.
Because of the strong correlation between the loan interest rate and the borrower’s credit rating in the sample data, in order to remove the correlation of independent variables, decompose this model into two steps to seek regressive solution respectively. In the first step, remove the variable of credit rating in the regression model to obtain the first column of data in
Table 5. In the second step, remove the variable of loan interest rate in the regression model to obtain the second column of regression results.
Analysis of the data presented in
Table 6 shows that LGDN and credit rating are significantly negatively correlated. The higher the grade is, the poorer the credit rating is. On the contrary, the LGD is significantly positively correlated with the loan interest rate. This is because Lending Club has set a strict ratio between the grade and the interest rate. The credit rating is to decide the interest rate of a loan. Therefore, H1 is substantiated, but H4 cannot be verified. Besides, the relationship between the LGD and the lending amount is not significant. This suggests that H2 cannot be verified. The LGD and the lending term are significantly positively correlated with each other. This provides solid evidence for H3. Among loan factors, apart from hypotheses about the interest rate and the lending amount, the other two hypotheses can both be substantiated.
Then with respect to the borrower factor, firstly, Regression results indicate that the LGD is significantly negatively correlated with the borrower’s working years and housing ownership. The latter two factors can both reflect the borrower’s financial status. This means the more the total assets the borrower has, the lower the risks of debt default losses will be. In contrast, the LGD and the borrower’s income are significantly positively correlated. This suggests that increase in the income cannot enhance safety of debts. Rather, with the increase in the income, the borrower’s loaning willingness strengthens. As the loan principal increases, the LGD is on an upward trend, Hypothesis 5 and Hypothesis 6 are established. In the next place, LGD has a significantly positive correlation with the borrower’s debt-to-income ratio and number of defaults within two years, indicating that the borrower’s financial position exerts a significant negative impact on LGD, while the recent poor credit standing has a significant positive impact on LGD, so Hypothesis 7 and Hypothesis 8 are established. The third and fourth columns are the regression analysis removing the borrower’s total assets in the model, and the results are the same as those of the previous analysis.
5. Conclusions and Inspirations
Through the empirical analysis of default loans of the Lending Club, this paper describes probability distribution characteristics and influencing factors of LGDs in P2P network lending. It is found that the probability density of the P2P lending’s LGD is generally in a unimodal distribution with the value peaking at around 0.75. The characteristic is similar to the statistical results obtained by previous research about the subordinated bonds and limited uncovered bonds. This means that the P2P lending has a lower degree of priority, so its LGD is higher. This further demonstrates the importance of the LGD in deciding the final default losses. Besides, after solving the probability density distribution of the LGD of different credit ratings in samples, this paper observes that, as the loan credit rating declines, the LGD keeps on rising and the LGD probability density distribution tends to concentrate. Based on that, the negative correlation between the credit rating the LGD can be verified. To sum up, P2P lending has a lower degree of priority. Once default happens, the losses of P2P lending are often serious. This necessitates control of the LGD by P2P lending platforms so as to avoid negative influence brought by occurrence of default. Taken as a whole, the poorer the loan credit rating is, the longer the lending term will be, and the higher the LGD will be. Among the borrower’s factors, the borrower’s total assets are significantly negatively correlated with the LGD, while the borrower’s income is proved to have a significantly positive correlation with the LGD. The borrower’s financial status is significantly negatively correlated with the LGD, while the borrower’s recent poor credit records are significantly positively correlated with the LGD. On the whole, from the perspective of the borrower, the higher the total assets the borrower has, the better the borrower’s financial status can be and the lower the LGD will be. When the borrower’s income is studied, the extra loaning willingness brought by the borrower’s consumption needs should also be taken into account, which can lead to an increase in the LGD.
In the meanwhile, this paper puts forward the following inspirations:
- (1)
LGDs of network lending are generally high, so when carrying out risk control for P2P platform, it is necessary to give priority to the control of default rate and take preventive measures in order to ensure the safe operation of the platform.
- (2)
The credit rating of the borrower has a strong negative impact on LGD of network lending, and it is required to be more cautious when the P2P platform is reviewing loans of a low credit rating. While the credit rating of loans is mainly formulated based on the credit points of the borrower, and it reveals the significance of the credit reporting system in the risk control of network lending. In view of the imperfect credit reporting system in China at present and in order to seek better development of the P2P industry in the future, all countries should strongly support the construction of credit reporting system, all platforms should take the initiative to shoulder their own responsibilities and strengthen the cooperation with the credit investigation of Central Bank, and cooperate with the government to establish and improve the credit reporting system so as to achieve a win-win situation.
- (3)
When the network lending platform carries out loan review, it is supposed to bear in mind that the borrower’s repayment will always be more important than his repayment ability, and especially, it is due to pay more attention to the borrower’s credit performance in the short term. Only in this way can we reduce the loan loss, reduce the loan risk, and thus maintain the interests of investors.
- (4)
When the network lending platform reviews the borrower, be careful not to be deceived by the size of the total asset of the borrower but attach more importance to the borrower’s financial situation at that time, clearly investigate other debt burdens of the borrower, and explore such issues as how the debt-to-income ratio is and reducing the adverse selection.