CSRLoan: Cold Start Loan Recommendation with Semantic-Enhanced Neural Matrix Factorization

Zhuang, Kai; Wu, Sen; Liu, Shuaiqi

doi:10.3390/app122413001

Open AccessArticle

CSRLoan: Cold Start Loan Recommendation with Semantic-Enhanced Neural Matrix Factorization

by

Kai Zhuang

,

Sen Wu

^* and

Shuaiqi Liu

School of Economics and Management, University of Science and Technology Beijing, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(24), 13001; https://doi.org/10.3390/app122413001

Submission received: 9 October 2022 / Revised: 4 December 2022 / Accepted: 8 December 2022 / Published: 18 December 2022

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Recommending loan products to applicants would benefit many financial businesses and individuals. Nevertheless, many loan products suffer from the cold start problem; i.e., there are no available historical data for training the recommendation model. Considering the delayed feedback and the complex semantic properties of loans, methods for general cold start recommendation cannot be directly used. Moreover, existing loan recommendation methods ignore the default risk, which should be evaluated along with the approval rate. To solve these challenges, we propose CSRLoan for cold start loan recommendation. CSRLoan employs pretraining techniques to learn the embeddings of statements, which captures the intrinsic semantic information of different loans. For recommendation, we design a dual neural matrix factorization (NMF) model, which can not only capture the semantic information of both loan products and applicants but also generate the recommendation results and default risk at the same time. Extensive experiments have been conducted on real-world datasets to evaluate the effectiveness and efficiency of the proposed CSRLoan.

Keywords:

neural MF; recommender systems; loan recommendation

1. Introduction

With rapidly increasing online lending requests, it is very difficult to choose the correct loan products with human decisions. Loan recommendation, which recommends loan products to applicants, has been considered as a critical task for many microfinance applications, such as Kiva, Lendio, etc. Empowered by machine learning techniques, a branch of recommendation methods [1,2,3,4,5] has been proposed to match appropriate users with loans, but these methods highly require sufficient historical data for training. Nevertheless, both the applicants and the products are updated frequently in loan recommendations. There is insufficient time for collecting sufficient data to train a user-specific recommendation model, which leads to a serious cold start problem. Thus, cold start loan recommendations, which recommends the correct loan projects to applicants without using substantial historical data, are designed to solve this problem and can be widely used in many circumstances.

To remedy the cold start problem, online loan platforms ask the applicants to provide detailed/short statements about the purpose of money. We argue that the semantic information contained in the statements is useful for loan recommendations. Even so, it is also challenging to build the cold start loan recommendation system due to the following reasons. (1) The complexity of semantic information involves statements comprising non-structured text data and can be in different languages, which make it difficult to understand their semantics. (2) The difficulty of delayed feedback, according to the report of Kiva.org (https://www.kivaushub.org/kivaprocess, accessed on 8 October 2022), constitutes feedback durations of one specified loan ranging usually from 6 months to 3 years, and the pay back period is also varied. It is feasible to obtain the feedback once we recommend and approve the loan. (3) The challenge with respect to mixture evaluations matrices is that the recommendation of the loan project should not only consider the conversion rate but also the default risk.

Despite the fact that a great deal of studies have been conducted for cold start recommendations, none of them can handle all three challenges. To solve the first challenge, many context-based recommendation methods [6,7] are proposed. These methods convert the statements to several categories to capture the semantics of statements. We argue that semantic information can be better modeled by semantic representations, yet these methods fail. For the second challenge, although many cold start recommendation models, such as JIM [8] and CTLM [9], are proposed for general recommendations, the problem’s definitions in their works are quite different from ours. However, most of these works are designed for instant feedback applications and cannot be used for loan recommendations. Apart from these methods, previous credit scoring methods [10,11,12,13] are also related to our problem but these methods cannot handle cold start problems. Moreover, for the third challenge, there is no existing work considering the mixture evaluation metrics.

With the emergence of natural language processing techniques, the semantic of both applicant statements and the loan project characters can be represented in a unified hyperspace, which enables us to solve the challenges systematically. Unfortunately, most of existing loan recommendation techniques [3,14,15] follow collaborate filtering frameworks, which cannot be easily adapted to textual data. Even worse, recently proposed methods [16,17] assume that the interactions of applicants are available and employ graph neural networks to model it. These methods are not suitable in our task because the applicant’s interactions violate the cold start setting.

In this paper, we propose a novel semantic-enhanced neural matrix factorization for cold start recommendations for loan, abbreviated as CSRLoan. Instead of modeling the user-specified preference, CSRLoan directly used the provided semantic information to build the recommender system. Loans that have similar statements in semantics would be recommended to the same applicants. In this way, both the applicants and the loans can be viewed as several latent groups. When a new loan comes, the model will first decide the group it belongs to and then recommends it to the right applicant.

CSRLoan contains three key modules, i.e., statement encoding, dual NMF, and mixture learning target. To capture the semantic information of statements, CSRLoan employs a pre-training algorithm to learn the initial representations of used word embeddings and to conduct a transformer network to encode each statement into a representation vector. Taking the statement’s vector as the input, CSRLoan designs a unbalanced NMF model to convert the applicants and loan project into a semantic space and recommends the loan project nearby the applicant in that space. To optimize the model parameters, we propose a mixture loss that balances the influence of conversion rate and default risk.

The main contributions of this paper can be summarized as follows:

To the best of our knowledge, this is one of the pioneer works that model the semantics of statements with pre-training techniques and utilize them for loan recommendations. The intrinsic characters of statements are suitable for solving the cold start recommendation problem.
We propose CSRLoan, a dual neural matrix factorization model for cold start loan recommendation. It first learns the representations of statements. Then, the loan projects and applicants are embedded in a semantic space for better recommendation.
We conduct extensive experiments on a real-world dataset. The results show the superiority of CSRLoan compared with all baselines.

2. Related Work

In this section, we summarize the related works of CSRLoan. These works can be classified into three categories: loan recommendation, cold start recommendation, and pre-training for semantic modeling.

2.1. Loan Recommendation

Despite the emergence of social loans, few research works [3,14,15,17,18] have been developed to solve the loan recommendation problem. These methods can be roughly divided into two categories, i.e., recommend loan projects to applicants [3,16,18] and recommended applicants to loan projects [17].

For the first category, Zhao et al. [18] proposed a collaborate filtering approach to recommend optimal loan projects to applicants. Lee et al. [3] focused on the fairness of loan recommendation and utilized the matrix factorization framework to select the correct loan projects. Recently, Liu et al. [16] proposed a graph convolution network to recommend loan projects while considering the return rate of applicants. However, all these methods utilize either the loan–applicant interactions or applicant–applicant interactions to build the recommendation model. In our problem, such historical data are not available, which limits their usage.

For the second category, only one work [17] aimed to find potential lenders to loan projects. Zhang et al. employed random walk techniques on historical loan-applicant graph to obtain the embeddings of applicants. Then, a similar measure was adapted to rank potential lenders. Nevertheless, this work is not relevant to our task. We focus on recommending loan projects to applicants without using much historical data.

Moreover, credit-scoring methods [10,11,12,13], which assign the credit level for applicants, are also related to our problem. The traditional machine learning methods such as decision tree [13] and softmax regression [11] methods are employed to build a classifier to generate the credit levels. However, these methods ignore the semantic information of statements, which cannot be used for our problem. Recently, a transformer-based method [10] is proposed for credit scoring. Benefitting from the representation power of the transformer, the proposed method achieves the state-of-the-art performance on this task. Nevertheless, without considering the property of loans, it is not suitable for loan recommendation tasks.

2.2. Cold Start Recommendation

The cold start problem has been studied in the recommender system community for many years. The main issue of this problem is that there is insufficient information for making recommendations. According to the literature [19], existing research studies can be categorized into two groups, i.e., the explicit solutions and implicit solutions. Next, we present the related works.

For explicit solutions [20,21,22], the recommender system directly interacts with users or experts to collect the information. The user is either asked to fill some questionnaire or rate some products. To select the appropriate questions, active learning-based methods [20,22,23] are proposed to collect adequate information without overwhelming users. Apart from the active learning approaches, using an interview is another method for obtaining user preferences. An item set is employed to test the preferences of users. In each round, one item is selected to display and collect the response. There can be three types of responses: like, dislike, and unknown [21]. Although explicit solutions are efficient, users are often reluctant to participate in the query process, which limits their usage.

Another group of cold start recommendation methods includes implicit methods [6,24,25,26,27,28]. These methods try to understand new user preferences via minimum (or only once) interactions. Existing user information such as demographics or social relationships is employed to build the cold start recommender system. Based on the usage of external information, existing solutions in this group can be divided into user-demographic-based approaches [24] and social-relationship-based approaches [25]. Classification methods [26,27] are usually adopted to identify the character of users based on their demographics. Then, users having similar characteristics are usually recommended equally. To integrate the social relationship, many graph-based techniques [29,30] are proposed to capture the user’s similarity. However, finding useful information is challenging in these methods.

In loan recommendations, explicit solutions are not appropriate due to the high volume and diversity of applicants. The proposed method CSRLoan falls in the implicit group. To better model the semantic information, CSRLoan employs pre-training techniques and captures the influence of both default risk and conversion rate.

2.3. Pre-Training for Semantic Modeling

Pre-training is one of the new fronts in the field of matural language processing and can be used as a powerful tool for semantic modeling. The core philosophy of pre-training techniques is transfer learning or domain adaptation. Before the implementation of down-stream tasks, the self-supervised learning procedure is adopted to initialize the representations of some shared information, e.g., word embeddings or model parameters. One of the pioneer work in this field is the GloVe model [31]. It can be thought of as a mix of the count-based matrix factorization [32] and the context-based skip-gram [33] model together. This method learns the semantic of words by reconstructing the context surrounding them. GPT [34] is another milestone of pre-training techniques. It consists of two distinctive stages, unsupervised pre-training and supervised finetuning. The unsupervised loss that borrows the idea from the language models is designed to learn the initial parameters, which can be generalized to many other tasks via supervised finetuning. Recently, BERT [35] is the state-of-the-art pre-training method that shares the same idea of GPT. The main difference is the bidirectional encoder constructed by transformer [36]. Although many pre-training techniques are proposed, none of them are specifically designed for loan statements. In this paper, we propose an activity loss along with the mask language model to generate representations for loan recommendation.

3. Preliminaries

In this section, we first summarize the notations and define the loan recommendation problem. Based on the notations, we overview the proposed method CSRLoan. The notations used in this paper as listed in Table 1.

3.1. Problem Formulation

A loan project in lending platform can be a company project, such as Webank small loan, or a personal lender. In most cases, the available information of a loan project is limited. Generally, we define the loan project as a tuple of id and k attributes, i.e.,

v = (i d, a)

, where

a = {a_{1}, \dots, a_{k}}

contains the display name, loan purchase number, etc.. Specifically, a historical loan record can be organized into a triplet

r = (u, v, s)

, which indicates applicant u applying loan project v with proposal statement s. For a specific user, its historical loans form a sequence, i.e.,

T_{u} = [r_{1}, r_{2}, \dots, r_{K}]

. Due to the character of loans, most applicants are fresh users and have no historical loans.

Given the historical dataset of N loan records

D = {r_{p} | p \in 1, 2, \dots, N}

and a loan application of new user

u^{n}

with statement

s^{n}

, the cold start loan recommendation problem aims to recommend loan projects that satisfy the applicant’s needs. Formally, the cold start loan recommendation problem can be formally defined as follows.

Cold Start Loan Recommendation: The goal of cold start loan recommendation is to predict the most likely loan project

\hat{v}

that the request of new user

u^{n}

will be satisfied while minimizing the default risk by utilizing all information in

D

:

\hat{v} = a r g max_{v} P (v \in V; u^{n}, s^{n}, D, Θ)

(1)

where

P

is a measurement of the recommendation performance,

V

is the set of loan projects, and

Θ

denotes all parameters in the recommendation model.

3.2. Overview of CSRLoan

CSRLoan is a semantic-enhanced model. It utilizes historical loan records and models the semantic information of applicant statements to achieve cold start recommendations. The overall architecture is illustrated in Figure 1. To model the semantic information of statements, CSRLoan employs a transformer-based neural network to pre-train the statement’s embedding model. Taking the statement’s embeddings as the input, CSRLoan employs a dual neural matrix factorization model to capture the conversation and default risk jointly. Moreover, a mixture loss is defined to optimize the entire model.

4. Methodology

CSRLoan consists of three key modules, i.e., statement encoding, dual NMF, and mixture learning target. Next, we specify the details of these modules.

4.1. Statement Encoding

The statements of user applications play a key role in solving cold start recommendation problems. The semantics in these statements reflect the user demographic and could be used to find similar users. In this section, we present a statement-encoding module to convert these statement texts into semantic representations.

A statement can be represented as a sequence of words. Given a statement s of user u, i.e.,

s = [w_{1}, w_{2}, \dots, w_{K}]

, CSRLoan first embeds each word into a vector with an embedding matrix,

M

:

\begin{matrix} [e_{1}, e_{2}, \dots, e_{K}] & = E m b e d d i n g (M, s) \end{matrix}

(2)

where

M \in R^{M \times d}

is the word embedding’s matrix. To model the sequential information of s, we employ the sinusoidal values proposed by Vaswani et al. [36] for encoding the position’s information. This approach is an absolute position encoding called sinusoidal position embeddings. We construct sequential position embeddings as follows:

λ_{k j} = {\begin{matrix} s i n ({10,000}^{\frac{j}{k}}), & if k is even \\ c o s ({10,000}^{\frac{j - 1}{k}} k), & if k is odd \end{matrix}

(3)

where

k = [1, \dots, K]

and

j = [1, \dots, d]

. For the sequential position encoding of

e_{k}

, we denote it as

λ_{k} \in R^{d}

. The input vector can be represented as follows.

\begin{matrix} i_{k} & = e_{k} + λ_{k} \end{matrix}

(4)

After the generation of input vectors, we feed them into a transformer network to obtain semantic embeddings:

\begin{matrix} {h^{'}}_{i}^{ℓ + 1} = O_{h}^{ℓ} {concat}_{k = 1}^{H} (\sum_{j = 1 : K} w_{i j}^{k, ℓ} V^{k, ℓ} h_{j}^{ℓ}), \\ where, w_{i, j}^{k, ℓ} = {softmax}_{j} (\frac{Q^{k, ℓ} h_{i}^{ℓ} \cdot K^{k, ℓ} h_{j}^{ℓ}}{\sqrt{d_{k}}}), \end{matrix}

(5)

where

Q^{k, ℓ}, K^{k, ℓ}, V^{k, ℓ} \in R^{K \times d}, O_{h}^{ℓ} \in R^{d \times d}, k = 1

to H all indicate the number of attention heads. For numerical stability, the outputs after taking exponents of the terms inside

s o f t m a x

are limited to the range of

(- 5, 5)

. Then, the attention output

{h^{'}}_{i}^{ℓ + 1}

are fed into a feed-forward network (FFN), which contains the residual connections and batch normalization modules. For conciseness, we omit the equations of identical operations with transformer.

After the P layers of transformer, we obtain the hidden representations of all words, i.e.,

[h_{1}^{P}, \dots, h_{K}^{P}]

. To generate the final embedding of s, we employ the mean readout function on the node’s embedding. The embedding,

e_{s}

, of trajectory s is computed as

e_{s} = \sum_{i = 1}^{K} h_{i}^{P} / K

. For conciseness, we represent the above-mentioned operations as follows:

\begin{matrix} e_{s} & = E n c o d e r_{t r a n s} (s) \end{matrix}

(6)

Instead of initializing the model parameters randomly, we design a pre-training strategy to obtain the initial model’s parameters. Traditional pre-training utilizes an encoder-decoder framework. It masks 20% percentage of tokens and ties to predict the masked tokens in the decoded sequence. Similarly to the encoder, we can also build a decoder to reconstruct s:

\begin{matrix} [{\hat{w}}_{1}, {\hat{w}}_{2}, \dots, {\hat{w}}_{K}] & = D e c o d e r_{t r a n s} ([h_{1}^{P}, \dots, h_{K}^{P}]) \\ L_{m a s k} & = \sum_{i \in M P} C r o s s E n t r o p y (w_{i}, {\hat{w}}_{i}) \end{matrix}

(7)

where

M P

is the masked position. For loan applications, there is an “activity” label y along with the statement indicating whether the loan is defaulted or not. It consists of three states, i.e., default, active, and completed. Thus, we propose a new pre-training task, which predicts the activity label based on the representation.

\begin{matrix} f_{s} & = M L P (e_{s}) \\ p_{s} & = S o f t m a x (f_{s}) \\ L_{a c t i v i t y} & = C r o s s E n t r o p y (p_{s}, y) \end{matrix}

(8)

The final loss of pre-training can be represented as follows.

\begin{matrix} L_{p r e - t r a i n} & = \sum_{i = 1 : N} L_{m a s k}^{i} + L_{a c t i v i t y}^{i} \end{matrix}

(9)

After the pre-training, we employ the encoder to obtain the representation of the statement.

4.2. Dual Neural Matrix Factorization

In this section, we introduce the dual neural matrix factorization module, which not only fits the historical loan records but also estimates the default risk of loan. The dual neural matrix factorization module consists of three parts, i.e., feature encoding, loan recommendation, and default risk estimation.

Feature encoding: In this part, we convert the features of applicants and loan projects into vectors. Note that the loan project

v = (i d, a)

and applicant

u = (u i d, a^{'})

.

a^{'}

represents the features of applicants, including the gender, age, and income grades. These features can be classified into two groups, i.e., value features

a_{f}

and category features

a_{c}

. For value features, we first normalize them into

(0, 1)

and feed them into the MLP layer to obtain the input. For category features, we employ embedding techniques to convert them into vectors:

\begin{matrix} f_{f} & = M L P (N o r m a l i z e ([a_{1}, a_{2}, \dots, a_{k_{f}}]) \\ f_{c} & = M L P (c o n c a t (E m b e d d i n g ([a_{1}, a_{2}, \dots, a_{k_{c}}]))) \end{matrix}

(10)

where

k_{v}

and

k_{c}

are the number of features. We assume that the dimensions of

f_{f}

and

f_{c}

are equal to d. For u and v, we use the same method to generate their value features and category features. Then, we concatenate them and feed them to a

M L P

to obtain their features.

\begin{matrix} f^{u} & = M L P (c o n c a t (f_{f}^{u}, f_{c}^{u})) \\ f^{v} & = M L P (c o n c a t (f_{f}^{v}, f_{c}^{v})) \end{matrix}

(11)

For the applicant, we also involve the statement embedding to the features and obtain the final representation of the applicant.

\begin{matrix} f^{u} & = M L P (c o n c a t (s, f^{u})) \end{matrix}

(12)

Loan recommendation. Taking the representations of u and v as inputs, CSRLoan employs a neural matrix factorization method to fit historical loan records. In this part, we first construct an applicant–loan matrix

R

. Each element

R_{i, j}

in the matrix indicates the number of user

u_{i}

that successfully completed loan project

v_{j}

. Then, we fit these numbers with the representation vectors.

\begin{matrix} o_{i, j}^{r} & = M L P_{r} (c o n c a t (f^{u}, f^{v})) \end{matrix}

(13)

Default Risk Estimation. Similarly to loan recommendations, CSRLoan employs another neural matrix factorization model to fit the historical default loans. We also construct an applicant-loan matrix

D

. Each element

D_{i, j}

in the matrix indicates the number of user

u_{i}

defaulting the loan project

v_{j}

.

\begin{matrix} o_{i, j}^{d} & = M L P_{d} (c o n c a t (f^{u}, f^{v})) \end{matrix}

(14)

4.3. Mixture Learning Target

As the recommended loan projects should not only be attractive for applicants but also have low default risk, we designed a mixture loss

L

to optimize the parameters in CSRLoan. Formally,

L

consists of two parts. The first one is the recommendation loss

L_{r e c}

, which guides the model to recommend attractive loan projects.

\begin{matrix} L_{r e c} & = \sum_{i, j \in S_{s}} M S E (o_{i, j}^{r}, R_{i, j}) \end{matrix}

(15)

The second one is the default risk loss

L_{r i s k}

, which guides the model to predict the default risk.

\begin{matrix} L_{r i s k} & = \sum_{i, j \in S_{d}} M S E (o_{i, j}^{d}, - D_{i, j}) \end{matrix}

(16)

Finally, the total loss can be formulated as follows.

\begin{matrix} L & = L_{r e c} + L_{r i s k} \end{matrix}

(17)

All the parameters in CSRLoan can be updated in an end-to-end manner. We update the parameters with backpropagation algorithms and employ an Adam optimizer for optimization. When the dual NMF model is well-trained, we use a combination of recommendation results and default risks to generate the recommended loan projects. Once a new application

< u, s >

arrived, CSRLoan calculates the outputs

o_{u}^{r}

and

o_{u}^{d}

of all loan projects. Then, we add them together and select the loan projects with higher scores as the recommended loan projects:

\begin{matrix} {score}_{u} & = o_{u}^{r} + α \cdot o_{u}^{d} \end{matrix}

(18)

where

α

is a hyperparameter for balancing the contribution of loan recommendations and default predictions.

5. Experiment

In this section, we conduct extensive experiments to demonstrate the effectiveness of CSRLoan. Our experimental evaluation is designed to answer several research questions (RQs).

RQ1: Does CSRLoan outperform other general recommendation methods and loan recommendation methods?
RQ2: What is the capability of the proposed pre-training techniques and dual NMF?
RQ3: What are the influences of different hyper-parameter settings?

Next, we introduce the experimental settings, experimental results, ablations studies, and hyper-parameter studies.

5.1. Experimental Settings

We first introduce the datasets, compared baselines, evaluation metrics, and parameter settings of our experiments. Then, we evaluate CSRLoan against other state-of-the-art algorithms.

5.1.1. Datasets

In our experiments, we use a open-source crowdfunding dataset, Kiva, to evaluate the performance of CSRLoan. Kiva is an online crowdfunding platform for extending financial services to poor and financially excluded people around the world. Kiva lenders provided over USD 1 billion in loans to over two million people. In order to set investment priorities, help inform lenders, and understand their target communities, knowing the level of poverty of each applicant is critical. This dataset includes the loan data in the year 2014 containing 1,419,607 loans and 2,349,174 registered lenders across more than 100 countries. We overview the statistics of the top 10 countries in Figure 2. As shown, Philippines has the highest loan records, and we chose the Phillippines to evaluate the performance of CSRLoan.

In our experiments, we filter the users with at least one approved loan to evaluate the performance of CSRLoan. After this procedure, we obtain about 850,000 users with over one million loans. For cold start setting, we split the dataset by users. As shown in Table 2, we provided the statistics of the used dataset. The pre-training and finetuning are conducted on the training set. In the finetuning procedure, we chose the best model with the performance on validation set. Finally, the performances on the test set are reported in this paper.

For better understanding CSRLoan, we provided some examples of user applications to showcase the used features. As illustrated in Table 3, we can find that the numerical features include ages, income, and loan amount and the category features include country and gender repayment. We can also observe that the statement of a loan is a short description of the usage. Moreover, examples of loan projects are also provided in Table 4. We can find that the features of loan projects also contain the two types of features.

5.1.2. Evaluation Metrics

We use two different metrics for performance evaluation, Hit Ration (HR@K) and Normalized Discounted Cumulative Gain (NDCG@K). HR@K measures whether the loan shows within the top K in the ranked list while the NDCG@K takes the position of the loan into account and penalizes the score if it is ranked lower in the list.

5.1.3. Compared Baselines

In this study, we compare CSRLoan with seven representative baselines to evaluate the performance. Because of the particular data, i.e., statement text, used in our problem, we only chose recommendation methods that can be easily extended to the text information as the baselines. Notice that many graph neural-network-based methods [16,17] are proposed recently, and we do not compare CSRLoan with them due to the cold start setting. We roughly divide compared baselines into three groups: general recommendation methods, loan recommendation methods, and ablations of CSRLoan.

General recommendation methods.

MF-BPR [37]: A Bayesian personalized ranking optimized MF model with a pairwise ranking loss. It is tailored toward recommendations with implicit feedback data.
CML [2]: A recently proposed algorithm that minimizes the distance between each user–loan interactions in Euclidean space.

Loan recommendation methods.

MBR [14]: A motivation-based recommendation method to utilize the unstructured data. This is the state-of-the-art method for loan recommendations.
FAR [3]: A fairness-aware recommendation method based on one-class collaborative-filtering techniques.

Ablations of CSRLoan.

NMF: Traditional NMF that removes the pre-training and default risk estimation module.
CSRLoan $_{p r e^{-}}$ : Directly train CSRLoan with randomly initialized statement encoder.
CSRLoan $_{r i s k^{-}}$ : A variant of CSRLoan which removes the default risk estimation module.

Note that some of these compared methods are not for loan recommendations, and we utilize historical data to train these models and use the same testing data to evaluate the performances.

5.1.4. Parameter Settings

For CSRLoan, we set the dimensions of all embedding vectors as 32. In the recommendation process, the trade-off hyper-parameter

α

is set as

0.5

. In CSRLoan, all MLPs are three-layer fully connected neural networks. For the transformer module, we use two-layer transformer to moderate the number of parameters. The attention head H is empirically set to 7. To train the parameters, we randomly mask 20% positions of the input statement and predict the masked words in the decoder. Moreover, all models are trained by Adam. The training batch size is set as 128. The maximum epoch is set as 50 for both pre-training and finetuning operations.

5.2. Experimental Results

The comparison experimental results of CSRLoan are shown in Table 5. For a better understanding, we also illustrate the performance changes in CSRLoan and other selected baselines in Figure 3 and Figure 4 with the increase in the recommended number of loan projects. Next, we analyze the results and answer the research questions: RQ1.

As shown in Table 5, the performance of CSRLoan beats all compared baselines on all evaluation metrics consistently. For example, compared with the most competitive method, i.e., NMF, CSRLoan achieves about 30% relative improvement on HR@5 (from 0.0841 to 0.1150). This result shows the superiority of the proposed modules and answers the RQ1. Among all the recommendation methods, MBR is the strongest baseline. Although it models the semantic information of unstructured data, it is also inferior to CSRLoan. The reason is that MBR only captures the category motivation information of statements and cannot borrow knowledge from the statement’s semantics. For different groups of baselines, the loan recommendation methods outperform the general ones, which shows the special characteristics of loan data.

As illustrated in Figure 3 and Figure 4, the performances of CSRLoan continuously outperform the compared baselines. When the return number is

N = 5

, the performance gaps between different methods are relatively small. With the increase in N, the performances of all methods first increase and then become stable. From Figure 4, we can observe that CSRLoan significantly outperforms the most competitive baseline, i.e., MBR.

5.3. Ablation Studies

To answer the RQ2, we compared the performance of CSRLoan with its ablations.

Contribution of pre-training strategy. Comparing CSRLoan with CSRLoan

_{p r e^{-}}

, the average relative improvement is above 10%. This is because our model can capture the semantic information of statements with the pre-training procedure.

Contribution of default risk estimation. Comparing CSRLoan with CSRLoan

_{r i s k^{-}}

, CSRLoan achieves better performances. The results indicate that the bad loan projects having high default risks are filtered with the prediction model filter, which results in higher performances than other methods.

5.4. Hyper-Parameter Studies

To answer RQ3, in this subsection, we evaluate the influences of the hyper-parameter’s settings. Specifically, we analyze the impacts of two key parameters of CSRLoan, i.e., the dimension d of memory representation and the trade-off factor

α

of the two losses in the joint objective.

Influence of embedding dimension d. For dimension d, we change d from 2 to 128 in CSRLoan. The performance of loan recommendation is shown in Figure 5. With the increase in d, we find that the performance first increases and then decreases. One potential reason is that the memory provides too little information when the d is too small, while it can include more irrelevant information when it is too large.

Influence of trade-off hyperparameter $α$ . For

α

, we search

α

from

0.1

to

0.9

. The results are shown in Figure 6. The changing pattern is quite similar to d, which indicates that both the pre-training strategy and default risk prediction modules are critical for the task.

6. Conclusions

In this paper, we presented a novel semantic-enhanced dual NMF-based model, namely CSRLoan, for the cold start loan recommendation problem. Different from existing loan recommendation methods, CSRLoan leverages the knowledge from applicant’s statements and default records to enhance the recommendation performance of new user applications. Specifically, we designed two key procedures in CSRLoan to achieve cold start recommendations. Firstly, a transformer module was utilized to model the semantic information of statements. Along with the transformer, we also proposed pre-training techniques to initialize the parameters of the statement encoder. Secondly, we proposed a dual NMF model to capture the information on both success loans and default risk. A mixture loss was designed to optimize the parameters of CSRLoan. In this manner, we can recommend loans for new applicants based on the provided statements and personal information.

Moreover, we conducted extensive experiments to show the superiority of CSRLoan and verified the effectiveness of the newly proposed techniques. Compared with the baselines, CSRLoan significantly outperforms existing loan recommendation methods under different evaluation metrics. According to the ablation study, we found that the proposed statements encoder and dual NMF model can enhance the recommendation performance substantially. In addition, the hyperparameters in CSRLoan were tested. The results indicate that CSRLoan is not sensitive to these hyperparameters. It can achieve good performance in a broad range of parameter settings.

Author Contributions

Conceptualization, K.Z. and S.W.; methodology, K.Z. and S.L.; software, K.Z.; validation, S.W. and S.L.; writing—original draft preparation, K.Z.; writing—review and editing, S.W.; visualization, K.Z.; supervision, S.W.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

The used data are available in https://www.kiva.org/build/data-snapshots, accessed on 8 October 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Rendle, S.; Freudenthaler, C.; Gantner, Z.; Schmidt-Thieme, L. BPR: Bayesian personalized ranking from implicit feedback. arXiv 2012, arXiv:1205.2618. [Google Scholar]
Hsieh, C.; Yang, L.; Cui, Y.; Lin, T.; Belongie, S.J.; Estrin, D. Collaborative Metric Learning. In Proceedings of the WWW ’17: 26th International World Wide Web Conference, Perth, Australia, 3–7 April 2017; pp. 193–201. [Google Scholar]
Lee, E.L.; Lou, J.K.; Chen, W.M.; Chen, Y.C.; Lin, S.D.; Chiang, Y.S.; Chen, K.T. Fairness-aware loan recommendation for microfinance services. In Proceedings of the 2014 International Conference on Social Computing, Beijing, China, 4–7 August 2014; pp. 1–4. [Google Scholar]
Zhao, K.; Zhang, Y.; Yin, H.; Wang, J.; Zheng, K.; Zhou, X.; Xing, C. Discovering Subsequence Patterns for Next POI Recommendation. In Proceedings of the IJCAI, Yokohama, Japan, 7–15 January 2020; pp. 3216–3222. [Google Scholar]
Liu, Y.; Ao, X.; Dong, L.; Zhang, C.; Wang, J.; He, Q. Spatiotemporal Activity Modeling via Hierarchical Cross-Modal Embedding. IEEE Trans. Knowl. Data Eng. 2022, 34, 462–474. [Google Scholar] [CrossRef]
Tan, H.; Yao, D.; Bi, J. Deep Transfer Learning for Successive POI Recommendation. In Proceedings of the International Conference on Spatial Data and Intelligence, Hangzhou, China, 22–24 April 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 134–140. [Google Scholar]
Yao, D.; Zhang, C.; Huang, J.; Bi, J. SERM: A Recurrent Model for Next Location Prediction in Semantic Trajectories. In Proceedings of the CIKM ’17: ACM Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 2411–2414. [Google Scholar]
Yin, H.; Cui, B.; Zhou, X.; Wang, W.; Huang, Z.; Sadiq, S.W. Joint Modeling of User Check-in Behaviors for Real-time Point-of-Interest Recommendation. ACM Trans. Inf. Syst. 2016, 35, 11:1–11:44. [Google Scholar] [CrossRef]
Li, D.; Gong, Z.; Zhang, D. A Common Topic Transfer Learning Model for Crossing City POI Recommendations. IEEE Trans. Cybern. 2019, 49, 4282–4295. [Google Scholar] [CrossRef]
Wang, C.; Xiao, Z. A Deep Learning Approach for Credit Scoring Using Feature Embedded Transformer. Appl. Sci. 2022, 12, 10995. [Google Scholar] [CrossRef]
Munkhdalai, L.; Ryu, K.H.; Namsrai, O.E.; Theera-Umpon, N. A partially interpretable adaptive Softmax regression for credit scoring. Appl. Sci. 2021, 11, 3227. [Google Scholar] [CrossRef]
Petrides, G.; Moldovan, D.; Coenen, L.; Guns, T.; Verbeke, W. Cost-sensitive learning for profit-driven credit scoring. J. Oper. Res. Soc. 2022, 73, 338–350. [Google Scholar] [CrossRef]
Liu, W.; Fan, H.; Xia, M. Credit scoring based on tree-enhanced gradient boosting decision trees. Expert Syst. Appl. 2022, 189, 116034. [Google Scholar] [CrossRef]
Yan, J.; Wang, K.; Liu, Y.; Xu, K.; Kang, L.; Chen, X.; Zhu, H. Mining social lending motivations for loan project recommendations. Expert Syst. Appl. 2018, 111, 100–106. [Google Scholar] [CrossRef]
Wang, X.; Zhang, D.; Zeng, X.; Wu, X. A Bayesian investment model for online P2P lending. In Frontiers in Internet Technologies; Springer: Berlin/Heidelberg, Germany, 2013; pp. 21–30. [Google Scholar]
Liu, Y.; Ma, H.; Jiang, Y.; Li, Z. Modelling risk and return awareness for p2p lending recommendation with graph convolutional networks. Appl. Intell. 2022, 52, 4999–5014. [Google Scholar] [CrossRef]
Zhang, H.; Zhao, H.; Liu, Q.; Xu, T.; Chen, E.; Huang, X. Finding potential lenders in P2P lending: A hybrid random walk approach. Inf. Sci. 2018, 432, 376–391. [Google Scholar] [CrossRef]
Zhao, H.; Liu, Q.; Wang, G.; Ge, Y.; Chen, E. Portfolio selections in P2P lending: A multi-objective perspective. In Proceedings of the 22nd ACM SIGKDD International Conference On Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 2075–2084. [Google Scholar]
Gope, J.; Jain, S.K. A survey on solving cold start problem in recommender systems. In Proceedings of the 2017 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, India, 5–6 May 2017; pp. 133–138. [Google Scholar]
Braunhofer, M.; Elahi, M.; Ge, M.; Ricci, F. Context dependent preference acquisition with personality-based active learning in mobile recommender systems. In Proceedings of the International Conference on Learning and Collaboration Technologies, Heraklion, Greece, 22–27 June 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 105–116. [Google Scholar]
Sun, M.; Li, F.; Lee, J.; Zhou, K.; Lebanon, G.; Zha, H. Learning multiple-question decision trees for cold-start recommendation. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, Rome, Italy, 4–8 February 2013; pp. 445–454. [Google Scholar]
Rubens, N.; Elahi, M.; Sugiyama, M.; Kaplan, D. Active learning in recommender systems. In Recommender Systems Handbook; Springer: Berlin/Heidelberg, Germany, 2015; pp. 809–846. [Google Scholar]
Elahi, M.; Ricci, F.; Rubens, N. Active learning in collaborative filtering recommender systems. In Proceedings of the International Conference on Electronic Commerce and Web Technologies, Munich, Germany, 1–4 September 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 113–124. [Google Scholar]
Bobadilla, J.; Ortega, F.; Hernando, A.; Bernal, J. A collaborative filtering approach to mitigate the new user cold start problem. Knowl. Based Syst. 2012, 26, 225–238. [Google Scholar] [CrossRef] [Green Version]
Shapira, B.; Rokach, L.; Freilikhman, S. Facebook single and cross domain data for recommendation systems. User Model. User Adapt. Interact. 2013, 23, 211–247. [Google Scholar] [CrossRef]
Lika, B.; Kolomvatsos, K.; Hadjiefthymiades, S. Facing the cold start problem in recommender systems. Expert Syst. Appl. 2014, 41, 2065–2073. [Google Scholar] [CrossRef]
Zhang, M.; Tang, J.; Zhang, X.; Xue, X. Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, Gold Coast, QLD, Australia, 6–11 July 2014; pp. 73–82. [Google Scholar]
Tan, H.; Yao, D.; Huang, T.; Wang, B.; Jing, Q.; Bi, J. Meta-Learning Enhanced Neural ODE for Citywide Next POI Recommendation. In Proceedings of the 2021 22nd IEEE International Conference on Mobile Data Management (MDM), Toronto, ON, Canada, 15–18 June 2021; pp. 89–98. [Google Scholar] [CrossRef]
Liu, S.; Ounis, I.; Macdonald, C.; Meng, Z. A heterogeneous graph neural model for cold-start recommendation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual, 25–30 July 2020; pp. 2029–2032. [Google Scholar]
Han, P.; Wang, J.; Yao, D.; Shang, S.; Zhang, X. A Graph-based Approach for Trajectory Similarity Computation in Spatial Networks. In Proceedings of the KDD, Virtual, 14–18 August 2021; pp. 556–564. [Google Scholar]
Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 173–182. [Google Scholar]
Guthrie, D.; Allison, B.; Liu, W.; Guthrie, L.; Wilks, Y. A closer look at skip-gram modelling. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy, 22–28 May 2006; Volume 6, pp. 1222–1225. [Google Scholar]
Floridi, L.; Chiriatti, M. GPT-3: Its nature, scope, limits, and consequences. Minds Mach. 2020, 30, 681–694. [Google Scholar] [CrossRef]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Rendle, S.; Freudenthaler, C.; Gantner, Z.; Schmidt-Thieme, L. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009), Montreal, QC, Canada, 18–21 June 2009; AUAI Press: Vancouver, BC, Canada, 2009; pp. 452–461. [Google Scholar]

Figure 1. The architecture of CSRLoan.

Figure 2. Statistics of loans across different countries.

Figure 3. HR@N of loan recommendation where N ranges from {1, 5, 10, 15, …, 45, 50}.

Figure 4. NDCG@N of loan recommendation where N ranges from {1, 5, 10, 15, …, 45, 50}.

Figure 5. HR@5 with respect to embedding dimension d.

Figure 6. HR@5 with respect to the value of

α

.

Figure 6. HR@5 with respect to the value of

α

.

Table 1. Notations used in this paper.

Notations	Descriptions
$u, v, s$	Applicant user, loan project, and the statement
$U, V$	User set and loan project set
$r = (u, v, s)$	A historical record indicates u apply v with s
$T_{u}$	Historical loan records of user u
$D$	Historical data of all loan records
$M$	Word embedding matrix
d	Embedding dimension
$Θ$	All parameters in CSRLoan

Table 2. Statistical information of the dataset.

Dataset	# of Samples	# of Users	# of Default Samples	Average Length of s
Train	634,144	501,813	50,900	13.7
Validation	53,703	48,083	3833	11.6
Test	388,405	302,401	6210	12.8

Table 3. Examples of the user applications.

ID	Amount	Categories	Statement	Country	Region	Ages	Income	Gender	Repayment	Activity Label
1	575	Transportation	to repair and maintain the auto rickshaw used in their business	Pakistan	Lahore	30	14,000	female	irregular	default
2	150	Transportation	to repair their old cycle-van and buy another one to rent out	India	Maynaguri	22	6000	female	bullet	completed
3	200	Arts	to purchase an embroidery machine and new materials	Pakistan	Lahore	25	8000	female	irregular	completed
4	250	Services	purchase leather for my business using ksh 20000	Kenya		23	6000	female	irregular	completed
5	200	Agriculture	to purchase a dairy cow and start a milk products business	India	Maynaguri	25	8000	male	bullet	overdue
6	400	Services	to buy more hair and skin care products	Pakistan	Ellahabad	30	8000	female	monthly	completed
7	475	Manufacturing	to purchase leather, plastic soles and heels in different sizes	Pakistan	Lahore	46	19,000	female	monthly	completed
8	625	Food	to buy a stall, gram flour, ketchup, and coal for selling ladoo	Pakistan	Lahore	35	24,000	male	irregular	default

Table 4. Examples of the loan projects.

ID	Loan Theme	Require Partner	Duration	Amount
638631	General	Yes	2 years	8000
640322	General	Yes	0.5 years	12,000
641006	Higher Education	Yes	3 years	40,000
641019	Higher Education	No	2 years	2000
641594	Subsistence Agriculture	Yes	2 years	10,000
642256	Extreme Poverty	Yes	1 years	20,000

Table 5. Performance comparison of different methods in terms of HR@5, NDCG@5, HR@10, and NDCG@10. The bold values indicate the best performances.

Type	Method	HR@5	NDCG@5	HR@10	NDCG@10
General	MF-BPR	0.02134	0.01841	0.05263	0.03991
General	CML	0.04630	0.03877	0.1518	0.1247
Loan Rec	FAR	0.04261	0.02974	0.0993	0.04971
Loan Rec	MBR	0.05216	0.03234	0.1377	0.0860
Ablations	NMF	0.0841	0.0710	0.2432	0.1221
	CSRLoan $_{p r e^{-}}$	0.0922	0.07082	0.2310	0.1778
	CSRLoan $_{r i s k^{-}}$	0.1021	0.08113	0.2530	0.1809
Our	CSRLoan	0.1150	0.0970	0.2233	0.1928

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhuang, K.; Wu, S.; Liu, S. CSRLoan: Cold Start Loan Recommendation with Semantic-Enhanced Neural Matrix Factorization. Appl. Sci. 2022, 12, 13001. https://doi.org/10.3390/app122413001

AMA Style

Zhuang K, Wu S, Liu S. CSRLoan: Cold Start Loan Recommendation with Semantic-Enhanced Neural Matrix Factorization. Applied Sciences. 2022; 12(24):13001. https://doi.org/10.3390/app122413001

Chicago/Turabian Style

Zhuang, Kai, Sen Wu, and Shuaiqi Liu. 2022. "CSRLoan: Cold Start Loan Recommendation with Semantic-Enhanced Neural Matrix Factorization" Applied Sciences 12, no. 24: 13001. https://doi.org/10.3390/app122413001

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

CSRLoan: Cold Start Loan Recommendation with Semantic-Enhanced Neural Matrix Factorization

Abstract

1. Introduction

2. Related Work

2.1. Loan Recommendation

2.2. Cold Start Recommendation

2.3. Pre-Training for Semantic Modeling

3. Preliminaries

3.1. Problem Formulation

3.2. Overview of CSRLoan

4. Methodology

4.1. Statement Encoding

4.2. Dual Neural Matrix Factorization

4.3. Mixture Learning Target

5. Experiment

5.1. Experimental Settings

5.1.1. Datasets

5.1.2. Evaluation Metrics

5.1.3. Compared Baselines

5.1.4. Parameter Settings

5.2. Experimental Results

5.3. Ablation Studies

5.4. Hyper-Parameter Studies

6. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI