Do Fintech Lenders Align Pricing with Risk? Evidence from a Model-Based Assessment of Conforming Mortgages

Liu, Zilong; Liang, Hongyan

doi:10.3390/fintech4020023

Open AccessArticle

Do Fintech Lenders Align Pricing with Risk? Evidence from a Model-Based Assessment of Conforming Mortgages

by

Zilong Liu

and

Hongyan Liang

^*

Gies College of Business, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA

^*

Author to whom correspondence should be addressed.

FinTech 2025, 4(2), 23; https://doi.org/10.3390/fintech4020023

Submission received: 23 April 2025 / Revised: 2 June 2025 / Accepted: 4 June 2025 / Published: 9 June 2025

(This article belongs to the Special Issue Trends and New Developments in FinTech)

Download

Browse Figures

Versions Notes

Abstract

:

This paper assesses whether fintech mortgage lenders align pricing with borrower risk using conforming 30-year mortgages (2012–2020). We estimate default probabilities using machine learning (logit, random forest, gradient boosting, LightGBM, XGBoost), finding that non-fintech lenders achieve the highest predictive accuracy (AUC = 0.860), followed closely by banks (0.857), with fintech lenders trailing (0.852). In pricing analysis, banks adjust the origination rates most sharply with borrower risk (7.20 basis points per percentage-point increase in default probability) compared to fintech (4.18 bp) and non-fintech lenders (5.43 bp). Fintechs underprice 32% of high-risk loans, highlighting limited incentive alignment under GSE securitization structures. Expanding the allowable alternative data and modest risk-retention policies could enhance fintechs’ analytical effectiveness in mortgage markets.

Keywords:

fintech mortgage lending; risk-based pricing; default prediction; machine learning; credit risk modeling

JEL Classification:

G21; G23; G51; G55

1. Introduction

Digital lending platforms promise to replace the frictions of mortgage origination with instant approvals, data-rich underwriting and finely tuned prices. In segments such as unsecured consumer credit, those promises appear to hold: by ingesting alternative variables—from cash-flow traces to digital footprints—fintech algorithms have outperformed legacy scorecards and expanded credit access without raising loss rates. Yet, the largest slice of U.S. housing finance—the conforming mortgage market dominated by Fannie Mae and Freddie Mac—operates inside a markedly different institutional shell. Every loan must pass a government-sponsored enterprise (GSE) scorecard that fixes both the information set and the modeling approach, and most credit risk is transferred to investors within weeks via agency mortgage-backed securities. Whether a lender is a community bank, a nationwide originator or an app-based fintech, the same standardized inputs feed the same automated underwriting engines, and the originating institution rarely bears long-run losses.

In the U.S. conforming mortgage market, banks refer to traditional financial institutions operating under strict regulatory oversight, holding a bank charter and typically maintaining physical branch networks. In contrast, fintech lenders are predominantly online platforms leveraging digital technology to streamline mortgage origination, enhance user experience and accelerate loan approval processes [1,2]. Recent studies have highlighted significant differences between fintech lenders and traditional banks regarding risk assessment capabilities, pricing strategies, and borrower targeting [3,4]. This study specifically investigates how these two lender types differ in aligning pricing with borrower risk within the regulated framework of U.S. government-sponsored enterprises (GSEs). First, how accurately can lenders discriminate between borrowers who will default and those who will not when all parties share the same mandatory data? Second, once risk is measured, how tightly do lenders map it into interest rates, and does that mapping differ between fintechs and incumbent banks? Answering these questions requires separating screening—the act of predicting default—from pricing—the act of converting that prediction into an interest rate. Prior work often conflates the two, using the origination APR itself as both the lender’s risk signal and its price. Here, we deploy a two-stage empirical framework that disentangles them: machine-learning models trained within each lender class generate out-of-sample default probabilities, and a pooled benchmark translates those probabilities into a fair pricing curve against which actual rates can be judged.

Leveraging 30-year fixed-rate mortgages originated from 2012 through 2020, we find that non-fintech lenders post the highest screening accuracy (average AUC ≈ 0.860), with banks following closely (≈0.857) and fintechs lagging (≈0.852) despite using the same gradient-boosting algorithms. More strikingly, banks display the steepest rate-risk slope—about 7.1 basis points for every one percentage-point increase in predicted default probability—and the narrowest distribution of mispricing residuals. Fintech lenders, in contrast, exhibit a slope that is roughly 40 percent flatter (4.18 slope) and underprice nearly one-third of the riskiest loans relative to the benchmark. The evidence suggests that technological sophistication alone cannot overcome two structural frictions: the information ceiling imposed by GSE scorecards and the weak incentives that arise when credit risk is swiftly securitized.

These findings matter for both market design and consumer welfare. If alternative-data fields were cautiously integrated into agency underwriting engines, the ceiling on predictive accuracy might rise for all lenders, allowing genuine analytics advantages to surface. Conversely, modest risk-retention requirements could sharpen pricing discipline by forcing originators—fintech and bank alike—to internalize a sliver of future losses. Until such reforms take hold, fintechs’ celebrated algorithms are likely to remain blunted in the very market that shapes the majority of U.S. household leverage.

2. Literature Review

Fintech’s rapid rise in unsecured consumer lending is frequently attributed to its proficiency in combining alternative data with machine-learning (ML) models, thus enhancing default prediction accuracy and expanding credit access. Specifically, incorporating digital footprint variables significantly lowers both rejection rates and subsequent delinquency rates compared to traditional FICO-based screening methods [5]. Similarly, incorporating even basic online behavioral signals yields considerable improvements in predictive accuracy, as measured by area under the curve (AUC) [6]. Complementing this evidence, fintech lenders have been shown to offer more finely granulated pricing of unsecured personal loans compared to traditional banks, suggesting advanced risk differentiation capabilities [4].

However, despite these technological advancements, mortgage markets impose institutional frictions that restrict fintech’s full potential. Within the conforming mortgage segment, proprietary data usage by fintech lenders is constrained by standardized underwriting systems like Fannie Mae’s Desktop Underwriter and Freddie Mac’s Loan Product Advisor [7,8]. Additionally, the rapid securitization of mortgages tends to dilute incentives for lenders to invest in granular screening processes, as documented by Keys et al. and Acharya et al. [9,10].

Empirical evidence regarding fintech performance in mortgage lending is somewhat mixed. Some studies indicate efficient mortgage processing without increased default rates [8]. Conversely, others highlight fintech lenders’ coarse pricing strategies and note cross-subsidization within loan portfolios, suggesting limitations in fintech’s risk assessment capabilities [1,3]. Furthermore, there is evidence of inclusivity shortcomings, particularly affecting women, due to gaps in product suitability, transparency and trust, thereby underscoring the importance of fairness and transparency in algorithmic design [11].

Algorithmic bias represents a critical challenge across all algorithm-driven lending systems, including both fintech platforms and traditional banks. Such biases are attributed primarily to skewed or incomplete training data rather than deliberate programmer intent [12]. This concern aligns with findings from studies of big tech lending in China, where algorithmic models outperform traditional credit assessments during economic downturns, partly due to structural features like high interest rates and short maturities that mitigate risk in unsecured lending [13,14].

Research exploring fintech’s technological edge in risk pricing further elaborates on its potential and constraints. Some observe that fintech institutions are more responsive than traditional banks to expansions in credit markets [15], while others discuss reintermediation trends driven by fintech, impacting lending dynamics and borrower experiences [16]. There are also insights into the complementary roles fintech platforms and traditional banks play in lending markets, highlighting the distinct strengths and limitations of each [17]. Additionally, deep-learning methods have demonstrated significant potential in improving mortgage risk prediction, reinforcing fintech’s position at the methodological forefront [18]. Complementing these methodological advancements, recent studies propose frameworks to assess fairness in algorithmic credit scoring, addressing crucial ethical and regulatory concerns [19].

Further complexities arise from regulatory environments. Some authors illustrate how fintech lenders exploit regulatory arbitrage opportunities, reshaping mortgage market dynamics and introducing competitive pressures alongside systemic risks [20]. Complementing these observations, others detail how securitization processes negatively impact distressed loan renegotiations, indicating structural barriers that fintech firms must navigate [21].

Recent literature also focuses on distinguishing predictive accuracy from pricing effectiveness. Some validate the predictive effectiveness of machine-learning models for consumer credit risk [22]. Others examine information asymmetries prevalent in peer-to-peer lending markets [23]. Additional studies investigate the role of machine learning in exacerbating or alleviating disparities within credit markets [2], while others explicitly discuss discriminatory practices and pricing inefficiencies observed among fintech lenders [1]. Further research emphasizes the importance of clearly separating default prediction accuracy from pricing strategies to enhance analytical clarity [6].

Against this backdrop, the current study adopts a rigorous benchmarking approach, evaluating lender-specific screening accuracy using multiple ML classifiers. Additionally, it introduces a pooled risk benchmark to independently assess potential mispricing in mortgage lending. This dual-pronged methodological strategy aims to clarify fintech’s specific limitations—whether rooted in predictive accuracy, pricing inefficiencies or a combination of both—in the standardized and regulated conforming mortgage market.

3. Data and Sample

The dataset comprises conforming 30-year fixed-rate mortgages originated from Q1 2012 to Q1 2020, the most recent fully available public data provided by Fannie Mae and Freddie Mac at the time of our analysis (the Fannie public data can be accessed here: https://capitalmarkets.fanniemae.com/credit-risk-transfer/single-family-credit-risk-transfer/fannie-mae-single-family-loan-performance-data, and the Freddie public data can be accessed here: http://www.freddiemac.com/research/datasets/sf_loanlevel_dataset.page accessed on 16 June 2022). We restrict the sample to originations before the widespread implementation of COVID-19 forbearance programs, which began in mid-2020 and significantly disrupted default observability by suspending delinquency reporting. This cutoff ensures consistency in measuring loan performance and maintains the validity of our default risk modeling. We exclude loans with LTVs outside the agencies’ lending grid, as those high LTV loans are non-standard loans (the loan eligibility matrix can be accessed at https://singlefamily.fanniemae.com/media/20786/display (Fannie); http://www.freddiemac.com/singlefamily/factsheets/sell/ltv_tltv.htm (Freddie) accessed on 16 June 2022). After excluding loans with missing covariates and non-standard LTVs, the analysis sample comprises 6.3 million observations. The summary statistics are presented in Table 1, and detailed variable definitions are presented in Appendix A.

Lenders were classified based on their regulatory identifiers and operational characteristics. Specifically, banks have a bank charter and a Research, Statistics, Supervision and Discount (RSSD) identifier issued by the Federal Reserve, while fintech lenders are identified as non-bank entities whose mortgage origination processes are predominantly completed online, following previous studies [8,20,24,25]. The analysis employs borrower FICO scores, which represent widely recognized creditworthiness measures in the U.S. A FICO score is a three-digit number, typically ranging from 300 to 850, calculated based on an individual’s credit history—including payment history, amounts owed, length of credit history, new credit and credit mix—and is widely used by lenders to assess the likelihood of timely loan repayment.

Panel A of Table 1 presents key origination characteristics of conforming 30-year fixed-rate GSE mortgages. Fintech lenders tend to originate loans with slightly lower FICO scores and loan-to-value (LTV) ratios compared to banks and non-fintechs. Fintech loans have a higher proportion of refinances (70%) and lower first-time homebuyer rates. Banks lend in areas with higher unemployment rates and lower real income, while fintechs originate in higher-income, lower-unemployment metros.

Panel B of Table 1 summarizes loan performance across groups. Fintech loans exhibit lower delinquency rates at 12 and 24 months compared to non-fintechs but slightly higher than banks. Notably, fintechs demonstrate higher prepayment rates across all horizons, consistent with more tech-savvy borrowers or aggressive rate-shopping behavior.

These patterns suggest that fintech lenders target a different risk-return profile, possibly emphasizing refinance opportunities and faster prepayment cycles, while maintaining comparable or lower default risk relative to traditional lenders.

4. Methodology

We employ a two-stage empirical framework, distinctively separating default prediction (screening) from interest rate setting (pricing). This separation allows clear identification of how lenders translate borrower risk into pricing decisions—an advantage over approaches conflating these tasks [1]. To ensure model robustness and mitigate overfitting, the hyperparameters for each model were carefully tuned through 3-fold cross-validation within the training data. The use of out-of-sample area under the ROC curve (AUC) scores further validates model generalizability. While the primary evaluation metric is AUC, we acknowledge precision–recall curves as another valuable evaluation tool, particularly useful in future extensions of this research.

4.1. Stage One: Lender-Specific Default Prediction and AUC Evaluation

Let the binary variable

Y_{i}

indicate whether loan i becomes 90 or more days delinquent within 36 months of origination. We denote by

x_{i}

the vector of borrower- and loan-level features available at origination (e.g., FICO score, loan-to-value ratio, debt-to-income ratio).

For each lender group

g \in {Bank, Non - Fintech, Fintech}

, we split the data into training (70%) and test (30%) subsets using stratified random sampling. We then estimate the probability of default using five different machine-learning models: logistic regression (logit), random forest (RF), LightGBM (LGBM), XGBoost (XGB) and gradient-boosting classifier (GBC).

We select these models to balance interpretability, robustness and predictive power. Logistic regression serves as a benchmark due to its simplicity and transparency. Tree-based models are included due to their ability to capture non-linear relationships and interactions among features, which are common in mortgage risk prediction. Gradient-boosting variants (GB, XGBoost, LightGBM) are especially suited for handling imbalanced classification problems and high-dimensional tabular data. The models employed in this study are widely used in credit and mortgage risk modeling. Their effectiveness in accurately predicting mortgage defaults has been demonstrated in prior research [13,18,24], supporting their application in our comparative analysis.

Each model is trained separately for each lender group to allow for group-specific patterns in default behavior. To ensure fair comparison and optimal performance, we tune the hyperparameters for each model using 3-fold cross-validation within the training data. The hyperparameters are selected based on the highest cross-validated AUC score. For example, the regularization strength C is tuned for logistic regression, while the number of estimators, learning rate and maximum tree depth are tuned for tree-based models. Technical descriptions of all models are provided in Appendix B, while the full list of candidate hyperparameter values and the selected configurations by lender group are reported in Appendix C.

For observation i in group g, the predicted probability of default is denoted by

\hat{p_{i, g}} = \hat{f_{g}} (x_{i}) .

where

\hat{p_{i, g}}

is the predicted probability that loan i defaults, while

f_{g}

represents the machine-learning model trained and tuned specifically for lender group g.

To measure predictive accuracy, we rely on the area under the ROC curve (AUC). Although the AUC lacks a single closed-form expression for binary classifiers, it can be interpreted as the probability that a randomly chosen “positive” (defaulted) loan receives a higher predicted default probability

\hat{p_{i, g}}

than a randomly chosen “negative” (non-defaulted) loan. We evaluate each of the five models on the test set and select the best-performing model (the one with the highest AUC) for subsequent analysis within each lender group. Finally, we compute the chosen model’s out-of-sample AUC. A higher

{AUC}_{g}

indicates stronger discriminative power in identifying high-risk borrowers for lender group g.

4.2. Pooled Model and Predicted Risk Scores

Although we obtain separate predictions

\hat{p_{i, g}}

by training on each lender group separately, we also want a uniform risk benchmark that does not depend on the lender’s own pricing or underwriting. To do this, we train a single best-performing “pooled” model on all loans from all lenders, denoted by

\hat{f^{pooled}}

. This model likewise uses only borrower- and loan-level features available at origination (excluding interest rates to preserve exogeneity). Formally,

\hat{p_{i}^{pooled}} = \hat{f^{pooled}} (x_{i}),

This pooled measure

\hat{p_{i}^{pooled}}

is our baseline estimate of each borrower’s default risk, unaffected by which lender originated the loan. The pooled model is trained on 70% of the combined dataset and then used to predict default probabilities for the full sample. We implement

\hat{f^{pooled}}

using LightGBM, the best-performing machine-learning algorithm identified in the group-specific training stage. This pooled measure

\hat{p_{i}^{pooled}}

serves as our baseline estimate of borrower risk, independent of the originating lender.

4.3. Pricing Alignment Analysis

To test whether lenders set interest rates in proportion to exogenous risk, we compare the actual loan interest rate

r_{i}

to the pooled default probability

\hat{p_{i}^{pooled}}

. We employ two complementary approaches:

We sort loans into deciles by $\hat{p_{i}^{pooled}}$ . Within each decile $d \in {1, \dots, 10}$ , calculate the average predicted default probability $\bar{p_{d}}$ and the average interest rate $\bar{r_{d}}$ . Plotting $\bar{r_{d}}$ against $\bar{p_{d}}$ for banks, non-fintech non-banks and fintechs reveals how steeply rates rise as risk increases; a steeper, more linear curve indicates tighter risk-based pricing, whereas a flatter curve signals weaker sensitivity.
We estimate a lender-group-specific linear regression of the form

$r_{i} = α + β \hat{p_{i}^{pooled}} + ε_{i} .$

The slope

β

captures the marginal change in the interest rate associated with a one-unit increase in predicted default probability. A larger

β

reflects stronger risk-based pricing. Comparing

β

across groups therefore shows which lenders adjust rates most sharply in response to borrower risk.

4.4. Mispricing Residual Analysis

Even if average rates rise with default probability, individual loans may be over- or underpriced relative to a benchmark pricing curve. To quantify this, we first estimate

\hat{p_{i}^{pooled}} = \hat{f^{pooled}} (x_{i}),

using all loans in a pooled regression (across all lenders). We define the fitted rate for loan i as

\hat{r_{i}} = γ_{0} + γ_{1} \hat{p_{i}^{pooled}},

and then compute the residual (mispricing) for each loan

ϵ_{i} = r_{i} - \hat{r_{i}} .

A negative

ϵ_{i}

means that the loan is underpriced (the lender charged a lower rate than the risk-based benchmark), while a positive

ϵ_{i}

implies that the loan is overpriced. Aggregating these residuals by lender group reveals whether fintechs, banks or non-fintech lenders systematically deviate from risk-based prices.

5. Results

5.1. Screening Accuracy

Figure 1 and Table 2 report the out-of-sample AUCs of five machine-learning classifiers—logistic regression (logit), random forest (RF), LightGBM (LGBM), XGBoost (XGB) and gradient-boosting classifier (GBC)—estimated separately for banks, non-fintech non-banks and fintech lenders.

Across all lender types, LightGBM consistently posts the highest—or statistically indistinguishable second-highest—AUC, confirming its superior ability to separate defaulters from non-defaulters. Averaging over the five models, non-fintech lenders attain the best overall predictive accuracy (mean AUC ≈ 0.860), banks follow closely (≈0.857), and fintechs lag (≈0.852). Performance within non-fintech and fintech groups is remarkably stable across the three gradient-boosting methods (LGBM, XGB, GBC), whereas logistic regression and random forest score a few hundredths lower. Banks show the greatest spread, with a distinct dip for random forest and a recovery under boosting algorithms.

Taken together, the evidence highlights tree-based boosting—especially LightGBM—as the most reliable modeling choice for high-dimensional mortgage credit data across all lender categories.

5.2. Pricing Alignment

Next, we examine whether interest rates align with the pooled model’s default probability

\hat{p_{i}^{pooled}}

. A linear regression of interest rate on

\hat{p_{i}^{pooled}}

yields

Figure 2 plots the average origination rates against decile-level predicted default probabilities

\hat{p_{i}^{pooled}}

. This figure plots the average interest rate against the average predicted default probability across deciles of estimated risk, separately for banks, non-fintech non-banks and fintech lenders. Each point represents the mean interest rate and mean predicted risk within a decile. The upward-sloping curves reflect positive pricing alignment—higher-risk borrowers are charged higher rates. However, the steepness and level of each curve differ across lender types. Banks exhibit the highest average rates and the steepest pricing gradient, suggesting stronger risk-based pricing. In contrast, non-fintech and fintech lenders show flatter slopes, indicating weaker sensitivity of pricing to estimated borrower risk. The results show that banks consistently charge the highest interest rates across all deciles, followed by fintech and then non-fintech lenders.

Table 3 presents the results from lender-type-specific OLS regressions of interest rates on the predicted default probability (

\hat{PD}

). All coefficients on (

\hat{PD}

) are statistically significant at the 1% level, confirming that lenders positively adjust pricing in response to borrower risk. Among the three groups, banks exhibit the steepest pricing sensitivity, with a coefficient of 7.19, indicating a strong alignment between risk and rate. Non-fintech lenders also show meaningful alignment (5.43), while fintechs apply the shallowest pricing slope (4.18). These results suggest that banks price risk more aggressively, whereas fintech lenders exhibit relatively weaker sensitivity to borrower default risk, despite operating with modern algorithms. Notably, the R-squared values remain low across all regressions (4–5%), consistent with the fact that much of the interest rate variation is driven by factors beyond the modeled credit risk, including competition, borrower characteristics not captured in (

\hat{PD}

) and loan features.

5.3. Mispricing Summary

Using the fitted pricing curve as a benchmark, we compute the mean residuals and the share of loans classified as under- or overpriced.

Table 4 reports summary statistics on interest rate mispricing, calculated as the difference between the actual interest rate and the fair rate predicted by the pooled PD-based pricing model. Banks exhibit a small positive average mispricing of +4.6 basis points, indicating a slight tendency to overcharge relative to risk. In contrast, both fintech and non-fintech lenders show negative average mispricing (–8.2 and –15.1 basis points, respectively), suggesting systematic underpricing of riskier borrowers. The share of underpriced loans is highest among fintech lenders (32.02%), followed closely by non-fintechs (29.99%), while their shares of overpriced loans remain relatively low (13.5% and 15.37%, respectively). These results imply that banks adhere more closely to risk-based pricing, while alternative lenders—especially non-fintechs—tend to offer below-benchmark rates to higher-risk borrowers, potentially reflecting either competitive strategies or less precise risk-pricing mechanisms.

6. Discussion

Three forces jointly explain fintech lenders’ muted alignment of price and risk in conforming mortgages. First, regulatory constraints limit informational flexibility. Specifically, every conforming mortgage loan must clear the standardized underwriting systems—Fannie Mae’s Desktop Underwriter (DU) or Freddie Mac’s Loan Product Advisor (LPA). These automated platforms employ strict and uniform “scorecards” that evaluate borrower risk based on predetermined criteria, such as credit scores, loan-to-value ratios and debt-to-income ratios. Critically, these scorecards do not accept proprietary fintech data or alternative risk indicators—such as real-time cash flow, rental payments or utility bill histories—that have significantly enhanced predictive accuracy in unsecured lending markets. Consequently, even the most advanced fintech algorithms can achieve only incremental improvements within this rigid framework, restricting the ability to differentiate borrower risks more effectively.

Second, incentive misalignment compounds this regulatory rigidity. Fintech and other non-bank lenders primarily use warehouse funding, originating loans intended for rapid sale into securitization pools backed by government-sponsored enterprise (GSE) guarantees. This originate-to-distribute model shifts the long-term credit risk to MBS investors, significantly weakening incentives for precise risk-based pricing. The immediate rewards from slightly lower rates and higher origination volume thus outweigh the long-term, dispersed default costs, making systematic underpricing rational from a business growth perspective.

Third, competitive positioning further shapes these pricing dynamics. Fintech lenders differentiate themselves through speed, streamlined user experiences and digital convenience, whereas traditional banks leverage brand recognition, customer trust, cross-selling opportunities and rigorous regulatory oversight. Banks also frequently retain loan servicing rights and face capital requirements that strongly incentivize accurate upfront risk pricing. This strategic and regulatory alignment explains why banks consistently display steeper rate-risk slopes compared to fintech lenders.

These dynamics underscore the structural limitations in algorithmic credit scoring when underwriting decisions are decoupled from long-term financial accountability. However, targeted policy reforms could significantly enhance fintech lenders’ alignment between pricing and borrower risk. First, modernizing GSE underwriting scorecards to allow carefully verified alternative data—such as rental payments, utility histories or verified real-time financial transaction data—could meaningfully expand the informational scope, enabling more nuanced borrower risk differentiation. Second, introducing modest risk-retention requirements, where lenders must retain a small percentage (e.g., 5%) of each originated loan’s risk, would align lenders’ incentives with long-term loan performance without compromising the liquidity benefits provided by agency mortgage-backed securities. Such requirements echo existing regulatory frameworks like the Dodd–Frank Act’s risk-retention rules and could substantially strengthen pricing discipline.

The findings also suggest several promising directions for future research. One avenue is examining whether fintech lenders demonstrate superior performance in private-label or non-conforming mortgage segments, where underwriting standards are more flexible, and lenders retain greater exposure to loan outcomes. Another research direction involves evaluating the broader welfare implications: specifically, does fintech-driven underpricing sustainably enhance homeownership access, or does it merely shift default risks onto government-supported entities and, ultimately, taxpayers? Finally, analyzing operational efficiencies in post-origination loan servicing may reveal whether fintech lenders provide measurable value that offsets weaker initial pricing accuracy. By explicitly separating default prediction from risk-based pricing, our analytical framework provides a useful template for exploring these critical questions in other regulated credit markets.

7. Conclusions

This study provides robust, causal evidence that fintech mortgage lenders lag behind traditional banks in both screening accuracy and risk-based pricing, even when all parties operate under the same GSE-mandated information regime. Using five state-of-the-art machine-learning models that are rigorously tuned via cross-validation and evaluated out-of-sample, we document a systematic performance gap (best-model AUC: 0.852 for fintechs vs. 0.857 for banks). A two-stage framework that cleanly separates default prediction from pricing further reveals that banks adjust rates by 7.2 basis points for every percentage-point increase in predicted default probability, whereas fintechs adjust them by just 4.2 bp. These patterns persist across more than six million conforming loans originated between 2012 and 2020, underscoring the scientific soundness and external validity of our results.

The findings carry broad international relevance. Many mortgage markets—from Canada to the U.K. and Australia—share two key features of the U.S. conforming segment: (i) highly standardized, regulator-approved underwriting algorithms and (ii) rapid securitization that shifts future credit losses off lenders’ balance sheets. In such environments, data ceilings and weak ex-post incentives can blunt the very technologies that drive fintech’s success in unsecured credit. Policymakers worldwide can therefore draw two actionable lessons. First, cautiously expanding the set of verifiable alternative data (e.g., rental payment histories, transaction-level cash-flow data) that government or quasi-government underwriting systems accept would raise the ceiling on predictive accuracy for all lenders. Second, modest risk-retention rules—mirroring the 5% “skin-in-the-game” standard in other securitized asset classes—would strengthen price discipline without unduly inhibiting secondary-market liquidity. Together, these reforms could unlock fintech’s analytic potential while safeguarding systemic stability, rendering our results pertinent well beyond the U.S. context.

Author Contributions

Conceptualization, H.L.; methodology, Z.L.; software, Z.L.; validation, H.L.; formal analysis, Z.L.; investigation, Z.L.; resources, Z.L. and H.L.; data curation, Z.L.; writing—original draft preparation, Z.L.; writing—review and editing, H.L.; visualization, Z.L.; supervision, Z.L.; project administration, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study are publicly available from the Fannie Mae and Freddie Mac Single-Family Loan Performance datasets for the years 2012 to 2020. These datasets can be accessed at: https://capitalmarkets.fanniemae.com/credit-risk-transfer/single-family-credit-risk-transfer/fannie-mae-single-family-loan-performance-data and http://www.freddiemac.com/research/datasets/sf_loanlevel_dataset.page, respectively (accessed on 16 June 2022).

Conflicts of Interest

The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.

Appendix A. Variable Definition

Variable	Definition
Borrower and Loan Characteristics
Origination Rate	The original interest rate on a mortgage loan as identified in the original mortgage note
Origination Balance ($ Thousand)	The dollar amount of the loan as stated on the note at the time the loan was originated
Original Loan-to-Value Ratio (OLTV)	Loan amount divided by the value of property at origination
Original Combined LTV (OCLTV)	The amount of all known outstanding loans (including home equity) at origination divided by the value of property
Debt-to-Income Ratio	Loan amount divided by borrower income at origination
FICO Score	Borrower’s FICO score at origination
Refinance	Indicator variables for whether the loan is a home refinancing or not
Cash-Out Refinance	Indicator variables for whether the loan is a cash-out refinance or not
Non-Cash-Out Refinance	Indicator variables for whether the loan is a non-cash-out (rate) refinance or not
Purchase	Indicator variables for whether the loan is a home purchase or not
First-Time Home Buyer	An indicator that denotes whether the borrower or co-borrower qualifies as a first-time homebuyer
Number of Borrowers	The number of individuals obligated to repay the mortgage loan
Has Mortgage Insurance	Indicator variables for whether the loan has mortgage insurance or not
Mortgage Insurance Unknown	Indicator variables for whether the loan’s mortgage insurance status is unknown
Primary Residence	An indicator that denotes whether the property occupancy status is for primary residence or not
Investment or Second Property	An indicator that denotes whether the property occupancy status is for secondary home/investment purpose or not
Correspondent Channel	Indicator variables for whether the loan is originated through the correspondent channel or not
Retail Channel	Indicator variables for whether the loan is originated through the retail channel or not
Broker Channel	Indicator variables for whether the loan is originated through the broker channel or not
MSA Macroeconomic Indicators
MSA Unemployment Rate	Unemployment rate by metropolitan statistical area, seasonally adjusted; obtained from the U.S. Bureau of Labor Statistics
MSA Real Personal Income	Real per-capita personal income (chained 2012 dollars) by metropolitan statistical area; obtained from the Bureau of Economic Analysis
Loan Performance
30 (60/90) DPD in 12 (24/36) Months	Indicator variables for whether the loan is 30 (60/90) days past due within 12 (24/36) months after origination
Prepaid in 12 (24/36) Months	Indicator variables for whether the loan is prepaid within 12 (24/36) months after origination

Appendix B. Detailed Machine-Learning Models for Credit Risk Modeling

Logistic Regression (LR)

Objective Function: The goal of logistic regression is to find the best fitting model to describe the relationship between the dichotomous characteristic of interest (dependent variable) and a set of independent (predictor or explanatory) variables. Logistic regression does this by estimating probabilities using a logistic function, which is a cumulative logistic distribution.

L (β) = - \frac{1}{N} \sum_{i = 1}^{N} [y_{i} \log (p_{i}) + (1 - y_{i}) \log (1 - p_{i})]

Here,

L (β)

is the logistic loss function; N is the number of observations;

y_{i}

is the actual outcome; and

p_{i}

is the predicted probability of the outcome being 1 for the

i t h

observation. The

β

coefficients are estimated during the training process.

Hyperparameters:

Regularization type (L1, L2, ElasticNet): Determines the type of regularization applied to the model to prevent overfitting by penalizing large coefficients.
Regularization strength: Controls the magnitude of the regularization term. A larger value specifies stronger regularization.
Solver: The algorithm used for optimization (e.g., liblinear, sag, saga, newton-cg). Different solvers are suitable for different types of data and different regularization methods.

2.: Random Forest (RF)

Objective Function: Random forest aims to reduce overfitting in decision trees by averaging multiple decision trees’ predictions, trained on different parts of the same training set, with the goal of improving the overall accuracy. A random forest algorithm does not have a single formula due to its ensemble nature, but it operates by building multiple decision trees and merging their predictions. The decision of the majority of trees is chosen by the random forest as the final prediction.

Hyperparameters:

Number of trees: The number of trees in the forest. More trees increase prediction stability but also computational complexity.
Max depth: The maximum depth of the trees.
Min samples split: The minimum number of samples required to split an internal node.
Min samples leaf: The minimum number of samples required to be at a leaf node.

3.: Gradient-Boosting Machine (GBM)

Objective Function: GBM aims to minimize the loss function by sequentially adding weak learners using a gradient descent algorithm. Each new model incrementally decreases the loss function (e.g., mean squared error for regression tasks) of the entire system.

F_{m} (x) = F_{m - 1} (x) + ρ_{m} h_{m} (x)

Here,

F_{m} (x)

is the boosted model’s prediction at iteration

m

;

F_{m - 1} (x)

is the prediction from the previous iteration;

h_{m} (x)

is the weak learner added at iteration

m

; and

ρ_{m}

is the learning rate.

Hyperparameters:

Learning rate: Determines how corrections are performed in the model with each added tree.
Number of learners: The total number of trees to be built.
Max depth of trees: The depth limit for each tree, controlling overfitting.

4.: LightGBM

Objective Function: Similar to GBM, LightGBM also focuses on minimizing the loss function but does so more efficiently for large datasets by using gradient-based one-side sampling and exclusive feature bundling.

Hyperparameters:

Number of leaves: The maximum number of leaves in one tree.
Learning rate: Speed of model learning.
Min data in leaf: The minimum number of records a leaf may have.
Feature fraction: The fraction of features to be used for each tree, preventing overfitting.

5.: XGBoost

Objective Function: XGBoost aims to minimize the regularized loss function that includes both a loss term and a regularization term, which helps in controlling overfitting more effectively than traditional GBM.

F_{m} (x) = F_{m - 1} (x) + ρ_{m} h_{m} (x) + λ Ω (h_{m})

In this formula,

Ω (h_{m})

represents the regularization term applied to the model

h_{m}

, adding a penalty for complexity to improve model generalization.

Hyperparameters:

Learning rate (eta): Determines the step size at each iteration to prevent overfitting.
Max depth: Maximum depth of a tree; increasing this value will make the model more complex and more likely to overfit.
Subsample: The fraction of samples to be used for fitting the individual base learners.
Colsample_bytree: The fraction of features to be used for each tree.

Appendix C. Tuned Hyperparameters by Model and Lender Group

Notes: All models use n_estimators = 100. The learning rate (lr) and maximum tree depth (depth) were tuned from grids: lr ∈ {0.05, 0.1}, depth ∈ {3, 6}. Logit models were tuned over C ∈ {0.01, 0.1, 1, 10}. Random forests were tuned over max_depth ∈ {3, 6, 10}.

Lender Group	Logit (C)	RF (max_depth)	LGBM (lr/depth)	XGB (lr/depth)	GBC (lr/depth)
Bank	0.1	10	0.1/3	0.1/3	0.1/3
Non-Fintech	0.1	10	0.05/3	0.1/3	0.05/3
Fintech	0.1	10	0.05/3	0.1/3	0.1/3

References

Bartlett, R.; Morse, A.; Stanton, R.; Wallace, N. Consumer-lending discrimination in the FinTech Era. J. Financ. Econ. 2022, 143, 30–56. [Google Scholar] [CrossRef]
Fuster, A.; Goldsmith-Pinkham, P.; Ramadorai, T.; Walther, A. Predictably Unequal? The Effects of Machine Learning on Credit Markets. J. Financ. 2022, 77, 5–47. [Google Scholar] [CrossRef]
Berger, D.W.; Milbradt, K.; Tourre, F.; Vavra, J.S. Refinancing Frictions, Mortgage Pricing and Redistribution (No. w32022). National Bureau of Economic Research. 2024. Available online: https://www.nber.org/papers/w32022 (accessed on 18 January 2025).
Di Maggio, M.; Yao, V. Fintech Borrowers: Lax Screening or Cream-Skimming? Rev. Financ. Stud. 2021, 34, 4565–4618. [Google Scholar] [CrossRef]
Jagtiani, J.; Lemieux, C. The roles of alternative data and machine learning in fintech lending. Financ. Manag. 2019, 48, 1009–1029. [Google Scholar] [CrossRef]
Berg, T.; Burg, V.; Gombović, A.; Puri, M. On the Rise of FinTechs: Credit Scoring Using Digital Footprints. Rev. Financ. Stud. 2020, 33, 2845–2897. [Google Scholar] [CrossRef]
Frame, W.S.; Fuster, A.; Tracy, J.; Vickery, J. The Rescue of Fannie Mae and Freddie Mac. J. Econ. Perspect. 2015, 29, 25–52. [Google Scholar] [CrossRef]
Fuster, A.; Plosser, M.; Schnabl, P.; Vickery, J. The Role of Technology in Mortgage Lending. Rev. Financ. Stud. 2019, 32, 1854–1899. [Google Scholar] [CrossRef]
Acharya, V.; Richardson, M.; Van Nieuwerburgh, S.; White, L. Guaranteed to Fail: Fannie Mae, Freddie Mac, and the Debacle of Mortgage Finance; Princeton University Press: Princeton, NJ, USA, 2013. [Google Scholar]
Keys, B.J.; Mukherjee, T.; Seru, A.; Vig, V. Did Securitization Lead to Lax Screening? Evidence from Subprime Loans. Q. J. Econ. 2010, 125, 307–362. [Google Scholar] [CrossRef]
Chen, S.; Doerr, S.; Frost, J.; Gambacorta, L.; Shin, H. The fintech gender gap. J. Financ. Intermediation 2023, 54, 101026. [Google Scholar] [CrossRef]
Cowgill, B.; Dell'Acqua, F.; Deng, S.; Hsu, D.; Verma, N.; Chaintreau, A. Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics. In Proceedings of the 21st ACM Conference on Economics and Computation, Virtual, 13–17 July 2020; pp. 679–681. [Google Scholar]
Frost, J.; Gambacorta, L.; Huang, Y.; Shin, H.S.; Zbinden, P. BigTech and the changing structure of financial intermediation. Econ. Policy 2019, 34, 761–799. [Google Scholar] [CrossRef]
Liu, L.; Lu, G.; Xiong, W. The Big Tech Lending Model (No. 30160). National Bureau of Economic Research. 2022. Available online: https://ssrn.com/abstract=4140878 (accessed on 8 March 2025).
Agarwal, S.; Chomsisengphet, S.; Mahoney, N.; Stroebel, J. Do banks pass through credit expansions? The marginal profitability of consumer lending during the Great Recession. J. Financ. 2018, 75, 1597–1642. [Google Scholar]
Balyuk, T.; Davydenko, S. Reintermediation in FinTech: Evidence from Online Lending. J. Financ. Quant. Anal. 2024, 59, 1997–2037. [Google Scholar] [CrossRef]
Tang, H. Peer-to-peer lenders versus banks: Substitutes or complements? Rev. Financ. Stud. 2019, 32, 1900–1938. Available online: https://academic.oup.com/rfs/article/32/5/1900/5427773 (accessed on 18 January 2025). [CrossRef]
Sadhwani, A.; Giesecke, K.; Sirignano, J. Deep Learning for Mortgage Risk. J. Financ. Econ. 2020, 19, 313–368. [Google Scholar] [CrossRef]
Kozodoi, N.; Jacob, J.; Lessmann, S. Fairness in credit scoring: Assessment, implementation and profit implications. Eur. J. Oper. Res. 2022, 297, 1083–1094. [Google Scholar] [CrossRef]
Buchak, G.; Matvos, G.; Piskorski, T.; Seru, A. Fintech, regulatory arbitrage, and the rise of shadow banks. J. Financ. Econ. 2018, 130, 453–483. [Google Scholar] [CrossRef]
Piskorski, T.; Seru, A.; Vig, V. Securitization and distressed loan renegotiation: Evidence from the subprime mortgage crisis. J. Financ. Econ. 2010, 97, 369–397. [Google Scholar] [CrossRef]
Khandani, A.E.; Kim, A.J.; Lo, A.W. Consumer credit-risk models via machine-learning algorithms. J. Bank. Financ. 2010, 34, 2767–2787. [Google Scholar] [CrossRef]
Lin, M.; Prabhala, N.R.; Viswanathan, S. Judging Borrowers by the Company They Keep: Friendship Networks and Information Asymmetry in Online Peer-to-Peer Lending. Manag. Sci. 2013, 59, 17–35. [Google Scholar] [CrossRef]
Liu, Z.; Liang, H. Machine learning prediction of loss given default in government-sponsored enterprise residential mortgages. J. Risk Model Valid. 2024, 18, 1–23. [Google Scholar] [CrossRef]
Liu, Z.; Liang, H. Racial Disparities in Conforming Mortgage Lending: A Comparative Study of Fintech and Traditional Lenders Under Regulatory Oversight. FinTech 2025, 4, 8. [Google Scholar] [CrossRef]

Figure 1. Out-of-Sample AUC by Model and Lender Group. This figure compares the predictive performance of five machine-learning classifiers—logistic regression (logit), random forest (RF), LightGBM (LGBM), XGBoost (XGB) and gradient-boosting classifier (GBC)—across three lender groups: banks, non-fintech non-banks and fintech lenders. The AUC scores reflect each model’s out-of-sample ability to distinguish between default and non-default mortgage loans originated from 2012 to 2020.

Figure 2. Interest Rate vs. Predicted Default Risk by Lender Group. This figure plots the average mortgage interest rate against the average predicted default risk across deciles of risk scores for three lender groups: banks (green), non-fintech non-banks (orange) and fintech lenders (blue). The curve indicates the extent to which each lender group aligns pricing with borrower risk. Banks show a steeper increase in rates as risk rises, while fintech and non-fintech lenders exhibit flatter curves, suggesting less responsive pricing to predicted risk levels.

Table 1. Summary Statistics by Lender Group. This table reports summary statistics for borrower and loan characteristics (Panel A) and subsequent loan performance outcomes (Panel B) across three lender types: Banks, Non-Fintech Non-Banks and Fintechs. All figures are computed at loan origination unless otherwise noted.

(A) Origination Characteristics
Variable	Bank Mean	Bank Std	Non-Fintech Mean	Non-Fintech Std	Fintech Mean	Fintech Std
Origination Rate (%)	4.34	0.53	4.30	0.53	4.34	0.51
Origination Balance ($)	241,827	129,780	267,520	130,120	250,602	134,174
Original LTV (%)	73.38	16.30	75.92	15.76	72.99	16.01
Original CLTV (%)	74.29	16.02	76.40	15.58	73.34	15.89
Debt-to-Income Ratio (%)	33.81	9.08	35.29	9.09	35.65	8.97
FICO Score	756.24	43.69	750.60	44.74	742.37	47.96
Refinance (%)	53.63	49.87	47.05	49.91	69.87	45.88
First-Time Buyer (%)	17.63	38.11	21.26	40.92	12.70	33.29
Number of Borrowers	1.54	0.51	1.50	0.52	1.47	0.51
Mortgage Insurance (%)	22.32	41.64	30.21	45.92	24.65	43.10
Investment/Second Property (%)	12.35	32.90	12.88	33.50	9.67	29.56
MSA Unemployment Rate (%)	6.81	2.72	5.17	2.18	4.96	1.94
MSA Real Personal Income ($)	46,999.64	7158	48,890.73	7396	49,446.05	7753
∆MSA Unemployment Rate (%)	−0.50	0.77	−0.60	0.55	−0.58	0.52
∆MSA HPI	0.0162	0.0543	0.0485	0.0420	0.0490	0.0359
(B) Loan Performance Outcomes
30 DPD in 12 Months (%)	2.23	14.76	3.62	18.67	2.68	16.16
30 DPD in 24 Months (%)	3.97	19.53	6.39	24.45	4.63	21.02
30 DPD in 36 Months (%)	5.19	22.17	8.02	27.17	5.57	22.93
90 DPD in 12 Months (%)	0.14	3.76	0.24	4.85	0.30	5.48
90 DPD in 24 Months (%)	0.44	6.60	0.64	7.97	0.80	8.89
90 DPD in 36 Months (%)	0.72	8.47	0.93	9.60	1.11	10.49
Prepaid in 12 Months (%)	6.97	25.46	9.08	28.73	11.10	31.42
Prepaid in 24 Months (%)	17.87	38.31	18.16	38.55	22.87	42.00
Prepaid in 36 Months (%)	28.85	45.31	28.33	45.06	34.27	47.46

Table 2. Out-of-Sample AUC Scores by Model and Lender Group. This table reports the out-of-sample area under the curve (AUC) scores for five machine-learning models—logistic regression (logit), random forest (RF), LightGBM (LGBM), XGBoost (XGB) and gradient-boosting classifier (GBC)—trained to predict 36-month mortgage default probabilities. Models were estimated separately for bank, non-fintech and fintech lenders using conforming loan data originated between 2012 and 2020.

Model	Bank AUC	Non-Fintech AUC	Fintech AUC
Logit	0.8548	0.8565	0.8467
RF	0.8530	0.8568	0.8462
LGBM	0.8571	0.8600	0.8522
XGB	0.8567	0.8599	0.8532
GBC	0.8561	0.8600	0.8517

Table 3. OLS Regression of Interest Rate on Predicted Default Probability. This table presents results from ordinary least squares (OLS) regressions of mortgage interest rates on predicted default probabilities (

\hat{PD}

), estimated separately for each lender group. The intercept represents the baseline interest rate when the predicted default is zero, and the coefficient captures the marginal effect of default probability on pricing (i.e., the risk-based pricing gradient). R-squared indicates the model’s explanatory power. The Akaike information criterion (AIC) reflects model fit, with lower values indicating better fit.

Table 3. OLS Regression of Interest Rate on Predicted Default Probability. This table presents results from ordinary least squares (OLS) regressions of mortgage interest rates on predicted default probabilities (

\hat{PD}

), estimated separately for each lender group. The intercept represents the baseline interest rate when the predicted default is zero, and the coefficient captures the marginal effect of default probability on pricing (i.e., the risk-based pricing gradient). R-squared indicates the model’s explanatory power. The Akaike information criterion (AIC) reflects model fit, with lower values indicating better fit.

Lender Type	Intercept (const)	$Coefficient on \hat{PD}$	R-Squared	AIC	Observations
Bank	4.2735	7.1977	0.041	7,190,000	4,649,821
Non-Fintech	4.0885	5.4337	0.053	1,488,000	1,150,726
Fintech	4.1711	4.1804	0.046	558,100	485,519

Table 4. Mispricing Statistics by Lender Group. This table reports summary statistics of interest rate mispricing relative to a pooled benchmark model across three lender groups: banks, fintech lenders and non-fintech non-bank lenders. Mean mispricing reflects the average difference between the observed rate and the benchmark-implied rate (positive values indicate overpricing; negative values indicate underpricing). The percentage of underpriced and overpriced loans is calculated based on the tails of the mispricing distribution (e.g., below the 5th and above the 95th percentile).

Lender Group	Mean Mispricing	Std. Dev.	% Underpriced	% Overpriced
Bank	0.0466	0.5245	21.60%	25.87%
Fintech	–0.0851	0.4319	32.02%	13.50%
Non-Fintech	–0.1524	0.462	29.99%	15.37%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, Z.; Liang, H. Do Fintech Lenders Align Pricing with Risk? Evidence from a Model-Based Assessment of Conforming Mortgages. FinTech 2025, 4, 23. https://doi.org/10.3390/fintech4020023

AMA Style

Liu Z, Liang H. Do Fintech Lenders Align Pricing with Risk? Evidence from a Model-Based Assessment of Conforming Mortgages. FinTech. 2025; 4(2):23. https://doi.org/10.3390/fintech4020023

Chicago/Turabian Style

Liu, Zilong, and Hongyan Liang. 2025. "Do Fintech Lenders Align Pricing with Risk? Evidence from a Model-Based Assessment of Conforming Mortgages" FinTech 4, no. 2: 23. https://doi.org/10.3390/fintech4020023

APA Style

Liu, Z., & Liang, H. (2025). Do Fintech Lenders Align Pricing with Risk? Evidence from a Model-Based Assessment of Conforming Mortgages. FinTech, 4(2), 23. https://doi.org/10.3390/fintech4020023

Article Menu

Do Fintech Lenders Align Pricing with Risk? Evidence from a Model-Based Assessment of Conforming Mortgages

Abstract

1. Introduction

2. Literature Review

3. Data and Sample

4. Methodology

4.1. Stage One: Lender-Specific Default Prediction and AUC Evaluation

4.2. Pooled Model and Predicted Risk Scores

4.3. Pricing Alignment Analysis

4.4. Mispricing Residual Analysis

5. Results

5.1. Screening Accuracy

5.2. Pricing Alignment

5.3. Mispricing Summary

6. Discussion

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Variable Definition

Appendix B. Detailed Machine-Learning Models for Credit Risk Modeling

Appendix C. Tuned Hyperparameters by Model and Lender Group

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI