Towards Fair AI: Mitigating Bias in Credit Decisions—A Systematic Literature Review
Abstract
:1. Introduction
2. Method
- First step: Conduct a comprehensive search for published articles on the topic in relevant databases.
- Second step: Develop a classification model, coded using a logical structure.
- Third step: Apply the classification model and develop a framework for the current discussion on the topic.
- Fourth step: Present the characteristics of the scientific literature and the main results, considering the coding system.
- Fifth step: Analyze the gaps and suggest opportunities for further studies.
Classification and Coding
3. Results
3.1. Study Selection
3.2. General Characteristics of the Studies
3.3. Citation of Articles
3.4. Classification of Articles
4. Discussion
4.1. State of the Art in Fairness in Credit Decisions
Contemporary Debate and Divergences in Algorithmic Fairness
4.2. Research Gaps
4.3. Research Agenda
4.4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial intelligence |
ML | Machine Learning |
References
- Alves, G., Bernier, F., Couceiro, M., Makhlouf, K., Palamidessi, C., & Zhioua, S. (2023). Survey on fairness notions and related tensions. EURO Journal on Decision Processes, 11, 100033. [Google Scholar] [CrossRef]
- Asimit, A. V., Kyriakou, I., Santoni, S., Scognamiglio, S., & Zhu, R. (2022). Robust classification via support vector machines. Risks, 10(8), 154. [Google Scholar] [CrossRef]
- Babaei, G., & Giudici, P. (2024). How fair is machine learning in credit lending? Quality and Reliability Engineering International, 40(6), 3452–3464. [Google Scholar] [CrossRef]
- Badar, M., & Fisichella, M. (2024). Fair-CMNB: Advancing fairness-aware stream learning with naïve bayes and multi-objective optimization. Big Data and Cognitive Computing, 8(2), 16. [Google Scholar] [CrossRef]
- Bantilan, N. (2018). Themis-ml: A fairness-aware machine learning interface for end-to-end discrimination discovery and mitigation. Journal of Technology in Human Services, 36(1), 15–30. [Google Scholar] [CrossRef]
- Barbierato, E., Vedova, M. L. D., Tessera, D., Toti, D., & Vanoli, N. (2022). A methodology for controlling bias and fairness in synthetic data generation. Applied Sciences, 12(9), 4619. [Google Scholar] [CrossRef]
- Bhatore, S., Mohan, L., & Reddy, Y. R. (2020). Machine learning techniques for credit risk evaluation: A systematic literature review. Journal of Banking and Financial Technology, 4, 111–138. [Google Scholar] [CrossRef]
- Bircan, T., & Ãzbilgin, M. F. (2025). Unmasking inequalities of the code: Disentangling the nexus of AI and inequality. Technological Forecasting and Social Change, 211, 123925. [Google Scholar] [CrossRef]
- Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in Neural Information Processing Systems, 29, 4356–4364. [Google Scholar]
- Bono, T., Croxson, K., & Giles, A. (2021). Algorithmic fairness in credit scoring. Oxford Review of Economic Policy, 37(3), 585–617. [Google Scholar] [CrossRef]
- Breeden, J. L., & Leonova, E. (2021). Creating unbiased machine learning models by design. Journal of Risk and Financial Management, 14(11), 565. [Google Scholar] [CrossRef]
- Brotcke, L. (2022). Time to assess bias in machine learning models for credit decisions. Journal of Risk and Financial Management, 15(4), 165. [Google Scholar] [CrossRef]
- Calders, T., & Verwer, S. (2010). Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21, 277–292. [Google Scholar] [CrossRef]
- Chen, Y., Giudici, P., Liu, K., & Raffinetti, E. (2024). Measuring fairness in credit ratings. Expert Systems with Applications, 258, 125184. [Google Scholar] [CrossRef]
- Colakovic, I., & Karakatič, S. (2023). FairBoost: Boosting supervised learning for learning on multiple sensitive features. Knowledge-Based Systems, 280, 110999. [Google Scholar] [CrossRef]
- Corbett-Davies, S., Gaebler, J. D., Nilforoshan, H., Shroff, R., & Goel, S. (2023). The measure and mismeasure of fairness. Journal of Machine Learning Research, 24(312), 14730–14846. [Google Scholar]
- de Castro Vieira, J. R. (2024). Replication data for: IA justa: Promovendo a justiça no crédito. Harvard Dataverse. [Google Scholar] [CrossRef]
- de Castro Vieira, J. R., Barboza, F., Sobreiro, V. A., & Kimura, H. (2019). Machine learning models for credit analysis improvements: Predicting low-income families’ default. Applied Soft Computing, 83, 105640. [Google Scholar] [CrossRef]
- Dwork, C., Hardt, M., Pitassi, T., Reingold, O., & Zemel, R. (2012, January 8–10). Fairness through awareness. 3rd Innovations in Theoretical Computer Science Conference (pp. 214–226), Cambridge, MA, USA. [Google Scholar]
- Fabris, A., Messina, S., Silvello, G., & Susto, G. A. (2022). Algorithmic fairness datasets: The story so far. Data Mining and Knowledge Discovery, 36(6), 2074–2152. [Google Scholar] [CrossRef]
- Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., & Venkatasubramanian, S. (2015, August 10–13). Certifying and removing disparate impact. 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 259–268), Sydney, Australia. [Google Scholar]
- Garcia, A. C. B., Garcia, M. G. P., & Rigobon, R. (2023). Algorithmic discrimination in the credit domain: What do we know about it? AI & Society, 39, 2059–2098. [Google Scholar] [CrossRef]
- Genovesi, S., Mönig, J. M., Schmitz, A., Poretschkin, M., Akila, M., Kahdan, M., Kleiner, R., Krieger, L., & Zimmermann, A. (2023). Standardizing fairness-evaluation procedures: Interdisciplinary insights on machine learning algorithms in creditworthiness assessments for small personal loans. AI and Ethics, 4, 537–553. [Google Scholar] [CrossRef]
- Giudici, P., Centurelli, M., & Turchetta, S. (2024). Artificial intelligence risk measurement. Expert Systems with Applications, 235, 121220. [Google Scholar] [CrossRef]
- González-Zelaya, V., Salas, J., Megias, D., & Missier, P. (2024). Fair and private data preprocessing through microaggregation. ACM Transactions on Knowledge Discovery from Data, 18(3), 1–24. [Google Scholar] [CrossRef]
- Haque, A. B., Islam, A. N., & Mikalef, P. (2023). Explainable Artificial Intelligence (XAI) from a user perspective: A synthesis of prior literature and problematizing avenues for future research. Technological Forecasting and Social Change, 186, 122120. [Google Scholar] [CrossRef]
- Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. Advances in Neural Information Processing Systems, 29, 3323–3331. [Google Scholar]
- Hassani, B. K. (2021). Societal bias reinforcement through machine learning: A credit scoring perspective. AI and Ethics, 1(3), 239–247. [Google Scholar] [CrossRef]
- Huang, X., Li, Z., Jin, Y., & Zhang, W. (2022). Fair-AdaBoost: Extending AdaBoost method to achieve fair classification. Expert Systems with Applications, 202, 117240. [Google Scholar] [CrossRef]
- Huang, Y., Liu, W., Gao, W., Lu, X., Liang, X., Yang, Z., Li, H., Ma, L., & Tang, S. (2023). Algorithmic fairness in social context. BenchCouncil Transactions on Benchmarks, Standards and Evaluations, 3(3), 100137. [Google Scholar] [CrossRef]
- Huisingh, D. (2012). Call for comprehensive/integrative review articles. Journal of Cleaner Production, 29–30, 1–2. [Google Scholar]
- Jabbour, C. J. C. (2013). Environmental training in organisations: From a literature review to a framework for future research. Resources, Conservation and Recycling, 74, 144–155. [Google Scholar] [CrossRef]
- Jammalamadaka, K. R., & Itapu, S. (2023). Responsible AI in automated credit scoring systems. AI and Ethics, 3(2), 485–495. [Google Scholar] [CrossRef]
- Junior, M. L., & Godinho Filho, M. (2010). Variations of the kanban system: Literature review and classification. International Journal of Production Economics, 125(1), 13–21. [Google Scholar] [CrossRef]
- Kamishima, T., Akaho, S., Asoh, H., & Sakuma, J. (2012). Fairness-aware classifier with prejudice remover regularizer. In Machine learning and knowledge discovery in databases, European conference, ECML PKDD 2012, Bristol, UK, September 24–28 (pp. 35–50). Part II, LNCS 7524. Springer. [Google Scholar]
- Kim, J.-Y., & Cho, S.-B. (2022). An information theoretic approach to reducing algorithmic bias for machine learning. Neurocomputing, 500, 26–38. [Google Scholar] [CrossRef]
- Kleinberg, J. (2018, June 18–22). Inherent trade-offs in algorithmic fairness. 2018 ACM International Conference on Measurement and Modeling of Computer Systems (pp. 40–40), Irvine, CA, USA. [Google Scholar]
- Kozodoi, N., Jacob, J., & Lessmann, S. (2022). Fairness in credit scoring: Assessment, implementation and profit implications. European Journal of Operational Research, 297(3), 1083–1094. [Google Scholar] [CrossRef]
- Kusner, M. J., Loftus, J., Russell, C., & Silva, R. (2017). Counterfactual fairness. Advances in Neural Information Processing Systems, 30, 4069–4079. [Google Scholar]
- Lee, M. S. A., & Floridi, L. (2021). Algorithmic fairness in mortgage lending: From absolute conditions to relational trade-offs. Minds and Machines, 31(1), 165–191. [Google Scholar] [CrossRef]
- Leite, D. F., Padilha, M. A. S., & Cecatti, J. G. (2019). Approaching literature review for academic purposes: The literature review checklist. Clinics, 74, e1403. [Google Scholar] [CrossRef]
- Liu, S., & Vicente, L. N. (2022). Accuracy and fairness trade-offs in machine learning: A stochastic multi-objective approach. Computational Management Science, 19(3), 513–537. [Google Scholar] [CrossRef]
- Liu, Z., Zhang, X., & Jiang, B. (2023). Active learning with fairness-aware clustering for fair classification considering multiple sensitive attributes. Information Sciences, 647, 119521. [Google Scholar] [CrossRef]
- Lobel, O. (2023). The law of AI for good. Florida Law Review, 75, 1073. [Google Scholar] [CrossRef]
- Lu, X., & Calabrese, R. (2023). The Cohort Shapley value to measure fairness in financing small and medium enterprises in the UK. Finance Research Letters, 58, 104542. [Google Scholar] [CrossRef]
- Makhlouf, K., Zhioua, S., & Palamidessi, C. (2024). When causality meets fairness: A survey. Journal of Logical and Algebraic Methods in Programming, 141, 101000. [Google Scholar] [CrossRef]
- Mazilu, L., Paton, N. W., Konstantinou, N., & Fernandes, A. A. A. (2022). Fairness-aware Data Integration. Journal of Data and Information Quality, 14(4), 1–26. [Google Scholar] [CrossRef]
- Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6), 1–35. [Google Scholar] [CrossRef]
- Mishra, G., & Kumar, R. (2023). An individual fairness based outlier detection ensemble. Pattern Recognition Letters, 171, 76–83. [Google Scholar] [CrossRef]
- Napoles, G., & Koutsoviti Koumeri, L. (2022). A fuzzy-rough uncertainty measure to discover bias encoded explicitly or implicitly in features of structured pattern classification datasets. Pattern Recognition Letters, 154, 29–36. [Google Scholar] [CrossRef]
- Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., & Brennan, S. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. [Google Scholar] [CrossRef]
- Pedreshi, D., Ruggieri, S., & Turini, F. (2008, August 6–10). Discrimination-aware data mining. 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 560–568), Long Beach, CA, USA. [Google Scholar]
- Pessach, D., & Shmueli, E. (2022). A review on fairness in machine learning. ACM Computing Surveys (CSUR), 55(3), 1–44. [Google Scholar]
- Peters, M. D., Marnie, C., Tricco, A. C., Pollock, D., Munn, Z., Alexander, L., McInerney, P., Godfrey, C. M., & Khalil, H. (2020). Updated methodological guidance for the conduct of scoping reviews. JBI Evidence Synthesis, 18(10), 2119–2126. [Google Scholar] [CrossRef]
- Purificato, E., Lorenzo, F., Fallucchi, F., & De Luca, E. W. (2023). The use of responsible artificial intelligence techniques in the context of loan approval processes. International Journal of Humanâ Computer Interaction, 39(7), 1543–1562. [Google Scholar] [CrossRef]
- Qiu, H., Feng, R., Hu, R., Yang, X., Lin, S., Tao, Q., & Yang, Y. (2023). Learning fair representations via an adversarial framework. AI Open, 4, 91–97. [Google Scholar] [CrossRef]
- Rizinski, M., Peshov, H., Mishev, K., Chitkushev, L. T., Vodenska, I., & Trajanov, D. (2022). Ethically responsible machine learning in fintech. IEEE Access, 10, 97531–97554. [Google Scholar] [CrossRef]
- Seuring, S. (2013). A review of modeling approaches for sustainable supply chain management. Decision Support Systems, 54(4), 1513–1520. [Google Scholar] [CrossRef]
- Singh, A., Singh, J., Khan, A., & Gupta, A. (2022). Developing a novel fair-loan classifier through a multi-sensitive debiasing pipeline: DualFair. Machine Learning and Knowledge Extraction, 4(1), 240–253. [Google Scholar] [CrossRef]
- Sonoda, R. (2023). Fair oversampling technique using heterogeneous clusters. Information Sciences, 640, 119059. [Google Scholar] [CrossRef]
- Stoll, C. R., Izadi, S., Fowler, S., Green, P., Suls, J., & Colditz, G. A. (2019). The value of a second reviewer for study selection in systematic reviews. Research Synthesis Methods, 10(4), 539–545. [Google Scholar] [CrossRef] [PubMed]
- Szepannek, G., & Lübke, K. (2021). Facing the challenges of developing fair risk scoring models. Frontiers in Artificial Intelligence, 4, 681915. [Google Scholar] [CrossRef] [PubMed]
- Tang, S., & Yuan, J. (2023). Beyond submodularity: A unified framework of randomized set selection with group fairness constraints. Journal of Combinatorial Optimization, 45(4), 102. [Google Scholar] [CrossRef]
- Thuraisingham, B. (2022). Trustworthy machine learning. IEEE Intelligent Systems, 37(1), 21–24. [Google Scholar] [CrossRef]
- Tigges, M., Mestwerdt, S., Tschirner, S., & Mauer, R. (2024). Who gets the money? A qualitative analysis of fintech lending and credit scoring through the adoption of AI and alternative data. Technological Forecasting and Social Change, 205, 123491. [Google Scholar] [CrossRef]
- Varley, M., & Belle, V. (2021). Fairness in machine learning with tractable models. Knowledge-Based Systems, 215, 106715. [Google Scholar] [CrossRef]
- Wang, X., Zhang, Y., & Zhu, R. (2022). A brief review on algorithmic fairness. Management System Engineering, 1(1), 7. [Google Scholar] [CrossRef]
- Zafar, M. B., Valera, I., Rogriguez, M. G., & Gummadi, K. P. (2017). Fairness constraints: Mechanisms for fair classification. In Artificial intelligence and statistics (pp. 962–970). PMLR. [Google Scholar]
- Zehlike, M., Hacker, P., & Wiedemann, E. (2020). Matching code and law: Achieving algorithmic fairness with optimal transport. Data Mining and Knowledge Discovery, 34(1), 163–200. [Google Scholar] [CrossRef]
- Zemel, R., Wu, Y., Swersky, K., Pitassi, T., & Dwork, C. (2013). Learning fair representations. In International conference on machine learning (pp. 325–333). PMLR. [Google Scholar]
- Zhu, H., Dai, E., Liu, H., & Wang, S. (2023). Learning fair models without sensitive attributes: A generative approach. Neurocomputing, 561, 126841. [Google Scholar] [CrossRef]
- Zou, L., & Khern-am nuai, W. (2023). AI and housing discrimination: The case of mortgage applications. AI and Ethics, 3(4), 1271–1281. [Google Scholar] [CrossRef]
Rating | Meaning | Encryption |
---|---|---|
1 | Type of Study | A—Empirical |
B—Theoretical | ||
2 | Type of Approach | A—Quantitative |
B—Qualitative | ||
3 | Research Topic | A—Dataset Bias |
B—Race/Ethnicity Bias | ||
C—Outcome Fairness | ||
D—Trade-offs Fairness × Performance | ||
4 | Theoretical Background | A—Profit maximization |
B—Taste-based | ||
C—Implicit discrimination | ||
D—Statistical Model of sample adjustments | ||
E—Information asymmetry | ||
5 | Bias-Mitigation Method | A—Relabeling |
B—Generation | ||
C—Fair representation | ||
D—Constraint optimization | ||
E—Regularization | ||
F—Calibration | ||
G—Thresholding | ||
6 | Approach | A—Preprocessing |
B—In-Processing | ||
C—Post-Processing | ||
7 | Data Access | A—Public |
B—Private | ||
8 | Data Source | A—Academic |
B—Synthetic | ||
C—Corporate | ||
D—Governmental | ||
9 | Type of Credit | A—Mortgage |
B—Credit Card | ||
C—Credit Mix | ||
10 | Biased Attribute | A—Gender |
B—Race | ||
C—Ethnicity | ||
D—Age | ||
11 | Fairness-Evaluation Metrics | A—Statistical Parity |
B—Equalized Odds | ||
C—Equal Opportunity | ||
D—Treatment Equality | ||
E—Test Equality | ||
F—Fairness Through Unawareness | ||
G—Fairness Through Awareness | ||
H—Counterfactual Fairness | ||
12 | Performance Evaluation Metrics | A—Accuracy |
B—F1-Score | ||
C—G-mean | ||
D—AUC | ||
E—ROC | ||
F—Kolmogorov—Smirnov | ||
G—Gini | ||
H—Mean Squared Error (MSE) | ||
I—Precision at Position k (P@k) | ||
J—Recall | ||
K—Balanced Accuracy Score (BAcc) |
Author | Study | Type | Research Topic | Background | Bias Mitigation | Approach | Data Access | Data Source | Credit | Biased Attribute | Fairness Metrics | Performance Metrics |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Napoles and Koutsoviti Koumeri (2022) | 1A | 2A | 3A | 4C | 6C | 7A | 8A | 9C | 10A | |||
Barbierato et al. (2022) | 1A | 2A | 3A | 4C | 5B | 6A | 7A | 8B | 10A, 10B | 11G | ||
Z. Liu et al. (2023) | 1A | 2A | 3C | 4D | 5C | 6A | 7A | 8A | 11A, 11C, 11G | 12A, 12B 12C | ||
Fabris et al. (2022) | 1B | 2B | 3A | 4C | 6A | 7A | 8A | |||||
Bono et al. (2021) | 1A | 2A | 3C | 4C | 5B | 6A | 7B | 8C | 9C | 10A, 10B | 11A, 11C | 12D |
Lee and Floridi (2021) | 1A | 2A | 3B | 4C | 5G | 6C | 7A | 8D | 9A | 10B | 11F | 12E, 12D |
Mishra and Kumar (2023) | 1A | 2A | 3A | 4D | 5A | 6A | 7A | 8A | 11G | 12E, 12D, 12A | ||
Giudici et al. (2024) | 1B | 2B | 3C | 4C | 5B | 6A | 11A | 12F | ||||
Breeden and Leonova (2021) | 1A | 2A | 3A | 4C | 5A | 6A | 7B | 8C | 9B | 10B | 12E, 12D, 12G | |
Singh et al. (2022) | 1A | 2A | 3C | 4C | 5B, 5C | 6A | 7A | 8D | 9A | 10A, 10B | 11B | 12A, 12B, 12J |
Rizinski et al. (2022) | 1A, 1B | 2A, 2B | 3C | 4A | 5F | 6C | 7A | 8A | 9B | |||
Szepannek and Lübke (2021) | 1A | 2A | 3D | 4C | 5A | 6A | 7A | 8A | 9C | 10A | 11A, 11G | 12D |
Sonoda (2023) | 1A | 2A | 3D | 4D | 5B | 6A | 7A | 8A | 9C | 10B | 11A, 11B, 11C | 12K |
X. Huang et al. (2022) | 1A | 2A | 3C | 4C | 5D | 6B | 7A | 8A | 9C | 10D | 11B, 11C | 12A |
Colakovic and Karakatič (2023) | 1A | 2A | 3D | 4C | 5D | 6B | 7A | 8A | 9C | 10A, 10C | 11B, 11G | 12A, 12B |
Kozodoi et al. (2022) | 1A, 1B | 2A, 2B | 3D | 4A | 5A, 5D, 5F | 6A, 6B, 6C | 7A, 7B | 8A, 8C | 9C | 11A | 12D | |
Varley and Belle (2021) | 1A | 2A | 3C | 4C | 5C | 6A | 7A | 8A | 9C | 10A | 11F, 11G | 12A |
Mazilu et al. (2022) | 1A | 2A | 3A | 4E | 5C | 6A | 7A | 8A | 9C | 10A | 11A, 11B, 11C, 11G | |
Zhu et al. (2023) | 1A | 2A | 3A | 4C | 5B | 6A | 7A | 8A | 9B | 10D | 11C, 11G | 12D, 12A |
Qiu et al. (2023) | 1A | 2A | 3A | 4C | 5C | 6A | 7A | 8A | 9C | 10A, 10B | 11A | 12H, 12B |
Zehlike et al. (2020) | 1A | 2A | 3C | 4C | 5G | 6C | 7B | 8A | 9C | 10A, 10C | 12I | |
Jammalamadaka and Itapu (2023) | 1A | 2A | 3C | 4A | 5A, 5D, 5F | 6A, 6B, 6C | 7A | 8A | 9C | 10A, 10D | 11A, 11B, 11C | 12A, 12B |
Asimit et al. (2022) | 1A | 2A | 3A | 4D | 5D | 6B | 7A | 8A | 9B | 10A | 11A, 11G | 12F |
Genovesi et al. (2023) | 1B | 2B | 3C | 4C | ||||||||
Purificato et al. (2023) | 1B | 2A, 2B | 3C | 4A | 5C | 6A | 11A | 12A, 12J, 12B | ||||
Bantilan (2018) | 1A | 2A | 3C | 4B | 5F | 6C | 7A | 8A | 9C | 10A, 10D, 10C | 11A | 12D, 12A, 12B |
Brotcke (2022) | 1B | 2B | 3C | 4C | ||||||||
Chen et al. (2024) | 1A | 2A | 3C | 4D | 5F | 6C | 7B | 8C | 9C | 10C | 12F, 12G | |
Alves et al. (2023) | 1A | 2A | 3D | 4C | 5B, 5C, 5E, 5G | 6A, 6B, 6C | 7A | 8A | 9C | 10A, 10D | 11A, 11B | 12C |
Makhlouf et al. (2024) | 1A | 2A | 3C | 4C | 6B | 7A | 8A | 9C | 10A | 11A | ||
González-Zelaya et al. (2024) | 1A | 2A | 3D | 4C | 5C | 6A | 7A | 8A | 9C | 10A | 11A, 11B | 12A, 12J |
Babaei and Giudici (2024) | 1A | 2A | 3C | 4C | 5F | 6C | 7A | 8C | 9C | 10E | 12A | |
Badar and Fisichella (2024) | 1A | 2A | 3C | 4C | 5D | 6B | 7A | 8A | 9C | 10A | 11A | 12C, 12J, 12K |
Lu and Calabrese (2023) | 1A | 2A | 3C | 4C | 5G | 6C | 7B | 8C | 9C | 10A, 10C |
Index | Author | Title | Summary |
---|---|---|---|
1 | Napoles and Koutsoviti Koumeri (2022) | A fuzzy-rough uncertainty measure to discover bias encoded explicitly or implicitly in features of structured pattern classification datasets | The paper introduces a fuzzy-rough uncertainty measure to quantify explicit and implicit bias in pattern classification datasets, demonstrating its effectiveness through simulations on the German credit dataset and highlighting its independence from machine-learning models. |
2 | Barbierato et al. (2022) | A Methodology for Controlling Bias and Fairness in Synthetic Data Generation | The paper presents a methodology for generating synthetic datasets with controlled bias and fairness using a probabilistic network and structural equation modeling, validated through examples and requiring fewer parameters than traditional methods. |
3 | Z. Liu et al. (2023) | Active learning with fairness-aware clustering for fair classification considering multiple sensitive attributes | The paper proposes FALCONS, a novel active learning approach with fairness-aware clustering to ensure fairness across multiple sensitive attributes while maintaining classification accuracy, outperforming state-of-the-art methods on real-world datasets. |
4 | Fabris et al. (2022) | Algorithmic fairness datasets: the story so far | The paper addresses the “data documentation debt” in algorithmic fairness, caused by unclear and scattered information. The authors reviewed over two hundred datasets, creating standardized and detailed documentation for the three most popular ones—Adult, COMPAS, and German Credit—and cataloging alternatives with their relevant characteristics. They analyze the data concerning anonymization, consent, inclusion, sensitive attribute labeling, and transparency, proposing best practices for curating new resources. |
5 | Bono et al. (2021) | Algorithmic fairness in credit scoring | The paper evaluates the accuracy and fairness of traditional models versus machine-learning models for credit scoring using data from 800,000 UK borrowers, finding that while machine-learning models are more accurate, they do not fully address fairness issues related to protected subgroups. |
6 | Lee and Floridi (2021) | Algorithmic Fairness in Mortgage Lending: from Absolute Conditions to Relational Trade-offs | The paper proposes a new methodology for algorithmic fairness in mortgage lending, viewing fairness as a relational trade-off rather than an absolute condition, to better address racial discrimination. |
7 | Mishra and Kumar (2023) | An individual fairness-based outlier detection ensemble | The paper proposes an individual fairness-based member selection approach called iFselect for outlier detection in ensembles, which uses an individual fairness score (iFscore) to balance performance and fairness without relying on sensitive attributes, demonstrating superiority in fairness and comparable performance on benchmark datasets. |
8 | Giudici et al. (2024) | Artificial Intelligence risk measurement | The paper proposes an integrated AI risk management framework using Key AI Risk Indicators (KAIRI) based on the principles of Sustainability, Accuracy, Fairness, and Explainability (SAFE) to assess and manage AI risks in financial services. |
9 | Breeden and Leonova (2021) | Creating Unbiased Machine-Learning Models by Design | The paper introduces a two-step modeling procedure to create fair machine-learning models for credit granting, ensuring independence from protected class status, tested on subprime credit card data with demographic proxies from the U.S. Census. |
10 | Singh et al. (2022) | Developing a Novel Fair-Loan Classifier through a Multi-Sensitive Debiasing Pipeline: DualFair | The paper introduces DualFair, a novel bias-mitigation technique and fairness metric (AWI), which effectively addresses intersectional fairness in machine learning models, particularly in the mortgage lending domain, creating a fair-loan classifier with high fairness and accuracy. |
11 | Rizinski et al. (2022) | Ethically Responsible Machine Learning in Fintech | The paper explores the integration of machine learning in finance, highlighting ethical challenges and proposing a framework to address issues such as bias and fairness, utilizing tools like SHAP for model transparency. |
12 | Szepannek and Lübke (2021) | Facing the Challenges of Developing Fair Risk Scoring Models | The paper presents a framework for developing fair risk scoring models using causal inference and explainable machine learning, defines counterfactual fairness, and demonstrates through simulations and real-world data that fairness can be improved without significant loss of predictive accuracy. |
13 | González-Zelaya et al. (2024) | Fair and Private Data Preprocessing through Microaggregation | The paper introduces Fair-MDAV, a modular preprocessing method that simultaneously addresses privacy and fairness in machine learning, achieving competitive results on benchmark datasets with minimal loss in predictive performance. |
14 | Sonoda (2023) | Fair oversampling technique using heterogeneous clusters | The paper proposes a fair oversampling technique using heterogeneous clusters to improve the balance between fairness and utility in machine learning classifiers by generating diverse synthetic data and demonstrates its effectiveness through experiments on various datasets and classifiers. |
15 | X. Huang et al. (2022) | Fair-AdaBoost: Extending AdaBoost method to achieve fair classification | The paper proposes Fair-AdaBoost, an extension of AdaBoost for fair classification, optimized with an extended NSGA-II algorithm, demonstrating superior fairness performance on benchmark datasets while maintaining accuracy. |
16 | Badar and Fisichella (2024) | Fair-CMNB: Advancing Fairness-Aware Stream Learning with Naïve Bayes and Multi-Objective Optimization | The paper presents Fair-CMNB, an adaptation of Naïve Bayes using multi-objective optimization to mitigate discrimination and handle class imbalance in streaming data, outperforming existing methods in terms of discrimination score and balanced accuracy. |
17 | Colakovic and Karakatič (2023) | FairBoost: Boosting supervised learning for learning on multiple sensitive features | The paper introduces FairBoost, an ensemble boosting method that improves fairness in classification tasks with multiple sensitive attributes without significantly compromising accuracy, using a fairness weight to balance the trade-off between fairness and accuracy, as demonstrated by experiments on various datasets. |
18 | Kozodoi et al. (2022) | Fairness in credit scoring: Assessment, implementation, and profit implications | The paper revisits statistical fairness criteria for credit scoring, catalogs algorithmic options for incorporating fairness into machine-learning models, and empirically compares fairness processors to balance profit and fairness, recommending separation as an appropriate criterion. |
19 | Varley and Belle (2021) | Fairness in machine learning with tractable models | The paper explores the application of tractable probabilistic models, specifically sum and product networks, to ensure fairness in machine learning by developing preprocessing techniques and introducing the concept of “fairness through percentage equivalence” to address biases related to protected attributes. |
20 | Mazilu et al. (2022) | Fairness-aware Data Integration | The paper proposes a fairness-aware data integration approach using tabu search to reduce bias in datasets and classifiers, evaluated on benchmark datasets to improve fairness metrics. |
21 | Babaei and Giudici (2024) | How fair is machine learning in credit lending? | The paper investigates bias in machine-learning credit-loan models in U.S. regions, proposes fairness measurement methods using Shapley values, and suggests a propensity score matching approach to improve fairness. |
22 | Zhu et al. (2023) | Learning fair models without sensitive attributes: A generative approach | The paper proposes FairWS, a framework that estimates sensitive attributes from relevant features to train fair classifiers without the need for direct access to sensitive attributes, demonstrating better accuracy and fairness compared to existing methods. |
23 | Qiu et al. (2023) | Learning fair representations via an adversarial framework | The paper proposes a minimax adversarial framework to learn fair representations that ensure statistical parity and individual fairness, making latent representations indistinguishable between protected groups, validated through experiments on four real-world datasets. |
24 | Chen et al. (2024) | Measuring fairness in credit ratings | The paper proposes a methodology using Shapley-Lorenz values to assess fairness and explainability in credit scoring models, concluding that the model is fair across countries and sectors when using a complete dataset. |
25 | Zehlike et al. (2020) | Matching code and law: achieving algorithmic fairness with optimal transport | The paper proposes a minimax adversarial framework to learn fair representations that achieve statistical parity and individual fairness, making latent representations indistinguishable between protected groups, validated through experiments on real-world datasets. |
26 | Jammalamadaka and Itapu (2023) | Responsible AI in automated credit scoring systems | The paper presents a study on the use of a German credit card dataset to develop fair and responsible AI models for credit evaluation, employing techniques such as XGBoost with hyperparameter tuning and the Disparate-Impact Remover to address gender and age-related bias, ensuring an ethical and transparent analysis. |
27 | Asimit et al. (2022) | Robust Classification via Support Vector Machines | The paper develops two robust support vector machine classifiers, Single Perturbation, and Extreme Empirical Loss, to handle uncertainty in variable data, demonstrating their effectiveness through extensive numerical experiments on synthetic and real-world datasets. |
28 | Genovesi et al. (2023) | Standardizing fairness-evaluation procedures: interdisciplinary insights on machine-learning algorithms in creditworthiness assessments for small personal loans | The paper proposes minimum ethical requirements to standardize the fairness assessment in AI systems used for small personal loan credit evaluations, addressing issues of distributive and procedural justice through specific metrics and transparency measures. |
29 | Alves et al. (2023) | Survey on fairness notions and related tensions | The paper reviews notions of fairness in machine learning, discusses their tensions with privacy and accuracy, analyzes mitigation methods, and includes experimental analyses to illustrate these concepts. |
30 | Lu and Calabrese (2023) | The Cohort Shapley value to measure fairness in financing small and medium enterprises in the UK. | The paper proposes a method using Cohort Shapley values to measure fairness in small and medium enterprise financing in the UK, revealing discrimination against startups, microbusinesses, women-led businesses, and ethnic minority groups based on data from the SME Finance Monitor survey. |
31 | Purificato et al. (2023) | The Use of Responsible Artificial Intelligence Techniques in the Context of Loan Approval Processes | The paper presents a system for loan approval processes that increases trust and reliance on AI, focusing on explainability and fairness, managing the entire lifecycle of the machine-learning model within a proprietary framework. |
32 | Bantilan (2018) | Themis-ml: A Fairness-Aware Machine-Learning Interface for End-To-End Discrimination Discovery and Mitigation | The paper specifies, implements, and evaluates a fairness-aware machine-learning interface called themis-ml, designed to measure and mitigate discrimination in machine-learning models, particularly in binary classification tasks. |
33 | Brotcke (2022) | Time to Assess Bias in Machine-Learning Models for Credit Decisions | The paper discusses the implications of using machine-learning models in credit decisions for fair lending practices, reviews the evolution of fair lending assessments, highlights the challenges posed by machine-learning models, and proposes solutions to address potential discriminatory risks. |
34 | Makhlouf et al. (2024) | When causality meets fairness: A survey | The paper reviews fairness notions based on causality in machine learning, providing guidelines for their application in real-world scenarios and classifying them according to their implementation difficulty, as per Pearl’s causality ladder. |
Country | Number of Articles |
---|---|
Germany | 6 |
Italy | 6 |
United States | 5 |
United Kingdom | 4 |
China | 3 |
France | 3 |
India | 2 |
Slovenia | 1 |
Belgium | 1 |
Mexico | 1 |
Japan | 1 |
North Macedonia | 1 |
Total | 34 |
Credit Decision Dataset | Papers | Most Frequent Sensitive Attributes | Most Used Fairness Metrics | Mitigation Stage |
---|---|---|---|---|
Mortgage data | 2 | Race | Equal Opportunity | Post-processing |
Credit card data | 4 | Gender; Race | Statistical Parity; Equalized Odds | Preprocessing |
“Credit mix”/synthetic or multi-source datasets | 20 | Gender; Race; Ethnicity | Statistical Parity (14/20); Equal Opportunity (6/20) | Preprocessing (15/20) |
Not specified/other | 8 | - | - | - |
Total | 34 |
Sensitive Attribute | SP (11A) | EOdds (11B) | EOpp (11C) | Other |
---|---|---|---|---|
Gender | 10 | 3 | 2 | 1 |
Race | 11 | 5 | 4 | 2 |
Ethnicity | 5 | 1 | 3 | - |
Age | 2 | - | - | 1 |
Mitigation Method | Pre Proc. | In Proc. | Post Proc. |
---|---|---|---|
Relabeling | 4 | - | - |
Synthetic Generation | 5 | 1 | - |
Fair Representation | 6 | 2 | - |
Constraint Optim. | - | 6 | 1 |
Regularization | - | 2 | - |
Calibration | - | - | 4 |
Thresholding | - | - | 3 |
Metric | Formal Criterion | Typical Strength/Use Case | Key Limitation | Papers |
---|---|---|---|---|
Statistical Parity (11A) | Simplest check; aligns with disparate impact regulation. | Ignores error distribution; may hurt utility. | 16 (47%) | |
Equalized Odds (11B) | TPR and FPR equal across groups | Controls both error types; popular in credit scoring. | Unattainable with very different base rates. | 8 (24%) |
Equal Opportunity (11C) | TPR equal across groups | Minimizes false negative harm (loan denial). | Leaves FPR unconstrained. | 7 (21%) |
Fairness Through Awareness (11G) | If | Individual level guarantee when a task-specific distance is defined. | Requires explicit similarity metric. | 8 (24%) |
Fairness Through Unawareness (11F) | Sensitive features excluded from model | Easy to apply, useful as a baseline. | Fails when proxies leak sensitive info. | 2 (6%) |
Paper | Perf. Metric | Base | Mitig. | Fair. Metric | |
---|---|---|---|---|---|
X. Huang et al. (2022) | AUC | 0.76 | 0.74 | −0.02 | EO gap ↓ 67% |
Colakovic and Karakatič (2023) | F1 | 0.846 | 0.842 | −0.004 | SP ratio ↑ 31% |
Lee and Floridi (2021) | FNR | 0.12 | 0.09 | ↓ 25% | EOpp gap ↓ 78% |
Mishra and Kumar (2023) | Accuracy | 0.89 | 0.88 | −0.01 | IND fairness ↑ 40% |
Research Gaps | Authors |
---|---|
Limited Literature on Fair ML in Credit Scoring | (Kozodoi et al., 2022). |
Lack of Comprehensive Empirical Analysis | (Bono et al., 2021; Kozodoi et al., 2022). |
Challenge of Model Explainability and Transparency | (Giudici et al., 2024; Rizinski et al., 2022; Szepannek & Lübke, 2021). |
Lack of Standardization in Fairness Metrics | (Bono et al., 2021; Genovesi et al., 2023; Lee & Floridi, 2021). |
Difficulty in Capturing Implicit Bias and Estimating Sensitive Attributes | (Napoles & Koutsoviti Koumeri, 2022; Zhu et al., 2023). |
Absence of Multiple Sensitive Attributes in Fair Classification | (Colakovic & Karakatič, 2023; Z. Liu et al., 2023). |
Limited Methods for Data Generation and Integration | (Barbierato et al., 2022; Mazilu et al., 2022). |
Increase in Computational Complexity | (Napoles & Koutsoviti Koumeri, 2022; Sonoda, 2023). |
Fairness in Small Groups | Mishra and Kumar (2023); Zehlike et al. (2020). |
Limited Interdisciplinary Insights on Fairness in AI Systems | (Genovesi et al., 2023; Zehlike et al., 2020). |
Legal, Ethical, and Policy Implications | (Lee & Floridi, 2021; Rizinski et al., 2022; Zehlike et al., 2020). |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
de Castro Vieira, J.R.; Barboza, F.; Cajueiro, D.; Kimura, H. Towards Fair AI: Mitigating Bias in Credit Decisions—A Systematic Literature Review. J. Risk Financial Manag. 2025, 18, 228. https://doi.org/10.3390/jrfm18050228
de Castro Vieira JR, Barboza F, Cajueiro D, Kimura H. Towards Fair AI: Mitigating Bias in Credit Decisions—A Systematic Literature Review. Journal of Risk and Financial Management. 2025; 18(5):228. https://doi.org/10.3390/jrfm18050228
Chicago/Turabian Stylede Castro Vieira, José Rômulo, Flavio Barboza, Daniel Cajueiro, and Herbert Kimura. 2025. "Towards Fair AI: Mitigating Bias in Credit Decisions—A Systematic Literature Review" Journal of Risk and Financial Management 18, no. 5: 228. https://doi.org/10.3390/jrfm18050228
APA Stylede Castro Vieira, J. R., Barboza, F., Cajueiro, D., & Kimura, H. (2025). Towards Fair AI: Mitigating Bias in Credit Decisions—A Systematic Literature Review. Journal of Risk and Financial Management, 18(5), 228. https://doi.org/10.3390/jrfm18050228