Ethics and Trustworthiness of AI for Predicting the Risk of Recidivism: A Systematic Literature Review
Abstract
:1. Introduction
2. Background Study
2.1. Trust
2.2. Trustworthy AI
2.3. An overview of the Seven Requirements for Trustworthy AI Proposed by the European Commission
2.3.1. Human Agency and Oversight
2.3.2. Technical Robustness and Safety
2.3.3. Diversity, Non-Discrimination, Fairness
2.3.4. Accountability
2.3.5. Transparency
2.3.6. Privacy and Data Governance
2.3.7. Societal and Environmental Wellbeing
3. Related Works
4. Methodology
4.1. Review Technique
4.2. A. Research Questions
- Q1: Are there works proposing the use of AI for predicting recidivism? How many?
- Q2: How many of these works considered the ethics and trustworthiness of the AI system?
- Q3: What are the essential requirements of trustworthy AI for predicting the risks of recidivism?
- Q4: What challenges hinder the development of trustworthy AI for predicting the risk of recidivism?
4.3. B. Search Strategy
4.4. Data Extraction
4.5. Quantitative Analysis
5. Proposed Extended Requirements for Trustworthy AI in Criminal Judiciary System
5.1. Consistency
5.2. Reliability
5.3. Explainability
5.4. Interpretability
6. Extended Analysis
6.1. Fairness
Ref | Purpose of Paper | Dataset | Algorithm Model Used or Studied | Evaluation Metrics | Location |
---|---|---|---|---|---|
[4] | Analyzed the performance of interpretable models regarding prediction ability, sparsity, and fairness | Broward County, Florida dataset & Kentucky dataset | LR, SVM, RF, DT, CART, Explainable Boost Machine | FPR, FNR, Accuracy | USA |
[73] | Investigate the recidivism risk on general-purpose ML algorithms, at the expense of not satisfying relevant group fairness metrics | Juvenile Justice System of Catalonia | LR, Multi-Layer Perceptron, SVM with a Linear kernel, KNN, RF, DT, NB | Balanced accuracy, TPR, TNR, AUR-ROC | Spain |
[81] | Studied and design Singular Race Models for recidivism and its effects on the accuracy and bias | Florida Department of Corrections (FDOC) & Florida Department of Law Enforcement | KNN, RF, AdaBoost, DT, SVM, ANNs | Accuracy, FPR, FNR | USA |
[51] | Introduced a new fairness measure and an enhanced feature-rich representation that permits the selection of the lowest bias models | Recidivism of Prisoners Released in 1994 | ANN | TP, TN, FP, FN, PPV | USA |
[74] | Investigated the use of Mugshots to address racial disparity | Miami-Dade County Clerk of the Court | MTCNN | Accuracy | USA |
[83] | Proposed a method called CAPE to solve fair classification problems in the presence of prior probability shifts. | Broward County, Florida (COMPAS) and MEPS | CAPE | Prevalence Difference (PD), Proportional Equality (PE) | USA |
[82] | Compared three strategies for debiasing algorithms and how they affect the fairness trade-off when predicting recidivism. | Federal Probation System—Post Conviction Risk Assessment (2009–2019) | Post Conviction Risk Assessment (PCRA) algorithms | AUC, PPV, FPR, FNR | USA |
[54] | Illustrated the construction of officer risk assessment modelling using the demographic, network, and Hawkes point process features. | Use of Force Complaint data from the Chicago Police Department | Boosted Decision Tree, Feed-Forward NN, Auto Machine Learning | AUC, Logloss, RMSE, MAE | USA |
[84] | Introduced three fairness definitions that satisfy intersectional fairness, desiderata, differential fairness and its bias amplification | Broward County, FL (COMPAS), UCI Adult data repository (1994) | Neural Network (ADAM) | Accuracy, F1-Score, AUC-ROC | USA |
[53] | Introduced a novel probabilistic formulation of data preprocessing for reducing discrimination | Broward County, Florida (COMPAS) | LR, RF | AUC | USA |
[85] | Developed an ML model predicting the criminal offence type committed in a large transdiagnostic sample of psychiatry patients | Ontario Review Board (Forensic mental health system) | RF, SVM (Radial Kernel), XGBoost | Accuracy, AUC-ROC, Sensitivity, Specificity, Confidence Interval | Canada |
[23] | Applied fairness criterion originating in educational and psychological testing to assess the fairness of recidivism prediction instruments. | Broward County, Florida | FPR, FNR, PPV | USA | |
[19] | Analyzed ProPublica on the risk assessment tool (COMPAS) | Broward County, Florida (COMPAS) | LR | AUC-ROC, FN, FP, Sensitivity, Specificity, PPV, NPV | USA |
[68] | Presented a technique for fairness and auditing black-box models using a variety of publicly available datasets and models | National Archive of Criminal Justice Data | SVM, Feedforward NN | Gradient Feature Auditing (GFA) | USA |
[86] | Compared the fairness predictions of risk assessment tools and humans | Broward County, Florida (COMPAS) | LR, Nonlinear SVM, COMPAS software | Accuracy, FP, FN | USA |
[87] | Proposed an approach to increase recidivism prediction accuracy while reducing race-based bias | Recidivism of Prisoners Released in 1994 | XGBoost, LR, SVM | Accuracy, FPR parity, FPR, FNR, TPR, TNR, Monte Carlo cross-validation | USA |
[88] | Research on human interactions with risk assessments through a controlled experimental study on Amazon Mechanical Turk | U.S. Department of Justice | Gradient Boosted Trees | AUC-ROC, Accuracy, FPR | USA |
[24] | Studied and compared the accuracy and fairness of risk assessment tools and humans in predicting recidivism risk | Broward County, FL (COMPAS) | LDA, LR, Non-Linear SVM, COMPAS | AUC-ROC, Accuracy, FPR, FNR | USA |
[16] | Addressed definitions and metrics for fairness that exists in the literature to optimizes public policy problem | City Attorney’s case Management System | Regularised LR, Decision Trees, RF, Extra Tree Classifiers | FPR, FDR, FOR, FNR, FP, FN, Recall | USA |
[89] | Designed a system that algorithmically redacts race-related information to reduce potential bias | American District Attorney’s Office | Gradient-Boosted Decision Tree | Accuracy, AUC-ROC, FPR, FNR | USA |
[90] | addressing the shortcomings of bias-error trade-off in AI algorithms | 1994 Census Income dataset, German Credit Dataset | Adaptive Boosting, SVM, LR | FPR, FNR | USA |
6.2. Interpretability
6.3. Transparency
6.4. Other Requirements for Trustworthy AI
7. Issues and Challenges
7.1. Datasets Used
7.2. Standardization
7.3. Metrics
7.4. Propensity to Trust: Private Sector vs. Public Sector
8. Conclusions and Future Works
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
TP | True Positive |
TN | True Negative |
FP | False Positive |
FN | False Negative |
FPR | False Positive Rate |
FNR | False Negative Rate |
TNR | True Negative Rate |
TPR | True Positive Rate |
PPV | Positive Predictive Value |
NPV | Negative Positive Value |
AUR-ROC | Area under the ROC Curve |
MAE | Mean Absolute Error |
RMSE | Root Mean Square Error |
ANN | Artificial Neural Networks |
NB | Naive Bayes |
LR | Logistic Regression |
SVM | Support Vector Machines |
CART | Classification and Regression Trees |
MTCNN | Multi-task Cascaded Convolutional Network |
RF | Random Forests |
CAPE | Combinatorial Algorithm for Proportional Equality |
XGBoost | eXtreme Gradient Boosted |
DT | Decision Trees |
LDA | Linear Discriminant Analysis |
References
- Sushina, T.; Sobenin, A. Artificial Intelligence in the Criminal Justice System: Leading Trends and Possibilities. In Proceedings of the 6th International Conference on Social, Economic, and Academic Leadership (ICSEAL-6-2019), Prague, Czech Republic, 13–14 December 2019; pp. 432–437. [Google Scholar] [CrossRef]
- Kovalchuk, O.; Karpinski, M.; Banakh, S.; Kasianchuk, M.; Shevchuk, R.; Zagorodna, N. Prediction Machine Learning Models on Propensity Convicts to Criminal Recidivism. Information 2023, 14, 161. [Google Scholar] [CrossRef]
- Berk, R.; Bleich, J. Forecasts of violence to inform sentencing decisions. J. Quant. Criminol. 2014, 30, 79–96. [Google Scholar] [CrossRef]
- Wang, C.; Han, B.; Patel, B.; Rudin, C. In pursuit of interpretable, fair and accurate machine learning for criminal recidivism prediction. J. Quant. Criminol. 2023, 39, 519–581. [Google Scholar] [CrossRef]
- Mohler, G.; Porter, M.D. A note on the multiplicative fairness score in the NIJ recidivism forecasting challenge. Crime Sci. 2021, 10, 17. [Google Scholar] [CrossRef]
- Cadigan, T.P.; Lowenkamp, C.T. Implementing risk assessment in the federal pretrial services system. Fed. Probat. 2011, 75, 30. [Google Scholar]
- Green, B. The false promise of risk assessments: Epistemic reform and the limits of fairness. In Proceedings of the FAT* ’20: 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020; pp. 594–606. [Google Scholar] [CrossRef]
- Lo Piano, S. Ethical principles in machine learning and artificial intelligence: Cases from the field and possible ways forward. Humanit. Soc. Sci. Commun. 2020, 7, 9. [Google Scholar] [CrossRef]
- Desmarais, S.L.; Johnson, K.L.; Singh, J.P. Performance of recidivism risk assessment instruments in US correctional settings. Psychol. Serv. 2016, 13, 206. [Google Scholar] [CrossRef] [Green Version]
- Green, B. “Fair” risk assessments: A precarious approach for criminal justice reform. In Proceedings of the 5th Workshop on Fairness, Accountability, and Transparency in Machine Learning, New York, NY, USA, 23–24 February 2018; pp. 1–5. [Google Scholar]
- O’Loughlin, T.; Bukowitz, R. A new approach toward social licensing of data analytics in the public sector. Aust. J. Soc. Issues 2021, 56, 198–212. [Google Scholar] [CrossRef]
- Bickley, S.J.; Torgler, B. Cognitive architectures for artificial intelligence ethics. AI Soc. 2023, 38, 501–519. [Google Scholar] [CrossRef]
- Chugh, N. Risk assessment tools on trial: Lessons learned for “Ethical AI” in the criminal justice system. In Proceedings of the 2021 IEEE International Symposium on Technology and Society (ISTAS), Waterloo, ON, Canada, 28–31 October 2021; pp. 1–5. [Google Scholar] [CrossRef]
- Hartmann, K.; Wenzelburger, G. Uncertainty, risk and the use of algorithms in policy decisions: A case study on criminal justice in the USA. Policy Sci. 2021, 54, 269–287. [Google Scholar] [CrossRef]
- Alikhademi, K.; Drobina, E.; Prioleau, D.; Richardson, B.; Purves, D.; Gilbert, J.E. A review of predictive policing from the perspective of fairness. Artif. Intell. Law 2021, 7, 1–17. [Google Scholar] [CrossRef]
- Rodolfa, K.T.; Salomon, E.; Haynes, L.; Mendieta, I.H.; Larson, J.; Ghani, R. Case study: Predictive fairness to reduce misdemeanor recidivism through social service interventions. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020; pp. 142–153. [Google Scholar] [CrossRef] [Green Version]
- Hamilton, M. The sexist algorithm. Behav. Sci. Law 2019, 37, 145–157. [Google Scholar] [CrossRef] [PubMed]
- Dieterich, W.; Mendoza, C.; Brennan, T. COMPAS Risk Scales: Demonstrating Accuracy Equity and Predictive Parity; Northpointe Inc.: Traverse City, MI, USA, 2016. [Google Scholar]
- Flores, A.W.; Bechtel, K.; Lowenkamp, C.T. False positives, false negatives, and false analyses: A rejoinder to machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. Fed. Probat. 2016, 80, 38. [Google Scholar]
- Hurlburt, G. How much to trust artificial intelligence? It Prof. 2017, 19, 7–11. [Google Scholar] [CrossRef]
- Li, L.; Zhao, L.; Nai, P.; Tao, X. Charge prediction modeling with interpretation enhancement driven by double-layer criminal system. World Wide Web 2022, 25, 381–400. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhou, F.; Li, Z.; Wang, Y.; Chen, F. Fair Representation Learning with Unreliable Labels. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Valencia, Spain, 25–27 April 2023; pp. 4655–4667. [Google Scholar]
- Chouldechova, A. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data 2017, 5, 153–163. [Google Scholar] [CrossRef]
- Dressel, J.J. Accuracy and Racial Biases of Recidivism Prediction Instruments. Bachelor’s Thesis, Dartmouth College, Hanover, NH, USA, 2017. [Google Scholar]
- Kaur, D.; Uslu, S.; Rittichier, K.J.; Durresi, A. Trustworthy artificial intelligence: A review. ACM Comput. Surv. (CSUR) 2022, 55, 1–38. [Google Scholar] [CrossRef]
- Emaminejad, N.; Akhavian, R. Trustworthy AI and robotics: Implications for the AEC industry. Autom. Constr. 2022, 139, 104298. [Google Scholar] [CrossRef]
- Ma, J.; Schneider, L.; Lapuschkin, S.; Achtibat, R.; Duchrau, M.; Krois, J.; Schwendicke, F.; Samek, W. Towards Trustworthy AI in Dentistry. J. Dent. Res. 2022, 101, 1263–1268. [Google Scholar] [CrossRef]
- Markus, A.F.; Kors, J.A.; Rijnbeek, P.R. The role of explainability in creating trustworthy artificial intelligence for health care: A comprehensive survey of the terminology, design choices, and evaluation strategies. J. Biomed. Inform. 2021, 113, 103655. [Google Scholar] [CrossRef]
- Mora-Cantallops, M.; Sánchez-Alonso, S.; García-Barriocanal, E.; Sicilia, M.A. Traceability for trustworthy ai: A review of models and tools. Big Data Cogn. Comput. 2021, 5, 20. [Google Scholar] [CrossRef]
- Kaur, D.; Uslu, S.; Durresi, A. Requirements for trustworthy artificial intelligence—A review. In Advances in Networked-Based Information Systems; Barolli, L., Li, K., Enokido, T., Takizawa, M., Eds.; NBiS 2020; Advances in Intelligent Systems and Computing; Springer: Cham, Switzerland, 2020; Volume 1264. [Google Scholar] [CrossRef]
- Vining, R.; McDonald, N.; McKenna, L.; Ward, M.E.; Doyle, B.; Liang, J.; Hernandez, J.; Guilfoyle, J.; Shuhaiber, A.; Geary, U.; et al. Developing a framework for trustworthy AI-supported knowledge management in the governance of risk and change. Lect. Notes Comput. Sci. 2022, 13516, 318–333. [Google Scholar] [CrossRef]
- Toreini, E.; Aitken, M.; Coopamootoo, K.; Elliott, K.; Zelaya, C.G.; Van Moorsel, A. The relationship between trust in AI and trustworthy machine learning technologies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020; pp. 272–283. [Google Scholar] [CrossRef]
- Vincent-Lancrin, S.; van der Vlies, R. Trustworthy artificial intelligence (AI) in education: Promises and challenges. In OECD Education Working Papers; OECD Publishing: Paris, France, 2020. [Google Scholar] [CrossRef] [Green Version]
- Ryan, M. In AI we trust: Ethics, artificial intelligence, and reliability. Sci. Eng. Ethics 2020, 26, 2749–2767. [Google Scholar] [CrossRef]
- Connolly, R. Trust in commercial and personal transactions in the digital age. In The Oxford Handbook of Internet Studies; Oxford University Press: Oxford, UK, 2013; pp. 262–282. [Google Scholar] [CrossRef]
- Beshi, T.D.; Kaur, R. Public trust in local government: Explaining the role of good governance practices. Public Organ. Rev. 2020, 20, 337–350. [Google Scholar] [CrossRef] [Green Version]
- Smit, D.; Eybers, S.; Smith, J. A Data Analytics Organisation’s Perspective on Trust and AI Adoption. In Proceedings of the Southern African Conference for Artificial Intelligence Research, Virtual, 6–10 December 2021; Springer: Berlin/Heidelberg, Germany, 2021; Volume 1551, pp. 47–60. [Google Scholar] [CrossRef]
- Rendtorff, J.D. The significance of trust for organizational accountability: The legacy of Karl Polanyi. In Proceedings of the 3rd Emes-Polanyi Selected Conference Papers, Roskilde, Denmark, 16–17 April 2018; Roskilde University: Roskilde, Denmark, 2018. [Google Scholar]
- Thiebes, S.; Lins, S.; Sunyaev, A. Trustworthy artificial intelligence. Electron. Mark. 2021, 31, 447–464. [Google Scholar] [CrossRef]
- Liu, K.; Tao, D. The roles of trust, personalization, loss of privacy, and anthropomorphism in public acceptance of smart healthcare services. Comput. Hum. Behav. 2022, 127, 107026. [Google Scholar] [CrossRef]
- Sutrop, M. Should we trust artificial intelligence? Trames A J. Humanit. Soc. Sci. 2019, 23, 499–522. [Google Scholar] [CrossRef]
- High-Level Expert Group on Artificial Intelligence. In Ethics Guidelines for Trustworthy AI; European Commission: Brussels, Belgium, 2019. Available online: https://digital-strategy.ec.europa.eu/en/policies/expert-group-ai (accessed on 3 July 2023).
- OECD. Tools for Trustworthy AI: A Framework to Compare Implementation Tools for Trustworthy AI Systems; OECD Digital Economy Papers, No. 312; OECD Publishing: Paris, France, 2021. [Google Scholar] [CrossRef]
- Floridi, L. Establishing the rules for building trustworthy AI. Nat. Mach. Intell. 2019, 1, 261–262. [Google Scholar] [CrossRef]
- Janssen, M.; Brous, P.; Estevez, E.; Barbosa, L.S.; Janowski, T. Data governance: Organizing data for trustworthy Artificial Intelligence. Gov. Inf. Q. 2020, 37, 101493. [Google Scholar] [CrossRef]
- Giovanola, B.; Tiribelli, S. Beyond bias and discrimination: Redefining the AI ethics principle of fairness in healthcare machine-learning algorithms. AI Soc. 2023, 38, 549–563. [Google Scholar] [CrossRef]
- Eckhouse, L.; Lum, K.; Conti-Cook, C.; Ciccolini, J. Layers of bias: A unified approach for understanding problems with risk assessment. Crim. Justice Behav. 2019, 46, 185–209. [Google Scholar] [CrossRef]
- ISO/IEC TR 24027:2021(E); Information Technology–Artificial Intelligence (AI)—Bias in AI Systems and AI Aided Decision Making. International Organization for Standardization, Vernier: Geneva, Switzerland, 2021.
- Ireland, L. Who errs? Algorithm aversion, the source of judicial error, and public support for self-help behaviors. J. Crime Justice 2020, 43, 174–192. [Google Scholar] [CrossRef]
- Berk, R. Accuracy and fairness for juvenile justice risk assessments. J. Empir. Leg. Stud. 2019, 16, 175–194. [Google Scholar] [CrossRef] [Green Version]
- Jain, B.; Huber, M.; Elmasri, R.; Fegaras, L. Using bias parity score to find feature-rich models with least relative bias. Technologies 2020, 8, 68. [Google Scholar] [CrossRef]
- Oatley, G.C. Themes in data mining, big data, and crime analytics. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2022, 12, e1432. [Google Scholar] [CrossRef]
- du Pin Calmon, F.; Wei, D.; Vinzamuri, B.; Ramamurthy, K.N.; Varshney, K.R. Data pre-processing for discrimination prevention: Information-theoretic optimization and analysis. IEEE J. Sel. Top. Signal Process. 2018, 12, 1106–1119. [Google Scholar] [CrossRef]
- Khorshidi, S.; Carter, J.G.; Mohler, G. Repurposing recidivism models for forecasting police officer use of force. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Busan, Republic of Korea, 19–22 February 2020; pp. 3199–3203. [Google Scholar] [CrossRef]
- Petersen, E.; Ganz, M.; Holm, S.H.; Feragen, A. On (assessing) the fairness of risk score models. arXiv 2023, arXiv:2302.08851. [Google Scholar]
- Berk, R.; Heidari, H.; Jabbari, S.; Kearns, M.; Roth, A. Fairness in criminal justice risk assessments: The state of the art. Sociol. Methods Res. 2021, 50, 3–44. [Google Scholar] [CrossRef]
- Grgic-Hlaca, N.; Redmiles, E.M.; Gummadi, K.P.; Weller, A. Human perceptions of fairness in algorithmic decision making: A case study of criminal risk prediction. In Proceedings of the 2018 World Wide Web Conference, Lyon, France, 23–27 April 2018; pp. 903–912. [Google Scholar] [CrossRef]
- McKay, C. Predicting risk in criminal procedure: Actuarial tools, algorithms, AI and judicial decision-making. Curr. Issues Crim. Justice 2020, 32, 22–39. [Google Scholar] [CrossRef]
- Zodi, Z. Algorithmic explainability and legal reasoning. Theory Pract. Legis. 2022, 10, 67–92. [Google Scholar] [CrossRef]
- Mökander, J.; Juneja, P.; Watson, D.S.; Floridi, L. The US Algorithmic Accountability Act of 2022 vs. The EU Artificial Intelligence Act: What can they learn from each other? Minds Mach. 2022, 32, 751–758. [Google Scholar] [CrossRef]
- Figueroa-Armijos, M.; Clark, B.B.; da Motta Veiga, S.P. Ethical perceptions of AI in hiring and organizational trust: The role of performance expectancy and social influence. J. Bus. Ethics 2022, 186, 179–197. [Google Scholar] [CrossRef]
- Anshari, M.; Hamdan, M.; Ahmad, N.; Ali, E.; Haidi, H. COVID-19, artificial intelligence, ethical challenges and policy implications. AI Soc. 2023, 38, 707–720. [Google Scholar] [CrossRef] [PubMed]
- Falco, G. Participatory AI: Reducing AI Bias and Developing Socially Responsible AI in Smart Cities. In Proceedings of the 2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), New York, NY, USA, 1–3 August 2019; pp. 154–158. [Google Scholar] [CrossRef]
- Chiang, C.W.; Lu, Z.; Li, Z.; Yin, M. Are Two Heads Better Than One in AI-Assisted Decision Making? Comparing the Behavior and Performance of Groups and Individuals in Human-AI Collaborative Recidivism Risk Assessment. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Hamburg, Germany, 23–28 April 2023; pp. 1–18. [Google Scholar]
- Pai, M.; McCulloch, M.; Colford, J. Systematic Review: A Road Map, version 2.2; Systematic Reviews Group, UC Berkeley: Berkeley, CA, USA, 2004. [Google Scholar]
- Kitchenham, B. Procedures for Performing Systematic Reviews; Keele Universit: Keele, UK, 2004; Volume 33, pp. 1–26. [Google Scholar]
- Ritter, N. Predicting recidivism risk: New tool in Philadelphia shows great promise. Natl. Inst. Justice J. 2013, 271, 4–13. [Google Scholar]
- Adler, P.; Falk, C.; Friedler, S.A.; Nix, T.; Rybeck, G.; Scheidegger, C.; Smith, B.; Venkatasubramanian, S. Auditing black-box models for indirect influence. Knowl. Inf. Syst. 2018, 54, 95–122. [Google Scholar] [CrossRef] [Green Version]
- Harada, T.; Nomura, K.; Shimada, H.; Kawakami, N. Development of a risk assessment tool for Japanese sex offenders: The Japanese Static-99. In Neuropsychopharmacology Reports; John Wiley & Sons: Victoria, Australia, 2023. [Google Scholar] [CrossRef]
- Miller, C.S.; Kimonis, E.R.; Otto, R.K.; Kline, S.M.; Wasserman, A.L. Reliability of risk assessment measures used in sexually violent predator proceedings. Psychol. Assess. 2012, 24, 944. [Google Scholar] [CrossRef] [PubMed]
- McPhee, J.; Heilbrun, K.; Cubbon, D.N.; Soler, M.; Goldstein, N.E. What’s risk got to do with it: Judges’ and probation officers’ understanding and use of juvenile risk assessments in making residential placement decisions. Law Hum. Behav. 2023, 47, 320. [Google Scholar] [CrossRef]
- Berk, R. An impact assessment of machine learning risk forecasts on parole board decisions and recidivism. J. Exp. Criminol. 2017, 13, 193–216. [Google Scholar] [CrossRef]
- Miron, M.; Tolan, S.; Gómez, E.; Castillo, C. Evaluating causes of algorithmic bias in juvenile criminal recidivism. Artif. Intell. Law 2021, 29, 111–147. [Google Scholar] [CrossRef]
- Dass, R.K.; Petersen, N.; Omori, M.; Lave, T.R.; Visser, U. Detecting racial inequalities in criminal justice: Towards an equitable deep learning approach for generating and interpreting racial categories using mugshots. AI Soc. 2022, 38, 897–918. [Google Scholar] [CrossRef]
- Liu, Y.Y.; Yang, M.; Ramsay, M.; Li, X.S.; Coid, J.W. A comparison of logistic regression, classification and regression tree, and neural networks models in predicting violent re-offending. J. Quant. Criminol. 2011, 27, 547–573. [Google Scholar] [CrossRef]
- Smith, B. Auditing Deep Neural Networks to Understand Recidivism Predictions. Ph.D. Thesis, Haverford College, Haverford, PA, USA, 2016. [Google Scholar]
- Waggoner, P.D.; Macmillen, A. Pursuing open-source development of predictive algorithms: The case of criminal sentencing algorithms. J. Comput. Soc. Sci. 2022, 5, 89–109. [Google Scholar] [CrossRef]
- Wijenayake, S.; Graham, T.; Christen, P. A decision tree approach to predicting recidivism in domestic violence. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Melbourne, VIC, Australia, 3–6 June 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–15. [Google Scholar] [CrossRef]
- Yuan, D. Case Study of Criminal Law Based on Multi-task Learning. In Proceedings of the 2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE), Beijing, China, 23–25 October 2020; pp. 98–103. [Google Scholar] [CrossRef]
- Zeng, J.; Ustun, B.; Rudin, C. Interpretable classification models for recidivism prediction. J. R. Stat. Soc. Ser. A Stat. Soc. 2017, 180, 689–722. [Google Scholar] [CrossRef] [Green Version]
- Jain, B.; Huber, M.; Fegaras, L.; Elmasri, R.A. Singular race models: Addressing bias and accuracy in predicting prisoner recidivism. In Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Rhodes, Greece, 5–7 June 2019; pp. 599–607. [Google Scholar] [CrossRef]
- Skeem, J.; Lowenkamp, C. Using algorithms to address trade-offs inherent in predicting recidivism. Behav. Sci. Law 2020, 38, 259–278. [Google Scholar] [CrossRef] [PubMed]
- Biswas, A.; Mukherjee, S. Ensuring fairness under prior probability shifts. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, Virtual, 19–21 May 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 414–424. [Google Scholar] [CrossRef]
- Foulds, J.R.; Islam, R.; Keya, K.N.; Pan, S. An intersectional definition of fairness. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1918–1921. [Google Scholar] [CrossRef]
- Watts, D.; Moulden, H.; Mamak, M.; Upfold, C.; Chaimowitz, G.; Kapczinski, F. Predicting offences among individuals with psychiatric disorders-A machine learning approach. J. Psychiatr. Res. 2021, 138, 146–154. [Google Scholar] [CrossRef] [PubMed]
- Dressel, J.; Farid, H. The accuracy, fairness, and limits of predicting recidivism. Sci. Adv. 2018, 4, eaao5580. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jain, B.; Huber, M.; Elmasri, R.A.; Fegaras, L. Reducing race-based bias and increasing recidivism prediction accuracy by using past criminal history details. In Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Corfu, Greece, 30 June 30–3 July 2020; pp. 1–8. [Google Scholar] [CrossRef]
- Green, B.; Chen, Y. Disparate interactions: An algorithm-in-the-loop analysis of fairness in risk assessments. In Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, USA, 29–31 January 2019; pp. 90–99. [Google Scholar] [CrossRef]
- Chohlas-Wood, A.; Nudell, J.; Yao, K.; Lin, Z.; Nyarko, J.; Goel, S. Blind justice: Algorithmically masking race in charging decisions. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, 7–8 February 2020; Association for Computing Machinery: New York, NY, USA, 2021; pp. 35–45. [Google Scholar] [CrossRef]
- Fish, B.; Kun, J.; Lelkes, Á.D. A confidence-based approach for balancing fairness and accuracy. In Proceedings of the 2016 SIAM International Conference on Data Mining, Miami, FL, USA, 5–7 May 2016; pp. 144–152. [Google Scholar] [CrossRef] [Green Version]
- Zhang, S.; Yan, G.; Li, Y.; Liu, J. Evaluation of judicial imprisonment term prediction model based on text mutation. In Proceedings of the 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C), Sofia, Bulgaria, 22–26 July 2019; pp. 62–65. [Google Scholar] [CrossRef]
- Michael, M.; Farayola, I.; Tal, S.T.; Connolly, R.; Bendechache, M. Fairness of AI in Predicting the Risk of Recidivism: Review and Phase Mapping of AI Fairness Techniques. In Proceedings of the 18th International Conference on Availability, Reliability and Security (ARES 2023), Benevento, Italy, 29 August–1 September 2023. [Google Scholar] [CrossRef]
Inclusion Criteria | Exclusion Criteria |
---|---|
Full Text | Duplicated Studies |
Articles written in English relating to the Trustworthiness of AI when predicting the risk of recidivism | Non-English |
Published between 2010 to 2023 | Published before the 2010 |
Published in Conferences, Journals, or Books | Uncompleted studies |
Ref | Purpose of Paper | Dataset | Algorithm Model Used or Studied | Evaluation Metrics | Location |
---|---|---|---|---|---|
[4] | Analyzed the performance of interpretable models regarding prediction ability, sparsity and fairness | Broward County, Florida dataset & Kentucky dataset | LR, SVM, RF, DT, CART, Explainable Boost Machine | FPR, FNR, Accuracy | USA |
[80] | Presented interpretable binary classification models to predict general recidivism as well as crime-specific recidivism | Recidivism of Prisoners Released in 1994 | CART, LR, SVM, Stochastic Gradient Boosting (Adaboost) | TPR, FPR, AUC-ROC | USA |
[21] | Achieved multi-granularity inference of legal charges by obtaining subjective and objective elements from the fact descriptions of legal cases | CAIL 2018 | SVM, Deep Pyramid CNN, ELECTRA, QAjudge | Macro-Precision, Macro-Recall, Macro-F1 | China |
[74] | Investigate the Interpretable model through the use of mugshots to racial bias | Miami-Dade County Clerk of the Court | MTCNN | Accuracy | USA |
[79] | Used Multi-task learning to conduct joint training with the task of crime prediction | CAIL 2018 | LibSVM, LSTM, Multi-Label-KNN, BiLSTM | Precision, Recall, F1-measure, F-macro, F-Micro | China |
[54] | Illustrated the construction of interpretable risk assessment modelling using demographic features | Use of Force Complaint data from the Chicago Police Department | Boosted DT, Feed-Forward NN | AUC, Logloss, RMSE, MAE | USA |
[78] | Employ Decision Tree induction to obtain both interpretable trees as well as high prediction accuracy | NSW Bureau of Crime Statistics and Research (BOCSAR) Re-offending Database | DT, LR | AUC-ROC, TPR, FPR | Australia |
[77] | To establish open-source algorithms as the standard in highly consequential contexts that affect people’s lives for reasons of transparency and collaboration. | Broward County Florida | Ridge Regression, LASSO Regression, Elastic Net Regression | AUC-ROC, | USA |
[76] | Presented a method (Gradient Feature Auditing) to evaluate the effect of features in a data set on the predictions of models | National Archive of Criminal Justice Data | Deep NN, SVM, DT, Superspace Linear Integer Models (SLIM) | Balanced Classification Rate (BCR) | USA |
[68] | Presented a technique for auditing black-box models using a variety of publicly available datasets and models | National Archive of Criminal Justice Data | SVM, Feedforward NN | Gradient Feature Auditing (GFA) | USA |
[75] | A comparison of logistic regression, classification and regression tree, and neural networks models in predicting violent re-offending | Prison Service Inmate Information System and Central System Database | LR, CART, Multi-Layer Perceptron NN | AUC-ROC, Accuracy | UK |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Farayola, M.M.; Tal, I.; Connolly, R.; Saber, T.; Bendechache, M. Ethics and Trustworthiness of AI for Predicting the Risk of Recidivism: A Systematic Literature Review. Information 2023, 14, 426. https://doi.org/10.3390/info14080426
Farayola MM, Tal I, Connolly R, Saber T, Bendechache M. Ethics and Trustworthiness of AI for Predicting the Risk of Recidivism: A Systematic Literature Review. Information. 2023; 14(8):426. https://doi.org/10.3390/info14080426
Chicago/Turabian StyleFarayola, Michael Mayowa, Irina Tal, Regina Connolly, Takfarinas Saber, and Malika Bendechache. 2023. "Ethics and Trustworthiness of AI for Predicting the Risk of Recidivism: A Systematic Literature Review" Information 14, no. 8: 426. https://doi.org/10.3390/info14080426