The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective
Abstract
:1. Introduction
Type of Bias | Consideration(s) |
---|---|
Model-Building Strategies | |
Gender Bias | Integrate an evaluation of demographic performance disparities to model development [21]. We include LGBTQ sexual preferences as well as administrative and genetic gender biases. |
Race based Bias | Recognize race as a social, not genetic, construct, and avoid use in clinical settings/situations [22,23]. This consideration is challenging and requires significant preliminary work to follow the Food and Drug Administration’s (FDA) “new” guidelines to use a two-question format about ethnicity prior to asking about race, in addition to recruiting and including more diverse populations for clinical trials [24]. |
Ethnicity based Bias | Use the term “geographic origin” [22]. |
Age related Bias | Incorporate adversarial learning to de-bias models by protecting sensitive attributes [25,26]. |
Historical data Bias | Strive to exhibit fair representation to diversify the model input by including groups that are not generally represented in research datasets to minimize the effect of demographic, cultural, gender, and socioeconomic biases [8,27].Research should be interdisciplinary; avoid having the same team members input to training models; expand inter- and intraorganizational collaboration [28]. |
Algorithmic Bias | The sociotechnical systems, as well as the health care, data science, and computer science industries, should employ diverse populations to create fair algorithms, because “under-representation” of some groups and institutional/structural inequities from prior care patterns otherwise will be perpetuated in the data used in MLAGs [28]. |
Evaluation Bias | Avoid over/underfitting models to circumvent diminished predictive power in populations who are not well represented in the training dataset [14]. |
Implicit Bias | Consider the nature of bias and use counterfactual explanations as a predictive tool to detect attributes that are assigned and thus plague certain populations [15,29]. |
Selection/Sampling/Population Bias | Review and keep in mind basic clinical epidemiological principles (e.g., selection bias, which affects “real-world” study interpretation and generalizability to a representative population) [13,27]. |
Clinical Practice Strategies | |
Socioeconomic status (SES) Bias | Work to limit the misunderstanding between patients and providers due to cultural and linguistic differences; keep the positive aspects of the social determinants of health (SDoH) in mind [12,30,31,32]. |
Learned/training/clinical data Boas | Engage in continuous monitoring and updating to focus on detecting trends in the MLAGs decisions, instead of the learned bias for improvement; use training data that accurately represent populations under-represented in health care systems [8]. |
Cultural Bias | Different cultures have different societal norms. These include beliefs, arts, laws, customs, capabilities, and habits of the individuals in these groups. This may relate to what informaiton is shared with health providers. Understanding where data availability may differ between cultures can help you to design models more fairly. |
Public Policy Strategies | |
Insurance status bias | With intent or not, EHRs house erroneous coding error data. Therefore, the practices of upcoding, misspecification, and unbundling should cease to avoid coding of illnesses/diseases and to make decisions to confirm with billing and insurance rules, rather than with the most accurate information [33]. To avoid these practices, health care organizations need to improve the quality, accuracy, and usability of EHRs [34]. |
Analytic Strategies | |
Conformation Bias | Incldue all relevant data in your dataset. Allow the ML model to choose the features. |
Information Bias | This relates to the differential accuracy or availability of certain variables when compared with others in the dataset. One can eliminate information with too high a rate of missingness or inaccurate recording. |
Anchoring Bias | There is a tendency to put efficiency ahead of accuracy. Therefore one should choose parameters for their minimal accuracy requirements that is at the level of current clinical reasoning. |
2. The Current State of Machine Learning (ML) Models in Health Care
3. Limitations of Current Machine Learning Models—Health Care
Some Lesser-Known and Complex Sources of Bias in Artificial Intelligence Models (AIMs) Used in Health Care: Why Training and I Data Are often Not Representative of Relevant Populations, and Why They Perpetuate Errors and Bias
4. Controlling Large Language Model (LLM) Bias in Health Care
5. Unresolved Issues Persist: An Opportunity to Change the Status Quo of the Long-Lived Effects of Biased Data
6. Discussion/Conclusions
- Ensure that your training population is consistent with the population of intended users of your model.
- Where you want to be fair to community members who make up a small part of the total population you may need specific models for these communities (running the right model ont eh right population).
- Implementing cultural sensitivity in model generation and the output and communications to your end users is essential.
- Testing hyperperameters to ensure that we choose settings that minimize the risk of algorithmic bias. This implies monitoring the performance of all algorithms across populations.
- Though a constant focus on bias minimization should be a focus of all model development.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Correction Statement
References
- Matthay, E.C.; Glymour, M.M. A Graphical Catalog of Threats to Validity: Linking Social Science with Epidemiology. Epidemiology 2020, 31, 376–384. [Google Scholar] [CrossRef] [PubMed]
- Ntoutsi, E.; Fafalios, P.; Gadiraju, U.; Iosifidis, V.; Nejdl, W.; Vidal, M.-E.; Ruggieri, S.; Turini, F.; Papadopoulos, S.; Krasanakis, E.; et al. Bias in Data-Driven Artificial Intelligence Systems—An Introductory Survey. WIREs Data Min. Knowl. Discov. 2020, 10, e1356. [Google Scholar] [CrossRef]
- Elkin, P.L.; Mullin, S.; Mardekian, J.; Crowner, C.; Sakilay, S.; Sinha, S.; Brady, G.; Wright, M.; Nolen, K.; Trainer, J.; et al. Using Artificial Intelligence with Natural Language Processing to Combine Electronic Health Record’s Structured and Free Text Data to Identify Nonvalvular Atrial Fibrillation to Decrease Strokes and Death: Evaluation and Case-Control Study. J. Med. Internet Res. 2021, 23, e28946. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Resnick, M.P.; LeHouillier, F.; Brown, S.H.; Campbell, K.E.; Montella, D.; Elkin, P.L. Automated Modeling of Clinical Narrative with High Definition Natural Language Processing Using Solor and Analysis Normal Form. Stud. Health Technol. Inform. 2021, 287, 89–93. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Li, A.; Mullin, S.; Elkin, P.L. Improving Prediction of Survival for Extremely Premature Infants Born at 23 to 29 Weeks Gestational Age in the Neonatal Intensive Care Unit: Development and Evaluation of Machine Learning Models. JMIR Med. Inform. 2024, 12, e42271. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Resnick, M.P.; Montella, D.; Brown, S.H.; Elkin, P. ACORN SDOH survey: Terminological representation for use with NLP and CDS. J. Clin. Transl. Sci. 2024, 8, e39. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Vorisek, C.N.; Stellmach, C.; Mayer, P.J.; Klopfenstein, S.A.I.; Bures, D.M.; Diehl, A.; Henningsen, M.; Ritter, K.; Thun, S. Artificial Intelligence Bias in Health Care: Web-Based Survey. J. Med. Internet Res. 2023, 25, e41089. [Google Scholar] [CrossRef] [PubMed]
- Fuchs, D. The Dangers of Human-Like Bias in Machine-Learning Algorithms. Mo. ST’s Peer Peer 2018, 2, 1. [Google Scholar]
- Pierce, R.L.; Van Biesen, W.; Van Cauwenberge, D.; Decruyenaere, J.; Sterckx, S. Explainability in medicine in an era of AI-based clinical decision support systems. Front. Genet. 2022, 13, 903600. [Google Scholar] [CrossRef] [PubMed]
- Sharun, K.; Banu, S.A.; Pawde, A.M.; Kumar, R.; Akash, S.; Dhama, K.; Pal, A. ChatGPT and Artificial Hallucinations in Stem Cell Research: Assessing the Accuracy of Generated References—A Preliminary Study. Ann. Med. Surg. 2023, 85, 5275–5278. [Google Scholar] [CrossRef] [PubMed]
- Chin-Yee, B.; Upshur, R. Three problems with big data and artificial intelligence in medicine. Perspect. Biol. Med. 2019, 62, 237–256. [Google Scholar] [CrossRef] [PubMed]
- Obermeyer, Z.; Powers, B.; Vogeli, C.; Mullainathan, S. Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations. Science 2019, 366, 447–453. [Google Scholar] [CrossRef] [PubMed]
- Hellström, T.; Dignum, V.; Bensch, S. Bias in Machine Learning—What Is It Good For? arXiv 2020, arXiv:2004.00686. [Google Scholar]
- Chen, Y.; Clayton, E.W.; Novak, L.L.; Anders, S.; Malin, B. Human-Centered Design to Address Biases in Artificial Intelligence. J. Med. Internet Res. 2023, 25, e43251. [Google Scholar] [CrossRef] [PubMed]
- Gervasi, S.S.; Chen, I.Y.; Smith-McLallen, A.; Sontag, D.; Obermeyer, Z.; Vennera, M.; Chawla, R. The Potential for Bias in Machine Learning and Opportunities for Health Insurers to Address It. Health Aff. 2022, 41, 212–218. [Google Scholar] [CrossRef] [PubMed]
- Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; Galstyan, A. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 2021, 54, 115. [Google Scholar] [CrossRef]
- FitzGerald, C.; Hurst, S. Implicit Bias in Healthcare Professionals: A Systematic Review. BMC Med. Ethics 2017, 18, 19. [Google Scholar] [CrossRef]
- Lippi, D.; Bianucci, R.; Donell, S. Gender Medicine: Its Historical Roots. Postgrad. Med. J. 2020, 96, 480–486. [Google Scholar] [CrossRef] [PubMed]
- Park, J.; Saha, S.; Chee, B.; Taylor, J.; Beach, M.C. Physician Use of Stigmatizing Language in Patient Medical Records. JAMA Netw. Open 2021, 4, e2117052. [Google Scholar] [CrossRef] [PubMed]
- Srinivasan, R.; Chander, A. Biases in AI Systems. Commun. ACM 2021, 64, 44–49. [Google Scholar] [CrossRef]
- Straw, I.; Wu, H. Investigating for Bias in Healthcare Algorithms: A Sex-Stratified Analysis of Supervised Machine Learning Models in Liver Disease Prediction. BMJ Health Care Inform. 2022, 29, 100457. [Google Scholar] [CrossRef] [PubMed]
- Powe, N.R. Black Kidney Function Matters: Use or Misuse of Race? JAMA 2020, 324, 737–738. [Google Scholar] [CrossRef] [PubMed]
- Skinner-Dorkenoo, A.L.; Rogbeer, K.G.; Sarmal, A.; Ware, C.; Zhu, J. Challenging Race-Based Medicine through Historical Education about the Social Construction of Race. Health Equity 2023, 7, 764–772. [Google Scholar] [CrossRef] [PubMed]
- Schneider, M.E. Clinical Trials: FDA Proposes New Standards for Collecting Race, Ethnicity Data. 2024. Available online: https://www.raps.org/news-and-articles/news-articles/2024/1/fda-proposes-standards-for-collecting-and-reportin# (accessed on 25 April 2024).
- Garcia de Alford, A.S.; Hayden, S.; Wittlin, N.; Atwood, A. Reducing Age Bias in Machine Learning: An Algorithmic Approach. SMU Data Sci. Rev. 2020, 3, 11. [Google Scholar]
- Xu, J. Algorithmic Solutions to Algorithmic Bias: A Technical Guide. Available online: https://towardsdatascience.com/algorithmic-solutions-to-algorithmic-bias-aef59eaf6565 (accessed on 29 December 2023).
- Yu, A.C.; Eng, J. One Algorithm May Not Fit All: How Selection Bias Affects Machine Learning Performance. RadioGraphics 2020, 40, 1932–1937. [Google Scholar] [CrossRef] [PubMed]
- Kuhlman, C.; Jackson, L.; Chunara, R. No Computation without Representation: Avoiding Data and Algorithm Biases through Diversity. arXiv 2020, arXiv:2002.11836. [Google Scholar]
- Goethals, S.; Martens, D.; Calders, T. PreCoF: Counterfactual Explanations for Fairness. In Machine Learning; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar] [CrossRef]
- Gottlieb, L.M.; Francis, D.E.; Beck, A.F. Uses and Misuses of Patient- and Neighborhood-Level Social Determinants of Health Data. Perm. J. 2018, 22, 18–78. [Google Scholar] [CrossRef] [PubMed]
- Geskey, J.M.; Kodish-Wachs, J.; Blonsky, H.; Hohman, S.F.; Meurer, S. National Documentation and Coding Practices of Noncompliance: The Importance of Social Determinants of Health and the Stigma of African-American Bias. Am. J. Med. Qual. 2023, 38, 87–92. [Google Scholar] [CrossRef] [PubMed]
- Lee, D.N.; Hutchens, M.J.; George, T.J.; Wilson-Howard, D.; Cooks, E.J.; Krieger, J.L. Do They Speak like Me? Exploring How Perceptions of Linguistic Difference May Influence Patient Perceptions of Healthcare Providers. Med. Educ. Online 2022, 27, 2107470. [Google Scholar] [CrossRef] [PubMed]
- O’Malley, K.J.; Cook, K.F.; Price, M.D.; Wildes, K.R.; Hurdle, J.F.; Ashton, C.M. Measuring Diagnoses: ICD Code Accuracy. Health Serv. Res. 2005, 40, 1620–1639. [Google Scholar] [CrossRef] [PubMed]
- Holmes, J.H.; Beinlich, J.; Boland, M.R.; Bowles, K.H.; Chen, Y.; Cook, T.S.; Demiris, G.; Draugelis, M.; Fluharty, L.; Gabriel, P.E.; et al. Why is the electronic health record so challenging for research and clinical care? Methods Inf. Med. 2021, 60, 032–048. [Google Scholar] [CrossRef] [PubMed]
- Kino, S.; Hsu, Y.-T.; Shiba, K.; Chien, Y.-S.; Mita, C.; Kawachi, I.; Daoud, A. A Scoping Review on the Use of Machine Learning in Research on Social Determinants of Health: Trends and Research Prospects. SSM Popul. Health 2021, 15, 100836. [Google Scholar] [CrossRef] [PubMed]
- Schuch, H.S.; Furtado, M.; Silva, G.F.d.S.; Kawachi, I.; Filho, A.D.P.C.; Elani, H.W. Fairness of Machine Learning Algorithms for Predicting Foregone Preventive Dental Care for Adults. JAMA Netw. Open 2023, 6, e2341625. [Google Scholar] [CrossRef] [PubMed]
- Ferrara, C.; Sellitto, G.; Ferrucci, F.; Palomba, F.; De Lucia, A. Fairness-Aware Machine Learning Engineering: How Far Are We? Empir. Softw. Eng. 2023, 29, 9. [Google Scholar] [CrossRef] [PubMed]
- Martinez-Martin, N.; Greely, H.T.; Cho, M.K. Ethical Development of Digital Phenotyping Tools for Mental Health Applications: Delphi Study. JMIR Mhealth Uhealth 2021, 9, e27343. [Google Scholar] [CrossRef] [PubMed]
- Ding, S.; Tang, R.; Zha, D.; Zou, N.; Zhang, K.; Jiang, X.; Hu, X. Fairly Predicting Graft Failure in Liver Transplant for Organ Assigning. AMIA Annu. Symp. Proc. 2023, 2022, 415–424. [Google Scholar] [PubMed]
- Vyas, D.A.; Eisenstein, L.G.; Jones, D.S. Hidden in Plain Sight—Reconsidering the Use of Race Correction in Clinical Algorithms. N. Engl. J. Med. 2020, 383, 874–882. [Google Scholar] [CrossRef] [PubMed]
- Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artif. Intell. 2019, 267, 1–38. [Google Scholar] [CrossRef]
- Väänänen, A.; Haataja, K.; Vehviläinen-Julkunen, K.; Toivanen, P. AI in Healthcare: A Narrative Review. F1000Research 2021, 10, 6. [Google Scholar] [CrossRef]
- Shaheen, M.Y. Applications of Artificial Intelligence (AI) in Healthcare: A Review. Sci. Prepr. 2021. [Google Scholar] [CrossRef]
- MacIntyre, C.R.; Chen, X.; Kunasekaran, M.; Quigley, A.; Lim, S.; Stone, H.; Paik, H.; Yao, L.; Heslop, D.; Wei, W.; et al. Artificial Intelligence in Public Health: The Potential of Epidemic Early Warning Systems. J. Int. Med. Res. 2023, 51, 03000605231159335. [Google Scholar] [CrossRef] [PubMed]
- Giovanola, B.; Tiribelli, S. Beyond bias and discrimination: Redefining the AI ethics principle of fairness in healthcare machine-learning algorithms. AI Soc. 2023, 38, 549–563. [Google Scholar] [CrossRef] [PubMed]
- Obaid, H.S.; Dheyab, S.A.; Sabry, S.S. The impact of data pre-processing techniques and dimensionality reduction on the accuracy of machine learning. In Proceedings of the 2019 9th Annual Information Technology, Electromechanical Engineering and Microelectronics Conference (IEMECON), Jaipur, India, 13–15 March 2019; pp. 279–283. [Google Scholar]
- American Psychologial Association. Implicit Bias. 2023. Available online: https://www.apa.org/topics/implicit-bias (accessed on 19 January 2024).
- Juhn, Y.J.; Ryu, E.; Wi, C.-I.; King, K.S.; Malik, M.; Romero-Brufau, S.; Weng, C.; Sohn, S.; Sharp, R.R.; Halamka, J.D. Assessing Socioeconomic Bias in Machine Learning Algorithms in Health Care: A Case Study of the HOUSES Index. J. Am. Med. Inf. Assoc. 2022, 29, 1142–1151. [Google Scholar] [CrossRef] [PubMed]
- Hoffman, S.; Podgurski, A. The Use and Misuse of Biomedical Data: Is Bigger Really Better? Am. J. Law Med. 2013, 39, 497–538. [Google Scholar] [CrossRef] [PubMed]
- Cirillo, D.; Catuara-Solarz, S.; Morey, C.; Guney, E.; Subirats, L.; Mellino, S.; Gigante, A.; Valencia, A.; Rementeria, M.J.; Chadha, A.S.; et al. Sex and Gender Differences and Biases in Artificial Intelligence for Biomedicine and Healthcare. Npj Digit. Med. 2020, 3, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Celi, L.A.; Cellini, J.; Charpignon, M.-L.; Dee, E.C.; Dernoncourt, F.; Eber, R.; Mitchell, W.G.; Moukheiber, L.; Schirmer, J.; Situ, J.; et al. Sources of Bias in Artificial Intelligence That Perpetuate Healthcare Disparities—A Global Review. PLOS Digit. Health 2022, 1, e0000022. [Google Scholar] [CrossRef] [PubMed]
- McDermott, M.B.A.; Nestor, B.; Szolovits, P. Clinical Artificial Intelligence: Design Principles and Fallacies. Clin. Lab. Med. 2023, 43, 29–46. [Google Scholar] [CrossRef] [PubMed]
- Polubriaginof, F.C.G.; Ryan, P.; Salmasian, H.; Shapiro, A.W.; Perotte, A.; Safford, M.M.; Hripcsak, G.; Smith, S.; Tatonetti, N.P.; Vawdrey, D.K. Challenges with Quality of Race and Ethnicity Data in Observational Databases. J. Am. Med. Inf. Assoc. 2019, 26, 730–736. [Google Scholar] [CrossRef] [PubMed]
- Kamulegeya, L.H.; Okello, M.; Bwanika, J.M.; Musinguzi, D.; Lubega, W.; Rusoke, D.; Nassiwa, F.; Börve, A. Using Artificial Intelligence on Dermatology Conditions in Uganda: A Case for Diversity in Training Data Sets for Machine Learning. Afr. Health Sci. 2019, 23, 753–763. [Google Scholar] [CrossRef] [PubMed]
- Chan, S.; Reddy, V.; Myers, B.; Thibodeaux, Q.; Brownstone, N.; Liao, W. Machine Learning in Dermatology: Current Applications, Opportunities, and Limitations. Dermatol. Ther. 2020, 10, 365–386. [Google Scholar] [CrossRef] [PubMed]
- Haenssle, H.A.; Fink, C.; Toberer, F.; Winkler, J.; Stolz, W.; Deinlein, T.; Hofmann-Wellenhof, R.; Lallas, A.; Emmert, S.; Buhl, T.; et al. Man against Machine Reloaded: Performance of a Market-Approved Convolutional Neural Network in Classifying a Broad Spectrum of Skin Lesions in Comparison with 96 Dermatologists Working under Less Artificial Conditions. Ann. Oncol. 2020, 31, 137–143. [Google Scholar] [CrossRef] [PubMed]
- Fujisawa, Y.; Otomo, Y.; Ogata, Y.; Nakamura, Y.; Fujita, R.; Ishitsuka, Y.; Watanabe, R.; Okiyama, N.; Ohara, K.; Fujimoto, M. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br. J. Dermatol. 2019, 180, 373–381. [Google Scholar] [CrossRef] [PubMed]
- Brinker, T.J.; Hekler, A.; Enk, A.H.; Klode, J.; Hauschild, A.; Berking, C.; Schilling, B.; Haferkamp, S.; Schadendorf, D.; Holland-Letz, T.; et al. Deep Learning Outperformed 136 of 157 Dermatologists in a Head-to-Head Dermoscopic Melanoma Image Classification Task. Eur. J. Cancer 2019, 113, 47–54. [Google Scholar] [CrossRef] [PubMed]
- Brinker, T.J.; Hekler, A.; Enk, A.H.; Berking, C.; Haferkamp, S.; Hauschild, A.; Weichenthal, M.; Klode, J.; Schadendorf, D.; Holland-Letz, T.; et al. Deep Neural Networks Are Superior to Dermatologists in Melanoma Image Classification. Eur. J. Cancer 2019, 119, 11–17. [Google Scholar] [CrossRef] [PubMed]
- Pham, T.-C.; Luong, C.-M.; Hoang, V.-D.; Doucet, A. AI Outperformed Every Dermatologist in Dermoscopic Melanoma Diagnosis, Using an Optimized Deep-CNN Architecture with Custom Mini-Batch Logic and Loss Function. Sci. Rep. 2021, 11, 17485. [Google Scholar] [CrossRef] [PubMed]
- Guo, L.N.; Lee, M.S.; Kassamali, B.; Mita, C.; Nambudiri, V.E. Bias in, Bias out: Underreporting and Underrepresentation of Diverse Skin Types in Machine Learning Research for Skin Cancer Detection—A Scoping Review. J. Am. Acad. Dermatol. 2022, 87, 157–159. [Google Scholar] [CrossRef]
- Tschandl, P. Risk of Bias and Error from Data Sets Used for Dermatologic Artificial Intelligence. JAMA Dermatol. 2021, 157, 1271–1273. [Google Scholar] [CrossRef] [PubMed]
- Daneshjou, R.; Smith, M.P.; Sun, M.D.; Rotemberg, V.; Zou, J. Lack of Transparency and Potential Bias in Artificial Intelligence Data Sets and Algorithms: A Scoping Review. JAMA Dermatol. 2021, 157, 1362–1369. [Google Scholar] [CrossRef] [PubMed]
- Kleinberg, G.; Diaz, M.J.; Batchu, S.; Lucke-Wold, B. Racial Underrepresentation in Dermatological Datasets Leads to Biased Machine Learning Models and Inequitable Healthcare. J. Biomed. Res. 2022, 3, 42–47. [Google Scholar]
- Daneshjou, R.; Vodrahalli, K.; Novoa, R.A.; Jenkins, M.; Liang, W.; Rotemberg, V.; Ko, J.; Swetter, S.M.; Bailey, E.E.; Gevaert, O.; et al. Disparities in Dermatology AI Performance on a Diverse, Curated Clinical Image Set. Sci. Adv. 2022, 8, eabq6147. [Google Scholar] [CrossRef]
- Manuel, J.I. Racial/Ethnic and Gender Disparities in Health Care Use and Access. Health Serv. Res. 2018, 53, 1407–1429. [Google Scholar] [CrossRef] [PubMed]
- Mirin, A.A. Gender Disparity in the Funding of Diseases by the U.S. National Institutes of Health. J. Womens Health 2021, 30, 956–963. [Google Scholar] [CrossRef] [PubMed]
- Bosomworth, J.; Khan, Z. Analysis of Gender-Based Inequality in Cardiovascular Health: An Umbrella Review. Cureus 2023, 15, e43482. [Google Scholar] [CrossRef] [PubMed]
- Oikonomou, E.K.; Williams, M.C.; Kotanidis, C.P.; Desai, M.Y.; Marwan, M.; Antonopoulos, A.S.; Thomas, K.E.; Thomas, S.; Akoumianakis, I.; Fan, L.M.; et al. A Novel Machine Learning-Derived Radiotranscriptomic Signature of Perivascular Fat Improves Cardiac Risk Prediction Using Coronary CT Angiography. Eur. Heart J. 2019, 40, 3529–3543. [Google Scholar] [CrossRef] [PubMed]
- Kaur, G.; Oliveira-Gomes, D.D.; Rivera, F.B.; Gulati, M. Chest Pain in Women: Considerations from the 2021 AHA/ACC Chest Pain Guideline. Curr. Probl. Cardiol. 2023, 48, 101697. [Google Scholar] [CrossRef] [PubMed]
- Wada, H.; Miyauchi, K.; Daida, H. Gender Differences in the Clinical Features and Outcomes of Patients with Coronary Artery Disease. Expert Rev. Cardiovasc. Ther. 2019, 17, 127–133. [Google Scholar] [CrossRef] [PubMed]
- Bullock-Palmer, R.P.; Shaw, L.J.; Gulati, M. Emerging Misunderstood Presentations of Cardiovascular Disease in Young Women. Clin. Cardiol. 2019, 42, 476–483. [Google Scholar] [CrossRef] [PubMed]
- Worrall-Carter, L.; Ski, C.; Scruth, E.; Campbell, M.; Page, K. Systematic Review of Cardiovascular Disease in Women: Assessing the Risk. Nurs. Health Sci. 2011, 13, 529–535. [Google Scholar] [CrossRef] [PubMed]
- Larrazabal, A.J.; Nieto, N.; Peterson, V.; Milone, D.H.; Ferrante, E. Gender Imbalance in Medical Imaging Datasets Produces Biased Classifiers for Computer-Aided Diagnosis. Proc. Natl. Acad. Sci. USA 2020, 117, 12592–12594. [Google Scholar] [CrossRef]
- Pessach, D.; Shmueli, E. A review on fairness in machine learning. ACM Comput. Surv. 2022, 55, 1–44. [Google Scholar] [CrossRef]
- Shah, N.H.; Halamka, J.D.; Saria, S.; Pencina, M.; Tazbaz, T.; Tripathi, M.; Callahan, A.; Hildahl, H.; Anderson, B. A Nationwide Network of Health AI Assurance Laboratories. JAMA 2024, 331, 245–249. [Google Scholar] [CrossRef] [PubMed]
- Murphy, M.; Kroeper, K.; Ozier, E. Prejudiced Places: How Contexts Shape Inequality and How Policy Can Change Them. Policy Insights Behav. Brain Sci. 2018, 5, 237273221774867. [Google Scholar] [CrossRef]
- Bender EMGebru TMcMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event, 3–10 March 2021; pp. 610–623. [Google Scholar]
- Liyanage, U.P.; Ranaweera, N.D. Ethical considerations and potential risks in the deployment of large Language Models in diverse societal contexts. J. Comput. Soc. Dyn. 2023, 8, 15–25. [Google Scholar]
- Liang, P.P.; Wu, C.; Morency, L.P.; Salakhutdinov, R. Towards understanding and mitigating social biases in language models. In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021; pp. 6565–6576. [Google Scholar]
- Solaiman, I.; Dennison, C. Process for adapting language models to society (palms) with values-targeted datasets. Adv. Neural Inf. Process. Syst. 2021, 34, 5861–5873. [Google Scholar]
- Gupta, U.; Dhamala, J.; Kumar, V.; Verma, A.; Pruksachatkun, Y.; Krishna, S.; Gupta, R.; Chang, K.W.; Steeg, G.V.; Galstyan, A. Mitigating gender bias in distilled language models via counterfactual role reversal. arXiv 2022, arXiv:2203.12574. [Google Scholar]
- Sheng, E.; Chang, K.W.; Natarajan, P.; Peng, N. Societal biases in language generation: Progress and challenges. arXiv 2021, arXiv:2105.04054. [Google Scholar]
- Krause, B.; Gotmare, A.D.; McCann, B.; Keskar, N.S.; Joty, S.; Socher, R.; Rajani, N.F. Gedi: Generative discriminator guided sequence generation. arXiv 2020, arXiv:2009.06367. [Google Scholar]
- Liu, A.; Sap, M.; Lu, X.; Swayamdipta, S.; Bhagavatula, C.; Smith, N.A.; Choi, Y. DExperts: Decoding-time controlled text generation with experts and anti-experts. arXiv 2021, arXiv:2105.03023. [Google Scholar]
- Blei, D.; Ng, A.; Jordan, M. Latent Dirichlet Allocation. Adv. Neural Inf. Process. Syst. 2001, 3, 601–608. [Google Scholar]
- Snomed, C.T. Available online: https://www.nlm.nih.gov/healthit/snomedct/index.html (accessed on 19 January 2024).
- Schlegel, D.R.; Crowner, C.; Lehoullier, F.; Elkin, P.L. HTP-NLP: A New NLP System for High Throughput Phenotyping. Stud. Health Technol. Inform. 2017, 235, 276–280. [Google Scholar] [PubMed]
- Orphanou, K.; Otterbacher, J.; Kleanthous, S.; Batsuren, K.; Giunchiglia, F.; Bogina, V.; Tal, A.S.; Hartman, A.; Kuflik, T. Mitigating bias in algorithmic systems—A fish-eye view. ACM Comput. Surv. 2022, 55, 1–37. [Google Scholar] [CrossRef]
- Balayn, A.; Lofi, C.; Houben, G.J. Managing bias and unfairness in data for decision support: A survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems. VLDB J. 2021, 30, 739–768. [Google Scholar] [CrossRef]
- Corbett-Davies, S.; Pierson, E.; Feller, A.; Goel, S.; Huq, A. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 797–806. [Google Scholar]
- Kamishima, T.; Akaho, S.; Asoh, H.; Sakuma, J. Model-based and actual independence for fairness-aware classification. Data Min. Knowl. Discov. 2018, 32, 258–286. [Google Scholar] [CrossRef]
- Geyik, S.C.; Ambler, S.; Kenthapadi, K. Fairness-aware ranking in search & recommendation systems with application to linkedin talent search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2221–2231. [Google Scholar]
- Kobren, A.; Saha, B.; McCallum, A. Paper matching with local fairness constraints. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 1247–1257. [Google Scholar]
- Sühr, T.; Biega, A.J.; Zehlike, M.; Gummadi, K.P.; Chakraborty, A. Two-sided fairness for repeated matchings in two-sided markets: A case study of a ride-hailing platform. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 3082–3092. [Google Scholar]
- Beutel, A.; Chen, J.; Doshi, T.; Qian, H.; Wei, L.; Wu, Y.; Heldt, L.; Zhao, Z.; Hong, L.; Chi, E.H.; et al. Fairness in recommendation ranking through pairwise comparisons. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2212–2220. [Google Scholar]
- Rokach, L. Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography. Comput. Stat. Data Anal. 2009, 53, 4046–4072. [Google Scholar] [CrossRef]
- CMS. The Path Forward: Improving Data to Advance Health Equity Solutions. 2022. Available online: https://www.cms.gov/blog/path-forward-improving-data-advance-health-equity-solutions (accessed on 19 January 2024).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Franklin, G.; Stephens, R.; Piracha, M.; Tiosano, S.; Lehouillier, F.; Koppel, R.; Elkin, P.L. The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective. Life 2024, 14, 652. https://doi.org/10.3390/life14060652
Franklin G, Stephens R, Piracha M, Tiosano S, Lehouillier F, Koppel R, Elkin PL. The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective. Life. 2024; 14(6):652. https://doi.org/10.3390/life14060652
Chicago/Turabian StyleFranklin, Gillian, Rachel Stephens, Muhammad Piracha, Shmuel Tiosano, Frank Lehouillier, Ross Koppel, and Peter L. Elkin. 2024. "The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective" Life 14, no. 6: 652. https://doi.org/10.3390/life14060652
APA StyleFranklin, G., Stephens, R., Piracha, M., Tiosano, S., Lehouillier, F., Koppel, R., & Elkin, P. L. (2024). The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective. Life, 14(6), 652. https://doi.org/10.3390/life14060652