Next Issue
Volume 4, September
Previous Issue
Volume 4, March
 
 

BioMedInformatics, Volume 4, Issue 2 (June 2024) – 38 articles

Cover Story (view full-size image): Deep-learning-based diagnostic tests have become increasingly popular in recent years. However, deep learning models have been shown to be sensitive to noisy input data, which has raised concerns about the robustness of these models. In summary, robustness is the stability of a model’s predictions when data are noisy, and robustness is therefore imperative for reliable artificial-intelligence-based medical diagnostics. Strategies such as adversarial learning and data augmentation have been able to improve classifier robustness to certain sources of noise by diversifying the training data. By perturbing different amounts of training and testing set images, it is possible to both evaluate and improve the robustness of these models to certain sources of noise without sacrificing performance on images that have not been perturbed. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
17 pages, 9137 KiB  
Article
Utilizing Immunoinformatics for mRNA Vaccine Design against Influenza D Virus
by Elijah Kolawole Oladipo, Stephen Feranmi Adeyemo, Modinat Wuraola Akinboade, Temitope Michael Akinleye, Kehinde Favour Siyanbola, Precious Ayomide Adeogun, Victor Michael Ogunfidodo, Christiana Adewumi Adekunle, Olubunmi Ayobami Elutade, Esther Eghogho Omoathebu, Blessing Oluwatunmise Taiwo, Elizabeth Olawumi Akindiya, Lucy Ochola and Helen Onyeaka
BioMedInformatics 2024, 4(2), 1572-1588; https://doi.org/10.3390/biomedinformatics4020086 - 12 Jun 2024
Viewed by 1389
Abstract
Background: Influenza D Virus (IDV) presents a possible threat to animal and human health, necessitating the development of effective vaccines. Although no human illness linked to IDV has been reported, the possibility of human susceptibility to infection remains uncertain. Hence, there is a [...] Read more.
Background: Influenza D Virus (IDV) presents a possible threat to animal and human health, necessitating the development of effective vaccines. Although no human illness linked to IDV has been reported, the possibility of human susceptibility to infection remains uncertain. Hence, there is a need for an animal vaccine to be designed. Such a vaccine will contribute to preventing and controlling IDV outbreaks and developing effective countermeasures against this emerging pathogen. This study, therefore, aimed to design an mRNA vaccine construct against IDV using immunoinformatic methods and evaluate its potential efficacy. Methods: A comprehensive methodology involving epitope prediction, vaccine construction, and structural analysis was employed. Viral sequences from six continents were collected and analyzed. A total of 88 Hemagglutinin Esterase Fusion (HEF) sequences from IDV isolates were obtained, of which 76 were identified as antigenic. Different bioinformatics tools were used to identify preferred CTL, HTL, and B-cell epitopes. The epitopes underwent thorough analysis, and those that can induce a lasting immunological response were selected for the construction. Results: The vaccine prototype comprised nine epitopes, an adjuvant, MHC I-targeting domain (MITD), Kozaq, 3′ UTR, 5′ UTR, and specific linkers. The mRNA vaccine construct exhibited antigenicity, non-toxicity, and non-allergenicity, with favourable physicochemical properties. The secondary and tertiary structure analyses revealed a stable and accurate vaccine construct. Molecular docking simulations also demonstrated strong binding affinity with toll-like receptors. Conclusions: The study provides a promising framework for developing an effective mRNA vaccine against IDV, highlighting its potential for mitigating the global impact of this viral infection. Further experimental studies are needed to confirm the vaccine’s efficacy and safety. Full article
(This article belongs to the Special Issue Computational Biology and Artificial Intelligence in Medicine)
Show Figures

Figure 1

16 pages, 4106 KiB  
Article
Advancing DNA Language Models through Motif-Oriented Pre-Training with MoDNA
by Weizhi An, Yuzhi Guo, Yatao Bian, Hehuan Ma, Jinyu Yang, Chunyuan Li and Junzhou Huang
BioMedInformatics 2024, 4(2), 1556-1571; https://doi.org/10.3390/biomedinformatics4020085 - 12 Jun 2024
Viewed by 977
Abstract
Acquiring meaningful representations of gene expression is essential for the accurate prediction of downstream regulatory tasks, such as identifying promoters and transcription factor binding sites. However, the current dependency on supervised learning, constrained by the limited availability of labeled genomic data, impedes the [...] Read more.
Acquiring meaningful representations of gene expression is essential for the accurate prediction of downstream regulatory tasks, such as identifying promoters and transcription factor binding sites. However, the current dependency on supervised learning, constrained by the limited availability of labeled genomic data, impedes the ability to develop robust predictive models with broad generalization capabilities. In response, recent advancements have pivoted towards the application of self-supervised training for DNA sequence modeling, enabling the adaptation of pre-trained genomic representations to a variety of downstream tasks. Departing from the straightforward application of masked language learning techniques to DNA sequences, approaches such as MoDNA enrich genome language modeling with prior biological knowledge. In this study, we advance DNA language models by utilizing the Motif-oriented DNA (MoDNA) pre-training framework, which is established for self-supervised learning at the pre-training stage and is flexible enough for application across different downstream tasks. MoDNA distinguishes itself by efficiently learning semantic-level genomic representations from an extensive corpus of unlabeled genome data, offering a significant improvement in computational efficiency over previous approaches. The framework is pre-trained on a comprehensive human genome dataset and fine-tuned for targeted downstream tasks. Our enhanced analysis and evaluation in promoter prediction and transcription factor binding site prediction have further validated MoDNA’s exceptional capabilities, emphasizing its contribution to advancements in genomic predictive modeling. Full article
(This article belongs to the Special Issue Computational Biology and Artificial Intelligence in Medicine)
Show Figures

Figure 1

25 pages, 2372 KiB  
Review
Understanding the Molecular Actions of Spike Glycoprotein in SARS-CoV-2 and Issues of a Novel Therapeutic Strategy for the COVID-19 Vaccine
by Yasunari Matsuzaka and Ryu Yashiro
BioMedInformatics 2024, 4(2), 1531-1555; https://doi.org/10.3390/biomedinformatics4020084 - 9 Jun 2024
Viewed by 1371
Abstract
In vaccine development, many use the spike protein (S protein), which has multiple “spike-like” structures protruding from the spherical structure of the coronavirus, as an antigen. However, there are concerns about its effectiveness and toxicity. When S protein is used in a vaccine, [...] Read more.
In vaccine development, many use the spike protein (S protein), which has multiple “spike-like” structures protruding from the spherical structure of the coronavirus, as an antigen. However, there are concerns about its effectiveness and toxicity. When S protein is used in a vaccine, its ability to attack viruses may be weak, and its effectiveness in eliciting immunity will only last for a short period of time. Moreover, it may cause “antibody-dependent immune enhancement”, which can enhance infections. In addition, the three-dimensional (3D) structure of epitopes is essential for functional analysis and structure-based vaccine design. Additionally, during viral infection, large amounts of extracellular vesicles (EVs) are secreted from infected cells, which function as a communication network between cells and coordinate the response to infection. Under conditions where SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) molecular vaccination produces overwhelming SARS-CoV-2 spike glycoprotein, a significant proportion of the overproduced intracellular spike glycoprotein is transported via EVs. Therefore, it will be important to understand the infection mechanisms of SARA-CoV-2 via EV-dependent and EV-independent uptake into cells and to model the infection processes based on 3D structural features at interaction sites. Full article
Show Figures

Figure 1

12 pages, 514 KiB  
Article
Calibrating Glucose Sensors at the Edge: A Stress Generation Model for Tiny ML Drift Compensation
by Anna Sabatini, Costanza Cenerini, Luca Vollero and Danilo Pau
BioMedInformatics 2024, 4(2), 1519-1530; https://doi.org/10.3390/biomedinformatics4020083 - 9 Jun 2024
Cited by 1 | Viewed by 514
Abstract
Background: Continuous glucose monitoring (CGM) systems offer the advantage of noninvasive monitoring and continuous data on glucose fluctuations. This study introduces a new model that enables the generation of synthetic but realistic databases that integrate physiological variables and sensor attributes into a [...] Read more.
Background: Continuous glucose monitoring (CGM) systems offer the advantage of noninvasive monitoring and continuous data on glucose fluctuations. This study introduces a new model that enables the generation of synthetic but realistic databases that integrate physiological variables and sensor attributes into a dataset generation model and this, in turn, enables the design of improved CGM systems. Methods: The presented approach uses a combination of physiological data and sensor characteristics to construct a model that considers the impact of these variables on the accuracy of CGM measures. A dataset of 500 sensor responses over a 15-day period is generated and analyzed using machine learning algorithms (random forest regressor and support vector regressor). Results: The random forest and support vector regression models achieved Mean Absolute Errors (MAEs) of 16.13 mg/dL and 16.22 mg/dL, respectively. In contrast, models trained solely on single sensor outputs recorded an average MAE of 11.01±5.12 mg/dL. These findings demonstrate the variable impact of integrating multiple data sources on the predictive accuracy of CGM systems, as well as the complexity of the dataset. Conclusions: This approach provides a foundation for developing more precise algorithms and introduces its initial application of Tiny Machine Control Units (MCUs). More research is recommended to refine these models and validate their effectiveness in clinical settings. Full article
(This article belongs to the Special Issue Editor's Choices Series for Methods in Biomedical Informatics Section)
Show Figures

Figure 1

13 pages, 2994 KiB  
Article
Abdominal MRI Unconditional Synthesis with Medical Assessment
by Bernardo Gonçalves, Mariana Silva, Luísa Vieira and Pedro Vieira
BioMedInformatics 2024, 4(2), 1506-1518; https://doi.org/10.3390/biomedinformatics4020082 - 7 Jun 2024
Viewed by 702
Abstract
Current computer vision models require a significant amount of annotated data to improve their performance in a particular task. However, obtaining the required annotated data is challenging, especially in medicine. Hence, data augmentation techniques play a crucial role. In recent years, generative models [...] Read more.
Current computer vision models require a significant amount of annotated data to improve their performance in a particular task. However, obtaining the required annotated data is challenging, especially in medicine. Hence, data augmentation techniques play a crucial role. In recent years, generative models have been used to create artificial medical images, which have shown promising results. This study aimed to use a state-of-the-art generative model, StyleGAN3, to generate realistic synthetic abdominal magnetic resonance images. These images will be evaluated using quantitative metrics and qualitative assessments by medical professionals. For this purpose, an abdominal MRI dataset acquired at Garcia da Horta Hospital in Almada, Portugal, was used. A subset containing only axial gadolinium-enhanced slices was used to train the model. The obtained Fréchet inception distance value (12.89) aligned with the state of the art, and a medical expert confirmed the significant realism and quality of the images. However, specific issues were identified in the generated images, such as texture variations, visual artefacts and anatomical inconsistencies. Despite these, this work demonstrated that StyleGAN3 is a viable solution to synthesise realistic medical imaging data, particularly in abdominal imaging. Full article
(This article belongs to the Special Issue Advances in Quantitative Imaging Analysis: From Theory to Practice)
Show Figures

Figure 1

26 pages, 13349 KiB  
Article
Anomaly Detection and Artificial Intelligence Identified the Pathogenic Role of Apoptosis and RELB Proto-Oncogene, NF-kB Subunit in Diffuse Large B-Cell Lymphoma
by Joaquim Carreras and Rifat Hamoudi
BioMedInformatics 2024, 4(2), 1480-1505; https://doi.org/10.3390/biomedinformatics4020081 - 7 Jun 2024
Cited by 2 | Viewed by 1037
Abstract
Background: Diffuse large B-cell lymphoma (DLBCL) is one of the most frequent lymphomas. DLBCL is phenotypically, genetically, and clinically heterogeneous. Aim: We aim to identify new prognostic markers. Methods: We performed anomaly detection analysis, other artificial intelligence techniques, and conventional statistics using gene [...] Read more.
Background: Diffuse large B-cell lymphoma (DLBCL) is one of the most frequent lymphomas. DLBCL is phenotypically, genetically, and clinically heterogeneous. Aim: We aim to identify new prognostic markers. Methods: We performed anomaly detection analysis, other artificial intelligence techniques, and conventional statistics using gene expression data of 414 patients from the Lymphoma/Leukemia Molecular Profiling Project (GSE10846), and immunohistochemistry in 10 reactive tonsils and 30 DLBCL cases. Results: First, an unsupervised anomaly detection analysis pinpointed outliers (anomalies) in the series, and 12 genes were identified: DPM2, TRAPPC1, HYAL2, TRIM35, NUDT18, TMEM219, CHCHD10, IGFBP7, LAMTOR2, ZNF688, UBL7, and RELB, which belonged to the apoptosis, MAPK, MTOR, and NF-kB pathways. Second, these 12 genes were used to predict overall survival using machine learning, artificial neural networks, and conventional statistics. In a multivariate Cox regression analysis, high expressions of HYAL2 and UBL7 were correlated with poor overall survival, whereas TRAPPC1, IGFBP7, and RELB were correlated with good overall survival (p < 0.01). As a single marker and only in RCHOP-like treated cases, the prognostic value of RELB was confirmed using GSEA analysis and Kaplan–Meier with log-rank test and validated in the TCGA and GSE57611 datasets. Anomaly detection analysis was successfully tested in the GSE31312 and GSE117556 datasets. Using immunohistochemistry, RELB was positive in B-lymphocytes and macrophage/dendritic-like cells, and correlation with HLA DP-DR, SIRPA, CD85A (LILRB3), PD-L1, MARCO, and TOX was explored. Conclusions: Anomaly detection and other bioinformatic techniques successfully predicted the prognosis of DLBCL, and high RELB was associated with a favorable prognosis. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Graphical abstract

23 pages, 631 KiB  
Article
Physiological Data Augmentation for Eye Movement Gaze in Deep Learning
by Alae Eddine El Hmimdi and Zoï Kapoula
BioMedInformatics 2024, 4(2), 1457-1479; https://doi.org/10.3390/biomedinformatics4020080 - 6 Jun 2024
Viewed by 775
Abstract
In this study, the challenges posed by limited annotated medical data in the field of eye movement AI analysis are addressed through the introduction of a novel physiologically based gaze data augmentation library. Unlike traditional augmentation methods, which may introduce artifacts and alter [...] Read more.
In this study, the challenges posed by limited annotated medical data in the field of eye movement AI analysis are addressed through the introduction of a novel physiologically based gaze data augmentation library. Unlike traditional augmentation methods, which may introduce artifacts and alter pathological features in medical datasets, the proposed library emulates natural head movements during gaze data collection. This approach enhances sample diversity without compromising authenticity. The library evaluation was conducted on both CNN and hybrid architectures using distinct datasets, demonstrating its effectiveness in regularizing the training process and improving generalization. What is particularly noteworthy is the achievement of a macro F1 score of up to 79% when trained using the proposed augmentation (EMULATE) with the three HTCE variants. This pioneering approach leverages domain-specific knowledge to contribute to the robustness and authenticity of deep learning models in the medical domain. Full article
Show Figures

Figure 1

16 pages, 1545 KiB  
Review
Unlocking the Future of Drug Development: Generative AI, Digital Twins, and Beyond
by Zamara Mariam, Sarfaraz K. Niazi and Matthias Magoola
BioMedInformatics 2024, 4(2), 1441-1456; https://doi.org/10.3390/biomedinformatics4020079 - 6 Jun 2024
Cited by 1 | Viewed by 962
Abstract
This article delves into the intersection of generative AI and digital twins within drug discovery, exploring their synergistic potential to revolutionize pharmaceutical research and development. Through various instances and examples, we illuminate how generative AI algorithms, capable of simulating vast chemical spaces and [...] Read more.
This article delves into the intersection of generative AI and digital twins within drug discovery, exploring their synergistic potential to revolutionize pharmaceutical research and development. Through various instances and examples, we illuminate how generative AI algorithms, capable of simulating vast chemical spaces and predicting molecular properties, are increasingly integrated with digital twins of biological systems to expedite drug discovery. By harnessing the power of computational models and machine learning, researchers can design novel compounds tailored to specific targets, optimize drug candidates, and simulate their behavior within virtual biological environments. This paradigm shift offers unprecedented opportunities for accelerating drug development, reducing costs, and, ultimately, improving patient outcomes. As we navigate this rapidly evolving landscape, collaboration between interdisciplinary teams and continued innovation will be paramount in realizing the promise of generative AI and digital twins in advancing drug discovery. Full article
Show Figures

Figure 1

16 pages, 5283 KiB  
Article
A Study on the Effects of Cementless Total Knee Arthroplasty Implants’ Surface Morphology via Finite Element Analysis
by Peter J. Hunt, Mohammad Noori, Scott J. Hazelwood, Naudereh B. Noori and Wael A. Altabey
BioMedInformatics 2024, 4(2), 1425-1440; https://doi.org/10.3390/biomedinformatics4020078 - 3 Jun 2024
Viewed by 589
Abstract
Total knee arthroplasty (TKA) is one of the most commonly performed orthopedic surgeries, with nearly one million performed in 2020 in the United States alone. Changing patient demographics, predominately indicated by increases in younger, more active, and more obese patients undergoing TKA, poses [...] Read more.
Total knee arthroplasty (TKA) is one of the most commonly performed orthopedic surgeries, with nearly one million performed in 2020 in the United States alone. Changing patient demographics, predominately indicated by increases in younger, more active, and more obese patients undergoing TKA, poses a challenge to orthopedic surgeons as these factors present a greater risk of long-term complications. Historically, cemented TKA has been the gold standard for fixation, but long-term aseptic loosening continues to be a risk for cemented implants. Cementless TKA, which relies on the surface morphology of a porous coating for biologic fixation of implant to bone, may provide improved long-term survivorship compared with cement. The quality of this bond is dependent on an interference fit and the roughness, or coefficient of friction, between the implant and the bonebone. Stress shielding is a measure of the difference in the stress experienced by implanted bone versus surrounding native bone. A finite element model (FEM) can be used to quantify and better understand stress shielding in order to better evaluate and optimize implant design. In this study, a FEM was constructed to investigate how the surface coating of cementless implants (coefficient of friction) and the location of the coating application affected the stress-shielding response in the tibia. It was determined that the stress distribution in the native tibia surrounding a cementless TKA implant was dependent on the coefficient of friction applied at the tip of the implant’s stem. Materials with lower friction coefficients applied to the stem tip resulted in higher compressive stress experienced by implanted bone, and more favorable overall stress-shielding responses. Full article
Show Figures

Figure 1

29 pages, 7312 KiB  
Article
Evaluating Ovarian Cancer Chemotherapy Response Using Gene Expression Data and Machine Learning
by Soukaina Amniouel, Keertana Yalamanchili, Sreenidhi Sankararaman and Mohsin Saleet Jafri
BioMedInformatics 2024, 4(2), 1396-1424; https://doi.org/10.3390/biomedinformatics4020077 - 22 May 2024
Viewed by 1206
Abstract
Background: Ovarian cancer (OC) is the most lethal gynecological cancer in the United States. Among the different types of OC, serous ovarian cancer (SOC) stands out as the most prevalent. Transcriptomics techniques generate extensive gene expression data, yet only a few of these [...] Read more.
Background: Ovarian cancer (OC) is the most lethal gynecological cancer in the United States. Among the different types of OC, serous ovarian cancer (SOC) stands out as the most prevalent. Transcriptomics techniques generate extensive gene expression data, yet only a few of these genes are relevant to clinical diagnosis. Methods: Methods for feature selection (FS) address the challenges of high dimensionality in extensive datasets. This study proposes a computational framework that applies FS techniques to identify genes highly associated with platinum-based chemotherapy response on SOC patients. Using SOC datasets from the Gene Expression Omnibus (GEO) database, LASSO and varSelRF FS methods were employed. Machine learning classification algorithms such as random forest (RF) and support vector machine (SVM) were also used to evaluate the performance of the models. Results: The proposed framework has identified biomarkers panels with 9 and 10 genes that are highly correlated with platinum–paclitaxel and platinum-only response in SOC patients, respectively. The predictive models have been trained using the identified gene signatures and accuracy of above 90% was achieved. Conclusions: In this study, we propose that applying multiple feature selection methods not only effectively reduces the number of identified biomarkers, enhancing their biological relevance, but also corroborates the efficacy of drug response prediction models in cancer treatment. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

12 pages, 2693 KiB  
Article
Bioinformatics-Based Identification of Human B-Cell Receptor (BCR) Stimulation-Associated Genes and Putative Promoters
by Ethan Deitcher, Kirk Trisler, Branden S. Moriarity, Caleb J. Bostwick, Fleur A. D. Leenen and Steven R. Deitcher
BioMedInformatics 2024, 4(2), 1384-1395; https://doi.org/10.3390/biomedinformatics4020076 - 20 May 2024
Viewed by 1012
Abstract
Genome engineered B-cells are being developed for chronic, systemic in vivo protein replacement therapies and for localized, tumor cell-actuated anticancer therapeutics. For continuous systemic engineered protein production, expression may be driven by constitutively active promoters. For actuated payload delivery, B-cell conditional expression could [...] Read more.
Genome engineered B-cells are being developed for chronic, systemic in vivo protein replacement therapies and for localized, tumor cell-actuated anticancer therapeutics. For continuous systemic engineered protein production, expression may be driven by constitutively active promoters. For actuated payload delivery, B-cell conditional expression could be based on transgene alternate splicing or heterologous promotors activated after engineered B-cell receptor (BCR) stimulation. This study used a bioinformatics-based approach to identify putative BCR-stimulated gene promoters. Gene expression data at four timepoints (60, 90, 210, and 390 min) following in vitro BCR stimulation using an anti-IgM antibody in B-cells from six healthy donors were analyzed using R (4.2.2). Differentially upregulated genes were stringently defined as those with adjusted p-value < 0.01 and a log2FoldChange > 1.5. The most upregulated and statistically significant genes were further analyzed to find those with the lowest unstimulated B-cell expression. Of the 46 significantly upregulated genes at 390 min post-BCR stimulation, 6 had average unstimulated expression below the median unstimulated expression at 390 min for all 54,675 gene probes. This bioinformatics-based identification of 6 relatively quiescent genes at baseline that are upregulated by BCR-stimulation (“on-switch”) provides a set of promising promotors for inclusion in future transgene designs and engineered B-cell therapeutics development. Full article
(This article belongs to the Section Applied Biomedical Data Science)
Show Figures

Figure 1

21 pages, 1695 KiB  
Communication
The Crucial Role of Interdisciplinary Conferences in Advancing Explainable AI in Healthcare
by Ankush U. Patel, Qiangqiang Gu, Ronda Esper, Danielle Maeser and Nicole Maeser
BioMedInformatics 2024, 4(2), 1363-1383; https://doi.org/10.3390/biomedinformatics4020075 - 17 May 2024
Viewed by 1366
Abstract
As artificial intelligence (AI) integrates within the intersecting domains of healthcare and computational biology, developing interpretable models tailored to medical contexts is met with significant challenges. Explainable AI (XAI) is vital for fostering trust and enabling effective use of AI in healthcare, particularly [...] Read more.
As artificial intelligence (AI) integrates within the intersecting domains of healthcare and computational biology, developing interpretable models tailored to medical contexts is met with significant challenges. Explainable AI (XAI) is vital for fostering trust and enabling effective use of AI in healthcare, particularly in image-based specialties such as pathology and radiology where adjunctive AI solutions for diagnostic image analysis are increasingly utilized. Overcoming these challenges necessitates interdisciplinary collaboration, essential for advancing XAI to enhance patient care. This commentary underscores the critical role of interdisciplinary conferences in promoting the necessary cross-disciplinary exchange for XAI innovation. A literature review was conducted to identify key challenges, best practices, and case studies related to interdisciplinary collaboration for XAI in healthcare. The distinctive contributions of specialized conferences in fostering dialogue, driving innovation, and influencing research directions were scrutinized. Best practices and recommendations for fostering collaboration, organizing conferences, and achieving targeted XAI solutions were adapted from the literature. By enabling crucial collaborative junctures that drive XAI progress, interdisciplinary conferences integrate diverse insights to produce new ideas, identify knowledge gaps, crystallize solutions, and spur long-term partnerships that generate high-impact research. Thoughtful structuring of these events, such as including sessions focused on theoretical foundations, real-world applications, and standardized evaluation, along with ample networking opportunities, is key to directing varied expertise toward overcoming core challenges. Successful collaborations depend on building mutual understanding and respect, clear communication, defined roles, and a shared commitment to the ethical development of robust, interpretable models. Specialized conferences are essential to shape the future of explainable AI and computational biology, contributing to improved patient outcomes and healthcare innovations. Recognizing the catalytic power of this collaborative model is key to accelerating the innovation and implementation of interpretable AI in medicine. Full article
(This article belongs to the Topic Computational Intelligence and Bioinformatics (CIB))
Show Figures

Graphical abstract

15 pages, 2325 KiB  
Article
Machine Learning in Allergic Contact Dermatitis: Identifying (Dis)similarities between Polysensitized and Monosensitized Patients
by Aikaterini Kyritsi, Anna Tagka, Alexander Stratigos and Vangelis D. Karalis
BioMedInformatics 2024, 4(2), 1348-1362; https://doi.org/10.3390/biomedinformatics4020074 - 17 May 2024
Viewed by 772
Abstract
Background: Allergic contact dermatitis (ACD) is a delayed hypersensitivity reaction occurring in sensitized individuals due to exposure to allergens. Polysensitization, defined as positive reactions to multiple unrelated haptens, increases the risk of ACD development and affects patients’ quality of life. The aim of [...] Read more.
Background: Allergic contact dermatitis (ACD) is a delayed hypersensitivity reaction occurring in sensitized individuals due to exposure to allergens. Polysensitization, defined as positive reactions to multiple unrelated haptens, increases the risk of ACD development and affects patients’ quality of life. The aim of this study is to apply machine learning in order to analyze the association between ACD, polysensitization, individual susceptibility, and patients’ characteristics. Methods: Patch test results and demographics from 400 ACD patients (Study protocol Nr. 3765/2022), categorized as polysensitized or monosensitized, were analyzed. Classic statistical analysis and multiple correspondence analysis (MCA) were utilized to explore relationships among variables. Results: The findings revealed significant associations between patient characteristics and ACD patterns, with hand dermatitis showing the strongest correlation. MCA provided insights into the complex interplay of demographic and clinical factors influencing ACD prevalence. Conclusion: Overall, this study highlights the potential of machine learning in unveiling hidden patterns within dermatological data, paving the way for future advancements in the field. Full article
(This article belongs to the Special Issue Editor's Choices Series for Methods in Biomedical Informatics Section)
Show Figures

Figure 1

19 pages, 784 KiB  
Review
A Comprehensive Review of the Impact of Machine Learning and Omics on Rare Neurological Diseases
by Nofe Alganmi
BioMedInformatics 2024, 4(2), 1329-1347; https://doi.org/10.3390/biomedinformatics4020073 - 16 May 2024
Viewed by 1074
Abstract
Background: Rare diseases, predominantly caused by genetic factors and often presenting neurological manifestations, are significantly underrepresented in research. This review addresses the urgent need for advanced research in rare neurological diseases (RNDs), which suffer from a data scarcity and diagnostic challenges. Bridging the [...] Read more.
Background: Rare diseases, predominantly caused by genetic factors and often presenting neurological manifestations, are significantly underrepresented in research. This review addresses the urgent need for advanced research in rare neurological diseases (RNDs), which suffer from a data scarcity and diagnostic challenges. Bridging the gap in RND research is the integration of machine learning (ML) and omics technologies, offering potential insights into the genetic and molecular complexities of these conditions. Methods: We employed a structured search strategy, using a combination of machine learning and omics-related keywords, alongside the names and synonyms of 1840 RNDs as identified by Orphanet. Our inclusion criteria were limited to English language articles that utilized specific ML algorithms in the analysis of omics data related to RNDs. We excluded reviews and animal studies, focusing solely on studies with the clear application of ML in omics data to ensure the relevance and specificity of our research corpus. Results: The structured search revealed the growing use of machine learning algorithms for the discovery of biomarkers and diagnosis of rare neurological diseases (RNDs), with a primary focus on genomics and radiomics because genetic factors and imaging techniques play a crucial role in determining the severity of these diseases. With AI, we can improve diagnosis and mutation detection and develop personalized treatment plans. There are, however, several challenges, including small sample sizes, data heterogeneity, model interpretability, and the need for external validation studies. Conclusions: The sparse knowledge of valid biomarkers, disease pathogenesis, and treatments for rare diseases presents a significant challenge for RND research. The integration of omics and machine learning technologies, coupled with collaboration among stakeholders, is essential to develop personalized treatment plans and improve patient outcomes in this critical medical domain. Full article
(This article belongs to the Special Issue Editor's Choices Series for Clinical Informatics Section)
Show Figures

Figure 1

21 pages, 1557 KiB  
Review
Perspectives on Resolving Diagnostic Challenges between Myocardial Infarction and Takotsubo Cardiomyopathy Leveraging Artificial Intelligence
by Serin Moideen Sheriff, Aaftab Sethi, Divyanshi Sood, Sourav Bansal, Aastha Goudel, Manish Murlidhar, Devanshi N. Damani, Kanchan Kulkarni and Shivaram P. Arunachalam
BioMedInformatics 2024, 4(2), 1308-1328; https://doi.org/10.3390/biomedinformatics4020072 - 13 May 2024
Viewed by 953
Abstract
Background: cardiovascular diseases, including acute myocardial infarction (AMI) and takotsubo cardiomyopathy (TTC), are significant causes of morbidity and mortality worldwide. Timely differentiation of these conditions is essential for effective patient management and improved outcomes. Methods: We conducted a review focusing on studies that [...] Read more.
Background: cardiovascular diseases, including acute myocardial infarction (AMI) and takotsubo cardiomyopathy (TTC), are significant causes of morbidity and mortality worldwide. Timely differentiation of these conditions is essential for effective patient management and improved outcomes. Methods: We conducted a review focusing on studies that applied artificial intelligence (AI) techniques to differentiate between acute myocardial infarction (AMI) and takotsubo cardiomyopathy (TTC). Inclusion criteria comprised studies utilizing various AI modalities, such as deep learning, ensemble methods, or other machine learning techniques, for discrimination between AMI and TTC. Additionally, studies employing imaging techniques, including echocardiography, cardiac magnetic resonance imaging, and coronary angiography, for cardiac disease diagnosis were considered. Publications included were limited to those available in peer-reviewed journals. Exclusion criteria were applied to studies not relevant to the discrimination between AMI and TTC, lacking detailed methodology or results pertinent to the AI application in cardiac disease diagnosis, not utilizing AI modalities or relying solely on invasive techniques for differentiation between AMI and TTC, and non-English publications. Results: The strengths and limitations of AI-based approaches are critically evaluated, including factors affecting performance, such as reliability and generalizability. The review delves into challenges associated with model interpretability, ethical implications, patient perspectives, and inconsistent image quality due to manual dependency, highlighting the need for further research. Conclusions: This review article highlights the promising advantages of AI technologies in distinguishing AMI from TTC, enabling early diagnosis and personalized treatments. However, extensive validation and real-world implementation are necessary before integrating AI tools into routine clinical practice. It is vital to emphasize that while AI can efficiently assist, it cannot entirely replace physicians. Collaborative efforts among clinicians, researchers, and AI experts are essential to unlock the potential of these transformative technologies fully. Full article
(This article belongs to the Special Issue Computational Biology and Artificial Intelligence in Medicine)
Show Figures

Figure 1

19 pages, 2188 KiB  
Article
IMPI: An Interface for Low-Frequency Point Mutation Identification Exemplified on Resistance Mutations in Chronic Myeloid Leukemia
by Julia Vetter, Jonathan Burghofer, Theodora Malli, Anna M. Lin, Gerald Webersinke, Markus Wiederstein, Stephan M. Winkler and Susanne Schaller
BioMedInformatics 2024, 4(2), 1289-1307; https://doi.org/10.3390/biomedinformatics4020071 - 13 May 2024
Viewed by 707
Abstract
Background: In genomics, highly sensitive point mutation detection is particularly relevant for cancer diagnosis and early relapse detection. Next-generation sequencing combined with unique molecular identifiers (UMIs) is known to improve the mutation detection sensitivity. Methods: We present an open-source bioinformatics framework named Interface [...] Read more.
Background: In genomics, highly sensitive point mutation detection is particularly relevant for cancer diagnosis and early relapse detection. Next-generation sequencing combined with unique molecular identifiers (UMIs) is known to improve the mutation detection sensitivity. Methods: We present an open-source bioinformatics framework named Interface for Point Mutation Identification (IMPI) with a graphical user interface (GUI) for processing especially small-scale NGS data to identify variants. IMPI ensures detailed UMI analysis and clustering, as well as initial raw read processing, and consensus sequence building. Furthermore, the effects of custom algorithm and parameter settings for NGS data pre-processing and UMI collapsing (e.g., UMI clustered versus unclustered (raw) reads) can be investigated. Additionally, IMPI implements optimization and quality control methods; an evolution strategy is used for parameter optimization. Results: IMPI was designed, implemented, and tested using BCR::ABL1 fusion gene kinase domain sequencing data. In summary, IMPI enables a detailed analysis of the impact of UMI clustering and parameter setting changes on the measured allele frequencies. Conclusions: Regarding the BCR::ABL1 data, IMPI’s results underlined the need for caution while designing specialized single amplicon NGS approaches due to methodical limitations (e.g., high PCR-mediated recombination rate). This cannot be corrected using UMIs. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

14 pages, 1176 KiB  
Article
Cancer Classification from Gene Expression Using Ensemble Learning with an Influential Feature Selection Technique
by Nusrath Tabassum, Md Abdus Samad Kamal, M. A. H. Akhand and Kou Yamada
BioMedInformatics 2024, 4(2), 1275-1288; https://doi.org/10.3390/biomedinformatics4020070 - 13 May 2024
Viewed by 926
Abstract
Uncontrolled abnormal cell growth, known as cancer, may lead to tumors, immune system deterioration, and other fatal disability. Early cancer identification makes cancer treatment easier and increases the recovery rate, resulting in less mortality. Gene expression data play a crucial role in cancer [...] Read more.
Uncontrolled abnormal cell growth, known as cancer, may lead to tumors, immune system deterioration, and other fatal disability. Early cancer identification makes cancer treatment easier and increases the recovery rate, resulting in less mortality. Gene expression data play a crucial role in cancer classification at an early stage. Accurate cancer classification is a complex and challenging task due to the high-dimensional nature of the gene expression data relative to the small sample size. This research proposes using a dimensionality-reduction technique to address this limitation. Specifically, the mutual information (MI) technique is first utilized to select influential biomarker genes. Next, an ensemble learning model is applied to the reduced dataset using only the most influential features (genes) to develop an effective cancer classification model. The bagging method, where the base classifiers are Multilayer Perceptrons (MLPs), is chosen as an ensemble technique. The proposed cancer classification model, the MI-Bagging method, is applied to several benchmark gene expression datasets containing distinctive cancer classes. The cancer classification accuracy of the proposed model is compared with the relevant existing methods. The experimental results indicate that the proposed model outperforms the existing methods, and it is effective and competent for cancer classification despite the limited size of gene expression data with high dimensionality. The highest accuracy achieved by the proposed method demonstrates that the proposed emerging gene-expression-based cancer classifier has the potential to help in cancer treatment and lead to a higher cancer survival rate in the future. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

13 pages, 2093 KiB  
Article
A Smartphone-Based Algorithm for L Test Subtask Segmentation
by Alexis L. McCreath Frangakis, Edward D. Lemaire and Natalie Baddour
BioMedInformatics 2024, 4(2), 1262-1274; https://doi.org/10.3390/biomedinformatics4020069 - 10 May 2024
Cited by 1 | Viewed by 808
Abstract
Background: Subtask segmentation can provide useful information from clinical tests, allowing clinicians to better assess a patient’s mobility status. A new smartphone-based algorithm was developed to segment the L Test of functional mobility into stand-up, sit-down, and turn subtasks. Methods: Twenty-one able-bodied participants [...] Read more.
Background: Subtask segmentation can provide useful information from clinical tests, allowing clinicians to better assess a patient’s mobility status. A new smartphone-based algorithm was developed to segment the L Test of functional mobility into stand-up, sit-down, and turn subtasks. Methods: Twenty-one able-bodied participants each completed five L Test trials, with a smartphone attached to their posterior pelvis. The smartphone used a custom-designed application that collected linear acceleration, gyroscope, and magnetometer data, which were then put into a threshold-based algorithm for subtask segmentation. Results: The algorithm produced good results (>97% accuracy, >98% specificity, >74% sensitivity) for all subtasks. Conclusions: These results were a substantial improvement compared with previously published results for the L Test, as well as similar functional mobility tests. This smartphone-based approach is an accessible method for providing useful metrics from the L Test that can lead to better clinical decision-making. Full article
(This article belongs to the Special Issue Editor's Choices Series for Methods in Biomedical Informatics Section)
Show Figures

Figure 1

13 pages, 1558 KiB  
Article
ConsensusPrime—A Bioinformatic Pipeline for Efficient Consensus Primer Design—Detection of Various Resistance and Virulence Factors in MRSA—A Case Study
by Maximilian Collatz, Martin Reinicke, Celia Diezel, Sascha D. Braun, Stefan Monecke, Annett Reissig and Ralf Ehricht
BioMedInformatics 2024, 4(2), 1249-1261; https://doi.org/10.3390/biomedinformatics4020068 - 10 May 2024
Viewed by 1035
Abstract
Background: The effectiveness and reliability of diagnostic tests that detect DNA sequences largely hinge on the quality of the used primers and probes. This importance is especially evident when considering the specific sample being analyzed, as it affects the molecular background and potential [...] Read more.
Background: The effectiveness and reliability of diagnostic tests that detect DNA sequences largely hinge on the quality of the used primers and probes. This importance is especially evident when considering the specific sample being analyzed, as it affects the molecular background and potential for cross-reactivity, ultimately determining the test’s performance. Methods: Predicting primers based on the consensus sequence of the target has multiple advantages, including high specificity, diagnostic reliability, broad applicability, and long-term validity. Automated curation of the input sequences ensures high-quality primers and probes. Results: Here, we present a use case for developing a set of consensus primers and probes to identify antibiotic resistance and virulence genes in Staphylococcus (S.) aureus using the ConsensusPrime pipeline. Extensive qPCR experiments with several S. aureus strains confirm the exceptional quality of the primers designed using the pipeline. Conclusions: By improving the quality of the input sequences and using the consensus sequence as a basis, the ConsensusPrime pipeline pipeline ensures high-quality primers and probes, which should be the basis of molecular assays. Full article
Show Figures

Figure 1

24 pages, 1113 KiB  
Review
Current Applications of Artificial Intelligence in the Neonatal Intensive Care Unit
by Dimitrios Rallis, Maria Baltogianni, Konstantina Kapetaniou and Vasileios Giapros
BioMedInformatics 2024, 4(2), 1225-1248; https://doi.org/10.3390/biomedinformatics4020067 - 9 May 2024
Viewed by 1590
Abstract
Artificial intelligence (AI) refers to computer algorithms that replicate the cognitive function of humans. Machine learning is widely applicable using structured and unstructured data, while deep learning is derived from the neural networks of the human brain that process and interpret information. During [...] Read more.
Artificial intelligence (AI) refers to computer algorithms that replicate the cognitive function of humans. Machine learning is widely applicable using structured and unstructured data, while deep learning is derived from the neural networks of the human brain that process and interpret information. During the last decades, AI has been introduced in several aspects of healthcare. In this review, we aim to present the current application of AI in the neonatal intensive care unit. AI-based models have been applied to neurocritical care, including automated seizure detection algorithms and electroencephalogram-based hypoxic-ischemic encephalopathy severity grading systems. Moreover, AI models evaluating magnetic resonance imaging contributed to the progress of the evaluation of the neonatal developing brain and the understanding of how prenatal events affect both structural and functional network topologies. Furthermore, AI algorithms have been applied to predict the development of bronchopulmonary dysplasia and assess the extubation readiness of preterm neonates. Automated models have been also used for the detection of retinopathy of prematurity and the need for treatment. Among others, AI algorithms have been utilized for the detection of sepsis, the need for patent ductus arteriosus treatment, the evaluation of jaundice, and the detection of gastrointestinal morbidities. Finally, AI prediction models have been constructed for the evaluation of the neurodevelopmental outcome and the overall mortality of neonates. Although the application of AI in neonatology is encouraging, further research in AI models is warranted in the future including retraining clinical trials, validating the outcomes, and addressing serious ethics issues. Full article
(This article belongs to the Special Issue Editor-in-Chief's Choices in Biomedical Informatics)
Show Figures

Figure 1

23 pages, 6506 KiB  
Article
Selection of the Discriming Feature Using the BEMD’s BIMF for Classification of Breast Cancer Mammography Image
by Fatima Ghazi, Aziza Benkuider, Fouad Ayoub and Khalil Ibrahimi
BioMedInformatics 2024, 4(2), 1202-1224; https://doi.org/10.3390/biomedinformatics4020066 - 9 May 2024
Viewed by 894
Abstract
Mammogram exam images are useful in identifying diseases, such as breast cancer, which is one of the deadliest cancers, affecting adult women around the world. Computational image analysis and machine learning techniques can help experts identify abnormalities in these images. In this work [...] Read more.
Mammogram exam images are useful in identifying diseases, such as breast cancer, which is one of the deadliest cancers, affecting adult women around the world. Computational image analysis and machine learning techniques can help experts identify abnormalities in these images. In this work we present a new system to help diagnose and analyze breast mammogram images. To do this, the system a method the Selection of the Most Discriminant Attributes of the images preprocessed by BEMD “SMDA-BEMD”, this entails picking the most pertinent traits from the collection of variables that characterize the state under study. A reduction of attribute based on a transformation of the data also called an extraction of characteristics by extracting the Haralick attributes from the Co-occurrence Matrices Methods “GLCM” this reduction which consists of replacing the initial set of data by a new reduced set, constructed at from the initial set of features extracted by images decomposed using Bidimensional Empirical Multimodal Decomposition “BEMD”, for discrimination of breast mammogram images (healthy and pathology) using BEMD. This decomposition makes it possible to decompose an image into several Bidimensional Intrinsic Mode Functions “BIMFs” modes and a residue. The results obtained show that mammographic images can be represented in a relatively short space by selecting the most discriminating features based on a supervised method where they can be differentiated with high reliability between healthy mammographic images and pathologies, However, certain aspects and findings demonstrate how successful the suggested strategy is to detect the tumor. A BEMD technique is used as preprocessing on mammographic images. This suggested methodology makes it possible to obtain consistent results and establishes the discrimination threshold for mammography images (healthy and pathological), the classification rate is improved (98.6%) compared to existing cutting-edge techniques in the field. This approach is tested and validated on mammographic medical images from the Kenitra-Morocco reproductive health reference center (CRSRKM) which contains breast mammographic images of normal and pathological cases. Full article
(This article belongs to the Special Issue Feature Papers on Methods in Biomedical Informatics)
Show Figures

Figure 1

28 pages, 4958 KiB  
Article
Diagnostic Tool for Early Detection of Rheumatic Disorders Using Machine Learning Algorithm and Predictive Models
by Godfrey A. Mills, Dzifa Dey, Mohammed Kassim, Aminu Yiwere and Kenneth Broni
BioMedInformatics 2024, 4(2), 1174-1201; https://doi.org/10.3390/biomedinformatics4020065 - 8 May 2024
Cited by 1 | Viewed by 1019
Abstract
Background: Rheumatic diseases are chronic diseases that affect joints, tendons, ligaments, bones, muscles, and other vital organs. Detection of rheumatic diseases is a complex process that requires careful analysis of heterogeneous content from clinical examinations, patient history, and laboratory investigations. Machine learning techniques [...] Read more.
Background: Rheumatic diseases are chronic diseases that affect joints, tendons, ligaments, bones, muscles, and other vital organs. Detection of rheumatic diseases is a complex process that requires careful analysis of heterogeneous content from clinical examinations, patient history, and laboratory investigations. Machine learning techniques have made it possible to integrate such techniques into the complex diagnostic process to identify inherent features that lead to disease formation, development, and progression for remedial measures. Methods: An automated diagnostic tool using a multilayer neural network computational engine is presented to detect rheumatic disorders and the type of underlying disorder for therapeutic strategies. Rheumatic disorders considered are rheumatoid arthritis, osteoarthritis, and systemic lupus erythematosus. The detection system was trained and tested using 70% and 30% respectively of labelled synthetic dataset of 100,000 records containing both single and multiple disorders. Results: The detection system was able to detect and predict underlying disorders with accuracy of 97.48%, sensitivity of 96.80%, and specificity of 97.50%. Conclusion: The good performance suggests that this solution is robust enough and can be implemented for screening patients for intervention measures. This is a much-needed solution in environments with limited specialists, as the solution promotes task-shifting from the specialist level to the primary healthcare physicians. Full article
Show Figures

Figure 1

19 pages, 921 KiB  
Review
An Overview of Approaches and Methods for the Cognitive Workload Estimation in Human–Machine Interaction Scenarios through Wearables Sensors
by Sabrina Iarlori, David Perpetuini, Michele Tritto, Daniela Cardone, Alessandro Tiberio, Manish Chinthakindi, Chiara Filippini, Luca Cavanini, Alessandro Freddi, Francesco Ferracuti, Arcangelo Merla and Andrea Monteriù
BioMedInformatics 2024, 4(2), 1155-1173; https://doi.org/10.3390/biomedinformatics4020064 - 7 May 2024
Cited by 1 | Viewed by 900
Abstract
Background: Human-Machine Interaction (HMI) has been an important field of research in recent years, since machines will continue to be embedded in many human actvities in several contexts, such as industry and healthcare. Monitoring in an ecological mannerthe cognitive workload (CW) of users, [...] Read more.
Background: Human-Machine Interaction (HMI) has been an important field of research in recent years, since machines will continue to be embedded in many human actvities in several contexts, such as industry and healthcare. Monitoring in an ecological mannerthe cognitive workload (CW) of users, who interact with machines, is crucial to assess their level of engagement in activities and the required effort, with the goal of preventing stressful circumstances. This study provides a comprehensive analysis of the assessment of CW using wearable sensors in HMI. Methods: this narrative review explores several techniques and procedures for collecting physiological data through wearable sensors with the possibility to integrate these multiple physiological signals, providing a multimodal monitoring of the individuals’CW. Finally, it focuses on the impact of artificial intelligence methods in the physiological signals data analysis to provide models of the CW to be exploited in HMI. Results: the review provided a comprehensive evaluation of the wearables, physiological signals, and methods of data analysis for CW evaluation in HMI. Conclusion: the literature highlighted the feasibility of employing wearable sensors to collect physiological signals for an ecological CW monitoring in HMI scenarios. However, challenges remain in standardizing these measures across different populations and contexts. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

11 pages, 3190 KiB  
Article
Assaying and Classifying T Cell Function by Cell Morphology
by Xin Wang, Stacey M. Fernandes, Jennifer R. Brown and Lance C. Kam
BioMedInformatics 2024, 4(2), 1144-1154; https://doi.org/10.3390/biomedinformatics4020063 - 26 Apr 2024
Viewed by 1043
Abstract
Immune cell function varies tremendously between individuals, posing a major challenge to emerging cellular immunotherapies. This report pursues the use of cell morphology as an indicator of high-level T cell function. Short-term spreading of T cells on planar, elastic surfaces was quantified by [...] Read more.
Immune cell function varies tremendously between individuals, posing a major challenge to emerging cellular immunotherapies. This report pursues the use of cell morphology as an indicator of high-level T cell function. Short-term spreading of T cells on planar, elastic surfaces was quantified by 11 morphological parameters and analyzed to identify effects of both intrinsic and extrinsic factors. Our findings identified morphological features that varied between T cells isolated from healthy donors and those from patients being treated for Chronic Lymphocytic Leukemia (CLL). This approach also identified differences between cell responses to substrates of different elastic modulus. Combining multiple features through a machine learning approach such as Decision Tree or Random Forest provided an effective means for identifying whether T cells came from healthy or CLL donors. Further development of this approach could lead to a rapid assay of T cell function to guide cellular immunotherapy. Full article
(This article belongs to the Special Issue Editor's Choices Series for Methods in Biomedical Informatics Section)
Show Figures

Figure 1

47 pages, 1335 KiB  
Review
Recent Advances in Large Language Models for Healthcare
by Khalid Nassiri and Moulay A. Akhloufi
BioMedInformatics 2024, 4(2), 1097-1143; https://doi.org/10.3390/biomedinformatics4020062 - 16 Apr 2024
Cited by 3 | Viewed by 4431
Abstract
Recent advances in the field of large language models (LLMs) underline their high potential for applications in a variety of sectors. Their use in healthcare, in particular, holds out promising prospects for improving medical practices. As we highlight in this paper, LLMs have [...] Read more.
Recent advances in the field of large language models (LLMs) underline their high potential for applications in a variety of sectors. Their use in healthcare, in particular, holds out promising prospects for improving medical practices. As we highlight in this paper, LLMs have demonstrated remarkable capabilities in language understanding and generation that could indeed be put to good use in the medical field. We also present the main architectures of these models, such as GPT, Bloom, or LLaMA, composed of billions of parameters. We then examine recent trends in the medical datasets used to train these models. We classify them according to different criteria, such as size, source, or subject (patient records, scientific articles, etc.). We mention that LLMs could help improve patient care, accelerate medical research, and optimize the efficiency of healthcare systems such as assisted diagnosis. We also highlight several technical and ethical issues that need to be resolved before LLMs can be used extensively in the medical field. Consequently, we propose a discussion of the capabilities offered by new generations of linguistic models and their limitations when deployed in a domain such as healthcare. Full article
(This article belongs to the Special Issue Feature Papers in Clinical Informatics Section)
Show Figures

Figure 1

12 pages, 6504 KiB  
Project Report
Investigating the Effectiveness of an IMU Portable Gait Analysis Device: An Application for Parkinson’s Disease Management
by Nikos Tsotsolas, Eleni Koutsouraki, Aspasia Antonakaki, Stefanos Pizanias, Marios Kounelis, Dimitrios D. Piromalis, Dimitrios P. Kolovos, Christos Kokkotis, Themistoklis Tsatalas, George Bellis, Dimitrios Tsaopoulos, Paris Papaggelos, George Sidiropoulos and Giannis Giakas
BioMedInformatics 2024, 4(2), 1085-1096; https://doi.org/10.3390/biomedinformatics4020061 - 10 Apr 2024
Viewed by 699
Abstract
As part of two research projects, a small gait analysis device was developed for use inside and outside the home by patients themselves. The project PARMODE aims to record accurate gait measurements in patients with Parkinson’s disease (PD) and proceed with an in-depth [...] Read more.
As part of two research projects, a small gait analysis device was developed for use inside and outside the home by patients themselves. The project PARMODE aims to record accurate gait measurements in patients with Parkinson’s disease (PD) and proceed with an in-depth analysis of the gait characteristics, while the project CPWATCHER aims to assess the quality of hand movement in cerebral palsy patients. The device was mainly developed to serve the first project with additional offline processing, including machine learning algorithms that could potentially be used for the second aim. A key feature of the device is its small size (36 mm × 46 mm × 16 mm, weight: 14 g), which was designed to meet specific requirements in terms of device consumption restrictions due to the small size of the battery and the need for autonomous operation for more than ten hours. This research work describes, on the one hand, the new device with an emphasis on its functions, and on the other hand, its connection with a web platform for reading and processing data from the devices placed on patients’ feet to record the gait characteristics of patients on a continuous basis. Full article
Show Figures

Figure 1

14 pages, 3102 KiB  
Article
Analyzing Patterns of Service Utilization Using Graph Topology to Understand the Dynamic of the Engagement of Patients with Complex Problems with Health Services
by Jonas Bambi, Yudi Santoso, Ken Moselle, Stan Robertson, Abraham Rudnick, Ernie Chang and Alex Kuo
BioMedInformatics 2024, 4(2), 1071-1084; https://doi.org/10.3390/biomedinformatics4020060 - 9 Apr 2024
Cited by 3 | Viewed by 822
Abstract
Background: Providing care to persons with complex problems is inherently difficult due to several factors, including the impacts of proximal determinants of health, treatment response, the natural emergence of comorbidities, and service system capacity to provide timely required services. Providing visibility into the [...] Read more.
Background: Providing care to persons with complex problems is inherently difficult due to several factors, including the impacts of proximal determinants of health, treatment response, the natural emergence of comorbidities, and service system capacity to provide timely required services. Providing visibility into the dynamics of patients’ engagement can help to optimize care for patients with complex problems. Method: In a previous work, graph machine learning and NLP methods were used to model the products of service system dynamics as atemporal entities, using a data model that collapsed patient encounter events across time. In this paper, the order of events is put back into the data model to provide topological depictions of the dynamics that are embodied in patients’ movement across a complex healthcare system. Result: The results show that directed graphs are well suited to the task of depicting the way that the diverse components of the system are functionally coupled—or remain disconnected—by patient journeys. Conclusion: By setting the resolution on the graph topology visualization, important characteristics can be highlighted, including highly prevalent repeating sequences of service events readily interpretable by clinical subject matter experts. Moreover, this methodology provides a first step in addressing the challenge of locating potential operational problems for patients with complex issues engaging with a complex healthcare service system. Full article
(This article belongs to the Special Issue Feature Papers in Clinical Informatics Section)
Show Figures

Graphical abstract

12 pages, 7183 KiB  
Article
RETRACTED: Utilizing Generative Adversarial Networks for Acne Dataset Generation in Dermatology
by Aravinthan Sankar, Kunal Chaturvedi, Al-Akhir Nayan, Mohammad Hesam Hesamian, Ali Braytee and Mukesh Prasad
BioMedInformatics 2024, 4(2), 1059-1070; https://doi.org/10.3390/biomedinformatics4020059 - 9 Apr 2024
Cited by 2 | Viewed by 1491 | Retraction
Abstract
Background: In recent years, computer-aided diagnosis for skin conditions has made significant strides, primarily driven by artificial intelligence (AI) solutions. However, despite this progress, the efficiency of AI-enabled systems remains hindered by the scarcity of high-quality and large-scale datasets, primarily due to privacy [...] Read more.
Background: In recent years, computer-aided diagnosis for skin conditions has made significant strides, primarily driven by artificial intelligence (AI) solutions. However, despite this progress, the efficiency of AI-enabled systems remains hindered by the scarcity of high-quality and large-scale datasets, primarily due to privacy concerns. Methods: This research circumvents privacy issues associated with real-world acne datasets by creating a synthetic dataset of human faces with varying acne severity levels (mild, moderate, and severe) using Generative Adversarial Networks (GANs). Further, three object detection models—YOLOv5, YOLOv8, and Detectron2—are used to evaluate the efficacy of the augmented dataset for detecting acne. Results: Integrating StyleGAN with these models, the results demonstrate the mean average precision (mAP) scores: YOLOv5: 73.5%, YOLOv8: 73.6%, and Detectron2: 37.7%. These scores surpass the mAP achieved without GANs. Conclusions: This study underscores the effectiveness of GANs in generating synthetic facial acne images and emphasizes the importance of utilizing GANs and convolutional neural network (CNN) models for accurate acne detection. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

12 pages, 4488 KiB  
Article
A Comprehensive Analysis of Trapezius Muscle EMG Activity in Relation to Stress and Meditation
by Mohammad Ahmed, Michael Grillo, Amirtaha Taebi, Mehmet Kaya and Peshala Thibbotuwawa Gamage
BioMedInformatics 2024, 4(2), 1047-1058; https://doi.org/10.3390/biomedinformatics4020058 - 9 Apr 2024
Viewed by 1094
Abstract
Introduction: This study analyzes the efficacy of trapezius muscle electromyography (EMG) in discerning mental states, namely stress and meditation. Methods: Fifteen healthy participants were monitored to assess their physiological responses to mental stressors and meditation. Sensors were affixed to both the right and [...] Read more.
Introduction: This study analyzes the efficacy of trapezius muscle electromyography (EMG) in discerning mental states, namely stress and meditation. Methods: Fifteen healthy participants were monitored to assess their physiological responses to mental stressors and meditation. Sensors were affixed to both the right and left trapezius muscles to capture EMG signals, while simultaneous electroencephalography (EEG) was conducted to validate cognitive states. Results: Our analysis of various EMG features, considering frequency ranges and sensor positioning, revealed significant changes in trapezius muscle activity during stress and meditation. Notably, low-frequency EMG features facilitated enhanced stress detection. For accurate stress identification, sensor configurations can be limited to the right trapezius muscle. Furthermore, the introduction of a novel method for determining asymmetry in EMG features suggests that applying sensors on bilateral trapezius muscles can improve the detection of mental states. Conclusion: This research presents a promising avenue for efficient cognitive state monitoring through compact and convenient sensing. Full article
(This article belongs to the Special Issue Editor's Choices Series for Clinical Informatics Section)
Show Figures

Figure 1

28 pages, 2543 KiB  
Article
Quantifying Inhaled Concentrations of Particulate Matter, Carbon Dioxide, Nitrogen Dioxide, and Nitric Oxide Using Observed Biometric Responses with Machine Learning
by Shisir Ruwali, Shawhin Talebi, Ashen Fernando, Lakitha O. H. Wijeratne, John Waczak, Prabuddha M. H. Dewage, David J. Lary, John Sadler, Tatiana Lary, Matthew Lary and Adam Aker
BioMedInformatics 2024, 4(2), 1019-1046; https://doi.org/10.3390/biomedinformatics4020057 - 3 Apr 2024
Viewed by 1498
Abstract
Introduction: Air pollution has numerous impacts on human health on a variety of time scales. Pollutants such as particulate matter—PM1 and PM2.5, carbon dioxide (CO2), nitrogen dioxide (NO2), and nitric oxide (NO) are exemplars of the [...] Read more.
Introduction: Air pollution has numerous impacts on human health on a variety of time scales. Pollutants such as particulate matter—PM1 and PM2.5, carbon dioxide (CO2), nitrogen dioxide (NO2), and nitric oxide (NO) are exemplars of the wider human exposome. In this study, we adopted a unique approach by utilizing the responses of human autonomic systems to gauge the abundance of pollutants in inhaled air. Objective: To investigate how the human body autonomically responds to inhaled pollutants in microenvironments, including PM1, PM2.5, CO2, NO2, and NO, on small temporal and spatial scales by making use of biometric observations of the human autonomic response. To test the accuracy in predicting the concentrations of these pollutants using biological measurements of the participants. Methodology: Two experimental approaches having a similar methodology that employs a biometric suite to capture the physiological responses of cyclists were compared, and multiple sensors were used to measure the pollutants in the air surrounding them. Machine learning algorithms were used to estimate the levels of these pollutants and decipher the body’s automatic reactions to them. Results: We observed high precision in predicting PM1, PM2.5, and CO2 using a limited set of biometrics measured from the participants, as indicated with the coefficient of determination (R2) between the estimated and true values of these pollutants of 0.99, 0.96, and 0.98, respectively. Although the predictions for NO2 and NO were reliable at lower concentrations, which was observed qualitatively, the precision varied throughout the data range. Skin temperature, heart rate, and respiration rate were the common physiological responses that were the most influential in predicting the concentration of these pollutants. Conclusion: Biometric measurements can be used to estimate air quality components such as PM1, PM2.5, and CO2 with high degrees of accuracy and can also be used to decipher the effect of these pollutants on the human body using machine learning techniques. The results for NO2 and NO suggest a requirement to improve our models with more comprehensive data collection or advanced machine learning techniques to improve the results for these two pollutants. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop