Next Issue
Volume 4, March
Previous Issue
Volume 3, September
 
 

BioMedInformatics, Volume 3, Issue 4 (December 2023) – 20 articles

Cover Story (view full-size image): Human immunoglobulin allotypes are allelic antigenic determinants (or ‘markers’) that are determined serologically on human immunoglobulin (IG) or antibody heavy and light chains. These allotypes have been identified on gamma1, gamma2, gamma3 and alpha2 heavy chains (G1m, G2m, G3m and A2m allotypes, respectively) and on kappa light chain (Km allotypes). They represent a major system for understanding the immunogenicity of polymorphic IG chains in relation to amino acid and conformational changes. WHO/IMGT allotype nomenclature and the IMGT unique numbering for constant (C) domain, with the IMGT Collier de Perles graphical representation,  bridge Gm-Am and Km alleles to IGHC and IGKC gene alleles and structures and, by definition, to IG chain immunogenicity, enabling the immunoinformatics of personalized therapeutic antibodies and engineered variants. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
19 pages, 2823 KiB  
Article
Optimized FIR Filter Using Genetic Algorithms: A Case Study of ECG Signals Filter Optimization
by Houssam Hamici, Awos Kanan and Khalid Al-hammuri
BioMedInformatics 2023, 3(4), 1197-1215; https://doi.org/10.3390/biomedinformatics3040071 - 8 Dec 2023
Viewed by 828
Abstract
The advancement in technology and the availability of specialized digital signal processing chips have made digital filter design and implementation more feasible in a variety of fields, including biomedical engineering. This paper makes two key contributions. First, it uses a genetic algorithm to [...] Read more.
The advancement in technology and the availability of specialized digital signal processing chips have made digital filter design and implementation more feasible in a variety of fields, including biomedical engineering. This paper makes two key contributions. First, it uses a genetic algorithm to optimize the coefficients of finite impulse response (FIR) filters. Second, it conducts a case study on using genetic algorithms to optimize FIR filters for electrocardiogram (ECG) biomedical signal noise removal. The goal of the proposed filter design approach is to achieve the desired signal bandwidth while minimizing the side lobe level and eliminating unwanted signals using a genetic algorithm. The results of a comprehensive analysis show that the genetic algorithm-based filter is more effective than conventional filter designs in terms of noise removal efficiency. Full article
Show Figures

Figure 1

19 pages, 2539 KiB  
Review
Transforming Drug Design: Innovations in Computer-Aided Discovery for Biosimilar Agents
by Shadi Askari, Alireza Ghofrani and Hamed Taherdoost
BioMedInformatics 2023, 3(4), 1178-1196; https://doi.org/10.3390/biomedinformatics3040070 - 8 Dec 2023
Cited by 1 | Viewed by 1226
Abstract
In pharmaceutical research and development, pursuing novel therapeutics and optimizing existing drugs have been revolutionized by the fusion of cutting-edge technologies and computational methodologies. Over the past few decades, the field of drug design has undergone a remarkable transformation, catalyzed by the rapid [...] Read more.
In pharmaceutical research and development, pursuing novel therapeutics and optimizing existing drugs have been revolutionized by the fusion of cutting-edge technologies and computational methodologies. Over the past few decades, the field of drug design has undergone a remarkable transformation, catalyzed by the rapid advancement of computer-aided discovery techniques and the emergence of biosimilar agents. This dynamic interplay between scientific innovation and technological prowess has expedited the drug discovery process and paved the way for more targeted, effective, and personalized treatment approaches. This review investigates the transformative computer-aided discovery techniques for biosimilar agents in reshaping drug design. It examines how computational methods expedite drug candidate identification and explores the rise of cost-effective biosimilars as alternatives to biologics. Through this analysis, this study highlights the potential of these innovations to enhance the efficiency and accessibility of pharmaceutical development. It represents a pioneering effort to examine how computer-aided discovery is revolutionizing biosimilar agent development, exploring its applications, challenges, and prospects. Full article
Show Figures

Figure 1

33 pages, 1697 KiB  
Review
Genomics for Emerging Pathogen Identification and Monitoring: Prospects and Obstacles
by Vishakha Vashisht, Ashutosh Vashisht, Ashis K. Mondal, Jaspreet Farmaha, Ahmet Alptekin, Harmanpreet Singh, Pankaj Ahluwalia, Anaka Srinivas and Ravindra Kolhe
BioMedInformatics 2023, 3(4), 1145-1177; https://doi.org/10.3390/biomedinformatics3040069 - 7 Dec 2023
Cited by 2 | Viewed by 2858
Abstract
Emerging infectious diseases (EIDs) pose an increasingly significant global burden, driven by urbanization, population explosion, global travel, changes in human behavior, and inadequate public health systems. The recent SARS-CoV-2 pandemic highlights the urgent need for innovative and robust technologies to effectively monitor newly [...] Read more.
Emerging infectious diseases (EIDs) pose an increasingly significant global burden, driven by urbanization, population explosion, global travel, changes in human behavior, and inadequate public health systems. The recent SARS-CoV-2 pandemic highlights the urgent need for innovative and robust technologies to effectively monitor newly emerging pathogens. Rapid identification, epidemiological surveillance, and transmission mitigation are crucial challenges for ensuring public health safety. Genomics has emerged as a pivotal tool in public health during pandemics, enabling the diagnosis, management, and prediction of infections, as well as the analysis and identification of cross-species interactions and the categorization of infectious agents. Recent advancements in high-throughput DNA sequencing tools have facilitated rapid and precise identification and characterization of emerging pathogens. This review article provides insights into the latest advances in various genomic techniques for pathogen detection and tracking and their applications in global outbreak surveillance. We assess methods that leverage pathogen sequences and explore the role of genomic analysis in understanding the epidemiology of newly emerged infectious diseases. Additionally, we address technical challenges and limitations, ethical and legal considerations, and highlight opportunities for integrating genomics with other surveillance approaches. By delving into the prospects and obstacles of genomics, we can gain valuable insights into its role in mitigating the threats posed by emerging pathogens and improving global preparedness in the face of future outbreaks. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

21 pages, 6951 KiB  
Article
Enhancing Brain Tumor Classification with Transfer Learning across Multiple Classes: An In-Depth Analysis
by Syed Ahmmed, Prajoy Podder, M. Rubaiyat Hossain Mondal, S M Atikur Rahman, Somasundar Kannan, Md Junayed Hasan, Ali Rohan and Alexander E. Prosvirin
BioMedInformatics 2023, 3(4), 1124-1144; https://doi.org/10.3390/biomedinformatics3040068 - 6 Dec 2023
Cited by 4 | Viewed by 1771
Abstract
This study focuses on leveraging data-driven techniques to diagnose brain tumors through magnetic resonance imaging (MRI) images. Utilizing the rule of deep learning (DL), we introduce and fine-tune two robust frameworks, ResNet 50 and Inception V3, specifically designed for the classification of brain [...] Read more.
This study focuses on leveraging data-driven techniques to diagnose brain tumors through magnetic resonance imaging (MRI) images. Utilizing the rule of deep learning (DL), we introduce and fine-tune two robust frameworks, ResNet 50 and Inception V3, specifically designed for the classification of brain MRI images. Building upon the previous success of ResNet 50 and Inception V3 in classifying other medical imaging datasets, our investigation encompasses datasets with distinct characteristics, including one with four classes and another with two. The primary contribution of our research lies in the meticulous curation of these paired datasets. We have also integrated essential techniques, including Early Stopping and ReduceLROnPlateau, to refine the model through hyperparameter optimization. This involved adding extra layers, experimenting with various loss functions and learning rates, and incorporating dropout layers and regularization to ensure model convergence in predictions. Furthermore, strategic enhancements, such as customized pooling and regularization layers, have significantly elevated the accuracy of our models, resulting in remarkable classification accuracy. Notably, the pairing of ResNet 50 with the Nadam optimizer yields extraordinary accuracy rates, reaching 99.34% for gliomas, 93.52% for meningiomas, 98.68% for non-tumorous images, and 97.70% for pituitary tumors. These results underscore the transformative potential of our custom-made approach, achieving an aggregate testing accuracy of 97.68% for these four distinct classes. In a two-class dataset, Resnet 50 with the Adam optimizer excels, demonstrating better precision, recall, F1 score, and an overall accuracy of 99.84%. Moreover, it attains perfect per-class accuracy of 99.62% for ‘Tumor Positive’ and 100% for ‘Tumor Negative’, underscoring a remarkable advancement in the realm of brain tumor categorization. This research underscores the innovative possibilities of DL models and our specialized optimization methods in the domain of diagnosing brain cancer from MRI images. Full article
(This article belongs to the Special Issue Advances in Quantitative Imaging Analysis: From Theory to Practice)
Show Figures

Figure 1

12 pages, 921 KiB  
Case Report
Avatar Intervention for Cannabis Use Disorder in a Patient with Schizoaffective Disorder: A Case Report
by Sabrina Giguère, Laura Dellazizzo, Mélissa Beaudoin, Marie-Andrée Lapierre, Marie Villeneuve, Kingsada Phraxayavong, Stéphane Potvin and Alexandre Dumais
BioMedInformatics 2023, 3(4), 1112-1123; https://doi.org/10.3390/biomedinformatics3040067 - 6 Dec 2023
Viewed by 784
Abstract
Considering the harmful effects of cannabis on individuals with a severe mental disorder and the limited effectiveness of current interventions, this case report showcases the beneficial results of a 10-session Avatar intervention for cannabis use disorder (CUD) on a polysubstance user with a [...] Read more.
Considering the harmful effects of cannabis on individuals with a severe mental disorder and the limited effectiveness of current interventions, this case report showcases the beneficial results of a 10-session Avatar intervention for cannabis use disorder (CUD) on a polysubstance user with a comorbid schizoaffective disorder. Virtual reality allowed the creation of an Avatar representing a person significantly related to the patient’s drug use. Avatar intervention for CUD aims to combine exposure, relational, and cognitive behavioral therapies while practicing real-life situations and learning how to manage negative emotions and cravings. Throughout therapy and later on, Mr. C managed to maintain abstinence from all substances. Also, an improvement in the severity of CUD, as well as a greater motivation to change consumption, was observed after therapy. As observed by his mother, his psychiatrist, and himself, the benefits of Avatar intervention for CUD extended to other spheres of his life. The drastic results observed in this patient could be promising as an alternative to the current treatment available for people with a dual diagnosis of cannabis use disorder and psychotic disorder, which generally lack effectiveness. A single-blind randomized control trial comparing the treatment with a classical intervention in a larger sample is currently underway to evaluate whether the results are reproducible on a larger sample. Full article
Show Figures

Figure 1

11 pages, 636 KiB  
Review
Deciphering the Mosaic of Therapeutic Potential: A Scoping Review of Neural Network Applications in Psychotherapy Enhancements
by Alexandre Hudon, Maxine Aird and Noémie La Haye-Caty
BioMedInformatics 2023, 3(4), 1101-1111; https://doi.org/10.3390/biomedinformatics3040066 - 1 Dec 2023
Cited by 1 | Viewed by 1419
Abstract
Background: Psychotherapy is a component of the therapeutic options accessible in mental health. Along with psychotherapy techniques and indications, there is a body of studies on what are known as psychotherapy’s common factors. However, up to 40% of patients do not respond to [...] Read more.
Background: Psychotherapy is a component of the therapeutic options accessible in mental health. Along with psychotherapy techniques and indications, there is a body of studies on what are known as psychotherapy’s common factors. However, up to 40% of patients do not respond to therapy. Artificial intelligence approaches are hoped to enhance this and with the growing body of evidence of the use of neural networks (NNs) in other areas of medicine, this domain is lacking in the field of psychotherapy. This study aims to identify the different uses of NNs in the field of psychotherapy. Methods: A scoping review was conducted in the electronic databases EMBASE, MEDLINE, APA, and CINAHL. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement influenced this study’s design. Studies were included if they applied a neural network algorithm in the context of a psychotherapeutic approach. Results: A total of 157 studies were screened for eligibility, of which 32 were fully assessed. Finally, eight articles were analyzed, and three uses were identified: predicting the therapeutic outcomes, content analysis, and automated categorization of psychotherapeutic interactions. Conclusions: Uses of NNs were identified with limited evidence of their effects. The potential implications of these uses could assist the therapist in providing a more personalized therapeutic approach to their patients. Given the paucity of literature, this study provides a path for future research to better understand the efficacy of such uses. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

18 pages, 537 KiB  
Article
Towards Effective Emotion Detection: A Comprehensive Machine Learning Approach on EEG Signals
by Ietezaz Ul Hassan, Raja Hashim Ali, Zain ul Abideen, Ali Zeeshan Ijaz and Talha Ali Khan
BioMedInformatics 2023, 3(4), 1083-1100; https://doi.org/10.3390/biomedinformatics3040065 - 23 Nov 2023
Cited by 1 | Viewed by 1398
Abstract
Emotion detection assumes a pivotal role in the evaluation of adverse psychological attributes, such as stress, anxiety, and depression. This study undertakes an exploration into the prospective capacities of machine learning to prognosticate individual emotional states, with an innovative integration of electroencephalogram (EEG) [...] Read more.
Emotion detection assumes a pivotal role in the evaluation of adverse psychological attributes, such as stress, anxiety, and depression. This study undertakes an exploration into the prospective capacities of machine learning to prognosticate individual emotional states, with an innovative integration of electroencephalogram (EEG) signals as a novel informational foundation. By conducting a comprehensive comparative analysis of an array of machine learning methodologies upon the Kaggle Emotion Detection dataset, the research meticulously fine-tunes classifier parameters across various models, including, but not limited, to random forest, decision trees, logistic regression, support vector machines, nearest centroid, and naive Bayes classifiers. Post hyperparameter optimization, the logistic regression algorithm attains a peak accuracy rate of 97%, a proximate performance mirrored by the random forest model. Through an extensive regimen of EEG-based experimentation, the study underscores the profound potential of machine learning paradigms to significantly elevate the precision of emotion detection, thereby catalyzing advancements within the discipline. An ancillary implication resides in early discernment capabilities, rendering this investigation pertinent within the domain of mental health assessments. Full article
(This article belongs to the Section Applied Biomedical Data Science)
Show Figures

Figure 1

12 pages, 2619 KiB  
Article
Facilitating “Omics” for Phenotype Classification Using a User-Friendly AI-Driven Platform: Application in Cancer Prognostics
by Uraquitan Lima Filho, Tiago Alexandre Pais and Ricardo Jorge Pais
BioMedInformatics 2023, 3(4), 1071-1082; https://doi.org/10.3390/biomedinformatics3040064 - 8 Nov 2023
Cited by 1 | Viewed by 863
Abstract
Precision medicine approaches often rely on complex and integrative analyses of multiple biomarkers from “omics” data to generate insights that can help with either diagnostic, prognostic, or therapeutical decisions. Such insights are often made using machine learning (ML) models that perform sample classification [...] Read more.
Precision medicine approaches often rely on complex and integrative analyses of multiple biomarkers from “omics” data to generate insights that can help with either diagnostic, prognostic, or therapeutical decisions. Such insights are often made using machine learning (ML) models that perform sample classification for a particular phenotype (yes/no). Building such models is a challenge and time-consuming, requiring advanced coding skills and mathematical modelling expertise. Artificial intelligence (AI) is a methodological solution that has the potential to facilitate, optimize, and scale model development. In this work, we developed an AI-based, user-friendly, and code-free platform that fully automated the development of predictive models from quantitative “omics” data. Here, we show the application of this tool with the development of cancer survival prognostics models using real-life data from breast, lung, and renal cancer transcriptomes. In comparison to other models, our generated models rendered performances with competitive sensitivities (72–85%), specificities (76–85%), accuracies (75–85%), and Receiver Operating Characteristic curves with superior Areas Under the Curve (ROC-AUC of 77–86%). Further, we reported the associated sets of genes (biomarkers) and their expression patterns that were predictive of cancer survival. Moreover, we made our models available as online tools to generate prognostic predictions based on the gene expressions of the biomarkers. In conclusion, we demonstrated that our tool is a robust, user-friendly solution for developing bespoke predictive tools from “omics” data, which facilitate precision medicine applications to the point-of-care. Full article
Show Figures

Figure 1

11 pages, 688 KiB  
Article
Predictions of Programmed Cell Death Ligand 1 Blockade Therapy Success in Patients with Non-Small-Cell Lung Cancer
by Taksh Gupta, Tamara Qawasmeh and Serena McCalla
BioMedInformatics 2023, 3(4), 1060-1070; https://doi.org/10.3390/biomedinformatics3040063 - 7 Nov 2023
Viewed by 782
Abstract
Lung cancer is responsible for the most cancer deaths worldwide, with non-small-cell lung cancer (NSCLC) making up 80% of cases. Some genetic factors leading to NSCLC development include genetic mutations and Programmed Cell Death Ligand 1 (PD-L1) expression. PD-L1 proteins are targeted in [...] Read more.
Lung cancer is responsible for the most cancer deaths worldwide, with non-small-cell lung cancer (NSCLC) making up 80% of cases. Some genetic factors leading to NSCLC development include genetic mutations and Programmed Cell Death Ligand 1 (PD-L1) expression. PD-L1 proteins are targeted in an NSCLC treatment called PD-L1 blockade therapy (immune therapy). However, this treatment is effective in a low percentage of patients. This study aimed to create machine learning models to use features, like the number of mutations and the number of PD-L1 proteins in cancer cells, along with others, to predict whether a patient will receive clinical benefits from immune therapy. This was carried out by downloading and merging datasets from cbioportal.org to create a sample size for the model. Features that were highly correlated with clinical benefits were identified. Three machine learning models (Gaussian naïve Bayes, decision tree, and logistic regression) were created using these features to predict clinical benefits in patients, and each model’s accuracy was evaluated. All three models had accuracy rates between 55 and 85%, with two of the models averaging an accuracy rate of around 75%. Doctors can use these models to more accurately predict whether immune therapy treatment is likely to work in a patient before prescribing it to them. Full article
(This article belongs to the Section Clinical Informatics)
Show Figures

Figure 1

20 pages, 5450 KiB  
Article
Explainable AI-Based Identification of Contributing Factors to the Mood State Change in Children and Adolescents with Pre-Existing Psychiatric Disorders in the Context of COVID-19-Related Lockdowns in Greece
by Charis Ntakolia, Dimitrios Priftis, Konstantinos Kotsis, Konstantina Magklara, Mariana Charakopoulou-Travlou, Ioanna Rannou, Konstantina Ladopoulou, Iouliani Koullourou, Emmanouil Tsalamanios, Eleni Lazaratou, Aspasia Serdari, Aliki Grigoriadou, Neda Sadeghi, Kenny Chiu and Ioanna Giannopoulou
BioMedInformatics 2023, 3(4), 1040-1059; https://doi.org/10.3390/biomedinformatics3040062 - 7 Nov 2023
Viewed by 757
Abstract
The COVID-19 pandemic and its accompanying restrictions have significantly impacted people’s lives globally. There is an increasing interest in examining the influence of this unprecedented situation on our mental well-being, with less attention towards the impact of the elongation of COVID-19-related measures on [...] Read more.
The COVID-19 pandemic and its accompanying restrictions have significantly impacted people’s lives globally. There is an increasing interest in examining the influence of this unprecedented situation on our mental well-being, with less attention towards the impact of the elongation of COVID-19-related measures on youth with a pre-existing psychiatric/developmental disorder. The majority of studies focus on individuals, such as students, adults, and youths, among others, with little attention being given to the elongation of COVID-19-related measures and their impact on a special group of individuals, such as children and adolescents with diagnosed developmental and psychiatric disorders. In addition, most of these studies adopt statistical methodologies to identify pair-wise relationships among factors, an approach that limits the ability to understand and interpret the impact of various factors. In response, this study aims to adopt an explainable machine learning approach to identify factors that explain the deterioration or amelioration of mood state in a youth clinical sample. The purpose of this study is to identify and interpret the impact of the greatest contributing features of mood state changes on the prediction output via an explainable machine learning pipeline. Among all the machine learning classifiers, the Random Forest model achieved the highest effectiveness, with 76% best AUC-ROC Score and 13 features. The explainability analysis showed that stress or positive changes derived from the imposing restrictions and COVID-19 pandemic are the top two factors that could affect mood state. Full article
Show Figures

Figure 1

25 pages, 8083 KiB  
Article
Integrative Meta-Analysis during Induced Pluripotent Stem Cell Reprogramming Reveals Conserved Networks and Chromatin Accessibility Signatures in Human and Mouse
by Chloe S. Thangavelu and Trina M. Norden-Krichmar
BioMedInformatics 2023, 3(4), 1015-1039; https://doi.org/10.3390/biomedinformatics3040061 - 6 Nov 2023
Cited by 1 | Viewed by 1010
Abstract
iPSC reprogramming involves dynamic changes in chromatin accessibility necessary for the conversion of somatic cells into induced pluripotent stem cells (iPSCs). IPSCs can be used to generate a wide range of cells to potentially replace damaged cells in a patient without the threat [...] Read more.
iPSC reprogramming involves dynamic changes in chromatin accessibility necessary for the conversion of somatic cells into induced pluripotent stem cells (iPSCs). IPSCs can be used to generate a wide range of cells to potentially replace damaged cells in a patient without the threat of immune rejection; however, efficiently reprogramming cells for medical applications remains a challenge, particularly in human cells. Here, we conducted a cross-species meta-analysis to identify conserved and species-specific differences in regulatory patterns during reprogramming. Chromatin accessibility and transcriptional data as fibroblasts transitioned to iPSCs were obtained from the publicly available Gene Expression Omnibus (GEO) database and integrated to generate time-resolved regulatory networks during cellular reprogramming. We observed consistent and conserved trends between the species in the chromatin accessibility signatures as cells transitioned from fibroblasts into iPSCs, indicating distal control of genes associated with pluripotency by master reprogramming regulators. Multi-omic integration showed key network changes across reprogramming states, revealing regulatory relationships between chromatin regulators, enhancers, transcription factors, and target genes that result in the silencing of the somatic transcription program and activation of the pluripotency gene regulatory network. This integrative analysis revealed distinct network changes between timepoints and leveraged multi-omics to gain novel insights into the regulatory mechanisms underlying reprogramming. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Graphical abstract

30 pages, 6485 KiB  
Commentary
Using Information from Public Databases to Critically Evaluate Studies Linking the Antioxidant Enzyme Selenium-Dependent Glutathione Peroxidase 2 (GPX2) to Cancer
by R. Steven Esworthy and Fong-Fong Chu
BioMedInformatics 2023, 3(4), 985-1014; https://doi.org/10.3390/biomedinformatics3040060 - 3 Nov 2023
Viewed by 690
Abstract
Recent research on selenium-dependent glutathione peroxidase 2 (GPX2) tends to focus on possible roles in tumorigenesis. This is based on the idea that normally generated hydroperoxide species can damage DNA to produce mutations and react with protein sulfhydryl groups to perturb normal regulation [...] Read more.
Recent research on selenium-dependent glutathione peroxidase 2 (GPX2) tends to focus on possible roles in tumorigenesis. This is based on the idea that normally generated hydroperoxide species can damage DNA to produce mutations and react with protein sulfhydryl groups to perturb normal regulation of cancer-related pathways. GPX2 is one of many peroxidases available to control hydroperoxide levels. Altered GPX2 expression levels from normal to cancer or with cancer stages seems to be the main feature in bringing it to the attention of investigators. In this commentary, we examine this premise as a basis for cancer studies, largely by trying to place GPX2 within the larger context of antioxidant enzyme gene expression. We make use of public databases and illustrate their possible role in approaching this issue. Since use of such databases is new to us, we looked to sources in the literature to evaluate expression level data, finding general agreement with some discrepancies over the range of expression and relative expression levels among some samples. Using the database information, we critically evaluate methods used to study GPX2 in the current literature for a variety of cancers. Second, groups are now trying to compare enzymatic properties of GPX1 and GPX2 using proteins from bacterial cultures. We weigh in on these recent findings and discuss the impact on the relative GPX2 and GPX1 functions. Full article
(This article belongs to the Section Applied Biomedical Data Science)
Show Figures

Figure 1

23 pages, 1636 KiB  
Article
Enhancing Semantic Web Technologies Using Lexical Auditing Techniques for Quality Assurance of Biomedical Ontologies
by Rashmi Burse, Michela Bertolotto and Gavin McArdle
BioMedInformatics 2023, 3(4), 962-984; https://doi.org/10.3390/biomedinformatics3040059 - 1 Nov 2023
Viewed by 706
Abstract
Semantic web technologies (SWT) represent data in a format that is easier for machines to understand. Validating the knowledge in data graphs created using SWT is critical to ensure that the axioms accurately represent the so-called “real” world. However, data graph validation is [...] Read more.
Semantic web technologies (SWT) represent data in a format that is easier for machines to understand. Validating the knowledge in data graphs created using SWT is critical to ensure that the axioms accurately represent the so-called “real” world. However, data graph validation is a significant challenge in the semantic web domain. The Shapes Constraint Language (SHACL) is the latest W3C standard developed with the goal of validating data-graphs. SHACL (pronounced as shackle) is a relatively new standard and hitherto has predominantly been employed to validate generic data graphs like WikiData and DBPedia. In generic data graphs, the name of a class does not affect the shape of a class, but this is not the case with biomedical ontology data graphs. The shapes of classes in biomedical ontology data graphs are highly influenced by the names of the classes, and the SHACL shape creation methods developed for generic data graphs fail to consider this characteristic difference. Thus, the existing SHACL shape creation methods do not perform well for domain-specific biomedical ontology data graphs. Maintaining the quality of biomedical ontology data graphs is crucial to ensure accurate analysis in safety-critical applications like Electronic Health Record (EHR) systems referencing such data graphs. Thus, in this work, we present a novel method to create enhanced SHACL shapes that consider the aforementioned characteristic difference to better validate biomedical ontology data graphs. We leverage the knowledge available from lexical auditing techniques for biomedical ontologies and incorporate this knowledge to create smart SHACL shapes. We also create SHACL shapes (baseline SHACL graph) without incorporating the lexical knowledge of the class names, as is performed by existing methods, and compare the performance of our enhanced SHACL shapes with the baseline SHACL shapes. The results demonstrate that the enhanced SHACL shapes augmented with lexical knowledge of the class names identified 176 violations which the baseline SHACL shapes, void of this lexical knowledge, failed to detect. Thus, the enhanced SHACL shapes presented in this work significantly improve the validation performance of biomedical ontology data graphs, thereby reducing the errors present in such data graphs and ensuring safe use in the life-critical applications referencing them. Full article
(This article belongs to the Special Issue Feature Papers in Applied Biomedical Data Science)
Show Figures

Figure 1

14 pages, 953 KiB  
Article
Federated Learning for Diabetic Retinopathy Detection Using Vision Transformers
by Mohamed Chetoui and Moulay A. Akhloufi
BioMedInformatics 2023, 3(4), 948-961; https://doi.org/10.3390/biomedinformatics3040058 - 1 Nov 2023
Cited by 1 | Viewed by 1768
Abstract
A common consequence of diabetes mellitus called diabetic retinopathy (DR) results in lesions on the retina that impair vision. It can cause blindness if not detected in time. Unfortunately, DR cannot be reversed, and treatment simply keeps eyesight intact. The risk of vision [...] Read more.
A common consequence of diabetes mellitus called diabetic retinopathy (DR) results in lesions on the retina that impair vision. It can cause blindness if not detected in time. Unfortunately, DR cannot be reversed, and treatment simply keeps eyesight intact. The risk of vision loss can be considerably decreased with early detection and treatment of DR. Ophtalmologists must manually diagnose DR retinal fundus images, which takes time, effort, and is cost-consuming. It is also more prone to error than computer-aided diagnosis methods. Deep learning has recently become one of the methods used most frequently to improve performance in a variety of fields, including medical image analysis and classification. In this paper, we develop a federated learning approach to detect diabetic retinopathy using four distributed institutions in order to build a robust model. Our federated learning approach is based on Vision Transformer architecture to classify DR and Normal cases. Several performance measures were used such as accuracy, area under the curve (AUC), sensitivity and specificity. The results show an improvement of up to 3% in terms of accuracy with the proposed federated learning technique. The technique also resolving crucial issues like data security, data access rights, and data protection. Full article
(This article belongs to the Special Issue Advances in Quantitative Imaging Analysis: From Theory to Practice)
Show Figures

Figure 1

22 pages, 4583 KiB  
Article
Machine Learning Models for Predicting Personalized Tacrolimus Stable Dosages in Pediatric Renal Transplant Patients
by Sergio Sánchez-Herrero, Laura Calvet and Angel A. Juan
BioMedInformatics 2023, 3(4), 926-947; https://doi.org/10.3390/biomedinformatics3040057 - 14 Oct 2023
Viewed by 1065
Abstract
Tacrolimus, characterized by a narrow therapeutic index, significant toxicity, adverse effects, and interindividual variability, necessitates frequent therapeutic drug monitoring and dose adjustments in renal transplant recipients. This study aimed to compare machine learning (ML) models utilizing pharmacokinetic data to predict tacrolimus blood concentration. [...] Read more.
Tacrolimus, characterized by a narrow therapeutic index, significant toxicity, adverse effects, and interindividual variability, necessitates frequent therapeutic drug monitoring and dose adjustments in renal transplant recipients. This study aimed to compare machine learning (ML) models utilizing pharmacokinetic data to predict tacrolimus blood concentration. This prediction underpins crucial dose adjustments, emphasizing patient safety. The investigation focuses on a pediatric cohort. A subset served as the derivation cohort, creating the dose-prediction algorithm, while the remaining data formed the validation cohort. The study employed various ML models, including artificial neural network, RandomForestRegressor, LGBMRegressor, XGBRegressor, AdaBoostRegressor, BaggingRegressor, ExtraTreesRegressor, KNeighborsRegressor, and support vector regression, and their performances were compared. Although all models yielded favorable fit outcomes, the ExtraTreesRegressor (ETR) exhibited superior performance. It achieved measures of 0.161 for MPE, 0.995 for AFE, 1.063 for AAFE, and 0.8 for R2, indicating accurate predictions and meeting regulatory standards. The findings underscore ML’s predictive potential, despite the limited number of samples available. To address this issue, resampling was utilized, offering a viable solution within medical datasets for developing this pioneering study to predict tacrolimus trough concentration in pediatric transplant recipients. Full article
(This article belongs to the Special Issue Feature Papers on Methods in Biomedical Informatics)
Show Figures

Figure 1

18 pages, 9093 KiB  
Article
Identifying the Role of Disulfidptosis in Endometrial Cancer via Machine Learning Methods
by Fei Fu, Xuesong Lu, Zhushanying Zhang, Zhi Li and Qinlan Xie
BioMedInformatics 2023, 3(4), 908-925; https://doi.org/10.3390/biomedinformatics3040056 - 13 Oct 2023
Cited by 1 | Viewed by 785
Abstract
Uterine corpus endometrial carcinoma (UCEC) is the second most common gynecological cancer in the world. With the increased occurrence of UCEC and the stagnation of research in the field, there is a pressing need to identify novel UCEC biomarkers. Disulfidptosis is a novel [...] Read more.
Uterine corpus endometrial carcinoma (UCEC) is the second most common gynecological cancer in the world. With the increased occurrence of UCEC and the stagnation of research in the field, there is a pressing need to identify novel UCEC biomarkers. Disulfidptosis is a novel form of cell death, but its role in UCEC is unclear. We integrate differential analysis and the XGBoost algorithm to determine a disulfidptosis-related characteristic gene (DRCG), namely LRPPRC. By prediction and verification based on online databases, we construct a regulatory network of ceRNA in line with the scientific hypothesis, including a ceRNA regulatory axis and two mRNA-miRNA regulatory axes, i.e., mRNA LRPPRC/miRNA hsa-miR-616-5p/lncRNA TSPEAR-AS2, mRNA LRPPRC/miRNA hsa-miR-4658, and mRNA LRPPRC/miRNA hsa-miR-6783-5p. We use machine learning methods such as GBM to screen out seven disulfidptosis-related characteristic lncRNAs (DRCLs) as predictors, and build a risk prediction model with good prediction ability. SCORE = (1.136*LINC02449) + (−2.173*KIF9-AS1) + (0.235*ACBD3-AS1) + (1.830*AL354892.3) + (−1.314*AC093677.2) + (0.636*AC113361.1) + (−0.589*CDC37L1-DT). The ROC curve shows that in the training set samples, the AUCs for predicting 1-, 3-, 6-, and 10-year OS are 0.804, 0.724, 0.719, and 0.846, respectively. In the test set samples, the AUCs for predicting 1-, 3-, 6-, and 10-year OS are 0.615, 0.657, 0.687, and 0.702, respectively. In all samples, the AUCs for predicting 1-, 3-, 6-, and 10-year OS are 0.752, 0.706, 0.705, and 0.834, respectively. CP724714 has been screened as a potential therapy option for individuals who have a high risk of developing UCEC. Two subtypes of disulfidptosis-related genes (DRGs) and two subtypes of DRCLs are obtained by NMF method. We find that subtype N1 of DRGs is mainly enriched in various metabolic pathways, and subtype N1 may play a significant role in the process of disulfidptosis. Our study confirms for the first time that disulfidptosis plays a role in UCEC. Our findings help improve the prognosis and treatment of UCEC. Full article
(This article belongs to the Topic Computational Intelligence and Bioinformatics (CIB))
Show Figures

Figure 1

23 pages, 9945 KiB  
Article
Identification of a New Drug Binding Site in the RNA-Dependent-RNA-Polymerase (RdRp) Domain
by Aparna S. Gana and James N. Baraniuk
BioMedInformatics 2023, 3(4), 885-907; https://doi.org/10.3390/biomedinformatics3040055 - 10 Oct 2023
Cited by 2 | Viewed by 1144
Abstract
We hypothesize that in silico structural biology approaches can discover novel drug binding sites for RNA-dependent-RNA-polymerases (RdRp) of positive sense single-strand RNA (ss(+)RNA) virus species. RdRps have a structurally conserved active site with seven motifs (A to G), despite low sequence similarity. We [...] Read more.
We hypothesize that in silico structural biology approaches can discover novel drug binding sites for RNA-dependent-RNA-polymerases (RdRp) of positive sense single-strand RNA (ss(+)RNA) virus species. RdRps have a structurally conserved active site with seven motifs (A to G), despite low sequence similarity. We refined this architecture further to describe a conserved structural domain consisting of motifs A, B, C and F. These motifs were used to realign 24 RdRp structures in an innovative manner to search for novel drug binding sites. The aligned motifs from the enzymes were then docked with 833 FDA-approved drugs (Set 1) and 85 FDA-approved antivirals (Set 2) using the Molecular Operating Environment (MOE) docking 2020.09 software. Sirolimus (rapamycin), an immunosuppressant that targets the mammalian mTOR pathway, was one of the top ten drugs for all 24 RdRp proteins. The sirolimus docking site was in the nucleotide triphosphate entry tunnel between motifs A and F but distinct from the active site in motif C. This original finding supports our hypothesis that structural biology approaches based on RdRp motifs that are conserved across evolution can define new drug binding locations and infer potential broad-spectrum inhibitors for SARS-CoV-2 and other ss(+)RNA viruses. Full article
(This article belongs to the Topic Computational Intelligence and Bioinformatics (CIB))
Show Figures

Figure 1

16 pages, 19773 KiB  
Article
Pitfalls of Using Multinomial Regression Analysis to Identify Class-Structure-Relevant Variables in Biomedical Data Sets: Why a Mixture of Experts (MOE) Approach Is Better
by Jörn Lötsch and Alfred Ultsch
BioMedInformatics 2023, 3(4), 869-884; https://doi.org/10.3390/biomedinformatics3040054 - 8 Oct 2023
Cited by 3 | Viewed by 1123
Abstract
Recent advances in mathematical modeling and artificial intelligence have challenged the use of traditional regression analysis in biomedical research. This study examined artificial data sets and biomedical data sets from cancer research using binomial and multinomial logistic regression. The results were compared with [...] Read more.
Recent advances in mathematical modeling and artificial intelligence have challenged the use of traditional regression analysis in biomedical research. This study examined artificial data sets and biomedical data sets from cancer research using binomial and multinomial logistic regression. The results were compared with those obtained with machine learning models such as random forest, support vector machine, Bayesian classifiers, k-nearest neighbors, and repeated incremental clipping (RIPPER). The alternative models often outperformed regression in accurately classifying new cases. Logistic regression had a structural problem similar to early single-layer neural networks, which limited its ability to identify variables with high statistical significance for reliable class assignments. Therefore, regression is not per se the best model for class prediction in biomedical data sets. The study emphasizes the importance of validating selected models and suggests that a “mixture of experts” approach may be a more advanced and effective strategy for analyzing biomedical data sets. Full article
(This article belongs to the Section Applied Biomedical Data Science)
Show Figures

Figure 1

16 pages, 1820 KiB  
Article
OutSplice: A Novel Tool for the Identification of Tumor-Specific Alternative Splicing Events
by Joseph Bendik, Sandhya Kalavacherla, Nicholas Webster, Joseph Califano, Elana J. Fertig, Michael F. Ochs, Hannah Carter and Theresa Guo
BioMedInformatics 2023, 3(4), 853-868; https://doi.org/10.3390/biomedinformatics3040053 - 8 Oct 2023
Viewed by 1132
Abstract
Protein variation that occurs during alternative splicing has been shown to play a major role in disease onset and oncogenesis. Due to this, we have developed OutSplice, a user-friendly algorithm to classify splicing outliers in tumor samples compared to a distribution of normal [...] Read more.
Protein variation that occurs during alternative splicing has been shown to play a major role in disease onset and oncogenesis. Due to this, we have developed OutSplice, a user-friendly algorithm to classify splicing outliers in tumor samples compared to a distribution of normal samples. Several tools have previously been developed to help uncover splicing events, each coming with varying methodologies, complexities, and features that can make it difficult for a new researcher to use or to determine which tool they should be using. Therefore, we benchmarked several algorithms to determine which may be best for a particular user’s needs and demonstrate how OutSplice differs from these methodologies. We find that despite detecting a lower number of genes with significant aberrant events, OutSplice is able to identify those that are biologically impactful. Additionally, we identify 17 genes that contain significant splicing alterations in tumor tissue that were discovered across at least 5 of the tested algorithms, making them good candidates for future studies. Overall, researchers should consider a combined use of OutSplice with other splicing software to help provide additional validation for aberrant splicing events and to narrow down biologically relevant events. Full article
(This article belongs to the Special Issue Feature Papers in Computational Biology and Medicine)
Show Figures

Figure 1

24 pages, 2602 KiB  
Article
Weighted Trajectory Analysis and Application to Clinical Outcome Assessment
by Utkarsh Chauhan, Kaiqiong Zhao, John Walker and John R. Mackey
BioMedInformatics 2023, 3(4), 829-852; https://doi.org/10.3390/biomedinformatics3040052 - 7 Oct 2023
Viewed by 1566
Abstract
The Kaplan–Meier (KM) estimator is widely used in medical research to estimate the survival function from lifetime data. KM estimation is a powerful tool to evaluate clinical trials due to simple computational requirements, its use of a logrank hypothesis test, and the ability [...] Read more.
The Kaplan–Meier (KM) estimator is widely used in medical research to estimate the survival function from lifetime data. KM estimation is a powerful tool to evaluate clinical trials due to simple computational requirements, its use of a logrank hypothesis test, and the ability to censor patients. However, KM estimation has several constraints and fails to generalize to ordinal variables of clinical interest, such as toxicity and ECOG performance. We devised weighted trajectory analysis (WTA) to combine the advantages of KM estimation with the ability to visualize and compare treatment groups for ordinal variables and fluctuating outcomes. To assess statistical significance, we developed a new hypothesis test analogous to the logrank test. We demonstrated the functionality of WTA through 1000-fold clinical trial simulations of unique stochastic models of chemotherapy toxicity and schizophrenia disease course. With increments in sample size and hazard ratio, we compared the performance of WTA to KM estimation and the generalized estimating equation (GEE). WTA generally required half the sample size to achieve comparable power to KM estimation; advantages over the GEE included its robust nonparametric approach and summary plot. We also applied WTA to real clinical data: the toxicity outcomes of melanoma patients receiving immunotherapy and the disease progression of patients with metastatic breast cancer receiving ramucirumab. The application of WTA demonstrated that using traditional methods such as KM estimation can lead to both type I and II errors by failing to model illness trajectory. This article outlines a novel method for clinical outcome assessment that extends the advantages of Kaplan–Meier estimates to ordinal outcome variables. Full article
(This article belongs to the Special Issue Feature Papers in Medical Statistics and Data Science Section)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop