Next Article in Journal
PyChatAI: Enhancing Python Programming Skills—An Empirical Study of a Smart Learning System
Previous Article in Journal
Students Collaboratively Prompting ChatGPT
Previous Article in Special Issue
A Novel Data Analytics Methodology for Discovering Behavioral Risk Profiles: The Case of Diners During a Pandemic
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Advanced Digital System for International Collaboration on Biosample-Oriented Research: A Multicriteria Query Tool for Real-Time Biosample and Patient Cohort Searches

by
Alexandros Fridas
,
Anna Bourouliti
,
Loukia Touramanidou
,
Desislava Ivanova
,
Kostantinos Votis
* and
Panagiotis Katsaounis
*
Metabio, 542 50 Thessaloniki, Greece
*
Authors to whom correspondence should be addressed.
Computers 2025, 14(5), 157; https://doi.org/10.3390/computers14050157
Submission received: 21 February 2025 / Revised: 8 April 2025 / Accepted: 17 April 2025 / Published: 23 April 2025
(This article belongs to the Special Issue Future Systems Based on Healthcare 5.0 for Pandemic Preparedness 2024)

Abstract

:
The advancement of biomedical research depends on efficient data sharing, integration, and annotation to ensure reproducibility, accessibility, and cross-disciplinary collaboration. International collaborative research is crucial for advancing biomedical science and innovation but often faces significant barriers, such as data sharing limitations, inefficient sample management, and scalability challenges. Existing infrastructures for biosample and data repositories face challenges limiting large-scale research efforts. This study presents a novel platform designed to address these issues, enabling researchers to conduct high-quality research more efficiently and at reduced costs. The platform employs a modular, distributed architecture that ensures high availability, redundancy, and interoperability among diverse stakeholders, as well as integrates advanced features, including secure access management, comprehensive query functionalities, real-time availability reporting, and robust data mining capabilities. In addition, this platform supports dynamic, multi-criteria searches tailored to disease-specific patient profiles and biosample-related data across pre-analytical, post-analytical, and cryo-storage processes. By evaluating the platform’s modular architecture and pilot testing outcomes, this study demonstrates its potential to enhance interdisciplinary collaboration, streamline research workflows, and foster transformative advancements in biomedical research. The key is the innovation of a real-time dynamic e-consent (DRT e-consent) system, which allows donors to update their consent status in real time, ensuring compliance with ethical and regulatory frameworks such as GDPR and HIPAA. The system also supports multi-modal data integration, including genomic sequences, electronic health records (EHRs), and imaging data, enabling researchers to perform complex queries and generate comprehensive insights.

1. Introduction

International collaborative partnerships in medical research involving diverse global participants are essential for accelerating scientific discovery, validating clinical findings, and fostering innovation in disease prevention, diagnosis, and treatment [1]. Pooling resources, expertise, and diverse perspectives from different disciplines and institutions leads to more comprehensive and robust scientific discoveries, ultimately improving the quality and impact of health research on downstream challenges faced by the global healthcare system, including efficacy in personalized treatment plans and fast decision making [2]. The HapMap project is a notable example of successful international collaborations and global partnerships, achieving innovative advancements in the discovery of common disease pathways, thus setting up the basis for disease-specific targets for therapeutic interventions by mapping the entire human genome [3].
However, collaborative research faces significant pitfalls, including technical challenges, communication barriers, variations in research methodologies, differing regulatory frameworks, and data access limitations, resulting in the fragmentation of data sources, which complicates further research and slows down progress [4]. Among others, the misaligned goals among all stakeholders and complexity of managing interdisciplinary teams creates an additional burden on research initiatives [5]. Delays in stakeholder actions, combined with the absence of shared practices, frequently hinder funding decisions and ethical approvals for new research projects [6]. These inefficiencies become particularly critical during health crises, such as pandemics, when the inability to rapidly access, share, and analyze comprehensive biomedical data can lead to conflicts, delays, and missed opportunities for swift and coordinated action [7].
Furthermore, a significant challenge in collaborative research is the quality and accessibility of high-quality biosamples, which are crucial for basic research, and associated data collected from different stages of the biosample use cycle, hindering the progress of research and development of new diagnoses and treatments according the need of each individual [8]. The sharing of medical data, as well as biosample procurement, must adhere to strict regulatory governance, based on the consent of each individual, and at the same time guarantee a fair distribution of benefits and transparency, which further complicates the entire chain from a scientific hypothesis to the final stage of real-word applications of effective healthcare provision [9].
Currently, in an effort to push the boundaries of medical research, there is a high demand for advanced strategies to drive innovation, establish standardized policies, and enable exponential growth in disease characterization and management [10]. However, the continuous exchange of information among stakeholders demands well-organized project management infrastructures and databases in order to minimize bias and inconsistencies in data analysis and interpretation [11]. Fostering knowledge relationships between disciplines and diverse networks requires robust knowledge-sharing environments and systems that support the collection, processing, transformation, mining, and, ultimately, the evaluation of information to create valuable datasets for future research and lead to effective scientific output [12].
Looking to the future, cross-sectional synergies and multisource collaboration will be vital for advancing healthcare research by establishing research hubs that will create the proper environment to foster collaboration and innovation, as well as providing the effective infrastructures and resources needed to support large-scale research projects and facilitate the exchange of ideas and expertise [13]. While prior works around biosample-oriented research and data sharing have laid a strong foundation for global collaboration, the effectiveness and scalability of the current systems are still critical pitfalls [14,15,16].
Hence, addressing the current need in biomedical research and within the scope of Metabio’s activities and services [17], we have designed and created a platform and system facilitating international collaboration based on biosample and/or data exchange, an advanced multicriteria query tool for real-time searches for and access to fully annotated biosamples, disease-specific patient profiles, and meta-analytics. The scope of the project was to create a system incorporating several functionalities not fully addressed by prior works, such as real-time querying, the consistency of data integration, the availability of built-in tools to facilitate seamless collaboration, and an all-in-one system for multi-source data collection. Our goal is to support international research initiatives and cross-sector data exchange efficiency, thus empowering research-oriented networks to accelerate innovative discoveries and improve patient outcomes.

2. Materials and Methods

2.1. Project Design and Development

The initial phase of this project focused on defining the core framework of activities and creating a structure plan to ensure systematic implementation. Initially, a systematic comprehensive analysis of existing biosample/patient-related data exchange for research initiatives, as well as data sharing platforms, was conducted to identify gaps and crucial aspects. Specifically, when considering the needs of researchers in acquiring data and samples for impactful research, we found that data variability or inconsistency can have a significant impact on research outcomes. Researchers require a platform that offers reliable, current information on biospecimens, including details about their quality and suitability for specific research objectives. A summary of researchers’ needs and constraints is provided in Table 1.
Task creation followed a modular approach, with dedicated workstreams for database architecture design, query engine development, user interface creation, security and compliance integration, and performance optimization. Agile development methodologies were employed to enable iterative testing and refinement, ensuring that each component is aligned with the overall project objectives. This structured approach allowed for efficient workflow management, continuous assessment, and adaptive enhancements throughout the development process. Specific efforts were made to outline the data categories that would be integrated, including patient profiles, biosample characteristics, and associated meta-analytics spanning pre-analytical, post-analytical, and cryo-storage stages.
Emphasis was placed on designing an intuitive interface and tools to enable real-time hypothesis-driven data exploration, minimizing the need for advanced IT skills among researchers. The project plan also integrated innovative encryption techniques and assessed control management initiatives to address data privacy and security concerns.

2.2. User Assessment

A comprehensive user assessment was conducted to ensure that the platform effectively meets the diverse needs of its intended users. The assessment process followed a structured approach, involving stakeholder identification, user profiling, and needs analysis. The evaluation of user requirements was performed through a systematic collection and prioritization of functional and non-functional needs to ensure optimal platform design and usability. This process involved detailed requirement-gathering sessions with key stakeholders to determine essential functionalities and performance expectations. Functional needs were identified based on essential user tasks, such as data retrieval, analysis, visualization, and security management, ensuring that each requirement was mapped to specific user roles to provide tailored access and usability. Non-functional needs, including system reliability, performance efficiency, compliance with regulatory frameworks, and user experience design, were also carefully considered to enhance the overall usability and effectiveness of the platform.

2.3. Data Mining

The data mining process for clinical data collection was designed to ensure seamless integration, standardization, and accessibility of multi-source datasets. To achieve this, commonly used healthcare data standards, including HL7, openEHR, LOINC, SNOMED, and ICD-11 [24,25], were selected and combined to facilitate interoperability. The platform architecture was developed to link and integrate structured clinical and biomedical data with various electronic health record (EHR) systems [26], laboratory analytics, clinical trials, health information technologies (HITs) [27], as well as biosample-related data (i.e., storage conditions, pre-analytical factors, handling procedures, analysis, etc.), ensuring compatibility and compliance with international data exchange protocols. The management system was implemented based on Fast Healthcare Interoperability Resources (FHIR) specifications [28], enabling efficient data sharing through secure RESTful APIs hosted in the cloud. Additionally, real-time data acquisition was incorporated, allowing diverse sites to annotate the biosample and patient cohorts, establishing a dynamic and relational timeline per biosample and per donor. This methodology ensures that all collected data remain harmonized, traceable, and readily available for researchers, supporting large-scale biomedical studies and facilitating the discovery of actionable insights.

2.4. Security and Compliance

The platform was designed with a robust security and compliance framework to ensure the protection, integrity, and ethical management of sensitive biomedical data. Adhering to international regulations and ISO 27001 standards [29], the system incorporates the following controls:
  • Risk Assessment and Management: Identifies threats, vulnerabilities, and potential impacts on information assets;
  • Comprehensive Security Controls: Includes access control, network security, incident management, and monitoring;
  • Incident Management: Detects, reports, assesses, and responds to information security incidents;
  • Continuous Improvement: Involves processes for reviewing, updating, and enhancing the information security management system;
  • Compliance and Legal Requirements: Ensures adherence to relevant laws, regulations, and contractual requirements related to information security;
  • Encryption: A security control to protect sensitive data, outlining requirements for key management, algorithms, and cryptographic controls to maintain confidentiality and integrity;
  • Data Accuracy and Integrity Policies: Ensures reliability and accuracy of personal data;
  • Privacy by Design and Default Principles: Facilitates the integration of privacy considerations into system design and development. This includes privacy-enhancing technologies, risk-based approaches, and privacy impact assessments;
  • Data Protection Impact Assessment Methodologies: Used for assessing privacy risks associated with personal data processing.
The system operates in compliance with the GDPR [30] and the Health Insurance Portability and Accountability Act (HIPAA) [31], integrating a layered IT architecture supported by Fast Fully Homomorphic Encryption (FFHE) techniques [32]. The Security and Privacy Layer ensures security and privacy in all architectural layers (i.e., Data Layer, Service Layer, and Application Layer), promoting the secure information exchange network between the research platform and existing/legacy systems used by associated stakeholders.
The query tool was enabled via the following:
  • Authentication/Encryption Module: Handles authentication and encryption of stored and transmitted data;
  • Permission Handling Module: Manages user permissions for accessing and modifying data stored in the Data Layer;
  • Data Sharing Agreement Module: Facilitates data sharing agreements between various actors in the research ecosystem.

2.5. Data Transfer and Export

Data transfer and export were carefully structured through secure protocols and data management systems, adhering to digital rights management (DRM) protocols and consent management requirements [33], ensuring that all procedures and data exchanges are compliant with local, national, and international legislation regarding cybersecurity, data privacy, integrity, quality, authentication/encryption, and disclosure. The platform incorporated a dynamic real-time tiered (DRT) e-consent mechanism, allowing for the real-time dissemination of, access to, and utilization of biospecimens and related data, according to the donor/patient’s electronic consent status [34]. For registered institutional entities, primary and secondary user permissions were managed, ensuring only authorized users can access sensitive data. To enforce strict access control and data governance, a hybrid security model incorporating role-based access control (RBAC) and attribute-based access control (ABAC) was deployed [35]. RBAC ensures that users can only access data based on predefined roles and responsibilities, while ABAC enhances security by dynamically evaluating contextual attributes such as user credentials, location, device, and purpose for access before granting permissions.

2.6. Technical Implementation

The platform is built upon a robust, distributed data infrastructure that integrates multiple advanced technologies to facilitate real-time biosample and patient data annotation while ensuring high availability, security, and interoperability. Utilizing a cloud-based architecture, the system is hosted on a scalable, containerized environment using the Kubernetes and Docker platform, allowing for automated deployment, resource optimization, and fault-tolerant operation. The database management system employs a MySQL model. This workbench functionality enables the creation and management of connections to database servers. It enables configuration of connection parameters and executes SQL queries on the database connections using the built-in SQL. Specifically, MariaDB handles unstructured metadata and turns data into structured information in a wide array of applications. MariaDB is used because it is fast, scalable, and robust, with a rich ecosystem of storage engines, plugins, and many other tools, ensuring seamless integration of diverse biomedical datasets. To enable real-time querying and dynamic data retrieval, the platform incorporates an advanced multi-criteria query engine built on Elasticsearch, which optimizes search performance through full-text indexing, relevance ranking, and distributed search capabilities. Researchers can execute complex queries on biosamples, disease-specific patient profiles, and meta-analytics with minimal latency, while the system dynamically refines search results using advanced algorithms.

3. System Architecture, Functionalities, and User Experience

3.1. Overview of Platform and Architecture

This innovative platform for researchers is built on a modular architecture that supports real-time data integration, multi-criteria querying, and secure data exchange (Figure 1).
The platform consists of several key components involving data storage management, a real time data analysis, a multi-dimensional platform for sample and data recruitment from multiple demographic areas, and dynamic e-consent options. The platform’s data accessibility is generated by a distributed data model that ensures high availability and the redundancy of both biosample and patient data. These data are collected from several sources, including biobanks and clinical sites, and are then assorted accordingly. In addition, a multi-criteria query tool gives researchers the opportunity to perform complex queries based on various criteria, such as patient health profiles, biosample types, demographic areas, research protocols, and analytical results. This tool further supports customizable queries using drop-down menus to ensure accuracy and consistency.
Researchers can securely access the platform through an intuitive, web-based interface, using assigned credentials generated and managed through a secure, role-based authentication system. Both individual researchers and institutions (e.g., universities, research institutes, or pharmaceutical companies) can register as users, with designated primary and secondary accounts. These accounts are managed through a secure hierarchy that ensures appropriate permissions are granted based on licensing agreements, ethical compliance, and data use limitations.
Data searches and analyses are logged for transparency, ensuring ethical and secure collaboration. Researchers can utilize tools such as real-time analytics, protocol annotation, and search histories to optimize their projects. To further facilitate cross-border research, the platform provides a universal dashboard, offering aggregated regional statistics on biosample and patient cohort availability.
The system includes annotations for over 3000 research protocols across a 64-assay spectrum, ranging from commercial to homemade analysis procedures. For instance, researchers can query biosamples based on pre-analytical data, cryo-storage logistics, or specific patient health conditions. The search produces specific quantitative results, reading the unique qualitative characteristics per request and running through all the datasets of the network in real time. Searches are stored for all users. The system processes queries in real time, generating specific results based on data availability and clustering success rates at different thresholds (e.g., 50%, 70%, and 100%) (Figure 2).
Query flow functionalities include the following:
  • User Login: Users authenticate and submit their credentials for data access, based on their biobank and network permissions;
  • Query Submission: Users select query criteria from a user-friendly interface and submit the query;
  • Data Processing: The system processes the query, using data mining and real-time analytics to search across the network;
  • Results Output: The system generates a list of available biosamples, patient profiles, and associated data, clustered by adoption rates.

3.2. Real-Time Dynamic E-Consent (DRT E-Consent) System

The real-time dynamic e-consent (DRT e-consent) system empowers researchers with a flexible and donor-centric approach to accessing biosamples and biomedical data, providing a dynamic and transparent consent management framework. The system provides real-time updates on consent status, allowing researchers to work with the most current permissions without delays or ethical ambiguities. Through an automated verification process, the platform streamlines the process of requesting and obtaining biosamples, ensuring that all stakeholders operate within a structured and legally compliant framework.
In addition to secure access management, the DRT e-consent system promotes ethical research practices by fostering donor engagement. Researchers can track consent updates, receive notifications about changes, and, in some cases, communicate with donors regarding the intended use of their biosamples. This interactive approach ensures that research remains aligned with donor expectations, promoting transparency and trust in biomedical studies. Furthermore, the system provides a structured environment for research institutions, ensuring that data governance and compliance standards are upheld throughout the research lifecycle.

3.3. User Interface/User Experience (UX)

The platform successfully delivered a modular and scalable system with comprehensive functionalities tailored to the needs of international biomedical research. The modular architecture enabled the seamless integration of diverse data sources while maintaining flexibility for future expansions and enhancements. The web-based interface is clean and easy to navigate, offering a responsive design that adapts to different devices and screen sizes. The multi-criteria query tool features interactive drop-down menus and customizable filters, allowing researchers to quickly refine their searches based on specific variables, ensuring precise and consistent results with minimal effort. Additionally, the platform includes a universal dashboard that aggregates key data and regional statistics, giving users a clear overview of biosample and patient cohort availability across various locations. Real-time analytics and search history tools further enhance usability, allowing researchers to track and optimize their queries while maintaining a transparent, logged history of all actions taken. The interface also supports collaborative features, ensuring seamless communication and data sharing between researchers, with easy access to annotations, results, and project tools (Figure 3).
The user functionalities are summarized below as follows:
  • Personnel permission handling;
  • Managing research-related data entries;
  • Audits of history per user ID;
  • Real-time dynamic multicriteria queries.
Data recordings refer to the implementation of research protocols using all available commercial kit/protocols or homemade ones. The research results are stored and presented collectively or per assay.

4. Discussion

Effective data sharing, integration, and annotation contribute to the reproducibility, accessibility, and advancement of biological research by enabling seamless data retrieval, analysis, and collaboration across scientific disciplines [36]. Biomedical data repositories are available today based on biomedical knowledge-based systems with strict management principles that are designed to extract, accumulate, process, store, and disseminate specific datasets, fostering international collaboration and accelerating scientific discovery and innovation in future healthcare systems [37]. However, existing infrastructures for the search and discovery of such datasets often face limitations in terms of accessibility, interoperability, and ethical compliance [38]. Numerous studies have identified key barriers, including inconsistencies in data formats, the lack of standardized protocols, and fragmented data governance systems, all of which hinder large-scale collaborative research efforts [39,40]. These challenges are particularly evident in data-driven fields such as precision medicine and genomics, in which access to harmonized, high-quality datasets is essential for deriving meaningful insights [41]. Many traditional organizations hosting both biosamples and related pre-analytical data and metadata, such as biobanks and other data repositories, operate within localized or institution-specific frameworks with ad hoc systems, restricting the seamless exchange of information across borders and with other stakeholders [42]. Additionally, the reliance on static consent models can create ethical and legal complexities, limiting the usability of biosamples and related data over time [43].
Addressing these limitations, our platform provides a dynamic, scalable, and ethically responsible solution that ensures the real-time control of biosamples and data, fostering international research collaboration by creating the means for interoperable real-time annotations of both biosamples and patient profiles available for research initiatives, while complying with national and international standards. These sets of biological material and data are available for accredited researchers in real time, providing the access rights according to the consent of the initial donor, thus following a patient-centric approach in the entire process of research and development in modern medicine. Our real-time dynamic e-consent module directly addresses issues associated with ethical compliance, particularly in multi-jurisdictional research, in which data privacy laws, such as GDPR and HIPAA, impose varying constraints on data sharing and processing by allowing donors to update their preferences as needed.
The design and implementation of large-scale biomedical databases require a multidisciplinary approach, integrating secure data management, compliance with ethical and legal frameworks, and user-friendly interfaces that facilitate data retrieval and analysis [44]. A critical aspect of our platform’s design is its modular architecture, which ensures high availability, redundancy, and the real-time integration of data from multiple sources, including biobanks, hospitals, clinical research facilities, and patients themselves. Unlike conventional data repositories that rely on siloed storage and static databases [45], our system enables multi-criteria querying through an advanced data model, enhancing accessibility and research efficiency. The distributed network ensures that biosamples and associated data remain traceable, up-to-date, and available for approved researchers in compliance with donor consent and regulatory policies. Each searchable pool of biosamples and/or datasets is annotated in real time, linked with longitudinal pre-analytical data and metadata, creating fit-for-purpose cohorts for research and hypothesis-driven initiatives.
Applications of such research platforms span across various biomedical domains, including genomics, personalized medicine, epidemiological studies, and pharmaceutical development, often facing significant challenges in integrating multiple different data modalities; thus, researchers encounter difficulties in combining data from various sources like genetic sequences, clinical records, medical imaging, and environmental factors [46]. The absence of standardized data formats exacerbates these challenges by creating inefficiencies in clinical data integration and analysis, as shown by initiatives like Project Data Sphere, which demonstrated that the process of data standardization for cancer research was more interpretable and valuable than proprietary datasets [47].
Our platform provides a seamless data-sharing and integration framework that allows researchers to overcome these issues by offering a centralized system in which multiple data types can be harmonized and accessed in real time. This integration of multi-modal data (i.e., from genomic sequences and clinical trials to electronic health records (EHR) and medical imaging) enables a comprehensive approach to research, ensuring that researchers can generate more accurate and meaningful insights that would not be possible from isolated data sources.
One of the key advantages of our platform is its ability to support cross-institutional research collaborations, enabling data to be securely shared across different research institutions and geographical locations. This is particularly valuable for large-scale studies, for which pooling resources and expertise across institutions can lead to more robust and diverse datasets, accelerating the discovery of critical health insights.
Additionally, the platform introduces a multi-criteria query tool that allows researchers to perform complex, real-time searches across multiple datasets, significantly improving data accessibility and research efficiency. The integration of distributed data storage ensures high availability and redundancy, mitigating risks associated with data loss or system failures. Additionally, the platform’s role-based authentication system enhances security by granting user-specific permissions, ensuring that biosamples and data are accessed strictly within authorized frameworks. To show its potential future utilization and added value, two use cases for advanced research initiatives are presented:
Use Case 1: Cross-Border Oncology Research Using Multi-Criteria Query Tool
Scenario: A team of oncology researchers is conducting a study on the effectiveness of targeted therapies for lung cancer patients with specific genetic mutations. They need access to biosamples from diverse patient populations across multiple regions, with detailed data on patient demographics, treatment history, and genomic profiles. However, sourcing the right biosamples while ensuring ethical compliance is challenging due to fragmented data sources and varying consent regulations.
Current Challenges:
  • Biosamples and associated patient data are stored in multiple, siloed biobanks and clinical registries, making data retrieval slow and inefficient;
  • Researchers face difficulties tracking and updating consent, leading to potential legal and ethical issues;
  • Existing systems lack advanced search functionalities, making it difficult to refine sample selection based on multiple criteria.
How the Solution Adds Value:
The proposed platform integrates a multi-criteria query tool and a real-time dynamic e-consent (DRT E-Consent) system, streamlining biosample access and ensuring regulatory compliance.
Workflow and Benefits:
  • Researchers log in to the platform via a secure authentication system;
  • Using the multi-criteria query tool, they filter samples based on genetic markers, treatment response, demographic factors, and pre-analytical conditions;
  • The system processes the query in real time, retrieving data from multiple sources while maintaining data integrity;
  • The DRT e-consent system provides up-to-date information on donor consent status;
  • Automated alerts notify researchers if additional consent is needed, ensuring compliance with ethical guidelines;
  • A universal dashboard provides aggregated insights into available biosamples and patient cohorts across regions within the associated network;
  • Query history is logged for reproducibility, enhancing research credibility and compliance tracking.
Outcome: Researchers can quickly access high-quality biosamples, accelerate oncological discoveries, and improve personalized treatment approaches without delays or ethical risks.
Use Case 2: Infectious Disease Surveillance and Sample Recruitment
Scenario: A global health research institution is tracking an emerging viral outbreak. To study viral evolution and immune response variations, they need biosamples from infected patients across different demographics and geographic locations. However, real-time sample access and patient recruitment remain bottlenecks due to slow data retrieval, inconsistent consent tracking, and fragmented research networks.
Current Challenges:
  • Public health organizations struggle to rapidly locate biosamples relevant to ongoing outbreaks due to disconnected databases;
  • Patient consent preferences change frequently, but existing systems fail to update researchers in real time;
  • Researchers cannot easily refine searches by multiple parameters, making it difficult to identify the most relevant biosamples for surveillance studies.
How the Solution Adds Value:
The platform enhances infectious disease research by providing the seamless integration of real-time data sources, automated compliance tracking, and an intuitive user interface that accelerates biosample discovery and ensures that the research aligns with evolving regulatory and ethical standards.
Workflow and Benefits:
  • Researchers log in and submit a multi-criteria query to filter samples based on viral strain, symptom severity, patient demographics, and hospitalization history;
  • The system searches multiple biobanks in real time, providing ranked results based on clustering thresholds (50%, 70%, and 100%);
  • The DRT E-Consent system updates patient consent status dynamically;
  • Researchers receive instant notifications if a patient withdraws their consent, preventing unauthorized sample usage;
  • A universal dashboard provides regional outbreak statistics, helping policymakers allocate resources efficiently.
Outcome: The platform’s real-time analytics, multi-criteria querying, and consent tracking enable a faster outbreak response, more effective biosample recruitment, and improved global health research collaboration.
As the demand for multi-institutional research grows, driven by the rise of precision medicine and collaborative public health efforts [48,49], the perspectives and advantages of our platform for the future of biomedical research is significant, as it provides an infrastructure that supports seamless international collaboration, enabling researchers to access and share biosamples and data beyond geographical boundaries. This enhances the scope and impact of research projects, leading to more comprehensive studies and robust scientific outcomes. The inclusion of real-time analytics and search histories further optimizes research workflows, allowing researchers to track their queries and refine their approaches based on previous results. The universal dashboard offers aggregated regional statistics, providing insights into biosample availability and patient cohort distributions across different demographics. Table 2 highlights the broad impact of our research platform on society, researchers, and healthcare. Researchers gain access to advanced analytics, seamless data integration, and secure collaboration tools, accelerating discoveries and improving the efficiency of clinical trials. Society benefits from improved disease prevention, equitable healthcare advancements, and enhanced public health responses through large-scale epidemiological studies. The healthcare system gains real-time insights for precision treatments, better disease management, and faster drug development, ultimately leading to improved patient outcomes and a more efficient healthcare infrastructure.
Despite its many advantages, the platform faces certain challenges that must be addressed for long-term sustainability and scalability. One of the primary challenges is ensuring its widespread adoption among research institutions, biobanks, and clinical centers. While the system provides significant advantages over traditional platforms, integrating it within existing infrastructures may require policy adaptations and training. Another challenge is maintaining data integrity and interoperability across different sources, necessitating the continuous refinement of data processing and standardization techniques, especially for paper-based data formats that have not been recorded so far. Addressing these challenges will be crucial for maximizing the platform’s impact and ensuring its long-term viability in the research ecosystem.
Looking ahead, future development will take place to maximize its impact. Upcoming initiatives include the following:
-
Expanding Modular Capabilities: Incorporating AI-driven analytics and genomic data repositories will enhance the platform’s ability to support cutting-edge research;
-
Strengthening Interoperability: Additional compatibility with emerging international standards will facilitate broader adoption, particularly in regions with distinct regulatory frameworks;
-
Training and Education Programs: Developing comprehensive training modules will help stakeholders optimize platform usage and encourage widespread participation in the research community;
-
Strengthening International Collaborations: Regulatory frameworks will also be key to expanding the platform’s reach and ensuring compliance across diverse legal landscapes;
-
Incorporating Blockchain Technology: Consent management adds an extra layer of security and transparency, ensuring that all consent modifications are immutable and verifiable;
-
Pilot Testing: Ensuring the seamless handling of multi-modal data, engaging a spectrum of stakeholders (genomic, clinical, imaging, etc.), and gathering feedback from end-users for validation and improvement.
These enhancements will further position the solution as a transformative tool for global efforts in medical research, enabling the platform to remain a cornerstone in fostering effective scientific collaboration. The successful implementation of the platform has significant implications beyond the immediate scope of biosample-oriented research. Its modular, scalable design can be adapted to support various types of data-driven research initiatives, from public health surveillance to translational medicine. Furthermore, the platform exemplifies the importance of international collaboration in addressing global challenges, emphasizing the need for shared resources, common standards, and equitable access to data.

5. Conclusions

In the era of the globalization and digitalization of healthcare and research, high-throughput technologies and international collaboration are crucial for future research and development. This underscores the need for co-creating cross-sector interoperable data hubs that can connect all members of the value chain, enriching and upgrading current research networks with high-quality, uniquely annotated data, based on the principles of democratization and transparency. The platform for biosample-oriented research initiatives we described represents a transformative advancement in biomedical research, offering a secure, scalable, and ethically responsible approach to managing biosamples and medical data, addressing the limitations of existing research infrastructures. It facilitates global collaboration, enhances data accessibility, and ensures ethical compliance through real-time dynamic consent. While challenges remain in data standardization and institutional adoption, the future potential of this platform is vast, and it has the potential to set new standards for data-driven research, ultimately contributing to innovative discoveries and improved healthcare outcomes worldwide.

Author Contributions

Conceptualization, P.K., D.I. and K.V.; methodology, P.K. and D.I.; software, A.F.; validation, D.I., L.T. and A.B.; formal analysis, D.I. and L.T.; investigation, D.I., A.F., L.T. and A.B.; resources, P.K.; data curation, D.I.; writing—original draft preparation, D.I. and L.T.; writing—review and editing, D.I., L.T., A.B. and P.K.; visualization, A.F., D.I., L.T. and A.B.; supervision, P.K.; project administration, P.K. and D.I.; funding acquisition, P.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is unavailable due to privacy or ethical restrictions.

Conflicts of Interest

All authors are employed by Metabio, and the described platform was created and implemented for commercial purposes.

References

  1. Ward, C.L.; Shaw, D.; Sprumont, D.; Sankoh, O.; Tanner, M.; Elger, B. Good collaborative practice: Reforming capacity building governance of international health research partnerships. Glob. Health 2018, 14, 1. [Google Scholar] [CrossRef]
  2. Saenz, C.; Krahn, T.M.; Smith, M.J.; Haby, M.M.; Carracedo, S.; Reveiz, L. Advancing collaborative research for health: Why does collaboration matter? BMJ Glob. Health 2024, 9, e014971. [Google Scholar] [CrossRef]
  3. Belmont, J.W.; Hardenbol, P.; Willis, T.D.; Yu, F.; Yang, H.; Chang, L.Y.; Huang, W.; Liu, B.; Shen, Y.; Tam, P.K.H.; et al. The International HapMap Project. Nature 2003, 426, 789–796. [Google Scholar] [CrossRef]
  4. Yao, B. International Research Collaboration: Challenges and Opportunities. J. Diagn. Med. Sonogr. 2021, 37, 107–108. [Google Scholar] [CrossRef]
  5. Nott, M.; Schmidt, D.; Thomas, M.; Reilly, K.; Saksena, T.; Kennedy, J.; Hawke, C.; Christian, B. Collaborations between health services and educational institutions to develop research capacity in health services and health service staff: A systematic scoping review. BMC Health Serv. Res. 2024, 24, 1363. [Google Scholar] [CrossRef]
  6. Dixon-Woods, M.; Foy, C.; Hayden, C.; Al-Shahi Salman, R.; Tebbutt, S.; Schroter, S. Can an ethics officer role reduce delays in research ethics approval? A mixed-method evaluation of an improvement project. BMJ Open 2016, 6, e011973. [Google Scholar] [CrossRef]
  7. Canario Guzmán, J.A.; Espinal, R.; Báez, J.; Melgen, R.E.; Rosario, P.A.P.; Mendoza, E.R. Ethical challenges for international collaborative research partnerships in the context of the Zika outbreak in the Dominican Republic: A qualitative case study. Health Res. Policy Syst. 2017, 15, 82. [Google Scholar] [CrossRef]
  8. Huynh, T. Collaborative research in healthcare: Uncovering the impact of industry collaboration on the service innovativeness of university hospitals. J. Technol. Transf. 2024, 50, 1–28. [Google Scholar] [CrossRef]
  9. Li, X.; Cong, Y. Exploring barriers and ethical challenges to medical data sharing: Perspectives from Chinese researchers. BMC Med. Ethics 2024, 25, 132. [Google Scholar] [CrossRef]
  10. Kosiol, J.; Silvester, T.; Cooper, H.; Alford, S.; Fraser, L. Revolutionising health and social care: Innovative solutions for a brighter tomorrow—A systematic review of the literature. BMC Health Serv. Res. 2024, 24, 809. [Google Scholar] [CrossRef]
  11. Dusdal, J.; Powell, J.J.W. Benefits, Motivations, and Challenges of International Collaborative Research: A Sociology of Science Case Study. Sci. Public Policy 2021, 48, 235–245. [Google Scholar] [CrossRef]
  12. Figueiredo, M.S.N.; Pereira, A.M. Managing Knowledge–The Importance of Databases in the Scientific Production. Procedia Manuf. 2017, 12, 166–173. [Google Scholar] [CrossRef]
  13. Mikhael, E.M.; Al-Jumaili, A.A.; Jamal, M.Y.; Abdulazeez, Z.D. Current status and perceived challenges of collaborative research in a leading pharmacy college in Iraq: A qualitative study. BMC Med. Educ. 2025, 25, 61. [Google Scholar] [CrossRef]
  14. Muenzen, K.D.; Amendola, L.M.; Kauffman, T.L.; Mittendorf, K.F.; Bensen, J.T.; Chen, F.; Green, R.; Powell, B.C.; Kvale, M.; Angelo, F.; et al. Lessons learned and recommendations for data coordination in collaborative research: The CSER consortium experience. HGG Adv. 2022, 3, 100120. [Google Scholar] [CrossRef]
  15. Navale, V.; von Kaeppler, D.; McAuliffe, M. An overview of biomedical platforms for managing research data. J. Data Inf. Manag. 2021, 3, 21–27. [Google Scholar] [CrossRef]
  16. Winickoff, D.E.; Kreiling, L.; Borowiecki, M.; Garden, H.; Philp, J. Collaborative Platforms for Emerging Technology: Creating Convergence Spaces. Available online: http://www.oecd.org/termsandconditions (accessed on 5 February 2025).
  17. Metabio–Metabio. Available online: https://metab.io/ (accessed on 22 April 2025).
  18. Dowling, N.M.; Bolt, D.M.; Deng, S.; Li, C. Measurement and control of bias in patient reported outcomes using multidimensional item response theory. BMC Med. Res. Methodol. 2016, 16, 63. [Google Scholar] [CrossRef]
  19. Dankar, F.K.; Gergely, M.; Dankar, S.K. Informed Consent in Biomedical Research. Comput. Struct. Biotechnol. J. 2019, 17, 463–474. [Google Scholar] [CrossRef]
  20. Inan, O.T.; Tenaerts, P.; Prindiville, S.A.; Reynolds, H.R.; Dizon, D.S.; Cooper-Arnold, K.; Turakhia, M.; Pletcher, M.J.; Preston, K.L.; Krumholz, H.M.; et al. Digitizing clinical trials. NPJ Digit. Med. 2020, 3, 1–7. [Google Scholar] [CrossRef]
  21. Kazmierska, J.; Hope, A.; Spezi, E.; Beddar, S.; Nailon, W.H.; Osong, B.; Ankolekar, A.; Choudhury, A.; Dekker, A.; Redalen, K.R.; et al. From multisource data to clinical decision aids in radiation oncology: The need for a clinical data science community. Radiother. Oncol. 2020, 153, 43–54. [Google Scholar] [CrossRef]
  22. Matthews, K.R.W.; Yang, E.; Lewis, S.W.; Vaidyanathan, B.R.; Gorman, M. International scientific collaborative activities and barriers to them in eight societies. Account. Res. 2020, 27, 477–495. [Google Scholar] [CrossRef]
  23. Swift, B.; Jain, L.; White, C.; Chandrasekaran, V.; Bhandari, A.; Hughes, D.A.; Jadhav, P.R. Innovation at the Intersection of Clinical Trials and Real-World Data Science to Advance Patient Care. Clin. Transl. Sci. 2018, 11, 450–460. [Google Scholar] [CrossRef]
  24. Torab-Miandoab, A.; Samad-Soltani, T.; Jodati, A.; Rezaei-Hachesu, P. Interoperability of heterogeneous health information systems: A systematic literature review. BMC Med. Inform. Decis. Mak. 2023, 23, 18. [Google Scholar] [CrossRef]
  25. Blumenthal, S. Improving Interoperability between Registries and EHRs. AMIA Summits Transl. Sci. Proc. 2018, 2018, 20. [Google Scholar] [PubMed] [PubMed Central]
  26. Kim, E.; Rubinstein, S.M.; Nead, K.T.; Wojcieszynski, A.P.; Gabriel, P.E.; Warner, J.L. The Evolving Use of Electronic Health Records (EHR) for Research. Semin. Radiat. Oncol. 2019, 29, 354–361. [Google Scholar] [CrossRef]
  27. Yen, P.Y.; McAlearney, A.S.; Sieck, C.J.; Hefner, J.L.; Huerta, T.R. Health Information Technology (HIT) Adaptation: Refocusing on the Journey to Successful HIT Implementation. JMIR Med. Inform. 2017, 5, e7476. [Google Scholar] [CrossRef]
  28. Vorisek, C.N.; Lehne, M.; Klopfenstein, S.A.I.; Mayer, P.J.; Bartschke, A.; Haese, T.; Thun, S. Fast Healthcare Interoperability Resources (FHIR) for Interoperability in Health Research: Systematic Review. JMIR Med. Inform. 2022, 10, e35724. [Google Scholar] [CrossRef]
  29. ISO/IEC 27001:2022—Information Security Management Systems. Available online: https://www.iso.org/standard/27001 (accessed on 22 April 2025).
  30. General Data Protection Regulation (GDPR) Compliance Guidelines. 2021. Available online: https://gdpr.eu/ (accessed on 22 April 2025).
  31. HIPAA Home|HHS.gov. Available online: https://www.hhs.gov/programs/hipaa/index.html (accessed on 22 April 2025).
  32. Chillotti, I.; Gama, N.; Georgieva, M.; Izabachène, M. TFHE: Fast Fully Homomorphic Encryption Over the Torus. J. Cryptol. 2020, 33, 34–91. [Google Scholar] [CrossRef]
  33. Dingledy, F.W.; Matamoros, A.B. What Is Digital Rights Management? 2016. Available online: https://scholarship.law.wm.edu/libpubs/122 (accessed on 25 May 2021).
  34. Ivanova, D.; Katsaounis, P. Real-Time Dynamic Tiered e-Consent: A Novel Tool for Patients’ Engagement and Common Ontology System for the Management of Medical Data. Innov. Digit. Health Diagn. Biomark. 2021, 1, 45–49. [Google Scholar] [CrossRef]
  35. Hu, V.C.; Ferraiolo, D.; Kuhn, R.; Schnitzer, A.; Sandlin, K.; Miller, R.; Scarfone, K. NIST Special Publication 800-162 Guide to Attribute Based Access Control (ABAC) Definition and Considerations. NIST Spec. Publ. 2014, 800, 1–54. [Google Scholar] [CrossRef]
  36. Lapatas, V.; Stefanidakis, M.; Jimenez, R.C.; Via, A.; Schneider, M.V. Data integration in biological research: An overview. J. Biol. Res. 2015, 22, 9. [Google Scholar] [CrossRef]
  37. Lin, D.; McAuliffe, M.; Pruitt, K.D.; Gururaj, A.; Melchior, C.; Schmitt, C.; Wright, S.N. Biomedical Data Repository Concepts and Management Principles. Sci. Data 2024, 11, 622. [Google Scholar] [CrossRef]
  38. Shanahan, H.; Bezuidenhout, L. Rethinking the A in FAIR Data: Issues of Data Access and Accessibility in Research. Front. Res. Metr. Anal. 2022, 7, 912456. [Google Scholar] [CrossRef]
  39. Jacobson, L.P.; Parker, C.B.; Cella, D.; Mroczek, D.K.; Lester, B.M.; Smith, P.B.; Newby, K.L.; Gershon, R.; Cella, D. Approaches to protocol standardization and data harmonization in the ECHO-wide cohort study. Pediatr. Res. 2024, 95, 1726. [Google Scholar] [CrossRef]
  40. Facile, R.; Elizabeth Muhlbradt, E.; Gong, M.; Li, Q.-N.; Popat, V.B.; Pétavy, F.; Cornet, R.; Ruan, Y.; Koide, D.; Saito, I.; et al. The Use of CDISC Standards for Real-World Data (RWD): Expert Perspectives from a Qualitative Delphi Survey. JMIR Med. Inform. 2022, 10, e30363. [Google Scholar] [CrossRef]
  41. Kush, R.D.; Warzel, D.; Kush, M.A.; Sherman, A.; Navarro, E.A.; Fitzmartin, R.; Pétavy, F.; Galvez, J.; Becnel, L.B.; Zhou, F.L.; et al. FAIR data sharing: The roles of common data elements and harmonization. J. Biomed. Inform. 2020, 107, 103421. [Google Scholar] [CrossRef]
  42. Hoffman, N.; Alkhatib, R.; Gaede, K.I. Data Management in Biobanking: Strategies, Challenges, and Future Directions. BioTech 2024, 13, 34. [Google Scholar] [CrossRef]
  43. Jacquier, E.; Laurent-Puig, P.; Badoual, C.; Burgun, A.; Mamzer, M.F. Facing new challenges to informed consent processes in the context of translational research: The case in CARPEM consortium. BMC Med. Ethics 2021, 22, 21. [Google Scholar] [CrossRef]
  44. Dankar, F.K.; Ptitsyn, A.; Dankar, S.K. The development of large-scale de-identified biomedical databases in the age of genomics-principles and challenges. Hum. Genom. 2018, 12, 19. [Google Scholar] [CrossRef]
  45. Asiimwe, R.; Lam, S.; Leung, S.; Wang, S.; Wan, R.; Tinker, A.; McAlpine, J.N.; Woo, M.M.M.; Huntsman, D.G.; Talhouk, A. From biobank and data silos into a data commons: Convergence to support translational medicine. J. Transl. Med. 2021, 19, 493. [Google Scholar] [CrossRef]
  46. Rajendran, S.; Pan, W.; Sabuncu, M.R.; Chen, Y.; Zhou, J.; Wang, F. Learning across diverse biomedical data modalities and cohorts: Challenges and opportunities for innovation. Patterns 2024, 5, 100913. [Google Scholar] [CrossRef]
  47. Green, A.K.; Reeder-Hayes, K.E.; Corty, R.W.; Basch, E.; Milowsky, M.I.; Dusetzina, S.B.; Bennett, A.V.; Wood, W.A. The project data sphere initiative: Accelerating cancer research by sharing data. Oncologist 2015, 20, 464-e20. [Google Scholar] [CrossRef]
  48. Ashley, E.A. The precision medicine initiative: A new national effort. JAMA 2015, 313, 2119–2120. [Google Scholar] [CrossRef]
  49. Shin, S.H.; Bode, A.M.; Dong, Z. Precision medicine: The foundation of future cancer therapeutics. NPJ Precis. Oncol. 2017, 1, 12. [Google Scholar] [CrossRef]
Figure 1. Researchers’ platform for real-time data management, multi-criteria querying, and secure data exchange.
Figure 1. Researchers’ platform for real-time data management, multi-criteria querying, and secure data exchange.
Computers 14 00157 g001
Figure 2. Multi-criteria query system for advanced research and international collaborations based on cluster adoption tools and real-time analytics.
Figure 2. Multi-criteria query system for advanced research and international collaborations based on cluster adoption tools and real-time analytics.
Computers 14 00157 g002
Figure 3. Image of the platform interface and available features.
Figure 3. Image of the platform interface and available features.
Computers 14 00157 g003
Table 1. Researchers’ needs and constraints [11,18,19,20,21,22,23].
Table 1. Researchers’ needs and constraints [11,18,19,20,21,22,23].
NeedsConstraints
Rapid Access to High-Quality Samples and Data
-
Crucial for meaningful studies, biomarker discovery, and treatment development
-
Requires diversity for generalizable results
-
Supports swift progress or quick failure of unviable research
-
Regulatory and ethical hurdles delay access
-
Variability in sample quality and data annotation
-
Limited availability of rare or high-demand samples
-
Data privacy and security concerns
Collaboration and Interdisciplinary Research
-
Cross-disciplinary teamwork fosters innovation
-
Global partnerships enhance reach and resource sharing
-
Clear communication essential for synergy
-
Language and cultural barriers
-
Complex IP and data-sharing negotiations
-
Uneven funding and conflicting priorities
-
Institutional and operational differences
Transparent Reporting and Accountability
-
Builds public trust and ethical integrity
-
Supports informed consent and fair research practices
-
Enhances participant engagement and credibility
-
Complicated, inconsistent regulations across regions
-
Informed consent challenges with diverse populations
-
Ethical dilemmas vs. research progress
-
High cost and complexity of data protection measures
Technological Advancements and Data Management
-
Cutting-edge tools and systems drive impactful research
-
Big data handling and integration are crucial for insights
-
High cost of acquisition and maintenance
-
Skill and training gaps
-
Integration and standardization challenges
-
Limited computational infrastructure
Research Reproducibility and Transparency
-
Key to reliable, verifiable science
-
Open data sharing and methodological clarity promote collaboration
-
Privacy and proprietary concerns
-
Systemic disincentives for reproducibility
-
Time and resource demands
-
Pressure to publish may compromise quality
Table 2. Impact of Integrated Biomedical Data Infrastructure Across Key Sectors.
Table 2. Impact of Integrated Biomedical Data Infrastructure Across Key Sectors.
CategorySupportImpact
SocietyAdvances biomedical researchImproves disease detection and prevention.
Contributes to personalized medicineReduces adverse drug reactions and enhances patient outcomes.
Ensures research includes diverse populations.Supports equitable healthcare
Aids in large-scale epidemiological studies.Improves public health policies and responses to global health crises.
Enhances the ability to track and manage disease outbreaks, including pandemics.Improves health outcomes and societal cohesion.
ResearchersBreaks down data silos.Enables seamless collaboration across institutions and disciplines.
Provides a scalable and secure infrastructure.Efficient management of complex biomedical datasets.
Integrates multi-modal data (genomics, clinical records, imaging, etc.).Enhances research accuracy and efficacy.
Provides advanced analytics.Accelerates drug discovery and biomarker identification.
Enrichment of biosample-related data Reduced time and costs associated with clinical trials, expediting the translation of research into real-world treatments
HealthcareIntegrates genomic and clinical data for tailored treatmentsEnables precision medicine.
Improves disease surveillance and outbreak prediction.Enhances public health preparedness.
Real-time, evidence-based insightsEmpowers clinicians with for better decision making.
Enhances pharmaceutical innovation.Faster drug discovery and development.
Optimizes resource allocation and patient care strategiesSupports data-driven healthcare policies
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fridas, A.; Bourouliti, A.; Touramanidou, L.; Ivanova, D.; Votis, K.; Katsaounis, P. Advanced Digital System for International Collaboration on Biosample-Oriented Research: A Multicriteria Query Tool for Real-Time Biosample and Patient Cohort Searches. Computers 2025, 14, 157. https://doi.org/10.3390/computers14050157

AMA Style

Fridas A, Bourouliti A, Touramanidou L, Ivanova D, Votis K, Katsaounis P. Advanced Digital System for International Collaboration on Biosample-Oriented Research: A Multicriteria Query Tool for Real-Time Biosample and Patient Cohort Searches. Computers. 2025; 14(5):157. https://doi.org/10.3390/computers14050157

Chicago/Turabian Style

Fridas, Alexandros, Anna Bourouliti, Loukia Touramanidou, Desislava Ivanova, Kostantinos Votis, and Panagiotis Katsaounis. 2025. "Advanced Digital System for International Collaboration on Biosample-Oriented Research: A Multicriteria Query Tool for Real-Time Biosample and Patient Cohort Searches" Computers 14, no. 5: 157. https://doi.org/10.3390/computers14050157

APA Style

Fridas, A., Bourouliti, A., Touramanidou, L., Ivanova, D., Votis, K., & Katsaounis, P. (2025). Advanced Digital System for International Collaboration on Biosample-Oriented Research: A Multicriteria Query Tool for Real-Time Biosample and Patient Cohort Searches. Computers, 14(5), 157. https://doi.org/10.3390/computers14050157

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop