1. Introduction
Healthcare practitioners oversee various data, including the patient’s medical history (diagnoses and prescriptions), medical and clinical data (imaging and laboratory procedures), and other private or confidential medical information. Previously, it was a common practice to keep patients’ medical records in handwritten notes or typed reports [
1]. However, with the development of computer systems, clinical tests and records are increasingly digitized into electronic records known as Electronic Health Records (EHRs). EHRs are a digital version of medical records that contain information about a patient’s past, current, or future physical or mental health or condition [
2]. Meanwhile, the concept of “big data in healthcare” is growing as a result of the integration of healthcare payer–provider data such as EHRs, pharmacy prescription, imaging, and insurance records along with genomics-driven experiments such as genotyping and gene expression data. In addition, recently, vast amounts of data are being collected in real-time from wearables, smart phones, smart devices, chips, and other data acquired through the smart web of the Internet of Things (IoT). The main goal of integrating big data in healthcare is to improve healthcare quality, service efficiency, and costs and reduce medical errors [
3].
Precision medicine is a concept that can potentially transform medical interventions by providing effective, tailored therapeutic and treatment strategies based on an individual’s genomics and omics profiles. The 21st century vision of precision medicine is to provide ‘the right drug, with the right dose at the right time to the right patient’ [
4]. It is currently a novel topic in the healthcare industry. Precision medicine involves tailoring a treatment specific to an individual with a disease. Furthermore, it helps in the prevention of disease. Increased utilization of molecular and genomic stratification of patients, for instance, assessing for mutations that give rise to resistance to certain treatments, will provide medical professionals with clear evidence upon which to base treatment strategies for individual patients. With this development, there will no longer be a dependence on the adverse outcomes of trial and error in prescribing methods and drug delivery.
Currently, when the prescribed medication is ineffective, the patient may be switched to a different medication. This trial-and-error approach leads to poorer outcomes for patients in terms of adverse side effects, drug interactions, and potential disease progression, meaning effective treatment is delayed and patients are dissatisfied [
5]. Moreover, it also affects the healthcare industry in terms of wasted time, wasted inventory, and the overall quality of hospital performance. On the other hand, precision medicine is expensive due to the high cost of the needed technology.
In precision medicine, there are high expectations for genomics data to provide clues on what causes a disease’s initiation and progression and to enable the development of new strategies for disease prediction, prevention, and treatment. The idea is to translate omics profiles into subject-specific care based on their disease networks [
6]. However, the ability to decipher molecules and their mechanisms remains limited, despite growing access to omics profiles, due to the complexity of the biological processes, limitations in statistical analysis, and cellular heterogeneity, thus, causing a bottleneck effect. Since then, the bottleneck has shifted from data generation to interpreting, analyzing, integrating, and managing the data. Therefore, informatics–computational technology is an essential component of precision medicine. Ultimately, integrating big health data into precision medicine could be costly at the beginning due to the investment in new technology. Still, it will be provided to later generations at a lower cost and more efficiency.
Integrating big data to develop precision medicine in healthcare seems to be a promising direction for improving patient treatment and care and the allocation of resources by the healthcare service, reducing wasteful expenditure and improving the overall quality of healthcare. However, although precision medicine is becoming the new trend, the healthcare sector face many challenges in integrating big data and precision medicine into their practice, which need to be discussed and considered. On the other hand, big data and precision medicine offer many benefits that are often underestimated or unnoticed. Therefore, it is important to identify the benefits that can be gained from such integration.
This work discusses the limitations of precision medicine to answer the following questions: Is the hindrance to utilizing precision medicine in healthcare due to a lack of knowledge in integrating new fields in ‘physician practice? On the other hand, what benefits does precision medicine bring forward that aid in recognizing its value? What are potential solutions that could fix the limitations and promote precision medicine? Furthermore, this work proposes solutions to integrate big data and precision medicine in the healthcare industry.
The remainder of the paper is organized as follows: The proposed methodology is described in
Section 2. The results and the literature analysis are presented in
Section 3. We discuss the implications of the study and present conclusions and directions for future research in
Section 4.
2. Materials and Methods
Two research questions directed this literature review. The first question explores the challenges of implementing big data in precision medicine. The second question explores the benefits of big data being implemented with precision medicine. A systematic literature review was performed. The review structure follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.
2.1. Literature Search and Inclusion Criteria
The research papers used in this literature review were extracted from online databases, including peer-reviewed papers, conference records, and book chapters. In addition, some previous systematic review papers were also included to benefit from the assimilation of past knowledge. Finally, specific search filters were used as a practical scheme for retrieving related articles from various databases. The database sources that were used are IEEE Xplore Digital Library, PubMed, Science Direct, ProQuest, Elsevier, Research Gate, and Google Scholar.
The bibliography table was used to generate a list of the keywords that could be used in the literature search process to obtain the relevant articles. Boolean operators such as “AND” and “OR” were used to refine the search. The search variations were mainly in the form (ab(big data) OR ab(personalized)) AND (ab(challenges) OR ab(limitations)) AND (ab(benefits) OR ab(opportunities)). The retrieved papers were qualitative and theoretical. The most common report styles included comprehensive case studies of a sector and reviews of different limitations and challenges that prevent big data and precision medicine implementation in healthcare. Furthermore, it suggests some solutions and large advantages that can result from adding it as a standard to the healthcare industry. Some research formats included randomized controlled trials, surveys, case studies, and examples of applications that apply big data in healthcare.
2.2. Screening and Coding Procedures
The number of records initially identified via database searching was 556 studies. Next, the articles with related titles or relevant abstracts were inspected and held for further analysis. For inclusion, the selected papers are either scholarly journals or studies, and the content is related to the review questions. After excluding the duplicates, the articles were further screened to exclude articles that did not accentuate the topic of big data in precision medicine. As a result, 150 studies were identified from the used databases.
Additionally, the reference list of all the selected papers was searched. Finally, 46 articles were used for this literature review highlighting the trend of precision medicine using big data.
Figure 1 shows an analysis of the percentage of references out of a total of 46 that convey each topic in this paper, while
Figure 2 shows the PRISMA Flow Diagram. Finally, supplementary to the screening procedure, simple bibliographic data were tabulated for the genesis of ideas and we analyzed the research design.
3. Challenges Associated with Big Data and Precision Medicine
Clinical practice has been slow to incorporate precision medicine. Behind this lag in clinical adoption, there are several challenges that many healthcare delivery systems are facing as they attempt to adapt to the new requirements, practices, and standards.
In this review, challenges from different perspectives that were discussed in several studies were collected and are listed in detail in
Table 1. These challenges have been classified into five main categories: awareness and education, patient privacy/data collection, value recognition, data management and infrastructure, and other issues.
3.1. Awareness and Education
One of the main challenges facing healthcare in incorporating precision medicine is the lack of awareness, which can occur from the healthcare sector or the patient side.
3.1.1. From the Healthcare Sector Side
A survey claimed that only four out of ten consumers are aware of precision medicine. At the same time, only 11% of the patients reported that their doctor had discussed or recommended precision medicine treatment options [
24]. Precision medicine is not a common topic discussed at the point of care. This can occur primarily because clinicians are not fully aware of this new area or are hesitant about implementing it in their practice, believing it is time-consuming and too burdensome [
25]. This lack of discussion between clinicians and their patients has several adverse effects, including poor knowledge and awareness about precision medicine among patients, which consequently leads to low demand in that area (leading to limitation (
Section 3.1.2)).
Moreover, integrating precision medicine into medical practice requires using new technology and understanding the technique. In addition, due to the heterogeneous nature of omics data and EHRs, hybrid knowledge about human genomics, diseases, and various analysis algorithms for integrating and interpreting these data are required. Unfortunately, this skill is not yet common among professionals, and there is a noticeable lack of training [
21].
3.1.2. From the Community/Patient Side
On the other hand, apart from the low demand, due to the lack of community education, the terminology “precision medicine” has been used interchangeably with other terms such as “personalized healthcare”, “stratified medicine”, “personalized medicine”, “individualized medicine”, and more, which has led to a misunderstanding about its benefits. However, several studies have tackled this issue by explaining how these terminologies differ [
26]. Precision medicine focuses on tailoring treatments based on individual characteristics like genetics, environment, and lifestyle, aiming for the right treatment for the right patient at the right time. Personalized medicine shares this ethos but encompasses broader patient needs and preferences, emphasizing patient-centered care. Stratified medicine categorizes patients into subgroups based on shared characteristics to optimize treatment efficacy. Individualized medicine tailors interventions to the unique needs of each patient, considering genetics, environment, lifestyle, and personal preferences. In addition, there is a developing process of accommodating education about personalized medicine in the study curriculum, not only targeting medical faculties/universities and practitioners but also the wider public, potential patients, and real patients.
3.2. Patient Privacy/Data Collection
Data collection and patient privacy are interconnected in healthcare. The ownership and control of patient data, protection of sensitive information, etc., are all important considerations to ensure that data collection respects patient privacy rights. The collection of information to develop big data is the first step in adopting precision medicine. However, healthcare providers tend to face many challenges in data collection. First, there is a common misconception about data ownership, where patient data seem to belong to the institution; however, it is the property of the patient, and accessing and using it outside of the professional sphere necessitates patient consent [
27]. Furthermore, due to the ownership misconception, health providers usually do not involve patients’ desires in their healthcare decision-making [
25]. Meanwhile, to use the patient’s information, permission must be obtained in the form of signed consent.
Obtaining patient consent requires direct interaction of the patient with on-site staff [
13,
27], which is time-consuming and costly. Apart from that, patient protocols for the use of molecular data are frequently unclear or inappropriate, leading to misinformation about how the data are going to be used [
25,
28]. In addition to that, medical records, which include patients’ personal information, are usually tightly secured and not made public, which makes data collection challenging. In gathering non-medical data, usually, hypothesis-driven research is utilized, in which data are destroyed after being used [
27]. However, in healthcare, such practice cannot be adopted as the precision medicine concept revolves around the use of big data (i.e., a large amount of accumulative data); thus, destroying the data would ruin the concept of precision medicine. Furthermore, apart from the information being confidential, data security is very challenging, as there are concerns regarding the insecurity of a patient’s molecular data and its vulnerability to attacks [
25,
29]. In comparison, the collection of non-medical big data is characterized by low cost, low information density, and is mainly gathered by chance. Clinical big data, on the other hand, are known to be of high information density. It needs to be acquired intentionally and under informed consent, making it difficult, costly, and time-consuming [
27].
3.3. Value Recognition
Before implementing any plan or practice in an organization, ensuring the workforce is knowledgeable about it is important. They must also understand and recognize its value and the benefit it will bring. This will enhance their commitment to adopting it in their practice. However, that is not the case with precision medicine, as it is a new practice being implemented. There seems to be an exigent challenge in identifying the value of precision medicine, and the benefits of incorporating it into healthcare are not yet fully recognized [
30]. This leads to another challenge, i.e., it is yet unclear how to convince physicians of the value of precision medicine so that they adopt this new technology into their clinical practice [
7]. Many clinicians believe collecting, storing, and analyzing patients’ molecular information for precision medicine requires more time than they believe it is worth. On the other hand, some clinicians are hesitant to implement precision medicine methods for several reasons, including their belief that it demands time and are not well compensated. In addition, it is too burdensome to involve genetics experts/counselors in patient care [
25].
3.4. Data Management and Infrastructure
In this area of data management and infrastructure, the challenges faced by different organizations can be subdivided into three categories, i.e., lack of standardization; the storage, transfer, and management of data; and data integration issues.
3.4.1. Lack of Standardization
First, as stated before, precision medicine is a new practice that is being implemented in the medical field; as the healthcare (HC) setting has no standard protocols, this has led to ambiguity and challenges in adopting such practice. There is no consistent technique set to follow regarding the process of carrying out research and collecting clinical phenotypes, which clearly hampered the research from moving further. Furthermore, this inconsistency in carrying out the research has also led to the inability to achieve replicable clinical testing results [
8]. Replication is one of the most important techniques for scientists to gain confidence in the scientific validity of their findings. When the findings of one study are consistent with another, it is more likely to be a trustworthy claim about the new practice [
9]. Apart from that, Canada, for instance, lacks standardized quality assurance and regulated laboratory oversight. This means that there is an inconsistent technique for evaluating the clinical method adopted or approving the use of genetic testing in the first place. All of these factors result in the vagueness of the required standard of care [
10]. Adding to that, even if the data were collected, there seems to be no standardized Electronic Health Record (EHR) that accommodates genetic data; if it did exist, it was not generalized or consistent among different healthcare services, research centers, and biobanks (within the same country or between different countries). Overall, due to the lack of consistency in the whole procedure from the technique of collecting data to the evaluation of the data and to storing it, there are no clear decision-making procedures in the existing precision medicine programs [
25].
3.4.2. Storage, Transfer, and Management of Data
Healthcare is now gathering new and diverse types of data, including healthcare provider data (such as EHRs, pharmacy prescriptions, and insurance records) and genomics data, to adopt the precision medicine concept. However, the amount of data collected continuously from patients in HC is significantly large and can be used to build the big data for precision medicine. However, there seems to be a limitation in compiling these data to form the big data required. The main cause is primarily because hospital medical data repositories were designed and built in the pre-big-data era to be standalone and siloed. Moreover, the current technological infrastructure of these standalone data repositories does not allow for the transfer, modification, and management of medical data. So, the velocity and amount of data required to adopt big data approaches are typically segregated in clinic or hospital charts, with no central sharing [
27]. Moreover, current information technology systems are yet incapable of properly managing large volumes of patient molecular data. The current research laboratories, systems, and EHRs also lack the appropriate storage and computational resources required to integrate, manage, process, and analyze genetic data [
11,
25].
3.4.3. Data Integration
Furthermore, the limitations of adopting precision medicine do not end at managing the data but also extend to how to integrate and make use of the large amount data that has been collected. Compared to big data from other fields (such as social media or advertisement), medical data are more complicated due to the combination of heterogeneous information. This means that, unless the data are processed and converted into an understood format, it is useless to the HC [
27]. Unfortunately, over the last decade, a growing gap has become evident between researchers’ ability to extract and generate omics data and the ability to integrate and interpret these data [
11]. Due to the previously discussed limitations (i.e., data collection and management), the major focus was on generating and storing the maximum possible amount of data, resulting in this gap. On the other hand, it was discovered that the process of extracting correlations to provide actual and meaningful biological interactions is not straightforward due to the computational difficulty of analyzing hundreds of variables (volume), which are also heterogenous in nature (variety) [
11]. Overall, this leads to the inefficient use of individual molecular data in the provision of care [
25].
3.5. Other Issues
Lastly, a controversial topic has been raised recently regarding the association of precision medicine with economic inequality and the generalized availability of treatment. It was claimed that adopting precision medicine in healthcare would further widen the economic inequality in the health systems between high- and low-income countries. Low- and middle-income nations may not be fully able to integrate such practices due to precision medicine’s financial and training requirements, which are not easily achieved in these nations. Additionally, the expenses of transitioning to precision medicine are still under study and not completely known [
11]. Furthermore, even within high-income countries, precision medicine services are not always provided or available, especially in rural areas. Meanwhile, geneticists, genetic counselors, and molecular pathologists are also not always readily available in these areas. Consequently, many patients are either hesitant or unable to travel to other healthcare facilities to receive such services [
7,
25].
The challenges are categorized in
Table 1.
5. Discussion and Recommendations
Healthcare organizations face many challenges every day, as previously discussed in this paper. Moreover, other new challenges are being discovered that are hindering the efforts of these organizations and services in implementing precision medicine in their practice. However, precision medicine provides many opportunities and benefits that have not yet been wholly discovered and deserve to be tackled.
For that reason, this chapter is mainly oriented towards discussing and suggesting a few solutions that might help target the challenges previously discussed and, therefore, help to improve the implementation of precision medicine. The challenges were categorized into five main categories: awareness and education, patient privacy/data collection, value recognition, data management and infrastructure, and other issues.
5.1. Solution for Awareness and Education
Regarding the issue of awareness and education, there are two areas to be targeted: healthcare/medical workforce awareness and patient awareness. For medical practitioners, who are already in practice, the best way to raise awareness is through training and educational programs. The organization should encourage and provide this training to all of their workforce. Meanwhile, for medical students, the education curriculum must be updated in a way that not only familiarizes the students with the precision medicine concept but also integrates the concept of hybrid learning. Where able, medical and pharmaceutical colleges should provide major courses (apart from biological/medical courses) about how to integrate and interpret medical data (mainly omics data, which is heterogenous nature) using different methods of data analysis algorithms and compactional platforms.
Furthermore, healthcare professionals should also be taught some programming skills to be able to investigate better ways of analyzing omics data. Meanwhile, when healthcare organizations’ and services’ knowledge about precision medicine improves, this will help spread awareness to the patients and population. This can be achieved by carrying out campaigns (that will spread awareness about the topic), conducting clinician-to-patient talks during checkups, or by developing websites or pages within the hospital’s website that are made available to the public, which provide an overview of precision medicine and its benefits.
5.2. Solution for Patient Privacy
On the other hand, regarding the issue of patient privacy/data collection, it was stated that healthcare providers are facing many challenges in the process of data collection as well as problems in obtaining patients’ informed consent due to unclear protocols and misinforming patients about the purpose of informed consent. However, if, in an appropriate and clear way, the patients were to be fully informed about how their information would be used and the efforts that would be made to protect their genomics information, this should motivate the patients to provide their consent. Moreover, this will also provide the patients with knowledge regarding the options available for treatment. Thus, healthcare providers should be more likely to consider patients’ preferences in terms of treatment and prevention methods.
5.3. Solution for Value Recognition
Although stakeholders understand the benefit of personalized healthcare, they are still reluctant to use it due to a lack of evidence and supporting data on its overall success in healthcare institutions. Moreover, they are still reluctant to change policies and practices without evidential data that demonstrate economical and clinical value. The development of evidence to provide to stakeholders should be an important issue that scientists and companies that develop sophisticated technology for precision medicine pay attention to. Thus, awareness of this matter should be demonstrated before providing evidence for the profit of precision medicine [
10,
20,
29,
38]. Although data on patient survival, disease progression information, and risk reduction have begun to emerge, they are not reported properly enough to be shared with other stakeholders. Thus, it seems like there is a lack of evidence on the benefits of personalized healthcare. One way to start the process of sharing evidence is to develop policies that would protect patients’ confidential information while also developing a learning health system that provides a universally accepted and user-friendly way to systematically collect and share treatment or outcome data. Thus, implementing a learning healthcare system would produce effective information management and systems to easily share clinical data for all patients and analyze the patterns and profits produced by precision medicine [
20].
5.4. Solutions for Data Management and Infrastructure
Due to the rise of big data in precision medicine, there is an increased need to store the information and data generated by institutions that are conducting big projects. Thus, computational solutions, for instance, cloud-based computing, have emerged. Cloud computing is the best storage model that can provide the elastic scale needed for DNA sequencing, whose rate of technological advancement is also increasing rapidly [
35,
43]. The security and privacy of personal medical and scientific data remain a challenge, regardless of the cloud solution used.
Solutions to deal with big data, especially when analyzing complex genomics information, include the use of graphics processing units (GPUs), which have the potential to improve the computational power compared to conventional processors even in cloud solutions. Compared with the currently used central processing units (CPUs), GPUs are highly parallel hardware providing massive computation resources. GPUs have been recently used for proteomic analysis and metagenomic sequence classification.