*1.5. Current Survey*

Our study provides a comprehensive and in-depth examination of the utilization of big data analytics in medical institutions. Unlike other surveys on the subject, we not only present a summary of the available tools and applications but also delve deeper into the key actions and research efforts being undertaken in this field. Additionally, we address the technical and organizational challenges that arise when implementing big data analytics in digital health services. In the end, we offer a simple strategy that can be adopted by organizations that want to integrate big data analytics based on the best practice in the field of healthcare. The goal of this study was to provide healthcare organizations and institutions with a clear understanding of the potential use, effective targeting, and expected impact of big data analytics technology, thus helping them make informed decisions about its implementation.

#### **2. Big Data Concepts in the Health Field**

Big data are generally viewed as a set of data that are too large or too heterogeneous and complex in structure to be handled by traditional data processing software. Big data challenges include collecting, storing, analyzing, transferring, sharing, and visualizing the information it contains. Scientists, entrepreneurs, and medical professionals are often required to use data from a range of sources, including big data from the international literature, the Internet, medical records, patient registries, and even 'smart' devices.

#### *2.1. Features of Big Data in Healthcare*

• Volume

In digital health, the increase in the amount of data is a result of both the digitization of already available data and the creation of new data formats. The volume of data available consists of personal medical records, radiology and fluoroscopy images, clinical trials, surveys, demographic data, human genomes, genetic sequences, etc. The exponential rise in data in the healthcare industry is due to the integration of new types of big data, including three-dimensional images, biological data, and data from sensor technologies. To handle the large volumes of healthcare data, for example, authors in [19] have used natural language processing (NLP) techniques to extract meaningful information from clinical notes in EHRs for complementary and integrative health (CIH). By automating the extraction of CIH information, this research can address the challenge of dealing with the volume of unstructured data in EHRs.

• Variety

Traditionally, the vast majority of data available in healthcare have been unstructured data, such as medical records and handwritten notes from medical and nursing staff describing symptoms, indications, behavior, medical images, etc. Of course, there has been an upsurge in structured data in recent years, such as electronic drug prescribing information, quantitative data on an instrument and test measurements, and general data that are attempted to be recorded in a single structure so that they can be used as a basis for data analysis. In addition to the data that are obviously recorded, data from new sources, such as wellness devices that record patients' pulse or sleep time, social networks, and genomic research, the use of different data sources allows for the obtaining of faster and more reliable results. In the study [20], the authors demonstrate how monitoring social media conversations related to vaccines can address the various problems of big data by providing a way to organize and make sense of a large amount of unstructured social media data.

• Velocity

In healthcare, most data traditionally come from static sources, such as X-rays, hospital documents, patient records, health logs, etc. In some applications, however, it is necessary to process and use the data in real-time, for example, to monitor blood pressure and heart

function during surgery [21]. There are also cases where data processing is necessary at a relatively slower pace, such as the daily determination of glucose levels in diabetics [22]. Another example is information about a known disease, which develops at a much slower rate in terms of percentage compared to a new epidemic that is developing. In the latter case, the data arrive at a high rate and are "new" information. It is imperative to quickly process this information in order to resolve the matter in a timely manner. To analyze healthcare data in real-time or near real-time, researchers have proposed the use of big data analytics to develop predictive models that can detect and respond to health emergencies. For example, using machine learning algorithms to predict the outbreak of infectious disease and monitor the spread of the disease in real-time [23].

• Veracity

There are several similarities between the study of data reliability in financial transactions and healthcare: the accuracy of patient data, correctly filling in hospital or clinic fields, patient insurance, linkage to bank accounts, the recording of paymen<sup>t</sup> amounts, etc. [3]. Of course, in the health sector, there are data that are not observed in other sectors, such as information about a diagnosis, treatment, administration of medication, care, and any other information deemed necessary to be recorded. The validity of these data is, in any case, as important as the data mentioned above. Ensuring the accuracy of big data is critical in healthcare to prevent medical errors, incorrect diagnoses, and treatment decisions. To address this issue, various techniques such as data cleaning, data validation, data integration, and normalization are used to ensure that the data are reliable and consistent.

• Value

The cost of healthcare is unsustainable and constantly rising. However, the multiple benefits offered by the use and exploitation of big data in healthcare are far more numerous. For example, in the study [24], the authors developed machine learning algorithms to predict hospital readmissions and reduce healthcare costs. The algorithms were able to accurately predict readmissions, and healthcare providers were able to intervene early and provide targeted interventions to reduce the risk of readmission.

The following figure, Figure 3, illustrates the 5 Vs of big data in the healthcare sector.

**Figure 3.** Big data characteristics in the healthcare sector.
