*2.2. Data Sources*

For the healthcare sector, relevant data are needed to build systems that have a positive impact on the health and well-being of individuals. In this section, we introduce three data sources and analyze how each can be leveraged through concrete examples.

• Electronic Health Records

Electronic patient records are a source of an enormous amount of data containing information about the social, demographic, medical, and health aspects of the patient's health. However, without reliable decision support, the human brain can only process a certain amount of information. In order to develop real-time knowledge and support systems that are preventive, predictive, and diagnostic in the healthcare industry, it is important to have an infrastructure that is constantly updated. Computational models are required to assist medical professionals in data organization, pattern recognition, and result interpretation [25]. the following table shows some of the possible data that an electronic medical record may contain, as well as their data type (Table 2).


**Table 2.** Possible content of an electronic medical record.

## • Social networks

The resurgence of communication via social networks is one of the most important factors in the dramatic evolution of healthcare. According to a recent estimate, approximately one billion tweets have been exchanged, illustrating the depth of communication between organizations, patients, and providers. Social networks now offer researchers new ways to reach out to patients and include them in their research. One such project is TuAnalyze, a collaboration between TuDiabetes1 and Boston Children's Hospital that allows diabetics to track, assess, and share their findings while actively participating in diabetes research [26]. Without a doubt, one of the most intriguing applications of data analytics is its ability to predict and monitor significant epidemics for the benefit of public health. Predictions of major health outcomes, such as an exacerbation of asthma attacks, can be improved by combining social network analysis with environmental data. Specifically, Google searches, Twitter activity, and air quality data can be used to estimate the number of daily emergency room admissions for an asthma event [27]. According to a study published in [28], there was a rise in tweets discussing the situation in Nigeria at least three days before the Ebola outbreak was brought to public attention and seven days before the Centers for Disease Control issued an official alert. As a result, many researchers are now harnessing social media's potential to advance global awareness and improve health.

• Internet of Things

Millions of people use devices to monitor various aspects of their health behavior. These devices can monitor things such as heart rate, mobility, sleep quality, and blood sugar quality. The recorded data can be used to detect any danger and alert a physician, depending on the service offered by the device, all in real-time [29]. Due to advances in technology, particularly sensor technology, there is a growing interest in wearable and implantable sensors. These technological advances have made continuous and multimodal sensing possible. Simultaneously, advances in sensor miniaturization, noise reduction, and microelectronics development have increased the flexibility and reliability of implantable sensors [30].

#### *2.3. Healthcare Big Data Analytics Classification*

Several types of big data analytics are used in the healthcare industry (Figure 4), including descriptive analytics, diagnostic analytics, and predictive and prescriptive an-

alytics [18]. In this section, we discuss the specifics of each type of analysis and how it manifests itself in the healthcare field.

**Figure 4.** Classification of big data analytics in healthcare.

## (a) Descriptive Analytics

Descriptive analytics consists of the description of the existing situation and helps to outline the picture of past performance on the basis of historical data and through the use of business intelligence and data mining. To perform this level of analysis, various techniques are used [31]. Descriptive analysis, known as unsupervised learning, among other things, summarizes what happens in the managemen<sup>t</sup> of health services and what effect does a parameter have on the system? Descriptive analysis is the simplest level of understanding and use. It is a simple description of the data, with no further analysis, exploration, or analysis. The descriptive analysis defines, characterizes, aggregates, and classifies data in order to provide health practitioners with useful information for understanding and analyzing decisions, performance, and consequences. For example, this includes discharge rates, the average length of stay, and other relevant metrics for hospitals.

(b) Diagnostic Analytics

Diagnostic analytics seeks to explain why certain events occurred and what factors contributed to them. For example, diagnostic analytics attempts to understand the reasons behind the frequent readmissions of some patients [32] using various methods such as clustering and decision trees. To find the source of an issue and help people understand its nature and impact, an extensive examination and guided analysis of the existing data utilizing tools such as imaging techniques are required [33]. This may include the ability to understand the effects of system inputs and processes on performance. For instance, there are a number of significant factors, such as patient, provider, or organization-related issues, that may contribute to longer wait times for the provision of some healthcare services [34,35].

## (c) Predictive Analytics

Predictive analytics reflect the ability to predict future events while assisting in the identification of trends and identifying potential uncertain outcomes; for example, it may be asked to predict whether or not a patient will develop complications. Predictive models are often constructed using machine learning techniques. Predictive analytics use massive data sets to improve customer experience, improving results compared to conventional business strategies [7]. They are used to analyze large volumes of data, as well as unstructured data, which produce the results to predict future developments. From an information science perspective, predicting future developments based on current data sets is a difficult

issue. Business intelligence programs of this kind help to calculate data streams on a larger scale, including social media content, shopping experiences, users' daily activities, and surveys [36]. For example, a pharmacist may need to know how much of a medicinal preparation to keep in stock in the inventory in anticipation of an outbreak of an epidemic. A doctor may also need to predict certain clinical events, such as the length of a patient's stay, the possibility that a patient will choose to undergo surgery, or the possibility that a patient will have complications or even die [4].
