4.1.1. Apache Hadoop

Hadoop [79] is an open framework based on a distributed system that stores and processes very large computational clusters on core architecture and is reinforced by three primary elements, as shown in Figure 7.

**Figure 7.** The structure of apache Hadoop.

The core of Hadoop consists of a storage component called HDFS, a distributed file system, a processing component based on the map-reduce model, and a resource manager called YARN, "Yet Another Resource Negotiator" [80]. Hadoop splits the data into large blocks and distributes them among the diverse compute nodes that make up the computing system. It then transfers the code for execution to the nodes so that parallels, i.e., the simultaneous processing of data, can take place on those nodes [81]. In essence, the location property of the data is exploited, and nodes manage the individual data to which they have access. Finally, it should be noted that while the basic structure of Hadoop consists of the elements already mentioned, Apache extensions are often used to enrich Hadoop's capabilities, depending on the situation, with the most important ones being Apache Spark, Apache Storm, Apache Flink, Apache Hive, Apache HBase, Apache Flume, Apache Sqoop, and Apache Pig. Hadoop could potentially be employed to develop medical analytics solutions. However, as previously stated, it is a batch big data platform that fails to fully capitalize on the potential of real-time emergencies [11].
