Kubernetes-Powered Cardiovascular Monitoring: Enhancing Internet of Things Heart Rate Systems for Scalability and Efficiency

Sucipto, Hans Indrawan; Elwirehardja, Gregorius Natanael; Dominic, Nicholas; Surantha, Nico

doi:10.3390/info16030213

Open AccessArticle

Kubernetes-Powered Cardiovascular Monitoring: Enhancing Internet of Things Heart Rate Systems for Scalability and Efficiency

by

Hans Indrawan Sucipto

¹,

Gregorius Natanael Elwirehardja

²,

Nicholas Dominic

³ and

Nico Surantha

^1,4,*

¹

Computer Science Department, BINUS Graduate Program—Master of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia

²

Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia

³

Bioinformatics and Data Science Research Center, Bina Nusantara University, Jakarta 11480, Indonesia

⁴

Department of Electrical, Electronic and Communication Engineering, Faculty of Engineering Tokyo City University, Setagaya-ku, Tokyo 158-8557, Japan

^*

Author to whom correspondence should be addressed.

Information 2025, 16(3), 213; https://doi.org/10.3390/info16030213

Submission received: 9 February 2025 / Revised: 6 March 2025 / Accepted: 7 March 2025 / Published: 10 March 2025

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence with Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Reliable system design is an important component to ensure data processing speed, service availability, and an improved user experience. Several studies have been conducted to provide data processing speeds for health monitors using clouds or edge devices. However, if the system design used cannot handle many requests, the reliability of the monitoring itself will be reduced. This study used the Kubernetes approach for system design, leveraging its scalability and efficient resource management. The system was deployed in a local Kubernetes environment using an Intel Xeon CPU E5-1620 with 8 GB RAM. This study compared two architectures: MQTT (traditional method) and MQTT-Kafka (proposed method). The proposed method shows a significant improvement, such as throughput results on the proposed method of 1587 packets/s rather than the traditional methods at 484 packets/s. The response time and latency are 95% more stable than the traditional method, and the performance of the proposed method also requires a larger resource of approximately 30% more than the traditional method. The performance of the proposed method requires the use of a large amount of RAM for a resource-limited environment, with the highest RAM usage at 5.63 Gb, while the traditional method requires 4.5 Gb for the highest RAM requirement.

Keywords:

internet of things; microservices; health monitoring; Kubernetes

Graphical Abstract

1. Introduction

Cardiovascular disease (CVD) is one of the leading causes of morbidity and mortality worldwide. According to the World Health Organization, cardiovascular diseases are the major cause of death, and as many as 56% of all deaths worldwide are caused by CVD [1]. CVD is also categorized as a noncommunicable disease (NCDs), and it has been reported that in 2019, it has already accounted for 33.3 million deaths worldwide [2]. CVD is the second major contributor to worldwide health cases, and, in the United States, sudden cardiac death (SCD) is projected to rise to 8.5 million by 2030 [3]. CVD encompasses a range of conditions that affect the heart and blood vessels, including coronary artery disease, arrhythmias, and heart failure. The increasing CVD risk is driven by factors such as a sedentary lifestyle, poor diet, alcohol consumption, smoking, and an aging population [4]. The worst consequence of CVD is sudden cardiac death caused by arrhythmia, a component of CVD that impairs the ability of the heart to pump blood [5]. Therefore, a monitoring system for CVD is needed to reduce the mortality rate from CVD [6], in addition to the improvement of lifestyle habits. The detection and continuous monitoring of the heart rhythm are crucial for preventing severe outcomes and improving patient care. However, traditional monitoring systems often lack mobility, scalability, and real-time capabilities, highlighting the need for innovative solutions powered by Internet of Things (IoT) technologies and cloud computing.

In this IoT-powered era, combined with the advancement of machine learning (ML) and cloud technology, there is an opportunity to help CVD patients monitor their health rhythm. The importance of health monitoring systems lies in the importance of reliable system design that can be relied on in terms of the data-processing speed and service availability. A good system design has an impact on providing real-time data from sensors, processing the data on the cloud, and delivering them to the users.

Previous research on heart rate monitoring systems has a system design that leverages protocol buffers (Protobuff) to serialize data [7] as an interservice communication protocol in the Kubernetes cluster. This approach not only ensures high-speed data transfer but also reduces communication costs compared to text-based serialization. The combination of Kubernetes and ProtoBuff ensures improved scalability and stable CPU usage.

In addition, many systems emphasize the role of advanced sensor gateways in preprocessing data at the edge before forwarding them to the centralized cloud infrastructure. Message queuing telemetry transport (MQTT) and Advanced Message Queuing Protocol (AMQP) [8] are protocols that are mostly used as sensor gateways because of their reliability and speed in transferring data from the sensor to the cloud environment. Even though there is no major difference between AMQP and MQTT [9], these two protocols still have their advantages. For example, AMQP has the lowest jitter value, indicating that AMQP is more stable for handling large amounts of packets, and MQTT has the lowest CPU usage.

Previous research on heart rate monitoring systems has heavily utilized edge computing to process data closer to the source. By applying edge computing, these systems can take advantage of techniques such as caching at the edge to reduce latency [10] and minimize the volume of data sent to the cloud. This approach enables near-real-time data processing while saving bandwidth and computing resources.

Despite the advantages of edge computing, this study adopts a different approach because of the limitations of small-scale monitoring devices. Implementing edge computing requires additional hardware and resources, which may not be feasible for compact, low-power heart rate monitoring devices. To address this, this study shifted all processing to the cloud, utilizing its scalability and computing power to handle real-time data streams. By bypassing the edge and centralizing the processing in the cloud, the system achieves greater simplicity and adaptability, making it more practical for small-scale devices. This research focuses on finding the most stable architecture to receive many requests at a time while still being able to provide responses to users in real time. To determine the performance of the architecture, a test hit can be performed on the existing architecture; however, during the test, there can be various latencies due to local traffic that may be shared with other users. This research does not further explain methods to secure patient data because it focuses on stable and real-time architecture. In addition, the proposed method is a solution that can use an architecture that can be applied to the cloud rather than existing edge computing solutions because it considers the flexibility of patients who will use this health monitoring, whereas the existing edge computing solution requires patients to bring additional devices and requires a continuous supply of energy to work. As such, this research focuses on optimizing the use of a more flexible cloud by providing a reliable and stable architecture for sending and receiving data. However, the cloud approach requires connection stability from the client side, and unstable internet conditions can also affect the speed of data processing.

In addition, some research has explored improving system performance by using a separate database for each service in a microservice architecture [11]. This approach improved the execution time and modularity of the system. Other research implements event-driven architecture using messaging intermediaries, such as Apache Kafka, to facilitate efficient data flow and asynchronous communication between services. Although these methodologies offer significant advantages, they are often limited by the limitations of edge devices in terms of computing power and scalability.

Another study used more than one sensor gateway to improve latency and throughput by spreading the load between the sensor gateways rather than using only one sensor gateway [12]. Each sensor gateway forwards the data to the message broker; an addition can be made to this research by changing the sensor gateway from HTTP to a lightweight protocol, such as MQTT or AMQP. A study proved that HTTP is not suitable for real-time data processing based on the comparison of latency and throughput [13].

Kubernetes is an open-source project that aims for container orchestration. Kubernetes was used in this study because its performance is better than that of other orchestrators, such as K3s and KubeEdge. This is supported by research showing that Kubernetes has a lower memory footprint than other orchestrators [14]. These advantages make Kubernetes suitable for use in this study and for scenarios that require high scalability.

Additionally, Kubernetes offers flexibility in container management through its built-in features. With a modular architecture and a wide ecosystem, Kubernetes allows integration with various tools [15].

However, Kubernetes pose a new challenge of orchestration complexity; this challenge occurs because of the need for knowledge about load balancing configuration and service discovery. Resource allocation also needs to be considered to prevent wastage of resources due to underutilization or excessive overhead. In addition, performance overhead can occur in large-scale deployments, where networking, control planes, and the use of persistent storage can affect the response time.

From the comparison of MQTT, AMQP, and WebSocket protocols, WebSocket has poor performance compared with other protocols, as explained in studies that have shown that WebSocket has a weakness in latency for large payload sizes [16]. In the same study, MQTT and AMQP exhibited decreased performance when using a high Quality of Service (QoS). Using several studies that have been conducted, this research aims to obtain the best system design using Kubernetes in a local environment. Kubernetes was used to provide scalability and ease of management for containerized applications, unlike traditional infrastructure that requires manual configuration and scaling. This study compares several microservice architectures deployed in a local Kubernetes cluster. The first architecture utilizes MQTT for the sensor gateway, where the goal is to place incoming data from sensors into the system and to maximize data flow; data from MQTT will be forwarded to a message broker using Apache Kafka and then routed to related services. The second architecture leverages the lightweight nature and speed of data transfer from MQTT, with data being directly forwarded to related services.

The metrics to be compared were latency, throughput, error rate, and RAM usage. From the architectures that were be compared, this research looks for the advantages and disadvantages of each architecture in the context of deploying machine learning in a Kubernetes environment for IoT systems.

2. Materials and Methods

In this section, we explain the prediction model used in micro-services and explain the methods that will be compared.

2.1. The Kubernetes Architecture

This study compared two main system designs for intercluster communication. To ensure that both system designs had the same research environment, the research was conducted in a local Kubernetes environment, deployed as a microservice architecture. The deployment was performed on a local machine equipped with an Intel Xeon CPU E5-1620 and 8 GB of RAM. The existing device specifications can provide an overview of the research environment that is sufficient to test the performance of the two system designs.

Kubernetes was selected as the container orchestrator due to its scalability and ability to manage containerized applications efficiently [17]. Kubernetes simplifies container orchestration, enabling optimal resource utilization and seamless scaling. It employs a cluster-based architecture consisting of a master node and a worker node. The master node is responsible for managing the cluster, scheduling, maintaining the application’s state, and monitoring cluster health, while the worker nodes run the actual application containers, as can be seen in Figure 1.

For the database, this study used SQL as a database to store user data, and the Google Cloud Platform (GCP) provided services to deploy SQL, called Google SQL. There are several tables in Google SQL such as patients, ecg, model_results, and medicines. The patient table has dependencies on the ecg, mode_results, and medicine tables; this is performed so that, with the patient ID, data can be found from other tables that are wanted to be retrieved for the patient dashboard. As in the patient table, there is some basic patient information such as email, password, isSmoker, and isHavingHypertension. The ECG table stores the recorded electrocardiogram data sent from the patient to the monitoring system. The medicine table contains information on the types of drugs being used or prescribed to the patient, which stores information such as the name of the drug, the frequency of taking the drug, and additional information. The model_result table contains the results of the model prediction for each patient’s electrocardiogram record, as shown in Figure 2.

Google SQL was used as the primary database to store data. Each service was connected to Google SQL through Google Auth Proxy. Google Auth Proxy works by having the client running in the local environment. The application communicates with Google SQL through Google Auth Proxy, using the standard protocol from Google Auth Proxy. Google Auth Proxy creates a virtual secure tunnel from the local environment to Google SQL. Google Auth Proxy also checks the existing connection between the application and Google SQL. If a connection does not exist, it calls Google SQL Admin APIs to create an ephemeral SSL certificate to connect to Google SQL as seen in Figure 3.

2.2. MQTT as Inter-Cluster Communication (Traditional Method)

In this traditional system design, MQTT serves as the main protocol for intercluster communication in the microservice architecture implemented in Kubernetes. MQTT is a lightweight and efficient messaging protocol [18] that is widely used in distributed systems such as IoT owing to its low latency. MQTT has a unique way of sending messages to its clients and adopts a publish and subscribe (pub-sub) communication pattern, as shown in Figure 4 [19]. When the user communicates to the MQTT broker, the message sent from the user will be stored in the MQTT broker and then used by the related service. The service will retrieve data according to the topic that is being subscribed to by the service. The MQTT broker ensures that the message sent by the user can be received by the service, enabling seamless communication between the services. This design is particularly effective in scenarios that require reliable message delivery and real-time processing, because MQTT is optimized for high-throughput and low-resource environments. In this study, the MQTT method is adopted from the study by Longo et al. [9], by adopting MQTT in a Kubernetes cluster. This approach supports scalability and flexibility while maintaining efficient communication between services as seen in Figure 5.

To overcome message loss in brokers, especially MQTT, you can configure Quality of Service (QoS), which determines how the data process will be sent between the publisher and subscriber. QoS consists of 3 levels, starting at QoS 0 where the message will only be sent once without confirmation of receipt from the subscriber. QoS 1 ensures that the data have been received by the subscriber by sending a PUBACK from the subscriber if the data have been received. QoS 2 also ensures the data have been received by the subscriber but also ensures there is no duplication of data by sending PUBREC, PUBREL, and PUBCOMP.

2.3. MQTT and Kafka as Inter-Cluster Communication (Proposed Method)

This system design was optimized using a combination of MQTT and Apache Kafka to enhance the flow of data from sensors or edge devices to the cloud and from the cloud to services between clusters. MQTT was responsible for handling incoming data from edge devices, and after that, the MQTT broker forwarded the data to Apache Kafka, which was responsible for ensuring that each running service receives data from topics that have been subscribed to for each existing service. Apache Kafka was used as a centralized message broker to ensure high-throughput data flow. Kafka works based on the concept of topics, where a topic works as a channel through which data are placed. To be able to provide data on a topicrequires a producer and retrieving data from Apache Kafka requires a consumer, as seen in Figure 6. Kafka messages are stored in partitions, which have the power of parallelism and scalability. Kafka’s replication system ensures that each partition has a leader and multiple replicas. The leader handles all read and write requests, whereas the replicas serve as backups [20]. If the leader fails, one of the replicas automatically takes over, thereby ensuring data availability. With Kafka, the system can guarantee that any of the microservices can reach any required data stream with confidence [21]. Such a design, as shown in Figure 7, opens seamless integrations ranging from light data consumption with MQTT to truly scalable data processing with Kafka.

For Kafka to connect to the MQTT Broker, an addition is needed, namely, Kafka Connect. Kafka Connect plays a role so that data from the MQTT broker can be subscribed to and forwarded to the MQTT topic according to the topic created. Kafka Connect is a framework that makes it easy to automatically integrate data with Kafka and external sources such as MQTT brokers, databases, and others. Kafka Connect provides several connector options that can be configured easily as seen in Figure 8. Kafka Connect can provide real-time data flow between the Kafka and external sources. Using Kafka Connect, it is easy for data to flow from Kafka to the MQTT broker and vice versa, without the need to set up manual integration, thereby increasing the efficiency, scalability, and reliability in handling data on these two platforms.

The addition of Kafka to the proposed method provided additional complexity, especially in the process of managing the integration of MQTT with Kafka using Kafka Connect. Configurations in Kafka, such as producers, consumers, and connectors, require additional management. In addition, serialization and deserialization of messages between systems can add latency to the data-delivery process. When handling large amounts of data, this can affect performance. Despite these challenges, the combination of MQTT and Kafka offers a scalable and reliable solution for data streaming when properly configured and optimized.

2.4. Evaluation Metric

RAM usage was used for the evaluation of the system design efficiency. This helped in analyzing the computational overhead as well as the memory requirements for each system. The lower the resource usage, the better the design. This would continuously observe the usage of the resources for this experiment on the cluster. The monitoring tools in this work included node exporters, Prometheus, and Grafana [22]. These three currently popularly used tools for monitoring Kubernetes clusters provide observability in every nook by collecting, storing, and then visualizing metrics flowing out of the target system. Node Exporter collects hardware metrics, like memory, from the cluster and exposes them on HTTP endpoints. Prometheus periodically scrapes these metrics [23], stores them in its time-series database, provides the capability to query them through PromQL, and enables alerts for server monitoring. Grafana allows the visualization of this metric in an interactive dashboard for end users to analyze the trend [24] and monitor the cluster in real time [25]. Together, these tools form a robust monitoring stack that helps optimize resource utilization and early detection of problems. The CPU and RAM consumptions were compared in this study to identify the system design that yields optimal performance for varying traffic loads.

Two other major metrics used for the system design were latency, error rate, and throughput. Latency refers to the time taken by a request to travel from the sender (e.g., sensor) to the receiver (e.g., service). A lower latency represents a faster and quicker responsive communication in real-time. In contrast, throughput measures the number of messages successfully processed per second, reflecting the system’s ability to handle large amounts of data [26]. The error rate calculates the number of requests that can be processed by a server. The error rate indicates whether the server can still send back all the requests requested by the user under certain conditions. The error rate is crucial because in non-optimal conditions, it is expected that all requests from users can be processed properly by the server, and users can obtain results according to the data sent. This study analyzed these metrics across different scenarios to test the effectiveness of each design for real-time large-scale data processing. In other words, this combination of RAM, latency, error rate, and throughput comprehensively informed the trade-off between system complexity and performance.

Different conditions were attached to the performance of the system in this study using Apache JMeter. Apache JMeter is an open-source performance-testing tool for analyzing and measuring performance in both web-based applications and services. It provides the capability to simulate multiple concurrent users, build up a load, and measure how the system will handle different levels of traffic. Apache JMeter could provide this research with systematic measurements of metrics such as latency and throughput, which are very important for investigating scalability and efficiency issues in systems. With Apache JMeter, the possibility of reporting and deep analysis can provide a closer view of the performance trends and bottlenecks. To simulate real-world scenarios, three tests were conducted for each architecture. In this study, three tests were conducted to replicate the user request for the prediction result from the server.

Each test assumes that at any one time there are many users who attempt to request the prediction model for health monitoring. Therefore, this test attempted to determine which architecture was more resistant when obtaining the worst request case simultaneously using Apache JMeter. In the Apache JMeter, there are four samplers: MQTT Connect, MQTT Pub, MQTT Sub, and MQTT Disconnect. This sampler was tasked with simulating the flow of the process of sending and receiving data. When publishing the data, the client sent the patient ID, start time, and end time of the data to be predicted. For the evaluation process, the Results Tree, Summary Report, Active Threads Over Time, Transactions per Second, Response Latencies Over Time, and Hits per second were used.

In the first test, there were 500 users trying to access the server, and users tried to make predictions from the existing ECG results 20 times, so the total number of requests received by the server was 10,000 requests. The second test simulated 500 users, and each user sent an ECG prediction request 35 times, so the total request that the server handled was 17,500 requests. The third test was set for 500 users; each user sent 50 prediction requests, and the total number of requests sent to the server was 25,000 requests. All the tests were repeated 30 times to obtain the average results as shown in Figure 9.

Due to the time limitation in this research, each test for simulating sending requests on JMeter focused on the resilience of the architecture to process many requests simultaneously, assuming that the user’s network is currently stable. In addition, this research does not focus on the security of user data because this research focuses on finding an architecture that can handle many requests at a time, is reliable against a large number of requests, and is stable.

3. Results

This section will explain the process of obtaining the best prediction model and comparison between the traditional architecture (MQTT) and the proposed architecture (MQTT Kafka) based on the metric evaluation that has been discussed.

3.1. Arrhythmia Classification Model

The process of creating an arrhythmia disease classification model was carried out through four stages, as follows.

Training data for the model was obtained. In the development of arrhythmia models, training data was obtained from the Waveform Database (WFDB) with two types of data used, namely Malignant Ventricular Ectopy Database (VFDB) as positive data (affected by arrhythmia) and Normal Sinus Rhythm Database (NSRDB) as negative data (not affected by arrhythmia). The data was the patient’s ECG signal data, which were partitioned every 5 min with a frequency of 128 Hz.

The training data were processed to the next stage, namely feature extraction from Heart Rate Variability (HRV), which aimed to sort out features that have a major effect on the classification process. After the feature selection process, eight main features were obtained, namely mean RR interval, square root of the mean of the sum of the squares of the differences between the adjacent NN intervals, mean heart rate, total spectral density, the triangular index, the cardiac sympathetic index, the cardiac vagal index, and sample entropy.

The data were used for the machine learning (ML) model training process, and the ML models that were trained using the data were the Linear Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Multi-Layer Perceptron (MLP). For LR, SVM, and RF, the model configuration followed the default configuration of Scikit-learn version 1.1.3. MLP uses two hidden layers with 16 and 4 neurons, respectively, and a learning rate of 0.001 and batch size of 64. To ensure the reproducibility of results, the seed was set to 43, and the trained models were saved in pickle (pkl) format. As a result of ML model training, each model had accuracy, precision, recall, an F1-score and an AUC-ROC score above 95%. This showed good results for each model, but of all of the, the MLP model was the model that had the best performance, where the value of all matrices was at 99%. This indicates that MLP could beat all other matrices from other ML models, even though the MLP precision value was defeated by other ML models; a more detailed comparison can be seen in Table 1. Figure 10 shows the stages of creating an arrhythmia classification model. The complete source code for all ML experiments is freely accessible at https://github.com/NicholasDominic/Malignant-Ventricular-Arrhythmia-Detection (accessed on 5 February 2024).

3.2. Throughput Evaluation

The throughput value was obtained from the number of requests the server can handle in seconds. Each result obtained in this test was the result of an average of 30 trials that were carried out. In the test of 500 users, each user sent 20 ECG predictions; the test results show that with the traditional method, the highest throughput that could be achieved was 486 packets/s, which indicates that, in its best condition, the traditional method can receive 486 packets per second. The estimated time to complete 10,000 requests sent to the server was about 21 s.

In the proposed method, the highest throughput was 2210 packets/s, much higher than the traditional method, as seen in Figure 11. This test shows that the proposed method can handle the number of requests better than the traditional method. The higher throughput also indicates the proposed method’s ability to complete requests faster, which takes about 5 s. This shows that the proposed method performs better than the traditional method.

In tests with a simulation of 500 users, each user sent 35 ECG predictions; the MQTT architecture had the most significant throughput of 486 packets/s, indicating that the MQTT architecture could send 486 packets in one second at the highest performance of the architecture, and with this total throughput, the traditional method took approximately 36 s to complete 17,500 packets. These test results prove the limitations on the performance of the traditional method with a higher number of requests.

As for the proposed method, the highest performance was 2085 packets/s, indicating that the proposed method has a performance that far exceeds the traditional method. The proposed method had four times a greater performance than the traditional method, as seen in Figure 12. The proposed method took approximately 8 s to complete the total requests sent. Thus, the proposed method is much more efficient and can handle a larger volume of requests in a shorter time.

In a test with a simulation of 500 users, each user sent 50 ECG predictions; the results were using the traditional method; the highest throughput that could be achieved was 484 packets/s, as seen in Figure 13; this result shows that there was no significant improvement in throughput from the previous test, indicating that the traditional method could not handle the spike in demand efficiently. This also indicates that the traditional method has limitations in terms of scalability when handling large numbers of requests.

For the proposed method, the average throughput achieved was 1587 packets/s. This performance decreased compared to the first test but still outperformed the traditional method. With this throughput, the proposed method could still handle spikes in requests with better efficiency. Nonetheless, the proposed method still shows its superiority in handling a larger volume of requests compared to the traditional method despite the slight decrease in performance.

Based on the results obtained from Figure 11, Figure 12 and Figure 13, it is clear that there is a trend where the proposed method is superior to the traditional method. The proposed method beat the traditional method in terms of the amount of data handled simultaneously. The proposed method obtained a peak throughput of 2210 packets/s in the first scenario before decreasing slightly in the second and third scenarios to its lowest point of 1587 packets/s. At this point, the proposed method showed decreased performance due to increased resource requirements.

In the traditional method, throughput stagnated at 484–486 packets/s across the three scenarios, indicating a bottleneck in adapting to the number of incoming requests. In contrast, the proposed method could handle the number of requests because Apache Kafka can divide and handle many requests. Figure 13 provides insight into the stability of the existing method. The proposed method can still maintain its advantage over the traditional method, but there is a performance degradation that can be caused by network congestion or additional processing time. The absence of an error plot in Figure 13 limits the ability to analyze the variation and fluctuation of throughput. As an additional explanation, the white dots in the box plot above indicate the outliers of the data that are not within the distribution.

3.3. Response Evaluation

Response time measures the time it takes for data to reach the destination and return. This metric measures the time it takes to send the data and the time it takes for the server to process the data and send the response back to the user. The result was that 500 users sent 20 prediction requests to the server. For the proposed method, the best performance was at 0 ms, and the highest value was at 60 ms with an average response time of 0.12 ms, while for the traditional method, the best performance was at 0 ms, and the highest value was at 4 ms with an average response time of 0.09 ms. From the data, it can be seen that the traditional method can perform the better than proposed method. However, the difference in standard deviation between the traditional and proposed method shows significant results; the proposed method has a smaller number, which shows that the proposed method has a stable response time.

From the test results, it can be observed that the proposed method has an advantage in response time stability over the traditional method. This can be seen from the standard deviation difference between the two methods. This difference can be attributed to the optimization of processing the incoming data into the server and using an architecture that can handle many data streams efficiently.

When testing for 500 users, 35 prediction requests were sent to the server, showing the performance of the traditional method with the best response time value at 0 ms and the worst response time at 9 ms. This shows that with this enormous load, the traditional method can still handle the number of incoming requests very well, as seen from the worst time, which was only 9 ms, while the average response time was 0.05. Despite the increase in the previous test, the traditional method could still provide pretty good performance for real-time applications.

For the proposed method, the best response time was 0 ms, and the worst time was 234 ms; this result was much longer than the traditional method because there was a spike at one time for the response time of the proposed method. However, that does not mean the proposed method sent a response back with the worst time because the average response time of the proposed method was 0.28 ms, which was still within a reasonable value for real-time applications.

In testing the response time for 500 users who sent 50 prediction requests, the traditional method showed that the fastest response time was 0 ms. However, the highest response time was 11 ms with an average response time of 0.07 ms, indicating that the traditional method could still send a response back to the user quickly enough. With a slight difference between the lowest and highest times, the traditional method can be used for applications that depend on data transmission speed. The intended performance of this test against the traditional method provides insight into the fact that the MQTT broker has a good and efficient communication speed.

On the other hand, for the proposed method, the shortest response time was 0 ms. However, the highest response time was 261 ms with an average response time of 0.34 ms, indicating that the proposed method can provide excellent performance under some ideal conditions but, at certain moments, can also provide a slow response under certain conditions. Factors such as the number of messages being processed and network load can trigger these spikes. From both results, it can be seen that the architecture can still have excellent performance judging from the shortest response time, but the proposed method can have a slow performance at certain times with the highest response time value at 261 ms. There is a notable difference in standard deviation between the two methods; from the last two tests, it can be observed that the proposed method has a more significant standard deviation value than the traditional method. The result of a higher standard deviation on the proposed method can happen because moving data from MQTT to Kafka takes time.

From the tests that were carried out, it was found that there was a trade-off with the existing method. The traditional method provided faster results than the proposed method, as seen in the maximum value obtained; the traditional method was faster than the proposed method. In addition, the proposed method provided results with a more excellent average value than the traditional method and spiked at some moments. However, the standard deviation data state that there is a more profound difference between these results. The traditional method had a more significant value than the proposed method, such as in the third scenario, where the standard deviation of the traditional method was 5.71 ms. The proposed method was 0.28 ms, indicating that although the traditional method had a faster maximum value, it was not very stable for the time obtained in contrast to the proposed method, which, although the average value was relatively higher than the traditional method, provided a very stable value compared to the traditional method as can be seen in detail in Table 2. The standard deviation results also show that the spikes obtained from the proposed method will be rare. Therefore, the proposed method is more suitable for environments requiring stability, but the traditional method is more suitable if you need speed.

3.4. Latency Evaluation

3.4.1. Simulation of 500 Users, Each User Sends 20 Prediction Requests

Latency time measures the time it takes for data to reach their destination and return. This metric does not only measure the data transmission time without calculating the server’s processing time. The results obtained were in the traditional method: the smallest latency achieved was 0 ms, with the highest latency at 8 ms; there was a momentary latency spike in the traditional method, this resulted in the average latency in the traditional method, which tended to be high at 1.2 ms. The standard deviation also shows that the proposed method is more stable than the traditional method since the lowest standard deviation is 0.17 ms from the proposed method, as seen in Figure 14. This spike can occur due to various factors such as network load, the queuing of incoming messages, and delays in data delivery. However, in the proposed method, occasional latency spikes still occurred, possibly caused by contention in Kafka’s message queues, where multiple producers and consumers compete for resources, leading to temporary processing delays. Latency instability can affect the performance of architecture that relies on real-time communication. For the proposed method, it showed a more stable latency, with the lowest latency of 0.7 ms, and the highest latency being 1.1 ms, so the average latency was 0.9 ms as can be seen in detail in Table 3. The latency in the proposed method tends to be more stable than the traditional method, and due to the influence of throughput, the duration of MQTT Kafka is faster than MQTT. The stability of the response time shows that the proposed method can handle many requests more efficiently. With faster data delivery duration, the proposed method is superior for applications that require consistent communication and minimal disruption due to latency spikes.

3.4.2. Simulation of 500 Users, Each User Sends 35 Prediction Requests

Latency plays an important role in the efficiency of sending an application request; in the traditional method test results, the minimum latency obtained was 0 ms, and the maximum latency was 2.6 ms, with an average latency of 0.9 ms. This indicates that the traditional method can provide instant message delivery in some conditions as can be seen in detail in Table 4. The results of the minimum latency value provide information that the traditional method can handle message delivery without delay.

In addition, the proposed method’s minimum latency results were 0 ms. However, the maximum latency provided a significant range of values at 34.9 ms, as seen in Figure 15, with an average latency of 4.8 ms. This spike only occurred once, and, therefore, it affected the average latency. This spike suggests potential contention in Kafka’s internal message queue, where high-volume message bursts may temporarily overwhelm partitions, leading to processing delays. Kafka’s log segment flushing and replication mechanisms may also introduce periodic latencies, especially under high-load conditions. This increase can occur due to the addition of the data transmission process on the server. The incoming data will be processed into the MQTT Broker and forwarded to Kafka before being consumed by the service concerned. However, this spike can affect the mean value based on standard deviation; the proposed method still has a better value because there is a significant difference between the traditional method’s standard deviation. This difference can affect overall performance since a higher standard deviation means the value is more unstable.

3.4.3. Simulation of 500 Users, Each User Sends 50 Prediction Requests

For this metric, it was found that both architectures had the lowest latency at 0 ms, and the highest latency was 30.9 ms, as seen in Figure 16 for the proposed method, while for the traditional method, it was 1.4 ms. The test results also show that the latency difference between the two methods shows significant results, with the traditional method having the most considerable latency at 1.4 ms and the proposed method at 30.9 ms. This provides insight into the fact that the proposed method can still provide good performance, but, keep in mind that latency time spikes can occur; this can be due to the addition of layers in data processing. In scenarios requiring low-latency communication, such as IoT, traditional methods perform better than the proposed method. The proposed method’s increased latency can be due to message serialization, broker processing time, and network overhead. From the research results, the average latency value in the two methods can also be taken; in the traditional method, the average latency is 0.7 ms, while in the proposed method, the average latency is 2.7 ms as can be seen in detail in Table 5. Another reason for the increased latency time in the proposed method is the data flow process; after entering the MQTT broker, the data must be sent to Kafka before being processed by the microservice. However, the proposed method provides ease of scalability, persistence, and message durability, making it suitable for applications that require reliable message delivery and that can handle large data streams. The proposed method has better stability than the traditional method based on standard deviation. The lowest standard deviation is 0.51 ms, which was achieved by the proposed method rather than the traditional method of 7.6 ms.

From the results of testing all scenarios that have been carried out, it has been found that the proposed method has a higher average latency time than the traditional method. The additional latency in the proposed method can occur due to the addition of processes with Apache Kafka before entering the intended service. However, interesting results were also obtained from the results of all the scenarios carried out in addition to the average, namely in standard deviation. The standard deviation from all scenarios shows that the proposed method has lower results than the traditional method. Although the average of all scenarios is defeated by the traditional method, the proposed method outperformed the standard deviation, which indicates that the proposed method is more stable than the traditional method for latency time. The possibility of spikes in the proposed method will also be rare based on the standard deviation results. This shows that the proposed method offers good scalability, persistence, and stability, making it suitable for applications requiring data handling. In addition, the traditional method is more suitable for applications requiring very minimal latency speed to eliminate the possibility of instability.

3.5. RAM Evaluation

This evaluation compared RAM usage between two methods using tools such as Node Exporter to obtain server metrics, Prometheus to be able to query the metrics that have been obtained, and visualization using Grafana. The data from 500 users sending 20 prediction requests that have been obtained show that the traditional method has a lower RAM usage, which is at 1.8 GB and which obtains a peak at 2.6 GB, while the proposed method has RAM usage at 2 GB and reaches a peak at 4.6 GB. The more significant RAM usage in the proposed method is because Kafka requires more resources to work. Although the proposed method requires more RAM, the performance of the proposed method shows much better reliability than the traditional method. However, in a low-resource environment, the proposed method will tend to be less able to provide performance, and the traditional method provides better results because it requires fewer resources.

From the simulation of 500 users sending 35 prediction requests, the traditional method has the lowest memory usage at 2.5 GB and peak RAM Usage at 3.5 GB. However, the lowest RAM usage is at 2.5 GB; keep in mind that the number of requests sent was not the highest amount at that duration. RAM usage will continue to increase until it finally reaches 3.5 GB. In the proposed method, something similar happened: the lowest memory usage was at 3.4 GB, and RAM usage continued to increase until it reached 4.1 GB. This indicates that the proposed method requires higher resources than the traditional method for the number of requests sent. However, the peak RAM at 4.1 GB for the proposed method was not present for the entire duration; there were a few seconds where RAM usage was at another limit. As the number of incoming requests increased, both methods provide increased re-source utilization. In the traditional method, the lowest RAM usage is at 1.8 GB, but at this point, the system was not at its highest level. As JMeter traffic increased, the highest RAM usage recorded was 4.5 GB. This provides information that the traditional method starts with relatively low RAM usage and will increase with the number of requests entering the system. In contrast, the proposed method uses relatively stable RAM, where the lowest recorded RAM usage is at 3.13 GB and goes to the highest point at 5.63 GB; this shows that the proposed method uses higher RAM than the traditional method, as seen in Figure 17.

The three scenarios show that the proposed method’s good performance requires an adequate environment, which, in this study, requires more RAM than the traditional method. The need for more RAM is due to the additional Apache Kafka process, which is responsible for the distribution of incoming data. Therefore, from the performance discussed in other metrics, the proposed method requires a better environment than the traditional method. In addition, the traditional method uses a more minor RAM requirement than the proposed method. This result provides insight into the limited existing environment, so the traditional method is more suitable for use with the disadvantage of instability in other metrics. The proposed method is better in terms of stability but requires an adequate environment.

In this research, metrics such as energy consumption were not compared because energy consumption is more suitable for edge devices, which depend on energy availability at that time. Because this research is intended for architecture on a server or cloud that will always have available energy, energy consumption was not included in the metrics evaluation. In addition, CPU is also not a matrix evaluation because Kafka uses more RAM to run existing processes, so CPU is not discussed in this study. The results of the increase in RAM usage between the traditional and proposed methods can occur due to the addition of architecture in the proposed method with Kafka. However, in this study, no optimization process was carried out on RAM usage because the purpose was to analyze the usage results as they are when using the architecture tested in this study.

3.6. Error Rate Evaluation

This evaluation shows the number of requests that could not be handled based on the total requests sent to the architecture. In the traditional method, there was an error of 6.85%; this indicates that out of 1000 requests sent, 6.85% failed to be processed. This contrasts with the proposed method, which had an error of 0%, where all requests JMeter sent could be processed by the proposed method.

In the error rate for 17,500 requests, the traditional method gave an error of 20.6%, indicating that approximately 20% of requests failed to be processed by the traditional method. This is crucial because, with requests that simulate the real world, the traditional model does not handle requests well. The proposed model, however, performs much better, with an error of 7%, which is three times better than the traditional model, given that the throughput provided is also more significant than the traditional model.

From the test results for the two architectures, it was found that the error difference between them is not much different; for the traditional method, the error rate is 34.96%, and for the proposed method, the error rate is 35%. This result increased drastically from the first test when 10,000 requests were sent to the system, especially for the proposed method. During this test, there was no significant difference between the two architectures, as seen in Table 6.

From the test results, the proposed method can handle requests better than the traditional method from the results of the error rate of the first two scenarios, where the error value in the proposed method was 0% and 7.06%, which was lower than the traditional method. This shows that, in the traditional method, the biggest drawback in using MQTT is the broker capacity that may occur and the overload during message reception. The proposed method can minimize these shortcomings with the help of Apache Kafka. As seen in the error evaluation process, the proposed method has a lower error value. This happens because Apache Kafka can create partitions to distribute message processing before entering the destination service.

3.7. Comparison with Existing Work

To explore the results that have been obtained more deeply, the authors compared them with several studies that have been carried out regarding real-time monitoring using Kafka by comparing some of our evaluations and the ones available in existing work. The detailed results can be seen in Table 7.

The first piece of research discusses using Apache Kafka to implement a wireless and real-time image streaming system for traffic monitoring. This research tries to improve the performance of streaming images on the first two things, namely, the implementation of Software-Defined Wireless Mesh Network and Apache Kafka for data distribution. The evaluation data show that the performance of using Apache Kafka alone has the lowest latency at 1909 ms.

In the following research, event-driven cloud architecture was carried out using MQTT and Apache Kafka to determine efficient and cost-effective architecture. This research uses AWS as a cloud service and sets the availability zone to three zones to increase throughput. In the results obtained from the paper, the lowest latency was 250 ms with a throughput value of 8000 packets/s. The value of this research is relatively high. However, the research does not explain the data sent and the evaluation scenario.

The third paper discusses monitoring the condition of building structures. This research aims to find an architecture that can be monitored in real-time with low latency based on streaming data. This research uses Apache Kafka, Apache Flink, and Websocket to create its architecture. This research consists of three subsystems consisting of sensors that aim to collect, process, and analyze data using Apache Flink and store and display data using Websocket. Apache Kafka’s role is to serve as a bridge between subsystems. This research obtained the lowest latency value at 36.69 ms.

The last piece of research discusses Cloud Manufacturing, which emphasizes integrating distributed resources using the cloud and IoT in microservices. This research addresses scalability, performance, resource orchestration, and security challenges. This research uses Apache Kafka for fast and stable data ingestion, Apache Spark Streaming for fast data processing, and Rabbit MQ to maintain communication between services. This research results in obtaining the lowest latency at 5 ms and the highest throughput at 700 packets/s.

Based on the research that was conducted, the proposed method in this research, using MQTT and Kafka, can outperform existing methods, especially for the latency given in the proposed method, which is 0.9, a minimal value compared to existing work. Although the throughput can be defeated by one of the studies that can generate throughput at 8000 packets/s, there is no certainty about how to evaluate factors such as users who request at one time, and the availability zone in the cloud can also affect the results obtained.

4. Future Scope and Recommendation

The results of this study open up a variety of opportunities for further research; some aspects that can be researched are the optimization of the resources used, e.g., Kubernetes provides features for Horizontal Pod Autoscaling and Vertical Pod Autoscaling. These two techniques can adjust according to the pod’s need for resources so that if the resource is deemed necessary, it will be added automatically; this can be a solution for a more flexible use of resources according to the moment’s needs. In addition, research can be conducted on several other available protocol combinations, such as AMQP, or using other technologies, such as EMQX or RabbitMQ, to see the impact of using different technologies. In addition, the use of QoS in MQTT can also be researched separately regarding the impact of using QoS on traffic speed in high-traffic scenarios. QoS impacts the process of sending data with the capabilities of each level. Is there a drawback that can be seen from using the quality of service? Using Federated Learning and AI-driven optimization methods is also an interesting direction to try so that resource allocation occurs based on better consideration.

According to the literature, caching can also increase the speed of data processing in architecture. Hence, caching, such as Redis, must be implemented in the existing architecture to maximize the efficiency of data delivery to microservices. In this research, the data involved are sensitive data from patients, so further research needs to be performed on the security of the existing architecture. Some things that can be conducted are the use of rate limiting and circuit breakers to prevent DDoS, the use of a Web Application Firewall to detect anomalies, or the use of a Deep Learning-based Intrusion Detection System to add security to the system. Pod Security Policies can be applied to prevent privilege escalation, and research can also be performed on service mesh and its effect on additional security layers.

This research recommends using the proposed method if you need an architecture with stability and average data processing with better latency and response time. A large size of data must be processed at a one-time throughput value. The proposed method is suitable for environments with sufficient resources and for real-time processes such as health monitoring and industrial IoT. In addition, it applies autoscaling to maximize resource usage. To use the proposed method, sufficient knowledge is needed to run the configuration file so that Kafka, Kafka Connect, and MQTT can communicate with each other. However, if the available environment does not have minimal resources, then it is recommended to use the traditional method approach due to the more minor RAM usage. To have a resource usage history of the existing server, you can use technologies such as Prometheus and Grafana to monitor performance.

5. Conclusions

In this research, comparing traditional inter-cluster communication methods and the proposed CVD disease monitoring methods provides several important points; we start from the test results of several scenarios.

In the response time results, both methods obtained the lowest value at 0 ms, which indicates that both methods can perform well in an adequate environment. As the number of requests increases, the proposed method experiences an increase in response time with a maximum value of response time of 261 ms in the third scenario; this result makes the average response time of the proposed method increase so that the response time of the proposed method becomes longer than the traditional method. However, looking at the standard deviation data, the traditional method has a much more unstable response time value; the proposed method has a standard deviation that tends to be stable, with a standard deviation of 0.28 ms, while the traditional method is at 4.14 ms. The same thing happens to the latency results, whereas the proposed method has a more excellent value on the average latency value. However, when looking at the standard deviation between the traditional and proposed methods, the latency value in the proposed method is 93% more stable than the traditional method as with the traditional method, standard deviation is 7.6 ms, and with the proposed method, standard deviation is 0.51 ms. This phenomenon can occur because the proposed method has an additional layer for data processing, so there may be time spikes when data processing occurs.

For the success of both methods in processing the data provided by the user, the proposed method has a better value than the traditional method. However, in the first two scenarios, the proposed method has a smaller error value than traditional as seen in Table 6. This makes the proposed method not only stable in terms of data processing time but also means that it has better data processing accuracy than the traditional method. The satisfactory result of the proposed method lies in the ability of Apache Kafka to handle larger data volumes efficiently, making the combination of MQTT and Kafka provide a more stable architecture for real-time monitoring systems. However, this ability requires an environment with considerable resources, where there is an increase in RAM usage along with the number of users making requests. In the research results, the highest point of RAM usage for the proposed method is 25% greater than the traditional method, so this method is less suitable for environments that have limited resources. Therefore, the continuation of this research can be focused on optimizing the use of resources such as the use of data compression techniques. Some important points from the research that has been carried out are:

The proposed method achieves higher throughput compared to the traditional method, making it more efficient for handling large volumes of data;
The response time of the proposed method increases with more requests, reaching a maximum of 261 ms. However, the traditional method has higher standard deviation than the proposed method;
In terms of latency, the proposed method has higher average latency but is 93% more stable than the traditional method;
The success rate of data processing is higher in the proposed method, with a lower error rate in the first two scenarios;
Apache Kafka’s role in the proposed method improves data handling and system availability;
The proposed method requires more resources, with RAM usage increasing by 25% compared to the traditional method.

In the traditional and proposed methods for this research that aims to process data in real-time with the context of health monitoring, there is a significant trade-off of resource requirements, but cloud dependency that requires internet stability. The cloud offers scalability and ease of management but requires a stable network between the user and the architecture for data processing. Therefore, this research can open up other opportunities to reduce the significant latency due to internet stability in several ways proposed in Section 5.

In addition to CVD monitoring, the proposed method can be applied to other fields requiring real-time communication, such as general patient monitoring, architecture for wearable devices, telemedicine or Industrial IoT purposes requiring real-time communication, and fast modeling. However, further testing should be conducted to replicate the capabilities of the proposed method in real situations, such as testing under unstable network conditions. In addition, research focusing on security enhancements could also be performed since this system stores sensitive patient data. Security will be an important aspect that needs to be developed.

With regard to this, this research has contributed to improving the efficiency of microservice communication for health monitoring systems and other fields. It opens up opportunities for further development in data source efficiency and data security.

Author Contributions

Conceptualization, H.I.S., G.N.E., N.D. and N.S.; methodology, H.I.S., G.N.E., N.D. and N.S.; software, H.I.S., G.N.E. and N.D.; validation, H.I.S., G.N.E., N.D. and N.S.; formal analysis, H.I.S., G.N.E. and N.D.; investigation, H.I.S., G.N.E., N.D. and N.S.; resources, G.N.E. and N.S.; data curation, G.N.E. and N.D.; writing—original draft preparation, H.I.S.; writing—review and editing, H.I.S. and N.S.; visualization, H.I.S.; supervision, G.N.E., N.D. and N.S.; project administration, G.N.E. and N.D.; funding acquisition, N.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Directorate General of Strengthening for Research and Development, Ministry of Research, Technology, and Higher Education, Republic of Indonesia (Grant Number: 105/E5/PG.02.00.PL/2024).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in GitHub at https://github.com/NicholasDominic/Malignant-Ventricular-Arrhythmia-Detection (accessed on 5 February 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tolmasovich, T.R.; Shavkatovna, K.S. Heart arrhythmia disease and its treatment methods in modern medicine. J. Med. Pharm. 2024, 7, 139–145. [Google Scholar]
World Health Organization. Tracking Universal Health Coverage: 2023 Global Monitoring Report; World Health Organization: Geneva, Switzerland, 2023. [Google Scholar]
Bozkurt, B.; Ahmad, T.; Alexander, K.M.; Baker, W.L.; Bosak, K.; Breathett, K.; Fonarow, G.C.; Heidenreich, P.; Ho, J.E.; Hsich, E.; et al. Heart failure epidemiology and outcomes statistics: A report of the Heart Failure Society of America. J. Card. Fail. 2023, 29, 1412–1451. [Google Scholar] [CrossRef]
Moore, J.P.; Marelli, A.; Burchill, L.J.; Chubb, H.; Roche, S.L.; Cedars, A.M.; Khairy, P.; Zaidi, A.N.; Janousek, J.; Crossland, D.S.; et al. Management of heart failure with arrhythmia in adults with congenital heart disease: JACC state-of-the-art review. J. Am. Coll. Cardiol. 2022, 80, 2224–2238. [Google Scholar] [CrossRef]
Şahin, B.; İlgün, G. Risk factors of deaths related to cardiovascular diseases in World Health Organization (WHO) member countries. Health Soc. Care Community 2022, 30, 76–80. [Google Scholar] [CrossRef] [PubMed]
Mendis, S.; Graham, I.; Narula, J. Addressing the global burden of cardiovascular diseases; need for scalable and sustainable frameworks. Glob. Heart 2022, 17, 48. [Google Scholar] [CrossRef] [PubMed]
Surantha, N.; Isa, S.M. Real-time Monitoring System for Sudden Cardiac Death Based on Container Orchestration and Binary Serialization. In Proceedings of the 2021 International Symposium on Electronics and Smart Devices (ISESD), Virtual Conference, 29–30 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
Gemirter, C.B.; Şenturca, Ç.; Baydere, S. A comparative evaluation of AMQP, MQTT and HTTP protocols using real-time public smart city data. In Proceedings of the 2021 6th International Conference on Computer Science and Engineering (UBMK), Ankara, Turkey, 15–17 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 542–547. [Google Scholar]
Longo, E.; Redondi, A.E.C.; Cesana, M.; Manzoni, P. BORDER: A benchmarking framework for distributed MQTT brokers. IEEE Internet Things J. 2022, 9, 17728–17740. [Google Scholar] [CrossRef]
Surantha, N.; Jayaatmaja, D.; Isa, S.M. Sleep Monitoring Systems based on Edge Computing and Microservices Caching. In Proceedings of the 2024 IEEE Annual Congress on Artificial Intelligence of Things (AIoT), Melbourne, Australia, 24–26 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 148–152. [Google Scholar]
Ortiz, G.; Boubeta-Puig, J.; Criado, J.; Corral-Plaza, D.; Garcia-de-Prado, A.; Medina-Bulo, I.; Iribarne, L. A microservice architecture for real-time IoT data processing: A reusable Web of things approach for smart ports. Comput. Stand. Interfaces 2022, 81, 103604. [Google Scholar] [CrossRef]
Surantha, N.; Utomo, O.K.; Lionel, E.M.; Gozali, I.D.; Isa, S.M. Intelligent sleep monitoring system based on microservices and event-driven architecture. IEEE Access 2022, 10, 42069–42080. [Google Scholar] [CrossRef]
Beltrão, A.C.; de França, B.B.N.; Travassos, G.H. Performance evaluation of kubernetes as deployment platform for iot devices. In Proceedings of the Ibero-American Conference on Software Engineering, Curitiba, Brazil, 4–8 May 2020. [Google Scholar]
Čilić, I.; Krivić, P.; Podnar Žarko, I.; Kušek, M. Performance evaluation of container orchestration tools in edge computing environments. Sensors 2023, 23, 4008. [Google Scholar] [CrossRef] [PubMed]
Carrión, C. Kubernetes scheduling: Taxonomy, ongoing issues and challenges. ACM Comput. Surv. 2022, 55, 1–37. [Google Scholar] [CrossRef]
Bayılmış, C.; Ebleme, M.A.; Çavuşoğlu, Ü.; Küçük, K.; Sevin, A. A survey on communication protocols and performance evaluations for Internet of Things. Digit. Commun. Netw. 2022, 8, 1094–1104. [Google Scholar] [CrossRef]
Moravcik, M.; Kontsek, M.; Segec, P.; Cymbalak, D. Kubernetes–evolution of virtualization. In Proceedings of the 2022 20th International Conference on Emerging eLearning Technologies and Applications (ICETA), Stary Smokovec, Slovakia, 20–21 October 2022; pp. 454–459. [Google Scholar]
Azzedin, F.; Alhazmi, T. Secure Data Distribution Architecture in IoT Using MQTT. Appl. Sci. 2023, 13, 2515. [Google Scholar] [CrossRef]
Alshammari, H.H. The internet of things healthcare monitoring system based on MQTT protocol. Alex. Eng. J. 2023, 69, 275–287. [Google Scholar] [CrossRef]
Peddireddy, K. Streamlining Enterprise Data Processing, Reporting and Realtime Alerting using Apache Kafka. In Proceedings of the 2023 11th International Symposium on Digital Forensics and Security (ISDFS), Chattanooga, TN, USA, 11–12 May 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–4. [Google Scholar]
Calderon, G.; del Campo, G.; Saavedra, E.; Santamaría, A. Monitoring framework for the performance evaluation of an IoT platform with Elasticsearch and Apache Kafka. Inf. Syst. Front. 2023, 26, 2373–2389. [Google Scholar] [CrossRef]
Sharma, V. Managing multi-cloud deployments on kubernetes with istio, prometheus and grafana. In Proceedings of the 2022 8th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 25–26 March 2022; IEEE: Piscataway, NJ, USA, 2022; Volume 1, pp. 525–529. [Google Scholar]
Rawoof, F.M.; Tajammul, M.; Jamal, F. On-Premise Server Monitoring with Prometheus and Telegram Bot. Int. J. Sci. Res. Eng. Manag. 2022, 6, 1–5. [Google Scholar] [CrossRef]
Liu, Y.; Yu, Z.; Wang, Q.; Mei, H.; Song, G.; Li, H. Research on cloud-native monitoring system based on Prometheus. In Proceedings of the Fourth International Conference on Sensors and Information Technology (ICSI 2024), Xiamen, China, 5–7 January 2024; SPIE: San Francisco, CA, USA; Volume 13107, pp. 308–315. [Google Scholar]
Aggoune, A.; Benratem, Z. ECG data visualization: Combining the power of Grafana and InfluxDB. In Proceedings of the 2023 International Conference on Advances in Electronics, Control and Communication Systems (ICAECCS), Blida, Algeria, 6–7 March 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–6. [Google Scholar]
Li, C.; Guo, X.; Shangguan, L.; Cao, Z.; Jamieson, K. Curving LoRa to boost LoRa network throughput via concurrent transmission. In Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22), Renton, WA, USA, 4–6 April 2022; pp. 879–895. [Google Scholar]
Htut, A.M.; Aswakul, C. Development of near real-time wireless image sequence streaming cloud using Apache Kafka for road traffic monitoring application. PLoS ONE 2022, 17, e0264923. [Google Scholar] [CrossRef] [PubMed]
Khriji, S.; Benbelgacem, Y.; Chéour, R.; Houssaini, D.E.; Kanoun, O. Design and implementation of a cloud-based event-driven architecture for real-time data processing in wireless sensor networks. J. Supercomput. 2022, 78, 3374–3401. [Google Scholar] [CrossRef]
Yang, B.; Sun, S. Real-time structural health monitoring system based on streaming data. smart structures and systems. Smart Struct. Syst. 2021, 28, 275–287. [Google Scholar] [CrossRef]
Pacella, M.; Papa, A.; Papadia, G.; Fedeli, E. A Scalable Framework for Sensor Data Ingestion and Real-Time Processing in Cloud Manufacturing. Algorithms 2025, 18, 22. [Google Scholar] [CrossRef]

Figure 1. Control plane and worker node architecture on Kubernetes.

Figure 2. Present ERD on Google SQL.

Figure 3. Google Auth ProxyA.

Figure 4. MQTT publisher and subscriber.

Figure 5. MQTT as inter-cluster communication architecture.

Figure 6. Kafka topic.

Figure 7. MQTT and Kafka as inter-cluster communication architecture.

Figure 8. Kafka Connect MQTT broker.

Figure 9. Flowchart of architecture testing.

Figure 10. The flowchart of model training.

Figure 11. Throughput comparison between traditional and proposed method for 500 users and 20 requests sent.

Figure 12. Throughput comparison between traditional and proposed method for 500 users and 35 requests sent.

Figure 13. Throughput comparison between traditional and proposed method for 500 users and 50 requests.

Figure 14. Latency comparison between traditional and proposed method for 10,000 requests.

Figure 15. Latency comparison between traditional and proposed method for 17,500 requests.

Figure 16. Latency comparison between traditional and proposed method for 25,000 requests.

Figure 17. Box plot of RAM comparison between MQTT and MQTT Kafka for 25,000 requests.

Table 1. ML models performance comparison.

Indicators	LR	SVM	RF	MLP
Total Test Set	221	221	221	221
True Positive (TP)	110	110	110	109
True Negative (TN)	107	109	108	111
False Positive (FP)	0	0	0	1
False Negative (FN)	4	2	3	0
Accuracy	0.982	0.991	0.986	0.995
Precision	1.000	1.000	1.000	0.991
Recall	0.965	0.982	0.973	1.000
F1-score	0.982	0.990	0.986	0.995
AUC-ROC score	0.982	0.990	0.986	0.995
Machine Type	Intel (R) Core (TM) i5-CPU 1.60 GHz
Avg Execution Time (s)	3.717	1.526	1.670	1.593

Table 2. Summary of response time evaluation.

Total Request	Indicator	Traditional Method	Proposed Method
1000 requests	Min	0 ms	0 ms
	Max	4 ms	60 ms
	Mean	0.09 ms	0.12 ms
	Std Dev	1.11 ms	0.29 ms
17,500 requests	Min	0 ms	0 ms
	Max	9 ms	234 ms
	Mean	0.05 ms	0.28 ms
	Std Dev	4.14 ms	0.35 ms
25,000 requests	Min	0 ms	0 ms
	Max	11 ms	261 ms
	Mean ms	0.07 ms	0.34 ms
	Std Dev	5.71 ms	0.28 ms

Table 3. Latency time evaluation for 10,000 requests.

Indicators	Traditional Method	Proposed Method
Min	0 ms	0 ms
Max	8 ms	1.1 ms
Mean	1.2 ms	0.9 ms
Std Dev	1.5 ms	0.17 ms

Table 4. Latency time evaluation for 17,500 requests.

Indicators	Traditional Method	Proposed Method
Min	0 ms	0 ms
Max	2.6 ms	34.9 ms
Mean	0.9 ms	4.8 ms
Std Dev	11.3 ms	0.55 ms

Table 5. Latency time evaluation for 25,000 requests.

Indicators	Traditional Method	Proposed Method
Min	0 ms	0 ms
Max	1.4 ms	30.9 ms
Mean	0.7 ms	2.7 ms
Std Dev	7.6 ms	0.51 ms

Table 6. Summary of Error Rate Evaluation.

Total Request	Indicator	Traditional Method	Proposed Method
1000 requests	Error Rate	6.85%	0%
17,500 requests	Error Rate	20.6%	7.06%
25,000 requests	Error Rate	34.96%	35%

Table 7. Performance comparison with different systems.

Title	Method	Scenario (Users)	Latency (ms)	Throughput (Packets/s)
Htut, A.M., & Aswakul, C. [27]	Data streaming using Kafka	-	1909	-
Khriji, S., Benbelgacem, Y., Chéour, R., Houssaini, D.E., & Kanoun, O. [28]	MQTT and Zookeeper Kafka	-	250	8000
Yang, Bin & Sun, Siyuan. [29]	Kafka, Flink, Websocket	-	36.69	-
Pacella, M., Papa, A., Papadia, G., & Fedeli, E. [30]	Kafka and Rabbit MQ	10,000 users	5	700
Proposed Work	MQTT Kafka	10,000 users	0.9	2210

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sucipto, H.I.; Elwirehardja, G.N.; Dominic, N.; Surantha, N. Kubernetes-Powered Cardiovascular Monitoring: Enhancing Internet of Things Heart Rate Systems for Scalability and Efficiency. Information 2025, 16, 213. https://doi.org/10.3390/info16030213

AMA Style

Sucipto HI, Elwirehardja GN, Dominic N, Surantha N. Kubernetes-Powered Cardiovascular Monitoring: Enhancing Internet of Things Heart Rate Systems for Scalability and Efficiency. Information. 2025; 16(3):213. https://doi.org/10.3390/info16030213

Chicago/Turabian Style

Sucipto, Hans Indrawan, Gregorius Natanael Elwirehardja, Nicholas Dominic, and Nico Surantha. 2025. "Kubernetes-Powered Cardiovascular Monitoring: Enhancing Internet of Things Heart Rate Systems for Scalability and Efficiency" Information 16, no. 3: 213. https://doi.org/10.3390/info16030213

APA Style

Sucipto, H. I., Elwirehardja, G. N., Dominic, N., & Surantha, N. (2025). Kubernetes-Powered Cardiovascular Monitoring: Enhancing Internet of Things Heart Rate Systems for Scalability and Efficiency. Information, 16(3), 213. https://doi.org/10.3390/info16030213

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Kubernetes-Powered Cardiovascular Monitoring: Enhancing Internet of Things Heart Rate Systems for Scalability and Efficiency

Abstract

1. Introduction

2. Materials and Methods

2.1. The Kubernetes Architecture

2.2. MQTT as Inter-Cluster Communication (Traditional Method)

2.3. MQTT and Kafka as Inter-Cluster Communication (Proposed Method)

2.4. Evaluation Metric

3. Results

3.1. Arrhythmia Classification Model

3.2. Throughput Evaluation

3.3. Response Evaluation

3.4. Latency Evaluation

3.4.1. Simulation of 500 Users, Each User Sends 20 Prediction Requests

3.4.2. Simulation of 500 Users, Each User Sends 35 Prediction Requests

3.4.3. Simulation of 500 Users, Each User Sends 50 Prediction Requests

3.5. RAM Evaluation

3.6. Error Rate Evaluation

3.7. Comparison with Existing Work

4. Future Scope and Recommendation

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI