1. Introduction
Internet of Things (IoT) devices are becoming ubiquitous due to their quantity and proliferation [
1,
2,
3]. This means that there is a considerable amount of computational power distributed by these devices. It has been our goal to explore the use of these devices, particularly through the creation of tools that enable code distribution and execution to explore the use of the spare computational capacity of IoT devices [
4,
5].
The single-board computer (SBC), a rather powerful type of machine that can be used as a generic IoT device, enjoys enormous popularity due to their high performance for their price range and the vast number of settings wherin they can be used. These devices are becoming a standard for IoT prototyping and implementation [
6,
7,
8], bringing more and more computing power to this domain and also more underutilized devices.
Our goal is to create tools to enable the use of the spare power of such devices to accomplish tasks in collaborative scenarios were parallelization is key. One scenario where we believe that a coalition of IoT devices can make a difference is multimedia-based applications. Due to their nature, multimedia applications, such as the ones that process video or audio, benefit from the parallelization of hardware, as many tasks are extremely parallelizable [
9,
10,
11]. At the same time, they are also associated with QoS constraints that bring a new challenges to the implementation of distributed service execution. These constrains must be fulfilled, and the coalition of IoT devices must guarantee that it is possible to deliver such service.
In our scenario, IoT devices have different applications that are already being executed. The idea is to use their otherwise unused spare capacity. This means that we need to be able to measure the amount of spare capacity each device has in order to use it without interfering with its normal function.
It is well known that resource allocation in large distributed systems is an NP-hard problem. Due to the complexity and dynamism of applications, it is difficult to foresee the amount of load that will be imposed. Therefore, static allocation tends to create underutilized platforms due to used worst-case resource reservations. To achieve our goals, a dynamic approach which relies on lightweight heuristics that can be dynamically applied during runtime is the obvious solution. One can find a comprehensive survey of the kinds of resource allocation heuristics that can cover different levels of dynamicity while coping with the scale and complexity of high-density many-core platforms in [
12]. Here, we apply such concepts in order to know the amounts of resources (such as processor cycles, communication bandwidth, disk bandwidth, and storage) each device can contribute to a coalition of IoT devices and create a platform for distributed multimedia computations. Similar approaches to other domains can be found, for example, in [
13,
14].
Streaming consists of the continuous transmission of video and audio instead of downloading all the data first. Streaming protocols have been studied for a long time, and there is a reasonable amount of work about applying streaming protocols to IoT devices [
15,
16,
17]. Our purpose here is not to define how the streaming of data occurs at its core, but to describe the creation and coordination of IoT clusters to deliver services that can use streaming protocols, guaranteeing some expected quality of service.
Clusters of IoT devices were the topic of several previous works. In [
18], the authors propose a distributed architecture to perform collaborative work for IoT environments and sharing the application workload among the available devices. Offloading of tasks has been the subject of several previous works. In [
19], the authors proposed an algorithm to provide efficient task offloading in IoT and fog computing nodes. In [
20], the authors develop a distributed task algorithm in the context of fog nodes, where the spare resources are also a relevant feature for the problem solution and are provided by what the authors call helper nodes. A two layer architecture that includes one layer with clusters of IoT devices performing several tasks is presented in [
21]. Sharing of unused computational resources of IoT devices in a clustered approach in the context of smart cities was the subject of [
22]. The efficiency of IoT devices in the context of healthcare is studied in [
23,
24], where the authors apply task level parallelism (TLP) as a technique to optimize resources. This technique is also used at the device level in our approach as a further optimization technique. Using offloading solutions to improve efficient computation in IoT devices for other domains of application can be found, for example, in [
25,
26,
27,
28]. Although there are several previous work on this subject, to the best of our knowledge, the work we present here is the first one using IoT devices in collaboration to process multimedia applications.
The remainder of this paper is organized as follows. In the next section, we present definitions and details needed to describe our framework; then we present the core algorithms, along with examples of their application. We then validate our approach by showing its application to real and simulated scenarios and by analyzing the results. Finally, we conclude and present future work.
2. System Model
Our approach consists of a cluster of IoT devices that deliver some service, guaranteeing predefined QoS constraints. In this section, we describe, by means of definitions and examples, the several components of the system and how they interact.
We start by presenting, in
Figure 1, a high level view of the system as a whole, where the user defines the quality needed for a service, which is then translated in application properties and finally is executed by the cluster of devices that use a reserved portion of resources to deliver the service with the adequate QoS to fulfill the user’s request.
Code distribution is the central feature of this system. IoT devices with spare capacity offer their availability to participate in a coalition in order to allow the system to achieve a global outcome in terms of service delivery with a given quality. Due to the ad hoc nature of this coalition, several details must be thoroughly studied, namely, how to communicate, distribute data and guarantee the global QoS, and how to deal with dynamic changes.
Resource reservation plays an important role due to establishing the exact spare capacity that the device is willing to concede to a coalition, assuming such guarantee is central to enable the coalition to properly work. However, only this will not guarantee that the provided QoS is stable, since new devices can be added and removed (due to failures) at any time. Therefore, during runtime, the system must assure that a given QoS level is still possible, or a change to a new degraded (when acceptable) version of the service will be chosen. Whenever that option is not possible, the coalition will not be able to continue to provide service.
We now present several definitions.
Definition 1 (IoT Cluster). An IoT cluster is a set of IoT devices that are currently providing a collaborative solution for a service.
Definition 2 (IoT Sub-cluster). Given an IoT cluster , a sub-cluster is a set of IoT devices where each .
Definition 3 (Processing Unit). A processing unit p is defined as , where t is a task of code that should execute with input data u.
Definition 4 (Service). A service is a set of processing units along with QoS constraints defined in Q.
A service can be provided with different QoS levels due to the nature of the service and the user’s quality preferences.
Definition 5 (QoS Constraints). Let Q be the set of the user’s QoS constraints associated with service S. Each is a finite set of quality choices for the attribute of dimension k. This can be either a discrete or a continuous set such that , where is the set of QoS dimensions, is the set of attributes identifiers and is the set of attributes’ values identifiers. Each value is represented by a tuple , where , and .
The set of relationships assigns to each dimension in a set of attributes in and is defined as .
The set of relationships assigns to each attribute in a specific value in and is represented as .
defines the set of existing dependencies among the values of the existing attributes. A dependence between and is represented as .
For a given service, one can choose from a set of values for QoS dimensions; for example, audio-related parameters such as the sampling rate (8, 16, 24, 44, 48 and 88 kHz), sampling bits (8, 16 and 32), end-to-end latency (100, 75, 50 and 25 ms) and video related parameters such as picture resolution (SQCIF, QCIF, CIF, 4CIF, 16CIF), color depth (1, 3, 8, 16, …) and frame rate (1, …, 60).
Users provide a specification for the minimum desired QoS for a service S with the minimum acceptable dimensions. Delivering the service can be done with values that meet at least those defined by the user.
Example 1. Using a video streaming application as an example, the following is a list of quality dimensions that might be associated with any particular application. The list is given to illustrate the proposed model and is not intended to be exhaustive.
Dim = {Video Quality, Audio Quality}
Attr = {compression index, color depth, frame size, frame rate, sampling rate, sample bits}
Val = {{1, integer, discrete},{3, integer, discrete}, …, {[1, 30], integer, continuous}, …}
DA Video Quality = {image quality, color depth, frame size, frame rate}
DA Audio Quality = {sampling rate, sample bits}
AV compression index = {[0, 100]},
AV frame size = {SQCIF, QCIF, CIF, 4CIF, 16CIF}
AV color depth (bits) = {1, 3, 8, 16, 24, …}
AV frame rate (per second) = {[1, 30]}
AV sampling rate (kHz) = {8, 11, 32, 44, 88}
AV sample bits (bits) = {4, 8, 16, 24}
Having such a QoS characterization of a particular application domain, users and service providers are now able to define service requirements and proposals in order to reach an agreement on service delivery. Since the QoS has a multi-dimensional nature, tradeoffs can be made due to the scarcity of resources.
Further details on QoS characterization for distributed systems can be found, for example, in [
13].
From a pragmatic perspective, one can hide the details of QoS characterization in high-level descriptions, as presented in the following example.
Example 2. From a user’s perspective, and for practicality, QoS dimensions can be simplified. For example, video can be simply described as SD, HD and FHD, as presented in Table 1; and audio as low, medium and high, as presented in Table 2. Please note that this is just an example and other configurations can be easily implemented.
Table 1.
High-level video definitions.
Table 1.
High-level video definitions.
Description | Common Name | Resolution |
---|
SD (Standard Definition) | 480 p | 640 × 480 |
HD (High Definition) | 720 p | 1280 × 720 |
FHD (Full HD) | 1080 p | 1920 × 1080 |
Table 2.
High-level audio definitions.
Table 2.
High-level audio definitions.
Resolution | Bit Rate (kbps) | Sample Rate (kHz) |
---|
Low | 128 | 32 |
Medium | 192 | 44.1 |
High | 320 | 48 |
Moreover, note that QoS constraints are defined as the minimum acceptable set of properties to deliver a given service. This means that the coalition can produce a higher quality outcome, but not a lower one. Thus, devices leaving the coalition (due to some type of failure) must be handled, and the IoT cluster’s minimum QoS delivery must be recalculated.
2.1. Algorithms
The framework is divided in two main modules, the cluster setup and dynamic adaptation to new services. IoT clusters collaborate in order to fulfill the requested service by processing data framed with the QoS defined by the user. Due to the heterogeneity of services to be executed, users’ quality preferences, underlying operating systems, networks and devices, QoS specification must be acknowledged by all the devices, or they must be able to map their individual specifications onto a common one.
It is important to note that although we do not explore streaming algorithms, we need to understand the capabilities of the different devices in our cluster. Thus, we estimated their processing power in order to conclude if they are able to cope with the imposed demand.
2.1.1. Cluster Setup
The first step to have a collaborative network of IoT devices working together to deliver a multimedia service is to know how many devices are available and what they are capable of, and therefore, estimating the global power at hand. Cluster formation consists of registering available IoT devices with the main device (the one who is requesting the service) and announcing the spare capacity to deliver. Since the goal is to process audio and video, each device announces its capacity in a worst-case scenario with the spare resources it has. This is then mapped with the user’s preferences, and if it is possible, the service is delivered. Each IoT device registers with the device that is requesting to run the service in a client–server model. In Algorithm 1, we describe how this process works.
Algorithm 1 Cluster setup. |
Let be the node requesting the service S. |
Let be a global variable that stores the set of available nodes in the cluster. |
Take a service with processing unit P and related QoS Q such that each is a finite set of n quality choices for the attribute, expressed in decreasing order of preference, for all k QoS dimensions. |
Let be the set of nodes in the cluster capable of providing a given service. |
Let broadcast to the local network the request for nodes to register, adding them to . |
1: for each do |
2: Let be the QoS delivered by node |
3: if is higher (in all its dimensions) than Q then |
4: |
5: end if |
6: end for |
7: return |
2.1.2. Dynamic Adaptation to New Services
In Algorithm 2, we present the coordination for accepting new services.
Algorithm 2 Dynamic adaptation to new services. |
Let be the node requesting the service . |
Given a new service with processing unit and related QoS such that each is a finite set of n quality choices for the attribute, expressed in decreasing order of preference, for all k QoS dimensions. Let be a global variable that stores the set of available nodes in the cluster. |
Let be the sub cluster that will provide the new service. |
Let broadcast to the local network the request for nodes to register adding them to . |
while do |
for each do |
Let be the QoS delivered by node |
if is higher (in all its dimensions) than Q then |
if can accommodate such service along existing ones then |
|
end if |
end if |
end for |
end while |
return |
Note that services can be added up to a number where they can be delivered. In the case of failure of one node, the service is reset and a new sub-cluster is defined.
Example 3. In Figure 2, we can see two sub-clusters that were created to provide two different services with different QoS. Device requested a first service that was provided by devices , and after this, a second service that was provided by devices . Both services are based on the available devices that are part of the main cluster (). This cluster is dynamic, since it changes by adding and removing devices which can occur at any time. All the algorithms presented here are high-level descriptions of the implementation; many minor details are not described. In the next section, we look deeper at how the actual framework works by describing relevant parts of the implementation.
3. Results and Analysis
For the implementation of the framework, we used Elixir programming language [
29] because of the ease it provides to distribute and execute data and code by devices in a network. For the hardware, we used Raspberry Pis, a type of SBC (single-board computer) that can be used as a generic IoT device and enjoys enormous popularity due to is high performance for its price range and the vast number of scenarios where it can be used [
6,
7,
8]. The use of this type-o SBC also allows the use of Linux as the operating system to implement a resource reservation policy and use Elixir along with the Erlang virtual machine [
30] in each one of them, creating a distributed scenario that is simpler to manage.
3.1. Cluster Formation
All the IoT devices ran a client–server program that allowed them to interact with others and enabled the creation of the cluster that would provide the service. Any IoT device when turned on announces it is available. This is done by using built-in discovery features of Elixir, and every device maintains a list of all the other known devices in the network, as shown in
Figure 3.
We say that these other devices are registered with the device that has them on its list. The framework also provides a feature in which a link between devices is maintained, as seen in
Figure 4. Any failure (device disconnecting) is then detected, allowing the list to be updated by removing the disconnected device, as seen in
Figure 5.
When one of these devices needs to run a service, it queries each of the devices that are registered with it, obtaining the QoS capability that each can provide. The devices that provide a QoS above the minimum are selected to collaborate. As an example, we present in Listing 1 a small example written in Elixir of the main function of the cluster formation.
Listing 1: Cluster formation main function.
def create_cluster(RegistredNodes,QoSParametersList) do
SubCluster = query_nodes(RegistredNodes,QoSParametersList)
end
def query_nodes([],[]) do
[]
end
def query_nodes([node|remainingnodes],qosparameterslist) do
send(node, {:evaluate,qosparameterslist}
receive do
{node,:ok_capable} ->
[node|query_nodes(remainingnodes,qosparameterslist)]
{node,:not_capable} ->
query_nodes(remainingnodes,qosparameterslist)
end
end
3.2. Local QoS Calculation
A local QoS calculation allows one to know if a node can collaborate in a service given specific QoS constraints. This needs to be fast in order to cope with the dynamism of the framework. Our proposed solution for this specific problem is to benchmark the devices and know their capabilities in advance. Since we were using SBCs, we could profile those devices in order to know how much CPU and memory is needed to deliver some service under specific circumstances. We benchmarked all the SBCs that are used in the framework and obtained their resource needs for processing video and audio under some typical scenarios that we defined. Again, the framework can be modeled with different SBCs and different configurations of video and audio streams, and this should be seen as the example which we use in our solution, but that can be easily adapted to other situations.
We used the
ffmpeg tool (
https://ffmpeg.org/) (accessed on 4 November 2022) and the resource usage monitoring utility
RPI-Monitor (
https://github.com/XavierBerger/RPi-Monitor) (accessed on 4 November 2022) to test and obtain statistical data on typical scenarios for the SBCs considered, namely, the Raspberry Pi 3 A+, 3 B+ and Zero W.
We measured CPU and memory load for decoding video and audio in the following scenarios:
Low—SD video and medium audio quality: Video resolution of 640 × 480 pixels with 24 frames per second and audio with 128 kbps of bit rate and 32,000 Hz sample rate.
Medium—HD video and high audio quality: Video resolution of 1280 × 720 pixels with 24 frames per second and audio with 192 kbps of bit rate and 44,100 Hz sample rate.
High—FHD video and very high audio quality: Video resolution of 1920 × 1080 pixels with 30 frames per second and audio with 320 kbps of bit rate and 48,000 Hz sample rate.
The CPU load is presented in
Figure 6, where one can see that among the models Raspberry Pi 3B+ and 3A+, there was almost no difference in the CPU load. This was expected, as they rely on the same hardware at the CPU level. The Raspberry Pi Zero W has a less powerful processor, and this is clearly noticeable. While the CPU was stable across all our tests, we noted that this was not the case for RAM, where several configurations of the decoder would lead to rather different RAM requirements. In this case, we noticed that it needed an average of 27 MB for decoding the lowest quality stream, 45 MB for the medium quality stream and 56 MB on average for the highest quality one; and this applies to all the devices we tested.
Please note that in these tests we did not use H.264 hardware acceleration, as not all the devices provide this feature. Therefore, we directed our approach to the typical IoT device that relies mainly on the CPU’s capability to process all the work. Nevertheless, we can say that with H.264 enabled, the Raspberry Pi 3 had a constant CPU load of 5% for all the different combinations of video/audio tested and a constant memory footprint of 22 MB, making it a strong candidate for enabling fast and efficient processing of audio and video streams, which we hope to explore in future work.
Finally, we reinforce that this is a configuration step of the framework, and any other values and dimensions can be considering when the setup phase occurs. Additionally, we tested stream encoding and transcoding, but we verified that this is a much more intensive task and that the hardware at hand could not cope with the demand for rather low-quality streams.
3.3. Dynamic Coordination
When a new service arrives, a QoS request, is performed for the registered devices, and the computation is done as explained in
Section 3.2. If it is possible to provide the service, then a new sub-cluster is created for this service following the procedure described in Algorithm 2. All the coordination relies on the the Elixir message system.
When one device fails, the framework detects that event immediately, as illustrated in
Figure 5. After studying several approaches, we concluded that the most adequate one is to reset the cluster whenever there is a node failure. This means that the whole process of querying registered nodes and forming the service providing coalition is done again. Since this process relies on local network communication with very low latency and that the data used to setup the configuration are available statically, this process is relatively fast. This means that, in the case, where there is still a possibility to proceed with the service, the user will be deprived of it just briefly.