*Proceeding Paper* **The System Architecture and Methods for Efficient Resource-Saving Scheduling in the Fog †**

**Anna Klimenko**

Institute of IT and Security Technologies, Kirovogradskaya St. 25-2, 117534 Moscow, Russia; anna\_klimenko@mail.ru

† Presented at the 15th International Conference "Intelligent Systems" (INTELS'22), Moscow, Russia, 14–16 December 2022.

**Abstract:** The problem of resource-saving scheduling in a fog environment is considered in this paper. The objective function of the problem in question presupposes the fog nodes' reliability function maximizing. Therefore, to create a schedule, the following is required: the history of the fog devices' state changes and the search space, which consists of preselected nodes of the cloud-fog broker neighbourhood. The obvious approach to providing the scheduler with this information is to poll the fog nodes, yet this can consume the unacceptable time because of the QoS requirements. In this paper, the system architecture and general methods for efficient resource-saving scheduling is presented. The system is based on distributed ledger element usage, which provides the nodes with the proper awareness about the surroundings. The usage of the distributed ledger allows not only for the creation of the resource-saving schedule but also the reduction of the scheduling problem-solving time, which frees addition time that can be used for the solving of user tasks. The latter also affects the overall resource-saving via reliability. The novelty of this paper consists in the development of the hybrid ledger-based system, which integrates and arranges the elements of various ledger types to solve the newly formulated problem.

**Keywords:** scheduling problem; optimization; fog computing; distributed ledger

### **1. Introduction**

The scheduling problem is known as a problem of high importance in the field of distributed computing, including scheduling in the fog- and edge- computing environments. Resource-saving scheduling is a problem with a particular objective function, that is to maximize the reliability function values of the nodes at the end of the user operation. Moreover, the problem solution must not consume much time due to the quality of service (QoS) and quality of experience (QoE) requirements for the fog or edge systems.

Fog computing presupposes the cloud-fog broker (CFB) functioning in the fog layer of the network. The general tasks of the CFB are to receive the computational problem and data from the edge device; to schedule this problem, assigning the subtasks to the fog devices nearby—or if there are no devices with appropriate amount of resources in the fog,to send the problem and data to the cloud—and to return a result to the user's device [1].

There are some issues in this scheme that created by the peculiarities of the fog layer and of the resource-saving scheduling problem:


**Citation:** Klimenko, A. The System Architecture and Methods for Efficient Resource-Saving Scheduling in the Fog. *Eng. Proc.* **2023**, *33*, 9. https://doi.org/10.3390/ engproc2023033009

Academic Editors: Askhat Diveev, Ivan Zelinka, Arutun Avetisyan and Alexander Ilin

Published: 17 May 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).


The issues listed above generate the following tasks which must be performed to generate the schedule:


The most obvious solution to implementing the scheduling process under the listed conditions is to request all the needed data from the fog nodes in the CFB neighbourhood, and, if sufficient resources are there, to assign the tasks to the node. Yet, distances between nodes can produce unacceptable delays for the formation of scheduling problem search space (as well as for workload information retrieval), while the increase of the scheduling formation time leads to the decrease of the time for the computational task solution. Therefore, the general goal of this paper is to design the architecture and basic functioning methods of the system for efficient resource-saving scheduling in the fog.

The following tasks are considered to achieve the general goal:


The main contribution of this paper is the development of the architecture and basic functional methods of the system for efficient resource-saving scheduling in the fog. The novelty of this paper consists in the development of the hybrid ledger-based system, which integrates and arranges the elements of various ledger types to solve the newly formulated problem.

#### **2. The Resource-Saving Scheduling Problem and Its Analysis**

The scheduling problem in the fog environment takes place when the CFB distributes the computational tasks within the set of nearby fog nodes [1].

The process of task scheduling in general involves np-hard problem-solving, which can be implemented via various methods, including up-to-date heuristics and methaheuristics (genetic algorithms, ant colony algorithms, simulated annealing, etc.). These methods are quite efficient; however, no discrete optimization can be performed without the formation of a search space [4], and a schedule-forming procedure has its time restrictions: with the increase of scheduling time the time for functional tasks implementation reduces. Moreover, the resource-scheduling problem requires the data on the workload history of the nodes because this is the key parameter for the workload distribution.

Consider the CFB, which receives the user task the and data to solve. The problem is to schedule the task within the formed fog node community, so as *P*0(*τ*) → *max*, *where P*0(*τ*) is the overall reliability function value of the fog nodes community, including CFB,*τ* is the moment of the user operation completion.

Consider the network graph *G* = < *V*, *U* > where *V* is a set of computational nodes, and *R* is a set of ribs. *V* = *vi* = < *i*, *pi*, *Ri*(*t*0), *Li* >, where *i* is the node identifier, *pi* is the node performance, *Ri*(*t*0) is the computational resource of the node at the moment of scheduling problem solution, and *Li*—is the workload of the node at the moment of *t*0. *U* = *ui*, where *ui* is the data transmission rate of the network rib *j*. The user operation is described as an acyclic graph, whose vertexes are assigned to tasks and whose ribs are assigned to information connections between them. *O* = < *T*, *C* >, where *T* is the set of subtasks, and *C* is the set of information connections. *T* = *tj* = < *j*, *wj*, *dj* >, where *j* is

the subtask identifier, *wj* is the computational complexity of the subtask, and *dj* is the data volume transferred to the network. The problem solution is the following tasks assignment:

$$A = \begin{bmatrix} t\_{11} & \dots & t\_{1m} \\ \dots & \dots & \dots \\ \dots & t\_{\bar{j}} & t\_{nm} \end{bmatrix} \text{ such as } P\_0(\tau) \to \max. \text{ where } P\_0(\tau) = \prod\_{\bar{i}} P\_{\bar{i}}(\tau), \; P\_{\bar{i}}(\tau) = \begin{cases} \prod\_{\bar{i}} \left( \frac{\bar{i}}{\bar{i}} \right), & \bar{P}\_{\bar{i}}(\tau) = 0 \\ -\lambda\_0 \tau 2^{(kD\_{\bar{j}}/10)} \Big|\_{-} & \end{cases}$$

where *Pj*(*τ*) is the reliability function value,

*D* is the node workload;

*k* is the coefficient of node temperature increase depending on the current workload, and *tij* is the moment of assignment of task *j* to the node *i*.

The constraint for this problem is as follows: *τ* < *tconst*; that is, the user operation completion time must be less than the declared time for this operation. One can see that the formal presentation of the resource-saving scheduling problem is quite common for the scheduling ones and can be solved by one of the existing and described methods (e.g., simulated annealing or some greedy approaches).

However, to solve this optimization problem in the fog, the following data must be provided:


Therefore, to solve the resource-saving scheduling problem in the fog, some means must be developed to provide the CFBs with the information required.

#### **3. State-of-the-Art Analysis**

The basic research area of this paper covers the following problem fields:


We consider the field of the computational resources, failure rate, and reliability. In the current paper the term "computational resource" refers to the reliability function of the computational node. The reliability function value depends on the failure rate, while the failure rate is related to the device temperature and workload:

$$
\lambda = \lambda\_0 \* 2^{\left(\Lambda T / 10\right)};
\tag{1}
$$

where *λ* is a resulting failure rate, *λ*<sup>0</sup> is the failure rate under conditions of the unloaded device, and Δ*T* is the temperature difference between the temperature of the unloaded device and the temperature of the loaded one.

In the study [5], the coefficient is determined, which connects the node temperature and the workload. Consequently, the reliability function is determined as follows:

$$P\_l(\tau) = \exp(-\lambda\_0 \tau 2^{(kD/10)}).\tag{2}$$

Therefore, by varying the node workload, the reliability function values can be improved at the particular time moment. This approach to system reliability improvement has been used in other studies [6,7], with a variation of objective function forms. However, no attention has been paid to the workload distribution under the uncertainty of the search space, which is highly related to the fog environment.

The scheduling problem is considered carefully in a wide range of publications because the problem itself is not new [8]. Additionally, some studies have examined resource planning and scheduling in the fog but have not considered the questions of reliability, resource-saving, or scheduling under the search space uncertainty.

Notably, in [9], the method of the online formation of the search space in the dynamic fog environment with the usage of onthologies to speed up the search space formation is considered. However, this approach presupposes the formation of the search space online by means of node polling, which may be unacceptable in conditions with time and the QoS restrictions.

Distributed ledger technologies, which first emerged as cryptocurrency systems, have been applied to the particular areas of fog and edge computing, with some examples listed below:


Regarding the current paper, distributed ledger technologies can provide synchronized local data storing as the source of available resources, node states, and workload history data. It must be mentioned that as there is no report of ledger usage aimed at improving node reliability,except for in [13]. In this study, the distributed ledger is used to store the information about the device's workload, yet there are no methods providing the data integrity and system functioning in general.

Therefore, despite the rare efforts, there are still no basic approaches to the design and development of particular systems for efficient problem-solving of resource-saving scheduling in the fog.

#### **4. A Comparison of the Approaches to CFB Data Provisioning and the Approach Choice for System Design**

As mentioned in the previous section, to solve the resource-saving scheduling problem the following is required:


All these requirements can be met by means of particular data storing on the fog nodes.

Estimate the time *t*<sup>0</sup> as the time when all the needed data will be gathered from the nearby fog bodes. The minimum possible value of *t*<sup>0</sup> is the minimum of *ti*, which is the time of information gathering from the nearest fog node.Thus, the lower estimation is as follows: *t*<sup>0</sup> ≥ *t*min. The estimations of the time needed for the information gathering are as follows, with *Dk* as the distance between CFB and the fog device:

$$t\_{\min} \le t\_0 \le \sum\_{i=1}^n t\_{\min} D\_k \tag{3}$$

Consider a case in which all the needed information is stored on the fog node locally. There is no polling, and the time of information gathering for the search space forming and for the node workload history includes the time of local storage analysis.

Therefore, the estimation of the time for the local data processing and analysis can be as follows:

$$t\_0 = \mathfrak{f}(V\_{lead\mathfrak{g}er}),\tag{4}$$

where *Vledger* is the volume of the local data storage, and *ξ*(*Vledger*) is the time of ledger analysis. Thus, the polling strategy implementation depends on the network diameter and nodes as well as the data transmission channels state, while the local storage of the appropriate information makes it possible to form the search space and analyse the workload history of the nodes much more efficiently. The comparison between the approaches to the information gathering is shown in Figure 1

**Figure 1.** The time estimations for the polling strategy and the local data storage.

The following must be highlighted: the time decrease of the resource-saving scheduling problem-solving increases the resources of the fog nodes due to the possibility of decreasing the workload.

It is well-known that there is a wide range of distributed ledger types, each which use different data storage and consensus methods. For example, there are many consensus methods for blockchain-based ledgers, including proof of work, proof of stake, proof of authority (the competitive consensus), and PBFT (cooperative). Some consensus methods have been developed for ledgers with other storage types, e.g., Nano (block lattice), Swirlds Hashgraph consensus algorithm (Hedera Hashgraph), and others [14].

To provide the CFBs with the appropriate data, the choice of data storage method and the method of consensus on data must be formed on the basis of the general functional requirements of the resource-saving scheduling problem-solving system architecture:


block lattice/Nano technology of data storing and synchronization is the most appropriate for the storing of the changing resource and workload states of the fog nodes. Yet, it is insufficient to synchronize the data only: there must be a mechanism to detect the CFB failure and to recover the system of fog brokers. The view stamped replication concept has potential in this regard.As we have the fully replicated data, there is no need for an exchange with the leader elections procedure. Thus, the time needed for leader election is acceptable.

#### **5. Development of the System Architecture and Basic Functional Methods**

Consider the atomic transaction as follows: *Q* = < *timestamp*, *Load*, *nodeid*, *t*, *brokerid* >, where *transactionid* is the transaction identifier, *roundid* is the epoch number, *Load* is the workload estimation of the node, *Nodeid* is the unique fog node identifier, *t* is the moment the device exploitation begins, and *brokerid* is fog broker identifier. Transactions of the

described structure are stored in the block lattice—type ledger. Each fog node is assigned to its own blockchain within the block lattice. Each blockchain implements the storage of the nodes' states. Every time the node changes its state (e.g., workload change), this fact is placed into its assigned blockchain and disseminates through the fog broker nodes, which are the owners of the block lattice storage.

The basic architecture of the system is presented in the Figure 2.

**Figure 2.** The basic architecture of the system.

The general system architecture implements the following functionality. The fog device software performs the following:


The CFB software performs the following:


Figure 3 contains the software architecture of the fog and broker node.

**Figure 3.** The structure of the CFB component and its interaction with the fog device.

The basic functions of the system are implemented by means of the following methods.

	- Request of the list of fog brokers from the neighbour fog node.
	- Sending of a request of the ledger copy to the nearest fog broker node.
	- Addition of the new own blockchain to the block lattice, which is assigned to the new device in the fog layer.

### *5.2. Broker Failure*

The failure of the fog broker does not make it necessary to remove its own blockchain from the ledger. However, the broker failure can become the cause errors in the scheduling problem-solving when the neighbourhood fog nodes send their state descriptions to the failed broker. In the case of broker failure, the fog nodes do not have the command to send their data somewhere else. This problem can be solved in various ways:


In the case of broker failure, some state information will be lost, yet this situation can be detected within the fog broker network, and the interaction between the fog nodes and brokers can be recovered. To detect the broker failures, the broker community is formed. The brokers' awareness about their neighbourhood is implemented by means of message exchange. In case of distributed leader usage (viewstamped replication concept), broker nodes send the "heartbeat" messages to the leader and receive the same messages from it. One can see that in the distributed leader approach, the information exchange within the broker network will be quite acceptable.

The main stages of the system functioning based on the distributed leader approach are outlined below.

#### *5.3. Functioning Stage*


### *5.4. Follower Failure*


### *5.5. Leader Failure*


### *5.6. New Fog Device Addition*


#### *5.7. Fog Node Failure*

In the case of fog node failure, there can be a situation in which the assigned tasks can not be performed. In this case, the fog broker can reschedule the problem, taking into account that fact that the time for user computations has decreased. The failed fog node state is put into the ledger as "no resource available".

#### **6. Conclusions**

In the current paper, the architecture and some functional methods are proposed for decreasing the resource-saving scheduling problem time.

The distributed ledger technology, block lattice + Nano concept elements were chosen for the data provisioning implementation, while viewstamped replication protocol elements were chosen for the control and recovery of the CFB network. The architecture and methods proposed provide the CFB nodes with the data needed for the following:


Furthermore, the possibility to process the local data without the node polling allows the scheduling problem-solving time to be shortened, thus increasing the time for functional user task solution. This leads to a decrease in fog node's workload decrease, and in this way, the provision of additional time is achieved.

**Funding:** This study received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.
