2. Literature Review
In this section, the most popular models for cloud and fog computing are reviewed, explaining their resource monitoring and provisioning mechanisms, and it describes CSB jobs. With this information, we can better understand the strengths and weaknesses of each model and choose the one that is best suited to our needs.
The system of DRPM has been presented by Al-Ayyoub et al. [
38]. This multi-agent system was designed to control the cloud provider’s resources while considering the customers quality of service requirements, which are controlled by the SLA. In the event of physical machine overload, the DRPM system’s host fault detection (HFD) algorithm determines virtual machines to mitigate the issue. This is achieved by taking into account the source of the overload and making an informed decision accordingly. The system of DRPM has been evaluated and tested by the CloudSim tool, in order to utilize the resource enhancement, mitigate the power consumption, and avoid violations of the SLA.
The Aneka system was designed by Vecchiola et. [
39] which provides a PaaS for cloud environments by the platform of NET-based applications. This system makes it easy to manage, deploy, and develop applications in the cloud. APIs and runtime environment applications are presented by this system, which can be used on public and private cloud computing paradigms, such as GoGrid and Amazon EC2. Furthermore, the Aneka model similarly proposes techniques of resource monitoring according to SLA guidance. This method helps ensure that new jobs from cloud users are completed on time and within the parameters set by SLAs. It does this by estimating the time required to complete these tasks and matching them with available resources. The system keeps running if the calculated completion time for the new workloads matches; else, it keeps running cloud resources to stay within SLAs.
Fog computing has been described by Peter [
33] along with its real-time applications, and it has been shown that fog computing can run and handle Big Data produced by IoT devices. Additionally, it was demonstrated that fog computing may address latency and congestion problems. This method works by estimating the time required to complete new tasks from cloud users and then matching them up with the available resources and the SLA deadline time.
Chiang and Zhang [
36] showed that the IoT and fog computing are two important emerging technologies that are beginning to see more integration. They give an overview of some of the challenges and concerns that come with evolving IoT systems, and how fog computing can help to address them. Additionally, they discussed how to develop new business opportunities by using modern storage, computing, and networking architectures. In addition to reviewing the characteristics and advantages of this fog architecture, the authors also recommended solutions for several IoT problems.
Other related papers have reviewed fog computing architectures, such as Dastjerdi et al. [
40]. Instead of using the cloud, their model operates the IoT in the local fog. In the suggested architecture, fog services are positioned within a software-defined resource management layer [
41]. By using cloud-based middleware, fog cells are analyzed, arranged, and provisioned.
Kong et al. [
42] analyzed how fog computing is accompanied and spread by cloud computing and examined how the former integrates with the IoT. Using traffic lights and wind farms as use cases, their architecture was evaluated.
Agarwal et al. [
43] described the CSB architecture, which is shown in
Figure 5, where the CSB architecture is split into four parts:
A: The User Interface between the user and the CSB serves as a link between the two. An Application Interpreter describes how tasks should be performed, what QoS should be provided, and what is to be implemented by the user. Moreover, the service requisites needed for execution are identified by the Service Interpreter. A few other essential services are included in these requirements, such as the type of service and the location of the service. To gain access to required services, the Credential Interpreter examines the credentials.
B: The Core Services layer is used by the CSB to match user requests with the most appropriate cloud services. The Service Negotiator gathers requirements from the User Interface and passes them on to the Scheduler, which then uses these requirements to identify which cloud services will best suit the user’s needs. The Service Monitor is a constantly running program to check the cloud services’ availability and to find available new services. This allows tracking of the state of the cloud and ensures that services are always up and running.
C: The Execution Interface is where the Job Dispatcher executes user applications, by combining data files with the user application code, and sending the resultant packages to cloud resources. This layer provides the necessary infrastructure to make this happen. The Job Monitor is an essential tool for tracking the progress of a job and ensuring that the results are delivered to the customer upon completion.
D: The Persistence Layer is key to maintaining the User Interface state, Execution Interface, and Core Services in the event of broker failure.
Alwada’n et al. [
44] presented a new system called Dynamic Congestion Management (DCM) that uses multiple agents to consider various elements when making decisions, such as cloud providers’ limitations and the number and types of resources available, which are essential factors in decision-making, customers’ QoS requirements, and customers’ conditions (service demands).
Figure 6 shows the Dynamic Congestion Management (DCM) system architecture proposed in [
44]. The system is partitioned into three parts.
A: Each client is given a local agent by the Cloud Service Users section, who responds to queries in order of priority. Additionally, the agent is responsible for sending the marked requests to the Classifier inside the CSB.
B: In the case of job congestion, the Cloud Service Broker Classifier identifies the jobs’ priority determining which should be run first. When cloud users continue sending job requests to the CSB in large numbers, the CSB might randomly reject some of these requests. The Classifier in the DCM system seeks to deploy the a priori task categorizations of the local agent to manage the requests flowing to the CSB in order to control the random rejection of requests. If the CSB is not overloaded, the classifier will implement the first-in, first-out (FIFO) order for the incoming requests. Conversely, the weighted fair queuing (WFQ) mechanism will be applied by the Classifier to resolve the congestion.
C: The Service Provider’s physical and virtual machines, including bandwidth, CPU, storage, RAM, etc., receive job requirements from the CSB via agents for every single resource, which report to the Job Monitor with a periodic update message regarding the job status.
Due to restricted resources (memory, energy, etc.), the calculation of shared data can be laborious and typically cannot be performed by IoT devices themselves [
45]. Companies have been able to overcome this capacity restriction by outsourcing intense computer processes to the cloud thanks to the adoption and use of the cloud paradigm. However, this offloading comes at a cost to the quality of service (QoS) provided, including increased latency brought on by the distance between the cloud and the end devices, network overhead, and increased security and privacy risk. Over the past few years, the edge computing paradigm has been advocated as a way to lessen this penalty. Edge computing enables the transfer of computing tasks to nodes located closer to end devices (at one hop from them). As a result, these duties are closer to the data source and the data consumer, improving the quality of service provided [
46].
However, edge computing, according to some researchers, addresses the offloading of computing work from the cloud to the last hop before smart devices, while others claim it just includes the collection of devices at the end of the computer chain, including IoT devices, etc. [
47]. It becomes essential to adopt technology that will allow for real-time data collecting and analysis while also ensuring process and product quality [
48]. Edge computing (EC) must thus be introduced in order to guarantee the quality of the production process and final goods (zero defect manufacturing) [
48].
Additionally, some studies advise the use of edge computing for various domains, applications with stringent reaction times, and for those who wish to lower the cost of their infrastructure or control their privacy better. The particular coverage, chosen areas, and advantages of edge computing are therefore not presently agreed upon [
49].
3. System Model
The proposed multi-agent DCMB system is designed to take into account several factors when making decisions, utilizing different agents to make informed decisions. This system was created in order to improve traffic conditions by reducing congestion and improving the flow of traffic. Some of the factors that are considered when making decisions include urgent requests (jobs) coming from the fog layer, the huge amount of cloud requesters, customer technical and QoS requirements, and the scope and limitations of cloud provider resources. This system is essentially adopted from Cisco’s queuing algorithms [
50] and the DCM system proposed by Alwada’n et al. [
44].
Figure 7 shows the DCMB system’s architecture, which comprises four main sections: the Cloud Users, Cloud Service Broker, Fog Service Broker, and Cloud Service Provider sections.
3.1. Part 1: The Cloud Users
This part assigns a local agent to each customer, which remarks on customers’ requests according to their priority, and forwards the distinguished requests to the CSB (specifically to its Classifier entity). The marking that is carried out by the local agent is considered as an additional paid service, whereby clients can purchase priority cloud use. If this service is not chosen, the level of priority will be assigned as a law by the local agent. User requests contain job conditions (specifications), including in terms of hardware specifications, required software, and virtual machine (VM) type. The local agent marks the users’ requests as low, medium, or high priority, which supports the user to address some demanding jobs when required.
3.2. Part 2: The Broker of Cloud Service
The cloud center is considered in this part. Cloud users (including IoT devices and fog brokers) submit jobs to the CSB, which is the core of the DCMB system. During congestion, the Classifier determines which job should be operated first. Due to the large number of requests that the CSB receives, it may be forced to delay or reject some requests. Instead of delaying or rejecting the requests as per FIFO order, the Classifier can utilize the request marking previously applied by the agent in both the user part and the fog broker in the fog layer, to classify the requests in a way that allows for the customers’ QoS requirements. The FIFO algorithm is employed by the Classifier in order to process the ingress requests if there is no congestion in the CSB. In the event of congestion, the Classifier relies on the WFQ to resolve it. Streams are sorted into queues automatically by WFQ, which assigns a weight to each queue according to its priority. Initially, the process is a round robin, where time slices are allotted to each process in equal percentages, but once the round is completed, the traffic flow is passed by the processer, depending on its weight. The WFQ mechanism solves the starvation issue that may arise when round robin is used alone.
Figure 8 explains the job of the Classifier while receiving cloud users’ requests, the IoT devices’ requests, and the fog broker requests by using the WFQ mechanism. Furthermore, the Broker Buffer is used by the Classifier, if the requests need an additional classification, and kept on hold.
In the next step, the Scheduler runs the requests the classifier selected to run first. In response to a user request, the Scheduler selects the most appropriate cloud service based on the request’s application and requirements. Cloud service status is monitored by the Service Monitor and the new cloud services are periodically checked. Subsequently, the Job Dispatcher obtains the users’ jobs and puts together the user application and data files which are sent to a particular cloud resource in order to execute them. The job running status is monitored by the Job Monitor in order to send the job outcomes again to the client once the jobs are completed. The job status can be viewed and managed by using an agent which is linked together with the cloud resource. In cases where the broker fails for any reason (e.g., the broker’s system is down or has crashed), the state of every CSB entity is saved in a database in order to be recovered once the failure is fixed. Finally, once the task is finished, the agent connected to the resource directs the results back to the Job Dispatcher, which reports that to the Job Monitor while simultaneously sending the outputs back to the local agent at the cloud user side.
3.3. Part 3: The Broker of Fog Service
This part describes the high level of integration IoT with the edge-fog and their relationship with the cloud. Some of the components in this part are adapted from the model developed by Tuli et al. [
51], which was clarified in a notably accomplished way. IoT devices observe the external environment and translate any given directive into physical actions. IoT devices are connected with nearby gateways by wired or wireless communication protocols. The Fog Gateway offers user interfaces for applications to facilitate users’ credentials, report service expectations, access the backend program, receive service results, request resources from computing infrastructure according to availability, and manage IoT devices. It employs either the Constrained Application Protocol (CoAP) or Simple Network Management Protocol (SNMP) for communication. General Computing Nodes are responsible for data storage, application execution, and other operational management functions. Repository Nodes store the infrastructure and user credentials, as well as data from IoT devices; back up the application catalog; and increase the cloud storage to assist in data management operations.
The Fog Service Broker (FSB) is responsible for operating and managing the fog layer operations, including:
A: Finding suitable computing nodes to run the most time-sensitive data requiring a quick response from the fog layer, which cannot wait for feedback from the cloud.
B: Finding the suitable computing node(s) or the suitable repository node to run or store data that can be delayed for minutes or seconds for response or actions.
C: Sending less time-sensitive data to the cloud on a delayed basis, for archiving and historical analysis, long-term storage, and Big Data analytics.
D: In case there is a need to process very urgent operations, and the FSB cannot find suitable resources for these operations, it sends the request to the CSB for an urgent process. However, before sending this request, it informs the CSB of this urgent request by marking the request a high priority job. The CSB in turn applies the WFQ mechanism for classification, which helps to accelerate the processing of prioritized FSB requests.
Another point that should be noted is that in some cases the Fog Gateway may have the ability to communicate directly with the CSB without the need to communicate with the FSB first. This situation can take place in the case of a job that is less time-sensitive to delay.
3.4. Part 4: The Cloud Service Providers
This part includes the virtual and physical machines (CPU, storage, RAM, bandwidth, etc.). Every physical machine can have more than one VM and every VM has an agent that is accountable for job attributes received from the cloud broker and informing the Job Monitor by frequent messages of the condition of the job. In other words, the agent is responsible for monitoring the execution of jobs at the cloud providers.