1. Introduction
The Internet of Vehicles (IoV) is an emerging concept in intelligent transportation systems that aims to improve traffic safety and passenger comfort through integration with the Internet of Things (IoT), and it is an important implementation of the IoT [
1]. Based on Vehicle to Everything (V2X) technology, the IoV connects vehicles, roadside units (RSUs), and service providers into a whole organic network, enabling all-round communication between them [
2]. Smart vehicles in the IoV can communicate via V2X. Specifically, smart vehicles can share information with other vehicles through Vehicle to Vehicle (V2V) communication; thus, it can obtain a wider view of the road condition information shared by surrounding vehicles and greatly reduce the traffic accidents caused by blind spots [
3].
As the number of vehicles on the road continues to increase, the IoV continues to evolve, smart vehicles accounting for an increasing share of Internet-connected devices. In the IoV paradigm, smart vehicles are equipped with computing units and communication technologies that provide services such as intelligent control, traffic management and interactive applications for vehicles. The edge-side computing architecture for autonomous vehicles (AVs) relies on the communication infrastructure and services provided by the edge-cloud collaboration and the (Long-Term Evolution/5th Generation Mobile Communication Technology) LTE/5G. The edge side mainly refers to on-board edge computing units, RSUs, Mobile Edge Computing (MEC) servers, etc. AVs are equipped with a large number of sensors that collect data for different types of traffic systems and navigation applications, etc. [
4]. Currently, each AV is equipped with more than 60 to 100 electronic control units to support various functions such as communication, engine, dashboard, seat control, entertainment, etc. AVs generate a large amount of data in real time while driving; for example, AVs will generate and consume about 40 TB of data for every eight hours of driving (e.g., a city’s High-Definition (HD) map is about 1.5TB) [
5]. Although AVs have small-scale computational and storage resources, the computational capacity of vehicle terminals is relatively limited due to the huge computational tasks and the very high demand for real-time system response; e.g., in hazardous situations, the response time of the vehicle braking system is directly related to the safety of the vehicle, passengers, and the road, so they rely on other computational resources [
6]. The MEC servers in the rear seat unit of AVs can play an important role in improving the performance of mobile vehicle terminals [
7].
Since MEC servers operate at the edge of the radio access network, connecting to RSUs and performing transmission tasks with their help, their service area may be limited by the radio coverage of the RSUs. Due to the high mobility of the nodes, moving vehicles may pass through multiple RSUs and MEC servers during task offloading and can offload their computational tasks to any MEC server they have access to that provides computational resources for the tasks offloaded by the mobile nodes [
8], as shown in
Figure 1. The load on the MEC servers varies greatly from moment to moment; sometimes, there are a large number of users using the same server at the same time, causing a high load situation, while at other times, only a few users are connected to the server and the server load is low [
9]. Therefore, the key to the problem is how the mobile node defines the offloading decision by which it selects the appropriate edge servers to offload the computational tasks so that the total computational latency is minimized [
10].
A centralized offloading strategy has been used in some studies, where vehicles can request information about MEC servers available for offloading along the road from a centralized server located at a higher level [
11]. The centralized server is connected to the MEC servers through a wired network. However, this centralized architecture is not visible, and as the number of vehicles on the road continues to increase, the centralization of road information puts a higher load on the network [
9]. Offloading strategies based on Vehicle-to-Vehicle communication (V2V) architectures are discussed in other studies, where vehicles can send tasks to the MEC server through other vehicles, each of which can act as an intermediate node for data delivery [
4]. This solution requires the presence of other vehicles, but there is no guarantee that other vehicles will be present. Therefore, considering the scenario that if the mobile node only knows the MEC server that is being observed sequentially and has no global information about the candidate MEC server or only local information, how should it choose the optimal MEC server for offloading?
Such problems are known as Optimal Stopping Theory (OST) problems. A basic question about OST is that nodes autonomously decide when to act based on sequentially observed random variables with the goal of increasing expected benefits or decreasing expected costs [
12]. The Secretary Problem (SP), the House Sale (HS) problem or the Fair Coin Problem are some well-known OST problems [
12]. Single-node OST offloading problems have been studied, and in this paper, we attempt to study OST-based offloading problems for structured tasks with multiple subtasks and provide lightweight local algorithms that can be implemented in mobile nodes (vehicles or smartphones) so that mobile nodes can make offloading decisions independently.
The remainder of this paper is organized as follows: we provide a summary of the related work and outline our contributions in
Section 2, while details of the system model and the problem formulation are described in
Section 3. The OST-based task offloading models are described in
Section 4.1 and
Section 4.2. Performance evaluation is provided in
Section 5. Finally,
Section 6 concludes the paper and outlines future research directions.
3. System Model and Problem Description
Consider a vehicle networking application scenario studied in [
4,
10]: mobile nodes travel on the road with RSUs and base stations deployed on the roadside, where a limited set of MEC servers with storage units and computation units are deployed [
21]. Vehicles traveling on the road generate computational tasks and try to offload them to the MEC servers along the road. As user requirements become more complex, individual tasks gradually fail to meet user needs, and mobile nodes may generate a large number of structured tasks. For example, a structured task that is generated to meet business travel needs can be divided into three subtasks: weather forecasting (
), flight reservation (
), and hotel reservation (
).
and
are independent of each other and thus can be executed in parallel, but they all depend on the results provided by
[
22]. So, the three subtasks can be executed in the order of
,
, and
for offloading, and thus, they can be offloaded in chunks by dividing the structured task into multiple subtasks [
20].
The MEC server can operate at the edge of the network with the help of the RSU, so its communication range will be limited by the RSU communication range [
5]. Since the input data for the vast majority of tasks are much larger than the output data of the computation results, to ensure the continuity of task completion, it can be assumed that there are mobility management entities in the server that implement mobility management algorithms, such as path selection, power control algorithms [
23,
24] or as the prediction model in [
4]. If the node needs to obtain some results from a server other than the one selected for offloading this task, and the mobile node is outside the range of this MEC server, the selected MEC server should use a high-bandwidth wired connection to transmit the task results to the node through the next MEC server [
25], thus solving the problem of continuity in the transmission of computational results after the vehicle is out of communication range of the MEC server.
The mobile node observes the MEC servers along the route in order at this point, and is the random variable of the ith server processing time observed. When a node generates tasks locally and needs to offload, the node can observe n MEC servers; the mobile node uses the network and server analyzer to check each MEC server it passes and obtains the X values. Assuming that K blocks of tasks need to be unloaded, in n observation durations, then K unloading decisions need to be made.
The current goal is to optimize the decision to offload multiple targets to available servers. Overall, the offloading objectives are twofold: (1) maximize the probability of offloading to
K optimal servers; (2) minimize the expected value of the total
X for
K tasks. We first propose an offloading model that is generalizable to the distribution of
X for the first objective, and then, we propose a better offloading model with the second objective (minimizing the expected value of the total
X for
K tasks) for certain road scenarios where there may be a uniform distribution of
X.
Table 1 provides the key notations used in this paper.
5. Experimental Evaluation
We use two settings to evaluate the two proposed OST-based multi-task offloading models: simulation-based evaluation and real-world data sets based. In both settings, we compare our models: namely, K-Best (
Section 4.2.1) and KBU (
Section 4.2.2) with the Optimal model [
19], BCP model [
17], Random selection model (Random) and the
p-stochastic model (
p-model). We will discuss these models next.
The four offloading models used for comparison in a multitask offloading scenario are described as follows:
The optimal model takes the number of observations n, the probability distribution function of X, the quality-aware function and the threshold as inputs, and outputs the index s where nodes start checking the MEC servers. If the subsequent observed X does not exceed , i.e., , then we perform offloading; otherwise, we continue to observe. When the number of remaining unselected servers m does not exceed the number of remaining unloaded tasks, i.e., , then the remaining tasks are offloaded to the corresponding servers sequentially without comparing so that all tasks are guaranteed to be offloaded.
The BCP model takes the number of observations n and the probability distribution function of X as inputs and calculates the stop node , rejects the first servers, and then selects the first server with the best relative previous ranking (if any) to offload the task. When there are tasks that have been offloaded, there are still tasks left to be offloaded. If the number of remaining unselected servers m does not exceed the number of remaining unloaded tasks, i.e., , then the remaining tasks are offloaded to the corresponding servers in turn to ensure that all tasks can be offloaded.
In the p-model model, we use as the offload probability for each server in the following experiments, i.e., the node connects to the MEC server during the move and has an 80% probability of offloading to the server it is currently observing. In the random model, the mobile node randomly selects K servers for task offloading, and again, when the number of remaining unselected servers m does not exceed the number of remaining unloaded tasks, i.e., , then the remaining tasks are offloaded to the corresponding servers sequentially without further observation, so that all tasks are guaranteed to be offloaded.
To simulate the MEC environment, we use Simpy [
35] in Python. Simpy is a process-based framework for discrete-event simulation. Each MEC server is modeled as a resource, and during the simulation, the server publishes its processing time
X, and the connected mobile node can receive the information. The mobile node is modeled as a process that detects the processing time of the server in a one-direction mobile model and chooses whether to offload it or not. The values of the simulation experiment are shown in
Table 2.
5.1. Evaluation Based on Simulated Data
In the evaluation based on simulated data, the node would know in advance the distribution function of the random variable X, i.e., server utilization or CPU utilization. We consider experiments in the case of , . A series of random numbers will be generated by Python functions, which obey a specific distribution, such as normal or uniform distribution. Consider first the case where X follows a normal distribution.
As shown in
Figure 4, the results between the K-Best model and the four comparison models are compared when the distribution of
X presents a case of normal distribution. The K-Best model clearly performs best in terms of total delay of offloading. We can observe a clear overlap between the K-best model and Optimal model in
Figure 4a, and this overlap is significantly reduced in the comparison plots with the other three models. It can be seen that the K-Best model works significantly better than the last three models and presents Optimal. Overall, the OST-based models (K-Best, Optimal and BCP) are significantly more effective than several other models that are not based on OST. It can be seen in
Figure 5 that the OST-based models achieve a lower expected total processing delay, and the K-Best model has similar results to the Optimal and BCP models, while it differs more from the
p-model and random model.
In the previous experiments, the random variable X observed by the mobile nodes followed a normal distribution. Now, let X be uniformly distributed scaled in [0,1]; X is the server utilization or CPU utilization. For example, means that the CPU utilization of the server is 50%. For all models, the following experiments follow the same steps as in the previous experiments.
As shown in
Figure 6, the KBU model clearly performs the best compared to the remaining four models, and it can be seen in
Figure 6a that the KBU model overlaps significantly with the K-Best and Optimal models. Overall, the four OST-based models perform better than the other models. The KBU model achieves the smallest expected processing delay with
X following a uniform distribution, and the K-Best model also achieves a more desirable expected processing delay, which is significantly better than the
p-model and the Random model, as can be seen in
Figure 7.
Sensitivity Analysis
As shown in
Figure 8, the total processing delay of the K-Best model tends to decrease as the number of servers increases with a constant number of tasks
.
Figure 8a shows the case when the random variable
X obeys a normal distribution, and the case when the random variable
X obeys a uniform distribution is shown in
Figure 8b. We can see that when the number of observable servers increases, the K-Best model allows mobile nodes to select the best possible server for task offloading and has general applicability to the distribution of
X.
As shown in
Figure 9, as the number of servers increases, the total processing latency of the KBU model tends to decrease with a constant number of tasks
. It means that the KBU model enables mobile nodes to select the best possible server for task offloading as the number of observable servers increases.
5.2. Evaluation Based on Real Data
We also consider real data sets to evaluate our models. The purpose of this evaluation is to see how our models perform when dealing with real data sets. We use the CABS data set provided by the Shenzhen Smart City project [
36] to simulate the movements of the mobile nodes. The data set contains four attributes of the vehicles: vehicle ID, GPS location, movement time and movement speed. The use of mobility trace here is not for studying the mobility of users; in our experiment, for each movement, the car picks a server from the servers’ data set, checks that server utilization, and makes a decision of whether the car should offload at that time or continue observing based on the decision suggested by the models, as explained earlier in the Simulation Evaluation section. An example of one connection observation is shown in
Table 3; we can see that Cab 178 made connections to three servers at different times, where two connections were made to server m_1938 at almost the same location but at different times, while obtaining server information, i.e., CPU utilization.
The processing time of the servers is provided by the actual CPU utilization obtained in the Alibaba Cluster Tracking Program [
37]. There are more than one billion rows of CPU utilization data from about 150 servers that are recorded in the data set. One million of these data are used in the experiment;
Figure 10 shows the probability distribution of the CPU utilization of all servers in the data set. We can see in
Figure 10 that the CPU utilization follows a normal distribution with
,
. The values of the key parameters’ values in the experiment are given in
Table 4.
At the beginning of the experiment, the mean and the standard deviation were taken once for the whole servers’ utilization data set to feed the models (K-Best, KBU, Optimal and BCP). In a realistic scenario, the mobile nodes do not know the mean and standard deviation of a particular MEC server, but they can obtain the information about the historical data of the MEC servers in one area in a specific time with the help of MEC servers operators. Therefore, we take this information once when we start the experiment.
Similar to the experiments based on simulated data, the results of each model are aggregated and compared in the experiments based on real data sets. Each model selects K servers for mobile nodes to offload to minimize the total offload delay. The total delay is expressed in terms of the total server utilization.
Figure 11 shows the average server utilization for the offloading decisions suggested by each model. We can see that in the results based on real data sets, the OST-based model still performs better at minimizing the total offloading delay, and the KBU model and the K-Best model perform the best. Since in real data sets, the distribution of CPU utilization is closer to a normal distribution than a uniform distribution, the KBU model fails to show a more significant advantage over the K-Best model.
Figure 12 shows the average waiting time for the offloading decision suggested by each model. We can see that the
p-model with
suggests the smallest average waiting time. However, the average offloading time is too long, so that choosing an instant server is not a wise offload strategy. Moreover, the optimal server is unknown, and the mobile node cannot know exactly which server is the best choice, so using the OST-based model can achieve a near-optimal server utilization.
Sensitivity Analysis
We use the number of successful offloading for different threshold requirements as a performance metric for the models. The number of successful offloading is the number of offload decisions suggested by each model that satisfy a specific requirement. Suppose we have three different MEC applications x, y, and z, all with specific requirements. For example, application x requires a total CPU utilization ≤0.4, application y requires a total CPU utilization ≤0.6, and application z requires a total CPU utilization ≤0.8.
Figure 13 shows the number of successful offloading for all models for different requirements. For the first case requiring a total CPU utilization ≤0.4, the KBU model achieves 89 successful offloads, and the K-Best model achieves 90 successful offloads; for the second case where the required total CPU utilization ≤0.6, the KBU model achieves 250 successful offloads and the K-Best model achieves 230 successful offloads; for the third case that the total CPU utilization should ≤0.4, the KBU model achieves 385 successful offloads, and the K-Best model achieves 408 successful offloads. Overall, we can see that the OST-based offloading models (KBU, K-Best and Optimal) are significantly more effective than the other models.