1. Introduction
In today’s era, Vehicular Ad-hoc Network (VANET) has contributed to making the transportation system intelligent, which connects and interacts wirelessly with moving vehicles to solve problems, including traffic congestion, information dissemination, and accidents. In VANET, short-range wireless transceivers are mounted in vehicles and roadside units (RSUs), such as roadside base stations or access points [
1]. The vehicles in VANET will serve as routing nodes; they are not indirectly linked to one another and will have to communicate via several hops. Consequently, a multi-hop routing approach is required to find a valid route between the sender and receiver that includes a list of transitional vehicles [
2]. In VANETs, many kinds of wireless connections can be used for data routing. The vehicles can directly connect, called vehicle-to-vehicle (V2V), whereas vehicles connected with infrastructure are considered V2I/I2V. For better connectivity and backbone network, infrastructure-based networks are also called infrastructure-to-infrastructure (I2I). These connections are shown in
Figure 1. V2V allows vehicles to share data cooperatively. An additional wireless connection exists between the infrastructure and neighboring cars that can be used both ways (e.g., V2I and I2V). In this structure, the connection provides internet access and current data to vehicles [
3].
In the literature, routing schemes for VANET have been extensively studied [
4,
5,
6]. Routing protocols are divided into five groups based on how constructive, reactive, hybrid, adaptive, and context-aware they are. A route discovery request is sent to all the nodes in the whole network with the help of a proactive routing protocol. It increases control overhead, energy usage, and E2E delay. While in the reactive routing protocol, the discovery process is initiated by the source node, and it reaches only the intended destination.
This method reduces control overhead but still requires the path discovery process to find a route for each new node [
4]. The constructive and reactive approaches are combined in the hybrid routing protocol. Clusters are areas where the nodes in a hybrid network are clustered together. The clustering architecture improves network scalability by using constructive intra-cluster routing and reactive inter-cluster routing. As a result, VANET environment scalability is improved, and overhead control messages are reduced. Although clustering techniques reduce routing control overhead, regular cluster head (CH) elections increase the re-election process’s control overhead [
5]. Due to the interference and mobility, the adaptive routing protocol can deal with varying network topologies, node mobility, and complex wireless conditions. To address the problem of heavy congestion, context-aware routing integrates external information resources such as maps, location facilities, or even public transportation programs [
6].
When developing a routing protocol, it is critical to consider the problems and characteristics of the infrastructure on which it will be used. Some challenges are the high mobility of nodes, dynamic changing topology, scalability, reliability, fault tolerance, energy consumption, uneven traffic density, neighborhood discovery, delay constraints, and real-time transmission [
7]. In highly complex networks such as VANETs, reliability is the most difficult problem to solve. A valid route can become invalid after a brief period because vehicle communication breaks down frequently due to the high speed at which vehicles travel. Using the shortest route for data communication between network nodes without considering route reliability may be expensive. This occurred because these paths could become unacceptable shortly, interrupting data transmission frequently [
8]. In VANET, there are two types of reliability, which are mentioned below:
Link Reliability: The probability of a connection remaining uninterrupted for a definite period of time is known as link reliability. Assumed a prediction time
for constant accessibility of a dedicated link
among the interconnection of the nodes at time
, whereas
is representing the link reliability.
Route Reliability: In VANETs, various possible paths could occur between the sender node
and the receiver nodes
, where each path is the connection between various links in the dedicated route. For every provided path, the number of its established links by
. The route reliability
for path
P is described as follows:
where
is the link reliability as calculated in Equation (1).
In past years, with the rapid development of bio-inspired techniques and machine learning techniques, routing protocols based on particle swarm optimization (PSO) [
9], artificial bee colony (ABC) [
10], ant colony optimization (ACO) [
11], genetic algorithm (GA) [
12], harmony search (HS) [
13], support vector machine (SVM) [
14], reinforcement learning (RL) [
15], and researchers have extensively adopted k-means [
16] to recognize and route packets among nodes in an improved way [
17,
18]. Machine learning is a collection of predictive mathematical models that can be used to make predictions and decisions based on a large amount of data. This ability to predict and make decisions may be critical in the VANET [
19,
20]. However, in route selection, background details such as communication type, E2E link dependency, and packet load size can boost the performance of the VANET system. All these observations encourage the adoption of machine learning techniques to mitigate the various challenges and issues in routing between vehicles. Thus, a context-aware reliable routing protocol has been proposed that incorporates k-means and SVM approaches in an attempt to provide a better quality of service (QoS) in VANET.
1.1. Research Contributions
The major impacts of the proposed protocol are as below:
Introduces a context-aware method to distinguish the traffic flows with distinct context information to minimize communication overheads.
Design a machine learning techniques-based routing that considers k-means and SVM approaches for optimal route selection to deliver reliability and robustness towards network malfunction, dynamic topology, and variable mobility in VANET.
Adopts packet delivery ratio (PDR) and E2E delay as routing metrics which guarantee that the most reliable route is selected during transmission.
1.2. Organization
The section describes the structure of the rest of the paper:
Section 2 represents the background survey of various research related to this area.
Section 3 discusses k-means clustering and SVM techniques are discussed. In
Section 4, the proposed context-aware routing protocol is illustrated. The performance parameters to be measured are presented in
Section 5. Lastly, the conclusion is discussed in
Section 6.
2. Background Survey
While going through the current literature, we reviewed that widespread research on routing protocols has been proposed in VANET.
A cluster-based lifetime routing protocol called CBLTR [
21] is proposed, aiming to maximize the stability of routes and average throughput in a bidirectional sector situation. The CHs are elected by considering the vehicle’s lifetime as one of the parameters inside each cluster. The CHs select the optimal route according to its current location, destination location, and average throughput. The proposed protocol also minimizes the control overhead in the clusters among the cluster members and the CH. The simulation results reveal that it outperforms in terms of E2E delay and throughput. Although, QoSBeeVanet is proposed in [
22], a QoS multi-path routing protocol. It is centered on a biological model of bee transmission in the quest for food sources. It utilized a scout and forager to find the network and transfer data to the destination. Every scout recorded its data in the routing table and assessed its quality using a weighting factor. The hybrid bee swarm routing (HyBR) approach for VANET was introduced in [
23]. HyBR is a multicast and unicast routing that ensures road security by communicating packets with minimal latency and large data delivery. During high network density, it utilizes Scout and Forager for network findings which are motivated by bee communication. However, during less density, it utilizes a geography-based approach, which uses a GA to determine the shortest route between source and destination.
A hybrid clustering mechanism is proposed in [
24], which merges context- and geographic-based clustering methods. During clustering, every node calculates a weight based on specific parameters: velocity, distance, residual lifetime, point of interest, and direction. The node with the maximum weight is chosen as CH. The proposed research decreases the overhead in the network and the destination-aware inter-clustering routing, which improves the overall PDR and reduces the E2E delay. A hybrid, multipath ACO-based routing approach (MAZCORNET) is proposed in [
25] to determine multiple paths among vehicular nodes. In MAZCORNET, the network is split into numerous zones, a proactive mechanism is utilized to determine a path within each zone. A reactive mechanism is utilized to determine a route among zones by utilizing the local data accumulated in each zone. This technique is scalable and fault-tolerant. CBQoS [
26] is a new QoS-based unicast routing for VANET. It considers two procedures: a clustering approach that establishes and enhances the transmission of routing information to meet QoS necessities, and a routing information optimization algorithm, and an ABC algorithm that determines the optimal paths among source and destination using QoS parameters such as usable bandwidth, E2E delay, and connection expiration time.
An improved HS optimization (EHSO) algorithm [
27] considers the optimized link state routing (OLSR) parameters’ design by storing two common selection techniques in memory: roulette wheel and tournament selection. The improved harmony search optimization (EHSO) outperforms the OLSR in terms of PDR and routing overheads, according to simulation findings applied to a highway scenario. A location-based geocast routing protocol [
28] that uses PSO with a next-vehicle approach and a fitness feature that is built in such a way that it can quickly locate local and global maxima. The authors created a PSO with a fitness feature that maximizes the distribution ratio and minimizes delay, routing load, and packets drop when choosing an appropriate next-hop vehicle to send information to the geocast region on time. Since the fitness feature utilized in PSO minimizes delay and routing load, the proposed protocol performs better.
The literature also incorporates numerous studies [
29,
30,
31,
32] that have embraced machine learning methods to resolve the routing issue in VANET. A greedy forwarding routing algorithm [
29] in VANET is based on the SVM technique. The SVM in the proposed approach is used to manage the data and create routing metrics to improve the routing performance. By applying a large amount of classified data (features including the distance between the forwarding node and the next-hop node, the moving direction, the acceleration, and the moving direction of the next-forwarding node), the model is obtained by training such a dataset in SVM. The simulation results show better reliability and communication efficiency are achieved. To estimate the required information for routing protocols, a unique routing information scheme known as the machine learning-assisted route selection (MARS) is proposed in [
30]. Machine learning is utilized in MARS to keep track of road details in roadside units. MARS may also assist in determining the forwarding path among two RSUs according to the expected destination position and the approximate communication delays in both directions. To keep track of roads, we utilize RSUs and machine learning. MARS can forecast vehicle movement and choose appropriate routing paths with higher communication capacity for packet transmission. MARS can also assist in determining the forwarding direction between two RSUs.
For VANET architecture, HQVR, a heuristic Q-learning-based routing algorithm [
31], chooses a transitional hop based on the reliability of the connection. The learning protocol for HQVR is based on the data collected by transmitting beacon packets and is a distributed algorithm. The rate of beacon messages affects the convergence of the Q-learning algorithm, according to the authors, which makes convergence slower. The relation length ratio determines the learning rate in HQVR. The learning rate defines the sum of convergence according to the Q-learning procedure’s functionality. As a result, the need for exploration decreases with a higher-quality link. As a result, the source can select the optimal path from among the several options. Whereas a reinforcement learning (RL) based routing protocol called RLRC [
32] in VANET creates a cluster between the vehicles, the authors utilize an enhanced form of K-Harmonic Means (KHM). Since RLRC creates clusters to minimize the number of state spaces, the CH would be required to share a large number of packets with the CMs of their cluster.
The graph-based deep learning model [
33] in the communication network is discussed in various aspects, where the problems and Graph Neural Networks (GNN) based solutions are also listed. The construction method of wireless communication graph for different wireless networks and to introduce of the progress of various classical paradigms of GNNs are discussed [
34]. GNNs-based deep reinforcement learning (DRL) architecture [
35] can generalize the unseen network topologies used for training. To fully utilize the network resources deep graph reinforcement learning (DGRL) method [
36] is effective, improving the data delivery rate and reducing the delay.
As a result, when choosing a CH, RLRC counts the vehicle’s energy parameter. The bandwidth parameter is chosen as the second parameter for selecting the CH to ensure smooth connectivity. The least distant node is chosen as the CH according to the relative distance. The SARSA model is used to optimize the RLRC procedure’s routing mechanism, which reduces learning time. RLRC decreases the amount of state space and speeds up convergence by creating clusters.
Table 1 demonstrates the comparative analysis of surveyed protocols in VANETs.
4. Mathematical Analysis for Proposed Protocol
The operation of the proposed protocol is defined below:
The likelihood that a direct connection among two vehicles will remain uninterruptedly accessible over a definite time duration is known as link reliability. Assumed a prediction time
for constant accessibility of a particular link
among two vehicles at
, the link reliability
is specified as below:
For the proposed work, evaluation for the Euclidean distances among the data points and centroids are calculated to allocate points to the closest centroid. The dataset is generated based on the following parameters such as the location of the vehicle, the direction, the velocity of the vehicle, and the Point of Interest (POI).
A process for clustering N data inputs
into
k clusters
, each comprising
data points,
, reduces the subsequent mean-square-error (MSE) value:
where
xt is a vector signifying the
tth input and
cj signifies the geometric centroid of the cluster
Ci. To minimize an objective value, a squared error function is used, which represents the distance between data point
xt and the cluster center
cj.
Here, are known as cluster centers which are acquired by the subsequent steps:
- 4.
Set k cluster centers . For each input xt and k cluster, perform stages 2 and 3 until all clusters congregate.
- 5.
Evaluate cluster membership value using Equation (4) and determine the membership of each input in every k cluster whose cluster center is nearest to that centroid.
- 6.
For each k cluster, establish cj to be the center of all data inputs in cluster Ci.
- 7.
Consequently, the k-means clustering divides the routes into two clusters named GOOD and BAD. The cluster with high mean square error (MSE) is labeled as BAD, and the cluster with low MSE is labeled as GOOD. Machine learning techniques are implemented to train the data accumulated from produced simulations and train SVM in every iteration with random inputs until the best results are achieved. The pseudo-code has been explained in Algorithm 1.
- 8.
In this step, Radial Basis Function (RBF) is used for transforming the given input vector into n-dimensional data. Gaussian RBF mathematical expression is represented as follows:
where,
represents the kernel function for two classes
c1 and
c2,
and represented
.
For this, we gather data from the produced simulations and train SVM with arbitrary inputs in every iteration until the optimal result is attained. However, the input data is firstly normalized before utilizing for training. The SVM recognizes the malicious activity of the vehicle in the network and transmits the results to the response unit, which has its own set of regulations to produce an outcome.
- 9.
After training the routing data with SVM, during the execution, we evaluate the following parameters of each route from source to target:
PDR: It signifies the ratio of all packets effectively received at the receiver to all the data packets transmitted by the source vehicle.
Average E2E delay: It signifies the average time that the packets take to reach the destination.
Throughput: It signifies the total packets that are transferred from the sender to the receiver node in a given amount of time.
Determine the routes with low PDR, high E2E delay, and low throughput, and also determine the corresponding nodes which occur frequently in these routes.
- 10.
After determining the nodes which occur frequently in the non-optimal routes, the proposed approach eliminates the routes which consist of nodes from the BAD cluster and shifts the load of the malicious nodes to its nearby node to maintain reliability.
Algorithm 1: K-means Clustering-based VANET Routing |
| Input Parameters: |
| (a) Set of Routes |
| (b) Initial number of clusters |
| (c) Direction, Velocity, and Location of each Vehicle node |
| Output Parameters: |
| (a) Optimal clusters: GOOD and BAD |
1. | Randomly initialize centroids in space |
2. | For do |
3. | Calculate the cluster membership function |
4. | Assign routes to convenient clusters according to |
5. | End for |
6. | If all routes are assigned to a cluster, then |
7. | End of the algorithm |
8. | Else |
9. |
|
10. | End if |
11. | If MSE of Cluster = Low, then |
12. | Cluster = GOOD |
13. | Else |
14. | Cluster = BAD |
15. | End if |
5. Simulation Analysis
For testing the efficiency of the proposed protocol, a 1000 × 1000 area has been considered for simulation. The performance of the proposed protocol is contrasted to the CBLTR [
21] and Aravindhan et al. [
24] regarding the parameters such as Throughput, PDR, and E2E delay.
Table 2 represents the Simulation parameters.
To evaluate the proposed protocol, initially, 100 nodes are disseminated on the network area, and each vehicle is given continuous velocity from the range as follows: 50–70 km/h and 0–100 km/h. The major reason for considering the varying speed and density parameters in the simulation was to exclude the transmission and link failure among vehicles due to instability in speed and density among vehicles. Moreover, these two parameters perform a crucial role in the lifetime of the transmission connection and the superiority of routes established among the vehicles.
PDR specifies the percentage of data packets arriving at the destination concerning the total number of packets transmitted to the destination.
Table 3 shows the PDR variation by the vehicle’s varying speed.
With simulation results, the PDR decreases by increasing the velocity of vehicles. This is because, at high speed, the position of nodes varies more frequently, and hence more packets drop. By utilizing the weighting mechanism to select the next forwarding node in CBLTR [
21], the node nearest to the destination is chosen. However, such nodes are generally near the boundary of the transmission region and leave it very quickly. In the proposed approach, the reliability of the route increases by considering various parameters and hence reduces the failure probability at higher speeds.
Figure 5 shows the percentage improvement in PDR in the proposed protocol compared to Aravindhan et al. [
24] and CBLTR [
21] protocols with varying mobility. It is noted that in protocols Aravindhan et al. [
24] and CBLTR [
21], the packet delivery rate starts to decrease with an increase in vehicle velocity. However, in the proposed approach, the most reliable connection was chosen to utilize k-means, and the minimum cost node was elected (suitable velocity, nearer distance, and same direction as the existing node) and hence reducing the probability of route failure.
Table 4,
Table 5 and
Table 6 compare PDR with various vehicular nodes for the Proposed, Aravindhan et al. [
24], and CBLTR [
21] protocols in network areas 1000 × 1000, 1200 × 1200, and 1500 × 1500, respectively. It demonstrates that the delivery rate initially increases with vehicular node density. This is because, with few vehicles on the roads, it is difficult to determine the nearby vehicles; hence, packets drop once the waiting time is over. CBLTR [
21] has the least PDR compared to all other protocols. Moreover, CBLTR [
21] requires discovering the route before transmitting the basic information. Due to the frequent variations in clusters in CBLTR [
21], the created route must be preserved. Unlike CBLTR [
21], Aravindhan et al. [
24] only identify the next available forwarding node, and hence it adjusts much better than CBLTR [
21] in the varying network topology of VANET.
Figure 6,
Figure 7 and
Figure 8 show the improvement in PDR with varying vehicular node density in the network areas 1000 × 1000, 1200 × 1200, and 1500 × 1500, respectively. The proposed protocol shows improvement in PDR compared to CBLTR [
21] and Aravindhan et al. [
24] because of effective route selection using the k-means and SVM approach. The average PDR in the proposed protocol is improved by 2.5% as compared to Aravindhan et al. [
24] protocol and by 8.2% compared to CBLTR [
21] protocol considering the 50 vehicular nodes for the simulation work, as shown in
Figure 6.
Average E2E delay is the average duration taken by a data packet to communicate between the source and destination.
Table 7 and
Table 8 show the delay comparison with varying vehicle nodes for the proposed Aravindhan et al. [
24] and CBLTR [
21] protocols in network areas 1000 × 1000 and 1500 × 1500, respectively. In VANETs, the probability of connection and data packet delay also increases with an increase in the node’s distance. However, constant paths have been chosen for the proposed protocol, and very few connections break during data broadcasting; this results in E2E delay reduction.
Figure 9 and
Figure 10 show the improvement in delay with varying vehicular node density in the network areas 1000 × 1000 and 1500 × 1500, respectively. It is noted that when vehicular node traffic increases, E2E delay also rises. In CBLTR [
21], the E2E delay is maximum. The primary cause for maximum delay in the CBLTR [
21] is that it considers a single parameter for neighboring nodes, which will always be its nearest neighbor. The proposed protocol addressed this issue by utilizing k-means and considering various parameters for route selection. As revealed in
Figure 9, the average E2E delay of diverse densities of vehicles reduced at a static rate.