5.1. Monitoring Information Aggregation Utilizing Bayesian Network
Based on information collected from the detection mentioned above, it is possible and necessary to calculate “faith” about a suspicious node’s trustworthiness. Since the suspicious nodes might be part of a carousel attack or stretch attack, or even both (with more than one suspicious behavior), here we introduce a Bayesian learning network to aggregate and further analyze gathered information. A Bayesian network is a probabilistic graphical model representing a set of random variables and their conditional dependencies (represented by conditional probabilities), exhibited by a directed acyclic graph (DAG).
Our Bayesian network contributes to modeling a set of nodes in terms of their status (comprised or not) and behaviors. It can be utilized to predict the most likely status of a node based on past observation records of its behaviors.
In order to calculate this prediction, one method is the maximum likelihood approach. It is the learning process of the Bayesian network from data collected. These data can be used to estimate a Bayesian network’s parameters that can denote the status of the nodes. Note that the datasets do not have to be complete, as we usually obtain incomplete ones from real networks. This approach is based on the likelihood principle, which favors the predictions (or estimates) with maximal likelihood. In other words, it prefers predictions maximizing the probability of observing the collected datasets [
47].
Naturally, alternatives are available to this learning process, such as the Bayesian approach or constraint-based approach. They are capable but either require more input or have additional constraints [
47].
The practical Bayesian network employed in this paper is illustrated in
Figure 4. It is aimed to examine a node’s “health” status (compromised or not), denoted by variable H. Two symptoms are considered here: one is “node is part of a route loop” (denoted by variable L), and another is “route is part of a stretched route” (denoted by variable S). These variables are binary, represented by
T (true) or
F (false) for those pre-defined variables H, L and S.
Figure 4 only shows a visualized structure, the details on learning for information aggregation is given as follows:
Table 2 shows an example of incomplete datasets
that have three different recorded data cases:
,
and
. A data case refers to a record of a set of symptoms exhibited by a node, in other words, a record with a certain combination of instantiation (
h,
l and
s), where symptom parameters
denote that this node has not been compromised and are used to participate in a route loop formulation and a stretched route before, respectively. Furthermore,
denote that this node has been compromised and not used to participate in any route loop as well as stretched route before, respectively. The symbol “?” here denotes the undetermined values of variables.
The goal is to calculate the expected empirical distribution of nodes status H based on the incomplete dataset.
Table 3 demonstrates assumptions of some initial estimates based on common sense; for instance, a compromised node is more likely to have participated in the formulation of a route loop or stretched route in the previous routing discovery process.
The expected empirical distribution of an incomplete dataset
is defined as follows:
where
is an event consists of certain combination of instantiations
,
is the size of data set and
are variables with undetermined values of case
.
For instance, the probability of an instantiation
(means this node is not compromised, has not been part of route loop and has participated in formulating a stretched route in previous routing discovery) is given by the following.
This process is repeatable; given sufficient retries, the probability of all the other instantiations can be eventually obtained.
Then, the expectation–maximization estimate of a node that has not been compromised can be written by the following:
where
l and
s refer to all possible values of
l and
s, respectively. Other parameters, such as
and
, are determined by the following.
All the outcomes derived from (
13), (
16) and (
17) based on incomplete datasets
constitute
estimates that are set to replace the initial estimates illustrated in
Table 3.
As has been mentioned in
Section 4.2,
is an “live” vector and updating itself all the time. Hence, we can keep watching the nodes symptoms from
and acquiring new incomplete datasets periodically:
,
…
(
m is a positive integer). If we keep accessing new data from
, then we are always able to obtain estimates with higher likelihood [
47].
5.3. Route Discovery Based on AHP
Once security information has been distributed around the network, the step that comes next is to exploit this information to discover the optimal routes with the help of the analytic hierarchy process (AHP) [
49]. AHP is one of the many choices of multi-criteria decision analysis (MCDA) methods, which are initially developed to help make optimal decisions (in this paper, this decision is about picking the best route) while taking multiple concerns (for example, energy efficiency and security) into consideration. There are many candidates other than AHP, but none of them, even AHP itself, are perfect and cannot be applied to every problem.
The “utility function” (see more details in [
50]) of each route, defined in this paper, is hard to construct since vampire attackers still deliver the packets eventually. Hence, in a sense, the energy consumed by attackers cannot be treated as “entirely” wasted. Furthermore, as earlier mentioned in
Section 3.1, the extra energy consumed by vampire attacks is associated with a series of random parameters, its volume varies a lot and the exact number is difficult to determine, making it even more difficult for us to construct the utility function. The authors of [
49] suggest that AHP is particularly helpful when a decision maker is having problems in constructing a utility function.
As shown in
Figure 5, we can then set a goal of figuring out the optimal route based on multiple criteria. The top-level in the figure is the goal of the decision, the second level of the hierarchy addresses the criteria under consideration, the lowest level shows the available choices (in this paper, they are all the possible routes). Afterward, the scores (or so-called priorities) of different possible routes can be determined based on pairwise comparisons between different criteria preset by a decision maker.
5.4. Details of AHP
In AHP, pairwise comparisons are made between different criteria. Hence, the setup of ratio scales is necessary. The judgement is a relative value or a quotient of two quantities and (in this paper, and refers to security concern and energy efficiency concern, respectively). In other words, these relative values (or ratio scales) represent the priority (importance) of each criterion.
The most straightforward linear priority setup proposed by authors of [
51] is shown in
Table 4. In a general sense, a human being cannot simultaneously compare more than 7 (±2) subjects [
52]. For example, a common man or woman cannot assign importance to more than 7 (
) items properly, and this is the limit of human ability when processing information. To avoid confusion, we chose 7 + 2 = 9 degrees in this paper.
Table 5 also provides other choices of priority setup. All these alternatives are constructed based on psychophysics theory. The validity of each one in the decision-making process is commonly evaluated in practical experiments. Therefore, the question of which scale has the best performance may spark many debates. Nevertheless, precious experiments results reveal that all of them overcome the essential linear one [
53,
54,
55].
Let us take a simple example: Consider two routes evaluated based on two criteria, namely, the energy efficiency and safety level. Note that security concern is twice as important as the energy efficiency. Assume Route 2 is set at 2.5 times as safe as Route 1, but has a transmission cost that is doubled than that of Route 1. Moreover, we can compare two routes on the following ratio scale.
Therefore, it can be concluded that Route 2 is 5.5 times as good as Route 1.
A different way of putting it is by using an interval scale as employed below.
It comes to the same conclusion that Route 2 is the better one.
Verbal comparisons must be converted to numerical ones in the derivation of priorities of each route; for more details, see
Section 5.5.
5.5. Priority Calculation in Optimal Route Determination
Assume, on any specific
i-th route that involves a total number of
nodes, that the expected total priority
of this route can be determined by the following:
where
refers to the priority from the
m-th node on this route to its next hop (
).
A “standard next-hop” is defined in advance to offer fair judgment to different routes: the next-hop node is 100% not compromised; the distance to the next-hop node is the maximum radio range of the node; and the energy cost after a packet is successfully delivered to its next-hop node is represented by .
Resembling the example previously given in
Section 5.4, the “relative” priority on each hop compared to that of a pre-defined ’standard next hop’ is calculated as follows:
where
is the priority (importance) of energy efficiency and is inversely proportional to the normalised expected transmission cost with respect to
. For details on how to estimate transmission costs, see
Section 3.2.1.
Analogously,
refers to the priority (importance) of security concerns, it is proportional to the possibility that the next hop node is not compromised (for more details, see Equation (
15) in
Section 5.1).
refers to the corresponding scale of security concern priority, meaning that security concern is set to
times as important as the energy efficiency concern. The exact number of
can be selected among various available options in
Table 5.
5.6. Optimal Route Determination
The optimal route is supposed to be with the maximum interpreted as the safest route while limiting energy consumption as much as possible.
Actual route discovery can be performed by means of existing routing protocols such as AODV, with minimal possible changes in control messages such as RREQs and RREPs.
To be more specific, the field “hop count” is set to be replaced with the corresponding “priority volume count.” In RREQs, the “priority volume count” refers to the total priority volume of the route from the originating node to the node that is dealing with this route request. In RREPs, “priority volume count” is the priority volume of the route from originating node to destination node. Note that there is another minor modification: AODV picks the route with minimum hops, while the optimal route here is the one with maximum priority volume.