*4.3. Action Space*

When the task occurs in IoT devices, each IoT device can decide whether to offload to its partner or not and the portion to be offloaded based on the current state information. The action set is constructed based on the assumption where IoT devices can offload all or half of the task. However, it can be easily extended to consider other portions of the task to define additional actions. Therefore, the action set can be described by

$$\mathbf{A} = \mathbf{A}\_{\mathbf{i}} \times \mathbf{A}\_{\mathbf{j}}.\tag{6}$$

where **Ai** and **Aj** are the action spaces for IoT devices *i* and *j*, respectively, which can be defined as

$$\mathbf{A}\_{\mathbf{l}} = \{0, 1, 2\} \tag{7}$$

and

$$\mathbf{A}\_{\parallel} = \{0, 1, 2\},$$

$$\mathbf{A} = \begin{bmatrix} \end{bmatrix} \tag{8}$$

where *Ai* = 0 and *Aj* = 0 represent that IoT devices *i* and *j* do not offload its task, respectively. *Ai* = 1 and *Aj* = 1 denote that IoT devices *i* and *j* offload half of the task to its partner, respectively. In addition, *Ai* = 2 and *Aj* = 2 are the actions where IoT devices *i* and *j* offload all of the task to its partner, respectively.

## *4.4. Transition Probability*

The state transition probability of IoT device *i* is affected by the state of IoT device *j*. Specifically, the processing speed of the task occurring in IoT device *i* (i.e., the transition probability of **T<sup>M</sup> i** ) is dependent on whether the task occurring at IoT device *j* is processed in IoT device *i* or not (i.e., *T<sup>O</sup> <sup>j</sup>* ). In addition, the transition probability of *T<sup>O</sup> <sup>i</sup>* is affected by whether IoT device *j* processes its own task or not (i.e., *T<sup>M</sup> <sup>j</sup>* ). Similar to that, the state transition probability of IoT device *j* is also influenced by the state of IoT device *i* (especially *T<sup>M</sup> <sup>i</sup>* and *<sup>T</sup><sup>O</sup> <sup>i</sup>* ). Therefore, the transition probability with the chosen action *A* from the current state *S* to the next state *S* can be described by

$$P[S'|S,A] = P[S'\_{\dot{i}}|S\_{i\prime},T^{M}\_{\dot{j}},T^{O}\_{\dot{j}},A] \times P[S'\_{\dot{j}}|S\_{\dot{j}\prime},T^{M}\_{\dot{i}\prime},T^{O}\_{\dot{i}\prime},A],\tag{9}$$

where *S <sup>i</sup>* and *S <sup>j</sup>* denote the next state of IoT devices *i* and *j*, respectively. In addition, *Si* and *Sj* represent the current state for IoT devices *i* and *j*, respectively.

Meanwhile, *T<sup>M</sup> <sup>i</sup>* and *<sup>T</sup><sup>O</sup> <sup>i</sup>* are influenced by the chosen action *A*, and these states are dependently changed with each other. In addition, *T<sup>M</sup> <sup>i</sup>* is affected by *<sup>T</sup><sup>O</sup> <sup>j</sup>* . For example, when the task of IoT device *j* is processed in IoT device *i*, the processing speed of the task of IoT device *i* can decrease. Similarly, *T<sup>O</sup> <sup>i</sup>* is influenced by *<sup>T</sup><sup>M</sup> <sup>j</sup>* . For example, when IoT device *j* does not process its own task, it can focus on processing the offloaded task from IoT device *i*, and therefore the offloaded task can be completed within a short duration. Meanwhile, when the task is processed in IoT device *i*, its energy level can decrease. That is, the transition of *Ei* is influenced by *T<sup>M</sup> <sup>i</sup>* . The timer for the deadline of the task operates only when the task occurs, and therefore the transition of *Di* is affected by *T<sup>M</sup> <sup>i</sup>* and *T<sup>O</sup> <sup>i</sup>* . Meanwhile, other states change independently of each other. Therefore, for the chosen action *A*, the transition probability from the current state of IoT device *i*, *Si* = [*T<sup>M</sup> <sup>i</sup>* , *<sup>T</sup><sup>O</sup> <sup>i</sup>* , *Ei*, *Di*], to the next state of IoT device *i*, *S <sup>i</sup>* = [*T<sup>M</sup> <sup>i</sup>* , *<sup>T</sup><sup>O</sup> <sup>i</sup>* , *Ei*, *Di*], can be described by

$$P[S\_i^t | S\_l, T\_j^M, T\_l^O, A] = P[T\_i^M | T\_l^M, T\_l^O, T\_j^O, A] \times P[T\_i^O | T\_l^O, T\_l^M, T\_l^M, A] \times P[E\_i^t | E\_l, T\_l^M] \times P[D\_i^t | D\_l, T\_l^M, T\_l^O] \tag{10}$$

*Energies* **2019**, *12*, 4050

We assume that the inter-task occurrence time of IoT device *i* follows an exponential distribution with mean 1/*λi*. Then, the probability that the task occurs in IoT device *i* during a decision epoch can be calculated as *λiτ* [27,31]. Therefore, *P*[*T<sup>M</sup> <sup>i</sup>* |*T<sup>M</sup> <sup>i</sup>* = 0, *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 0, *A*] can be represented by

$$P[T\_{\stackrel{i}{i}}^{\prime M}|T\_{\stackrel{i}{i}}^{M}=0, T\_{\stackrel{i}{i}}^{O}=0, T\_{\stackrel{j}{j}}^{O}, A] = \begin{cases} 1 - \lambda\_{\stackrel{i}{i}}\tau, \text{ if } T\_{\stackrel{i}{i}}^{\prime M} = 0,\\ \lambda\_{\stackrel{i}{i}}\tau, & \text{ if } T\_{\stackrel{i}{i}}^{\prime M} = 1, \\ 0, & \text{otherwise.} \end{cases} \tag{11}$$

Before receiving the result of the offloaded task, IoT device *i* does not generate the task. Therefore, *P*[*T<sup>M</sup> <sup>i</sup>* |*T<sup>M</sup> <sup>i</sup>* = 0, *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 0, *<sup>T</sup><sup>O</sup> <sup>j</sup>* , *A*] can be defined as

$$P[T'^M\_i | T^M\_i = 0, T^O\_i \neq 0, T^O\_j, A] = \begin{cases} 1, \text{ if } T'^M\_i = 0, \\\ 0, \text{ otherwise.} \end{cases} \tag{12}$$

Meanwhile, when the task occurs (i.e., *T<sup>M</sup> <sup>i</sup>* = 1), IoT device *i* decides whether to offload to IoT device *j* or not and the offloaded portion (i.e., half of the task and all of the task). If IoT device *i* decides not to offload it (i.e., when *A* = 0), the task state will change to 2 representing the situation where IoT device *i* processes all of the task by itself (i.e., *T<sup>M</sup> <sup>i</sup>* = 2). On the other hand, when IoT device *i* decides to offload half of the task and all of the task (i.e., when *A* = 1 and *A* = 2), the next states of the occurrence and processing status for the task of IoT device *i* (i.e., *T<sup>M</sup> <sup>i</sup>* ) become 3 and 4, respectively. Therefore, the corresponding transition probabilities can be represented as

$$P[T'^M\_i | T^M\_i = 1, T^O\_i, T^O\_j, A = 0] = \begin{cases} 1, \text{ if } T'^M\_{\hat{i}} = 2, \\\ 0, \text{ otherwise,} \end{cases} \tag{13}$$

$$P[T'^M\_{\vec{i}}|T^M\_{\vec{i}}=1, T^O\_{\vec{i}}, T^O\_{\vec{j}}, A=1] = \begin{cases} 1, \text{ if } T'^M\_{\vec{i}} = 3, \\\ 0, \text{ otherwise,} \end{cases} \tag{14}$$

and

*P*[*T<sup>M</sup> <sup>i</sup>* <sup>|</sup>*T<sup>M</sup> <sup>i</sup>* = 1, *<sup>T</sup><sup>O</sup> <sup>i</sup>* , *<sup>T</sup><sup>O</sup> <sup>j</sup>* , *<sup>A</sup>* <sup>=</sup> <sup>2</sup>] = 1, if *T<sup>M</sup> <sup>i</sup>* = 4, 0, otherwise, (15)

respectively.

We assume that the processing time of IoT device *i* for its own task follows an exponential distribution with mean 1/*μF*,*<sup>S</sup> <sup>i</sup>* when it processes all of its own task and any task of IoT device *j* is not offloaded to IoT device *i* (i.e., *T<sup>M</sup> <sup>i</sup>* = 2 and *<sup>T</sup><sup>O</sup> <sup>j</sup>* = 0). In this case, the probability that the task is completed during a decision epoch is *μF*,*<sup>S</sup> <sup>i</sup> τ* [27,31]. Then, the probability that a task is not completed during a decision epoch is 1 <sup>−</sup> *<sup>μ</sup>F*,*<sup>S</sup> <sup>i</sup> τ*. On the other hand, if some portion of the task of IoT device *j* is offloaded to IoT device *i* (i.e., *T<sup>O</sup> <sup>j</sup>* = 0), the processing speed of IoT device *i* for its own task decreases, and thus it is assumed that the processing time of IoT device *i* for its own task follows an exponential distribution with mean 1/*μF*,*<sup>D</sup> <sup>i</sup>* (<sup>&</sup>gt; 1/*μF*,*<sup>S</sup> <sup>i</sup>* ). In this case, the probability that the task is completed (or not completed) during a decision epoch is *μF*,*<sup>D</sup> <sup>i</sup> <sup>τ</sup>* (or 1 <sup>−</sup> *<sup>μ</sup>F*,*<sup>D</sup> <sup>i</sup> τ*) [27,31]. Therefore, *P*[*T<sup>M</sup> <sup>i</sup>* |*T<sup>M</sup> <sup>i</sup>* = 2, *<sup>T</sup><sup>O</sup> <sup>i</sup>* , *<sup>T</sup><sup>O</sup> <sup>j</sup>* = 0, *<sup>A</sup>*] and *<sup>P</sup>*[*T<sup>M</sup> <sup>i</sup>* |*T<sup>M</sup> <sup>i</sup>* = 2, *<sup>T</sup><sup>O</sup> <sup>i</sup>* , *<sup>T</sup><sup>O</sup> <sup>j</sup>* = 0, *A*] can be derived as

$$P[T'^M\_i | T^M\_i = 2, T^O\_i, T^O\_j = 0, A] = \begin{cases} 1 - \mu\_i^{F,S} \tau, \text{ if } T'^M\_i = 2, \\\mu\_i^{F,S} \tau, & \text{ if } T'^M\_i = 0, \\\ 0, & \text{otherwise,} \end{cases} \tag{16}$$

*Energies* **2019**, *12*, 4050

and

$$P[T\_i^{\prime M}|T\_i^M = 2, T\_i^O, T\_j^O \neq 0, A] = \begin{cases} 1 - \mu\_i^{F,D} \tau\_\prime \text{ if } T\_i^{\prime M} = 2, \\ \mu\_i^{F,D} \tau\_\prime & \text{ if } T\_i^{\prime M} = 0, \\ 0, & \text{otherwise.} \end{cases} \tag{17}$$

Meanwhile, when offloading half of the task to IoT device *j* (i.e., *T<sup>M</sup> <sup>i</sup>* = 3), the remained task can be completed with shorter time. It is assumed that the processing time of the remained task follows an exponential distribution with mean 1/*μH*,*<sup>S</sup> <sup>i</sup>* when any task of IoT device *j* is not offloaded to IoT device *i* (i.e., *T<sup>O</sup> <sup>j</sup>* = 0), and then the probability that the remained task is completed during a decision epoch is *μH*,*<sup>S</sup> <sup>i</sup> τ* [27,31]. On the other hand, when IoT device *i* offloads the half of its task to IoT device *j* and processes the offloaded task from IoT device *j* (i.e., *T<sup>M</sup> <sup>i</sup>* = 3 and *<sup>T</sup><sup>O</sup> <sup>j</sup>* = 0), the processing time of the remained task of IoT device *i* follows an exponential distribution with mean 1/*μH*,*<sup>D</sup> <sup>i</sup>* . In this case, the probability that the remained task is completed during a decision epoch is *μH*,*<sup>D</sup> <sup>i</sup> τ* [27,31]. Thus, the corresponding transition probabilities can be represented as

$$P[T'^M\_i | T^M\_i = 3, T^O\_i, T^O\_j = 0, A] = \begin{cases} 1 - \mu\_i^{H,S} \tau\_\prime \text{ if } T'^M\_i = 3, \\\mu\_i^{H,S} \tau\_\prime & \text{if } T'^M\_i = 0, \\\ 0, & \text{otherwise,} \end{cases} \tag{18}$$

and

$$P[T'^M\_i | T^M\_i = 3, T^O\_i, T^O\_j \neq 0, A] = \begin{cases} 1 - \mu\_i^{H,D} \tau\_\prime \text{ if } T'^M\_i = 3, \\ \mu\_i^{H,D} \tau\_\prime & \text{if } T'^M\_i = 0, \\ 0, & \text{otherwise.} \end{cases} \tag{19}$$

When the task does not occur, the processing status of the offloaded task of IoT device *i* does not change. Therefore, *P*[*T<sup>O</sup> <sup>i</sup>* |*T<sup>O</sup> <sup>i</sup>* = 0, *<sup>T</sup><sup>M</sup> <sup>i</sup>* = 1, *A*] can be denoted as

$$P[T\_i^{\prime O}|T\_i^O = 0, T\_i^M \neq 1, T\_j^M, A] = \begin{cases} 1, \text{ if } T\_i^{\prime O} = 0, \\ 0, \text{ otherwise.} \end{cases} \tag{20}$$

Meanwhile, when the task occurs (i.e., *T<sup>M</sup> <sup>i</sup>* = 1), the processing status of the offloaded task of IoT device *i* changes according to the chosen action *A*. Therefore, corresponding transition probabilities can be represented as

$$P[T\_i^O | T\_i^O = 0, T\_i^M = 1, T\_j^M, A = 0] = \begin{cases} 1, \text{ if } T\_i^O = 0, \\\ 0, \text{ otherwise,} \end{cases} \tag{21}$$

$$P[T\_{\vec{i}}^{O}|T\_{\vec{i}}^{O}=0, T\_{\vec{i}}^{M}=1, T\_{\vec{j}}^{M}, A=1] = \begin{cases} 1, \text{ if } T\_{\vec{i}}^{O} = 1, \\\ 0, \text{ otherwise,} \end{cases} \tag{22}$$

and

$$P[T\_i^{\prime O} | T\_i^O = 0, T\_i^M = 1, T\_j^M, A = 2] = \begin{cases} 1, \text{if } T\_i^{\prime O} = 2, \\ 0, \text{ otherwise,} \end{cases} \tag{23}$$

respectively.

When some portion of the task is offloaded (i.e., *T<sup>O</sup> <sup>i</sup>* = 1 or *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 2), it is processed by IoT device *j*. Meanwhile, the processing time of the offloaded task to IoT device *j* depends on the portion of the offloaded task (i.e., *T<sup>O</sup> <sup>i</sup>* ) and whether IoT device *<sup>j</sup>* processes its own task or not (i.e., *<sup>T</sup><sup>M</sup> <sup>j</sup>* ). Specifically, when half of the task (or all of the task) of IoT device *i* is offloaded and IoT device *j* does not process its own task, we assume that the processing time of the offloaded task follows an exponential distribution with mean 1/*μH*,*<sup>S</sup> <sup>j</sup>* (1/*μF*,*<sup>S</sup> <sup>j</sup>* ). On the other hand, if half of the task (or all of the task) of IoT device *i* is offloaded and IoT device *j* processes its own task, the processing time of the offloaded task follows an exponential distribution with mean 1/*μH*,*<sup>D</sup> <sup>j</sup>* (1/*μF*,*<sup>D</sup> <sup>j</sup>* ). Then, the probabilities that the offloaded task is completed during a decision epoch for each case can be derived as *μH*,*<sup>S</sup> <sup>j</sup> <sup>τ</sup>*, *<sup>μ</sup>F*,*<sup>S</sup> <sup>j</sup> <sup>τ</sup>*, *<sup>μ</sup>H*,*<sup>D</sup> <sup>j</sup> <sup>τ</sup>*, and *<sup>μ</sup>F*,*<sup>D</sup> <sup>j</sup> τ*, respectively [27,31]. Therefore, the corresponding transition probabilities can be denoted as

$$P[T\_i^{\prime O}|T\_i^O = 1, T\_i^M, T\_j^M = 0, A] = \begin{cases} 1 - \mu\_j^{H,S} \tau\_\prime \text{ if } T\_i^{\prime O} = 1, \\\mu\_j^{H,S} \tau\_\prime & \text{ if } T\_i^{\prime O} = 0, \\\ 0, & \text{otherwise,} \end{cases} \tag{24}$$

$$P[T'^O\_i | T^O\_i = 2, T^M\_i, T^M\_j = 0, A] = \begin{cases} 1 - \mu^{F,S}\_j \tau\_\prime \text{ if } T'^O\_i = 1, \\ \mu^{F,S}\_j \tau\_\prime & \text{if } T'^O\_i = 0, \\ 0, & \text{otherwise}, \end{cases} \tag{25}$$

$$P[T\_i^O|T\_i^O = 1, T\_i^M, T\_j^M \neq 0, A] = \begin{cases} 1 - \mu\_j^{H,D} \tau\_\prime \text{ if } T\_i^O = 1, \\\mu\_j^{H,S} \tau\_\prime & \text{ if } T\_i^O = 0, \\\ 0, & \text{otherwise,} \end{cases} \tag{26}$$

and

$$P[T'^{O}\_{i}|T^{O}\_{i}=2,T^{M}\_{i},T^{M}\_{j}\neq 0,A]=\begin{cases}1-\mu^{F,D}\_{j}\tau\_{\prime}\text{ if }T^{\prime O}\_{i}=1,\\\mu^{H,S}\_{j}\tau\_{\prime} & \text{if }T^{\prime O}\_{i}=0,\\0, & \text{otherwise.}\end{cases}\tag{27}$$

The IoT device can harvest energy only when its environments provide energy (e.g., when the wind blows above a certain speed). Therefore, the probability that IoT device *i* harvests one unit energy at an arbitrary decision epoch is modeled by a Bernoulli random process with the probability *pH <sup>i</sup>* [32]. Then, when the IoT device *<sup>i</sup>* does not process any task (i.e., *<sup>T</sup><sup>M</sup> <sup>i</sup>* = 0 or *<sup>T</sup><sup>M</sup> <sup>i</sup>* = 1) and its battery is not fully charged (i.e., *Ei* = *EMAX*), *Ei* increases by one unit with the probability *<sup>p</sup><sup>H</sup> <sup>i</sup>* . If the battery of IoT device *i* is full (i.e., *Ei* = *EMAX*), it cannot harvest energy anymore. Therefore, the corresponding transition probabilities can be represented as

$$P[E\_i'|E\_i \neq E\_{MAX}, T\_i^M = 0] = \begin{cases} 1 - p\_i^H, \text{ if } E\_i' = E\_{i\prime} \\ p\_i^H, & \text{if } E\_i' = E\_i + 1, \\ 0, & \text{otherwise,} \end{cases} \tag{28}$$

$$P[E\_i'|E\_i = E\_{MAX}, T\_i^M = 0] = \begin{cases} 1, \text{ if } E\_i' = E\_{i\prime} \\ 0, \text{ otherwise,} \end{cases} \tag{29}$$

$$P[E\_i'|\mathcal{E}\_i \neq \mathcal{E}\_{MAX}, T\_i^M = 1] = \begin{cases} 1 - p\_i^H \text{, if } E\_i' = E\_{i\prime} \\ p\_i^H \text{,} & \text{if } E\_i' = E\_i + 1 \\ 0 & \text{otherwise,} \end{cases} \tag{30}$$

and

$$P[E\_i'|E\_i = E\_{MAX}, T\_i^M = 1] = \begin{cases} 1, \text{ if } E\_i' = E\_{i\prime} \\ 0, \text{ otherwise.} \end{cases} \tag{31}$$

When IoT device *i* processes the task (i.e., *T<sup>M</sup> <sup>i</sup>* = 2 or *<sup>T</sup><sup>M</sup> <sup>i</sup>* = 3) and it has energy (i.e., *E* = 0), it consumes one unit energy. On the other hand, if IoT device *i* does not have any energy, it cannot process for any task the sensed data, and thus no energy is consumed. In addition, its energy *Ei* increases by one unit with the probability *p<sup>H</sup> <sup>i</sup>* . Therefore, the corresponding transition probabilities can be expressed by

$$P[E\_i'|E\_i \neq 0, T\_i^M = 2] = \begin{cases} 1 - p\_i^H, & \text{if } E\_i' = E\_i - 1, \\\ p\_i^H, & \text{if } E\_i' = E\_{i'} \\\ 0, & \text{otherwise}, \end{cases} \tag{32}$$

$$P[E\_i'|E\_i = 0, T\_i^M = 2] = \begin{cases} 1, \text{ if } E\_i' = E\_{i\prime} \\ 0, \text{ otherwise,} \end{cases} \tag{33}$$

$$P[E\_i'|E\_i \neq 0, T\_i^M = 3] = \begin{cases} 1 - p\_i^H, & \text{if } E\_i' = E\_i - 1, \\\ p\_i^H, & \text{if } E\_i' = E\_i, \\\ 0, & \text{otherwise}, \end{cases} \tag{34}$$

and

$$P[E\_i'|E\_i=0, T\_i^M=3] = \begin{cases} 1, \text{ if } E\_i' = E\_i, \\ 0, \text{ otherwise.} \end{cases} \tag{35}$$

Meanwhile, when all of the tasks are offloaded to IoT device *j* (i.e., *T<sup>M</sup> <sup>i</sup>* = 4), IoT device *i* does not consume its own energy. Therefore, the corresponding transition probability can be denoted as

$$P[E\_i'|E\_{i\prime}T\_i^M = 4] = \begin{cases} 1, \text{ if } E\_i' = E\_{i\prime} \\ 0, \text{ otherwise.} \end{cases} \tag{36}$$

When the task does not occur (i.e., *T<sup>M</sup> <sup>i</sup>* = 0 or *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 0), the timer for the deadline of the task does not start, and therefore it does not expire. Therefore, *P*[*D i* |*Di*, *<sup>T</sup><sup>M</sup> <sup>i</sup>* = 0, *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 0] can be represented as

$$P[D\_i'|D\_i, T\_i^M = 0, T\_i^O = 0] = \begin{cases} 1, \text{ if } D\_i' = 0, \\ 0, \text{ otherwise.} \end{cases} \tag{37}$$

We assume that the timer for the deadline of the task of IoT device *i* follows an exponential distribution with mean 1/*κ<sup>i</sup>* [33,34]. Then, when the task is not completed (i.e., *T<sup>M</sup> <sup>i</sup>* = 0 or *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 0), the probability that the timer expires during a decision epoch is *κiτ*. Thus, the corresponding transition probabilities can be represented as

$$P[D\_i'|D\_i = 0, T\_i^M \neq 0, T\_i^O] = \begin{cases} 1 - \kappa\_i \tau\_r & \text{if } D\_i' = 0, \\\kappa\_i \tau\_r & \text{if } D\_i' = 1, \\\ 0, & \text{otherwise}, \end{cases} \tag{38}$$

*Energies* **2019**, *12*, 4050

and

$$P[D\_i'|D\_i = 0, T\_i^M, T\_i^O \neq 0] = \begin{cases} 1 - \kappa\_i \tau\_\prime \text{ if } D\_i' = 0, \\\kappa\_i \tau\_\prime & \text{if } D\_i' = 1, \\\ 0, & \text{otherwise.} \end{cases} \tag{39}$$

Meanwhile, when the task is completed (i.e., *T<sup>M</sup> <sup>i</sup>* = 0 and *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 0), the timer is reset and does not operate, which means that there is no expiration. Therefore, *P*[*Di* |*Di* = 1, *<sup>T</sup><sup>M</sup> <sup>i</sup>* = 0, *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 0] can be denoted as

$$P[D\_i'|D\_i = 1, T\_i^M = 0, T\_i^O = 0] = \begin{cases} 1, \text{ if } D\_i' = 0, \\\ 0, \text{ otherwise.} \end{cases} \tag{40}$$

If the timer expires (i.e., *Di* = 1) and the task is not completed (i.e., *T<sup>M</sup> <sup>i</sup>* = 0 or *<sup>T</sup><sup>O</sup> <sup>i</sup>* = 0), the timer remains in the expired state. Therefore, the corresponding transition probabilities can be represented as

$$P[D\_i'|D\_i = 1, T\_i^M \neq 0, T\_i^O] = \begin{cases} 1, \text{ if } D\_i' = 1, \\ 0, \text{ otherwise,} \end{cases} \tag{41}$$

and

$$P[D\_i'|D\_i = 1, T\_i^M, T\_i^O \neq 0] = \begin{cases} 1, \text{if } D\_i' = 1, \\ 0, \text{otherwise.} \end{cases} \tag{42}$$

The transition probability for the states of IoT device *j* can be defined as similar to that of IoT device *i*. These are omitted in this paper due to the page limitation and for simple descriptions, which can be found in [30].
