*4.2. MEC Scheduling*

In Section 3.2, the mathematical models and the corresponding optimization problem are designed. As shown in Equations (35) and (36), the proposed objective function and the constraints considered in this paper form an NP-hard problem. Our objective function is a time domain function; therefore, the Lyapunov drift optimization technique is suitable for solving this problem since we can observe the tradeoff between the performance and battery stability. Let Θ*<sup>t</sup>* denote the vector of the uncharged battery queues at time *t*, and the quadratic Lyapunov function is defined as

$$L^t = \frac{1}{2} (\Theta^t)^T \Theta^t = \frac{1}{2} (Z^t)^2,\tag{40}$$

where (Θ*<sup>t</sup>* )*<sup>T</sup>* denotes the transpose of Θ*<sup>t</sup>* ; however, it should be noted that Θ*<sup>t</sup>* has only one queue vector, and therefore Equation (40) can be derived. Let Δ*<sup>t</sup>* be a conditional quadratic Lyapunov function, which can be formulated as <sup>E</sup>[*Lt*+<sup>1</sup> <sup>−</sup> *<sup>L</sup><sup>t</sup>* <sup>|</sup> <sup>Θ</sup>*<sup>t</sup>* ], i.e., the drift on *t* [21]. The dynamic policy is designed to solve the proposed optimization formulation by observing only the current uncharged battery queue *Zt* , which is maximized as follows:

$$Q^t - V\Delta^t,\tag{41}$$

where *V* is a positive constant value parameter used to control the drift policy, which affects the reward–battery tradeoffs [21]. Next, we select a frequency at each time slot *t*. By selecting a frequency, we receive a reward. This selection can be represented as follows:

$$\underset{f^t \in \mathcal{F}\_k}{\text{argmax}} \; Q^t[f^t] - V \cdot Z^t \cdot (E\_{\text{offload}}^t[f^t] - \varepsilon^t), \tag{42}$$

where *e<sup>t</sup>* is the energy harvested at *t* and has a constant value. Since it does not impact the results, Equation (42) can be updated as follows:

$$\underset{f^{t}\in F\_{k}}{\text{argmax}}\,\mathcal{Q}^{t}[f^{t}] - V\cdot Z^{t}\cdot (E^{t}\_{\text{offload}}[f^{t}])\,. \tag{43}$$

Since Equation (43) is in the closed form, the proposed algorithm can dynamically control −→*<sup>f</sup> <sup>t</sup>* and find the optimal −→*<sup>f</sup> <sup>t</sup>* in polynomial time.
