*3.3. Dynamic Programming*

The RHC developed in this study cannot be solved mathematically by merely using a discrete variable matrix. In the window of each step of the RHC, *J*∗*k* can be calculated by finding *<sup>u</sup>*<sup>∗</sup>*k*(*t*) using DP, which is a global optimization theory. This control method based on the RHC prediction is similar to the 'Look-ahead DP' [26,27].

Applying the time step of the UPS chemical simulation, Δ*tsimul*, to Equation (14), this becomes the discrete equation:

$$J\_{s,s+N\_{simul}} = \sum\_{m=1}^{N\_{simul}} g\left(\mathbf{x}(t\_m|t\_k), \boldsymbol{\mu}(t\_m|t\_k), t\_m|t\_k\right) \cdot \Delta t\_{simul} \tag{17}$$

Although the simulation of the battery differs for each model, it cannot be solved in the grid at each time step. Then, in DP, the state of the battery has to be calculated for each temperature difference

(<sup>Δ</sup>*T*), which is less than 0.001 ◦C/s. Though the range of Δ*T* varies according to the temperature of the cell, if the total difference is divided in the entire range, computing cost increases exponentially (Δ*Ttotal*/Δ*T* ≈ 55, 000). Therefore, the range of each state is calculated for each time step in the prediction window, and state *xk* is assigned according to this. Thus, the DP step is set separately from the time step of the UPS chemical simulation, Δ*tsimul*, according to Δ*tDP* as:

$$N\_{DP} = \frac{\Delta \mathcal{W}\_{prediction}}{\Delta t\_{DP}}, \quad N\_{RHC} = \frac{\Delta \mathcal{W}\_{solar}}{\Delta t\_{DP}} \tag{18}$$

Then, Equation (17) can be represented as a recursive relation using Δ*tDP*:

$$f\_{s,s+N\_{\rm wind}}(\mathbf{x}(t\_k)) = \min \left\{ \mathbf{g}(\mathbf{x}(t\_k|t\_k), \mathbf{u}(t\_k|t\_k), t\_s|t\_k) \cdot \Delta t\_{\rm DP} + f\_{s+1,s+N\_{\rm wind}}^\*(\mathbf{x}(t\_{k+1})) \right\} \tag{19}$$

where *J*<sup>∗</sup>*s*,*<sup>y</sup>* is the optimal cost calculated from *ts* to *ty*, and *s* = 1, 2, ... , *NDP*. This recursive relation can conclude minimum cost value using final time *tend* and backward calculation from *<sup>J</sup>*<sup>∗</sup>*s*+*Nsimul*−1,*s*+*Nsimul*(*x*(*tend*)).Finally, using Equations (18) and (19), Equation (16) becomes Equation (20), which represents the discrete-type cost result:

$$J^{RHC} = \sum\_{k=1}^{N\_{total}} J\_k^\* = \sum\_{k=1}^{N\_{total}} \sum\_{p=1}^{N\_{RHC}} \lg(\mathbf{x}(t\_p), \boldsymbol{\mu}\_k^\*(t\_p), t) \cdot \Delta t\_{DP} \tag{20}$$

#### *3.4. Solver Based on 1.5-Dimensional DP*

In general, an *n*-dimensional DP has *n* states and *n* control variables. However, in the model developed in this study, there is only one control variable, the fan flow. As shown in the following Equations (21) and (22), there are at least three states for which the data maps are shown in Figure 4:

$$V(t) = h(I\_l(t), \rho(t), T(t), \text{SOC}(t))\tag{21}$$

$$Zn\_{\text{consump},t\_1} = \int\_{t\_0}^{t\_1} q\Big(V(t), I\_I(t), \rho(t), T(t), SOC(t)\Big) dt\tag{22}$$

**Figure 4.** Cell output voltage dependence on cell parameters: (**a**) current density and state of charge (SOC); (**b**) oxygen density and SOC; (**c**) temperature and SOC.

Here, *V*(*t*) is the output voltage, *IJ*(*t*) is the current density, ρ(*t*) is the oxygen density, *T*(*t*) is the temperature, *SOC*(*t*) is the state of charge, *Znconsump*,*t*<sup>1</sup> is the Zn consumption until time *t*1, *h* is the modeled system functions to calculate output voltage using cell experimental data, and *q* is the modeled functions to calculate metal consumption using cell experimental data.

Because UPS is a chemical model that uses batteries, the variables *IJ*(*t*), ρ(*t*), and *T*(*t*) are interrelated states at each moment. Their values depend on each previous state and the control variable.

The sensitivity of the states, i.e., their change in a single step, determines which of the states is the reference for DP.

When the state variables that directly affect the output voltage are simulated for 3600 s, at constant load power and with the minimum and maximum fan flow, the current density *IJ*(*t*) changes only slightly, as seen in Figure 5. Therefore, the cell temperature *T*(*t*) and oxygen concentration ρ(*t*) were chosen as the state variables for DP. As discussed, the cell temperature was calculated by distinguishing the time variables Δ*tsimul* and Δ*tDP*, and considering that the state of the previous step has a major influence on the next one. However, the oxygen concentration is very sensitive to the states of each step, which depend on the fan flow. Thus, as DP proceeds, the target grid of the oxygen concentration is sparsely split, because this is determined when the cell temperature is applied to the grid by a control input. Therefore, the optimal *x*2 was selected among a target region instead of the target point to solve the 1.5-dimensional DP structure:

$$\mathbf{x}\_{1,t\_{m+1}} = f\_1(\mathbf{x}\_{l\_m}, \mathbf{u}\_{l\_m}), \ \mathbf{u}\_m = f'\_1(\mathbf{x}\_{1,t\_{m+1}}, \mathbf{x}\_{l\_m}), \ \mathbf{x}\_{2,t\_{m+1}} = f\_2(\mathbf{x}\_{l\_m}, \mathbf{u}\_{l\_m}) \tag{23}$$

$$\mathbf{x}\_{2,t\_{m+1}} = f\_2(\mathbf{x}\_{l\_m \prime} f'\_{\, 1}(\mathbf{x}\_{1,t\_{m+1} \prime} \mathbf{x}\_{l\_m})), \ \mathbf{x}\_{l\_m} = (\mathbf{x}\_{1,t\_{m\prime}}, \mathbf{x}\_{2,t\_m}). \tag{24}$$

**Figure 5.** State variables on cell parameter by 1% and 100% of max fan flow control during operation: (**a**) current density; (**b**) oxygen concentration; (**c**) cell temperature.

Here, *f*1 is a reverse function for acquiring the control variable *um* using the 1s<sup>t</sup> state variable at time *tm*+1, *<sup>x</sup>*1,*tm*+<sup>1</sup> and the state variables at time *tm*, *xtm* . The next step, the 2n<sup>d</sup> state variable at time *tm*+1, *<sup>x</sup>*2,*tm*+<sup>1</sup> , can be obtained from *<sup>x</sup>*1,*tm*+<sup>1</sup> (that of the next step) and *xtm* (the state variables of the this step). This recursive process make us calculate the optimal control by selecting the minimum points as path in *J*<sup>∗</sup>*s*,*s*+*Nsimul* among each region grid for the next DP step, as shown in Equation (19). This can be expressed as in Figure 6.

The experimental UPS data show that, below 60 ◦C and within the operational boundary conditions, the zinc consumption is higher for lower temperatures. This means that, when the difference between the lowest and highest temperature in the cells is higher than 10 ◦C, or the highest temperature is lower than 60 ◦C, the largest zinc consumption is that of the cell located at the end of the module.

When inverting the control input for the target state *T*, as in Equation (24), the metal-air battery model cannot calculate backward accurately, because all variables change. Therefore, the target *T* was calculated after obtaining the minimum and maximum cell temperature in the last step of the solve window, (*k* + <sup>Δ</sup>*Wprediction)*.

Because of these approximations, the algorithm is not perfectly and globally optimal, though reasonably optimal values are used.

**Figure 6.** Schematic of structure solving by approximate DP.

## *3.5. Electrical Load Cycle*

A basic electrical load cycle having the performance required by existing UPSs is shown as the first graph in Table 2. The power required was revised by increasing the simulation time from 7200 to 9000 s to evaluate the performance of the UPS control (cycle #1). Additionally, because the load power of the cycle is constant and simple, four more scenarios (cycle #2–#5) of operation time from 2 to 4 h were included in the simulation. These cycles are shown in Table 2.

**Table 2.** Required electrical load cycles representing different scenarios.
