*6.2. System Architecture*

The architecture proposed for the application case is shown in Figure 10. As it can be observed, there are two main layers:

	- **–** Lower latency. Since most of the processing is carried out locally, the mist computing device can respond faster.


Despite the benefits of using mist AI-enabled nodes, it is important to note that IIoT nodes, since they integrate cameras/sensors and the control hardware, are more expensive and complex (i.e., there are more hardware parts that can fail).

•Cloud: it behaves like in the edge computing based architecture. As a consequence, it deals with the requests of the mist devices that cannot be handled locally.

**Figure 10.** Mist-computing-based communications architecture.

#### *6.3. Energy Consumption of the Mist AI-Enabled Model*

In this application case, latency is a critical factor, and a low fault-tolerance policy needs to be implemented. To achieve the "Increase Safety" goal, the use of object detection models with low inference latencies is mandatory. In this case, the human movement dynamics are typically low, since, running on the factory floor is typically not allowed. Moreover, with respect to the "Operations Tracking" goal, the inference latency is not

critical, since it does not affect the obtained results, due to the deterministic nature of the inference latency.

To estimate the energy cost of the overall system, it was considered the data presented in Table 3 for an STM32 Nucleo-L4R5ZI processor running TensorFlow Lite with a Mobinet-V1 model (Task #1-Visual Wake Words) to simulate the "Increase Safety" task and a Resnet-V1 model (Task #2-Image Classification) for simulating the "Operations Tracking" task. The former is a classification task for binary images that detect the presence of a person with an inference latency of 603.14 ms and energy consumption of 24,320.84 μJ per inference (1 joule = 2.77777778 × 10−<sup>7</sup> kWh). The latter is an image classification benchmark with 10 classes for smart video recognition applications with an inference latency of 704.23 ms and energy consumption of 29,207.84 μJ per inference. At this stage, it is important to notice that only inference is being considered, since no information is available regarding the training stage, namely the consumed energy.

First, the number of inferences can be estimated for a year and one camera, and then the overall power consumption can be extrapolated to all cameras, based on the previous assumptions:

$$N\_{VWW} = \frac{365 \times 24 \times 3600 \,\mathrm{s}}{603.14 \,\mathrm{ms}} = 52,286,368 \,\mathrm{infererences/year} \tag{5}$$

$$E\_{VWW} = N\_{VWW} \times 24,320.84 \text{ } \text{\textmu\textJ} = 12.716,483.9 \text{ \textJ} = 0.353 \text{ kW} \text{\texth\textJ} \text{\textdegree device} \tag{6}$$

$$N\_{I\mathbb{C}} = \frac{365 \times 24 \times 3600 \text{ s}}{704.23 \text{ ms}} = 44.780,824 \text{ inference/year} \tag{7}$$

$$E\_{IC} = N\_{IC} \times 29 \,\text{207.84 } \text{\textmu\textJ} = 1.307 \,\text{951.2 J} = 0.363 \text{ kWh/device} \tag{8}$$

where *Nx* represents the number of inferences per year for model *x* (VWW or IC) and *Ex* represents the total equivalent energy consumed in one year per device. In this particular case, the energy refers only to the one consumed by the inference task. Given that, in this study, we are only focused on the additional power consumption of the inference stage, the power consumed by all functional hardware blocks has not been included.

Equation (6) indicates that each camera, when running the VWW model, consumes approximately 0.353 kWh in a year. When running the IC model for the same period (Equation (8)), each camera consumes approximately 0.363 kWh; therefore, by extrapolating for the 18 cameras, we achieve a total consumption (in one year) of 6.354 kWh and 6.534 kWh, for the VWW and IC models, respectively. This power consumption is on the Green-AI magnitude scale, and the yearly inference cost of all the 18 cameras can easily be maintained by a conventional renewable energy source, such as a photovoltaic panel.
