*3.3. Data Measurement/Preparation for Occupancy Prediction*

Occupancy information, as stated above, is one of the most important inputs to context-driven control approaches for efficient control of building equipment. Several studies show the effectiveness of integrating occupancy in MPC for HVAC, ventilation and lighting systems control [60,61]. In order to collect accurate occupancy prediction, different techniques can be used, including PIR sensors, cameras, RFID, Wi-Fi, Bluetooth-low-energy (BLE), and environmental data (e.g., CO2, temperature, and humidity) [62–64].

In this work a data set containing almost 28,000 instances for one day from 8:30 until 19:00 has been used. Data are collected from EEBLab using the occupancy number node and stored into a Mongodb data (see Figure 7). The occupancy profile varied between one and seven occupants during the day.

**Figure 7.** Occupants' numbers over the day in EEBLab.

Deep learning-based occupancy forecasting techniques have been investigated [65]. The first two recurrent neural network (RNN) based methods, long short-term memory (LSTM) and gated recurrent unit (GRU), have been evaluated and compared in terms of accuracy and root mean square error. These algorithms are classified as extensions of RNN by integrating internal gates which help in deciding whether to keep or throw out the past relevant information compared to traditional RNN. The idea is to evaluate the first generated model and then decide which could be deployed in the EEBLab. Therefore, in this study, LSTM model performs well and has been selected to be exploited in the experiment, due to its effectiveness in terms of accuracy (LSTM 98.7%, GRU 97.5%) and root mean square error (RMSE) (LTSM 3.34, GRU 3.73) parameters. In fact, Apache Kafka has been used, in this case study, to consume the actual number of the occupancy, coming from the HOLSYS platform, to forecast the next 10 steps ahead, serving as a real time input for the MPC controller model.
