**3. Results**

The results are discussed following the steps described in Sections 2.2–2.4.

#### *3.1. Step 1. Establish Prediction Models*

#### (a) AHU and HVAC control unit: Parametric model

As mentioned in Section 2.2, from the scatter plot obtained for the energy demands of the AHU and HVAC control unit load group (*EAHU&controls*) against the outdoor temperature (*Toutdoor*), a parametric approach was considered for demand prediction. Figure 12 shows the results of the curve fitting, where (a) shows the measurements and fitted curve, and (b) the residuals. The fitted parameters to Equation (1) are provided in Table 2. Thereby, the parametric model is represented by Equation (11).

$$\begin{array}{rcl}E\_{\text{AHL\&controls},t=i} & = & \exp\left(\left(0.0075 \cdot T\_{\text{Outdoor}} \, ^2 - 0.3576 \cdot T\_{\text{Outdoor}} + 123.8934\right) \\ & & \cdot \left[0.01555 + \frac{-0.0016}{1 + \exp\left(-(-1.0342) \cdot (T\_{\text{Outdown}} - 21.2692)\right)}\right]\right) \end{array} \tag{11}$$

A major factor for the deviations from the function fit above the curve is found to be the opening of the variable air volume (VAV) valve depending on the indoor CO2 concentration. The error calculation metrics for the fitted curve are a WAPE of 2.81% and a CVRMSE of 4.36%. Since these values are in line with the requirements, the model is considered accurate enough for its predictions.

**Figure 12.** (**a**) Final function fit power demand for the AHU and HVAC control unit during office hours. (**b**) Residuals plot.


**Table 2.** Numerical values for parameters.

## (b) Chiller: Multi-variable linear regression

For chiller demand prediction, multi-variable linear regression is used. The results of the k-fold cross-validation for different numbers of variables involved in the regression formula (see Equation (2)) are shown in Figure 13. Overall, the performance is very similar regardless of the number of variables involved. At first, there is a (slightly) increasing performance when using one variable compared to two variables. This behavior is in line with the literature claiming that temperatures at previous time steps (t−1) may have the largest impact on the building cooling load at time t [42]. Including more variables in the regression results in unexpected behavior; at first, the performance decreases, and then it appears to increase again. Since this observed behavior cannot be proven, the sudden increase in performance using ≥ 6 variables is thought to have a mathematical or coincidental origin rather than a physical origin.

**Figure 13.** K-fold cross-validation results for multi-variable linear regression.

The results of the k-fold cross-validation for different numbers of variables involved in the regression formula are investigated for up to 10 variables; see Figure 14. As discussed in the previous section, k-fold cross-validation where k = 10 is used—this means that the process of taking the 9 training folds, determining the regression coefficients, and then inputting the validation fold as input is repeated 10 times in total until all folds are used once as a validation set. Each repetition of this process yields a set of regression coefficients. These regression coefficients are plotted in Figure 14. It is observed that the value for a1 is nearly the same value for all folds, indicating a low uncertainty in the value of this parameter. The results for parameter a2 show that most values are slightly below zero, indicating a weak negative correlation. For a3 and beyond, parameter values become extremely uncertain; as shown by the distribution of values around zero, it can be concluded that it is not even clear whether there is a small positive correlation, small negative correlation, or no correlation at all. Therefore, these and subsequent terms were not considered.

**Figure 14.** Regression coefficients for k-fold cross-validation using 10 variables.

The regression with two variables reaches a local performance maximum with an *R*<sup>2</sup> value of 0.89, a WAPE of 19.7% and a CVRMSE of 27.6%. Since the best results are obtained by using two variables, a model of the form as shown in Equation (12) is used and fitted to the dataset. The corresponding coefficients are provided in Table 3.

$$E\_{\text{cilinear},t=i} = a\_0 + a\_1 \cdot T\_{\text{outdoor},\ t=i} + a\_2 \cdot T\_{\text{outdoor},t=i-1} \left[ \text{kWh.h}^{-1} \right] \tag{12}$$



#### (c) Plug loads and lighting: Recent day model

The results of the performance assessment for the lighting and plug load predictions according to the model described by Equation (3) are shown in Figure 15. The performance for all hours (= working + non-working) of the dataset is shown in blue, and the prediction performance for only work hours, is shown in orange.

**Figure 15.** (**a**) Performance assessment of predictions for lighting loads. (**b**) Performance assessment of predictions for plug loads.

The predictions show a sharp increase in performance between *N* = 1 and *N* = 2 for both lighting and plug loads. Using multiple historic data points for the prediction of a future data point is thought to have a stabilizing effect because the outliers which may be present in the historic data are combined with more representative historic values. The predictions, which are made by taking the average of these historic data points, are therefore less affected by outliers. Both the lighting and plug load predictions reach optimum performance when using five historic days (*N* = 5) in the forecast. With *N* = 5 for work hours, the lighting load predictions yield a WAPE of 8.6% and a CVRMSE of 12.0%, and the plug load predictions yield a WAPE of 10.6% and a CVRMSE of 13.8%.

(d) PV power: Solargis®

During the night, there is no sunlight, and PV power predictions for these hours are always zero. These predictions are of course always 100% accurate. These predictions should not be considered in performance evaluation. Specifically, predictions which are done for hours between 17:00 and 07:00 are not included in the Solargis® PV prediction evaluation. Prediction accuracy is determined by comparing Solargis® predictions with the AC-side power measurements of the PV system.

Figure 16 gives an overview of the Solargis® prediction performance as a function of the lead time for the case study building. The overall performance of the predictions shows a rapid decrease in prediction accuracy in the first few lead hours. A peak at 5 lead hours is followed by a sharp decrease in the MAE and the RMSE, after which a long approximately stable period follows. The peak and subsequent decrease mark the transition between satellite-based models and numerical weather prediction models used by Solargis® [46].

The subplots for the different seasons all show similar behavior in terms of the MAE and the RMSE. However, the MBE clearly shows different fluctuations depending on the season and the lead time. As stated earlier, the MBE indicates whether there is systematic overestimation or underestimation in the predictions. In principle, the MBE can be used to easily correct prediction inaccuracy, e.g., by subtracting the value of the MBE in case of overestimation. Due to the various fluctuations, there will be no attempt to improve prediction accuracy through MBE compensation. Such an analysis is beyond the scope of this research.

(e) Outdoor temperature: Solargis®

The accuracy of the temperature predictions is assessed by comparing Solargis® temperature predictions with the temperature measurements which are calculated by the weather station on the roof of the building; see Figure 17.

**Figure 16.** PV yield prediction performance assessment.

**Figure 17.** Weather station at the case study building.

The results of the analysis of the temperature forecasts for the case study building are shown in Figure 18. The overall performance of all data points is shown as well as subplots for the performance during the different seasons. From the magnitude of the MAE and the RMSE (without MBE correction), it can be seen that temperature predictions are quite accurate overall. The autumn and winter temperatures are predicted best. The errors increase only gradually for longer lead times. For an unknown reason, a small spike at lead time = 25 h is observed. The behavior of the MBE in the overall assessment shows a steady underestimation, with a value of –0.5 ◦C. Although this value is slightly changed for the different seasons, a systematic underestimation (indicated by the "–" sign) occurs throughout the seasons. MBE correction can be used to improve the predictions. This is achieved by adding a value of 0.5 ◦C to all the Solargis® temperature predictions of the dataset. The dashed lines in Figure 18 show the modified results. The figure clearly shows that the MBE has shifted upwards to the desired value of ~0 ◦C. The lower values for the MAE and the RMSE in all plots show that this correction is an appropriate measure to better align the predictions with the measurements.

**Figure 18.** Outdoor temperature prediction performance assessment.

#### 3.1.1. Summary of Subcomponent Prediction Models

The subprediction models and Solargis® services that were demonstrated in Section 2.2 form the building blocks of the complete building energy demand prediction. A summary of the developed models and corresponding performance metrics for each load group is provided in Table 4. For the purpose of this research, the proposed model accuracies are considered sufficient.

**Table 4.** Summary of the prediction performance of the best performing model for each load group.


#### 3.1.2. Total Demand Prediction of the Building

The established prediction models for all load groups and Solargis® temperature and PV prediction services are integrated into a combined model, wherein the building's total energy demand is predicted. The dataset used in the integrated model consists of the Solargis® and historic building energy demands from 25 May 2018 to 4 April 2019. The error metrics are computed for the predictions which are calculated at 00:00. Only the predictions calculated for workdays between 07:00 and 17:00 are included in quantifying the error matrices. Figure 19 shows the prediction accuracy of all the models on average for all the months. Predictions for lighting and plug loads are combined and simultaneously assessed for convenience.

From Figure 19b it can be seen that prediction errors are largest during the summer months. Since the chiller operates the most during these months, and with the highest energy demands, the magnitude of error is also larger. During the colder months, the chiller is mostly in standby mode and is thus nearly perfectly predicted because standby power is constant. In Figure 19c, the months January and February show an above-average error magnitude. From the large relative difference between the MAE and the RMSE, it follows that there were a few moments with a relatively large prediction error. These are caused by the opening of the variable air volume (VAV) valve due to CO2 concentration, which is not a factor that is considered in the predictions.

**Figure 19.** Prediction errors for work hours (07:00 to 17:00).

Lighting and plug load prediction accuracy show above-average larger errors in January and March. As can be seen for the first day in Figure 20a, and in Figure 20b, the building shows completely different behavior compared to expectations. Due to abnormal building operation, which is probably caused by anomalously low occupancy, the predictions are far <sup>o</sup>ff, resulting in a large prediction error. Additionally, due to the history-based model used for lighting and plug-load predictions, these abnormal days are still incorporated in the predictions for the next day. Since data from historic days are used to make the forecasts, this again results in the estimation of the demand in the upcoming days. One abnormal day could, therefore, trigger a cascade of prediction accuracy deviation for several days. Nonetheless, overall, prediction errors for this load group are comparatively small and prediction results are satisfactory.

**Figure 20.** Bad prediction performance in (**a**) January and (**b**) March.

## *3.2. Step 2: BESS Simulations*

In the previous step, multiple prediction models were developed and ultimately combined to predict the total demand of the building. The prediction models are integrated with the proposed operational strategy and simulated in MATLAB. The results of the simulations are evaluated in this section. Table 5 provides an overview of the assessed Key Performance Indicators (KPIs). The overview shows that total energy consumption (KPI 1) has increased by 2.2%, which is caused by conversion loss in the BESS (KPI 4). From a decrease of 60.9% in exported electricity (KPI 2), it follows that the operational strategy has significantly increased self-consumption (KPI 5) from 82.3% to 93.1%. Due to the storage of this excess PV power that would otherwise be exported, the amount of imported electricity (KPI 3) is reduced by 0.4%, as the BESS was capable of providing (some of) the required energy. Overall, self-su fficiency (KPI 6) has increased by 2%, which means that the ratio of self-consumed electricity from PV to total energy consumption (KPI 1) has improved. KPI 7, which quantifies the ability of the system to maintain the baseline, shows that the baseline was successfully maintained for 97.2% of the time on weekdays between 07:00 and 16:33 (see also Figure 10).


In Section 2.3.3, the Baseline Deviation Duration Curve (BDDC) was defined. This curve provides a visual impression of the ability of the demand curve to maintain the baseline throughout the day. Figure 21 illustrates the baseline deviation.

**Figure 21.** (**a**) Baseline Deviation Duration Curve (BDDC) with 15-minute resolution data. (**b&c**) Close-up of the corners of (**a**).

It is important to realize that the BESS cannot always store/deliver power due to the applied constraints—this means that whenever the difference between *Pbuilding,net* and the baseline is too small, the battery will not deliver or store power. A slight deviation from the baseline (BL) value cannot be prevented, and it is not a problem. This is why the tolerance band, which marks the baseline deviation between which the deviation is considered acceptable, is defined. The green area marks the bandwidth around the zero line of –3.51 kW to 2.85 kW. These values naturally follow when considering the minimum requirement of a 3 kW charging/discharging power constraint before the BESS starts to operate and charging/discharging efficiencies (85.5% and 90% are the charging/discharging efficiencies):


Therefore, whenever the baseline deviation ≤ –3.51 kW, the BESS should have charged to fill the valley. Whenever the baseline deviation ≥ 2.85 kW, the BESS should have discharged to shave the peak. Finding baseline deviation values outside of the tolerance means that the BESS was incapable of maintaining the BL and this was not caused by the minimum power constraint.

From the parts of the load duration curve outside of the green area, it can be seen that it was not always possible to discharge/charge to deliver/store the power necessary to maintain the baseline. The peaks which could not be shaved by the BESS are marked by the red area and have a total duration of ~32 hours during the evaluated period. The valleys which could not be filled by the BESS are marked by the yellow area and have a total duration of ~19 hours. Nevertheless, it can be concluded that the system is well capable of actively maintaining the baseline 97.2% of the time (within the green tolerance band). From Figure 22 it can be seen that, overall, there is a decrease in load duration of high positive power and an increase in the duration of low positive power. This is the direct consequence of the load balancing strategies wherein peaks are shaved and valleys are filled.

**Figure 22.** Load duration curves for (**a**) work hours only and (**b**) all hours of the dataset. (**c&d**) Close-ups of (**a**). (**e&f**) Close-ups of (**b**).

## *3.3. Step 3. BMS Implementation*

The KPIs of the building when operating the BESS can readily be calculated from the measurements that are extracted from the Building Management System (BMS) during the experimental period from 7 August 2019 to 19 August 2019. An overview of the resulting KPIs for the experimental period is shown in Table 6.


**Table 6.** Key performance indicator (KPI) assessment for experimental results.

After introducing the BL strategy, total energy consumption increased due to BESS losses. Furthermore, it follows that exported electricity to the national grid is reduced from 115 to 70 kWh, and imported electricity increased from 1500 to 1582 kWh. Self-sufficiency has increased from 31.5% to 31.7% and self-consumption from 85.8% to 91.3%. Finally, during 96.2% of the time, the BESS was able to successfully maintain the BL within the tolerance, thereby, demonstrating that the load shape objectives are most often met. In the future, the value of flexibility can be established if the relevant guidelines and regulations are provided by the energy markets. The load duration curves for the experimental period on the real building are shown in Figure 23.

**Figure 23.** Load duration curves for the experimental results (**a**) Work hours only. (**b**) All hours of the dataset. (**c&d**) Close-ups of (**a**). (**e&f**) Close-ups of (**b**).

The ability of the demand curve to maintain the BL is visualized in Figure 24 using a Baseline Deviation Duration Curve (BDDC). During 96.2% of the time, the BL was maintained within the constraints. There was a total duration of 3 hours wherein peaks were not shaved. However, all valleys were effectively filled during the experimental period.

**Figure 24.** (**a**) Baseline Deviation Duration Curve (BDDC) for the experimental results of strategy 2 with 15-minute resolution data. (**b&c**) Close-up of (**a**).
