**Nomenclature**


#### **Appendix A. Variable Importance Measures ("%IncMSE")**

In the random forest algorithm, the variable importance is represented by "%IncMSE".

$$\%InMSE\left(\upsilon\_{/}\right) = 100\% \times \left[MSE\left(-\upsilon\_{/}\right) - MSE\right] /MSE\tag{A1}$$

where "*MSE* −*vj* " stands for the MSE if *vj* is not used in the prediction. A higher %*IncMSE* suggests that the variable *vj* is more important.

#### **Appendix B. SCOP Measurement Uncertainty**

This part is to calculate the measurement uncertainty of SCOP. The equations for SCOP are shown in (A2)–(A4), where *Qsys* is the system cooling capacity, *cw* and *mw* are the water specific heat capacity and water flow rate (others please see the nomenclature). Assuming the sensors/meters are well-calibrated, the reported accuracies of sensors/meters from a chiller design guide [38] can be used to quantify the measurement uncertainty of SCOP.

$$\text{SCOP} = Q\_{\text{sys}} / P\_{\text{sys}} \tag{A2}$$

$$Q\_{\mathcal{G}\mathcal{Y}\mathcal{Y}} = c\_{\mathcal{U}}m\_{\mathcal{U}} \left( T\_{\mathcal{C}\mathcal{H}\mathcal{W}\mathcal{R}} - T\_{\mathcal{C}\mathcal{H}\mathcal{W}\mathcal{S}} \right) \tag{A3}$$

$$P\_{\mathcal{S}\mathcal{Y}\delta} = P\_{\mathcal{C}\delta} + P\_{\mathcal{C}I} + P\_{pump} + P\_{\mathcal{A}H\mathcal{U}I} \tag{A4}$$

Considering the measurement uncertainties in the cooling capacity and power, the Equation (A2) can be re-written as Equation (A5), where the subscript *mea* means the measured value, <sup>Δ</sup>*Psys*,*mea*, <sup>Δ</sup>*Qsys*,*mea* and Δ*SCOPmea* are the measurement uncertainties for the power, cooling capacity and SCOP.

$$\text{SCOP}\_{\text{max}} = Q\_{\text{sys}, \text{max}} / P\_{\text{sys}, \text{max}} \tag{A5}$$

where *Psys*,*mea* = *Psys* + <sup>Δ</sup>*Psys*,*mea*; *Qsys*,*mea* = *Qsys* + <sup>Δ</sup>*Qsys*,*mea*; *SCOPmea* = *SCOP* + Δ*SCOPmea*;

Since multiple uncertainties are involved in the SCOP calculation, uncertainty shift can be used to integrate the variable uncertainties for easy analyses [39]. The measurement uncertainty of power consumption can be easily quantified by the power meter uncertainty. For the measurement uncertainty of the cooling capacity, the measured cooling capacity is represented by Equation (A6), and the measurement uncertainty can be calculated by Equation (A7), as demonstrated by [39].

$$Q\_{\ $\$ 5,\text{WM}} = \mathcal{c}\_{\text{UV}} m\_{\text{UV},\text{WM}} \Big( T\_{\text{CHV}\text{WR},\text{maa}} - T\_{\text{CHV}\text{V}\text{S},\text{maa}} \Big) \tag{A6}$$

$$\begin{split} \Delta Q\_{\text{sys,max}} &= Q\_{\text{sys,max}} - Q\_{\text{sys}} \\ &= \Delta m\_{\text{in,max}} \Big( T\_{\text{CHIVR},\text{max}} - T\_{\text{CHIVS},\text{max}} \Big) + m\_{\text{in,max}} \Big( \Delta T\_{\text{CHIVR},\text{max}} - \Delta T\_{\text{CHIVS},\text{max}} \Big) \end{split} \tag{A7}$$

With *mw*,*mea* = *mw* + Δ*mw*,*mea*; *TCHWR*,*mea* = *TCHWR* + <sup>Δ</sup>*TCHWR*,*mea*; *TCHWS*,*mea* = *TCHWS* + <sup>Δ</sup>*TCHWS*,*mea*; where Δ*mw*,*mea*, <sup>Δ</sup>*TCHWR*,*mea*, <sup>Δ</sup>*TCHWS*,*mea* are the measurement uncertainties for the corresponding variables.

Based on measurement accuracies in Table A1, the measurement uncertainties of cooling capacity and power can be calculated. The simulated operation data is considered as the measured value. As presented in Table A2, the maximum positive and negative relative uncertainties for Δ*SCOPmea* are +3.73% and -3.95%, with the absolute uncertainties of "0.0708" and "-0.0694" respectively.





#### **Appendix C. Energy Saving Estimation**

The energy saving percentage at a time instant can be estimated using Equation (A8) by substituting the value of *SCOP*(τ) and *SCOPidl*(τ).

$$\frac{\Delta P(\tau)}{P(\tau)} = \frac{\left[P(\tau) - P\_{\text{idd}}(\tau)\right]}{P(\tau)} = \left[\text{SCOP}\_{\text{idd}}(\tau) - \text{SCOP}(\tau)\right] / \text{SCOP}\_{\text{idd}}(\tau) \tag{A8}$$

where *<sup>P</sup>*(τ) = *load*(τ) *SCOP*(τ), *Pidl*(τ) = *load*(τ) *SCOPidl*(τ), P is power, idl is the ideal value, τ is the time instant.

## **Appendix D. Load Clustering**

A K-means clustering method is proposed to generate typical load profiles (TLPs) in this study. Figure A1a shows a general procedure of the K-means clustering with a Piecewise Aggregate Approximation (PAA) transformation. To cluster the load profile (a time-series data), the distance between two profiles should be computed for measuring the similarity. However, computing the distance on raw time-series data is di fficult and slow. Therefore, the approximation is normally carried out to reduce the computational di fficulties [40]. Many representation techniques can be used to transform the raw time-series data, such as Discrete Fourier Transformation [41], Discrete Wavelet Transformation [42], Single Value Decomposition [43], PAA [44], etc. The PAA was used in this paper due to its simplicity and fast calculation speed [44].

In the PAA method, the original load profile is firstly segmented into equal-distance pieces (Figure A1b). Then, the mean of each piece is used to approximate the original segmen<sup>t</sup> (Figure A1c). This approximation greatly reduces the data dimension, while the fundamental characteristics in the original time-series data are still captured. After the PAA transformation, the Euclidean distance between two load profiles is calculated, and the standard K-means clustering algorithm is applied. The initialization of the K-means is important as it a ffects the final result [45]. A different K value will be tested to find a suitable one. A realistic testing range for the K value is between 2 and √n [46], where "n" is the number of data samples. Then, the "furthest first initialization" is used, which starts at a random point as the first cluster center, and adding more cluster centers which are furthest from the existing ones [47]. Based on the clustering result, one representative profile from each cluster will be selected to form the TLPs.

The measured cooling load data of the case building in year 2013 was used. In total 214 daily load profiles from spring season to autumn season were used, which covered typical cooling seasons for sub-tropical regions like Hong Kong. The clustered loads were given in Figure A2a, where the similar load profiles were grouped together as a load cluster. In each cluster, the load profile closest to the cluster centroid was selected to constitute the TLPs (see Figure A2b). These TLPs and their associated weather data (from Hong Kong Observatory) were used as the simulation inputs.

**Figure A1.** (**a**) Process of generating typical load profiles (TLPs); (**b**) two original load profiles; (**c**) two load profiles after PAA transformation.

**Figure A2.** (**a**) Clustered load profiles; (**b**) Selected typical load profiles.


**Table A3.** Load and weather data of ten TLPs.

(#: 'C1 means the load cluster 1.).
