**4. Results and Discussion**

Stochastic simulations were performed using EnergyPlus. Figure 9 shows the annual energy simulation results of residential buildings. It is of note that the energy consumptions are based on pure stochastic simulations defined in Section 3, assuming a uniform distribution of the 117 parametric design scenarios of each building type without any additional weighting factor. In reality, there may be less people in the very low (left) and very high (right) energy consumption ranges. To ge<sup>t</sup> a more realistic energy consumption distribution, we collected Wuhan's housing price (for residential building) and rent (for office building) information and adjusted the energy distribution accordingly. Figure 10 shows the housing price distribution of 18,864 residential buildings in Wuhan.

**Figure 9.** Annual residential building electricity consumption distribution of the stochastic simulations.

**Figure 10.** Housing price distribution of residential buildings in Wuhan.

Similarly, the annual electricity consumption distributions for small and large office buildings are shown in Figure 11.

**Figure 11.** Annual office building electricity consumption distribution of the stochastic simulations (top: small office. Bottom: large office).

Furthermore, a data-driven regression model was developed to predict building energy consumption. Suggested by local urban planners and energy policy makers, building's price/rent and vintage were chosen to be the two key independent variables for the regression model.

Various machine learning algorithms were applied to the dataset to compare prediction accuracy. However, due to the very limited number of inputs, more complex algorithms did not show much advantage. Finally, linear regression models were selected, because of their robustness and high prediction accuracy. Tables 2–4 show the regression functions for residential, small office and large office buildings, respectively.


**Table 2.** Regression functions for residential buildings.



$$= \begin{cases} \begin{array}{c} 0, \text{ if } y \text{earr} = 2010 \\ 0.0119, \text{ if } y \text{earr} = 2000 \\ 0.0228, \text{ if } y \text{earr} = 1990 \end{array} \end{cases}$$


**Table 4.** Regression functions for large office buildings.

*Fele* = ⎧⎪⎪⎪⎨⎪⎪⎪⎩ 0,  *f year* 2010 0.0204, *i f year* = 2000 0.0245, *i f year* = 1990 , *Fgas* = ⎧⎪⎪⎪⎨⎪⎪⎪⎩ 0,  *f year* 2010 0.0544, *i f year* = 2000 0.0847, *i f year* = 1990

To better illustrate our methods and make it easy and friendly to use, we developed a building simulation platform based on JavaEE technologies [8,9,21,22]. Figure 12 shows the system architecture. The platform consists of two parts. The first part is the service consumer (Application layer). The consumer here refers to the end users or any other third-party applications. The end user can utilize the service which results directly by opening a given service endpoint URL through the browser. Our service can also be incorporated into other external systems easily. The second part is the service provider. It generally includes three main layers: data layer, core algorithms implementation layer, and RESTful WebService layer. The data layer is responsible for providing enough data to make the platform work securely, such as the building information, system data, and stochastic simulation data. Figure 13 shows how the simulation data is stored in the database. From the E-R diagram, we can see that hourly building energy consumptions can be simulated for the main types of buildings, such as large office, small office, and residential, in different scenarios. The core algorithm layer implements the core algorithms to simulate the building energy consumption. This layer mainly includes regression analysis and interpolation algorithms. To support third-party applications, our platform was designed to be a Service Oriented Architecture (SOA) based program [23]. Specifically, we chose the widely used RESTful WebService to wrap the core simulation APIs, so that everyone would be able to use our platform by just calling these standard WebServices [24]. For instance, users can use the API directly through their browsers by typing into the service endpoint as shown in Figure 14. In addition, third-party applications written in any programming languages can incorporate the APIs easily as these APIs are developed using the standard WebService.

**Figure 12.** System architecture.

**Figure 13.** E-R Diagram.

**Figure 14.** Output of a RESTful WebService.

The building energy prediction models and the APIs created in this paper can be used to support further third-party urban energy application development. For example, Figure 15 shows an example of an urban building energy prediction platform developed by one of our research partners. Monthly energy consumptions (in EUI) of different building types are color-coded and mapped to individual buildings in a GIS database. Dark red represent a high EUI, while light red represents a lower EUI value. To support HVAC system design and equipment selection, high fidelity hourly EUIs are also provided for typical design days. By clicking any individual buildings from the web-based platform, the annual EUI of the selected building is shown with other building characteristics information, including building height, floor area, year of building, and housing price/rent. If the user toggles the year bar in the bottom, the platform can also visualize energy information in the past and predict future scenarios. This urban-scale 3D platform is currently used by the local government. It provides spatial and temporal building energy assessment and visualization to support design decision makings for city managers and urban planners.

**Figure 15.** An urban building energy prediction platform developed based on our API [25].

#### **5. Limitations and Future Work**

This paper demonstrates the development of residential and o ffice building archetypes using a case study in Wuhan, China. Data-driven regression models were developed based on stochastic simulations. A web-based urban energy platform and an interface were created to support further third-party application development. Future work can be improved based on the following limitations.

A uniform distribution was assumed to generate di fferent design variations for stochastic simulations. The actual distribution was adjusted through post processing to match the distribution of the survey data. In future work, we will apply a Bayesian calibration to consider the probability distribution of key uncertain variables. Due to the limited number of inputs for regression model development, the advantages of more complex non-linear machine learning algorithms, such as support vector machine or gradient boosting, cannot be reflected. In the next step, we will collaborate with our colleagues and partners and collect more available input data to improve our models. In addition, the platform will be fully verified using real-world data from our partners. Furthermore, it is usually straightforward to model building energy consumption for each single building using the traditional physics-based energy simulation methods, but it does not work well for modelling multiple building at community or city level [26–28], hence we are trying to use deep learning to discover the hidden and complex dynamics between multiple buildings so as to make our model more accurate while simulating the city scale energy consumption.
