*Article* **Utilising Open Geospatial Data to Refine Weather Variables for Building Energy Performance Evaluation—Incident Solar Radiation and Wind-Driven Infiltration Modelling**

**Kristian Skeie \* and Arild Gustavsen**

Department of Architecture and Technology, Norwegian University of Science and Technology, Alfred Getz vei 3, 7491 Trondheim, Norway; arild.gustavsen@ntnu.no

**\*** Correspondence: kristian.skeie@ntnu.no

**Abstract:** In building thermal energy characterisation, the relevance of proper modelling of the effects caused by solar radiation, temperature and wind is seen as a critical factor. Open geospatial datasets are growing in diversity, easing access to meteorological data and other relevant information that can be used for building energy modelling. However, the application of geospatial techniques combining multiple open datasets is not yet common in the often scripted workflows of data-driven building thermal performance characterisation. We present a method for processing time-series from climate reanalysis and satellite-derived solar irradiance services, by implementing land-use, and elevation raster maps served in an elevation profile web-service. The article describes a methodology to: (1) adapt gridded weather data to four case-building sites in Europe; (2) calculate the incident solar radiation on the building facades; (3) estimate wind and temperature-dependent infiltration using a single-zone infiltration model and (4) including separating and evaluating the sheltering effect of buildings and trees in the vicinity, based on building footprints. Calculations of solar radiation, surface wind and air infiltration potential are done using validated models published in the scientific literature. We found that using scripting tools to automate geoprocessing tasks is widespread, and implementing such techniques in conjunction with an elevation profile web service made it possible to utilise information from open geospatial data surrounding a building site effectively. We expect that the modelling approach could be further improved, including diffuseshading methods and evaluating other wind shelter methods for urban settings.

**Keywords:** thermal building performance; satellite-based solar radiation data; meteorological reanalysis data; ISO 52016-1; single-zone infiltration

#### **1. Introduction**

Meteorological data like temperature, wind speed, and solar radiation are essential input for characterising buildings' thermal performance. Ideally, these elements are measured locally using a well-maintained weather station near the building site or on the building itself. The increasing availability of high-resolution geospatial data, gridded weather data and adequate modelling techniques (including web-services) can provide an alternative approach to estimating local climatic building boundary conditions in the built environment [1,2]. Using assimilated data-sources has several advantages, e.g., making it possible to supplement low-cost air temperature observations, that are relatively common to measure on-site, with other weather variables that are more difficult to capture or predict in a simple way. Such as solar irradiance data from services built on remote sensing of sky conditions, or wind speed estimations from numerical weather prediction (NWP)-models in forecast or reanalysis-mode. Reanalysis is a method to reconstruct the past weather by combining modelling of the atmospheric dynamics and physics of the earth climate systems with historical observations. Daily updated information about the past weather and historical climate is available via Copernicus Climate Change Service (C3S) Climate Data

**Citation:** Skeie, K.; Gustavsen, A. Utilising Open Geospatial Data to Refine Weather Variables for Building Energy Performance Evaluation—Incident Solar Radiation and Wind-Driven Infiltration Modelling. *Energies* **2021**, *14*, 802. https://doi.org/10.3390/en14040802

Academic Editor: Benedetto Nastasi Received: 16 December 2020 Accepted: 22 January 2021 Published: 3 February 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Store (CDS) such as the ECMWF (European Centre for Medium-Range Weather Forecasts) fifth-generation global reanalysis (ERA5) and the soon to be released Copernicus Regional Reanalysis for Europe (CERRA). Advancements in temporal and spatial resolutions and dedicated land surface analysis, like ERA5-Land [3], may extend the popularity and application to many fields [4]. Some meteorology institutions are also developing hourly surface analysis products on a regional level, combining their operational mesoscale models for weather forecasting with observations and making the assimilated products available as gridded datasets free-of-charge [5]. However, with the use of any gridded weather data product for building energy performance evaluation, comes a need to adjust the data to fit local building boundary conditions [1,2].

Plenty of methods exist to downscale or bias-correct gridded weather data and to include local effects from the terrain, vegetation and buildings in solar and wind assessments. Downscaling techniques reaching the meso and micro-scale span from simple analytical or statistical methods to running high-resolution NWP-models informed by global reanalysis [6], or even computational fluid dynamics (CFD) codes uncoupled or coupled with atmospheric models [7]. When it comes to including local sheltering effects, tools vary significantly in the overall approach and temporal and spatial resolution. With increasing interest in local renewable energy generation and urban scale modelling, numerous research efforts have been initiated to develop and refine tools and methods to support wind energy analysis, façade and rooftop solar potential assessments, urban building energy modelling and urban micro-climate studies [8–10]. The diversity of methods and tools logically reflect the wide variety of use cases, but also evident are the multiple ways to represent surfaces and other features in geospatial datasets (e.g., terrain, tree canopies, roofs and facades). Workflows integrated with graphical information system (GIS)-tools operating on two-dimensional raster maps that supply the surface elevation are prevalent [10]. Modern toolkits offer to automate geoprocessing tasks through Python scripts and web-mapping services [11,12]. To encompass height information, e.g., point-clouds, into full 3D-processing, often requires time-consuming manual work and expertise [13,14]. Although more and more mapping agencies and local authorities are releasing point cloud data or 3D-building models according to Open Geospatial Consortium (OGC) standards, these are mostly limited to city-scale, province, or municipality levels. 3D-building models and point clouds also risk being outdated if not updated at frequent intervals.

In the following, we use high-resolution height data from airborne laser scanning which is becoming widely available in the form of pre-processed digital surface and terrain models covering large land surface areas on a regional or national scale [15]. Despite that these datasets may suffer from the same problems as above, they may easily be served in a web-service and supplemented by building footprints with user-specified heights in the nearby area of interest. Footprints are broadly available as open data from authorities or volunteered geoinformation and usually produced at more frequent intervals [13].

The application of geospatial techniques combining multiple open datasets is not common in the often-scripted workflows of data-driven building thermal performance characterisation.

This work aims to investigate ways to adapt site-specific climate data for building thermal energy analysis, by identifying suitable open geospatial datasets that can be served in a web-service and demonstrate a scripted workflow that can be implemented to calculate solar and wind effects on buildings facades. We present a method for processing time-series from climate reanalysis and satellite-derived solar irradiance services, by implementing land-use, and elevation raster maps served in an elevation profile web-service. Building footprints from OpenStreetMap Overpass API complement the analysis by separating buildings and trees in the vicinity.

#### *1.1. Outline*

This paper first addresses how much we can adapt gridded weather data to local building boundary conditions using only building location (latitude, longitude) and a

selection of free/open geospatial datasets covering Europe. A comparison between observations and the ERA5 reanalysis and CAMS-Rad satellite service is shown for a low-rise building localised in a relatively open landscape in the south of Germany. Including a surface wind downscaling method using weighted surface roughness derived from land-use maps. Next, we investigate what detailed geospatial data was found for our three other residential buildings in Norway, UK and Belgium. These are located in more urban settings, where shading and wind sheltering of nearby obstructions have a more significant impact. The four case studies are used to evaluate the proposed methods to assess solar radiation distribution on the facades and wind and temperature-dependent infiltration; Two factors influenced by nearby topography and obstructions that are not always considered physically in data-driven building energy performance evaluation. We focus on the influence of sheltering of nearby obstacles, buildings, and trees in either case, using the methods:


#### *1.2. Identifying Suitable Open Geospatial Datasets and Previous Works*

Open geospatial datasets are growing in diversity, from crowdsourcing efforts to data produced by authorities and scientific collaborations [18]. Developments around open spatial infrastructures ease access to meteorological data and other relevant spatial information. Open data and modelling are valuable, as building monitoring data is often limited. Buildings are complex systems because their energy use and indoor conditions vary dynamically under the influence of weather, occupancy and component performance [19]. The drive towards a more sustainable built environment and low carbon transition of the energy system give rise to challenges that can only be met by multi-disciplinary knowledge [20,21]. Interaction of open data and models may become fundamental for monitoring, verifying and tracking performance at multiple levels [21].

Many geospatial datasets are available via the EU Copernicus Earth observation programme which has operated a policy of open data since its inception. INSPIRE (Infrastructure for Spatial Information in Europe) is a related EU initiative that aims to ease access to public data through standardisation of spatial data among member states. Two of Copernicus regional products, the Digital Elevation Model (DEM) over Europe (EU-DEM), and the CORINE land cover (CLC) maps are relevant examples of pan-European cooperation [22,23]. CLC maps have been used to derive surface roughness classes in numerous wind resource studies [24,25]. The EU-DEM is a hybrid product based on the larger SRTM and ASTER GDEM datasets produced by NASA Earthdata and Japan Space Systems [26], two of many global and freely available DEM's. These are satellite sensor-based models representing the first-return earth surface (including trees, buildings) at a relatively coarse resolution (from ca. 30 m) and accuracy (ca. 5 m to several hundred) [27]. Still, they have been used in solar resource map creation and operating a service that returns a site's horizon profile, available from JRC's PVGIS website [28].

Lately, more and more light detection and ranging (LiDAR) data obtained from state-funded airborne laser scanning (ALS) are published around the world under free licenses. Pre-built digital elevation models are distributed in high resolution as digital surface models (DSMs) and digital terrain models (DTMs) covering entire regions and countries. In Europe, the INSPIRE Geoportal keeps track of downloadable elevation data [15]. For example, in Norway, both post-processed LiDAR point cloud data and digital surface model data are made openly available by the Norwegian Mapping Authority under CC-BY licensing [29]. Pre-built DSM and DTM-tiles in 1-m resolution covering the whole country can be downloaded amounting to a combined download of ca. 2.5 TB in GeoTIFF format. This national model is updated sector-wise when new surveys are produced. For the other countries in the study, DSM's and DTM's in 1 m resolution published under

open government licenses were available for the UK, and the region of Flanders, Belgium. In parts of Germany, datasets are still proprietary and come at a cost [30].

Many researchers have shown that high-resolution LiDAR data in its point-cloud form is enabling determination of building geometry and shading from the surrounding environment. Nonetheless, composed DSMs (of 1 m resolution) have proven up to the task to capture the slope and aspect of basic roof shapes (without variation in the architecture of the roof) required to estimate solar potential on rooftops [31,32]. At least two different 2.5D solar models are previously published detecting vertical façades from 1 m DSM pixels and either estimating wall irradiances under clear-sky [33,34] or based on observations of global horizontal radiation [35]. Another feature of the second model is the inclusion of vegetation which is found to be crucial when modelling irradiance on walls in an urban setting, especially where building heights are relatively low [35].

A model resolution of 1 m is interesting, as it allows handling large areas, but still maintain a file-size that is easy to store and work with by splitting data in raster tiles. Another reason to consider digital surface and terrain models over higher-resolution Li-DAR point cloud data is that considerable work and expertise [36], has gone into creating the DSM's and DTM's to meet the requirements of the commissioner. Airborne imagery, stereography or orthophotos are also often overlaid in the creation. Likewise, imagery integration is recommended for creating 3D building geometry from LiDAR point clouds to capture objects more accurately [18–20], underlining that expertise in data fusion, processing, and acquisition is needed. Other recent developments show that micro-drones [37] or mobile ground-based laser surveying can be used for detailed building shape and façade mapping [38], indicating new workflows and applications to the building industry.

Other works make use of building ground plans or roof perimeters obtained through user-contributed data such as OpenStreetMap or derived from administrative databases to extrude lower detail building models from the ground and up, or DSM's with or without terrain and surface model data available [10]. Some cities and local authorities are openly distributing 3D city-scale models at higher detail level or building cadastral data at greater accuracy then what can be expected by crowdsourced content [39,40].

However, all of the datasets above suffers from the same challenges: acquisition and that the most recent dataset available may be outdated. Evaluating design or asbuilt building energy performance as part of commissioning requires recent height-data. There are also other applications than energy evaluation where better assessments of local climatic conditions can positively influence the building design process or operation-phase. Building information and geospatial assessment techniques can be used to reduce climateinduced damages on buildings, improve user quality, and improve the balance between climatic adaptation demands and other demands [41]. When it comes to design-studies, the buildings under consideration may not even be built yet. Therefore, the proposed approach will need to be easily updated when new DSM/DTM's are available for an area. It will also need to have multiple ways to input building footprint and height information and separate between ground, buildings and other tall obstacles like trees to overcome the identified gaps.

#### **2. Methods**

This section gives a brief overview of the methods, the code design and implementation. Calculation details and input data are provided in the Appendices A and B. A public Github repository will be published with the full code when the paper is published. Existing packages and scripts used in the workflow include:


• The code implemented in the workflow relies on many additional popular Python and R packages such as rcpp, ncdf4, gdal, pyProj, shapely and netCDF4.

Table 1 provides an overview of the different datasets, spatialisation techniques and analytical models used in this study to process weather data sourced from the climate reanalysis and satellite irradiance service in order to evaluate local solar and wind effects on buildings facades.



The workflow needed to obtain the local data has largely been automated (Figure 1). First, the area of the local study is defined by the geographic position. LiDAR DSM and DTM raster maps in local projections were downloaded from national mapping services and stored on the server. New rasters containing only DSM data within building footprints were created from Open Street Map (OSM) building layers by calling the Overpass Application Programming Interfaces (API) and an open map layer from the UK Ordnance survey (OSM building outlines were not available for the area of interest in the UK at the time of study (mid-2020)). The local area LiDAR raster datasets were stored as .tif and the building footprints as .shp files. The rasters were served together with the EU-DEM and CLC land cover maps via the elevation API configured to return JSON strings of height (or land type classes) along paths resolved to sets of latitude and longitude points. The service "Open Topo Data REST API" relies on Python's gdal and pyProj packages for conversion between latitude, longitude, and local map projections in meter [44]. Scripting was adjusted to handle specifying latitude and longitude in decimal degrees with six decimal places precision (translating to 0.11 m at the equator) and to include datum shifts, ensuring more accurate conversions between latitude, longitude and local map projections (in meter).

**Figure 1.** Proposed workflow to assess local sheltering using the wind shadow method and the solar shading model in Section 2.4.

Before running the path profiling scrips, we define the footprint of the case building from the centre point in decimal degrees (lat, lon) by the length of the building envelope in the x-direction, length in the y-direction (in meter) and building rotation (in degrees). We specify the intermediate distance of mapping points for each façade and calculate their respective projection lines in all directions. Figure 2 shows the building and façade centre viewpoints used for the path profiling calculations (in green). In this way, it is irrelevant if the building is represented on the surface raster and building footprint shape layers or not.

**Figure 2.** The building façade geometry for solar shading and wind shelter calculations was created from three inputs: length in the x-direction, length in the y-direction (in meter) and building rotation. The latitude and longitude were selected to be the centre-point that the envelope is rotated around.

#### *2.1. Environmental Variables*

2.1.1. Environmental Variables Derived from Reanalysis Data

Meteorological data (Table 1) were derived from the ERA5 atmospheric reanalysis of the global climate released by the European Centre for Medium-Range Weather Forecasts (ECMWF) as part of the Copernicus Climate Change Service (CDS). A comprehensive description of ERA5 is provided by Hersbach et al. [49]. ERA5 provides validated estimates for each hour of the day, worldwide, with a two to three months delay and leading back to 1979, or as recent as up to a couple of days ago through the preliminary dataset ERA5T.

Table 2 also shows selected variables from "ERA5-Land" another current dataset produced by ECMWF with around four times finer spatial resolution (~9 km grid spacing compared to the ~31 km grid of ERA5). It is a simulation of the land surface components of ERA5 forced by ERA5 ′ s lower atmospheric fields, currently without coupling or additional data assimilation, meaning that observations only influence the simulation indirectly through the forcing [50].

**Table 2.** Climate variables acquired from ERA5 and ERA5-land reanalysis and the model transformation. The name of each variable used in the comparison study is shown on the same line as the sourced variables according to their short names in the Copernicus CDS.


\* fal is a diagnostic broadband albedo, whereas the true ground value is calculated by: αgr = 1—ssr/ssrd [51]. To account for increased reflectance when the ground is covered by snow, the ERA5-Land surface model parameters: snow cover and snow albedo were included: αgr;sn = snowc·asn + (1 − snowc)· αgr .

We used nearest-neighbour and bilinear interpolation to create hourly time-series for each site-location. Figure 3 shows the four case studies' location relative to the ERA5 grid points and each grid tile's relative weighting. The ERA5-land data were also interpolated to the site (not shown).

**Figure 3.** Locations of cases (**a**) Holzkirchen DE, (**b**) Trondheim NO, (**c**) Gainsborough UK, and (**d**) Brussels BE, and nearest ERA5 grid cells (red lines) overlaid over local area maps. The percentages illustrate the resulting bilinear weighting of the adjacent cells. Map data from OpenStreetMap.

#### 2.1.2. Environmental Variables Derived from Remote Sensing

For solar radiation, it is possible to use surface downwelling radiation from the reanalysis, but several studies show that better products exist to account for clouds' variability. Services that combine solar models with remote sensing techniques provide greater temporal and spatial resolution. An overestimation of solar radiation is often observed in reanalysis, and an underestimation is observed in satellite methods [52].

Solar irradiance data were acquired from Copernicus Atmosphere Monitoring Service: CAMS Radiation Service (CAMS-Rad) version 3.2 [53]. CAMS-Rad's satellite-based solar irradiance data are available at a spatial resolution of ~5 km over central Europe in 15-min time steps from 2004 until the present time (up to two days ago) and covers the field of view of the Meteosat satellite (Europe, Africa and the Middle East). An account of the radiative transfer scheme in CAMS-Rad is provided by Qu et al. [54]. We used the camsRad R-package [43] to obtain 1-min time-series of direct normal, global horizontal and diffuse horizontal irradiance for clear-sky and cloudy conditions. The dataset was then resampled to 10-min intervals for the shading calculations.

#### *2.2. Downscaling*

#### 2.2.1. Local Wind Speed Estimation

The ERA5 reanalysis is already being applied in wind energy assessment, showing improvements over previously released global and regional reanalysis datasets [55–57]. In several recent studies, dynamical downscaling of ERA5 data using the high-resolution Weather Research and Forecasting (WRF) model has demonstrated an added value of introducing a higher spatial resolution [6,57,58]. The near-surface wind speeds in the reanalysis are advised not to be used directly to indicate surface wind conditions at a site [56], as the relatively low spatial resolution lacks the local representativity of the site surroundings. This recommendation is reflected by a word of caution in the APIdocumentation:

*"Care should be taken when comparing this variable with observations because wind observations vary on small space and time scales and are affected by the local terrain, vegetation and buildings that are represented only on average in the ECMWF Integrated Forecasting System."*

In fact, the 10-m near-surface winds in ERA5 are parametrised as the potential wind in open terrain [59], which can differ substantially from the model representation over the whole grid cell [60]. Overland, exposure correction in ERA5 is done taking the lowest model level winds (at a height above the surface that is less influenced by underlining terrain) and applying vertical interpolation to 10 m through a logarithmic wind profile

including stability indices (Monin-Obhukov theory). An open-terrain surface roughness (0.03 m) is used in the transformation [60].

In an attempt to better represent surface conditions, a simple analytical downscaling method is used. First presented by Wieringa [59], the "2L-method" consists of a two-layer model of the atmospheric boundary level to account for the difference in surface roughness from one location to another. It is to be used in combination with roughness lengths obtained through anemometric analysis or a surface roughness map. We used parts of the code as printed in [47] and a surface roughness conversion table (see Appendix C) for our implementation. Other more sophisticated models reported in the literature are included in the widely used commercial software WAsP, WindSim, and windPRO. There are similarities between the procedures implemented in WAsP, the 2L-method, and another simple model by De Rooy and Kok [61–63]. The latter was recently used to create a 1 km gridded dataset for Germany of hourly surface variables using station records and a regional climate reanalysis model [4]. Other authors have successfully combined WAsP and statistical approaches [6,64]. These simple downscaling methods are not claimed to represent the full complexity of the boundary layer. The 2L-method has mainly been used to create wind resource maps and determining extreme open-water winds [47,48,62,65]. It was first developed as an interpolation method for surface wind measurements [59] and has later been developed by Verkaik [66,67] and by Wever and Growen [68]. Evaluations over land have revealed mixed results, outlining that the roughness lengths significantly impact model performance and that the use of uniform (non-directional) roughness values leads to large errors [47,65–67].

We follow the approach of Verkaik [66,67] using a high-resolution land-use map with derived surface roughness values together with a simplified footprint model to downscale model wind. In the lower level, the surface wind is transformed into the so-called blending height, where local disturbances have been blended out (Figure 4). Next, the wind speed in the upper layer is determined by using the NWP-model grid cell roughness value and geostrophic resistance laws [59]. At the height of the boundary layer, the wind speed is interpolated between model grid points to site. The wind speed at the blending height is calculated using regional surface roughness length and transformed back to surface height (10-m) using local roughness length in the given wind direction.

**Figure 4.** The 2-layer downscaling model concept applied to the reanalysis model surface wind.

If the surface roughness used in the upwards and downwards transformation in the upper layer are identical (the model grid cell roughness and the regional roughness is set to the same value), the 2-layer model reduces to a neutral logarithmic wind profile conversion via blending height [59]. In the result section, we refer to this common conversion as the 1L-method:

$$\mathrm{U}\_{\mathrm{loc}} = \mathrm{U}\_{\mathrm{10m}} \frac{\ln\left(\frac{\mathrm{Z\_{\mathrm{bh}}}}{\mathrm{Z\_{\mathrm{0};\mathrm{WMO}}}}\right) \ln\left(\frac{\mathrm{Z\_{\mathrm{loc}}}}{\mathrm{Z\_{\mathrm{0};\mathrm{loc}}}}\right)}{\ln\left(\frac{10}{\mathrm{Z\_{\mathrm{0};\mathrm{WMO}}}}\right) \ln\left(\frac{\mathrm{Z\_{\mathrm{bb}}}}{\mathrm{Z\_{\mathrm{0};\mathrm{loc}}}}\right)}, \ \mathrm{z\_{\mathrm{0};\mathrm{WMO}}}=0.03 \text{ m} \tag{1}$$

where zloc is the local height to use in the conversion (10 m or roof height), zbh is the blending height, *U*10m is the wind speed from reanalysis, z0;loc is the local roughness length at the target location and *z*0;WMO is the open-terrain surface roughness (Equation (1)).

For the land cover classification, we use the CORINE land use (CLC) map covering the entire Europe. The map was stored in the elevation path profile web-service. The returned land cover classes along paths (with an intermediate spacing of 100 m) were used to assign their relative roughness using tables from literature (see Appendix C). The details of the regional and local roughness length calculation and the footprint model is given in the Appendix C.

#### 2.2.2. Transforming Solar Irradiance Data Using a Satellite DEM

In CAMS Radiation Service, the irradiance calculations are done under the assumption of a flat terrain within the satellite pixel, without considering the diffuse parts masked or reflected by surrounding slopes [53]. In this study, no attempts are made to calculate the diffuse part reflected by the surrounding slopes, but it could be possible to use the satellite-derived DEM to calculate reflections. For the direct irradiance, to account for shading from local hills or mountains, we combined the matrices with the terrain shading angle from the JRC's PVGIS API, selecting the two's maximum shading angle in each sector. The PVGIS horizon angles were interpolated to match sectors of 2.5◦ , from the native 7.5◦ sectors corresponding to half-hour intervals. A distance of 10 km was assumed. Except for the PVGIS horizon profile API, no existing services were found to consider local hill-shading effects. Still, many web-based height map services provide tools to create elevation profiles manually along user-defined paths. We also want to test to what extent services that calculate terrain shading angle from satellite-derived DEM's can supplement or substitute higher-resolution DSM's/DTM's covering the nearby building vicinity.

#### *2.3. Transformations to Local Building Boundary Conditions Using Detailed Surface Models* 2.3.1. Wind Shadow Sheltering on Facades by Nearby Upwind Obstacles

The wind shadow model to calculate wind sheltering effects on building air-infiltration is implemented into a Python script in the following work. The main concept of this simple empirical model first presented in Walker, Wilson and Forests 1996 paper [17], is a wind shadow projected downstream by upwind obstacles to determine the effect of wake velocity on the building surfaces. It applies a Gaussian-shaped weighting reduction, projected and weighted on the facades, that extends beyond the width of the obstacle in the far wake region. In the following, we apply the calculated directional sheltering factor to scale the wind speed in the infiltration model as intended in the original paper, not trying to solve the full wind profile with urban canyon effects, localised flow accelerations, vertical spread and other effects, which would be possible with CFD-simulation or with the three-dimensional diagnostic urban wind models described in the Appendix B that share the empirical parameterisation for far wakes [69,70]. In complexity level, the analytical model implemented here is more similar to the extensive work focusing on deriving wind conditions in urban environments using morphometric approaches [71].

The method calculates an effective mean wind speed *U*λ based on the unobstructed wind speed *U*, multiplied by the shelter factor *λ<sup>w</sup>* which takes a value between 1 (no shelter) to 0 (complete shelter), or 1 to 0.3 in a physical setting where large buildings are immediately adjacent [16]:

$$\mathcal{U}\_{\lambda} = \mathcal{U} \cdot \lambda\_w(\theta) \tag{2}$$

where the shelter factor *λw*(*θ*) is expressed as a function of wind direction angle *θ*.

The authors of the wind shadowing method make it clear that the coefficients to find *λ<sup>w</sup>* and *U<sup>λ</sup>* (Equation (2)), were not based on measured wake velocities, but on measured sheltered and unsheltered façade surface pressures. See Appendix B for further discussion.

The other unique feature of the model is a flapped notch wake used to simulate the effect of wind direction fluctuations on near wake spread and growth. The notch wake indicated by notch centreline velocity in Figure 5, is flapped over a range of wind angles assuming a Gaussian distribution of wind direction about the mean angle *θ*:

$$\lambda\_{\boldsymbol{w}}(\boldsymbol{\theta}) = \sum\_{j=1}^{61} f\left(\phi\_{j'}\theta\_{\prime}\sigma\_{\theta}\right) \cdot \lambda\_{\boldsymbol{w}}\left(\phi\_{j}\right) \tag{3}$$

where the standard deviation of the wind distribution *σ<sup>θ</sup>* is estimated based on a function that changes with averaging time, e.g., 10 ◦ for a time-step of 1 h [17], which translates into a range *φ<sup>j</sup>* of +/− 30 ◦ for each side of the mean wind angle *θ* (Equation (3)).

**Figure 5.** The concept of a Gaussian-shaped wake extending beyond a (**a**) wide obstacle of 8 × 8 × 10 m (**b**) narrow obstacle of 2 × 2 × 10 m.

Figure 5 illustrates how narrow and wide objects differ in notch wake velocity deficit and how fast the wake assumes a Gaussian profile behind the obstacle. A standard deviation of *σ<sup>θ</sup>* = 10 ◦ was used for the calculations (Equation (3)).

As described in Appendix B, to determine the scaling length, the obstacles' aspect ratio is considered (calculating a characteristic dimension in the wind direction), but not including the roof pitch or edges of the obstacle relative to the wind direction, or the geometrical relationship between the two objects in consideration. These and other simplifications of the three-dimensional flows are discussed in the paper [17] and wind-tunnel experiments that evaluate the model [72]. Other studies show that although the assumption of wake symmetry may be a reasonable approximation for simple cubes oriented normal to the wind, for more complex geometries the functional form of the velocity deficit in the far wake is neither symmetric nor Gaussian [73,74]. For situations where the obstacle has protruding edges in the wind direction, standing vortices are formed that may lead to velocity deficits that differ significantly from what is predicted by a simple wake model [74].

In our implementation, the effective distance between the obstacle and the facade is calculated differently for the wind shadow coming into the facade, fully immersed, and out of the facade. When the projection line intersects with the façade, the distance is calculated from the intersection point to the obstacle's edge. For the particular case, when the wind direction is perpendicular to the façade, the mid façade point is used in the distance calculation, which is the situation described in the paper.

For each façade, *j*, the effective shelter *λw*;*<sup>j</sup>* is found by a weighting of sheltered and unsheltered portions of the wall:

$$
\lambda\_w = 1 - (1 - \lambda\_{w\infty}) \left(\frac{\mathcal{L}\_\mathbf{s}}{\mathcal{L}\_\mathbf{w}}\right) \tag{4}
$$

where *λw*;c;*<sup>j</sup>* is the shelter factor on the wake centerline, Lw;*<sup>j</sup>* is the length of the façade, and Ls;*<sup>j</sup>* is the sheltered façade length (Equation (4)). The sheltered façade length and distance between the obstacle and the façade (the wake distance) will differ for each wall and wind angle. Every façade and obstacle are considered independently, and upwind walls do not shelter downwind building walls on the same building, as these effects are accounted for in the pressure coefficients [17].

The footprint of the building under consideration was defined as described in Figure 2. The nearby building footprints (including calculated mean building height information)

were imported from building shape layers and analysed with Python's shapely library. Trees and vegetation can be included as points (narrow obstacles), but this was not tested as there were no tall trees located within 2–3 building heights in the prevailing wind direction of either case building. Finally, we evaluate the maximum sheltering factor on each façade for every wind direction resulting in a combined directional sheltering factor for the building.

#### 2.3.2. Approach to Calculate Sheltering from Surface Digital Elevation Models (DEM)

For the shading study, we created path profiling scripts using the gdal library for Python and later found Python modules of [45] and shifted to use these modules together with the elevation API. Calling the API repeatedly, one can use a façade sub-division, yet we restricted the evaluation to a single point per façade. For each case building, the digital elevation maps hosted in the elevation API covered a radius of at least 100 m around the building. Focusing on obstacles in the close surroundings is in agreement with Lingfors [31] who found that for roof surfaces a radius of 50 m is satisfactory with little impact on annual direct irradiance beyond 75 m. Further procedure:


We start evaluating the horizon angle a few meters away from the façade to avoid heightmap slope artefacts or exclude trees or other nearby obstructions that are not shielding the entire façade. For the lower resolution DEMs, starting the evaluation, e.g., 5 m away also help to reduce how precisely the façade lines need to be defined relative to the underlying surface DEMs. Following the methodology laid out in the EN ISO 52016-1:2017 standard, it is suggested that building self-shading or "side-fins" are assessed separately.

#### 2.3.3. Solar Shading by Nearby Objects

We follow the procedure of EN ISO 52016-1:2017 for calculation of solar shading reduction factors of nearby (and distant) objects. The standard differentiates between shading reduction factor by objects and on or close by the building itself like overhangs, sides and rebates. In the present work, we did not consider building self-shading. The standard gives two methods to assess shading of diffuse radiation; either disregard shading of the diffuse or use a sector-based evaluation applied to the Perez transposition model. In this anisotropic model, the diffuse part is separated into sky diffuse, circumsolar, horizon band and ground reflected irradiance (EN ISO 52010:2017 calculation method). The technical report accompanying the standard also outlines how some sky patch dome models are

compatible with the Perez model. An example of implementing a sky patch model with Perez surface transposition and LiDAR data for shading analysis is found in [35]. As objects may not only block solar irradiance on a surface, but may also reflect solar radiation (e.g., hills, trees, other buildings, or other parts of the same building), the three-dimensional problem quickly calls for more advanced methods, e.g., ray-tracing supported by GPU acceleration. For simplicity, we follow the first approach in the standard, which is equivalent to a situation where the radiation reflected or transmitted by objects in the environment is equal to the diffuse radiation blocked by these object.

The direct radiation (including circumsolar) is fully or partially blocked by a factor *F*sh if the object is between the sun and surface (Figure 6).

**Figure 6.** Shading of direct and circumsolar beam irradiance and the vertical shading factor.

#### *2.4. Including Environmental Variables and Local Sheltering Effects*

In the following section, we discuss models that can be used pre-process weather variables in order to capture the thermal tie between indoor temperature, and boundary conditions, in this case, incident solar insolation and temperature- and wind-dependent air-leakages across the building envelope which result in infiltration losses.

#### 2.4.1. Infiltration Losses

Infiltration is the uncontrolled air leakage through cracks and other unintentional openings in the building envelope introducing outdoor air into a building. Many infiltration models for residential buildings have been developed based on statistical fits of infiltration data. By considering that weather is the dominant driving force, infiltration flow can be assumed to be linearly dependent on the outside-inside temperature difference and wind-speed [75]. However, the simplicity of regression is not without limitations. The fitted coefficients carry little physical meaning, and the collinearity between heat transmission across the building envelope and infiltration losses driven by the indoor-outdoor temperature difference may lead to identifiability issues.

An empirical single-zone infiltration model accounts for infiltration, relying on physical parameters and building information, the AIM-2 model. We use the model form presented by Lundström in [58], where the calculated potential specific infiltration flow rate *Q*∗inf [Pa n ] multiples with the infiltration coefficient Cinf [l/(s Pa <sup>n</sup>m<sup>2</sup> )] which can be estimated by inverse approaches or obtained from fan pressurisations tests. A literature review of studies using the model and the differences in our implementation to [58] are presented in Appendix B. The single-zone infiltration models' performance is mainly sensitive to the highly uncertain distribution of air leakages across the envelope and the parameters used for converting wind data measured at a weather station to the building site and local wind shelter effects from typography and nearby buildings. This uncertainty can be reduced by wind measurements on-site. However, shelter coefficient may still apply as a simplified approach to account for direct wind shielding caused by trees and neighbouring buildings located within 2–3 building heights of the building facades [16,17]. To estimate local wind velocity Uloc , AIM-2 uses unobstructed wind speed transformed to building eave height at the building site. A power law wind profile conversion was

used in the original AIM-2 model, whereas a logarithmic wind profile conversion can be found in the implementation in the popular building energy simulation software ESP-r [76]. Furthermore, the wind shelter coefficient *λ*w of 0–1 is multiplied by the wind speed at roof height. Both the authors of AIM-2 and the LBL infiltration method recommends making this shelter effect directional based on wind direction [16,75].

An engineering approach to make the wind sheltering directional is proposed by Walker and Wilson [16] and can be found reprinted in the ASHRAE Handbook of design guidelines [77]:

$$\lambda\_{\rm w}(\theta) = 0.5 \cdot \left( (\lambda\_{\rm w;1} + \lambda\_{\rm w;3}) \cdot \cos^2 \theta + (\lambda\_{\rm w;1} - \lambda\_{\rm w;3}) \cdot \cos \theta + (\lambda\_{\rm w;2} + \lambda\_{\rm w;4}) \cdot \sin^2 \theta + (\lambda\_{\rm w;2} - \lambda\_{\rm w;4}) \cdot \sin \theta \right), \tag{5}$$

where *λw*(*θ*) is the shelter factor for the particular wind direction *θ*, and *λw*;*<sup>j</sup>* is the shelter factor when the wind direction is normal to a wall *j* (estimated perpendicular to each side building side) which can be estimated from sheltering class tables in literature [78].

We compare this interpolation approach (Equation (5)) to the wind sheltering model in the result section.

#### 2.4.2. Solar Heat Irradiance on Facades

The solar irradiance is calculated as a weighted mean vertical input according to the proportions of total solar gains expected for each façade orientation. This input can be used in simplified thermal models that are suitable to determine building heat transmission losses (HTC) [79]. When measuring and accounting for solar gains in steady-state whole building heat loss experiments, one approach is weighting each façade by their respective glazing proportions. Stamp et al. found that for north-south oriented dwellings, vertical south-facing or weighted means provide the most accurate results to determine heat transmission losses (HTC), whilst estimating solar gains from global horizontal measurements overestimated HTC [80]. For east-west facing dwellings, mean or weighted means may provide more accurate results than a single vertical measurement in the dominant direction, particularly where there are local shading effects [80]. The presence and operation of solar shading devices represent a considerable uncertainty.

#### **3. Results and Discussion**

The proposed method is applied to four buildings:


Figure 7 illustrates that nearby buildings do not shade the east, south, and west façades of the TWIN house, at the 15. of February ca. 9:30 in the morning, 12:30 mid-day and 15.30-afternoon local time (or any other time in winter). The ZEBLL house at the NTNU campus, on the other hand, is shaded in the afternoon on this day from a nearby building located west of the house. We evaluate the shading model by using data from a pyranometer on the south façade in Section 3.2.

**Figure 7.** Shading of east, south and west facades on the 15. February with solar position 135 ◦ (SE), 180 ◦ (S) and 225 ◦ (SW). (**a**) TWIN, Holzkirchen, DE; (**b**) ZEBLL, Trondheim, NO.

#### *3.1. Comparison of Sourced Weather Data to Observations at the Holzkirchen Site*

The key weather variables sourced from the reanalysis and the satellite irradiance service are presented, by comparing a two and a half month-long winter period to observations at the Fraunhofer IBP test site in Holzkirchen. The weather data was collected at the IBP's weather station at 1-min intervals, provided as 10-min averages for the period 7 December 2018 to 28 February 2019 as part of IEA EBC Annex 71. The 10-min data were processed to hourly observations for the following analysis.

#### 3.1.1. Air Temperature and Sky Longwave Irradiance

The air temperature and sky longwave irradiance from reanalysis are seen to represent the diurnal cycle (Figure 8). The ERA5-Land temperature at 2-m scores somewhat better on central performance metrics (0.33 ◦C, 2.04 ◦C and 1.36 ◦C) compared to ERA5 ′ s (1.32 ◦C, 2.27 ◦C and 1.61 ◦C) for mean bias difference, root mean square error and mean absolute difference respectively. The errors are largest under cold spells. Both products underpredict the temperature under cold conditions, which is a feature of NWP models, they struggle to represent cold temperatures in stable conditions well [81].

**Figure 8.** Observed outdoor temperature and longwave sky radiation at the Holzkirchen site compared to the ERA5 reanalysis for the period 7 December 2018 to 28 February 2019.

The longwave sky irradiance and the sky temperature, calculated from the ERA5 longwave sky irradiance and air temperature (using the conversion in Table 2) correlate to local measurements on most of the days. In Figure 9, the distribution and sorted values in ascending order (black) reveal a bias across the distribution. These quantile-quantile plots are helpful to determine if the distributions are similar. The actual hourly values aligned in time are shown as coloured point samples. The reanalysis does not match observed hourly temperature in the lower end, but the distributions are similar, so more confidence can be placed for longer periods (e.g., monthly averages).

#### 3.1.2. Wind Speed and Direction

As described in the method section, the near-surface 10-m wind in the reanalysis is not itself a direct output of the model: instead, the lowest predicted model level wind is post-processed using an exposure correction to better represent observed 10-m wind in open terrain [60]. In this case, the wind mast's local surroundings match the condition of open unobstructed terrain in the prevailing wind direction, making a direct comparison possible. When comparing the reanalysis data to site-observations, there is a clear bias in the reanalysis (Figure 10). Both products generally capture the hourly fluctuations reasonably but a consistent underprediction effect is observed.

**Figure 9.** Observed outdoor temperature and sky temperature (computed from longwave radiation) at the Holzkirchen site compared to the ERA5 reanalysis for the period 7 December 2018 to 28 February 2019.

**Figure 10.** Observed wind speed and direction 10 m above ground at the Holzkirchen site compared to the ERA5 reanalysis for the period 7 December 2018 to 28 February 2019.

By applying the 2L-method, the 10 m near-surface wind from ERA5 is bias-adjusted, leading to a better match with the observed wind (Figure 11). More details on the correction factors are presented in the downscaling chapter. As with the temperature comparison, the actual hourly values aligned in time are shown as coloured point samples and sorted in ascending order (black) to see model bias across the theoretical distribution (Q-Q plot). A Weibull distribution is overlaid on top of each scatter plot, displaying the observations series in grey and the reanalysis in colours.

After the bias correction, the Weibull scale parameter is increased from 3.1 m/s to 3.8 m/s, compared to the observed 4.0 m/s. The mean bias difference is reduced from −0.85 m/s to −0.2 m/s, and the mean absolute difference is improved by 0.4 m/s from 1.2 to 0.8 m/s.

#### 3.1.3. Global Horizontal and Diffuse Irradiance

The global horizontal and diffuse irradiance observed at the site compares quite well to the satellite-derived irradiance from the CAMS-Rad service (Figure 12) on most days. There is a series of days in February, towards the end of the period, where diffuse irradiance is overpredicted, and the global horizontal is underpredicted. These are clear-sky days not interpreted as such by the satellite-model product. Earlier in the period, the cloud cover appears to be predominantly overcast (global horizontal and diffuse irradiance are

equal), but some day-to-day variability can be spotted in both observations and satellite irradiances.

**Figure 11.** Observed wind speed and direction 10 m above ground at the Holzkirchen site compared to the ERA5 reanalysis for the period 7 December 2018 to 28 February 2019.

**Figure 12.** Observed global horizontal (GHI) and diffuse irradiance (Gdif ) at the Holzkirchen site compared to the variables from CAMS radiation service for the period 7 December 2018 to 28 February 2019.

#### 3.1.4. Snow Depth and Ground Surface Albedo

The snow model depth in the ERA5-Land grid cell matches the observed depth quite well (Figure 13). The calculated snow broadband albedo adjusted for snow cover (equation in Table 2) is compared to the measurements from two pyranometers on-site, filtered by solar azimuth height (only displaying values when the sun is 5 ◦ above the horizon) to account for uncertainty at sunrise and sunset. The uncertainty in the measurements is likely to be high at low radiation intensity, so applying more precise criteria could improve the correlation.

#### 3.1.5. Vertical Solar Irradiance on Facades

The TWIN buildings and weather station at the Holzkirchen site lie unobstructed in a flat open terrain making it impossible to evaluate the shading model with measurements from this site. It is still included to show the calculated horizontal and vertical solar irradiance compared to observations, as TWINS is the only case with measurements in all four façade orientations. On the particular day selected for analysis, clear sky conditions can be observed from ca. 10:00 in the morning (Figure 14). Even if the global horizontal irradiance from CAMS-Rad matches observations well, the calculated irradiance on each façade orientation is underestimated. We can observe that the diffuse fraction on the vertical appears to be underestimated in every facade direction even when using clear-sky irradiance as input to the solar transposition model. Possible explanations are due to how diffuse radiation is parametrised in the Perez anisotropic sky model (including model attenuation coefficients), the assumption that diffuse sky irradiance is isotropic in the shading calculations (each surface sees 50% of the sky) and physical effects caused by terrestrial reflectance or measurement errors. Due to fresh snow, the modelled ground reflectance was set to a high value of 0.70 at this day (calculated from the ERA5-Land snow cover).

**Figure 13.** Observed snow depth and ground reflectance (calculated from ground reflected short wave radation) at the Holzkirchen site compared to the ERA5 reanalysis forecasted albedo for the period 7 December 2018 to 28 February 2019. The dashed blue line shows the calculated snow albedo of Table 2.

**Figure 14.** Observed global horizontal (GHI) and total vertical irradiance measured in each direction for a sunny day in winter at the Holzkirchen site compared to the calculated surface irradiances using the (**a**) CAMS radiation service as input. The (**b**) clear sky data are based on the CAMS-Rad McClear service.

#### *3.2. Comparison of Sourced Weather Data to Observations at the NTNU Campus* Vertical Solar Irradiance on Facades

The ZEBLL campus building also features pyranometers measuring global horizontal irradiance on the roof and total vertical irradiance on the south façade. This building has its largest windows towards south, and the view from the wall-mounted sensor is shown in Figure 15. In winter, the afternoon sun is obstructed by large buildings towards west, well-captured by the shading model (Figure 16).

**Figure 15.** Panoramic view from a vertical sensor positioned south, ZEB Living Lab, NTNU Campus.

**Figure 16.** Observed global horizontal (GHI) and total vertical irradiance measured from south sensor position, ZEB Living Lab, NTNU campus compared to the calculated surface irradiances using the CAMS radiation service as input. The CAMS-Rad irradiance over-evaluates cloud cover (**a**) and clear sky (**b**).

The CAMS-Rad irradiance over-evaluates cloud cover on this particular day (Figure 16a, but a better match can be seen in the clear sky irradiance from the CAMS-Rad McClear model (Figure 16b). The discrepancy on how vertical solar irradiance is calculated is likely due to surfaces partially shaded by trees before noon (Figure 15), because only buildings and terrain are considered in this particular diagram. The difference between observed hourly global horizontal irradiance and what is interpreted by the satellite product as mostly diffuse sky irradiance on this particular morning, also exemplifies how surface shading is left unaffected when cloud cover is predominant over the hour in the satellite product. Improving the shading calculations to include diffuse shading and not only effects on direct surface radiation.

#### *3.3. Horizon Angle and Solar Radiation*

#### 3.3.1. Horizon Profile Using a Satellite-Derived Pan-European Surface Height Model

The proposed workflow for solar and wind assessment has in common that the same datasets are used on a local level, but in different ways. Moreover, that available dataset for downscaling to medium scale covers all of Europe (or global datasets), whereas the local effects are only assessable using local area height maps or building footprint vector map-layers with user contributed height-specification. Although there are initiatives of standardisation and contribution within the INSPIRE framework, including OSM contribution, and a community-led project to derive a pan-European terrain model online [82], high-resolution models involve more complex issues that are less pronounced in lower-resolution models due to the already high uncertainty. The EU-DEM v.1 product is evaluated to a vertical accuracy of 2.9 m RMSE, with higher values for the Nordic

countries (e.g., 5.75 m RMSE for Norway) [83]. EU-DEM v1.1 used in this study is an improvement over the first version, but it has not been validated yet [23]. In order to use different datasets together, we implemented datum conversion and height adjustment to the different national height systems [82]. For Norway, the difference between the national height and the one used in EU-DEM was less than 1 cm, but for Belgium, the offset is as much as 2.31 m.

We tested combinations of height information from the EU-DEM model and the local high-resolution DSM's. However, when mapping from the EU-DEM model surface height, which may very well be above the building height in steep terrain or in places with low buildings and heigh vegetation, we did not find a significant benefit to use the 30-m resolution of EU-DEM compared to the existing horizon profile API of PVGIS, which is a service that relies on a pre-processed global SRTM of 3 arc-seconds (around 90 m). However, we found that if we have actual information about the building elevation height, and create a buffer around the viewpoint (e.g., 50 to 100 m), we can create a horizon profile that in some situations is roughly similar to the detailed DSM's.

3.3.2. Horizon Profile Using Local High-Resolution Surface Height Models

Figure 17, shows the calculated horizon angle at mid-façade height using the maximum local terrain profile of the PVGIS API (thin black line) and the high-resolution digital surface models filtered to buildings only (thick black line) or without filter (green), for the various case buildings. The TWINS building (a) is located in an open flat area. ZEBLL (b) is located in Norway where solar height at the summer solstice (red line) is lower than the others, and in this case, vegetation towards north-east and larger campus buildings westward block morning and afternoon sun. GBORO (c) is shaded by a neighbouring townhouse distanced 7 m from the south façade, and UKULE (d) is also located in an urban environment facing a street and a backyard with trees in the backyard.

**Figure 17.** Calculated horizon angle at mid-façade height using the maximum local terrain profile of the PVGIS API (thin black line) and the digital elevation model DEM filtered to buildings only (thick black line) or without filter (green) for the (**a**) TWINS, (**b**) ZEBLL (**c**) GBORO and (**d**) UKULE buildings.

Table 3 summarises the average calculated horizon height per façade, separated by considering only buildings obstacles (abbreviated "Build.") or all obstacles including vegetation (abbreviated "Surf.").

**Table 3.** Calculated average shading angle at mid-façade height filtered by terrain and building outlines, or without surface filtering (including all trees or other obstacles from the surface DEM).


Figure 18 shows the clear-sky shade index per winter month by [31] defined as the ratio of shaded and unshaded clear-sky irradiance (only direct and circumsolar beam is included in the following). The impact of vegetation is presented as a range from fully opaque (by an x mark) to fully transparent (by a cross mark) and a mix of the two (dot).

**Figure 18.** Monthly clear-sky shading index per façade for the winter months presented per façade from left to right: (**a**) ZEBLL (**b**) GBORO (**c**) UKULE.


By weighting the calculated surface radiation by the glazing ratios, we can estimate the solar heat gains on the window facades in the winter months and estimate a total solar aperture for the building.

#### *3.4. Surface Roughness and Wind Sheltering*

3.4.1. Unobstructed Height Adjusted Wind Speed

ERA5 has a model resolution of approximately 31 km and lack the local representativity of the site surroundings. To better represent surface conditions upwind influence, a downscaling method is used in an attempt to increase the local accuracy. The method

consists of a high-resolution land-use map with derived surface roughness values from tables (Appendix A) and a simple two-layer model of the atmospheric boundary layer.

The analysis in Section 1 illustrated that 10-m surface wind speed at the Holzkirchen site (TWINS) is underestimated in the reanalysis. An explanation can be found in the high forecasted model surface roughness, indicating that the model grid box surface-average does not represent the local site conditions. The forecasted z0;M = 1.5 m (Table 4) is a value typically representative for forested areas. The site is located in an open agricultural area bordering the nearby town of Holzkirchen to the north-west and a golf course towards the south (Figure 9). The prevailing wind direction is west-south-west, and the first km up-wind are open landscape. Further out forest surrounds the farmland, influencing the regional scale weighted mean surface roughness plotted in Figure 17.

**Table 4.** Surface roughness from ERA5 grid cells and local value derived from the land cover maps.


First, the forecasted surface roughness for each grid cell was extracted from the ERA5 reanalysis. Table 4 show the values for each site, together with the open terrain roughness used to compute 10-m wind in ERA and the derived surface roughness from the closest grid cell of the CORINE Land Cover (CLC) 2018 map paired with a table of roughness values (Appendix C). The roughness length for the land use type "Sport and leisure facilities" was adjusted to a lower value from 0.5 m to 0.03 m to represent the snow-covered golf course.

Next, the directional surface roughness from the footprint model applied to local scale (up to 1.8 km), and regional scale (up to 9 km) scale are plotted (Figures 19–22). For TWINS we can see that on the local level, the surface roughness is well approximated to 0.03 m in the prevailing wind direction (SW), but the model level of 1.5 is many classes higher than the regional level z0, except in the north direction (due to the proximity to forest). The difference between the model average forecasted in the nearest grid cell (z0;M), and the regional footprint model is less pronounced for cases (b) to (d) (Figures 20–22). However, all display a considerably higher surface roughness on local-level than the open terrain roughness (z0;WMO).

**Figure 19.** Derived surface roughness for TWINS site (**a**) in 10 km radius adapted from CLC map (© European Union, Copernicus Land Monitoring Service 2018, European Environment Agency (EEA)); (**b**) Directional local and regional surface roughness weighted by distance to building-site. Darker green colour indicates higher surface roughness values.

**Figure 20.** Derived surface roughness for ZEBLL site (**a**) in 10 km radius adapted from CLC map 2 ; (**b**) Directional local and regional surface roughness weighted by distance to building-site. Darker green colour indicates higher surface roughness values.

**Figure 21.** Derived surface roughness for GBORO site (**a**) in 10 km radius adapted from CLC map 2 ; (**b**) Directional local and regional surface roughness weighted by distance to building-site. Darker green colour indicates higher surface roughness values.

**Figure 22.** Derived surface roughness for UKULE site (**a**) in 10 km radius adapted from CLC map 2 ; (**b**) Directional local and regional surface roughness weighted by distance to building-site. Darker green colour indicates higher surface roughness values.

The resulting unobstructed surface winds transformed to roof height are used together with the wind sheltering method in the estimation of infiltration loss (see Section 3.4.3).

#### 3.4.2. Wind Sheltering of Nearby Obstructions

The wind sheltering model results are shown below for the three cases (b) to (d) where nearby buildings wake significantly impact surface conditions in the dominating wind direction. For the campus building (Figure 23), obstacles downwind coincide with the dominating south-west wind direction. Likely, these buildings will also influence the wind direction and lead to more complex airflow patterns than captured by the model.

**Figure 23.** Air infiltration sheltering for the NTNU Living Lab (ZEBLL) case (**a**) Upwind obstacle geometries in red (OSM building footprints) and building exterior facades in green overlaid on the local DSM height map by Kartverket/CC-BY 4.0; (**b**) The combined directional shelter factor for the four facades using the shelter model (red) and the table-values obtained for each building side (blue).

For the two other cases, GBORO and UKULE, located in urban/semi-urban setting, the calculated maximum sheltering coefficient from building obstacles are more irregular (Figure 24). Therefore, it fits less to the interpolated values (using Equation (5)) estimated qualitatively for each building side from sheltering classes in literature [78]. Instead of a selecting the sheltering factor manually from a table, one could use the model to approximate the sheltering factor from obstacles located perpendicular to each façade orientations (when the wind direction is normal to a wall).

**Figure 24.** Air infiltration sheltering (**a**) for the GBORO case-building; (**b**) and for the UKULE case building. The combined directional shelter factor for the three exterior facades of either plotted over the wind direction. Only the obstacles that yielded an impact to the shelter factor are enumerated.

In the presence of more tightly-spaced building obstacles, the limitations of the sheltermodel implementation become evident. First, overlapping sheltering is not considered, which may be detrimental to performance. We have no way to assess skimming flow that is usually accounted for in urban wind morphometric approaches by increasing the zeroplane displacement height [71]. Secondly, with more complex and overlapping geometries, the functional form of the velocity deficit in the far wake is not Gaussian [72–74]. In the urban/semi-urban situation below, limiting the assessment to approximate the sheltering factor in the four façade directions and interpolating using Equation (5) may be a more physically sound and robust simplification.

#### 3.4.3. Infiltration Loss

In this section, the infiltration model results are plotted against the potential wind speed of ERA5 at 10-m = height to illustrate the effect of the wind exposure and shelter correction. The AIM-2 models physical model output, "infiltration potential" (a scalable time-dependent variable) is multiplied by the results of air-pressurisation tests n50 value. It could be replaced by either a design-value or a parameter estimation in an inverse modelling approach. All the four cases (which are well-insulated and air-tight residential buildings) have measured air-pressurisation n50 values of 1.0 air change rate per hour at 50 Pa pressure difference, except the last case d) with a lower value of 0.5 h −1 .

The slope of the infiltration air change rate at 1 atm. pressure (Figure 25) is mainly influenced by the n50 value, the shelter coefficient used in AIM-2 and what kind of exposure correction is made from ERA5 10 m potential wind speed (in open terrain) to building roof height, *Urf* .

**Figure 25.** Calculated infiltration rate (air change per hour) over six months in winter versus 10-m wind speed in open terrain from reanalysis and estimated cumulative distribution (ECFD) (**a**) TWINS; (**b**) ZEBLL; (**c**) GBORO; (**d**) UKULE.

• The black horizontal line is a common conversion of the n50 infiltration rate to 1 atm. pressure, simply multiplying the n50 values by a constant of 0.07, a rule-of-thumb conversion factor representative of a moderately sheltered building with more than one exposed façade.

The first two AIM-2 model variants uses a 1-layer (1L) logarithmic transformation of the wind speed (up to a blending height of 60 m) which is a common conversion method in many Building Energy Simulation (BES) tools.


In the last two AIM-2 model variants, the full 2-layer (2L) downscaling method was used with and without the directional surface sheltering method. In the 2L-method, the directional surface roughness upwind to the site is obtained for the required regional and local footprint using derived surface roughness from the CLC-classification.


For the first two model runs using fixed surface roughness, the variability of air change comes from the outdoor indoor temperature difference, which is more pronounced at calm wind conditions. This effect is explained by the AIM2-models sub-addition of stack and wind-driven infiltration, where the stack effect loses significance at higher wind speeds. At higher wind speeds, the wind direction plays a role when the two-layer downscaling method is applied and through the wind shelter method in cases (b) to (d) where there are buildings located in the prevailing wind direction (Figure 25).

The resulting infiltration loss over the six-month winter period is presented in the table below in W/m<sup>2</sup> K indoor-outdoor temperature difference and W/m<sup>2</sup> assuming a constant indoor temperature of 21 ◦C in the period (Table 5). Consistent with the hourly infiltration air change rate (Figure 25), including shading from nearby obstacles significantly impacts the average infiltration heat loss over the winter period for cases (b) to (d). Comparing the average infiltration loss with nearby sheltering effect (final line) to the "rule-of-thumb" approach of scaling n50 by a constant of 0.07 (first line), the mean differences between the two are less than 30 % except for case b) ZEBLL.

**Table 5.** Infiltration loss calculated mean over the six month winter period per floor area in W/Km<sup>2</sup> and assuming an indoor temperature of 21 ◦C in W/m<sup>2</sup> heated floor area.


#### **4. Conclusions**

In this paper we examine how open geospatial data can be used to refine weather variables for building energy performance evaluation with focus on incident solar radiation and wind-driven infiltration modelling. By using only building location (latitude, longitude) and a selection of free/open geospatial datasets covering Europe, we were able to acquire and adapt gridded weather data variables to local building boundary conditions. The nearly three-month-long winter comparison of a building test-site in South-Germany indicates that hourly surface variables from climate reanalysis and satellite-based solar radiation can become a feasible supplement to local observations for heating season building performance modelling and evaluation. However, the air temperature, vital to heating analysis, did not fully capture the extreme lows. In Europe, local observations of air-temperature and to some extent other weather variables are commonly available closer to site than the resolution of the current global reanalysis datasets (as was the case for the four case buildings). New regional surface products are expected to take advantage of observation stations' density and reduce the need for site-correction techniques by featuring higher spatial resolution.

To include local effects from the terrain, vegetation and buildings in solar and wind assessments, 1-m resolution DSM's and DTM's from airborne laser scanning were acquired for each case building. The separation of nearby vegetation and buildings is essential to

model incident solar insolation on low-rise building facades. The impact of distinguishing between the two was clear on the building-sites with trees. The separation was obtained by pre-processing the DSM into a new raster based on building outlines, but this operation could be made more efficient by using the Overpass API directly within the elevation profile web service to obtain building footprints and filter between building obstacles and vegetation. The building footprints (with building heights obtained from the DSM) were also used to include wind sheltering effects in the infiltration calculations. Including wind sheltering from nearby building obstacles in the AIM-2 model significantly impacts the average infiltration heat loss over the winter period for three of the four cases. Based on the four case studies, it seems like the approach work better for more open situations, other approaches may be better suited for buildings situated in a more dense urban setting.

Overall, we found that using scripting tools to automate geoprocessing tasks in conjunction with an elevation profile web service made it possible to utilise information from open geospatial data surrounding a building site effectively. However, there are needs for improvements to the methodology and risk of oversimplification. A next step could be to include diffuse-shading models and evaluate other wind conversions to site and shelter methods for urban settings. It appears that the often-scripted data-driven building thermal evaluation workflows can benefit from using climatological and spatial tools and datasets, especially to include local effects, but more practical evaluation studies are needed.

**Author Contributions:** Data curation, K.S.; Investigation, K.S.; Methodology, K.S.; Writing—original draft, K.S.; Writing—review and editing, K.S. and A.G.; Visualization, K.S.; Supervision, A.G. Both authors have read and agreed to the published version of the manuscript.

**Funding:** This paper has been written within the Research Centre on Zero Emission Neighbourhoods in Smart Cities (FME ZEN) funded by the Research Council of Norway (RCN), the research partners NTNU and SINTEF, and the user partners from the private and public sector.

**Data Availability Statement:** Produced using Copernicus data and information funded by the European Union-EU-DEM layers and Copernicus Land Monitoring Service, Copernicus Atmosphere Monitoring Service (CAMS) radiation service information and Copernicus Climate Change Service information–ERA5 and ERA5-Land global reanalysis. Contains DSM and DTM data produced by The Norwegian Mapping Authority/CC-BY 4.0, DSM and DTM data produced by the Environment Agency UK/Open Government Licence, DSM and DTM data produced by Information Flanders/Open Data Commons Attribution license 1.0, OpenStreetMap data © OpenStreetMap contributors/Open Data Commons Open Database License and Ordnance Survey data © Crown copyright and database right 2020.

**Acknowledgments:** The case studies were made available to participants in the IEA–EBC Annex 71 project Building Energy Performance Assessment Based on In-Situ Measurements.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A**

**Table A1.** Building information and assumptions used to estimate solar irradiance on the facades. Façade glazing distribution and azimuth orientation are given in a list form where the four cardinal directions (N E S W) are (180 90 0 -90) degrees.



**Table A2.** Building information and assumptions used in the building infiltration model.

#### **Appendix B**

#### *B.1. The AIM-2 Infiltration Model*

A single zone infiltration model is used to account for building air-infiltration. Empirical single-zone infiltration models were developed in the 1980s and have seen some renewed interest in later years [84–88]. In most simplified infiltration models, the stack and wind-induced infiltration rates are assessed and derived separately and then superpositioned for a total infiltration rate. Two of the most established models are the LBL and AIM-2 models, adapted into simplified and advanced form in the ASHRAE Fundamentals Handbook. AIM-2 was developed by Walker and Wilson (1990) for houses [16], and a model implementation can be found in the BES software ESP-r [76].

The accuracy of the AIM-2 model to predict infiltration rates in dwellings can be excellent (±10%) when the model parameters are well known according to validations by the authors [78]. A separate validation study found a mean error of 16–27% assessing ten single-family homes [89]. Another study found an average error of about 19% predicting air infiltration rates for 16 detached houses under a wide range of weather conditions [84]. More recently, the AIM-2 model was utilised to predict infiltration rates in three stonechurches in Sweden [85]. The median absolute prediction error was 25%. Considering the model was not developed for large structures, a correction factor of 0.8 to account for overprediction was shown to reduce the error from 25 to 11%. In another recent study, the infiltration model was validated on a single building, obtaining a mean absolute value error of 17–35% by using different parameters for envelope leakage distribution [86]. A methodology is presented in successive work to determine the air change rate in nearreal-time by combining the AIM-2 model with a tracer gas decay test method, reducing the error to 10% [90].

Lundström implemented the AIM-2 model in a building energy model [87], recently transformed to stochastic state-space form followed by a Bayesian calibration procedure [91]. The stochastic approach includes a logistic function to model occupant induced manual venting during heating and cooling season. Lundström presents the following version to calculate infiltration loss *φin f* in [87], where the calculated potential specific infiltration flow rate *Q*∗ *inf* [Pa<sup>n</sup> ] multiples with the infiltration coefficient *Cinf* [l/(s Panm<sup>2</sup> )] which can be estimated or obtained from fan pressurisations tests.

$$\boldsymbol{\phi}\_{\inf} = \mathbb{C}\_{\inf} \cdot \mathbb{Q}\_{\inf}^\* \cdot \kappa \cdot \rho\_a \cdot (\theta\_\varepsilon - \theta\_i), \ \mathbb{Q}\_{\inf}^\* = \mathbb{Q} / \mathbb{C}\_{\inf} \tag{A1}$$

$$Q\_{\rm inf}^{\*} = \left( (Q\_s^{\*})^{\frac{1}{n}} + (Q\_w^{\*})^{\frac{1}{n}} - 0.33 \cdot (Q\_s^{\*} \cdot Q\_w^{\*})^{\frac{1}{2n}} \right)^{n} \tag{A2}$$

where, *ρ<sup>a</sup>* is the density of outdoor air, *κ* is the heat capacity, and exponent *n* is the building leakage flow coefficient of the orifice power law. Like many simple infiltration models, AIM-2 uses a superposition technique where the infiltration flow rates due to wind and stack effects, *Q*∗ *<sup>s</sup>* and *Q*<sup>∗</sup> *<sup>w</sup>*, are added non-linearly in addition to an interaction term (Equations (A1) and (A2)).

In [87], building height *H* is adjusted for buildings taller than two floors to *H\** (m), resembling the correction factor used by [85] to account for over-prediction. Potential infiltration rates due to stack and wind effects can be pre-calculated by either using the

calculated indoor temperature from the previous time-step in [91] or by assuming a constant pre-defined set-point temperature in [87].

$$Q\_s^\* = f\_s \left(\Delta P\_s\right)^n = f\_s \left(\frac{9.806 \cdot H^\* \cdot p\_a \cdot |\theta\_e - \theta\_{\dot{i}}|}{\theta\_{\dot{i}} + 273.15}\right)^n \tag{A3}$$

$$Q\_w^\* = f\_w \left(\Delta P\_w\right)^n = f\_w \left(0.5 \cdot \mathcal{U}\_{loc}^2 \cdot \lambda\_w^2 \cdot \rho\_a\right)^n \tag{A4}$$

Assuming evenly distributed envelope leakage, no flue (chimney), and basement or slab-on-grade foundation, the wind and stack factors reduces to *f<sup>s</sup>* = 0.25 [(Pa/K)<sup>n</sup> ] and *f<sup>w</sup>* = 0.22 [(Pa s2/m<sup>2</sup> )]. However, in this simplified form AIM-2 loses some of its flexibility to utilise building and site-specific data. The leakage distribution input values is a major source of uncertainty influencing AIM-2 and determining the values through measurements is difficult [78]. Noting a lack of reliable data, one of the validation studies shows that minor improvement is achievable (over the default uniform values) if certain building characteristics are taken into consideration [84]. Based on the optimisation of the leakage distribution (between ceilings, floors, and walls) on different groups of houses, their study recommends using values provided in a guideline for estimating leakage distribution parameters according to house types, number of storeys, and foundation type by Lew [92]. The leakage distribution tables by Lew were implemented as cited in the ESP-r source-code [76]. The full equations for *f<sup>s</sup>* and *f<sup>w</sup>* can be found in [78] including model forms for building flues or crawl spaces.

#### *B.2. The Wind Shadow Method*

The concept of a Gaussian-shaped wake in Walker, Wilson and Forest's wind shadow method [17] is similar to a more commonly used shelter model "WEMOD" for far wake effects by Taylor and Salmon [65] that can be found implemented in the QUICK-URB and SkyHelios urban wind models [69,70]. Both QUICK-URB and SkyHelios are light-weight diagnostic urban wind models that do not solve the full Navier-Strokes equations but are based on empirical parameterisations, and mass conservation principles first compiled by Röckle [93] and later improved with updated parameterisations like the WEMOD wake model [69,70]. A difference between the wake model by Walker et al. and the WEMOD model is that the latter was developed to adjust wind speed measurements in a single point in space as opposed to being used on whole building facades. However, being of the same family of models many of the assumptions and limitations apply to both.

The sheltering factors were derived from measured sheltered and unsheltered surface pressures [17]. Therefore, the authors suggest that the surface pressure coefficients *C<sup>p</sup>* of a building shielded by adjacent obstacles could be predicted by correcting the *C<sup>p</sup>* obtained for an isolated building. This was later investigated in wind tunnel experiments using scale building models with different shapes and surrounding conditions by Sawachi et al. [72]. Their results indicate that the influence of the upwind building on the *C<sup>p</sup>* distribution of the downwind building is clearly different inside and outside of the wind shadow. The best correlations are shown when the distance of an adjacent building (obstacle) is more than twice the obstacle's height and width. It is suggested that when the adjacent building obstacle is closer, the width of the wind shadow should be given a wider area, depending on the depth of the shielding obstacle. Still, it is unclear from the paper whether they applied the flapping technique or simply projected a shadow of constant width. Another proposition is that the distance beyond where the shielding effect is negligible could be defined more clearly by the full three- dimensional size and geometrical relationship between the two objects in consideration [72].

For the scaling length in the model, a characteristic dimension of the obstacle is considered which is an empirical relationship between the smallest and the largest dimensions in projected width or height (cast in the direction of the wind). The definition is supported by former experimental studies. We refer to the original paper for the theoretical explanation of model assumptions and the functional form of the wake decay [17].

#### **Appendix C**

The roughness length table for each CLC type used in this work is published on the Finish wind atlas website with distinctive values for summer and winter. The values for winter are used.

To calculate the regional and local roughness length, a surface drag coefficient *Cd*;*<sup>i</sup>* is averaged at the blending height which is a method to give weighting to the larger roughness values [59,66]:

$$\mathbf{C}\_{d:i} = \left(\frac{\kappa}{\ln(z\_{b\hbar}/z\_0)}\right)^2, \kappa = 0.4 \tag{A5}$$

where *zbh* is the blending height and *z*<sup>0</sup> is the roughness computed for each map point *i* (100 m). A simple footprint model is used to scale the significance of roughness further away from the site:

$$\mathbb{C}\_d = \frac{\mathcal{W}\_n \mathbb{C}\_{d;i}}{\sum \mathcal{W}\_n}, \mathcal{W}\_n = \exp\left(-\frac{\mathbb{X}\_n}{D}\right) \tag{A6}$$

where the scaling factor *D* is 600 m for the local footprint and 3 km for regional footprint [66]. A source area of up to 3 times *D* is considered, resulting in an effective evaluation length of 1.8 km for the local scale and 9 km for the regional, which accounts for a total of 80% of the integral of *W<sup>n</sup>* (Equation (A6)). Other studies using this footprint model have reported other distance weightings [48,63].

#### **References**


## *Article* **Cyber-Physical Systems Improving Building Energy Management: Digital Twin and Artificial Intelligence**

**Sofia Agostinelli 1,\*, Fabrizio Cumo <sup>1</sup> , Giambattista Guidi <sup>2</sup> and Claudio Tomazzoli <sup>3</sup>**


**Abstract:** The research explores the potential of digital-twin-based methods and approaches aimed at achieving an intelligent optimization and automation system for energy management of a residential district through the use of three-dimensional data model integrated with Internet of Things, artificial intelligence and machine learning. The case study is focused on Rinascimento III in Rome, an area consisting of 16 eight-floor buildings with 216 apartment units powered by 70% of self-renewable energy. The combined use of integrated dynamic analysis algorithms has allowed the evaluation of different scenarios of energy efficiency intervention aimed at achieving a virtuous energy management of the complex, keeping the actual internal comfort and climate conditions. Meanwhile, the objective is also to plan and deploy a cost-effective IT (information technology) infrastructure able to provide reliable data using edge-computing paradigm. Therefore, the developed methodology led to the evaluation of the effectiveness and efficiency of integrative systems for renewable energy production from solar energy necessary to raise the threshold of self-produced energy, meeting the nZEB (near zero energy buildings) requirements.

**Keywords:** digital construction; artificial intelligence; digital twin; nZEB; energy management; energy efficiency; edge computing

#### **1. Introduction**

The energy management of building systems and urban areas such as residential districts is assuming an increasingly relevant role in the control and assessment of urban development and refurbishment processes.

Digital predictive technologies and sensor-based control systems are becoming fundamental tools [1] supporting policies to reach near-zero requirements and targets for buildings and urban districts. Nowadays, the integration of information communication technologies (ICT) has an important role in the configuration of smart cities and in defining digital strategies addressing social, public health, economic, environmental, and safety issues [2].

The success of such digital transformations requires the ability to meet and manage new emerging challenges [3]. Deep interactions between humans, infrastructures, and technologies are increasingly created over time by the global consequences of urbanization and the growth of human activities. Dealing with complexities related to sustainability matters, cities are implementing technological improvements achieving smarter performances through the definition of smart cities that adhere to a smart growth agenda [4].

According to the above mentioned, it can be introduced the urban intelligence [5] concept, providing insights into a number of issues currently faced by modern cities (i.e., air pollution, communication network demand, congested traffic, water floods, etc.) through the introduction of data from Internet of Things (IoT) sensors processed by intelligent and

**Citation:** Agostinelli, S.; Cumo, F.; Guidi, G.; Tomazzoli, C. Cyber-Physical Systems Improving Building Energy Management: Digital Twin and Artificial Intelligence. *Energies* **2021**, *14*, 2338. https://doi.org/10.3390/en14082338

Academic Editor: Ioan Sarbu

Received: 9 March 2021 Accepted: 14 April 2021 Published: 20 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

real-time advanced analytics. According to the United Nations prediction, 60% of cities will have at least half a million inhabitants by 2030, leading to issues in cities such as the increasing of network demand and crowd congestions [6].

In the future, progressively current problems in cities will be necessarily managed through intelligent urban reasoning algorithms and suitable deployment data-model based on urban intelligence systems, pervasive computing, communication, big data management technologies, and artificial intelligence (AI), leading to a strong evolution in the management of urban environments as well as in the quality of life in smart cities [7,8].

The configuration of city digital twins represents a giant leap forward for urban sustainability from design to construction and maintenance basing on the implementation of Industry 4.0 principles [9,10]. It is defined as a digital replica of a physical asset, collecting information from sensors, drones, or other sensive IoT devices, applying advanced analytics, machine learning (ML), and AI obtaining real-time processed data about the lifecycle process of physical assets.

In particular, digital twin (DT) ecosystems are related to three main entities: a physical object, its virtual replica, and the connection between them in terms of collecting and connecting real-time information. Such a digital ecosystem can effectively contribute to the lifecycle management of both vertical and horizontal systems, in order to store, manage and process big data about the urban environment in a three-dimensional data model as a structured information system connected to the physical.

In this paper, the applications of such ICT-based digital approaches are related to energy management systems, in order to predict real time situations, enriching and leading to more effective decisions, obtaining the automation of repetitive tasks, and providing added value with the optimization of decision-making processes.

In particular, the objective concerns the configuration of a solid methodology for an increasingly intelligent system where the potential of ICT, IoT, big data and AI are combined interacting with BIM (building information modeling) models (Figure 1), defining threedimensional information and predictive systems for energy management.

**Figure 1.** Rione Rinascimento III three-dimensional BIM model overview, consisting of encoded functional blocks (FB) and building models (from C\_0n to H\_0n).

In fact, the connection between IoT devices, digital information models (BIM), and AI defines an advanced smart-city ecosystem as an intelligent, ubiquitous, and sustainable digital urban context [2] where real-time monitoring systems allow data connections and processing anytime and anyplace [3,4].

More specifically, the project developed by CITERA Interdepartmental Centre of Sapienza University of Rome explores the potential of digital-twin models integrated with AI systems finding a specific application as an opportunity to apply the developed methodology. The case study is related to the configuration of an effective DT model of a residential district in Rome, increasing energy efficiency and identifying a cost-optional solution for which both consumption and costs are expected to be reduced.

Therefore, the 3D information model was developed gradually from the territorial, infrastructural (using Autodesk InfraWorks for geographic information systems) up to the building scale (using Autodesk Revit for building information modeling). The model resulted both as a microscopic and macroscopic digital database, containing static, dynamic, geometric, and semantic data about buildings and their functional interactions.

As mentioned, a BIM approach was carried out focusing on energy management model-uses and leveraging interoperability using IFC (industry foundation classes) models for energy diagnosis purposes. Basing on such analysis, a smart-energy-grid management system was developed combining BIM as-built models with IoT and AI obtaining a substantial as-performed and up-to-date city digital twin.

#### **2. Background**

The objective of bringing the virtual and physical worlds together is focused to better support decision-making, reducing risks and configuring a citizen engagement tool, improving urban sustainability [9]. The introduction of DT in construction processes addresses the improvement of decision-making focusing on well-informed and advanced real-time "what-if" scenario assessments, reducing wastes of time and resources that are typical in construction.

In this regard, the Newcastle University created a DT of the city dedicated to incidents and disasters responding and prevention, running simulations of incidents such as burst pipes, heavy rainfall or floods to evaluate the potential impact on communities over a 24 h period [10].

Another effective example of smart-city DT currently ongoing is virtual Singapore, which provides capabilities from virtual experimentations, test-bedding, and decisionmaking up to research and development [11].

Moreover, a relevant experience is carried out by the Centre for Digital Built Britain (CDBB) delivering a "smart digital economy for infrastructure and construction", as a transformation of the UK AEC (architecture engineering and construction) industry's approach about planning, building, maintenance and utilization of social and economic infrastructures [12].

In addition, the ongoing project for the city digital twin of Atlanta creates a virtual reality (VR)-based platform (built basing on the unity interactive and data-driven crossplatform game engine) which contains a three-dimensional fully modeled city of Atlanta, reproducing the entire city into a virtual space, facilitating spatial-temporal feedbacks and interactions between the human/infrastructure systems and their virtual representations [13].

Focusing on the energy implementations, three significant experiences related to DT developments integrated with AI systems can be mentioned, in order to define a systemic approach for the present study, aiming at integrating the objectives of the single experiences reported below.

The first concerns a microclimatic study on urban scale carried out in the Kalasatama district by the Municipality of Helsinki, in which it is important to highlight the "Energy and Climate Atlas", defined as a city information model for studying and developing strategies for the mitigation of climate changes and improving energy efficiency. The atlas includes a number of specific information about the buildings, such as heating systems, energy certification, electricity consumption, district heating, and water distribution. As configured, the model helps to analyze a series of technological scenarios, allowing users to define the solar energy potential of buildings, evaluating the possibility for reducing carbon dioxide emissions or outlining cost-impact scenarios for different interventions [14].

In addition, it is important to investigate the behavior of energy-smart-grid systems serving differentiated users managed by ML. As known, the main issue to be resolved concerns the need to implement storage systems due to the characteristics of discontinuity of renewable energy production.

The ESS (energy storage system) management realized through a DT integrated with ML systems can bring significant improvements leading to consequent bill savings, if compared with the current systems based on predefined control systems of the electrical power supply from the batteries.

In addition, the development of an energy management system (EMS) is fundamental. As reported by Park, Byeon et al. [15] "an EMS reinforces operational functions such as adjusting the amount and schedule of charging and discharging through the efficient control of the ESS and power conditioning system (PCS) and manages the overall power flow". Moreover, it is connected with sensors and measurement equipment able to analyze and monitor consumption patterns, managing information about power activities and optimizing the overall efficiency.

Another extremely significant energy application of DT is the simulation and testing of scenarios for energy-efficiency interventions aiming to achieve nZEB (near zero energy buildings) requirements on buildings. Since most buildings today are already built, it is necessary to underline the essential application of nZEB parameters on existing built environments through the use of BIM-oriented 5D and 6D digital approaches [16].

The fifth and sixth dimensions of BIM are used and developed to promote stakeholder's collaboration, visualizing and evaluating different options with the configuration of nZEBs, in terms of sustainability and energy efficiency parameters (6D), estimating associated costs (5D) and technical issues [16].

From there, the advances in building data interoperability both at a technical and organizational level enable relevant innovation in end–user energy delivery and optimization [17] beside to open data availability, leveraging on technologies [18] such as the IoT and cyber–physical systems.

#### **3. Material and Methods**

The case study of the present research analyzes digital ICT-based energy management techniques applied to a 16 eight-floor buildings residential district called Rione Rinascimento III, located in Rome, which represents the most significant Italian residential implementation of a geothermal source heat pump (GSHP) system, that is currently the largest in Europe.

#### *3.1. The Urban Context*

Rinascimento III (Figure 2) is configured as a building intervention characterizing an energetically self-sufficient new portion of the city, integrated as much as possible with the surrounding areas in terms of urban planning and services, and it is considered of relevant significance since it is powered by a still not-commonly-deployed kind of renewable energy system.

In the urban planning agreement between the Municipality of Rome and the private owner, primary and secondary public works were planned, as well as the completion of the Talenti Park area in front of the district. According to the Italian regulations, the new district is included in the category of bioenergetic improvement interventions, which aim at improving the bioclimatic performance of the settlement.

Moreover, the introduced Italian energy policies (such as Decree Law no. 63 of 4 June 2013) aim at a partial refunding up to 65% of the amount for energy requalification expenses, consistently improving the use of renewable sources such as the geothermal one.

The geological characteristics of the Italian territory are particularly favorable for the development of geothermal energy systems and could allow one to exploit low-enthalpy resources at different depths and in numerous areas of the country.

According to the above mentioned, a research activity was developed by the CNR (National Centre for Research) with a pilot project promoted in four Italian regions (Calabria, Campania, Apulia, and Sicily), contributing to the increase of knowledge about the use of geothermal resources, with the aim of providing useful information to start activities of exploration for the improvement of geothermal energy uses in the south of Italy [19].

**Figure 2.** Master plan of the considered area (Rinascimento III) in Rome.

#### *3.2. Linking Virtual to Physical*

The concept of Construction 4.0 defines a framework where data-driven systems are able to manage physical processes by configuring a virtual replica of the physical world and achieving decentralized decision-making processes based on self-learning mechanisms [20].

Therefore, BIM models containing data and information useful for processing assessments become able to communicate with the real systems using data from sensors, developing learning capabilities, and being able to process the received information.

The collaboration between 3D information models and IoT devices is highly necessary for a successful implementation of real-time DT purposes, as well as for energy management optimizations. However, the implementation of IoT in real-world environments configuring smart, ubiquitous, and live-interconnected systems (Figure 3) is currently still restricted by technical barriers such as device battery life, network capacity, and maintenance costs.

**Figure 3.** Block diagram of the IoT system.

The core functionality of IoT devices is to reliably collect and share data (such as flow rates, temperatures, pressures, physical movements, distance, mass, etc.) from its designated environment to the virtual world.

The hardware elements consist of a battery-powered sensor, an actuator, and a network communication system in which the collected data are processed and consequently sent to remote servers.

In the present application, the connection between the physical and virtual model is made through sensors [21] able to monitor and communicate electrical power data such as power energy voltmeter ammeter for lighting and heating, ventilation and air conditioning (HVAC) systems and smart plugs for electromotive equipment such as computers, televisions, washing machines, and so forth (Table 1) [22].


**Table 1.** Review of the implemented IoT devices.

In this case, AI systems allow the DT to develop predictive capabilities, learning from the events and improving outputs, ultimately taking and implementing autonomous decisions based on the analysis carried out without human interventions.

Moreover, the AI system achieves a balanced condition between energy consumption and energy production system's performance parameters [23], adapting itself to the environment in order to achieve the predefined objectives.

In other words, the system takes data from sensing devices, and it generates appropriate and specific actions through reasoning systems, modifying the behavior of the equipment in order to optimize energy consumptions. Specifically, it takes information from IFC-BIM and CityGML-GIS (geographic information systems) models, constantly updating them with real-time data as described in Figure 4.

**Figure 4.** Data flow and processing for digital-twin-based energy optimization.

#### *3.3. Data Interoperability*

Principles of Industry 4.0 and data interoperability in the AEC sector are extensively applicable on linking GIS and BIM models, providing data for real-time multiscale objectoriented simulations of the built environment. As configured, GIS-BIM 3D city information models and applications require common communication standards introducing problems related to information integration and data interoperability at different domains and scales [24].

In the specific case of information management in construction processes based on BIM methodologies, interoperability consists in exchanging data from models to different software and application platforms, implemented for different purposes and functionalities throughout to the whole lifecycle.

The main objective of interoperability is to facilitate the interaction between different and nonhomogeneous information systems, minimizing errors and aiming at reliability, effectiveness, and optimization of resources.

For the above mentioned, different levels and approaches on interoperability, are defined by the Information Technology Vocabulary (ISO/ISO/IEC 2382) [25] as the "capability to communicate, execute programs, or transfer data among various functional units in a manner that requires the user to have little or no knowledge of the unique characteristics of those units" [26,27].

Industry foundation classes (IFC) were defined as a reference standard format for the building industry to develop different advanced processes based on spatial data relations between building components of a BIM model.

In the present application, specific processes can be scheduled for different activities, objectives and domains (Table 2) since objects are connected to data entities and properties such as name, geometry, identifications, material parameters, etc.


**Table 2.** Data domains and collection of the interoperability process.

In the GIS field, CityGML was developed as a model standard representing geometric and information relationships between geographic entities, being defined as the most appropriate territorial modeling standard in different levels of detail. In addition, IFC and CityGML standard were used, as they are currently the two semantic models dedicated to the configuration of object-oriented information management systems, even though research is still focused on information exchanging, linking IFC and CityGML toward an advanced 3D city information model [28].

#### *3.4. 6D BIM for Sustainability and Energy Efficiency*

The study focuses on the Rinascimento III district (about 85,000 m<sup>2</sup> ) which is a part of Rione Rinascimento, consisting of 16 eight-floor buildings hosting about 900 apartment units with 2500 inhabitants.

A significant part of the energy supplied to the building complex is self-produced using renewable geothermal sources. For this reason, the following case study is considered to be extremely relevant for approaching digital methodologies integrating DT and AI systems for an efficient energy-smart-grid management.

According to the BIM Use Classification System developed by Penn State University [29] which basically categorizes BIM Uses (Figure 5) as the main purpose to be achieved

when implementing BIM in construction processes, specific purposes and objectives for BIM models were identified.

**Figure 5.** The components of a BIM use, adapted from ref. [29].

The definition of the main BIM purposes led to the identification of specific requirements for data implementation and model configuring.

Since the current application is based on the use of BIM and GIS models for energy management purposes, priority was given to the implementation of specific data such as well-defined technical parameters of the building envelope, thermal zones, rooms, HVAC systems, and equipment, as well as specific data about localization, climate [30], boundary conditions, etc., as information coming directly from the BIM system in the interoperability process.

Moreover, BIM models can have different level of depth both geometrically and informatively, depending on the BIM Uses and related objectives. According to the ISO 19650 [31] standard, LODs were defined, gradually moving toward a LOIN (level of information need) perspective shifting from a prescriptive to a performance approach, based on information granularity depending to predetermined specific BIM uses.

As mentioned, the production of the BIM models followed a number of phases coming from a low degree of definition (LOD 100 [32]), useful in preliminary and outdoor concept stages, up to a LOD 400 (Figure 4, right), according to the BIMForum, "2013 Level of Development Specification" (AIA/AGC, 2013), [32] for indoor energy analysis and simulations purposes as described in Figure 6.

**Figure 6.** The evolution phases of the BIM model LOD, according to the objective definitions and energy uses.

As configured, the so-called sixth BIM dimension (6D) was achieved since the identified BIM use was connected to energy efficiency and sustainability analyses and simulations [33]. Developing a BIM-oriented methodology allowed to assess the energy performance of the building system, providing relevant support to decision-making processes.

In this section, it is necessary to detail the data and boundary conditions necessary to run the energy analysis through the 6D BIM model [33]. The thermal characteristics of the building envelope technical systems as well as the related data contained in the BIM model are reported in Table 3.


**Table 3.** Characteristics of the buildings' thermal envelope in the BIM model.

In this case, DT reproduces the energy characteristics of the building envelope and technical plants, which combine a component of renewable energy as described in Section 3.5. In Table 4 the technical components of the main HVAC plants, as well as the controlled mechanical ventilation system are reported.

**Table 4.** Building's thermal system configuration detailed in the BIM model.


#### *3.5. Building Energy Model (BEM)*

The main objective of the DT-based developed methodology is using data models across different simulation and monitoring processes [34], combining data from different sources (BIM, GIS, IoT, etc.) in a three-dimensional model, which is aligned almost in real-time with the reproduced system [35,36].

In order to create a building energy model (BEM) [37], each component of the information model was associated with the corresponding products in a BEM software connected to BIM data (MC4 Suite for Revit), defining different thermal zones and boundary conditions.

Once the energy model was generated using a specific and authorized software, [38] it followed the validation phase.

In particular, according to Italian regulation DLgs. 30 May 2008 on "calculation methodologies and requirements for the execution of energy diagnoses and energy certification of buildings" if the deviation between the values estimated by the model and the real consumption does not exceed 5% on average, then the model is validated.

In the pilot project described in the present study, the building complex is supplied by the largest European residential geothermal plant with GSHP (COP of 3.8 in winter configuration and 5.5 in summer configuration), equipped with 200 vertical geoprobes, 150 m deep.

The components of the total energy consumption of Rinascimento district are reported in the following schemes (Figure 7) and divided into four main categories: (1) winter air conditioning; (2) summer air conditioning; (3) hot water; and (4) electric power supply.

**Figure 7.** Total primary energy consumption and renewable energy sources (RES) energy production.

In fact, the energy production coming from renewable energy sources (RES) and particularly from the geothermal plant could be estimated in about 6305 MWh/y on 9.979 MWh/y consumed, subdivided as shown in Figure 7. Consequently, 63% of the total energy requirement of primary energy is produced by the geothermal system.

In this case study, the energy diagnosis was conducted on one single building (Figure 8) of about 3648 m<sup>2</sup> , using the Revit Suite of Mc4 Software through BIM data, for a dynamic simulation of the building behavior, supplied by a modular portion of the geothermal plant. Since the highlighted building is currently the only one being fully occupied by residents (who permitted the implementation of sensing devices for DT configuration), it was selected for energy modeling and real-time monitoring.

**Figure 8.** Selected building for the performed energy analysis.

Moreover, since the geometry and spaces subdivision are almost identical for all the buildings, the modeled building is expected to share similar boundary conditions about solar radiation (Figure 9) and ventilation with the other five highlighted in Figure 8, positioned on the outer perimeter of the district without any shading.

**Figure 9.** Solar radiation analysis.

The performed simulations led to the evaluation (according to the Italian classification of Legislative Decree 48, 10 July 2020) [39] of an A2 class with a specific consumption of 26.8 kWh/m2y; the comparison with the real value building consumptions coming from an average evaluation of 3 year bills (26.6 kWh/m2y) validated the simulation model.

The aim of the DT model was also to simulate the increasing of the RES production percentage, in order to reach the goal for Rinascimento to become a near zero energy district (nZED). The energy simulation in the model were performed considering new installation of photovoltaic panels for the production of electricity and solar collectors for the production of domestic hot water.

In particular, the model was implemented with the integration of 312 kWp of monocrystalline photovoltaic modules in the building façade able to produce 276,000 kWh/y of electricity; and the realization of an area hosting 405 high-efficiency flat-plane solar collectors able to produce 410,000 kWh/y.

The simulations outputs lead to a final result of 6991 MWh/y of energy coming from renewable energy sources (RES) (geothermal+solar), which means about 70% of the district energy consumption directly produced in place by the RES microgrid of the complex.

However, the obtained results so far were focused on the building as a whole, specifying some different thermal zones created according to differences in use, occupation hours, types of HVAC installed, or types of external envelope and sun exposure.

Considering the analysis on a smaller scale, focusing on indoor environmental quality [40] such as thermal-hygrometric conditions, the BIM model was detailed with HVAC systems to develop computational fluid dynamics (CFD) analysis [41].

The standard k-ε model was deployed according to the limited need of calculation power and time for iterations (less than 300) as well as for the absence of high-pressure gradients in the rooms.

The following input conditions have been set:

Average outdoor air temperature equal to 5 ◦C; radiant floor water temperature equal to 40 ◦C; underfloor heating surface temperature is between 24 and 29 ◦C; radiative model discrete ordinates; and 1 s timestep.

Four control probes were temporary fixed and positioned in the center of each room in a typical apartment at 1.50 m from the ground, which is the same height of the DT temperature and humidity monitors fixed in all the apartment rooms (Figure 10).

**Figure 10.** Control probes positioning.

Fluid-dynamics analyses were developed from the BIM model to study the temperature gradient and convective air flows in rooms, triggered by the operation of radiant floors in winter heating mode in order to evaluate comfort parameters in each room, experimenting data interoperability from BIM model to CFD analysis (Figure 11).

**Figure 11.** Temperature gradient and air velocity vectors in different rooms.

#### *3.6. Artificial Intelligence*

Machine learning is a form of AI providing systems the capability to learn from data without the use of explicit programming. ML produces models where there are some kind of regularity in data [42]. Like human children's learning processes, it is driven by "experience" [43].

As a general rule, training a model requires computer resources which are orders of magnitude bigger than those required to execute the model [44,45].

In this specific case, data are collected and analyzed in order to devise one or more model for energy-efficiency purposes using AI while allowing normal comfort and living habits. The general architecture of the system is shown in Figure 12.

**Figure 12.** System architecture.

The goal was achieved through two phases: (1) design and implementation of the infrastructure and (2) obtaining data, training, and model testing.

#### 3.6.1. Design and Implementation of the Infrastructure

Energy data are simple time series of power consumption or production, coming from real sensors in a given time lapse, each one transmitting data with its own application programming interface (API); moreover, they are obviously located close to energy loads or near power sources.

This means that data are not all in the same place at the same time, which is a necessary condition to perform the analysis that led to the desired algorithms.

The first problem is therefore to plan and deploy a cost-effective IT (information technology) infrastructure able to provide reliable data to be processed.

Each apartment was implemented with monitoring sensors, so that every device energy consumption could be considered to define the control solution of the overall energy requirement in each apartment.

All the implemented metering sensors produce a huge amount of data requiring significative computational resources to obtain acceptable analysis performances; therefore, the best solution for reducing installation expenses would be to control the system acquiring all the information in a data center or a service in a data center.

This architecture leads to the necessity of setting a local system for interconnecting IoT sensors and actuators over a geographical network (such as the Internet), executing sort of local computation and buffering data in case of connection blackout, using the known "ubiquitous and pervasive computing" [46] techniques to deal with the computational problems of centralized intelligence.

Following this approach, two distinct problems had to be solved designing the infrastructure:


The first element in the infrastructure is a subsystem able to cope with several transmission protocols and time frames, whose output is the synchronized power consumption (or production) of the smart metered devices. This subsystem accepts instruction from the second element to switch on and off some of the controlled devices.

This element needs to be connected with all sensor networks; therefore, it has to be physically placed next to them, minimizing transmission problems and monitoring local environment even in absence of communication with the central control system. This kind of elements is called "elettra" in the following section.

The second architecture element is another subsystem, composed of a different "proxy", and each proxy receives the outputs of the first subsystem as an input. The proxies deliver the data to the central unit and receive back data from the same device, taking care of bandwidth problems and unreliability of the network.

These proxies have to be physically close to the first subsystem while the central unit can be remote; the central control system is a centralized unit able to store and process data, operating building digital simulation models and delivering commands back to the proxies.

The logic model of the designed infrastructure is based on three elements, as shown in Figure 13.

**Figure 13.** Logic model of the infrastructure.

Following this logic infrastructure, a series of "cheap" small computer or SoC (system on chip) had to be equipped, containing both the "elettra" and "proxy" subsystems; all those computers are connected to a high-performing server in a data centre able to run the software of the central control system. The operative concept of this infrastructure is exemplified in Figure 14, where only a few energy consumer devices are reported as an example.

**Figure 14.** Operative concept.

Elements e1, e2, e3, e4, and e<sup>5</sup> are the cheap computing containing the elettra subsystem and the proxy, while elements c<sup>1</sup> to c<sup>10</sup> are energy load examples, and P1, P<sup>2</sup> are photovoltaic panels for electric power production and geothermal plant.

#### 3.6.2. Obtain Data, Train, and Test Models

Once the data are stored in the central control system, they can be analyzed to build digital numerical models able to simulate and optimize all the main parameters of the smart energy grid. All data have a similar form, so that they can be viewed as a series of {location, date-time, object, value}.

Considering a single location, using ML techniques and rule-based methods such as association rule learning, it is possible to deduce which device is active at a certain time for each selected location [46].

In the present application, it was not possible to consider all the locations as equivalent one to the other, as detailed in Section 4.

A possible general solution is the adoption of best practices, which are hard to define due to the different final uses (home, office, and mixed use) and layouts; if grouped by location and similarities parameters, AI becomes able to automatize processes attributing each location to the most appropriate group or cluster. Therefore, it is necessary to run a ML technique known as "clustering" to automatically create groups of similar apartments used for mathematical representation of each unit: to create the feature vector of each unit, each and every energy consumer and producer was counted and grouped together by type [47].

Given the vector representation of each apartment, we used the well-known unsupervised technique known as K-means, to automatically extract groups of energy-similar apartments.

After a period of observation, a sample for each homogeneous group in a single location was chosen. These local samples were used to extract behavioral rules to be applied to the others belonging to the sample group.

Analyzing the configuration of each location at a given time, it is possible to compare any apartment "Ai" with the sample one "As". As an example, a general association rule can be expressed as follows: "at time tk, make a comparison of device type dj of flat i (dAij) with the correspondent device type of the reference one (d As j). If they are in a similar status, then do nothing; otherwise, switch it on or off, so that it is in the same state of the reference one's.".

An association rule is something in the form X → Y that in a smart grid should assume the simplified form TheSolarPanel IsOn → TheWashingMachineIsOn. We achieved this using the "Apriori Algorithm" which is an influential algorithm for mining frequent item for Boolean association rules. It identifies the frequency of individual items in the dataset, extending them to larger item sets, according to their appearance in the dataset [48].

Nevertheless, every automated system can easily fail if the digital representation of the built environment does not match reality. Assuming that, inevitably during the lifetime of an apartment, some smart plug will be connected to different devices, affecting the digital model reliability and accuracy.

In order to keep the digital model continuously up-to-date, AI techniques transform a power absorption curve of a single device in a sequence of characters named "energy words of the device" [48], using analytical processes similar to those of text analyses; then, a supervised learning method named "Naïve Bayes classifier" automatically identifies the type of each energy load, so that the system can detect a mismatch between the digital representation and what is actually connected to the network.

The dictionary of different energy words exceeded the size of 60,000, with the major number appearing less than three time in the energy footprint; therefore, we set this threshold to avoid dimensionality problems. The resulting predictive model elaborated using the Naïve Bayes classifier was validated using both a 66% train 33% test split and a 10-fold cross validation technique, taking advantage of the tool named "Weka", an open source ML software (using the class weka.classifier.bayes.NaiveBayes).

#### **4. Results**

As a consequence of the energy efficiency improvement based on the implementation of renewable energy systems, in winter conditions, the geothermal power plant supplies every building both with heating and domestic hot water; solar collectors integrate the system, while the photovoltaic system powers the external lighting system around the perimeter of the buildings. In summer conditions, domestic hot water is produced through solar collectors covering 100% of the actual needs, while the geothermal power plant only works for the production of chilled water for cooling (through the absorber), while the photovoltaic system powers the entire lighting system of the complex.

The energy diagnosis conducted on a single building using the BIM model through the Revit Suite of Mc4 Software led to the transition from an A2 class (with a specific consumption of 26.8 kWh/m2y) to an A4 class (with a specific consumption of 16.1 kWh/m2y). Moreover, in order to further validate the results and the obtained energy diagnosis, the calculation was also repeated with two other numerical simulation tools: (1) Termus BIM, basing on the BIM model and (2) ArchiEnergy, a semidynamic software developed by Sapienza University of Rome (Table 5).

**Table 5.** Energy diagnosis results (kWh/m2y): software comparison.


\* Average of 3 year consumptions of the district; \*\* 3 month summer bills of the analyzed building.

Once the results and deviation values were obtained, they were evaluated and compared to the following chart in Figure 15, which reports results from other energy diagnosis conducted on similar building systems.

**Figure 15.** Software comparison through energy diagnosis results on similar buildings.

From the analysis, it is shown that the diagnoses made with the energy software led to similar results with a maximum deviation of 12%, and the difference between the two BIM-based, Mc4 Suite for Revit and Termus BIM, is 5% (Table 5).

Moreover, the fluids-dynamic analysis performed in specific rooms of a single apartment was confirmed by the data coming from sensors, showing that there is no discomfort in any area due to the configuration of the radiant floor equipment.

In fact, large masses of moving air can be observed as previously shown in Figure 11. This is mainly due to the temperature difference between the floor and the environment. Convective motions affecting all the areas are generated; however, the temperature gradient is fully compliant with the regulation requirements, and the air velocities are very low, falling within the range of comfort conditions.

It was also monitored the temperature in each area, where the internal temperature was initially 5 ◦C (equal to the external temperature), until the achievement of the internal comfort temperature of 20 ◦C. The temperature transient is shown below (Figure 16).

**Figure 16.** Temperature transient.

It can be noticed that the air heating trend is almost the same for all the rooms, and the comfort temperature is reached in about 1900 s (just over 30 min).

Moreover, another obtained result was the implementation of an intelligent energy management model, i.e., an automatic ML system capable of modulating loads (mainly electrical) according to the expected self-production of energy; for this purpose, information from the European Copernicus [49] earth observation system are acquired in order to have accurate predictive meteorological data.

In this regard, the energy-smart-grid system realized with solar collectors and photovoltaic panels needs a set of rules to establish priorities regarding energy production and consumption loads:

Production: electricity from solar sources, being totally free, must be the first to be fed into the distribution network, followed by the energy coming from the geothermal power plant (which needs electricity to power the circulation pumps); as a last option, it is possible to use energy coming from the public electrical net or use gas.

Consumption: the priority of power supply must be given to the lighting system, followed by the electromotive force circuit, while the air conditioning systems can be regulated and modulated in the event of a lack of energy, by lowering or raising the optimal temperature up to 2 ◦C.

Therefore, the AI system contributed to reach the goal of increasing the efficiency of the entire energy system by more than 10%, limiting the dependence of the building complex from the electricity and gas distribution networks to a maximum of 20% of the total energy consumed. The system for energy loads forecasting and managing was created in a single apartment (Figure 17) according to the following two logical steps: (a) the creation of a synthetic method to group the plants based on the similarity of results in terms of energy efficiency and (b) metering, evaluation, and analysis of consumption data of the selected plant.

**Figure 17.** Energy loads forecasting and DT managing through AI.

The inevitable use mutability of the apartments was also considered, as well as the variations in energy loads over time; consequently, an algorithm able to automatically deduce which devices are used in each power outlet was adopted, analyzing the hourly trend of current absorption.

Some energy sensors (as detailed in Material and Methods) were applied, and data were collected in a central system. The different typology of energy loads was considered, and then submetering was performed, as shown in Table 6 and Figure 18.


**Table 6.** Working day energy metering in a typical apartment (Wh).

**Figure 18.** Submetering in a typical apartment.

Energy consumption of each device varies according to its power absorption, as shown in Table 6 and Figures 18 and 19, which report some controlled measures on a typical working day, detailing both the apartment and the single rooms.

**Figure 19.** Consumption submetering (Wh) in an office room.

The use of these energy sensors led to another result: the so-called "submetering" It was possible to detect the biggest single load both in the apartment and in a single room. In this way, the analysis and decision of how to save energy becomes simpler, devising strategies affecting the most consuming items, effectively contributing to the overall energy saving.

#### **5. Discussion**

The concept of DT is extremely transversal and widely suitable to both microscales such as apartments and macroscales at the district levels. As the new and future buildings will be directed to near-zero-energy building standards (nZEB), or even zero-energy buildings (ZEB), they therefore need tools suitable for the new design requirements, i.e., digital systems able to predict and simulate both global energy consumption and internal behavior [50].

It is quite impossible to define a validation process able to ensure the reliability of the calculation method by 100%.

For a full comprehension of the model and interoperability process accuracy, it was necessary to proceed with a comparison methodology based on the overall final outputs, (kWh/m2y) between three different software (1) Termus BIM by Acca Software, (2) ArchiEnergy from Sapienza University of Rome, and (3) Mc4 Suite for Revit.

The developed analysis was focused on the comparison of results coming from different processes basing on both traditional and BIM approaches. On the one hand, Termus BIM used IFC BIM standards, while in Mc4 Software a plug-in approach was developed directly connecting the Revit BIM model with Mc4 analysis tools. On the other hand, the ArchiEnergy software is a traditional system calculating energy consumption based on inputs by the user about the plant and the building envelope.

Following the validation phase, the DT led to the evaluation of the smart-grid implementation effects. In particular, in Figure 20, the reduced energy consumption and the relative reduced CO<sup>2</sup> consumption coming from the Mc4 Suite for Revit analysis are shown.

**Figure 20.** Results of the energy-efficiency interventions.

At the same time, the work carried out highlights how in highly urbanized contexts characterized, it is very difficult to achieve high performances as required by the nZEB Italian Decree [51], even if significant energy requalification interventions are developed, improving both the building envelope and air conditioning systems.

As a consequence, it became necessary to consider building complexes not only as consumers, but also as energy producers in a local, block, district, or neighborhood smart grid: the concept of "prosumer".

By such a logic, the role of AI in smart-grids management and optimization of both energy production and consumption becomes decisive, being able to make reliable forecasts on possible scenarios.

Analyzing similar energy efficiency interventions on buildings and residential complexes, it is shown how efficient technologies are now available, well defined, and widely known. Therefore, the parameters of selection between different interventions are essentially (a) climatic parameters, (b) regulatory restrictions and constraints on interventions, and (c) the availability of government grants for the use of RES, compensating the payback time, which is still too long for certain technologies.

As previously shown, the use of BIM-based systems [16] for building energy efficiency drives no substantial improvements in terms of accuracy of results compared to traditional methodologies [18].

However, the real innovation contribution of DT-enabled systems concerns the definition of digital technologies able to reduce the gap between the expected performance of buildings and their real behavior. These goals are mentioned in the strategies of National and International R&D Programs such as Next Generation EU (Recovery and Resilience Facilities) [52], Strategic Energy Technology (SET) Plan [53], and Italian National Integrated Energy and Climate Plan (Dimension 5 Research, Innovation and Competitiveness) [54].

In this case, DT becomes a key element for research and development on secondgeneration smart buildings entirely based on electricity consumption and characterized by energy autonomy, high flexibility, block chain, and smart contract dialogue systems with the grid, assisted by digital monitoring methods.

#### *Artificial Intelligence*

Although optimizations on energy consumption have been studied in depth [55], when dealing with residential compounds or SOHO (small office home office) buildings, we cannot directly borrow general solutions from research experiences [56,57]. In fact, the overall consumption in these environments is the sum of small contributions by a considerable amount and variety of devices [58], while, mostly in industrial environments, there are generally few big powers draining that can be controlled one by one.

Moreover, these small consumers are operated by people which do not follow any procedure, since they have their personal habits: dealing with both technical and human factors through data analysis techniques becomes a fundamental strategy [59].

DT was coupled with AI to investigate building behaviors as a whole, and supervised learning techniques are used to produce an efficient and intelligent storage system management in the whole complex.

The problem of energy savings in buildings is strictly connected to the need of measuring and controlling energy loads in an efficient way, which can evolve complex scenarios. For instance, if nobody is at home and it is already late morning, both the coffee machine in the kitchen and the air conditioning are wasting energy if they are still switched on, while if someone is still there then both appliances should be still operational. Consequently, several sensors and actuators can be involved and their data should be interconnected so that an ad hoc algorithm derives the correct energy saving policy (e.g., a motion sensor shares data with electrical relays able to switch on/off the correct devices).

Real-time building management system incorporate model-based control through ML [60] to extend the use of mathematical models even to the management of humanrelated factors. In fact, thermal, humidity, acoustical comfort, and occupants model are combined and connected to ML.

While the first model depend on facts, the latter depend on humans: the behavioral model is a probabilistic one [60]: the probability that an occupant takes specific behavioral decisions or actions is defined as a function of the occupant's characteristic and the current environmental conditions, and "predicting the residents" actions toward a specific situation is not easy".

Considering the apartment microscale, instead of the whole building, our approach was to envision the automatic definition of best practices [61]; if grouped by location and similarities parameters, thanks to unsupervised learning techniques, it was possible to automatize the processes of attributing each location to the most appropriate group or cluster. In our approach, the most efficient and performing apartment for each group or cluster was found considering the energy bill over a few months, confirmed by the energy data collected over a given period [62].

Given these "sample" location, personal actions in apartments can be modeled with behavioral rules [63]: the definition of rules was given using a formal logic that allows exceptions [48] through AI, using Apriori algorithm to automatically learn the rules.

The automatic update of the BIM model to ensure the validity of the DT, based on an up-to-date information model, was dealt with by using web services [64]. Specifically, it was necessary to ensure that information about energy loads coming from smart plugs were up-to-date in the model. A supervised learning technique (named "Naïve Bayes classifier") combined with a novel energy load information coding [65] was used to achieve the goal.

#### **6. Conclusions and Further Developments**

The configured DT methodology gives buildings the capability of improving and enriching their knowledge and available data, receiving input and signals from sensors that constantly monitor them, developing self-learning capabilities and predictivity through the integration with AI systems.

Moreover, the paper focuses on how the concept of DT is extremely transversal and applicable both to macroscopic and microscopic scales (from district to apartment), as demonstrated for the use of energy management systems. It can be related, for example, to specific components of technological systems, to the digitalization of infrastructures and real estate assets, to technological systems, or networks of technological systems, etc.

The objective of the research was to exploit ML systems to manage and to simultaneously integrate self-production and supply system in an energy smart grid, in terms of both thermal and electrical loads.

The results of the DT-based real-time monitoring are able to reduce the gap between the energy performance of the buildings (simulated through energy diagnosis) and the real building performance. This is possible thanks to data analysis, which allows one to get more refined energy management strategies, even highlighting inadequate users' behaviors and policies.

As far as load forecasting is concerned, the configured DT is able to calculate thermal loads on a daily basis [60], integrating them with algorithms capable to calculate in advance building consumption based on historical data transmitted by sensors; in this way the system, on the one hand, acquires real-time data from smart metering [61] and environmental quality sensors; on the other hand, it integrates historical data (bills, consumption, etc.) and IoT with a real-time simulation approach [62]. The purpose is aimed at updating and refining the database, tailoring the energy profile of consumption on real users

These intelligent systems implemented also provide an active control on the energy balance; in fact, once the system becomes sufficiently confident, it takes control itself of the energy production systems, as well as of the loads modulation and regulation in order to optimize the energy balance system, limiting nonessential loads in case of production deficit.

Even the optimization of thermo-hygrometric wellbeing parameters in the indoor environment is considered as fundamental. In fact, through the analysis of data from environmental quality sensors and after an appropriate self-learning period, the DT becomes able even to set operations times and levels of the systems to optimize the thermo-hygrometric wellbeing of users.

Moreover, spreading the proposed research to an urban approach, developments in the BIM-GIS synergy, as both large- and small-scale digital information system configuration, would allow for the integration of each urban energy cell with the national power distribution grid, with particular focus on electric mobility and storage systems of smart grids, urban metabolism, etc. Predictions about the impacts on neighboring areas and profiling functional integrations would be performed, providing essential digital tools for the implementation and real-time monitoring of municipal and district energy plans.

In addition, in this regard, further developments of the present research would reach the optimization of the operations using a data model as a process core, replicating reality in real time, limiting or even eliminating system malfunctioning, grid unbalance, or even power breakdowns. With the aim of reducing malfunctions and breakdowns on energy services, the proposed methodology would be applied even to the facility management of HVAC and electrical plants toward configuring predictive maintenance systems.

**Author Contributions:** Conceptualization, S.A., F.C., G.G., and C.T.; methodology, S.A. and F.C.; software, S.A. and C.T.; validation, S.A., F.C., G.G., and C.T.; formal analysis, S.A.; investigation, G.G.; resources, S.A.; data curation, S.A. and C.T.; writing—original draft preparation, S.A., F.C., and C.T.; writing—review and editing, S.A. and G.G.; visualization, S.A.; supervision, F.C. and G.G.; project administration, G.G. and F.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Energies* Editorial Office E-mail: energies@mdpi.com www.mdpi.com/journal/energies

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com ISBN 978-3-0365-1755-1