A Review of Research on Building Energy Consumption Prediction Models Based on Artificial Neural Networks

School of Architecture, Harbin Institute of Technology, Harbin 150001, China
Key Laboratory of Cold Region Urban and Rural Human Settlement Environment Science and Technology, Ministry of Industry and Information Technology, 66 Xi Dazhi Street, Harbin 150006, China
Author to whom correspondence should be addressed.
Sustainability 2024, 16(17), 7805;
Submission received: 15 August 2024 / Revised: 3 September 2024 / Accepted: 5 September 2024 / Published: 7 September 2024


Building energy consumption prediction models are powerful tools for optimizing energy management. Among various methods, artificial neural networks (ANNs) have become increasingly popular. This paper reviews studies since 2015 on using ANNs to predict building energy use and demand, focusing on the characteristics of different ANN structures and their applications across building phases—design, operation, and retrofitting. It also provides guidance on selecting the most appropriate ANN structures for each phase. Finally, this paper explores future developments in ANN-based predictions, including improving data processing techniques for greater accuracy, refining parameterization to better capture building features, optimizing algorithms for faster computation, and integrating ANNs with other machine learning methods, such as ensemble learning and hybrid models, to enhance predictive performance.

1. Introduction

The International Energy Agency’s (IEA) Global Buildings Status Report highlights the continuous increase in energy consumption, with building operations accounting for 30% of global final energy consumption in 2021. To address this, building energy consumption predictions can inform energy-saving measures and strategies, ultimately leading to reduced energy use and carbon emissions.
Building energy consumption prediction is a crucial tool for assessing energy-saving potential throughout a building’s design, operation, and retrofitting phases. Common prediction methods include model-driven (white box), data-driven (black box), and hybrid (grey box) approaches. Data-driven methods, particularly machine learning, have garnered significant attention due to their time efficiency, ease of operation, and relatively accurate prediction performance. Artificial neural networks (ANNs), as a versatile machine learning technique, are widely considered one of the most effective methods for building energy consumption prediction [1]. Since 2016, researchers have delved deeper into this field, exploring various algorithm optimizations, model integrations, and hybrid approaches to enhance ANNs’ predictive power.
Several recent review articles focus on building energy consumption prediction using ANNs. Nadia D. Roman et al. [2] explored the creation of ANN-based meta-models for building performance simulation (BPS); Chujie Lu et al. [3] analyzed open issues and challenges in applying twelve ANN architectures; Dimitri Guyot et al. [4] examined neural network applications, technical features, and limitations in architecture; Siti Solehah Md Ramli et al. [5] compared ANNs to other data-driven models using evaluation metrics; Saeed Reza Mohandes et al. [6] observed rapid-growth in ANNs for building energy analysis (BEA), particularly with GDBP and LMBP algorithms, and a shift towards newer neural network types (GRNN and RNN); Jason Runge et al. [7] noted the prevalence of black-box feedforward neural network models with manually determined parameters.
Through the reviewed literature, it is evident that building energy consumption prediction using artificial neural networks (ANNs) spans multiple disciplines, including architectural engineering, computer science and artificial intelligence, energy engineering, and data science. However, there is a notable gap in systematic reviews from an architectural perspective on the application of ANNs across different building phases. Consequently, this literature review aims to provide a thorough analysis of studies from 2015 to 2023, elucidating the diverse applications of various input types, building types, energy types, and temporal characteristics in energy consumption prediction models. It also proposes guidelines for selecting appropriate ANN structures for different building phases. Finally, this review discusses future potential developments in ANN-based predictions, focusing on enhancing data processing, refining model parameters, and integrating learning approaches. It emphasizes that improved data cleaning and processing can enhance data quality, optimized algorithms can refine model parameters, and the application of ensemble and hybrid models can improve model interpretability and predictive performance.
Figure 1 illustrates the continuous increase in publications on “Building Energy Consumption Prediction Using Artificial Neural Networks” from 2015 to 2023 (starting in 2017), indicating a significant research hotspot. This paper addresses this trend by reviewing the past decade of the literature on using ANNs to predict building energy use and demand. It summarizes how different ANN structures are applied in building design, operation, and retrofitting. Additionally, it discusses input types, building types, energy consumption types, and temporal characteristics of prediction models, ultimately offering guidance on selecting appropriate ANN structures for different building phases. Finally, this paper explores potential future developments in ANN-based building energy consumption prediction, considering data processing, model parameterization, algorithm optimization, integrated learning, and hybrid models.
This study employed bibliometrics, using the keywords “Neural networks”, “energy”, “prediction”, and “build” in search strings within the established and reputable Web of Science database. We selected English publications from 2015 to 2023. Initially, 2155 publications were found; after screening titles and abstracts, 549 were selected; and upon further review, 292 publications were finally chosen. The categorization of the literature is shown in Figure 2, with 38 review articles on building energy consumption prediction, of which only 5 are reviews on predictions using artificial neural networks, as depicted in the flowchart in Figure 3.

2. Analysis of Current Applications and Characteristics of ANNs

2.1. ANN

Artificial Neural neural Networks networks (ANNs) embody a category of machine learning algorithms inspired by biological neural networks. These models are noted for their robust ability to represent and model nonlinear relationships between inputs and outputs.
Initially developed in the late 1950s, artificial neural networks aimed to mimic neuronal behaviors in the human brain [8]. In 1995, Syed M. Islam [9] and colleagues leveraged feedforward neural networks to model and predict building thermal loads, laying a crucial groundwork for future research. By 2000, Kalogirou [10] highlighted the potential of ANNs in designing diverse energy systems. Subsequently, in 2005, González and Zamarreño [11] utilized feedback neural networks to forecast hourly electric energy use in office buildings. Following this, in 2006, Karatasou and Santamouris [12] promoted the use of feedforward neural networks (FFNNs) for building energy consumption forecasts. Subsequent research consistently validates that ANNs provide precise predictions of building energy use, with FFNNs noted for their high accuracy and widespread application in a multitude of studies.
With advancements in deep learning, Recurrent recurrent Neural neural Networks networks (RNNs) have emerged as potent tools for processing sequential data in building energy consumption prediction. In 2016, Marino, Amarasinghe [13], and Manic noted Long long Shortshort-Term term Memory’s memory’s (LSTM’s) challenges with minute-by-minute predictions but acknowledged its adequacy for hourly forecasts. The following year, Heng Shi [14] et al. introduced a pooling-based deep RNN, enhancing household load predictions. By 2018, Rahman [15] proposed using an RNN sequence-to-sequence model for mid- and long-term forecasting, which showed promising accuracy. Additionally, in the same year, Kumar, Hussain, Banarjee, and Reza [16] demonstrated that LSTM models, capable of addressing nonlinearities and retaining historical data, outperform traditional BP neural networks in predicting electrical grid loads.
In 2019, Kim and Cho [17] combined Convolutional convolutional Neural neural Networks networks (CNNs) with LSTM to predict residential power consumption, marking a significant application in energy modeling. The following year, Yuan Gao [18] and his team utilized a sequence-to-sequence model with a 2D CNN featuring an attention layer, enhancing prediction accuracy for specific buildings. In 2021, Guannan Li [19] employed a CNN-LSTM hybrid network, noting a significant reduction in computation time when integrated with attention mechanisms, though without accuracy improvements. By 2022, Ibrahim Aliyu [20] introduced a 1D CNN, celebrated for its computational efficiency, high performance, and cost-effective hardware demands. CNNs are relatively new to the field of building energy consumption prediction, yet they have garnered increasing attention with the continuous advancement in deep learning technologies.
Prior to 2017, publications on artificial neural networks were somewhat sparse. However, post-2017, there has been a substantial surge in the literature related to building energy consumption prediction using artificial neural networks, with as many as 474 studies published in 2022. In summary, ANNs are extensively applied in building energy consumption prediction, becoming increasingly valued by researchers and representing one of the most commonly used prediction algorithms today. ANNs are categorized into FFNNs, RNNs, and CNNs. As depicted in Figure 4, publications from 2015 to 2023 on ANNs, FFNNs, RNNs, and CNNs show a continuous increase. FFNNs are the most frequently utilized neural network structure. Publications on RNNs have gradually increased since 2018, while research on CNNs has only recently gained public exposure, indicating significant potential for further exploration.

2.2. Characteristics of Neural Networks by Structure

2.2.1. Feedforward Neural Networks (FFNNs)

Feedforward Neural neural Networks networks (FFNNs) represent a fundamental and widely used type of neural network architecture. Their structure consists of an input layer, one or more hidden layers, and an output layer. Key FFNN types include Multilayer multilayer Perceptron perceptron (MLP), Radial radial Basis basis Function function (RBF), Extreme extreme Learning learning Machine machine (ELM), Wavelet wavelet Neural neural Network network (WNN), and Nonlinear nonlinear Autoregressive autoregressive with Exogenous exogenous Input input (NARX). Table 1 summarizes their characteristics, advantages, disadvantages, and structures. Among them, MLPs are the most widely used and versatile, known for their ability to model and learn complex nonlinear relationships. MLPs perform well across various tasks, such as regression, classification, and pattern recognition. RBF networks are suited for regression and classification problems, while WNNs excel in signal processing and time series analysis. ELMs are designed for large-scale data and high-dimensional features, offering fast training times. NARX models, which take into account past observations and external inputs, are well suited for time series forecasting, particularly for data with periodic or trending characteristics. Each type of FFNN has its own structure, characteristics, and strengths, and selecting the appropriate model depends on the specific problem, data characteristics, and desired outcomes.

2.2.2. Recurrent Neural Networks (RNNs)

A Recurrent recurrent Neural neural Network network (RNN) features a unique architecture with recurrent connections tailored for processing sequential or temporally dependent data. The foundational RNN structure comprises an input layer, a hidden layer, and an output layer, where outputs from the hidden layer are recurrently fed back as inputs, creating a loop that facilitates data flow across time steps. Key variants of RNNs, such as LSTM, Gated gated Recurrent recurrent Unit unit (GRU), Elman Neural neural Network network (Elman), Restricted restricted Boltzmann Machine machine (RBM), and Evolutionary evolutionary Data data Analysis analysis (EDA), are detailed in Table 2, highlighting their features, pros, cons, and architectural nuances. LSTM, a flexible and robust RNN variant, excels in managing tasks that involve long sequence data and require the capture of extended temporal dependencies. The GRU model, a streamlined version of LSTM, is optimized for rapid training and high computational efficiency, making it ideal for time-sensitive tasks. The Elman network, one of the simplest RNN forms, is well suited for basic sequence modeling tasks. Recurrent neural networks, in general, are adept at handling both sequential and recursive data structures. Bidirectional RNNs, in particular, are effective for tasks that necessitate an understanding of contextual relationships within data sequences, offering a deeper comprehension of bidirectional dependencies. Selecting an appropriate model involves a thorough evaluation of the specific task demands, data traits, and algorithmic efficacy.

2.2.3. Convolutional Neural Networks (CNNs)

Convolutional Neural neural Networks networks (CNNs) are advanced deep learning models, excelling in processing data with grid-like structures, including images and audio. These networks utilize a combination of convolutional layers, pooling layers, and fully connected layers to progressively distill and amalgamate features from the input data, culminating in performing classification or regression tasks via the fully connected layers. Given that building energy consumption data typically manifest as time series, CNNs adeptly discern long-term dependencies and periodic variations within such data. Through their convolution and pooling operations, CNNs are capable of extracting features across various time scales, thereby facilitating a more nuanced understanding and prediction of building energy consumption trends. Moreover, the prediction of building energy consumption often requires integrating multiple types of data, such as weather conditions and building attributes. CNNs address this by processing diverse data modalities concurrently through a multi-channel input layer, which enhances the fusion and learning of multimodal data features, thereby boosting the accuracy of predictions. Additionally, the spatial structure and layout of buildings play a critical role in energy consumption prediction. CNNs effectively map local features and spatial correlations within architectural structures using convolution operations, which enhances understanding of the spatial dynamics influencing building energy consumption and improves the accuracy of predictions.
Convolutional Neural neural Networks networks (CNNs) hold broad prospects for application in building energy consumption prediction. They can accurately predict and optimally manage building energy by processing time series data, integrating multimodal data, and handling spatial information.

3. Applications of Building Energy Consumption Prediction Models at Different Stages

“The entire lifecycle of a building” encompasses the complete process from design and construction to operation, maintenance, renovation, and even demolition. Within this lifecycle, energy consumption is a critical consideration, as the design of and operational approaches to a building directly impact energy efficiency and consumption. Energy consumption prediction throughout the building’s lifecycle requires an understanding of the characteristics and influences at different stages—for example, architectural structure and material choices during the design phase, and user behavior and equipment efficiency during the operational phase. Although there is existing research on building energy consumption prediction, comprehensive studies across different phases of a building’s lifecycle are relatively scarce.
Neural network models for predicting building energy consumption are predominantly applied across the design, operation, and renovation phases of a building’s lifecycle. Analysis of the initially reviewed literature shows that 11% of studies focus on the early design phase, 8% on energy-saving optimization design, and a significant 81% on the operational phase (Figure 5).

3.1. Early Design Stages

3.1.1. Energy Forecasting in Early Design Phases

The ANN training datasets primarily consist of input and output parameters, with inputs including building form, envelope structure, environmental conditions, human activity, historical energy usage, equipment efficiency, and date/time; outputs comprise cooling loads, heating loads, lighting energy, total energy, and electricity usage. These datasets predominantly derive from real data, simulated data, and benchmark data. The lLiterature review indicates that during the building design phase, research focusing on ANN-based energy consumption prediction models is primarily concentrated on residential buildings. Buildings are categorized by their functionality related to user activities, with residential buildings representing 50% of the research focus, followed by 14% on office buildings, 4% on rural buildings, 4% on campus buildings, 7% on commercial structures, and 21% on other types such as industrial and sports facilities. The dataset primarily comprises simulated data from parametric modeling post-energy simulation, accounting for 93%, with a small fraction derived from real historical data. Key building evaluation metrics include Mean mean Absolute absolute Percentage percentage Error error (MAPE), Root root Mean mean Square square Error error (RMSE), Coefficient coefficient of Variation variation of the Root root Mean mean Square square Error error (CV-RMSE), Mean mean Absolute absolute Error error (MAE), Mean mean Bias bias Error error (MBE), Mean mean Squared squared Error error (MSE), Coefficient coefficient of Determination determination (R2), and Mean mean Relative relative Error error (MRE), with MSE being the most extensively used, accounting for 30%, and MBE and MRE the least utilized, each at only 1%, as depicted in Figure 6. According to Table 3, input parameters are largely focused on enclosure structure and architectural form, while output parameters are mainly on cooling and heating loads.
The primary purpose of energy consumption prediction in the early stages of architectural design is to design the building’s form, identify the main factors affecting energy consumption, choose appropriate enclosure structures, and budget for energy costs, necessitating the collection of extensive datasets. Real data from completed buildings are difficult to collect and lack continuity, making the datasets incomplete and inadequate for training artificial neural networks. Therefore, in the early stages of architectural design for energy consumption prediction, it is common to first select input parameters with strong relevance for parametric modeling, then acquire corresponding output parameters through energy consumption simulations, and after processing the data, input the dataset into an ANN model for training and validation to develop an ANN-based building energy consumption prediction model. Subsequently, optimizations are considered along with costs to obtain the optimal solution, further determining parameters such as architectural form and enclosure structures to achieve the building’s optimal energy-efficient form and enclosure combination.
Utilizing Artificial artificial Neural neural Networks networks (ANNs) to predict building energy consumption in the initial phases of architectural design can lead to the development of an energy-efficient building prototype. This approach not only optimizes energy-saving measures during the conceptual design phase but also enhances energy efficiency, manages overall building costs, and ensures economic viability for long-term operations and maintenance.

3.1.2. Selection of ANNs in Architectural Design Phases

According to Figure 7, feedforward neural networks (FFNNs) dominate the literature on predicting building energy consumption during the design phase, with the introduction of Convolutional convolutional Neural neural Networks networks (CNNs) gaining traction since 2019. In 2020, X.J. Luo [47] and colleagues proposed three multi-objective prediction frameworks utilizing machine learning technologies—ANNs, support vector regression, and long short-term memory networks—to predict multiple types of energy consumption concurrently. Comparative analyses indicate that ANN-based models excel in both accuracy and processing efficiency.
For energy consumption prediction in the architectural design phase, the FFNN structure is commonly employed due to its suitability for non-temporal data, such as enclosure structure parameters and architectural form parameters, which lack time dependencies, making RNNs inappropriate for this phase. FFNNs are typically chosen, with CNNs selected on rare occasions. CNN models, with their capability for feature extraction and multi-scale prediction, are effective not only in extracting features of a building’s appearance, structure, and layout, but also in analyzing these features at different levels—including structural components, material selection, and spatial layout—by combining them for comprehensive analysis. The feature images of the buildings are then fed into the CNN model to predict energy consumption. Future research might explore using CNNs for building energy consumption prediction.

3.2. Operational Phase of Construction

3.2.1. Application of Energy Forecasting in the Operational Phase

The bulk of current research on “Building Energy Consumption Prediction Based on Artificial Neural Networks” is concentrated on the operational phase, representing 81% of the studies. In this phase, 26% of research targets residential buildings, followed by office buildings at 23%, public buildings at 5%, campus buildings at 16%, commercial buildings at 12%, and other building types such as industrial and sports facilities at 18%. Most studies focus on residential and office buildings, with significant attention given to variables like historical energy consumption, weather conditions, and timing. Given that most subjects of these studies are intelligent buildings, collecting real-time operational data—including temperature, humidity, air quality, lighting intensity, and occupancy rates—is straightforward. This data collection primarily comprises actual data, accounting for 85% of the datasets utilized. However, challenges remain with data gaps, discontinuities, and abrupt changes, directing the majority of research towards short-term predictions, which represent about 70% of all studies. Predictions typically center on electricity usage and overall energy consumption. Commonly used evaluation metrics include MAPE, RMSE, CV-RMSE, MAE, MBE, MSE, R2, and MRE, with MAPE being the most prevalent at 26% usage and MRE the least at only 1%, as depicted in Figure 8.
“Building Energy Consumption Prediction Based on Artificial Neural Networks” research is most extensive during the operational phase as buildings are active and generating substantial data. This phase provides a robust dataset, including historical energy usage, weather conditions, and equipment operations, forming a solid foundation for model development. Researchers often employ real-time data in neural networks, enhancing the credibility and accuracy of their predictive models. Predicting energy consumption during this phase helps facility managers monitor energy use efficiently, swiftly detect deviations, and implement optimization strategies. This not only curtails energy usage but also boosts efficiency, reduces operational costs, and elevates the economic performance of buildings.
An analysis of how the literature on the operational phase of building energy consumption prediction from 2015 to 2023 is distributed reveals a predominance of feedforward neural network-based studies at 43%, with recurrent neural networks following at 29%, and convolutional neural networks at 4%. Feedforward neural networks such as MLP, WNN, ELM, NARX, and RBF are prevalent, with MLP being the most common model used for energy prediction. Recurrent neural network studies are often centered around LSTM, GRU, and Elman networks, with LSTM featured most extensively. Convolutional neural networks are less frequently used compared to other types, but there is a growing trend towards hybrid models and ensemble learning, reflecting a broader diversification in research approaches (Table 4).

3.2.2. Selection of ANNs in the Operational Phase

In recent studies, Lei Gao [134] et al. compared MLP, LSTM, and CNN neural networks, finding that LSTM exhibited the best predictive performance, while MLP was the least effective. Raghavendra Chalapathy et al. [135] contrasted shallow learning models with deep learning models, discovering superior short- and long-term predictive capabilities in LSTM-based RNN-MIMO models. Lei Xu et al. [144] introduced three Bayesian deep neural network models—Recurrent recurrent Neural neural Networknetwork, Long long Shortshort-Term term Memorymemory, and Gated gated Recurrent recurrent Unitunit—among which the Bayesian LSTM (BLSTM) model achieved optimal performance. Yun Duan et al. [186] presented the LSTM-PQR model, which outperformed BPNN in predicting energy consumption in office buildings. Continuous research validates that LSTM surpasses feedforward neural networks in predictive accuracy due to its ability to effectively manage sequential data and capture long-term dependencies, typical of building energy data. Moreover, LSTM’s memory cells can retain and retrieve past information as needed, effectively addressing gradient vanishing and explosion issues, thereby enhancing its capability to model long-term dependencies. In contrast, feedforward neural networks struggle with long-term dependencies. Additionally, building energy consumption prediction involves multiple influencing factors such as weather, building characteristics, and equipment operational status, constituting multivariate time series data. LSTM proficiently handles such data, better delineating the interrelationships among various factors, thereby improving prediction accuracy.
According to Figure 9, recent years have seen a steady increase in research articles on RNN-based building energy consumption predictions, surpassing those based on FFNN. Concurrently, the literature on CNN and hybrid models for predicting building energy consumption continues to grow. Jingyi Zhou et al. [168] believe CNNs can effectively predict energy consumption throughout a building’s entire lifecycle with greater accuracy. Guannan Li et al. [19] conducted comparative assessments of LSTM and its hybrids, such as LSTM-CNN and CNN-LSTM, finding that CNN-LSTM models halve computation times and enhance prediction accuracy. Pingping Chen et al. [200] introduced an attention-based CNN-LSTM model that effectively retains historical information, thereby improving predictive performance for short-term load forecasting in smart buildings. Increasingly, researchers are validating the superior predictive capabilities of CNNs and related hybrid models. Consequently, during the operational phase of buildings, it is advantageous to prioritize RNN-based models for energy consumption predictions when datasets exhibit strong time series characteristics, with potential future exploration towards CNNs and hybrid models.

3.3. Renovation Phase

3.3.1. Application of Energy Consumption Prediction in Renovation Phases

In the building retrofitting phase, research focuses primarily on residential buildings (55%) and office buildings (25%). Smaller shares of research concentrate on rural buildings (5%), campus buildings (10%), and commercial buildings (5%). As Figure 10 illustrates, studies on energy consumption prediction models using artificial neural networks heavily target residential and office buildings. Design variables analyzed during the retrofit phase often center on enclosure structure and environmental parameters.
Data collection poses a challenge in this field. Most research on neural network models for building retrofit prediction relies on simulated data (89%), as collecting historical energy consumption, temperature, humidity, and wind speed data from existing buildings (without pre-installed sensors or flow meters) is difficult. Only 11% of studies utilize real-world datasets. Predicted types of building energy consumption typically include total energy consumption, and heating and cooling loads.
Common evaluation metrics include MAPE, RMSE, CV-RMSE, MAE, MBE, MSE, R2, and MRE. Of these, MSE is most widely used (31% of studies). During the building retrofit phase, energy consumption prediction is often paired with multi-objective optimization to achieve multiple goals following the retrofit. Optimization targets primarily focus on energy consumption, thermal comfort, cost, and lighting. As architectural form parameters are difficult to adjust during retrofits, variables typically concentrate on enclosure structures and equipment layout.

3.3.2. Choice of ANNs for Building Renovations

During the building retrofit phase, the application of artificial neural networks (ANNs) for predicting energy consumption facilitates the assessment of energy efficiency improvements and cost savings pre- and post-retrofit. This assessment provides a solid foundation for evaluating retrofit proposals, informing design choices, and identifying optimal retrofit strategies. Further, analysis of the prediction outcomes enables the fine-tuning of retrofit strategies, thereby maximizing both energy efficiency and economic returns.
In the retrofit phase, ANNs function as meta-models. The process begins with parametric modeling using initial datasets, followed by the acquisition of target parameter data via Building building Performance performance Simulation simulation (BPS) technology. Preprocessing steps such as data cleaning and normalization are then undertaken to ready the data for model training. Objective functions and constraints are subsequently established for specific multi-objective optimization tasks that tailor the model’s structure and parameters for optimized training. Once trained, the model forecasts future energy consumption, offering decision support and directing the implementation of energy management and retrofit measures. This workflow demands an ANN architecture that not only delivers precise predictions but also accommodates swift feedback and real-time adjustments.
As indicated in Table 5, FFNNs are the prevalent choice for predicting energy consumption in building retrofit scenarios, attributed to their straightforward and intuitive nature, which simplifies understanding and implementation. During the retrofit phase, the absence of complex time series data or significant long-term dependencies makes FFNNs ideal due to their rapid training capabilities compared to RNNs and other models. This allows for expedited training and predictions, crucial in retrofit contexts that demand quick feedback and immediate adjustments. Moreover, FFNNs’ adaptability makes them particularly effective in retrofit phases characterized by costly and lower-quality data collection that encompasses diverse data types and predictive tasks, streamlining the model selection and implementation process.
In summary, the selection of FFNNs for predicting energy consumption during the building retrofit phase is justified by their simplicity, swift training speed, wide applicability, and minimal data requirements. Nonetheless, the decision to employ a specific model type should be thoroughly assessed based on the actual conditions and specific demands of the prediction task at hand.

4. Future Research Directions

The process for predicting building energy consumption unfolds through a series of meticulous steps: initially, gathering data concerning the building’s architecture, weather conditions, historical energy usage, and equipment specifics. Once the data are processed, features are selected, and the input and output parameters are established. These data are then segmented into training, validation, and testing subsets. Subsequently, a neural network model is constructed and undergoes evaluation and validation after training, with adjustments made based on the validation outcomes. Ultimately, this trained model is implemented in real-world applications to forecast energy consumption in buildings. This entire procedure, depicted in Figure 11, is primarily influenced by data quality, feature selection, choice of model, and optimization strategies. Employing proper data handling, feature engineering, model selection, and optimization enhances the accuracy and dependability of the predictions.
Reflecting on the prevailing research trends, the future trajectory of this field can be segmented into three principal areas: firstly, enhancing the data preprocessing techniques by choosing methods apt for various stages; secondly, selecting optimal neural network architectures and fine-tuning model parameters during the model building phase; and thirdly, choosing effective algorithms for optimization. These measures are targeted to refine the overall predictive process and elevate the outcomes for building energy consumption predictions.

4.1. Data Processing

Data processing plays a pivotal role in predicting building energy consumption, critically affecting the models’ accuracy and validity. It forms an essential phase in the development of predictive models. Future research will benefit from advancing data cleaning, feature engineering, and integration techniques to boost data quality and model performance. Advanced strategies, including outlier detection and management of missing data, alongside feature selection and dimension reduction, are vital for enhancing both data integrity and predictive efficacy.
In the initial data cleaning phase, measures are implemented to identify and correct anomalies to ensure data accuracy and consistency. Established methods for outlier management include Chauvenet’s criterion [235], GRUBS outlier detection [236], and isolation forest (ISF) [190] for handling missing data and excluding outliers. To address incomplete preliminary data, methods such as transfer learning [60,128,129,143,153,171,175,237], exponential moving average (EMA) [142], XGboost [119], and decision tree classification (DTC) [119] are proposed. During the data cleaning phase, feature selection and extraction are necessary to reduce redundancy and lower dimensionality, thereby enhancing modeling efficiency and accuracy. Proposed methods include PCA [88,115,117], random forest (RF) [170,183], rough set theory [48,238], statistical moments [50], Spearman correlation coefficient [51,175,176], sensitivity analysis [68,88,169,172], regression analysis [58], stacked auto-encoder (SAE) [122], recursive feature elimination (RFE) [119,235], Kalman filter (KF) as a stochastic filtering method [208], differencing for effective information removal [237], Pearson correlation analysis [56,236,239], and mean impact value (MIV) for information filtering [103]. Additionally, data grouping or classification into sets with similar features involves using clustering analysis techniques to identify patterns within the data, thus better understanding the data’s structure and distribution. Techniques include clustering residuals based on Chebyshev distance [51], affinity propagation (AP) clustering [126], K-means, and K-nearest neighbors (KNN) methods [235].
The aforementioned studies have effectively addressed issues related to data incompleteness, selection, clustering, and outlier management, significantly improving the accuracy of building energy consumption predictions. In the operational phase of buildings, training data primarily consist of real-world data, with most studies relying on data collected from smart meters, while only a few use specialized metering equipment, which greatly impacts the accuracy of operational phase data. In contrast, during the early design and renovation phases of buildings, training data are almost entirely simulated. Therefore, using methods like feature selection and principal component analysis to choose highly correlated input parameters is crucial for obtaining accurate energy consumption simulations, which in turn enhances the accuracy of prediction models. Future research should focus on continuously refining and processing data to improve their quality and reliability, thereby providing a stronger foundation for subsequent modeling and analysis.

4.2. Optimization of Model Parameters and Algorithms

Model parameter optimization is pivotal in predicting building energy consumption. Investigating automated optimization techniques, such as hyperparameter tuning, can greatly improve prediction model accuracy by refining neural network structures and parameters. Numerous studies have already boosted prediction accuracy by employing optimization methods like singular spectrum analysis [177], random search [237], exhaustive grid search [88], Taguchi methods [240], Bayesian optimization [68,87,92,114,116,141,144,241], and genetic algorithms [64,66,108,115,120,136,139,190,242]. These techniques enable more effective exploration of the parameter space, optimizing essential parameters and designing robust experiments to find the best parameter combinations, thereby significantly enhancing model performance and predictive accuracy. Future developments should focus on optimizing neural network structures and incorporating online learning and dynamic adjustments to further improve predictive outcomes and functionality.
While traditional algorithms encounter numerous challenges such as limited data volume, complex models, and poor generalization, the building energy consumption prediction domain is in dire need of algorithmic advancements to boost model accuracy and robustness. Contemporary research has incorporated various optimization algorithms like genetic algorithms [64,66,108,115,120,136,139,190,242], population-based incremental learning [220], evolutionary strategies [220], particle swarm optimization [45,105,164,219,220,228,243,244], ant colony optimization [176,220], electromagnetic firefly algorithm [46], biogeography-based optimization [220], TLBO [53,83,235], and GWO [141,180,228] with artificial neural networks. These algorithms are instrumental in optimizing parameters, structuring, accelerating training, and enhancing neural network performance, helping them to better accommodate diverse data and tasks, thereby augmenting predictive power and generalization. Additionally, advancements in regularization methods like dropout [142,144] and optimization algorithms such as iPSO [115,154] and iTLBO [52] have substantially improved the efficiency and effectiveness of neural networks.
An increasing number of studies are now shifting from trial-and-error methods to algorithm optimization for setting model parameters. However, traditional algorithms often lack sufficient generalizability. Future research should employ improved algorithms, selecting the most suitable ones based on the specific requirements of the problem and the characteristics of the data. This approach will enhance generalizability and efficiency, improve model interpretability and explainability, and ultimately increase the accuracy of building energy consumption predictions.

4.3. Applications of Integrated and Hybrid Models

In the process of constructing neural network models, the choice of neural network architecture is crucial. The previous sections proposed trends in selecting different neural network structures at different building phases, noting an increasing application of ensemble learning and hybrid models. A substantial body of literature now exists [204,205,206,207,208,209,210,211,212,213,214,215] on predicting building energy consumption using ensemble learning methods, as well as numerous studies [19,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203] using hybrid models for precise energy consumption predictions. Results demonstrate that ensemble machine learning models possess good predictive accuracy in forecasting building energy consumption.
Research has found that in the domain of building energy consumption prediction, ensemble learning is an effective approach. Ensemble learning, which combines multiple base models to enhance predictive performance and generalization, includes methods such as random forests, gradient boosting trees, adaptive boosting, and stacked ensembling. Utilizing ensemble learning techniques can yield more accurate results for building energy consumption prediction. Hybrid models, which combine different types of models to enhance predictive performance and robustness, are also employed in building energy consumption prediction. Hybrid models can integrate traditional statistical models (such as linear regression and logistic regression) with neural networks, leveraging the strengths of both to improve prediction accuracy and generalization. They can also combine physics-based models with ensemble and deep learning approaches, utilizing both the physical properties of buildings and extensive empirical data for precise predictions of building energy consumption.
Research in the operational phase of buildings often requires extensive real-world data, making the application of ensemble learning and hybrid models particularly impactful during this stage. These methods can significantly enhance prediction performance. In the future, as building data and models continue to accumulate, ensemble learning and hybrid models will play an increasingly vital role. The development of these approaches may focus on enhancing the diversity in model combinations, dynamic weight adjustment and adaptive learning, integration of heterogeneous models, combining ensemble learning with deep learning, applying online and incremental learning, and improving model interpretability and transparency. These advancements will further improve the accuracy, stability, and interpretability of building energy consumption predictions, driving the development and application of energy consumption prediction technologies.

5. Conclusions

A comprehensive review of the literature from 2015 to 2023, focusing on 292 papers, summarizes the characteristics of different ANN structures employed in building energy consumption prediction. By analyzing these structures and their practical applications in the design, operation, and renovation phases of buildings, this paper not only highlights the comprehensive application of various input types, building types, energy types, and temporal characteristics in energy consumption prediction models, but also proposes how to select appropriate ANN structures based on different building phases.
In the early design phase, fully connected feedforward neural networks (FFNN) are popular due to their simplicity and efficiency, while convolutional neural networks (CNN) are popular due to their efficiency in processing spatial data. During the operational phase, recurrent neural networks (RNNs), particularly long short-term memory networks (LSTMs), are predominant for their ability to handle time series data, leading to improved prediction accuracy; CNNs and hybrid models are also increasingly used for their strong capabilities in managing complex data structures. For the building renovation phase, feedforward neural networks are preferred due to their usability, quick training, and broad applicability, allowing for swift evaluation and adjustments to building performance.
Based on the literature review and analysis, future research directions may include the following:
  • Exploring improved optimization algorithms to enhance the performance of artificial neural networks;
  • Exploring optimization of neural network structures and parameter settings to enable online learning and dynamic adjustments to enhance prediction outcomes;
  • Investigating the application of ensemble learning and hybrid models for dynamic weight adjustment and adaptive learning, integration of heterogeneous models, online and incremental learning, and enhancing model interpretability and transparency;
  • Exploring more effective data cleaning, feature engineering, and data integration methods to improve data quality and model predictive performance.

