Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Optimizing the Operation of Grid-Interactive Efficient Buildings (GEBs) Using Machine Learning

Sustainability 2024, 16(20), 8752; https://doi.org/10.3390/su16208752

by Czarina Copiaco and Mutasim Nour^*

Reviewer 1:

Byung Ki Jeon

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Sustainability 2024, 16(20), 8752; https://doi.org/10.3390/su16208752

Submission received: 31 July 2024 / Revised: 5 October 2024 / Accepted: 8 October 2024 / Published: 10 October 2024

(This article belongs to the Section Energy Sustainability)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Review Comment:

This paper aims to predict and optimize the energy consumption of grid-interactive efficient buildings (GEBs) using machine learning, particularly focusing on achieving high prediction accuracy through the use of Gaussian Process Regression (GPR). Additionally, the study is notable for validating the model with real data and enhancing practicality with a user-friendly GUI. This research makes a significant contribution by considering the practical application of energy management systems.

Improvements and Questions:

Lack of Clear Rationale for Choosing GPR and Theoretical Explanation:

The paper does not clearly explain why GPR was chosen. It is necessary to provide a specific comparison and explanation of why GPR outperforms other machine learning techniques (e.g., linear regression, decision trees, neural networks). Additionally, the theoretical background of GPR, including how it handles data variability and the fact that it does not require assumptions about the overall distribution of the data due to its non-parametric nature, should be clearly explained.
GPR operates by not assuming a specific functional form for the prediction model, instead learning directly from the data to find the optimal function form using a non-parametric approach. The model calculates a probability distribution over all possible functions from the training data, and when making predictions, it uses these distributions to perform predictions while accounting for uncertainty.
Therefore, to clearly justify why GPR offers superior performance compared to other models, it is important to include a detailed explanation of the theoretical background and the working principles of GPR. Adding this content to the paper would be crucial to helping readers understand why GPR was an appropriate choice for this study.

Equation Supplementation (I consider this the most critical area for improvement):

The paper lacks sufficient equations throughout. Specifically, equations (4-5) are critical to this paper. Simply inserting data into algorithms provided by simulation tools does not constitute a proper paper. Users must fully understand the algorithm, and it is essential for the authors to explain the working principles. I recommend significantly supplementing the equations related to the algorithms used. Additionally, the concepts and equations regarding what a Gaussian Process is, the role of kernel functions and their various options, and the calculation of predictive distributions using covariance matrices should also be included.

Generalization of the Model:

The study was validated using a specific dataset (from the Texas university campus), raising concerns about the generalizability of the model. There is a lack of discussion on whether this model would perform equally well in different environments or building types. I suggest additional validation using other datasets or environments to address this issue and enhance the generalization of the model.

Clarification of Data Requirements:

The paper should provide more detailed explanations about the types and quality of data needed for the model to function correctly. For instance, how the model would perform in existing buildings where historical data might be lacking, and the challenges and solutions related to data collection should be further discussed.

Model Maintenance and Recalibration:

The paper mentions the need for periodic recalibration to maintain model performance, but there is a lack of discussion about the potential costs and technical challenges associated with this process. I request a more detailed explanation of how these maintenance processes would impact actual operations.

Challenges in GUI Integration and Practical Implementation:

The paper needs to provide a more detailed explanation of the challenges expected during the integration of the GUI and the practical implementation of the system. Specifically, the difficulties related to data collection, model maintenance, and user training during the actual implementation phase and the strategies to overcome these challenges should be clearly presented. I would appreciate the authors’ further insights on these issues.

Author Response

Improvements and Questions:

Comment 1: Lack of Clear Rationale for Choosing GPR and Theoretical Explanation:

The paper does not clearly explain why GPR was chosen. It is necessary to provide a specific comparison and explanation of why GPR outperforms other machine learning techniques (e.g., linear regression, decision trees, neural networks). Additionally, the theoretical background of GPR, including how it handles data variability and the fact that it does not require assumptions about the overall distribution of the data due to its non-parametric nature, should be clearly explained.

GPR operates by not assuming a specific functional form for the prediction model, instead learning directly from the data to find the optimal function form using a non-parametric approach. The model calculates a probability distribution over all possible functions from the training data, and when making predictions, it uses these distributions to perform predictions while accounting for uncertainty.

Therefore, to clearly justify why GPR offers superior performance compared to other models, it is important to include a detailed explanation of the theoretical background and the working principles of GPR. Adding this content to the paper would be crucial to helping readers understand why GPR was an appropriate choice for this study.

Response to comment 1:

This comment was addressed in two parts as detailed below:

Section 1.2 – Detailed working principles of the GPR to provide further theoretical background for readers’ reference was added. This section introduced GPR models as non-parametric models that employs Gaussian Process which defines distribution probability over functions, accounting for uncertainty estimates in data predictions.

The concept of kernel was likewise elaborated with focus on the exponential kernel which is particularly effective in handling large datasets and achieving smooth functions with minimal errors. Refer to comment 2 response for details on algorithm equations and training specifics.

Section 3.2.3 – Following Figures 6 and 7 which highlight GPR as the top performing algorithm in both building and grid-level electric demand forecasting, reference to Section 4.2 explaining specific algorithm characteristics that led to the results acquired was added.

Section 4.2 provides further insight on the characteristics of exponential GPR which led to its consistent outperformance over other algorithms. These are described as follows:

Non-Parametric: It makes no assumptions about the entire data population based on the sample training dataset. Algorithms in this category often demonstrate higher robustness with datasets featuring large distribution measures.
Bayesian Approach: It applies a probability distribution over all possible values, enabling the provision of predictions with uncertainty measurements.
Exponential GPR Kernel: This feature facilitates effective handling of large datasets. When combined with the described pre-processing methods, smooth functions can be achieved with minimal errors.

Comment 2: Equation Supplementation (I consider this the most critical area for improvement):

The paper lacks sufficient equations throughout. Specifically, equations (4-5) are critical to this paper. Simply inserting data into algorithms provided by simulation tools does not constitute a proper paper. Users must fully understand the algorithm, and it is essential for the authors to explain the working principles. I recommend significantly supplementing the equations related to the algorithms used. Additionally, the concepts and equations regarding what a Gaussian Process is, the role of kernel functions and their various options, and the calculation of predictive distributions using covariance matrices should also be included.

Response to comment 2:

Further details on the working principles of the linear regression and exponential GPR algorithms have been added in Section 1.2. In particular, this includes the training processes related to equations 4-5 as shown below. Please refer to the revised manuscript attached for references.

Linear regression (LR) is a learning algorithm which aims to find a linear curve solution for predicting continuous outputs. However, its limitation lies in its linearity, which may not be suitable for scenarios requiring more complex relationships between input variables and predicted output. In LR, weight parameters are assigned to training features and iteratively adjusted to minimize errors. Observed data are used to estimate the coefficients which minimize the sum of squared residuals during training. The magnitude and direction of the relationship between each independent and dependent variables are represented by the coefficients. The trained model can then be used to predict new data points by inputting independent variables into the linear equation.

h_θ= θ_0〖 + θ〗_1 x_1+ θ_2 x_2+⋯

Where h_θ = predicted output, θ_n = weight parameter, x_n = training features.

Gaussian Process Regression (GPR) models are frequently used in statistical modelling and pattern recognition. These are non-parametric models that employs Gaussian Process which defines distribution probability over functions, accounting for uncertainty estimates in data predictions.

The covariance function, or kernel determines the characteristics over functions by defining the relationship between points in the input space. An example below is the exponential kernel (6) which is particularly effective in handling large datasets and achieving smooth functions with minimal errors.

f(x)∼GP(m(x),k(x_i ,x_j |θ))

k(x_i ,x_j |θ) = 〖σ_f〗^2 exp(- r/σ_l )

Where m(x) = mean function, r= √((x_i-x_j )^T (x_i-x_j)), θ = maximum a posteriori estimate, σ_f = standard deviation, σ_l=length scale.

Training involves kernel hyperparameters learning through maximization of the marginal likelihood of observed data. Predictions about new data points can then be done through computation of mean and variance using calculated covariance matrix from the learned kernel.

Comment 3: Generalization of the Model:

The study was validated using a specific dataset (from the Texas university campus), raising concerns about the generalizability of the model. There is a lack of discussion on whether this model would perform equally well in different environments or building types. I suggest additional validation using other datasets or environments to address this issue and enhance the generalization of the model.

Response to comment 3:

Noted. Model training has been done using a residential building from another database with results presented in Section 3.2.3. The calculated nRMSE was recorded as 0.59%, which is in line with the 0.52% figure noted using the Texas university campus dataset confirming the applicability of the demand forecasting model to buildings with varying usage and location.

Comment 4: Clarification of Data Requirements:

The paper should provide more detailed explanations about the types and quality of data needed for the model to function correctly. For instance, how the model would perform in existing buildings where historical data might be lacking, and the challenges and solutions related to data collection should be further discussed.

Response to comment 4:

A paragraph has been added in Section 4.4 Recommendations to discuss data requirements for existing buildings alongside potential limitations, challenges, and solutions as summarized below:

In the absence of actual historical data, previous data recordings from a building of similar scale, usage, and location may likewise be considered for existing buildings.
Where the recorded data types vary between different buildings, it is recommended to perform feature selection by parameter weighing for each new dataset to accurately identify features that would enhance forecasting performance.
Regular frequency of data recordings is essential in pattern recognition. In cases where this was not implemented, data shall be pre-processed such that recording frequencies are maintained by limiting model inputs as per the largest (worst-case) timestamps interval.

Comment 5: Model Maintenance and Recalibration:

The paper mentions the need for periodic recalibration to maintain model performance, but there is a lack of discussion about the potential costs and technical challenges associated with this process. I request a more detailed explanation of how these maintenance processes would impact actual operations.

Response to comment 5:

Further sequencing information on the recommendation for annual model calibration process has been added to Section 4.4 as per the following:

Data scientist/engineer to retrain the model if required through a thorough evaluation of previous forecasting performance for error reduction.
The existing forecasting model is able to operate as normal during this process.
Replacement of existing model with recalibrated model to be performed between reading intervals to avoid service interruptions.

Comment 6: Challenges in GUI Integration and Practical Implementation:

The paper needs to provide a more detailed explanation of the challenges expected during the integration of the GUI and the practical implementation of the system. Specifically, the difficulties related to data collection, model maintenance, and user training during the actual implementation phase and the strategies to overcome these challenges should be clearly presented. I would appreciate the authors’ further insights on these issues.

Response to comment 6:

Section 3.4 has been added which summarises user input requirements and information exchange in the proposed GUI implementation. This includes a screenshot of the developed GUI for building classification where pre-recorded readings aren’t available, alongside a narrative on required inputs:

Users are prompted to enter building characteristics and forecasted weather data, which are then used to predict the final electric load in kW. Additional functionalities related to the case study are presented in separate tabs within the GUI window.
Predicted weather data can be automatically imported from source to eradicate risks of human error. Likewise, predicted building electric demand load figures can be shared directly to the power grid system and vice versa for effective demand management.

Please refer to Recommendations Section 4.4 & comment response 4 for details on challenges and solutions related to data collection and model maintenance.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors do not provide details on the specific contributions and novelty of this paper compared to existing literature, methods, or algorithms. They do not provide mathematical details on the optimization algorithms and do not compare machine learning algorithms. The authors write about ANN but do not provide specifics, layers, etc. There is no model validation.

Author Response

Comment 1:

Response to Comment 1:

Please refer to comment responses below:

In Section 2.1, the paper summarizes the calculated nRMSE of various forecasting algorithms from existing literature. Apart from the proposed methodology to standardize algorithm comparisons and pre-processing/feature selection processes, Section 3.2.2 highlights that the optimized exponential GPR method led to a 0.3% nRMSE reduction compared to the ‘leading’ linear regression methodology from existing literature. Other contributions such as case study development, energy savings estimation, and enhanced forecasting flexibility were then discussed in Section 4 leading to recommendations for future work.
Further details on the workings of the linear regression and exponential GPR algorithms with comparisons have been added in Section 1.2. These include information on how parameters are handled in each iteration of the algorithm, ultimately forming a ‘trained’ model that is adept at load forecasting from previous demand information. Mathematical details of optimization algorithms were likewise added in Section 1.3. Following the full methodology description, further discussion in Section 4.2 explains characteristics of the exponential GPR algorithm which led to its consistent outperformance over others which are summarised as follows.
Non-Parametric: It makes no assumptions about the entire data population based on the sample training dataset. Algorithms in this category often demonstrate higher robustness with datasets featuring large distribution measures.
Bayesian Approach: It applies a probability distribution over all possible values, enabling the provision of predictions with uncertainty measurements.
Exponential GPR Kernel: This feature facilitates effective handling of large datasets. When combined with the described pre-processing methods, smooth functions can be achieved with minimal errors.
Brief description of ANN was provided in Section 1.2 as background information referred from existing literature which evaluated its forecasting performance. However, as it was outperformed by algorithms such as linear regression, it did not form part of this study’s test iterations. Hence, it is proposed to focus theoretical details on methods such as linear regression and exponential GPR which form the in-depth analysis of this research.
Model training has been extended using a residential building from another database for validation with results presented in Section 3.2.3. The calculated nRMSE was recorded as 0.59%, which is in line with the 0.52% figure noted using the Texas University campus dataset confirming the applicability of the demand forecasting model to buildings with varying usage and location.

Reviewer 3 Report

Comments and Suggestions for Authors

This paper explores how machine learning techniques can be used to optimise the operation of Grid-Interactive Efficient Buildings (GEBs), focusing primarily on the impact of weather and building usage parameters on the performance and interactivity of electrical load forecasting systems through research. However, the paper has several points that require further explanatory notes from the authors:

[1] Although the paper mentions the application of a variety of machine learning models, it does not provide sufficient justification as to why these particular models were chosen. It is recommended that this section be added to explain the criteria for model selection.

[2] For the experimental results, specific figures are provided, but an in-depth discussion of the results is lacking. It is recommended that some analysis be added on why certain models perform better than others, or the differences in performance under different conditions.

[3] In the conclusion section, please further suggest directions for future research and what are the implications of the research?

[4] The paper mentions that the energy savings were verified through ETAP and Trimble ProDesign software, however it does not go into detail on how these software were used for the verification. The authors are requested to present on this section.

[5] In this paper, the author focuses on using artificial intelligence methods in grid demand management, and different artificial intelligence methods can be compared and analyzed and further improved, which can refer to:

[a] IEEE Transactions on Industrial Informatics, vol. 18, no. 2, pp. 835-846, 2022

[b] IEEE Transactions on Industrial Informatics, vol. 19, no. 11, pp. 10751-10762, 2023

[c] IEEE Transactions on Power Systems, vol. 34, no. 2, pp. 1653-1656, March 2019

[d] IEEE Transactions on Power Electronics, vol. 36, no. 1, pp. 73-77, Jan. 2021

Author Response

Comment 1: Although the paper mentions the application of a variety of machine learning models, it does not provide sufficient justification as to why these particular models were chosen. It is recommended that this section be added to explain the criteria for model selection.

Response to Comment 1:

As noted in Section 1.1.2, certain performance criteria which include the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) are common units to measure performance of predictive models by quantifying differences between actual values and forecasted values. All available algorithms from MATLAB Regression Learner have been trained and tested with a common database. This include the top performing ‘linear regression’ algorithm which was identified from the normalized RMSE list from literature review. The results show least error figures for Exponential GPR in both building and grid electric demand forecasting applications and hence was chosen for the final case study simulation as discussed in Section 3.2.3.

Comment 2: For the experimental results, specific figures are provided, but an in-depth discussion of the results is lacking. It is recommended that some analysis be added on why certain models perform better than others, or the differences in performance under different conditions.

Response to Comment 2:

Following Figures 6 and 7 which highlight exponential GPR as the top performing algorithm in both building and grid-level electric demand forecasting, reference to Section 4.2 explaining specific algorithm characteristics that led to the results acquired was added in Section 3.2.3. This discussion section can be summarised as follows:

Non-Parametric: It makes no assumptions about the entire data population based on the sample training dataset. Algorithms in this category often demonstrate higher robustness with datasets featuring large distribution measures.
Bayesian Approach: It applies a probability distribution over all possible values, enabling the provision of predictions with uncertainty measurements.
Exponential GPR Kernel: This feature facilitates effective handling of large datasets. When combined with the described pre-processing methods, smooth functions can be achieved with minimal errors.

Comment 3: In the conclusion section, please further suggest directions for future research and what are the implications of the research?

Response to Comment 3:

Section 4.5 highlights the research implications and lists several potential areas for further research and investigation. Among these recommendations include:

System Implementation in Operational Buildings: Involving regular performance assessments to gauge its effectiveness in real-world scenarios.
Expanded Database Sources: Using additional database sources to further validate and enhance the obtained results.
Integration Effects of External Systems: Investigating the integration effects of other external systems on building electric demand to understand how various factors in-fluence forecasting accuracy.
Study on Different Generation Technologies: Conducting a detailed study on the ef-fects of different types of generation panels and capacities on overall building and grid demand to optimize energy generation.
Extended Applications of Load Forecasting: Exploring the implementation of load forecasting on extended areas of research such as information exchange security and larger-scale renewable energy generation.

Comment 4: The paper mentions that the energy savings were verified through ETAP and Trimble ProDesign software, however it does not go into detail on how these software were used for the verification. The authors are requested to present on this section.

Response to Comment 4:

Item 4 in project objectives has been reworded to allude to ‘network stability verification’ done using ETAP and ProDesign instead of energy savings verification.

As discussed in Section 3.3, the full grid-to-building network has been set up in ETAP considering the listed assumptions. LV network stability in terms of current flow to each campus building in the case study was then verified in ProDesign through a comprehensive simulation to ensure parameters such as voltage drops are well within industry limits. This confirms the applicability of the case study to real-world scenario, which served as the basis of accurate energy savings estimation.

Comment 5: In this paper, the author focuses on using artificial intelligence methods in grid demand management, and different artificial intelligence methods can be compared and analyzed and further improved, which can refer to:

[a] IEEE Transactions on Industrial Informatics, vol. 18, no. 2, pp. 835-846, 2022

[b] IEEE Transactions on Industrial Informatics, vol. 19, no. 11, pp. 10751-10762, 2023

[c] IEEE Transactions on Power Systems, vol. 34, no. 2, pp. 1653-1656, March 2019

[d] IEEE Transactions on Power Electronics, vol. 36, no. 1, pp. 73-77, Jan. 2021

Response to Comment 5:

The machine learning aspect in this paper heavily focuses on identification of optimal regression algorithm for electric load forecasting in line with other measures such as feature selection by parameter weighting and effects of varying training data duration. The papers mentioned above are well noted and can be used for further analysis on supplemental research related to information exchange security and larger-scale renewable energy integration.

Section 4.5 recommendations for future work have been updated to include further study on incorporation with related developments mentioned above. Please refer to comment 3 response for details.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The revisions have been well incorporated. Please review the paper again to check for any typographical or related errors. For example, in line 100, it says "equation [14]."

Author Response

Comment1:

Thank you for your comment. The revisions have been well incorporated. Please review the paper again to check for any typographical or related errors. For example, in line 100, it says “equation [14].”

Response to comment 1:

The manuscript is checked thoroughly for any typo errors.

“equation [14].” Is changed to “equation (4) [14].” Since (4) refers to equation 4 and [14] is the citation of the reference 14.

Reviewer 2 Report

Comments and Suggestions for Authors

The paper was improved. No optional issues are found

Author Response

Comment 1:

The paper was improved. No optional issues are found

Response to Comment 1:

Thank you for your comment confirming that no issues were found.

Reviewer 3 Report

Comments and Suggestions for Authors

Revisions made, manuscript now acceptable

Author Response

Comment 1:

Revisions made, manuscript now acceptable

Response to Comment 1:

Thank you for your comments confirming that the required revision has been made and no further comments to attend to.

Article Menu

Optimizing the Operation of Grid-Interactive Efficient Buildings (GEBs) Using Machine Learning

Further Information

Guidelines

MDPI Initiatives

Follow MDPI