3. Results and Discussions
This study evaluated energy consumption projections for the Valladolid campus using six artificial intelligence models: XGBoost, random forest, autoencoder, radial basis function, artificial neural networks, and decision trees. The on-campus evaluation revealed the differences in models used to forecast energy use across the campus. When applied to larger quantities of consumption, some models were more consistent, while others were more variable when applied to a smaller group of consumption. Variability was also found in the evaluation of 27 different buildings, with models varying based on the sizes, purposes, and data accessibility of the buildings.
A comparison of the AI models displayed across individual buildings indicated that several structural and operational aspects render model performance dependent. Buildings with stable and predictable energy demands exhibited one prediction profile, while buildings with stochastic energy consumption exhibited a different prediction profile. While both had AI models that were successfully able to respond and delivery accuracy to usage complexity, others had models whose performance in turn exhibited sensitivity to the written performance variability of the data. Given the discrepancy in model performance, we would suggest that the relative energy prediction performance of an AI model is very likely related to a model structure and a building energy consumption profile.
The integration of heat exchangers into campus energy systems can significantly affect total energy consumption and CO2 emissions. Heat exchangers recover waste heat and increase thermal efficiency, which can reduce the quantity of needed primary energy sources and the carbon emissions associated with their use. The energy source does matter in how much CO2 is decreased; clearly, powered systems using renewable energy will decrease emissions more than those powered by fossil fuels. Also, while heat exchangers do maximize actual energy use, system design, maintenance, and operating conditions will impact their maximum efficiency. Exploring their impacts, and possibly in conjunction with AI energy models, could provide deeper insights into how to minimize emissions while ensuring energy efficiency values in various building types.
3.1. Validation of the Whole Campus
The Valladolid campus strives to conduct focused research in the energy conservation and sustainability areas, with special attention to the beauty of heat exchanger systems. They are fundamental systems that ensure minimal energy waste due to heat being transferred between fluids. The facilities have incorporated technology whereby system energy consumption is lowered by 30% in most instances with respect to conventional systems. This spirit of conservation aids in improving efficiency and setting an example for the rest of the academic world as an environmentally sustainable practice.
Artificial neural network (ANN), random forest (RF), XGBoost (XG), radial basis function (RBF), and autoencoder (AUTO) are the six AI models whose anticipated energy consumption (in kWh) is shown in the box plot. Actual (1.5 ×
kWh), ANN (1.1 ×
kWh), RF (1.3 ×
kWh), XG (1.4 ×
kWh), RBF (1.2 ×
kWh), and AUTO (1.3 ×
kWh) are the median energy consumption amounts, as shown in
Figure 4. With the lowest median energy consumption of 1.1 kWh, the ANN model performs better than the others, demonstrating its ability to anticipate energy usage with accuracy. This improved performance implies that applying the ANN model to heat exchanger operations may result in more accurate energy management, which would ultimately raise productivity and lower operating expenses throughout the Valladolid campus.
Figure 5 compares the true measured energy consumption values with predictions made by different AI solutions, including random forest (RF), XGBoost (XG), autoencoder, radial bias function (RBF), artificial neural network (ANN), and tree decision. Perfect estimation is represented as a diagonal dotted line in each plot, where expected and true values are equal. This line describes how accurate predictions are by the model: the closer points are to the dotted line, the more accurate the model’s predictions are predicted to be. This visualization allows us to see each model’s ability to predict energy consumption in detail.
Each model displays varying degrees of accuracy in its predictions. The ANN, RBF, and autoencoder models showcase a strong correlation between predicted and actual values, with most data points closely aligning with the diagonal line. In contrast, the random forest model also performs well but exhibits slightly more dispersion in its predictions. The XGBoost model, while relatively accurate, shows a broader spread of points, indicating some variability in its predictions. The RBF and autoencoder models reveal a more significant divergence from the ideal line, suggesting that they may require further tuning to enhance their predictive capabilities. The tree model, while offering some insights, shows the most deviation from actual values, indicating it is the least effective among the models presented.
The information provided by the models in comparison is essential to optimize energy management practice, specifically on the Valladolid campus regarding heat exchangers. Due to the ANN, RBF, and autoencoder models’ accuracy being the best, it could be argued that they are the most efficacious tools to predict energy consumption, allowing for improved operational efficiency and lower spending. Leveraging the strengths of the best models, the campus will have more effective options to create different strategies that optimize energy, while reducing their overall carbon footprint. As the campus continues to refine their AI models, it will be important to continue to improve prediction accuracy, to ensure energy efficiency using ANN in future applications.
Q-Q plots are useful for evaluating whether the predicted energy consumption values from AI models follow a theoretical normal distribution. Deciding on the distribution of the energy consumption predictions in heat exchangers is essential when drawing conclusions about patterns, anomalies, or model performances. In each of the Q-Q plots, the quantiles of predicted values from the models, such as ANN, RF, XG, RBF, autoencoder, and tree decision, are compared against the theoretical quantiles of the normal distribution. Thus, if the points fall nearly along the diagonal line, it means that the predicted values from the model are normally distributed, which is crucial for an array of statistical analyses.
When looking at the Q-Q plots shown in
Figure 6, we can observe distinctive performance characteristics from the models. The ANN plot has points that closely follow the diagonal, particularly more so than other models. This is a strong indication the predictions are evenly distributed and closer to having an approximately normal distribution. This is also a good indicator that it has good predictive ability and is reliable in estimating energy prediction. While the RF and XGBoost plots have some difference from the diagonal as the tails shows difference, suggesting they struggled with extremes, the RBF and autoencoder plots display a greater difference from the diagonal and a stronger indication those models must improve predictions so there can be a closer to normal distribution. The furthest from the 45-diagonal line suggests it is poor means of predicting energy consumption in this situation. The insights gained from the Q-Q plots have important implications for energy management strategies involving heat exchangers. Models that produce predictions closely aligned with a normal distribution, such as ANN, are likely to be more reliable for forecasting energy consumption patterns.
The histograms provided demonstrate the residuals from six different AI models: artificial neural network (ANN), random forest (RF), XGBoost (XG), random forest with Bayesian optimization (RBF), autoencoder, and tree models. Residuals are defined as the differences between predicted values and the actual observed values of energy consumption. Examining the residuals is necessary to consider how well each model performs and whether there are any biases in their predictions. Residuals should ideally be normally distributed around the 0-value, indicating biased model predictions and a lack of systematic error.
Figure 7 also shows recognizable patterns in the residuals of each model. The ANN model shows a relatively tight distribution of the residuals around the mean of 0 and hints it is a reasonable predictor, with bias nearing zero. The RF and XGBoost models show some wider distribution, suggesting they also provide reasonable predictions; however, due to their broader distribution, there is more variability in these models. The RF and XGBoost have more observable tails than the ANN model, suggesting they may have difficulty with predictions in some ranges of the data, which could result in bias due to over- or under-prediction in some extreme cases. The RBF and autoencoder have wider distributions, suggesting the models need some further optimizations and potentially could have been built differently for improved accuracy. Finally, the tree histogram is a much wider distribution about the mean and, therefore, more skewed, indicating a tendency for larger prediction error and suggesting that it is likely the least reliable model in this use case.
3.2. Evaluation of Performance Matrix
Figure 8 contains two bar charts with six different artificial intelligence (AI) models, which are ANN, RF, XGBoost, RBF, autoencoder, and tree. The site is examining heat exchangers, likely in terms of energy consumption. The bar logs on the left show the Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Both of these are common evaluation parameters to assess the accuracy of model predictions. The lower the RMSE/MAE, the better the model performance, as this signifies less difference between the predicted and actual values. The bar charts on the right show Kull–Walsh Global Efficiency (KGE), Nash–Sutcliffe Efficiency (NSE), and the coefficient of determination (R
2). All of these are predominantly used in hydrological modeling, which can be used to assess predictive power and/or goodness-of-fit of models in other engineering-related problems. The higher, the better for KGE, NSE, and R
2 (close to 1 is ideal), meaning that the predicted values are closely matching the observed values.
By analyzing both graphs, we can derive the best AI models that successfully predict energy consumption in the heat exchanger. In terms of error, the RBF and autoencoder models appear to have the best RMSE and MAE, which can be interpreted as presenting the most value in predictively accurate results. We can support this with the right graph, where RBF and autoencoder yield KGE, NSE, and R2 measurements very close to 1, suggesting a good fit to the data and high prediction value. While ANN also yields relatively low error values, its KGE, NSE, and R2 are lower than RBF and autoencoder, which suggests its predictions are less valuable. Moreover, XGBoost, RF, and tree models, despite possibly providing respective relatively positive error results, have error values that are greater than the RBF and autoencoder models and lower in efficiency/R2 measurements, whereby we show prediction accuracy is less significant than RBF and autoencoder. For these reasons, based on measure metrics, it would appear the RBF and autoencoder AI models would present the best and likely most optimal prediction for energy consumption in the heat exchanger, thereby contributing to more sustainable energy management and efficiency within the Valladolid campus.
The
Figure 9 is a correlation heatmap that shows the correlation coefficients among the various variables. The variables on the specific heatmap above include “Actual” and “Predicted” values from six different AI models: ANN, RF, XGBoost, RBF, autoencoder, and tree. The coefficient correlation ranges from −1 to 1, where +1 is a perfect positive correlation, −1 is a perfect negative correlation, and 0 is no linear correlation. The color scale shows the strength of the correlation, with a dark red color representing the strongest positive correlation (closer to 1) and a dark blue color showing weakest correlation. Each cell at the intersection of a row and column shows the correlation between corresponding variables.
A higher coefficient correlation between the actual and predicted values indicates that the model’s predictions closely follow the actual trend, signifying better model performance. From the heatmap, “Predicted ANN,” “Predicted RBF,” and “Predicted auto” all show very high correlation coefficients with “Actual” values (0.9939, 0.9915, and 0.9948, respectively). This suggests that these three models are highly accurate in their predictions. “Predicted RF” and “Predicted tree” also show strong positive correlations (0.8269 and 0.7685, respectively), but they are not as high as ANN, RBF, and autoencoder. “Predicted xgboost” exhibits the lowest correlation with “Actual” values (0.7016) among all the models, indicating that its predictions are less aligned with the actual data compared to the other models. Therefore, based on this correlation analysis, ANN, RBF, and autoencoder appear to be the most reliable models in terms of their ability to predict outcomes that closely mirror the actual observations.
3.3. Validation of Each Building
Figure 10 presents a sample of visual analysis regarding energy consumption predictions for 27 distinct buildings at the University of Valladolid. Each image is a grid with nine plots, with titles that show the building IDs, which range from D01 to D013 in the first two images, which are used, while the third image has employed the naming sequence for each of the buildings (E01, E02, etc.). Each subplot represents a visual comparison of “Actual” energy consumption (black line) between actual and each of the three chosen artificial intelligence (AI) models, respectively (a) artificial neural network (ANN—red line), (b) radial basis function (RBF—green line), and (c) autoencoder (autoencoder prediction—magenta line); therefore, each of the nine plots visualizes the predictive performance of each model estimating energy consumption for specific buildings, across respective “Sample Indices,” likely representing different periods of time or data points.
An in-depth analysis of the 27 plots indicated various levels of predictive accuracy for the three AI models, across the various buildings. For many of the buildings, mostly those designated with names “D” (e.g., D01, D04, D07, D08, D010, D011), both the ANN and RBF models performed exceptionally well in estimating the “actual” energy consumption using a close estimation of the respective data. We observed that predicted lines overlapped or in tight proximity on the respective historic energy consumption data—strongly suggesting predictive accuracy and robustness. The predictive capabilities of models using ANN and RBF methods highlight their reliability in utilizing these models/these buildings for the energy forecasting endeavor.
On the other hand, the autoencoder model has a more varied behavior. The autoencoder also fits the actual consumption in some cases well (e.g., D01, D04). However, it has some cases where the autoencoder was very inaccurate. Some buildings, especially D02, D05, D06, D09, and D013 from the “D” series and particularly, from the “E” series, buildings E01, E02, E04_1, and E05, the autoencoder prediction line has greater deviations, sometimes indicating poorer over- or under-estimation of consumption than ANN or RBF. The differences seem to suggest that the autoencoder is more sensitive to the consumption profile of the specific building(s), suggesting it may need better tuning, or its architecture may not be suitable uniformly across all types of the buildings on this campus, for getting improved results.
The overarching objective of this extensive analysis is to identify the most effective AI models for predicting energy consumption across the entire University of Valladolid campus. Accurate energy consumption forecasting is paramount for implementing efficient energy management strategies, optimizing resource allocation, reducing operational costs, and ultimately contributing to the university’s sustainability goals. By analyzing the performance of ANN, RBF, and autoencoder across 27 diverse buildings, the study aims to determine which models offer the best balance of accuracy and adaptability for a complex, multi-building environment.
This detailed visual comparison is an essential diagnostic tool, providing clarity into the strengths and weaknesses in each AI model applied against real energy consumption data. The visualization of these differences by building allows researchers and energy managers to rationally decide which models to use and whether combining models is feasible. For example, in cases where ANN or RBF performed particularly well, these models could act as forecasting tools for energy managers; in buildings that presented specific difficulties with the autoencoder, model hybridizations or other models or further parameter optimization of the autoencoder would be necessary to successfully build a forecasting model.
The analysis performed over all 27 buildings of the University of Valladolid, utilizing the ANN, RBF, and autoencoder models, is an essential first step towards the establishment of an intelligent energy intervention and management system. The evidence from these visual comparisons suggests overall that ANN and RBF will generally provide a more accurate and consistent prediction across a more varied set of building types, which include both suggested “D” series and “E” series buildings. While Autoencoders provide promise, the variability in any model means further consideration, investigation and/or customized application is needed. The additional insights from these plots will be essential to provide a data-driven approach to energy efficiency, making sure energy managers are managing their rare inputs and sustainable campus environment as responsibly and sustainably as possible.
These correlation heatmaps depict strong relationships between actual energy consumption and the foretold by varying AI modes for energy usage at several buildings on the university campus. Similar high correlation patterns were displayed in all 27 buildings assessed, which indicates the strength of the AI models’ predictive capacity. Overall high correlation across all buildings indicates that the models, including artificial neural networks (ANN), radial bias function (RBF), and autoencoders (AE), effectively learned complexity ingrained in energy consumption of different building types and purposes. The consistent performance across the entire dataset of 27 buildings supports the reliability and generalizability of the proposed AI framework for university-wide energy management and optimization.
When the correlation heatmaps for buildings D01, D08, E03, and E014 are carefully examined, the pictorial evidence reinforces the high performance of the model more. The heatmaps represent residuals between an actual value against the predicted values, where the diagonal values (i.e., the values measuring a variable with itself) are all a correlation of 1.0, as expected. More importantly, the off diagonal values, particularly those associating “Actual” energy consumption to “Predicted ANN”, “Predicted RBF” and “Predicted Auto”, each exhibit very high positive correlation values (sometimes greater than 0.9). For example, in
Figure 11a–d, each correlation coefficient between “Actual” and the respective predicted values is closely approaching 1, suggesting that across these buildings observed energy usage exhibited an excellent trend alignment with the models’ forecasts. The sample selected for visual representation had high levels of agreement, ultimately confirming the individual AI models were suitably capturing complex energy behavior at the building level, which is particularly important for characterizing high-consumption areas and refining energy conservation pursuits based on evidential data.
3.4. CO2 Emissions on Campus
Figure 12 shows the CO
2 emissions (kg) of energy use on the entire university campus, with an emphasis on biomass energy as shown by several different AI models. Within the scatter plot, the “Actuals” (shown in yellow) reflect a baseline of reported CO
2 emissions. The distribution and range of CO
2 emissions across different biomass consumption reports provide a broad spectrum of values, ranging from barely anything to values slightly over 4000 kg, with even a few extreme outliers exceeding 5000 kg for some models. The spread shown by the data points of the models (ANN, RF, XGBoost, RBF, and autoencoder) reflect different interpretations and predictions of CO
2 emissions with the influence of biomass consumption across the universities 27 buildings. The distribution of total CO
2 emissions across the campus of the total sum of CO
2 emissions, the different footprints attributed to each building, and the consumption patterns of energy use provided critical information for targeting strategies to reduce emissions.
When examining the performance of the individual AI models in predicting CO2 emissions from biomass, the goal is to identify which model’s predictions most closely align with the “Actual” observed values. Visually inspecting the plot, we see that the artificial neural network (ANN), radial basis function (RBF), and autoencoder (Auto) models appear to have distributions that closely mirror the “Actual” yellow data points in terms of both range and density, particularly in the lower to mid-range emission values. For instance, the clusters of blue (ANN), cyan (RBF), and magenta (autoencoder) points show a strong resemblance to the spread of the actual data. While all models capture the general trend, these three models demonstrate a higher fidelity in reflecting the real-world CO2 emissions attributed to biomass energy consumption across the campus. This proximity to actual values makes ANN, RBF, and autoencoder strong candidates for reliable CO2 emission monitoring and forecasting in the context of biomass energy use.
3.5. Sensitivity Analysis and Feature Importance
These bar charts in
Figure 13 present the results of sensitivity analysis and feature importance for each of the six AI models employed: (a) ANN, (b) RF, (c) XGBoost, (d) RBF, (e) autoencoder, and (f) tree decision. Understanding feature importance is critical for interpreting how each model arrived at its predictions, by revealing which input variables had the most significant impact on the model’s output. While the specific labels for the “Features” are not provided, it is evident that the models attribute varying degrees of importance to different input variables. For instance, in panel (a) for ANN, one feature stands out with overwhelmingly high importance, suggesting that the ANN heavily relies on this specific input for its predictions. Conversely, in panel (d) for RBF, several features appear to contribute almost equally and significantly to the model’s performance, indicating a more distributed reliance on its inputs. This variance in feature importance across different models highlights their distinct internal mechanisms and sensitivities to the input data.
3.6. Short-Term Predections
Figure 14a illustrates an energy consumption forecast for a whole campus over the next five years, utilizing a linear regression model. The blue line represents historical energy data, exhibiting significant fluctuations in energy. Following this historical data, a red line extrapolates the energy consumption for the subsequent period, representing the forecasted energy using linear regression. This linear trend suggests a steady and continuous increase in energy consumption over the five-year forecast period. The model projects that the energy demand, which was around
kWh at the start of the forecast, will rise to approximately
kWh by the end of the five-year period. This indicates that, based on the linear regression, the campus should anticipate a substantial and consistent rise in its overall energy requirements in the coming five years.
Figure 14b illustrates a forecast for CO
2 emissions for an entire campus over the next five years, also using a linear regression model. Based on the energy usage graph, the blue line on the graph is based on the historical CO
2 emissions data that exhibited random patterns (the variability of CO
2 emissions for the campus over about 350 months). The historic CO
2 data was followed with a projected (red) line extending the forecast using the linear regression model. The resulting linear trend indicates a steady and significant increase in campus CO
2 emissions during the five-year forecast period. The model indicates CO
2 emissions based on the linear trend will start with CO
2 emissions of about 1500 kg at the beginning of the forecast period, increase rapidly to almost 14,000 kg of CO
2 emissions near the end of the five-year forecast period (equivalent to a little less than 1600 months from the first historical data that was used). Therefore, it appears the campus can anticipate a substantially large and steady increase in CO
2 emissions over the next five years, based on the linear regression.
4. Conclusions
This research proposes and validates a comprehensive artificial intelligence-based methodology for forecasting and optimizing energy consumption and associated CO2 emissions across a complex university campus composed of 27 buildings with varying sizes, uses, and thermal behaviors. The framework combines thermodynamic modeling of heat exchangers with advanced data-driven techniques to improve operational efficiency and sustainability in higher education infrastructures. Six distinct AI algorithms were implemented: artificial neural networks (ANN), radial basis function networks (RBF), autoencoders, random forest (RF), XGBoost, and tree decision, each trained on high-resolution data including monthly building-level energy use, detailed heat exchanger parameters, occupancy information, and weather-dependent load variations. The performance of these models was rigorously assessed using multiple statistical and hydrological metrics, including MAPE, RMSE, R2, NSE, and KGE, to ensure robustness, interpretability, and generalizability of the predictions.
The results obtained confirm that ANN, RBF, and autoencoder models exhibit superior performance in terms of prediction accuracy and reliability. These models consistently delivered R2 values above 0.99 and mean absolute percentage errors below 5%, even under varying building conditions. ANN demonstrated the most stable behavior across the full dataset, showing limited residual dispersion and strong alignment with the theoretical normal distribution of prediction errors. The approach combines data science with engineering thermodynamics to deliver accurate and actionable insights, making it a valuable tool for energy management. The results demonstrate that the integration of real-world performance data from thermal systems with state-of-the-art machine learning models offers significant potential for improving energy efficiency and reducing environmental impact in complex building clusters.
The ability of these models to capture nonlinear relationships between heat exchanger performance and energy demand is essential in environments characterized by dynamic load profiles and heterogeneous equipment operation. The autoencoder model proved especially useful in identifying latent features and detecting anomalies or inefficiencies in the operation of thermal systems, which is highly valuable for preventive maintenance and fault diagnosis.
The study revealed substantial differences in model performance depending on building typology, thermal load variability, and availability of operational data. Administrative and academic buildings with stable usage patterns allowed all models to achieve high predictive accuracy. Conversely, laboratory spaces, multi-purpose halls, and buildings with intermittent occupancy introduced significant variability that required tailored modeling strategies. These findings suggest that energy forecasting for multi-building infrastructures cannot rely on uniform models, but instead should consider hybrid or adaptive approaches, possibly integrating real-time feedback or reinforcement learning techniques in future work. The capacity to disaggregate predictions at the building level is particularly relevant for energy management aiming to identify high-consumption profiles, benchmark performance, and prioritize retrofitting actions under constrained budgets.
Heat exchangers emerged as a critical component in the energy-emission nexus. The study demonstrated that optimized HX operation, through control of flow rates, temperature differentials, and seasonal adjustments, can result in energy savings of up to 30% in certain buildings when compared to standard operating conditions. These reductions are directly reflected in lower CO2 emissions, particularly in buildings where fossil fuels such as natural gas are still predominant. By applying standardized emission factors to the energy consumption predictions, the framework enabled precise estimation of building-level carbon footprints. The study confirmed that buildings equipped with well-performing HX systems and partially integrated renewable energy technologies show significantly lower emission intensities, reinforcing the strategic value of upgrading HVAC subsystems as part of a decarbonization pathway.
The inclusion of a five-year forecasting horizon, based on linear regression applied to the AI-derived consumption and emissions data, further strengthens the contribution of this work. The forecast indicates a significant and sustained increase in both energy consumption and CO2 emissions across the campus if no corrective measures are implemented. Energy demand is expected to grow from 0.5 × 105 kWh to 5.5 × 105 kWh, while emissions may increase from approximately 1500 kg to 14,000 kg over the same period. These projections emphasize the urgency of deploying predictive control systems and data-driven decision-making tools for energy and sustainability planning. AI-based forecasting enables early detection of inefficiencies and supports the definition of targeted intervention strategies with measurable environmental and economic impact.
The study also incorporated sensitivity and feature importance analyses to identify the most influential parameters affecting energy use. Heat exchanger-specific variables, particularly volumetric flow rate, supply and return temperatures, and thermal efficiency, were consistently identified as the dominant predictors across all models. These findings highlight the importance of accurate monitoring and control of thermal subsystems within District Heating according to thermodynamic principles. Moreover, contextual parameters such as building use, occupancy density, and floor area were shown to enhance model performance, supporting the development of intelligent digital twins for operational optimization. The methodology is scalable to other buildings, campuses, hospitals, and cities, contributing directly to the broader transition towards data-informed, low-carbon built environments in alignment with the European Union’s climate and energy directives.