Machine Learning Models for Solar Power Generation Forecasting in Microgrid Application Implications for Smart Cities

Suanpang, Pannee; Jamjuntr, Pitchaya

doi:10.3390/su16146087

Open AccessArticle

Machine Learning Models for Solar Power Generation Forecasting in Microgrid Application Implications for Smart Cities

by

Pannee Suanpang

^1,*

and

Pitchaya Jamjuntr

²

¹

Department of Information Technology, Faculty of Science & Technology, Suan Dusit University, Bangkok 10300, Thailand

²

Department of Electrical Engineering, Faculty of Engineering, King Mongkut’s University of Technology Thonburi, Bangkok 10140, Thailand

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(14), 6087; https://doi.org/10.3390/su16146087

Submission received: 18 June 2024 / Revised: 7 July 2024 / Accepted: 8 July 2024 / Published: 17 July 2024

(This article belongs to the Topic Energy Management and Sustainable Development from Economic, Social and Environmental Aspects)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In the context of escalating concerns about environmental sustainability in smart cities, solar power and other renewable energy sources have emerged as pivotal players in the global effort to curtail greenhouse gas emissions and combat climate change. The precise prediction of solar power generation holds a critical role in the seamless integration and effective management of renewable energy systems within microgrids. This research delves into a comparative analysis of two machine learning models, specifically the Light Gradient Boosting Machine (LGBM) and K Nearest Neighbors (KNN), with the objective of forecasting solar power generation in microgrid applications. The study meticulously evaluates these models’ accuracy, reliability, training times, and memory usage, providing detailed experimental insights into optimizing solar energy utilization and driving environmental sustainability forward. The comparison between the LGBM and KNN models reveals significant performance differences. The LGBM model demonstrates superior accuracy with an R-squared of 0.84 compared to KNN’s 0.77, along with lower Root Mean Squared Error (RMSE: 5.77 vs. 6.93) and Mean Absolute Error (MAE: 3.93 vs. 4.34). However, the LGBM model requires longer training times (120 s vs. 90 s) and higher memory usage (500 MB vs. 300 MB). Despite these computational differences, the LGBM model exhibits stability across diverse time frames and seasons, showing robustness in handling outliers. These findings underscore its suitability for microgrid applications, offering enhanced energy management strategies crucial for advancing environmental sustainability. This research provides essential insights into sustainable practices and lays the foundation for a cleaner energy future, emphasizing the importance of accurate solar power forecasting in microgrid planning and operation.

Keywords:

forecasting; K Nearest Neighbors (KNN); Light Gradient Boosting Machine (LGBM); smart cities; solar power generation

1. Introduction

The accelerating pace of urbanization coupled with the ever-increasing energy demands of cities has brought forth an urgent need for innovative and sustainable solutions. Smart cities, characterized by their integration of advanced technologies and intelligent urban planning, aim to enhance the quality of life of their residents while minimizing their environmental footprint. In this pursuit, renewable energy sources have emerged as pivotal components, with solar power standing out as a beacon of clean and accessible energy [1,2]. Solar power technologies have become central to the vision of smart energy-efficient cities. By harnessing the sun’s abundant energy, cities can significantly reduce their reliance on fossil fuels, mitigate carbon emissions, and ensure a reliable energy supply for their growing populations [3].

Solar power generation in smart cities encompasses a wide array of applications, ranging from rooftop solar panels on residential buildings to expansive solar farms integrated into urban landscapes. The integration of solar energy into the fabric of cities not only provides a source of renewable electricity but also fosters energy independence and resilience. The decentralized nature of solar power allows for local energy production, reducing transmission losses and increasing the overall efficiency of the energy grid [3,4]. Moreover, the advent of advanced technologies, such as artificial intelligence and smart grid systems, has further enhanced the integration of solar power into urban environments. These technologies enable real-time monitoring, demand-side management, and the optimization of energy distribution, making solar power a viable and intelligent choice for smart cities [2,4].

Figure 1 depicts the microgrid components, showcasing renewable energy generated from solar power. This energy is sent to the microgrid and distributed to various sources, including energy storage systems, electronic vehicles (EVs), power generators, homes, and utility grids. Solar light generation has emerged as a promising solution for addressing the challenges posed by traditional energy sources and achieving environmental sustainability. In this quest for a sustainable energy paradigm, microgrids have emerged as an essential enabler, providing a dynamic platform for the seamless integration and optimal utilization of solar power generation [5,6,7]. Whether operating independently or inter-connected with the main power grid, microgrids offer a versatile infrastructure that empowers communities and industries to embrace solar energy as a central component of their energy ecosystem. Leveraging the potential of microgrids can unlock the full potential of solar power, maximizing its impact and paving the way toward a more resilient, self-sufficient, and sustainable energy landscape [7,8].

In recent years, the deployment of solar power systems has gained significant traction, especially in the context of microgrids. Microgrids, which are localized power distribution networks that can operate autonomously or in connection with the main grid, offer numerous benefits including enhanced energy reliability, reduced transmission losses, and increased resilience. Efficiently forecasting solar power generation in microgrids is crucial for optimal operation and planning as it enables the effective integration of renewable energy sources into the grid [7,9,10].

Solar power is a clean and renewable energy source that has the potential to play a significant role in meeting the world’s energy needs. However, the intermittent nature of solar power generation can make it difficult to integrate into the grid. One way to address this challenge is to use solar power generation forecasting to help ensure that the grid has the necessary capacity to meet demand [11].

1.1. Research Gap

Despite significant progress in the deployment of solar power systems, particularly in the context of microgrids, efficient forecasting of solar power generation remains crucial for optimal operation and planning. Forecasting models, such as Light Gradient Boosting Machine (LGBM) and K Nearest Neighbors (KNN), are used to predict solar power generation [3,4,7,8]. LGBM, a machine learning algorithm, has been shown to be effective for various forecasting tasks by incorporating meteorological variables and historical solar power generation data [8]. KNN, a non-parametric algorithm, captures the spatial correlation between neighboring solar power plants to enhance forecast accuracy [8].

To bridge this research gap, there are a number of different forecasting models that can be used to predict solar power generation. Two of the most popular models are LGBM and KNN. LGBM is a machine learning algorithm that has been shown to be effective for a variety of forecasting tasks. KNN is a non-parametric algorithm that is simple to implement and interpret. The significance of the research problem found that the effectiveness of LGBM lies in improving forecast accuracy by incorporating meteorological variables and historical solar power generation data [1,2,5,12] while KNN models capture the spatial correlation between neighboring solar power plants and enhance forecast accuracy [8,13].

However, there is a notable research gap in the comparative studies of LGBM and KNN models, specifically for environmental sustainability [8]. While both models have demonstrated effectiveness in different aspects, a direct comparison focusing on their impact on environmental sustainability has not been thoroughly explored [11,12]. Addressing this gap could provide valuable insights into the most effective forecasting methods for integrating solar power into smart city frameworks, ultimately contributing to more sustainable urban development. Finally, from the study of the literature, it was found that there is still a lack of comparative studies on LGBM and KNN models for environmental sustainability.

1.2. Objective

Therefore, this research paper has the primary objective of this study to compare the performance of the LGBM and KNN models in forecasting solar power generation within a microgrid context. The specific objectives of this study include the following:

Developing accurate and reliable solar power generation forecasting models using the LGBM and KNN algorithms;
Evaluating the performance of the LGBM and KNN models in terms of forecasting accuracy, precision, and efficiency;
Investigating the potential of these models to aid in effective decision making for optimal utilization of solar energy resources within microgrids;
Assessing the environmental benefits and implications of accurate solar power generation forecasting for reducing carbon emissions and advancing sustainability goals.

This study holds significant implications for environmental sustainability and renewable energy management. Comparing the LGBM and KNN models for solar power generation forecasting provides valuable insights into the suitability and performance of machine learning algorithms in microgrid applications. The findings of this study can greatly assist policymakers, energy managers, and system operators in making informed decisions regarding the integration and management of solar power generation within microgrids. This, in turn, promotes the development of a cleaner and more sustainable energy infrastructure.

Furthermore, this study contributes to the existing body of knowledge by exploring the potential of advanced forecasting techniques to enhance the efficiency and reliability of renewable energy systems. By evaluating the performance of the LGBM and KNN models, this research expands our understanding of the effectiveness of different algorithms in predicting solar power generation. The outcomes of this study can serve as a foundation for future research and development in the field of renewable energy forecasting and microgrid management, fostering the advancement of sustainable energy practices.

2. Review Literature

2.1. Solar Power Generation and Microgrids

The integration of solar power generation and microgrids within the context of smart cities has garnered significant attention in recent scholarly works. Researchers have explored innovative strategies to harness solar energy efficiently while optimizing its use within urban microgrids, aiming to create sustainable and resilient energy infrastructures for smart cities [14]. One prevalent area of research focuses on the technological advancements in solar panels and energy storage systems. Studies by Smith et al. [15] have highlighted the importance of high-efficiency solar panels coupled with advanced energy storage solutions, enabling smart cities to store surplus solar energy for later use, thereby ensuring a stable power supply even during periods of low solar generation. Additionally, scholars have delved into the design and optimization of microgrid architectures. Research conducted by Li [16] emphasizes the significance of intelligent microgrid management systems in balancing the supply–demand dynamics within smart cities. Advanced control algorithms and real-time monitoring mechanisms are explored to efficiently integrate solar power into microgrids, enhancing grid stability and reducing dependency on centralized power sources. Moreover, policy and regulatory frameworks governing solar power integration in smart city microgrids have been a subject of scholarly inquiry. Anderson and Patel [17] have analyzed the impact of supportive policies and incentives on encouraging investments in solar infrastructure. Their findings underscore the crucial role of policy formulation in fostering a conducive environment for the widespread adoption of solar power generation technologies in smart cities. Furthermore, researchers have explored case studies to evaluate the real-world implementation of solar-powered microgrids in smart cities. The work of Kim et al. [18] provides insights into the successful deployment of solar microgrids in urban areas, emphasizing the role of community engagement and collaboration between stakeholders. Such case studies offer valuable lessons for other smart cities aiming to replicate similar sustainable energy initiatives.

Smart grids play a pivotal role in the effective utilization of solar energy in smart cities and the real-time monitoring and control enabled by smart grid technologies facilitate seamless integration, allowing cities to manage energy distribution efficiently [19]. The advancements in energy storage solutions, such as high-capacity batteries and innovative thermal storage systems, have addressed the challenge of solar energy intermittent, ensuring a stable power supply even during periods of low sunlight [20].

Despite the advancements, challenges persist in the integration of solar power in smart cities. Issues such as land availability for solar farms, grid integration complexities, and public awareness remain significant hurdles [21]. However, future research directions should focus on enhancing energy storage technologies, optimizing grid infrastructure, and developing effective public and business sector engagement strategies to promote solar energy adoption in urban areas [22,23].

2.2. Solar Power Generation Forecasting Techniques

Solar power generation forecasting techniques have experienced significant advancements in recent years, enabling the efficient utilization of solar energy resources within microgrid systems. Researchers have explored various methods to forecast solar power generation, encompassing both statistical and machine learning-based approaches. Statistical methods, such as Autoregressive Integrated Moving Average (ARIMA), exponential smoothing, and linear regression, have been commonly employed [24]. These techniques rely on historical data and mathematical models to make predictions; however, they may encounter challenges in capturing the nonlinear and complex relationships inherent in solar power generation.

One common approach is the use of meteorological data and statistical methods for forecasting solar power generation. Studies by Kalogirou [25] have shown that combining historical solar irradiance data with statistical models, such as ARIMA, can yield accurate short-term solar power forecasts. These methods consider historical weather patterns and solar radiation levels to predict future solar power output.

The emergence of machine learning-based approaches has garnered attention in solar power generation forecasting due to their capability to capture intricate patterns and handle vast datasets. Among these approaches, artificial neural networks [26], support vector regression (SVR) [27], random forest [18], and gradient boosting algorithms have demonstrated promising results. By leveraging machine learning algorithms, these models enhance the accuracy and reliability of solar power generation forecasts. Moreover, a promising technique involves the use of machine learning algorithms, specifically artificial neural networks (ANNs). Researchers like Li et al. [28] have demonstrated the effectiveness of ANNs in capturing complex nonlinear relationships within solar power generation data. By training neural networks on historical solar output data along with corresponding meteorological variables, ANNs can provide accurate and real-time solar power forecasts. Furthermore, hybrid models that combine different forecasting methods have gained attention in recent years. For instance, the integration of Numerical Weather Prediction (NWP) models with machine learning algorithms has been explored by several researchers [29]. This hybrid approach leverages the strengths of both NWP models, which provide detailed weather information, and machine learning algorithms, which can effectively learn patterns from historical data. By combining these techniques, hybrid models offer enhanced accuracy in solar power generation forecasts, especially for medium to long-term predictions.

Artificial neural networks have proven effective in capturing nonlinear relationships and complex patterns within solar power generation data [30]. Support vector regression, a supervised learning algorithm, employs support vector machines to regress and forecast solar power generation. Random forest, an ensemble learning technique, combines multiple decision trees to improve prediction accuracy and handle high-dimensional data [31]. Additionally, gradient boosting algorithms, such as LGBM, leverage boosting techniques to iteratively enhance model performance [32]. These machine learning algorithms offer robust solutions for accurate and reliable solar power generation forecasting. By incorporating machine learning-based approaches into the realm of solar power generation forecasting, researchers have unlocked the potential to harness solar energy resources more effectively. These techniques enable the capture of intricate relationships and patterns, leading to improved accuracy in predicting solar power generation. The continued exploration and refinement of machine learning algorithms holds promise for further advancements in this field, facilitating the efficient planning and operation of microgrid systems [31,32].

2.3. Light Gradient Boosting Machine (LGBM)

In recent years, accurate solar power generation forecasting has garnered significant attention as it plays a crucial role in ensuring the efficient operation and management of microgrids. Numerous machine learning techniques have been employed to develop forecasting models; one such technique is the LGBM.

LGBM, a powerful gradient boosting framework, has gained popularity due to its efficiency and accuracy in solving a wide range of machine learning problems. By combining multiple weak predictive models known as decision trees, LGBM creates a robust predictive model. It employs a gradient-based learning algorithm and exhibits exceptional performance with large datasets and high-dimensional feature spaces. Several studies have investigated the application of LGBM for solar power generation forecasting. For example, the study devised an LGBM-based model to forecast solar power generation in a remote area microgrid [32]. This model demonstrated superior accuracy compared to traditional forecasting methods. Similarly, the study utilized LGBM to forecast solar power generation in a large-scale photovoltaic power plant and showcased its effectiveness in capturing intricate temporal patterns [33]. Figure 2 shows the LGBM model.

These findings highlight LGBM as a promising algorithm for solar power generation forecasting. It exhibits high accuracy, even when confronted with noisy data, and enables real-time forecasting of solar power generation. The utilization of LGBM for solar power generation forecasting holds the potential to contribute to the environmental sustainability of microgrids. By accurately predicting solar power generation, microgrids can operate more efficiently and reduce their reliance on fossil fuels. This, in turn, aids in minimizing greenhouse gas emissions and enhancing air quality. In addition to the aforementioned studies, further research has explored the application of LGBM in solar power generation forecasting. For instance, Wu et al. [34] investigated the impact of various feature engineering techniques on the performance of LGBM models in solar power forecasting. Their results demonstrated the importance of feature selection and preprocessing methods in enhancing forecasting accuracy.

Furthermore, the study proposed a hybrid model that combines LGBM with other machine learning algorithms to improve the accuracy and robustness of solar power generation forecasting [35]. Their findings indicated that the hybrid model outperforms individual models in capturing complex patterns and achieving accurate predictions. Moreover, recent research [24] focused on the interpretability of LGBM models in solar power generation forecasting. They introduced a novel feature importance measure for LGBM, shedding light on the relative significance of different features in the forecasting process [36].

Another notable study [18,27] explored the potential of LGBM for solar power generation forecasting in urban areas. Their research emphasized the adaptability of LGBM models to different geographical locations and highlighted the importance of incorporating spatial data for accurate solar power forecasts [37]. Furthermore, Li, et al. [28] investigated the impact of incorporating weather data in LGBM models for solar power generation forecasting. Their results indicated that the inclusion of weather variables significantly improves the accuracy and reliability of forecasts, particularly in regions with high weather variability [38].

In summary, the application of LGBM in solar power generation forecasting has garnered considerable attention due to its efficiency, accuracy, and ability to handle complex patterns. With ongoing research focusing on various aspects of LGBM modeling, including feature engineering, interpretability, hybrid models, and the incorporation of weather and spatial data, the potential of LGBM in forecasting solar power generation is further solidified. The adoption of LGBM models in microgrids contributes to environmental sustainability by enabling efficient operations, reduced reliance on fossil fuels, and a sub-sequent decrease in greenhouse gas emissions, thus improving air quality.

2.4. K Nearest Neighbors (KNN)

KNN is a widely adopted non-parametric machine learning algorithm employed in both regression and classification tasks, including solar power generation forecasting. The essence of KNN lies in its utilization of historical data from similar instances, referred to as nearest neighbors, to predict the target variable [39]. The fundamental assumption underlying KNN is that instances sharing similar features are likely to exhibit similar target values. This principle forms the basis of KNN’s simplicity and interpretability, making it an attractive choice for forecasting applications. Unlike other algorithms, KNN does not impose any specific assumptions regarding the data distribution, allowing it to effectively handle nonlinear relationships [39,40]. Figure 3 shows the KNN model.

Nevertheless, the performance of KNN heavily hinges on two critical factors: the appropriate selection of the number of neighbors (K) and the choice of a suitable distance metric for similarity calculation. Determining the optimal value of K is crucial, as a small value may lead to excessive sensitivity to noise and result in overfitting, while a large value can smooth out important patterns and lead to underfitting. Additionally, selecting an appropriate distance metric, such as Euclidean or Manhattan distance, is vital to accurately measure the similarity between instances. The choice of distance metric depends on the characteristics of the dataset and the problem at hand [41].

Several research papers have explored the application of KNN in solar power generation forecasting and have proposed various enhancements to its performance. For instance, Shi et al. [42] combined KNN with fuzzy C-means clustering to improve short-term solar power generation forecasting. Zhao et al. [43] introduced an optimized feature selection algorithm to enhance the accuracy and efficiency of KNN-based forecasting. Raza, et al. [44] conducted a comparative analysis of machine learning techniques, including KNN, for solar power generation forecasting, providing insights into the strengths and weaknesses of KNN compared to other algorithms. Dong et al. [45] proposed an optimized feature selection method to enhance the performance of KNN for solar power prediction. Ding et al. [46] presented an improved KNN model with a novel method to determine the optimal number of neighbors, thus enhancing its predictive accuracy. These studies contribute to our understanding of the application of KNN in solar power generation forecasting. By considering various techniques, optimizations, and feature selection methods, researchers aim to improve the performance and accuracy of KNN-based models for more reliable and precise solar power generation forecasts.

2.5. Comparative Studies on Solar Power Generation Forecasting

Numerous comparative studies have been conducted in recent years to evaluate and compare the performance of different forecasting techniques for solar power generation. These studies primarily focus on incorporating the latest advancements in the field. The main objective of these studies is to identify the most accurate and reliable models for forecasting solar power generation, thereby improving the planning and operation of microgrids.

One such study was conducted by Zhang et al. [7]. A comparative study of machine learning algorithms was proposed for solar power generation forecasting, which compared multiple machines learning algorithms, including LGBM, KNN, artificial neural networks (ANN) and support vector regression (SVR). The analysis involved evaluating the accuracy, precision, and reliability of these algorithms. Similarly, Wai and Lai [47] performed a comparative analysis of machine learning algorithms for solar power generation forecasting in microgrid systems. They investigated the performance of various techniques, including LGBM, KNN, artificial neural networks, and decision trees, with the aim of determining the most effective model in terms of accuracy and computational efficiency [48].

Wang et al. [41] contributed to the literature with their comparative analysis of different machine-learning algorithms for solar power generation forecasting. They compared techniques such as LGBM, KNN, artificial neural networks, and support vector regression, focusing on evaluating the predictive performance of these algorithms and identifying the most suitable model for accurate solar power generation forecasting.

In the context of distributed energy systems, Gbémou et al. [35] conducted a comparative study to assess and compare the performance of various machine learning algorithms, including LGBM, KNN, artificial neural networks, and support vector regression, for forecasting solar power generation. Their research shed light on the strengths and weaknesses of each algorithm, providing insights for selecting appropriate models in distributed energy systems.

These comparative studies, along with others, offer valuable insights into the performance and effectiveness of different machine -learning algorithms for solar power generation forecasting. By evaluating and comparing these techniques, researchers and practitioners can make informed decisions when selecting forecasting models for microgrids, ultimately improving the planning, operation, and utilization of solar energy resources.

2.6. Rayong Smart Cities: Thailand

This areas’ area is located at Rayong, a prominent industrial center in Thailand, and has undertaken numerous smart city initiatives aimed at improving urban life. These initiatives, spanning transportation, energy management, and healthcare, have employed innovative approaches. Simultaneously, renewable energy projects have played a pivotal role in significantly reducing the city’s carbon footprint [49]. However, the journey toward developing smart cities in Rayong has been accompanied by challenges. Critical issues such as data security, citizen privacy, and the establishment of necessary technological infrastructure demand careful consideration and effective solutions [49].

Rayong’s smart city initiatives signify a transformative step toward enhanced urban living. While challenges persist, the positive socio-economic impact and increased citizen engagement demonstrate the potential for continued growth and development in the realm of smart cities. Within the Rayong Smart City project, Ban Chang (Figure 4) has undergone significant development as a new area incorporating various smart technologies. These include smart energy, education, communication, water quality, street lighting, healthcare, parking, transportation, and water management systems. This shift is due to the decision to utilize clean energy sources, specifically solar cells, within the town. Furthermore, there are plans to install wind turbines to generate additional electricity from wind and solar energy. These initiatives are aimed at establishing an alternative and sustainable energy supply for the community. Currently, the calculated capacity stands at 15 megawatts, which is deemed sufficient for the town’s needs. The envisioned population for this new town is approximately 15,000 people. Each household will be equipped with smart meters, enabling precise monitoring of energy consumption. This technology will facilitate effective control and management of electricity usage, providing an additional avenue for regulating power consumption within the community [50].

3. Methodology

3.1. Research Framework

Figure 5 presents the research framework, encompassing four essential steps, designed to enhance solar power integration within microgrids for efficient energy management in smart cities, as follows:

This comprehensive research framework showcases a systematic approach to harnessing solar power within microgrids. By emphasizing data quality, advanced processing techniques, feature extraction, and rigorous model evaluation, this framework contributes significantly to the advancement of sustainable energy solutions in smart cities.

Microgrid sensors are transmitted to the central electricity system.
Data Processing: upon acquiring the data, the second step involves rigorous processing. This includes noise removal and clustering techniques to refine the collected data, ensuring accuracy and reliability for further analysis;
Feature Extraction and Learning: in this crucial phase, the processed data undergo feature extraction, where relevant attributes are identified. Subsequently, machine learning algorithms are employed for training and learning. This step is pivotal in understanding patterns and trends within the data, enabling informed predictions and decision models;
Model Evaluation: the final stage assesses the effectiveness of the developed models. Here, test data is employed to evaluate the performance of the LGBN model. Additionally, the outcomes are compared with those of the KNN model, providing a comprehensive evaluation of the model’s efficiency and accuracy.

3.2. Data Collection and Preprocessing

The data were collected from solar power generation from the microgrid system located in Ban Chang Rayong smart city, Thailand. The dataset includes information on solar irradiance, temperature, and various meteorological factors known to influence solar power generation. Additionally, we obtained corresponding power generation data from the microgrid’s solar energy system.

The dataset comprises 9459 entries with each containing variables such as solar irradiance, ambient temperature, humidity, wind speed, visibility, pressure, and cloud ceiling. These variables are crucial as they directly or indirectly impact solar power generation. The data also includes timestamps to track changes over time. By conducting careful data collection and preprocessing, which involved handling missing values, normalizing variables, and addressing outliers, we aimed to obtain a reliable and high-quality dataset for our study. These steps were crucial in ensuring the accuracy and validity of the subsequent analysis and modeling processes.

3.2.1. Preprocessing Methods

Once the data was collected, we proceeded with the preprocessing phase to ensure the reliability and quality of the dataset. This involved several steps:

-: Handling missing values: missing values were handled using interpolation and imputation techniques to maintain data continuity and completeness;
-: Normalization: variables were normalized to ensure a consistent scale, facilitating better performance in machine learning algorithms. Normalization was achieved using min–max scaling, transforming the data into a range between 0 and 1;
-: Outlier detection and treatment: potential outliers were identified using statistical methods, such as the Z-score, and treated appropriately to prevent skewed analysis.

3.2.2. Mathematical Descriptions

Let

X

represent the raw data matrix where

X_{i j}

denotes the value of the j-th feature for the i-th sample. The normalization of feature

j

is performed as

[X_{i j}^{'} = \frac{X_{i j} - \min (X_{j})}{\max (X_{j}) - \min (X_{j})}]

(1)

where

\min (X_{j})

and

\max (X_{j})

are the minimum and maximum values of feature

j

, respectively. The standard deviation

(σ)

and mean

(μ)

for each feature were calculated to understand the dispersion and central tendency:

[μ_{j} = \frac{1}{n} \sum_{i = 1}^{n} X_{i j}]

(2)

[σ_{j} = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(X_{i j} - μ_{j})}^{2}}]

(3)

Figure 6 illustrates the distribution of power output (in watts) from the microgrid’s solar energy system. The histogram provides insights into the frequency and range of power outputs observed during the data collection period, highlighting the typical power generation levels and any notable variations.

3.3. Feature Selection and Engineering

In this phase, we perform feature selection and engineering techniques to extract the most informative and influential factors for solar power generation forecasting. This involves conducting statistical analyses and exploring correlation patterns between the collected meteorological and power generation variables. Through this process, we identify key features that significantly impact solar power generation, thereby enhancing the predictive accuracy of our models.

The following is a summary of the correlation values based on the provided matrix (Figure 7):

Time: There is a strong positive correlation between Time and Hour (1.00), indicating that they are highly correlated.

Month: Month shows a weak negative correlation with Time (−0.01) and Hour (−0.01), suggesting a slight decrease in correlation with these variables.

Hour: Similar to Time, Hour exhibits a strong positive correlation with Time (1.00), indicating a high degree of correlation between the two variables.

Humidity: Humidity shows a negative correlation with Time (−0.19), indicating a weak negative relationship. It also demonstrates a negative correlation with AmbientTemp (−0.57) and PowerOutput (−0.40), suggesting a stronger negative relationship with these variables.

AmbientTemp: AmbientTemp exhibits a positive correlation with Time (0.16), Month (0.21), and Hour (0.17), indicating a weak positive relationship. It also shows a strong positive correlation with PowerOutput (0.58), suggesting a significant positive relationship between AmbientTemp and PowerOutput.

Visibility: Visibility displays weak positive correlations with Time (0.01), Month (0.06), Hour (0.03), and PowerOutput (0.20), indicating slight positive relationships with these variables.

Pressure: Pressure exhibits a weak negative correlation with Time (−0.05) and a weak positive correlation with Humidity (0.43). It also shows a weak positive correlation with PowerOutput (0.07), indicating a slight positive relationship.

PowerOutput: PowerOutput displays a weak positive correlation with Time (0.08), a weak negative correlation with Month (−0.02), and a moderate positive correlation with AmbientTemp (0.58), indicating a notable positive relationship with AmbientTemp.

This summary provides an overview of the correlation values between the variables. Positive correlations indicate a direct relationship (both variables increase or decrease together), while negative correlations suggest an inverse relationship (one variable increases as the other decreases). The strength of the correlation is determined by the magnitude of the correlation coefficient, ranging from −1 to 1. By preprocessing the data and focusing on non-redundant and meaningful features, our analysis provides a clearer understanding of the factors influencing solar power generation. This rigorous approach ensures the reliability and accuracy of our subsequent modeling efforts.

3.4. Light Gradient Boosting Machine (LGBM) Model

The LGBM model, a powerful gradient boosting algorithm, is employed as one of the forecasting models in our comparative analysis. LGBM leverages boosting techniques to iteratively optimize the model’s performance. With its ability to handle large-scale datasets and capture complex relationships, LGBM offers a promising solution for accurate solar power generation forecasting. We implement the LGBM model using the preprocessed dataset and tune the model’s hyperparameters to maximize its predictive capability.

The LGBM model is employed as one of the key forecasting models in our comparative analysis due to its efficiency and high performance in handling large-scale datasets. LGBM is a powerful gradient-boosting algorithm that iteratively optimizes the model’s performance by combining multiple weak learners to form a strong predictor. Its ability to capture complex relationships and its scalability make it a promising solution for accurate solar power generation forecasting.

3.4.1. Model Implementation and Hyperparameter Tuning

We implemented the LGBM model using the preprocessed dataset. To ensure the model’s optimal performance, we performed hyperparameter tuning. The key hyperparameters and their settings are as follows (Table 1).

3.4.2. Dataset Division

The dataset is divided into training and testing sets to evaluate the model’s performance. We used a 20–80 split, where 80% of the data are used for training the model and the remaining 20% are used for testing. This division ensures that the model is trained on a substantial portion of the data while being tested on an unseen subset to validate its generalization capability.

Loss Function: the primary loss function used in our LGBM model is the MAE, defined as follows:

MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - \hat{y_{i}}|

(4)

where

$n$ is the number of data points;
$y_{i}$ is the actual value;
$\hat{y_{i}}$ is the predicted value.

MAE is chosen for its simplicity and its ability to provide an unbiased estimation of the prediction errors. It measures the average magnitude of the errors in a set of predictions, without considering their direction, thus giving a clear picture of the model’s accuracy.

Mathematical Explanation: LGBM utilizes gradient boosting to minimize the loss function iteratively. The algorithm follows these steps:

Initialization: start with an initial prediction, typically the mean of the target variable;
Compute residuals: calculate the residuals, which are the differences between the actual and predicted values;
Fit a weak learner: fit a weak learner (e.g., a decision tree) to the residuals;
Update prediction: update the prediction by adding the new weak learner’s prediction, scaled by the learning rate;
Repeat: repeat steps 2–4 for a predetermined number of iterations or until convergence.

The mathematical formulation for the gradient boosting process in LGBM is given by

F_{m} (x) = F_{m - 1} (x) + η h_{m} (x)

(5)

where

$F_{m} (x)$ is the prediction at iteration $m$ ;
$η$ is the learning rate;
$h_{m} (x)$ is the weak learner at iteration $m$ .

By iteratively refining the model and minimizing the residuals, LGBM efficiently captures complex patterns in the data, leading to accurate and robust solar power generation forecasts.

Model Valuation

The performance of the LGBM model is evaluated using the test dataset. The MAE and other relevant metrics such as Root Mean Squared Error (RMSE) and R-squared

(R^{2})

are computed to assess the accuracy and reliability of the model.

3.4.3. The LGBM Model

Input: preprocessed dataset containing historical solar power generation data and relevant meteorological factors;
Split the dataset into training and testing sets;
Initialize the LGBM model with default hyperparameters;
Iterate the following steps until convergence or a specified number of iterations:
4.1
Compute the gradients of the loss function with respect to the model’s predictions;
4.2
Update the model’s predictions by adding a new weak learner (tree) that minimizes the loss function;
4.3
Update the weights of the training samples based on the loss function and the new weak learner’s performance;
Tune the hyperparameters of the LGBM model using techniques such as grid search or Bayesian optimization;
Train the LGBM model on the training set using the tuned hyperparameters;
Evaluate the model’s performance on the testing set using appropriate evaluation metrics (e.g., mean squared error and mean absolute error);
Repeat steps 4–7 for different configurations of hyperparameters to find the optimal combination;
Select the LGBM model with the best performance based on the evaluation metrics;
Output: the trained LGBM model for solar power generation forecasting in the microgrid context.

3.4.4. The LGBM Algorithms

The algorithm of the LGBM model for solar power generation forecasting in the microgrid context apply to the following:

Input: the preprocessed dataset containing historical solar power generation data and relevant meteorological factors.

Output: the trained LGBM model for solar power generation forecasting in the microgrid context.

Steps:

Split the dataset into training and testing sets;
Initialize the LGBM model with default hyperparameters;
Iterate the following steps until convergence or a specified number of iterations;
Compute the gradients of the loss function with respect to the model’s predictions;
Update the model’s predictions by adding a new weak learner (tree) that minimizes the loss function;
Update the weights of the training samples based on the loss function and the new weak learner’s performance;
Tune the hyperparameters of the LGBM model using techniques such as grid search or Bayesian optimization;
Train the LGBM model on the training set using the tuned hyperparameters;
Evaluate the model’s performance on the testing set using appropriate evaluation metrics (e.g., mean squared error and mean absolute error);
Repeat steps 3–6 for different configurations of hyperparameters to find the optimal combination;
Select the LGBM model with the best performance based on the evaluation metrics.

3.4.5. Pseudocode

def train_lgbm_model(dataset):

# Split the dataset into training and testing sets.

train_set, test_set = split_dataset(dataset)

# Initialize the LGBM model with default hyperparameters.

lgbm_model = LGBMRegressor()

# Train the LGBM model on the training set.

lgbm_model.fit(train_set.features, train_set.targets)

# Evaluate the model’s performance on the testing set.

test_mse = lgbm_model.score(test_set.features, test_set.targets)

return lgbm_model, test_mse

def tune_lgbm_model(dataset):

# Define the hyperparameter space to search.

hyperparameter_space = {

’num_leaves’: [16, 32, 64],

’learning_rate’: [0.01, 0.05, 0.1],

’n_estimators’: [100, 200, 300]

}

# Perform grid search to find the optimal hyperparameters.

best_model, best_mse = tune_hyperparameters(lgbm_model, dataset, hyperparameter_space)

return best_model, best_mse

def select_lgbm_model(dataset):

# Train and evaluate the LGBM model with different hyperparameter configurations.

models = []

mses = []

for hyperparameter_configuration in hyperparameter_space:

model, mse = train_lgbm_model(dataset, hyperparameter_configuration)

models.append(model)

mses.append(mse)

# Select the LGBM model with the best performance on the testing set.

best_model_index = min(range(len(mses)), key=lambda i: mses[i])

best_model = models[best_model_index]

return best_model

# Train and tune the LGBM model.

lgbm_model, best_mse = train_lgbm_model(dataset)

lgbm_model = tune_lgbm_model(dataset)

# Select the best LGBM model.

lgbm_model = select_lgbm_model(dataset)

return lgbm_model

3.5. K Nearest Neighbors (KNN) Model

The KNN model, a non-parametric machine learning algorithm, serves as the second forecasting model in our comparison. KNN relies on historical data from similar instances, known as nearest neighbors, to predict solar power generation. Due to its simplicity and interpretability, KNN has been widely used in various forecasting tasks. We implement the KNN model with the preprocessed dataset, considering various values for the number of neighbors (K) and different distance metrics to identify the optimal configuration for accurate solar power generation forecasting.

3.5.1. Algorithm and Implementation Details

Input: the preprocessed dataset containing historical solar power generation data and relevant meteorological factors;
Dataset division: the dataset is split into training and testing sets using a 20–80 split, ensuring 80% of the data are used for training the model, and 20% for testing to evaluate generalization capability;
Model initialization: initialize the KNN model;
Prediction process: for each instance in the testing set:
- Calculate the distances between the instance and all instances in the training set using a chosen distance metric (e.g., Euclidean or Manhattan);
- Select the K Nearest Neighbors based on the calculated distances;
- Retrieve the target values (solar power generation) of the K Nearest Neighbors;
- Predict the target value of the instance by aggregating the target values of the K Nearest Neighbors (e.g., averaging for regression tasks and majority voting for classification tasks);
Model evaluation: evaluate the performance of the KNN model on the testing set using appropriate evaluation metrics such as MSE, MAE, and R-squared (R²);
Hyperparameter tuning: perform hyperparameter tuning to identify the optimal number of neighbors (K) and distance metric (e.g., Euclidean, Manhattan, and Chebyshev) using grid search or other optimization techniques;
Selecting the best model: select the KNN model configuration with the best performance based on evaluation metrics.

3.5.2. Hyperparameter Settings and Dataset Division (Table 2)

Loss Functions and Mathematical Descriptions:

MSE: measures the average of the squares of the errors, providing a good measure of variance in the predicted values.

[MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}]

(6)

MAE: measures the average absolute differences between predicted and actual values, it is less sensitive to outliers than MSE.

[MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - \hat{y_{i}}|]

(7)

R-squared (R²): indicates the proportion of the variance in the dependent variable that is predictable from the independent variables.

[R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}]

(8)

Table 2. Hyperparameter settings and dataset division.

Hyperparameter	Description
Number of Neighbors	Values tested: 1, 3, 5, 7, 9
Distance Metric	Euclidean, Manhattan, Chebyshev
Dataset Division	80% training, 20% testing

3.5.3. The Algorithm for the KNN Model

Input: preprocessed dataset containing historical solar power generation data and relevant meteorological factors;
Split the dataset into training and testing sets;
Initialize the KNN model;
For each instance in the testing set, do the following steps:
4.1
Calculate the distances between the instance and all instances in the training set using a chosen distance metric (e.g., Euclidean and Manhattan);
4.2
Select the K Nearest Neighbors based on the calculated distances;
4.3
Retrieve the target values (solar power generation) of the K Nearest Neighbors;
4.4
Predict the target value of the instance by aggregating the target values of the K Nearest Neighbors (e.g., averaging for regression tasks and majority voting for classification tasks);
Evaluate the performance of the KNN model on the testing set using appropriate evaluation metrics (e.g., mean squared error and mean absolute error);
Repeat steps 4–5 for different values of K and distance metrics to identify the optimal configuration;
Select the KNN model with the best performance based on the evaluation metrics;
Output: the trained KNN model for solar power generation forecasting in the microgrid context.

3.5.4. KNN Algorithms

The algorithm of the KNN model for solar power generation forecasting in the microgrid context to the following:

Input: preprocessed dataset containing historical solar power generation data and relevant meteorological factors.

Output: The trained KNN model for solar power generation forecasting in the microgrid context.

Steps:

Split the dataset into training and testing sets;
Initialize the KNN model;
For each instance in the testing set, do the following steps;
Calculate the distances between the instance and all instances in the training set using a chosen distance metric (e.g., Euclidean and Manhattan);
Select the K Nearest Neighbors based on the calculated distances;
Retrieve the target values (solar power generation) of the K Nearest Neighbors;
Predict the target value of the instance by aggregating the target values of the K Nearest Neighbors (e.g., averaging for regression tasks and majority voting for classification tasks);
Evaluate the performance of the KNN model on the testing set using appropriate evaluation metrics (e.g., mean squared error and mean absolute error);
Repeat steps 3–4 for different values of K and distance metrics to identify the optimal configuration;
Select the KNN model with the best performance based on the evaluation metrics.

3.5.5. Pseudocode

def train_knn_model(dataset, k, distance_metric):

# Split the dataset into training and testing sets.

train_set, test_set = split_dataset(dataset)

# Initialize the KNN model.

knn_model = KNeighborsRegressor(n_neighbors=k, metric=distance_metric)

# Train the KNN model on the training set.

knn_model.fit(train_set.features, train_set.targets)

# Evaluate the model’s performance on the testing set.

test_mse = knn_model.score(test_set.features, test_set.targets)

return knn_model, test_mse

def tune_knn_model(dataset):

# Define the hyperparameter space to search.

hyperparameter_space = {

’n_neighbors’: [1, 3, 5, 7, 9],

’metric’: [’euclidean’, ’manhattan’, ’chebyshev’]

}

# Perform grid search to find the optimal hyperparameters.

best_model, best_mse = tune_hyperparameters(knn_model, dataset, hyperparameter_space)

return best_model, best_mse

def select_knn_model(dataset):

# Train and evaluate the KNN model with different hyperparameter configurations.

models = []

mses = []

for hyperparameter_configuration in hyperparameter_space:

model, mse = train_knn_model(dataset, hyperparameter_configuration[’n_neighbors’], hyperpa-rameter_configuration[’metric’])

models.append(model)

mses.append(mse)

# Select the KNN model with the best performance on the testing set.

best_model_index = min(range(len(mses)), key=lambda i: mses[i])

best_model = models[best_model_index]

return best_model

# Train and tune the KNN model.

knn_model, best_mse = train_knn_model(dataset)

knn_model = tune_knn_model(dataset)

# Select the best KNN model.

knn_model = select_knn_model(dataset)

return knn_model

By employing the LGBM and KNN models within our methodology, we aim to evaluate and compare their performance in terms of forecast accuracy, precision, and reliability. Through extensive experimentation and analysis, we determine which model provides superior forecasting capabilities within the microgrid context. This research contributes to the optimization of solar power generation forecasting, aiding in the efficient utilization of solar energy resources in microgrid systems.

4. Result

4.1. Evaluation Metrics

The performance of the LGBM and KNN models for solar power generation forecasting was evaluated using commonly used metrics, including MSE, MAE, and RMSE. The evaluation metrics provided quantitative measures of the accuracy of the model’s predictions. Additionally, other metrics such as Mean Absolute Percentage Error (MAPE) or coefficient of determination (R-squared) may have been considered to assess the models’ performance from different perspectives. Here is the table for the evaluation metrics (Table 3).

These evaluation metrics were used to assess the performance of the LGBM and KNN models for solar power generation forecasting. They provided quantitative measures of accuracy and helped in comparing the predictive capabilities of the models from different perspectives.

The formulas for R-squared, RMSE, and MAE written in equation format are the following:

R^{2} = 1 - \frac{S S R}{S S T}

(9)

where

S S R

represents the sum of squared residuals and

S S T

represents the total sum of squares. R-squared indicates the proportion of the variance in the dependent variable that is predictable from the independent variable(s).

Root Mean Squared Error (RMSE):

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(p r e d i c t e d_{i} - a c t u a l_{i})}^{2}}

(10)

RMSE measures the square root of the average squared difference between predicted and actual values, providing a measure of the model’s prediction error.

Mean Absolute Error (MAE):

M A E = \frac{1}{n} \sum_{i = 1}^{n} |p r e d i c t e d_{i} - a c t u a l_{i}|

(11)

MAE measures the average absolute difference between predicted and actual values, offering another perspective on prediction accuracy.

where

(R^{2})

represents the R-squared value;

(S S R)

represents the sum of squared residuals;

(S S T)

represents the total sum of squares;

(n)

represents the number of observations in the dataset;

(p r e d i c t e d_{i})

represents the predicted value for observation

(i)

;

(a c t u a l_{i})

represents the actual value for observation

(i) .

These Equations (9)–(11) can be used to calculate the R-squared, RMSE, and MAE metrics to evaluate the performance of the selected models in your analysis.

4.2. Comparison of LGBM and KNN Models

A detailed comparison between the LGBM and KNN models was conducted to evaluate their performance for solar power generation forecasting. The analysis focused on assessing the accuracy, precision, and reliability of both models. The evaluation metrics discussed earlier were used to compare the predictive capabilities of LGBM and KNN. Significant differences in performance between the models were identified. Furthermore, the computational efficiency of each model was explored, taking into account factors such as the training time and memory requirements. Here is the table showcasing the comparison between the LGBM and KNN models for solar power generation forecasting:

Table 4 shows the comparison of the performance of the LGBM and KNN models using various evaluation metrics. The metrics include accuracy represented by the R-squared value, RMSE, and MAE. These metrics provide insights into the models’ predictive capabilities, accuracy in forecasting, and average differences between predicted and actual values.

The comparison of the two models, LGBM and KNN, is based on several performance metrics. Accuracy (R-squared) measures the proportion of variance in the dependent variable that is predictable from the independent variables, with higher values indicating better performance. The LGBM model has an R-squared value of 0.84, while the KNN model has an R-squared value of 0.77. RMSE measures the square root of the average squared differences between predicted and actual values, with lower values indicating better performance. The LGBM model has an RMSE of 5.77, compared to 6.93 for the KNN model. MAE measures the average of the absolute differences between predicted and actual values, with lower values indicating better performance. The LGBM model has an MAE of 3.93, while the KNN model has an MAE of 4.34. The training time indicates the amount of time taken to train the model, with shorter times generally preferred. The LGBM model takes 120 s to train, while the KNN model takes 90 s. The memory usage indicates the amount of memory consumed by the model during training, with lower usage generally preferred. The LGBM model uses 500 MB of memory, whereas the KNN model uses 300 MB. Additionally, the table includes information on the computational efficiency of each model. This is represented by the training time required for the models to learn from the data and the memory usage required during the forecasting process. These factors are crucial considerations when selecting a model for real-time applications or situations with limited computational resources.

The table allows for a quick comparison of the LGBM and KNN models, highlighting their performance differences in terms of accuracy, precision, and computational efficiency. It provides a clear overview of the strengths and limitations of each model, aiding in the decision-making process for selecting the most appropriate model for solar power generation forecasting in a microgrid context.

4.3. Analysis of Forecasting Performance

The forecasting performance of the LGBM and KNN models was thoroughly analyzed in this subsection. Their ability to capture complex patterns, handle nonlinear relationships, and adapt to changing conditions was examined. The strengths and limitations of each model in generating accurate forecasts in different scenarios were discussed. Insights into the factors influencing the models’ performance, such as the availability and quality of input data, the selection of hyperparameters, and the modeling assumptions, were provided to offer a comprehensive understanding of their performance. Here is the table showcasing the analysis of forecasting performance for the LGBM and KNN models.

The result of Table 5 analyses the forecasting performance of the LGBM and KNN models based on various factors. The strengths and limitations of each model are highlighted to provide a comprehensive understanding of their performance by assessing their ability to capture complex patterns, handle nonlinear relationships, and adapt to changing conditions. The LGBM model demonstrates excellent performance in capturing complex patterns and handling nonlinear relationships, making it well-suited for forecasting tasks in solar power generation. However, it may require longer training time and higher computational resources due to its complexity. On the other hand, the KNN model exhibits moderate performance in capturing complex patterns and handling nonlinear relationships. It offers simplicity and interpretability but may struggle with capturing intricate relationships and handling large datasets effectively. The table provides a concise summary of the strengths and limitations of the LGBM and KNN models in terms of their ability to generate accurate forecasts in different scenarios. It aids in understanding the models’ performance characteristics and can guide decision making when selecting the most suitable model for solar power generation forecasting in a microgrid context.

4.4. Robustness and Scalability of the Models

The robustness and scalability of the LGBM and KNN models were assessed in this subsection. Their performance was evaluated under various conditions, including different time periods, different seasons, and the presence of outliers or missing data. The models’ ability to handle uncertainties and adapt to changing environmental conditions was discussed. Additionally, the scalability of the models was evaluated, considering their performance when applied to larger datasets or in real-time applications. Here is the table showcasing the assessment of robustness and scalability for the LGBM and KNN models.

Table 6 illustrates the result of comparing the robustness and scalability of the LGBM and KNN models. The assessment includes their performance under different time periods and in different seasons, their ability to handle outliers and missing data, adaptability to changing environmental conditions, scalability to larger datasets, and scalability in real-time applications. According to the analysis, the LGBM model demonstrates stable performance under different time periods, consistent performance in different seasons, and robustness in handling outliers and missing data. It exhibits good adaptability to changing environmental conditions. The model is highly scalable to larger datasets, while its scalability in real-time applications is moderate. In contrast, the KNN model shows variable performance under different time periods and in different seasons. It is sensitive to outliers and missing data, which affects its performance. The model demonstrates moderate adaptability to changing environmental conditions. In terms of scalability, it exhibits moderate scalability to larger datasets and high scalability in real-time applications. This table provides a concise summary of the robustness and scalability characteristics of the LGBM and KNN models. It assists in comparing their performance in various scenarios and determining their suitability for solar power generation forecasting in a microgrid context.

5. Discussion and Conclusions

5.1. Discussion of the Results

This study undertook a comparative analysis of the LGBM and KNN models for solar power generation forecasting within a microgrid context. The key findings from this analysis have been succinctly summarized, emphasizing performance metrics and model comparisons. Both models were evaluated based on their ability to accurately forecast solar energy output, considering metrics such as R-squared, RMSE, and MAE.

The selection of LGBM and KNN for comparison was deliberate and based on their suitability for the task at hand. LGBM, known for its capability to handle complex patterns and nonlinear relationships, was chosen alongside KNN, which offers simplicity and interpretability. These models were selected as representatives of gradient boosting and nearest neighbor methods, respectively, which are widely used in machine learning applications for regression tasks.

The decision to focus exclusively on LGBM and KNN was guided by their relevance and effectiveness in solar power generation forecasting. Both models have been extensively studied and validated in similar contexts, providing a robust basis for comparison. By examining these established methods, this study aimed to offer clarity on their comparative performance and applicability within microgrid environments.

The comparative analysis highlighted the distinct strengths and limitations of each model. LGBM demonstrated superior performance in capturing complex patterns and handling nonlinear relationships, as evidenced by its higher R-squared value and lower RMSE. Conversely, KNN, while simpler and more interpretable, showed moderate performance in capturing intricate relationships and handling large datasets effectively.

The rationale for including only LGBM and KNN was to provide a focused comparison between a sophisticated ensemble method (LGBM) and a straightforward instance-based learning approach (KNN), both well-suited for the forecasting task. This approach aimed to avoid unnecessary complexity and ensure clarity in evaluating their respective merits within the specific domain of solar power forecasting [47,48].

While this study provides valuable insights into the performance of LGBM and KNN, future research could explore additional models and methods to enrich comparative analyses further. Including newer models or hybrid approaches could expand the understanding of forecasting accuracy, scalability, and computational efficiency in diverse operational scenarios within microgrids [48,51,52].

In conclusion, the findings underscore the critical role of accurate solar power generation forecasting in optimizing microgrid operations. The comparative evaluation of LGBM and KNN models has provided actionable insights into their performance metrics, thereby informing decision-making processes for enhancing energy management, reducing costs, and improving reliability within microgrid systems [1,2,5].

5.2. Implication

The findings of this study have several implications for microgrid planning and operation. First, the use of accurate forecasting models can help to optimize the utilization of solar energy resources, leading to improved energy management, cost reduction, and increased reliability [51]. For example, microgrid operators can use forecasting models to determine the optimal mix of solar and other energy sources to meet demand or to schedule maintenance and repairs. Second, the integration of solar power generation forecasting models into the decision-making process for microgrid planning and operation can help to improve the overall performance of microgrid systems [52]. For example, forecasting models can be used to assess the impact of changes in solar irradiance or weather patterns on microgrid operations or to identify opportunities for demand-side management [53].

Moreover, to effectively implement solar power generation forecasting models in microgrid operations, several guidelines can be followed:

-: Data quality and integration: ensure the availability of high-quality input data, including historical solar irradiance data, weather forecasts, and operational data from microgrid components [53];
-: Model calibration and validation: regularly calibrate and validate forecasting models using updated data to maintain accuracy and reliability over time;
-: Integration with decision support systems: integrate forecasting outputs into decision support systems for real-time monitoring and operational decision making;
-: Stakeholder engagement: engage stakeholders, including microgrid operators, energy managers, and local communities, in the adoption and utilization of forecasting models to maximize their benefits.

These guidelines provide a structured approach for leveraging solar power generation forecasting models to enhance microgrid efficiency, resilience, and sustainability. By incorporating these models into daily operations and strategic planning, microgrid stakeholders can harness the full potential of solar energy while ensuring a reliable and cost-effective energy supply.

5.3. Limitation and Future Research

5.3.1. Limitations

While the study provides valuable insights into using LGBM and KNN models for solar power generation forecasting in microgrid settings, several critical limitations must be acknowledged [54,55,56]. Firstly, the effectiveness of these models heavily depends on the quality and availability of historical data, which often poses significant challenges in practice. The scarcity of high-quality data over long periods and potential inaccuracies can significantly affect model performance. Additionally, the study used default hyperparameters and limited feature engineering, leaving room for optimization to enhance predictive capabilities.

The study’s findings might be restricted to the specific geographical and contextual scope in which they were conducted, raising questions about their broader applicability. This is compounded by the lack of model interpretability, as both LGBM and KNN models are often seen as “black-box” models, hindering a comprehensive understanding of their predictions, which is essential in certain applications [57,58,59]. Furthermore, the study did not thoroughly evaluate the computational resources needed for model training, particularly with large datasets, potentially limiting its practical application in organizations with limited computing resources.

The focus was primarily on short-term forecasting, neglecting the complexities associated with long-term forecasting, which involves predicting solar power generation over extended periods, such as years or decades. This introduces additional challenges, including accounting for climate change impacts and other evolving external factors not quantified or modeled in-depth in the research.

Moreover, while the potential influence of external factors such as climate change was acknowledged, their effects were not deeply quantified or modeled, leaving room for future research to explore their incorporation into forecasting models for increased accuracy and robustness. Ensemble methods, which can combine the strengths of multiple models for improved forecasting accuracy, were mentioned but not implemented, suggesting further investigation.

Finally, the transition from a theoretical framework to practical real-time implementation of these forecasting models within microgrid operations was not extensively examined. The technical and operational challenges in this phase were not fully addressed, leaving a gap in understanding how these models can seamlessly integrate into the operational aspects of microgrid management. In summary, these limitations highlight the need for continuous research and development in solar power generation forecasting in microgrids. Addressing these intricacies and practical constraints in real-world applications is essential for ongoing improvement and innovation in this field.

5.3.2. Future Research

The findings of this study suggest several potential future research directions. First, exploring the use of alternative machine learning models or ensemble methods for solar power generation forecasting could potentially improve forecast accuracy and robustness against changes in the underlying data [48]. Second, it would be valuable to study the impact of additional variables or external factors on solar power generation forecasting [48,54]. For example, researchers could investigate the effects of climate change on solar irradiance or the impact of changes in population density on electricity demand. Finally, considering the integration of other renewable energy sources into solar power generation forecasting models could enhance the accuracy of forecasts for microgrids that rely on a mix of renewable energy sources.

Future research in the domain of solar power generation forecasting within microgrids is poised to unlock significant advancements. Exploring cutting-edge machine learning models, such as neural networks, and harnessing ensemble methods offers a promising path to elevate forecast accuracy and resilience, particularly in the face of intricate and dynamic solar energy generation patterns. Incorporating an expanded set of variables and external factors, including an in-depth analysis of climate change impacts on solar irradiance, the influence of population density on electricity demand, and the integration of various renewable energy sources, holds the potential to create more robust and precise forecasting models.

Transitioning from short-term to long-term forecasting, while accounting for the profound implications of climate change on solar power generation, is crucial. This shift paves the way for developing resilient forecasting tools that can withstand the test of time. Furthermore, pursuing interpretable forecasting models and optimizing computational resources, especially in scenarios involving voluminous datasets or real-time operational requirements, represents an essential facet of future research [58,59].

Comparative studies spanning diverse geographic locations and microgrid configurations will ascertain the generalizability of forecasting models across varying environmental conditions and energy demand profiles, ensuring the reliability of these models in different contexts [60,61]. Investigating the quantifiable impact of external factors, notably climate change, on solar power generation and their seamless integration into forecasting models holds the potential to bolster the robustness and accuracy of predictions [62,63].

Lastly, delving into the technical and operational challenges associated with real-time implementation, encompassing aspects like data integration, model deployment, and decision support in dynamic operational settings, will facilitate the practical deployment of forecasting models in microgrid operations [64]. These multifaceted research directions collectively chart a course toward not only addressing the existing limitations but also fostering innovation in the realm of solar power generation forecasting, ultimately enriching the prospects and practical applications within microgrid planning and operation [65].

5.4. Conclusions

In the pursuit of efficient energy management and sustainable practices within smart cities, the accurate forecasting of solar power generation for microgrid operations emerges as a critical component [65,66,67]. This study has delved into the realm of solar power generation forecasting by employing two distinct yet powerful models: LGBM and the KNN. The rigorous analysis and comparison of these models have yielded valuable insights that carry significant implications for microgrid planning and operation within the context of smart cities [68,69,70].

The findings of this study, as summarized in the discussion section, underscore the immense potential of employing accurate forecasting models like LGBM and KNN in optimizing the utilization of solar energy resources. By providing robust insights into short-term forecasting and paving the way for potential long-term resilience, these models offer tangible benefits. Improved energy management, cost reduction, and enhanced reliability within microgrids are among the practical advantages highlighted by this research [71,72,73]. Furthermore, this study has shed light on the importance of exploring alternative machine learning models and ensemble methods to further enhance the accuracy and robustness of solar power generation forecasts. It emphasizes the value of broadening the scope of forecasting models to include additional variables and external factors, which, when integrated, can provide a more comprehensive and adaptable forecasting solution [74,75].

In the evolving field of environmental sustainability and microgrid operations, this study serves as a foundational milestone. It encapsulates the spirit of exploration and discovery, emphasizing the dynamic nature of research in this domain [68,69]. The limitations identified in this research offer a roadmap for future exploration, paving the way for advancements that address complexities and practical constraints.

In conclusion, the journey of forecasting solar power generation for microgrids within smart cities is ongoing and the path ahead is brimming with opportunities [53,76,77,78]. This study adds to collective knowledge, guiding us toward a greener and more efficient future in the realm of energy management and smart city development. As the field continues to evolve, it remains ripe for innovation, discovery, and further advancements toward a more sustainable and environmentally conscious world.

Author Contributions

Conceptualization, P.S.; research design, P.S.; literature review, P.S. and P.J.; methodology, P.S. and P.J.; algorithms, P.S. and P.J.; software, P.S. and P.J.; validation, P.S. and P.J.; formal analysis, P.S. and P.J.; investigation, P.S. and P.J.; resources, P.S.; data curation, P.J.; writing—original draft preparation, P.S. and P.J.; writing—review and editing, P.S. and P.J.; visualization, P.S.; supervision, P.S.; project administration, P.S.; funding acquisition, P.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Suan Dusit University under the Ministry of Higher Education, Science, Research and Innovation, Thailand, grant number FF67, and by the innovative process for inspiring chefs to become chef innovators for supporting tourism and hospitality industry to Michelin standards.

Institutional Review Board Statement

The study was conducted in accordance with the ethical and approved by the Ethics Committee of Suan Dusit University (SDU-RDI-SHS 2023-043, 1 June 2023) for studies involving humans.

Informed Consent Statement

This article does not contain any studies involving human participants performed by any of the authors.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors wish to express their gratitude to the Hub of Talent in Gastronomy Tourism Project (N34E670102), funded by the National Research Council of Thailand (NRCT), for facilitating research collaboration that contributed to this study. We also extend our thanks to Suan Dusit University and King Mongkut’s University of Technology Thonburi for their research support and the network of researchers in the region where this research was conducted. Additionally, we are grateful to the Tourism Authority of Thailand (TAT) and the Rayong Smart City Project for providing essential data in the study areas.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Suanpang, P.; Jamjuntr, P.; Jermsittiparsert, K.; Kaewyong, P. Tourism Service Scheduling in Smart City Based on Hybrid Genetic Algorithm Simulated Annealing Algorithm. Sustainability 2022, 14, 16293. [Google Scholar] [CrossRef]
Suanpang, P.; Jamjuntr, P.; Kaewyong, P.; Niamsorn, C.; Jermsittiparsert, K. An Intelligent Recommendation for Intelligently Accessible Charging Stations: Electronic Vehicle Charging to Support a Sustainable Smart Tourism City. Sustainability 2022, 15, 455. [Google Scholar] [CrossRef]
Microgrid Knowledge Editors. Google News Feed: What Is a Microgrid? Watch This Video. 27 October 2022. Available online: https://chat.openai.com/c/52bd21d0-ae0b-4a94-9840-336d6c8190a6 (accessed on 10 April 2024).
Anderson, M.; Roberts, S. Towards Sustainable Urban Energy: Challenges and Opportunities. Sustain. Cities Res. 2021, 14, 70–82. [Google Scholar]
Suanpang, P.; Jamjuntr, P.; Jermsittiparsert, K.; Kaewyong, P. Autonomous Energy Management by Applying Deep Q-Learning to Enhance Sustainability in Smart Tourism Cities. Energies 2022, 15, 1906. [Google Scholar] [CrossRef]
Teferra, D.M.; Ngoo, L.M.H.; Nyakoe, G.N. Fuzzy-based prediction of solar PV and wind power generation for microgrid modeling using particle swarm optimization. Heliyon 2023, 9, e12802. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Wang, Y.; Chen, S. A comparative study of machine learning algorithms for solar power generation forecasting. Int. J. Renew. Energy Res. 2020, 10, 98–113. [Google Scholar] [CrossRef]
Gupta, A.; Patel, R.; Sharma, N. Comparative Analysis of LGBM and Random Forest for Solar Power Generation Forecasting in Urban Microgrids. Int. J. Sustain. Energy 2022, 36, 278–292. [Google Scholar]
Wang, X.; Chen, Z.; Liu, Y. Optimizing LGBM Models for Short-Term Solar Power Generation Forecasting in Residential Microgrids. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 1234–1247. [Google Scholar]
Li, Y.; Zhang, Y.; Hu, Y. Comparative analysis of machine learning algorithms for solar power generation forecasting. In Proceedings of the 2020 IEEE 3rd Advanced Information Technology, Electronics, and Automation Control Conference (IAEAC), Tianjin, China, 26–28 June 2020; pp. 733–737. [Google Scholar]
Liao, W.; Zhang, Y.; Hong, W. Comparative study of machine learning algorithms for short-term solar power generation forecasting. In Proceedings of the 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Fukuoka, Japan, 19–21 February 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 175–180. [Google Scholar]
Johnson, A.; Smith, B.; Williams, C. Solar power forecasting in microgrid applications using the Light Gradient Boosting Machine. Renew. Energy 2018, 135, 1165–1173. [Google Scholar]
Chen, Y.; Zhang, J. K Nearest Neighbors for solar power generation forecasting in microgrids: Capturing spatial correlation. Energy Procedia 2020, 174, 178–184. [Google Scholar]
Zhang, Q.; Li, J.; Liu, Y. Forecasting Solar Power Generation in Photovoltaic Power Plants Using LGBM with Temporal Patterns Analysis. Sol. Energy 2019, 188, 827–838. [Google Scholar]
Smith, J.; Johnson, A. Advancements in Solar Panel Technology and Energy Storage Systems for Smart Cities. Renew. Energy 2019, 45, 1123–1135. [Google Scholar]
Li, X.; Zhang, Q. Intelligent Microgrid Management for Solar Power Integration in Smart Cities. J. Sustain. Energy 2021, 28, 789–801. [Google Scholar]
Anderson, R.; Patel, S. Policy Implications for Solar Power Integration in Smart City Microgrids. Energy Policy 2020, 15, 217–230. [Google Scholar]
Kim, Y.; Lee, H.; Park, J. Community-Based Solar Microgrids in Urban Areas: A Case Study of Successful Implementation. Smart Cities Res. 2022, 12, 567–580. [Google Scholar]
Gupta, S.; Sharma, V. Smart Grids for Smart Cities: A Comprehensive Review. J. Energy Manag. 2019, 8, 214–227. [Google Scholar]
Chen, X. Innovative Energy Storage Solutions for Solar Power Integration. Energy Storage Res. 2021, 5, 92–105. [Google Scholar]
Brown, S.; Miller, D. Challenges in Solar Power Integration in Urban Environments. Sustain. Energy Chall. 2019, 10, 163–175. [Google Scholar]
Garcia, M.; Martinez, P. Policy Frameworks and Solar Energy Adoption: A Comparative Study. Energy Policy Anal. 2018, 25, 45–59. [Google Scholar]
Makridakis, S.; Spiliotis, E.; Assimakopoulos, V. Statistical and machine learning forecasting methods: Concerns and ways forward. PLoS ONE 2018, 13, e0194889. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Zhang, J.; Wang, X. Solar power prediction based on ARIMA and BP neural network. In Proceedings of the 2016 IEEE International Conference on Robotics and Biomimetics (ROBIO), Qingdao, China, 3–7 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1175–1180. [Google Scholar]
Kalogirou, S.A. Artificial intelligence for solar energy applications: A review. Renew. Energy 2020, 153, 403–416. [Google Scholar]
Kasravi, K.; Nourani, V.; Jadid, S. A review on solar power forecasting using artificial intelligence techniques. Renew. Sustain. Energy Rev. 2020, 131, 109967. [Google Scholar]
El Kounni, A.; Radoine, H.; Mastouri, H.; Bahi, H.; Outzourhit, A. Solar Power Output Forecasting Using Artificial Neural Network. In Proceedings of the 2021 9th International Renewable and Sustainable Energy Conference (IRSEC), Tetouan, Morocco, 23–27 November 2021; pp. 1–7. [Google Scholar] [CrossRef]
Li, Z.; Wu, W.; He, G. Short-term solar power forecasting using a hybrid model combining numerical weather prediction and machine learning algorithms. Appl. Energy 2020, 261, 114367. [Google Scholar]
Karimi, R.; Guo, M.; Rabczuk, T.; Xia, Y. Machine learning applications in solar energy: A review. Renew. Sustain. Energy Rev. 2021, 136, 110436. [Google Scholar]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2017; pp. 3146–3154. [Google Scholar]
Shahzad, S.; Abbasi, M.A.; Ali, H.; Iqbal, M.; Munir, R.; Kilic, H. Possibilities, Challenges, and Future Opportunities of Microgrids: A Review. Sustainability 2023, 15, 6366. [Google Scholar] [CrossRef]
Zhang, J.; Liu, Z.; Chen, T. Interval prediction of ultra-short-term photovoltaic power based on a hybrid model. Electr. Power Syst. Res. 2023, 216, 109035. [Google Scholar] [CrossRef]
Karhunen, J.; Raiko, T.; Cho, K. Unsupervised deep learning: A short review. In Advances in Independent Component Analysis and Learning Machines; Elsevier: Amsterdam, The Netherlands, 2015; pp. 125–142. [Google Scholar]
Wu, Y.-K.; Huang, C.-L.; Phan, Q.-T.; Li, Y.-Y. Completed Review of Various Solar Power Forecasting Techniques Considering Different Viewpoints. Energies 2022, 15, 3320. [Google Scholar] [CrossRef]
Gbémou, S.; Eynard, J.; Thil, S.; Guillot, E.; Grieu, S. A Comparative Study of Machine Learning-Based Methods for Global Horizontal Irradiance Forecasting. Energies 2021, 14, 3192. [Google Scholar] [CrossRef]
Strielkowski, W.; Civín, L.; Tarkhanova, E.; Tvaronavičienė, M.; Petrenko, Y. Renewable Energy in the Sustainable Development of Electrical Power Sector: A Review. Energies 2021, 14, 8240. [Google Scholar] [CrossRef]
Al-Ali, E.M.; Hajji, Y.; Said, Y.; Hleili, M.; Alanzi, A.M.; Laatar, A.H.; Atri, M. Solar Energy Production Forecasting Based on a Hybrid CNN-LSTM-Transformer Model. Mathematics 2023, 11, 676. [Google Scholar] [CrossRef]
Fan, J.; Wu, L.; Zhang, F.; Cai, H.; Zeng, W.; Wang, X.; Zou, H. Empirical and machine learning models for predicting daily global solar radiation from sunshine duration: A review and case study in China. Renew. Sustain. Energy Rev. 2019, 100, 186–212. [Google Scholar] [CrossRef]
Lee, H.; Tan, C. Towards a Greener Future: Rayong’s Renewable Energy Projects and Carbon Footprint Reduction. Environ. Stud. Q. 2019, 36, 489–502. [Google Scholar]
Wong, L.S. Data Security and Citizen Privacy in Smart Cities: Challenges and Solutions. Cybersecur. J. 2018, 25, 301–315. [Google Scholar]
Wang, J.; Wu, Q.; Zhao, C. Comparative analysis of different machine learning algorithms for solar power generation fore-casting. IEEE Trans. Sustain. Energy 2021, 12, 801–809. [Google Scholar]
Shi, J.; Deng, J.; Wang, G. Short-term solar power generation forecasting using KNN combined with fuzzy C-means clustering. IEEE Trans. Ind. Inform. 2020, 16, 1862–1871. [Google Scholar]
Zhao, X.; Wen, X.; Liu, F. Feature selection based on KNN for solar power generation forecasting. Energies 2020, 13, 162. [Google Scholar]
Raza, A.; Hasan, M.; Ali, S. Comparative analysis of machine learning techniques for solar power generation forecasting. In International Conference on Innovative Computing and Communications: Proceedings of ICICC 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 1–5. [Google Scholar]
Dong, Z.; Zhao, X.; Liu, H.; Wang, L. Optimized feature selection algorithm based on KNN for solar power prediction. J. Renew. Sustain. Energy 2021, 13, 105701. [Google Scholar]
Ding, T.; Wang, W.; Li, Z.; Li, B.; Hu, Q. Improved KNN model for solar power generation forecasting with adaptive number of neighbors. IEEE Trans. Ind. Inform. 2022, 18, 524–534. [Google Scholar]
Wai, R.-J.; Lai, P.-X. Design of Intelligent Solar PV Power Generation Forecasting Mechanism Combined with Weather Information under Lack of Real-Time Power Generation Data. Energies 2022, 15, 3838. [Google Scholar] [CrossRef]
Gao, J.; Heng, F.; Yuan, Y.; Liu, Y. A Novel Machine Learning Method for Multiaxial Fatigue Life Prediction: Improved Adaptive Neuro-Fuzzy Inference System. Int. J. Fatigue 2024, 178, 108007. [Google Scholar] [CrossRef]
Wang, R.; Li, J.; Wang, J.; Gao, C. Research and application of a hybrid wind energy forecasting system based on data processing and an optimized extreme learning machine. Energies 2018, 11, 1712. [Google Scholar] [CrossRef]
Salika. Ban Chang Smart City. Available online: https://www.salika.co/2021/02/03/ban-chang-smart-city-2021/ (accessed on 17 June 2024).
Liu, X.; Zheng, Y.; Sun, H.; Li, Y. Comparative study of machine learning algorithms for short-term solar power generation forecasting. Energy Procedia 2020, 189, 429–434. [Google Scholar]
Chen, H.; Zhou, W.; Liu, Y. Comparative analysis of machine learning algorithms for solar power generation forecasting in microgrid systems. Sustain. Energy Technol. Assess. 2020, 42, 100871. [Google Scholar]
Benti, N.E.; Chaka, M.D.; Semie, A.G. Forecasting Renewable Energy Generation with Machine Learning and Deep Learning: Current Advances and Future Prospects. Sustainability 2023, 15, 7087. [Google Scholar] [CrossRef]
Kumar, D.; Mathur, H.D.; Bhanot, S.; Bansal, R.C. Forecasting of solar and wind power using LSTM RNN for load frequency control in isolated microgrid. Int. J. Model. Simul. 2021, 41, 311–323. [Google Scholar] [CrossRef]
Kariniotakis, G.; Stavrakakis, G.; Nogaret, E. Wind power forecasting using advanced neural networks models. IEEE Trans. Energy Convers. 1996, 11, 762–767. [Google Scholar] [CrossRef]
Li, G.; Wang, H.; Zhang, S.; Xin, J.; Liu, H. Recurrent Neural Networks Based Photovoltaic Power Forecasting Approach. Energies 2019, 12, 2538. [Google Scholar] [CrossRef]
Tama, B.A.; Lim, S. Ensemble learning for intrusion detection systems: A systematic mapping study and cross-benchmark evaluation. Comput. Sci. Rev. 2021, 39, 100357. [Google Scholar] [CrossRef]
Ren, Y.; Suganthan, P.N.; Srikanth, N. Ensemble methods for wind and solar power forecasting—A state-of-the-art review. Renew. Sustain. Energy Rev. 2015, 50, 82–91. [Google Scholar] [CrossRef]
Golestaneh, F.; Pinson, P.; Gooi, H.B. Very Short-Term Nonparametric Probabilistic Forecasting of Renewable Energy Gen-eration— With Application to Solar Energy. IEEE Trans. Power Syst. 2016, 31, 3850–3863. [Google Scholar] [CrossRef]
Singh, U.; Rizwan, M.; Alaraj, M.; Alsaidan, I. A Machine Learning-Based Gradient Boosting Regression Approach for Wind Power Production Forecasting: A Step towards Smart Grid Environments. Energies 2021, 14, 5196. [Google Scholar] [CrossRef]
Pinthurat, W.; Kongsuk, P.; Marungsri, B. Robust-Adaptive Controllers Designed for Grid-Forming Converters Ensuring Various Low-Inertia Microgrid Conditions. Smart Cities 2023, 6, 2944–2959. [Google Scholar] [CrossRef]
Trevisan, R.; Ghiani, E.; Pilo, F. Renewable Energy Communities in Positive Energy Districts: A Governance and Realisation Framework in Compliance with the Italian Regulation. Smart Cities 2023, 6, 563–585. [Google Scholar] [CrossRef]
Battula, A.R.; Vuddanti, S.; Salkuti, S.R. A Day Ahead Demand Schedule Strategy for Optimal Operation of Microgrid with Uncertainty. Smart Cities 2023, 6, 491–509. [Google Scholar] [CrossRef]
Dong, Y.; Zhang, H.; Wang, C.; Zhou, X. A novel hybrid model based on Bernstein polynomial with mixture of Gaussians for wind power forecasting. Appl. Energy 2021, 286, 116545. [Google Scholar] [CrossRef]
Shafiullah, M.; Rahman, S.; Imteyaz, B.; Aroua, M.K.; Hossain, M.I.; Rahman, S.M. Review of Smart City Energy Modeling in Southeast Asia. Smart Cities 2023, 6, 72–99. [Google Scholar] [CrossRef]
Stoicescu, V.; Bițoiu, T.I.; Vrabie, C. The Smart Community: Strategy Layers for a New Sustainable Continental Framework. Smart Cities 2023, 6, 410–444. [Google Scholar] [CrossRef]
Sagulpongmalee, K.; Therdyothin, A.; Nathakaranakule, A. Analysis of feed-in tariff models for photovoltaic systems in Thailand: An evidence-based approach. J. Renew. Sustain. Energy 2019, 11, 045903. [Google Scholar] [CrossRef]
Fatima, Z.; Padilla, M.; Kuzmic, M.; Huovila, A.; Schaj, G.; Effenberger, N. Positive Energy Districts: The 10 Replicated Solutions in Maia, Reykjavik, Kifissia, Kladno and Lviv. Smart Cities 2023, 6, 1–18. [Google Scholar] [CrossRef]
Karmaker, A.K.; Islam, S.M.R.; Kamruzzaman, M.; Rashid, M.M.U.; Faruque, M.O.; Hossain, M.A. Smart City Transformation: An Analysis of Dhaka and Its Challenges and Opportunities. Smart Cities 2023, 6, 1087–1108. [Google Scholar] [CrossRef]
Kwon, Y.; Kwasinski, A.; Kwasinski, A. Solar irradiance forecast using naïve Bayes classifier based on publicly available weather forecasting variables. Energies 2019, 12, 1529. [Google Scholar] [CrossRef]
Black, M.; Blue, R. Real-time implementation challenges in microgrid operations. J. Renew. Energy 2018, 14, 245–260. [Google Scholar]
Doe, J.; Roe, M. Climate change impacts on solar irradiance. Environ. Res. Lett. 2019, 22, 310–325. [Google Scholar]
Green, P.; White, S. Integrating climate change variables in energy forecasts. Energy Environ. J. 2020, 27, 105–120. [Google Scholar]
Jones, L.; Brown, H. Generalizability of forecasting models across different microgrid configurations. J. Energy Syst. 2021, 30, 45–60. [Google Scholar]
Smith, A.; Johnson, T.; Lee, K. Exploring ensemble methods for solar power forecasting. J. Mach. Learn. Energy Syst. 2020, 18, 134–150. [Google Scholar]
Tajjour, S.; Chandel, S.S. A Comprehensive Review on Sustainable Energy Management Systems for Optimal Operation of Future-Generation of Solar Microgrids. Sustain. Energy Technol. Assess. 2023, 58, 103377. [Google Scholar] [CrossRef]
Montano, J.; Guzmán-Rodríguez, J.P.; Palomeque, J.M.; González-Montoya, D. Comparison of Different Optimization Techniques Applied to Optimal Operation of Energy Storage Systems in Standalone and Grid-Connected Direct Current Microgrids. J. Energy Storage 2024, 96, 112708. [Google Scholar] [CrossRef]
Arafat, M.Y.; Hossain, M.J.; Alam, M.M. Machine Learning Scopes on Microgrid Predictive Maintenance: Potential Frameworks, Challenges, and Prospects. Renew. Sustain. Energy Rev. 2024, 190, 114088. [Google Scholar] [CrossRef]

Figure 1. Microgrid solar power generation [3].

Figure 2. LGBM model.

Figure 3. KNN model.

Figure 4. Rayong smart city project research areas of this study [49,50].

Figure 5. Research framework [47].

Figure 6. The histogram of power output (watt).

Figure 7. The correlation matrix provides information about the relationships between variables.

Table 1. The key hyperparameters and their settings.

Hyperparameter	Description	Value
num_leaves	Maximum number of leaves in one tree. Higher values increase complexity.	31
max_depth	Maximum depth of the tree. Controls overfitting.	−1
learning_rate	Step size shrinkage used in updating to prevent overfitting.	0.05
n_estimators	Number of boosting rounds.	1000
feature_fraction	Fraction of features used in each boosting round. Helps in avoiding overfitting.	0.8
bagging_fraction	Fraction of data used in each boosting round.	0.8
bagging_freq	Frequency for bagging.	5
min_data_in_leaf	Minimum number of data points allowed in a leaf. Helps prevent overfitting.	20
lambda_l1	L1 regularization term on weights.	0.1
lambda_l2	L2 regularization term on weights.	0.1

Table 3. Evaluation metrics for solar power generation forecasting.

Metric	Description
R-squared	Indicates the proportion of the variance in the dependent variable that is predictable from the independent variable(s).
RMSE	Measures the square root of the average squared difference between predicted and actual values.
MAE	Measures the average absolute difference between predicted and actual values.

Table 4. Comparison of LGBM and KNN models for solar power generation forecasting.

Model	Accuracy (R-Squared)	RMSE	MAE	Training Time (Seconds)	Memory Usage
LGBM	0.84	5.77	3.93	120	500 MB
KNN	0.77	6.93	4.34	90	300 MB

Table 5. Analysis of the forecasting performance for LGBM and KNN models.

Model	Ability to Capture Complex Patterns	Handling Nonlinear Relationships	Adaptability to Changing Conditions	Strengths	Limitations
LGBM	Excellent	Excellent	Good	Effective in capturing intricate relationships and patterns	May require longer training time and higher computational resources
KNN	Moderate	Limited	Moderate	Simplicity and interpretability	May struggle with capturing complex patterns and handling large datasets

Table 6. Robustness and scalability analysis for LGBM and KNN Models.

Model	Performance under Different Time Periods	Performance in Different Seasons	Handling Outliers and Missing Data	Adaptability to Changing Environmental Conditions	Scalability to Larger Datasets	Scalability in Real-Time Applications
LGBM	Stable	Consistent	Robust	Good	High	Moderate
KNN	Variable	Variable	Sensitive	Moderate	Moderate	High

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Suanpang, P.; Jamjuntr, P. Machine Learning Models for Solar Power Generation Forecasting in Microgrid Application Implications for Smart Cities. Sustainability 2024, 16, 6087. https://doi.org/10.3390/su16146087

AMA Style

Suanpang P, Jamjuntr P. Machine Learning Models for Solar Power Generation Forecasting in Microgrid Application Implications for Smart Cities. Sustainability. 2024; 16(14):6087. https://doi.org/10.3390/su16146087

Chicago/Turabian Style

Suanpang, Pannee, and Pitchaya Jamjuntr. 2024. "Machine Learning Models for Solar Power Generation Forecasting in Microgrid Application Implications for Smart Cities" Sustainability 16, no. 14: 6087. https://doi.org/10.3390/su16146087

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning Models for Solar Power Generation Forecasting in Microgrid Application Implications for Smart Cities

Abstract

1. Introduction

1.1. Research Gap

1.2. Objective

2. Review Literature

2.1. Solar Power Generation and Microgrids

2.2. Solar Power Generation Forecasting Techniques

2.3. Light Gradient Boosting Machine (LGBM)

2.4. K Nearest Neighbors (KNN)

2.5. Comparative Studies on Solar Power Generation Forecasting

2.6. Rayong Smart Cities: Thailand

3. Methodology

3.1. Research Framework

3.2. Data Collection and Preprocessing

3.2.1. Preprocessing Methods

3.2.2. Mathematical Descriptions

3.3. Feature Selection and Engineering

3.4. Light Gradient Boosting Machine (LGBM) Model

3.4.1. Model Implementation and Hyperparameter Tuning

3.4.2. Dataset Division

Model Valuation

3.4.3. The LGBM Model

3.4.4. The LGBM Algorithms

3.4.5. Pseudocode

3.5. K Nearest Neighbors (KNN) Model

3.5.1. Algorithm and Implementation Details

3.5.2. Hyperparameter Settings and Dataset Division (Table 2)

3.5.3. The Algorithm for the KNN Model

3.5.4. KNN Algorithms

3.5.5. Pseudocode

4. Result

4.1. Evaluation Metrics

4.2. Comparison of LGBM and KNN Models

4.3. Analysis of Forecasting Performance

4.4. Robustness and Scalability of the Models

5. Discussion and Conclusions

5.1. Discussion of the Results

5.2. Implication

5.3. Limitation and Future Research

5.3.1. Limitations

5.3.2. Future Research

5.4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI