1. Introduction
Global wildfires have become a critical environmental issue, with far-reaching consequences for ecosystems, human lives, and economies. While wildfires occur in various regions worldwide, research efforts have been undertaken to understand and address these challenges in specific areas. Several studies have investigated different aspects of wildfires and their management. The research of [
1,
2] focused on wildfire risk forecasting using the Weights of Evidence and Statistical Index models and explored forest fire risk forecasting through the application of case-based reasoning, while [
3,
4] proposed and developed a forest fire prediction model based on long short-term time-series networks and used deep learning algorithm approaches with unmanned aerial vehicle (UAV) images for forest fire smoke detection. Research conducted by [
5] employed machine learning techniques for flexible wildfire prediction. These studies provide valuable insights into wildfire dynamics and contribute to the development of effective strategies for fire management and prevention globally.
Wildfire is a major issue in tropical countries, particularly in Southeast Asian regions that are at high risk due to increasing temperatures. Additionally, the behavior of communities residing near wild and forested areas is one of the main contributing factors to fires, including deforestation and land clearing. Indonesia, located in the Southeast Asia region, experiences a tropical climate characterized by two seasons: rainy and dry. The landscape of Indonesia is predominantly covered by forests with peatland, especially on the Kalimantan and Sumatra Islands, where most of the land consists of peat, leading to a high risk of wildfires during the dry season. These disasters typically occur in the summertime when the dry soil conditions easily catch fire, particularly in proximity to hotspots. The behavior of residents in suburban or rural areas with dry land and small hotspots exacerbates the issue. Factors such as a lack of education, uncontrolled land clearing, and deforestation further contribute to the investigation of the disaster area. The impact of wildfires is particularly severe for communities near the fires, as the carbon emitted creates haze and air pollution, posing risks to human health, including respiratory illnesses. Furthermore, children and infants living in or near fire-prone areas may suffer from serious respiratory illnesses when the wind carries pollutants into the community. The effects of wildfires are not limited to humans; they also have detrimental impacts on animals and plants, as their habitats are destroyed by forest fires. Moreover, wildfires have long-term consequences such as global warming and climate change, as they strip the land of vegetation, leading to a reduction in oxygen generation from forests and wild trees.
Figure 1 shows the wild and forest disaster in one of the states, Riau Province in Indonesia, and its impact on the habitat around the area as well as creating air pollution in neighbor states as well as neighboring countries. The research case is based in Riau Province Indonesia territory, with a latitude of 0.507319 and longitude of 101.444924.
Numerous relevant research studies have been conducted with various proposed techniques to alleviate or resolve the wildfire crisis as discussed in [
6,
7,
8], where the wildfire method is fire data analysis and forecasting the hotspots using machine learning techniques. Additionally, the effect of climate change was influenced by the data analysis and characteristics of the source data of the environment to be analyzed. The process of forecasting wildfire and the location of hotspot occurrence are related to the data from the meteorological agency as well as other factors of change in the environment. Another study based on the topic of forest and wildfires collected data from the data monitored environmental indicators that compare to the ground station normally detected or built by a special agency for disaster. The results of the analyses incorporate color codes to differentiate each indicator of wildfire. All these proposed methods to analyze and make predictions used simple applications as elaborated in [
9,
10]. The issue of wild and forest fire has been released for some of the equipment and the use of machine learning for prediction and location of the sources of hotspots, in addition to modeling and mapping of the scattering of fire. Observation and comparison according to weather data to determine the number of hotspots were simulated to achieve high-accuracy decisions in the scattering and number of fire hotspots in a location with a high possibility of wild and forest fires. Several groups have developed various models to observe the probability of potential forest fire occurrence considering both the natural and anthropogenic environmental properties. Recently, a new model has been proposed for investigating forest fire zones taking into account specific localization functions and evaluating critical fire parameters. Using this model and observations made by instrumentation on the remote sensing platform, a decision-making system has been proposed for the detection of areas prone to fires [
11,
12].
The method to analyze and identify the spread of the haze in the air due to wildfire has been elaborated and discussed by [
13,
14,
15] to determine how much the area is being polluted by poor air quality. A deep learning algorithm called LSTM implements modeling to plot the pattern of the fire hotspot data, but the forecasting in this work only covers a small area or designated specific zone. The other work predicts and investigates fire datasets to using computerized analysis as one of methods to predict fire spread. An algorithm recurrent neural network (RNN) has the ability to analyze and integrate datasets for the propagation and prediction of fire hotspot spreading compared to others, while the use of RNN analyzes the dataset by the time series format. A technique to detect wild and forest fires in low levels used a ground sensing system by installing several sensors to detect the anomalies of environment indicators as discussed in [
16,
17,
18,
19]. Wireless sensor network (WSN) technology detects and monitors environmental change in ground sensing, such as temperature and humidity, due to global warming and dry seasons to obtain new datasets for analysis. Moreover, WSN technology collects accurate data in order to solve or overcome missing fire hotspot data, and the advantages of this technology, a direct detecting system compared to remote sensing, are useful for collecting direct fire data. Other results are mentioned in [
20,
21,
22], which explore a method to predict the location of forest fire hotspots using a machine learning algorithm. Environmental monitoring is a method required to consider as an indicator to check air quality and the cleanness of the environment from pollution. Several studies [
23,
24,
25,
26] have examined how the environment can be monitored using multi-sensors to obtain detailed data from all the areas of forest and wilderness that are at high risk of fire. The use of WSN to collect data from the number of multi-sensors deployed in the forest is that each sensor has a coverage area to collect environmental data, reported to the system and alerted when a potential fire is detected. Remote monitoring from long distances or remote locations using a long-range wide area network (Loran WAN) is applied to have forests and wild locations monitored. Various sensors and different methods to detect forest fires are used because a location has a different type of forest, contour, land field, and type of soil; various models and techniques detect the potential of a fire event. Transmitting and reporting data to the backend system for further analysis and forecasting is obtained for the number of data collected.
The boreal and tropical forests undergo the most anthropogenic impacts. In particular, it has been noted that the wildfire phenomenon by lightning strike or by human actions is crucial for the forests’ sustainability. New model simulation experiments showed that total burning of all coniferous forests up to 42° N resulted in an increase in atmospheric carbon by 21.7%, with subsequent global temperature increase by 4 °C. Wildfires are also important for CH
4 emissions. In the modeling algorithms, the wildfires’ CH
4 emissions are considered as 1–5 (TgCH
4/year) [
27]. The issue of the atmospheric greenhouse effect has garnered significant interest, not only in scientific publications but also in mass media. On the one hand, there has been an undeniable overemphasis on the contribution of the greenhouse effect to global climate change. On the other hand, this heightened attention highlights the need to analyze the role of the greenhouse effect as a factor in climate change. Recent years have seen remarkable progress in analyzing observational data and the successful development of numerical climate modeling, providing a foundation for a fresh examination of the atmospheric greenhouse effect within the context of global climate change [
28]. It has been observed that following the occurrence of forest fires, there is a decrease in solar ultraviolet radiation reaching the ground for several days. This decrease is attributed to the significant increase in concentrations of various air pollutants such as carbon monoxide, nitrogen oxides, ozone, and aerosol density during the forest fire event [
29]. The application of remote sensing in environmental monitoring for sustainable development is discussed in the paper [
30].
This research applied a method, namely the Long Short-Term Memory Networks (LSTM) deep learning algorithm, for several tasks on wildfires in Indonesia, such as:
Mapping the wildfire hotspots in Indonesian territory based on the MODIS dataset.
Plotting the distribution of wildfires and hotspots in the Indonesia region.
Wild and forest fire forecasting to determine the potential hotspots in the future.
Analyze the MODIS data on the area with high potential for fires.
The available Moderate Resolution Imaging Spectroradiometer (MODIS) dataset collected by NASA from 2010 to 2022 was used for this analysis to verify the model accuracy. The use of this dataset has limitations, such as MODIS data have a moderate resolution of 250 m and cannot detect small fires. In the analysis, the dataset is split into two categories, namely, a set of training data and a set of testing data. Data screening and filtering were implemented before running the analysis to obtain only the valuable dataset. The scope of this data analysis is limited to the Indonesian territory, since cases of forest fires frequently occur on two big islands, namely, Sumatra and Kalimantan. In this research, a new method was proposed for forecasting and mapping wildfire hotspots with the ability to achieve highly accurate hotspots compared to the previous works discussed, wherein a latest model of deep learning called LSTM contributes significant achievement to achieving precise locations of hotspots. Forecasting the number of hotspots for a future time in the specific zone, especially for Indonesian territory, is useful data for the respective agencies to propose further action. The Python programming language, which was used for this analysis and simulation, offers the advantage of being a high-level programming language capable of performing efficient calculations and accurately identifying wildfire hotspots based on the given dataset.
2. Material and Methodology
Natural disasters are a common phenomenon in the real world, according to region and geography, so forest and wildfire are disasters that commonly occur in most tropical countries with dense forests. Indonesia is one of the countries with a high risk of wildfire because of the land and area, with typical peatland and dry weather potentially catching fire. Indonesian government and community agencies supported by industry collaborate to counter and prevent this wildfire occurrence where possible, but some areas are not successful due to natural phenomena. Many studies have been conducted to find the main issues and root sources of fire, and to analyze the root causes of hotspots. This investigation uses a deep learning LSTM model for prediction and then plots prediction results to find and determine the scattering and distribution of the fire area especially in Indonesian territory. Earth data collected by NASA over the past 10 years are used in this analysis to plot and forecast the number of hotspots that occur in Indonesia.
Figure 2 shows burn area data in Indonesia due to wild and forest fires in the years 2015 to 2021; 2015 is the highest burn area, followed by the year 2019, and the remaining years average below 500,000 hectares.
The data reveal that the cases of wild and forest fires in Indonesia primarily occur on Kalimantan and Sumatra islands due to the similar type of land, which is peatland, and the extensive forest coverage on both islands, posing a high risk of fires during the summer or dry season. The most significant case was observed in 2019, with numerous hotspots recorded in Indonesia, particularly in Kalimantan and Sumatra. However, in 2020, in response to the outbreak of the coronavirus disease (COVID-19), the government implemented mobility restrictions across the country, resulting in reduced activities, including transportation. Consequently, there was a notable improvement in air quality and a decrease in the number of wild and forest fire hotspots. Recent data from years after the COVID-19 pandemic demonstrate a significant decline in fire incidents and hotspots due to the restrictions on human mobility and limitations on outdoor activities. In this work, we will discuss the data from 2023 to compare the period before and after the COVID-19 pandemic and examine the impact on air quality resulting from government-enforced control measures and reduced pollution from transportation.
2.1. LSTM Algorithm
Currently, many techniques and algorithms are developed with good results and faster processing time. One of the popular deep learning algorithms is LSTM, which is the development of deep learning, a Recurrent Neural Network (RNN), initiated by Hochreiter and Schmidhuber [
31]. The LSTM algorithm can analyze time series datasets to address the problem. This algorithm is also capable of learning long-term dependencies datasets and remembering the information for prolonged periods as a default.
Figure 3 shows a basic architecture of an RNN-LSTM model, consisting of several main blocks called cells such as the input gate, output gate, and forget gate. The LSTM algorithm consists of two parts which, in the hidden state memory cell in the RNNs, the parts are
and the
work memory. Both memory cells are responsible for the retention of the sequence features. The sequence of the previous memory was controlled by the forgetting gate in the working memory, which is
f, while
use is for the output,
O is the control portion of the output gate, and
is current memory to be written as an input. The control portion of
i for the state information of
and current input
is to write into the memory cell. All the types with three gates are not in a static condition, and the information in previous state as write
and current input of
are similarly determined as the nonlinear activation after linear combination. LSTM has been successfully utilized in various applications, including brain sciences and the environment [
3,
32].
The LSTM algorithm is a well-known deep learning technique that is capable of analyzing and addressing long-term dependencies in RNN by considering the historical data. On the other hand, other methods may have limitations in achieving accurate results when dealing with a long history of data or a large variety of data. Consequently, when there is a high volume and variety of data, along with a long history, these normal algorithms may not perform well in determining results. In such cases, the utilization of the LSTM algorithm, which effectively stores and utilizes information in long-term memory, provides more precise forecasting of the latest information. Implementing the LSTM algorithm as a fundamental method allows for the retention of information in the long-term stage, making it a commonly used technique for processing, analyzing, forecasting, and classifying basic time-series data [
33].
The forecasting method to calculate the prediction number of wildfire hotspots used in the LSTM algorithm required further analysis and error justification to check the accuracy of the results. There are three blocks commonly explain in the LSTM structure which inside the block with dotted border which shows
,
, and
means for the forget, input and output gate. Many techniques can be used to check errors in forecasting analysis; for example, mean average error (
MAE) to calculate average error, mean square error (
MSE) to check errors in square,
error analysis, a technique showing the proportion of the variant if forecasting number of dataset, and others as common error checking forecasting of dataset. These methods can be expressed as Equations (1)–(3) for
MAE,
MSE, and
, respectively:
where
is the number of fire hotspot datasets in actual time
i,
is the prediction number of hotspots in time
, and
is the quantity of training dataset as the sample in analysis that check the number of errors; a metric regression model is used in this wildfire hotspot forecasting. All models may have different results from each other to calculate error percentage and error; hence, the model was applied to find the minimum error results in the forecasting of hotspots used as the best performance of dataset prediction. Error calculation is very important in forecasting analysis to determine the performance level and accuracy prediction of an algorithm. If the error is too high above the threshold, then there is a need to check and calibrate the algorithm until the error percentage is within the acceptance level.
Table 1 shows part of the dataset used in this forecasting with a full set, as, for example, the coordinate of the hotspot (latitude and longitude), date of incident, and total number of hotspots that occur.
2.2. Fire Dataset
The dataset of wildfire hotspots in this analysis was obtained from NASA’s MODIS hotspot in Indonesia. Indicators in the dataset consist of 15 variables, and all the indicators are valuable for the analysis. The NASA Earth dataset is collected based on the image from the satellite to detect the active hotspot from the satellite station, because of the distance, whereas in some cases the quality of the image may have low quality and the hotspot is not counted in the number and is also affected by the total number of fire hotspots to be analyzed and forecasted. Then again, some sets of data that are missing due to complex parameters of earth data then strategize to overcome and complete the set of data by using the algorithm and predicting the missing database on training from the previous dataset. In this analysis and forecasting, only complete data achieve accurate prediction and determine the hotspot’s location. Hotspot data in Indonesia—with a total of more than 11,000, as shown in
Table 1—result in a highly accurate decision because of a high number of set data. The total data were collected in 10 years, from 2010 to 2022, and forecasted for the year 2023 [
22].
Table 2 presents a set of NASA earth data after filtering, selecting only four parameters from 2010 to 2022 for analysis.
Refer to
Table 1 and
Table 2 for the distribution of the fire hotspot dataset, which represents the daily occurrences of hotspots detected by satellites. On specific days, a higher number of hotspots are detected, requiring the data to be grouped and scaled into a single day to determine the total number of hotspots for each day. After the grouping process, the total number of datasets decreases significantly to only 4725 sets, compared to the previous count of up to 11,000 sets. These datasets represent a substantial amount of data collected over a span of 13 years, from 2010 to 2022. The purpose of this grouping and scaling is to optimize the forecasting and analysis of hotspot data. The data are presented in a line graph for informative analysis, showcasing the minimum and maximum occurrences of hotspots, as well as the average number over specific time intervals, allowing for conclusions to be drawn based on the plotted data. Moreover, to enable detailed analysis and forecasting, the data are further analyzed to provide monthly result values.
Table 3 displays the optimized fire hotspot data, organized into three sets: the number of hotspot occurrences, the data itself, and the total number of hotspots in a single day. This optimization process enables a comprehensive assessment of the daily event data, allowing for meaningful comparisons on a monthly basis.
3. Fire Data Forecasting
The LSTM algorithm, a type of deep learning technique in computer programming, is highly regarded for its ability to analyze large volumes of data, particularly in the context of data analysis and time series forecasting. It falls under the RNN network type and is known for its efficient handling of long-term dependencies and memory retention. The LSTM model offers several advantages for data analysis, including its capacity to predict future events based on extensive historical data. With its four distinct layers and effective communication processing techniques, the LSTM algorithm organizes the model in a chain structure, enabling efficient data analysis.
Figure 4 illustrates the block diagram of the data analysis scenarios used for forecasting wildfire hotspots in the future. The training and testing data are separated into individual blocks and undergo a rigorous process. Once the training data have demonstrated satisfactory performance, the testing data are utilized to forecast the number of hotspot occurrences in the future.
The first process in predicting and forecasting fire hotspot data is to build an LSTM algorithm model and then identify the input information. Unnecessary data in the filtering process can be ignored in the cell of the existing step. The identification process excludes the decoded data using the function of the sigmoid, taking the output model of the last part in LSTM unit at the time and the current input at time t. In addition, the function of sigmoid determines which parts of the previous old output should be removed. The forgetting gate is indicated by , in which is the vector value scale from 0 to 1, represents each number in cell state as written in Equations (2)–(4).
Referring to the collected dataset of more than 11,000 fire hotspots from the year 2010 to 2022, the first process is data filtering and normalization, deleting incomplete datasets or normalizing data to achieve a complete dataset. The grouping process resulted in total data to become 4213 datasets, as presented in
Table 2, for the further machine learning process in data mapping and forecasting. The training data has a massive portion compared to the testing data, accounting for 80% of the total data used for training and then only 20% for testing data, as shown in
Figure 2. The amount of training data is greater than the testing data in order to achieve high-accuracy decisions using the machine learning process. To evaluate the process and results obtained and then enhance the performance as well as the accuracy of forecasting data by minimizing the error in the results, an error calculation is used to check the percentage, to ensure that the error is below 10% as a normal indicator for data analysis error.
The LSTM algorithm is a type of recurrent neural network (RNN) architecture that is specifically designed to address the challenge of capturing and utilizing long-term dependencies in sequential data. Unlike traditional RNNs, which struggle with preserving and propagating information over long sequences, LSTM introduces specialized memory cells and gating mechanisms to selectively retain or discard information as needed. The key components of an LSTM network are the input gate, forget gate, memory cell, and output gate. These gates regulate the flow of information within the LSTM units. The input gate determines which portions of the input are important and should be stored in the memory cell. The forget gate decides which information from the previous time step should be discarded from the memory cell. The memory cell retains and updates information over time, allowing the network to remember long-term dependencies. Finally, the output gate determines the relevant information to be outputted from the memory cell.
The LSTM algorithm’s ability to capture long-term dependencies makes it particularly well-suited for tasks involving sequential data, such as natural language processing, speech recognition, and time series analysis. It has demonstrated superior performance in various applications were preserving and utilizing information over long sequences is critical. By effectively handling the vanishing or exploding gradient problem, LSTM has revolutionized the field of deep learning and has become one of the most widely used techniques for modeling sequential data. Its architectural design allows for the efficient processing and analysis of long-term dependencies, making it a powerful tool in the realm of sequential data modeling and prediction.
Figure 5 illustrates the neuron process of the LSTM model [
34].
The final process calculates the error of the currently available dataset to forecasting results to check the percentage of error. Many techniques can be used to calculate the error. For example, Root Mean Square Error (RMSE) is a technique based on statistics normally used as a comparison between forecasting and real values of data. RMSE is often used to estimate how accurately the forecasting results match to history and reference values data on the relative scale of the dataset. Equation (4) elaborates that
and
are the actual fire hotspot dataset compared to the forecasting dataset at time
t,
is the mean of real values of fire hotspot dataset, and
N is the total amount of data. When the value of RMSE starts from a small number to zero, this implies that the LSTM algorithm produces reliable results.
On the other hand, the other method used to evaluate the forecasting results is mean absolute error (
MAE), where M is the volume of observation dataset,
is the real values, and
is forecasting values.
MAE is the standard of absolute errors as shown in Equation (5), which may well suggest the accuracy of the prediction error value.
Another method applied to calculate the error of forecasting data is the mean absolute percentage error (MAPE). MAPE represents the ratio of error compared to the real value in percent. Equation (6) depicts the formula to calculate MAPE error percentage in forecasting data of fire hotspots.
R-squared (
) is a technique used to verify whether a forecasting model is achieved and reflects the deviation of real dataset, normally in the range of [0, 1]. If
is equal to 0, then the model fits poorly. If
is equal to 1, then the model has no errors, as shown in Equation (7).
Among all those methods used to evaluate prediction and forecasting errors, the RMSE and
MAE evaluate based on the short term, i.e., hourly forecast. However, the MAPE method may have a calculation problem using a small denominator, as well as evaluating daily forecast basis.
is one of the methods with detailed evaluation in a square calculation of all the datasets. The use of the LSTM algorithm for this process in forecasting and mapping with a pseudo-code is shown in Algorithm 1. While
X is the input dataset of hotspots from 2010 to 2022,
O is the output data after forecasting results for the next future year; and in this scenario the future year is 2023 [
35]. The process to obtain the results of the mapping and prediction of wild and forest fire specially in Indonesia region by computer simulation with analysis based on algorithm 1 as follow.
Algorithm 1: FORE-LSTM |
Input: , epoch I, |
number of iterations K, error parameters , cycle index N, |
Number of decompositions m, White noise data W. |
Output: Forecasting Result O. |
1: Input data |
2: Output data |
3: for each do |
4: |
5: for do |
6: |
7: end for |
8: end for |
9: for do |
10: |
11: end for |
4. Results and Discussion
The mapping and forecasting results of fire hotspots, as referred to NASA Earth data, started in the year 2010, complete with coordinates (latitude and longitude), date and time, the confidence level of the hotspot, and other indicators detected as previously shown in
Table 1. In this research, analysis of only four indicators is used for further processing, namely, the acquisition date (acq date), the coordinates, and the confidence level. The spreading of hotspot levels is classified into five categories or levels of confidence. This method is implemented to verify which fire hotspot has a high-risk occurrence of forest fires and which among them is at low risk without much impact. The five levels of classification from the top level are as follows: 81–100, indicated as red dots; 61–80, indicated as orange dots; 41–60, indicated as yellow dots; 21–40, indicated as green dots; and 0–20, indicated as blue dots.
Figure 6 represents the results of mapping the distribution of fire hotspots in Indonesia for the year 2022 and year 2010 to 2021 as in
Appendix A. Overall, the hotspots are more concentrated in Sumatra and Kalimantan. As a result, these islands present a higher risk of forest fires because of their geography and land type. Ultimately, the best strategy is to conduct a prediction based on the current history data along with forecasting to alert the government, respective institutions, and communities. The dotted shows in the map consist of 5 types which from green color for the lower risk of fire to the blue, yellow, pink and red color indicate the highest potential of forest fire risk with percentage from 80 to 100.
The total number of hotspots detected is categorized into five levels for specific evaluation; all the fire hotspots refer to earth data in Indonesia that have counted thousands of hotspots in accumulated time over a 10-year period. In this work, all the hotspot data are plotted onto a graph annually for a detailed presentation indicated by level, starting from January until December every year.
Figure 7 shows the monthly distribution of hotspots for every level for the year 2022 and the years 2010 to 2021 as in
Appendix B, where the black line represents the total number of hotspots. In general, the pattern and distribution of hotspot incidents are similar for most of the year and increase at the end of the year from September to December due to summer or the dry season, while earlier on, the number of hotspots looks normal. The maximum number of hotspots detected is between 600 and 700 for the peak season at the end of the year. The study is in the location of Indonesian territory with borders in latitude 5.980133, 94.964324 and −10.537824, 94.964324 on the west and in the east with 5.980133, 141.105341 and −10.537824, 141.105341.
Machine learning is used in this case for data analysis and training, wherein the history of fire hotspot data is divided into (1) testing and (2) training datasets. Normally, the quantity of the training data is larger than the testing data to achieve a high accuracy of the decision.
Figure 8 shows a set of training data with a total number of more than 4000 days compared to testing data for forecasting, distributed from the year 2010 to 2022. Referring to data, the year 2019 shows the highest training data, as shown in
Figure 9.
The results of the LSTM algorithm used for forecasting fire hotspots in Indonesia tested and compared the real available data for the years 2021 and 2022. Early results of forecasting data from 2018 to 2022 were analyzed for forecasting in the years 2021 and 2022.
Figure 9 shows the actual fire hotspots data compared to the achieved forecasting for both the years 2021 and 2022.
Figure 10a shows the comparison between forecasting and actual data for the year 2021, and
Figure 10b shows comparison data for the year 2022. The results show a similar trend and pattern for both results, i.e., actual and forecasting, with an RMSE for the year 2021 of 4.56% and 9.31% for year 2022 and an overall comparison for 2 years with an average is 6.94%. This method was used to check and demonstrate that the proposed algorithm is working fine with a minimum error of less than 10%. Implementation and application of the LSTM algorithm is different from other methods in its ability for deep analysis for the many datasets of fire hotspots, as well as data in time series, as shown previously.
The final results achieved for the forecasted fire hotspot for the coming year 2023 show reliable agreement between forecasting and real data. The process of data forecasting is based on distribution data training more than 4000 datasets like 80% of total data, while the rest is testing data about 20%. The results, as shown in
Figure 10, with the real data in the years 2020 to 2022 and then forecasted for the year 2023, obtain a similar pattern and trend as well as the distribution of data and rise at the end of the year. In the year 2022, the trend decreased, and a small number of fire hotspots were detected due to the environmental effect of cool weather and the rainy season, thus reducing the number of fire hotspots. The overall trend and fire event is similar with an increasing number at the end of the year starting in August to September.
The LSTM model in the RNN algorithm has been applied in this analysis, and the results achieve a minimum error in time series of data. The proposed algorithm works fine in predicting and forecasting fire hotspot data for the future, i.e., in 2023. The actual data plotted as a graph in
Figure 9 show that the behavior and pattern of data occurrence are similar. With reference to the results, the error bears a low percentage, and the mean proposed algorithm matches to the case in forecasting forest fire hotspots. The result of the forecasting shows a similar pattern to the previous fire hotspots number and trend as well as the monthly distribution. The use of this model yields high accuracy with an error rate of less than 10% compared to other traditional methods. Furthermore, this forest fire analysis and forecasting in the Indonesian region engages only a few researchers and the case categories occupy a specific case, as in tropical regions, as well as the typical land and contour. Future research will seek to identify and verify the achieved results and compare them with the uncontrolled external factors in this decision. Other parameters that may influence the decision of the system may be considered as well to achieve high accuracy and fast processing of the results.
Wildfires pose a significant threat to the ecosystems and communities of Southeast Asia. The region experiences a high occurrence of wild and forest fires, particularly in countries like Indonesia, due to various factors, including climate conditions and natural parameters. Parameters such as climate, wind patterns, and temperature play a crucial role in triggering and exacerbating wildfires in the region. The combination of dry seasons, high temperatures, and strong winds can create favorable conditions for the rapid spread of fires, especially in areas with abundant vegetation and flammable materials. The dry climate contributes to increased vegetation dryness, making it highly susceptible to ignition. Additionally, wind patterns can carry fire embers over vast distances, leading to the rapid expansion of fire fronts and the ignition of new areas. The interplay of these parameters increases the severity and extent of wildfires, posing significant challenges to firefighting efforts and causing ecological damage, air pollution, and threats to human health. Understanding the impact of these parameters and their interaction is crucial for developing effective strategies for wildfire prevention, early detection, and mitigation in Southeast Asia. The results of this study show the scattering and volume of wildfires in Indonesia, along with forecasting for future years, serving as an alert for the potential occurrence of fires. The work carried out in this study demonstrates an improved understanding of the location and frequency of fire hotspots, particularly in the Indonesia region. The informative graphical representation and future forecasts provided by this study offer valuable insights for addressing wildfire risks.
5. Conclusions
This research discusses the mapping, forecasting, and analysis of fire hotspots in Indonesia. The data have been mapped, revealing that hotspots are concentrated on two major islands, namely Sumatra and Kalimantan. Forecasting has been conducted using NASA’s MODIS data, specifically for the Indonesia region, encompassing over 700,000 datasets from 2010 to 2022. The analysis employs the LSTM algorithm, with the data divided into training and testing sets, accounting for 80% and 20%, respectively. The results present a mapping of fire hotspot categories into five levels, differentiating the potential of hotspots to ignite fires. The graph displaying the number of hotspots demonstrates an increasing distribution toward the end of the year, particularly from September to December, coinciding with the dry season or summer. The LSTM algorithm is used for forecasting the year 2023, exhibiting high-performance accuracy with a more than 90% success rate or a mere 6.94% error percentage. In the future, it is crucial to focus on reducing the error percentage by improving the algorithm’s training data performance and downsizing the forecasted area to specific states or districts.