Next Article in Journal
Wood Density and Carbon Concentration Jointly Drive Wood Carbon Density of Five Rosaceae Tree Species
Previous Article in Journal
The Effect of Knife Wear and Sharpening Mode on Chipper Productivity and Delays
 
 
Article
Peer-Review Record

Development of a Forest Fire Diagnostic Model Based on Machine Learning Techniques

Forests 2024, 15(7), 1103; https://doi.org/10.3390/f15071103
by Minwoo Roh 1, Sujong Lee 1, Hyun-Woo Jo 2 and Woo-Kyun Lee 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Forests 2024, 15(7), 1103; https://doi.org/10.3390/f15071103
Submission received: 26 April 2024 / Revised: 29 May 2024 / Accepted: 22 June 2024 / Published: 26 June 2024
(This article belongs to the Topic Application of Remote Sensing in Forest Fire)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This study primarily employs deep learning based models to predict VTCI and further uses machine learning-based classification models to diagnose forest fire risk. This work provides significant value by offering profound insights into the significance of various variables on forest fire risk. However, major revisions are required before considering publication. I would suggest the authors to revise their manuscript according the following:

L15: The abstract should be clear to the reader. I think it is better to mention the component land surface data and anthropogenic factors involved in this study.

L34: Please re-phrase. What is the problem in the 2020s?

L77: This statement needs a justification, or you can add references?

L81: The authors need to describe more about Fig. 1, both the left and right panels.

L97: Please clarify the choice of the ratio 2:1 for non-occurrence:occurrence

L100: Why?

L105: (a) and (b) refer to what?

L97-105: The data selection processes are not well described. Please explain more.

In equation (2-3), what parameters are a, b, a', b' ? What does index i mean?

L120: Give a figure showing a forest fire activity map. It helps the reader to understand it.

L128: In Table 2, there are 7 variables of meteorological data, but you only describe/mention 3 of them in this section. Moreover, Table 2 is not mentioned in the main text. 

L159: Please add references here?

L177: Please make sure all abbreviations are explained in the first appearance.

L182: Your model tries to minimize MSE, but the evaluation of your model is from MAE. Can you clarify it? Then later, you give MAE, MSE, and R2 in the result.

L185: How is the convergence of the model with 10 epochs?

L187: Maybe you mean Table 2 ?

L240-244: I think your argument does not make sense. Not only in June and July, higher predicted values are also found in other months, look at March and April. Many lower values of VTCI are predicted to be higher. I think Figure 4 is not necessary, as your model result already mentions R2 in Table 3. 

Table 4 is not mentioned in the main text. Please make sure all tables and figures are cited in the main text.

L269: Give more interpretation to the plot (Fig 5-9) so that the reader can understand the plot meaning.

L290: Figure 6 is missing.

L293: Where do the number 0 and 0.2 come from?

L347: Should refer to a figure

L298-304: how about the negative values? Does it mean opposing association?

Please check the numbering of all figures.

Figure (9-10) for Seoul and Yangpyeong could be combined as the left panel is the same.

In section 3.4 the verification of the forecast should be discussed in more detail.

The area of Gangwon province should be marked, as the reader might not know where it is on the map.

While the conclusion provides a brief overview of the study and the results, it does not adequately summarize the main findings of the paper. We suggest that you revise the conclusion to provide a clearer summary of your work, results and findings.

Comments on the Quality of English Language

The English of the manuscript needs to be improved. English editing from native or professional editor is required.

 

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

 

SUMMARY

Title: Development of Forest Fire Diagnostic Model based on Machine Learning Techniques

This paper proposes a forest fire risk forecast model with high spatial and temporal resolutions that makes use of machine learning techniques based on 1) satellite imagery, 2) forest fire activity maps, 3) meteorological data, 4) topographical data, and 5) environmental data. Different from other fire risk forecast models, the one proposed here makes use of anthropogenic data. In fact, it is found that higher risks occur near anthropogenic activity sites. A DNN is used to predict the vegetation temperature condition index from satellite imagery. Different machine learning algorithms are compared for the risk model using PyCaret, with CatBoost being selected as the best performer based on seven metrics. Predictions up to three days ahead using short-term weather forecast data in South Korea are compared against 362 actual fires, 264 of which fall within predicted high and very high risks (73%).

COMMENTS

1.        The title refers to a “forest fire diagnostic model” whereas the proposed method outputs a “forest fire risk forecast”. These are not the same thing. There is some loose use of terminology throughout the paper. Please, define terminology and keep consistency.

2.        The last sentence in the abstract is not very clear as a wrap-up of the findings. The previous one states that 73% of the fires in the case study fall within the forecasted high or very high risks, demonstrating the proposed method. I believe the last sentence should simply point out that these results also seem to suggest that higher risks are located near anthropogenic activity sites, which is an interesting finding.

3.        Will the raw data and the processed data be provided? This would add value to the paper.

4.        Lines 97-99: “To ensure a robust dataset, the ratio of non-occurrence to occurrence data points was set to 2:1, with non-occurrence locations comprising 66% of the instances in the dependent variable set.” Why is this a desirable ratio? No discussion or justification provided.

5.        The 8,115 datapoints collected for fire occurrences and the ratio chosen from the previous comment lead to the choice of 16,000 non-occurrences. However, the explanation (and rationale) of how the non-occurrences per month are calculated is very unclear. It took me a while to figure it out. The explanation in Lines 104-105 is incorrect.

6.        In Table 1, (b) is not explicitly shown. There is also inconsistent use of capital letters.

7.        In Eqs. (1), (2) and (3), what are “i”, a, b, a’ and b’?

8.        Please, explain clearly what a Fire Activity Map is, and provide some example.

9.        How and why is the DNN architecture of the VTCI prediction Model chosen/designed?

10.    Table 2 is presented before being referenced. In fact, it is not referenced anywhere in the manuscript.

11.    Fig. 3 is a bit small, with unreadable font.

12.    In Fig. 3, shouldn’t “Labeling Data” be “Labelled Data”?

13.    In Fig. 3, the labelled data seems to be used for supervised training of the Forest Fire Diagnostic Model. How is the DNN VTCI Prediction model trained, and why this is not explicitly shown in this figure?

14.    In Fig. 4, the trend line (regression line) does not seem to be capturing the average slope (from visual inspection). Explain.

15.    Label and caption of Fig. 6 is shown, but no figure is there.

16.    I do not understand Lines 281 to 286.

17.    Figures of SHAP dependence plots for EH, FFMC and DMC seem to be discussed but not shown.

18.    Author should explain how the variable with the most significant influence is identified from Fig. 5. Please, also provide the average impact on model output magnitude (mean of the absolute values).

19.    Figs. 7, 8 and 9 are too small to be interpreted or read. I suggest removing the grey background. These figures should mention the meaning of the colours (like in Fig. 5).

20.    Please, explain why only positive SHAP values are understood to increase risk. That it, the highest SHAP values are interpreted to mean higher risk.

21.    The order of the Risk Level in Table 5 seems to be inverted.

22.    Conclusions should be re-written. There is some repetition with the discussion section.

23.    Overall, the paper has merit. However, the different components of the system must be more clearly explained, rigorous definitions must be provided, and consistency kept. Justifications for DNN and CatBoost architectures must be discussed.

 

Comments on the Quality of English Language

English writing is fine.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Authors,

Thank you very much for your work. It looks complete and well edited. The introduction gives an interesting overview of the state of the art. Materials and methods are well presented and all information and metadata are included in this section. The conclusion addresses every aspect of the work, the scope and the results, probably you could emphasise more the importance of technologies and data access (open data? commercial data?) for the further steps of the research. Is it possible to include some more recent work in the references? Forest fires are a big problem in Europe and North America; machine learning and deep learning are often used to study the phenomenon.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

Title: Development of Forest Fire Diagnostic Model based on Machine Learning Techniques

 

Overview: The paper does not explain any of the data used or how they were analyzed.

 

Other comments:

Introduction- first paragraph: Are these forests that are naturally adapted to fire but are burning more often than usual? Or are they forests that don’t survive fire? Are fire return intervals becoming shorter?

Line 44: It seems that “fire prediction” depends on both fuel mapping and weather forecasting, rather than just mapping fire risk (as determined by fuel load). And it seems that predicting when and where fires will occur depends most on weather- and ignition sources.

Introduction seems quite short and probably needs a more detailed background/literature review. 

Figure 1. Caption needs more information. Where did the forest fire data come from? What period is covered by this map? What units are shown in the DEM? Meters?

Line 85: What is meant by “forest fire label data”?

Lines 95-96: What kind of data were collected for the locations and dates listed? What was the source?

Line 97: What data are these? The nature and source of the “data” mentioned here has not been explained. 

Line 120: So locations of past fires was obtained from the Korea Forest Surface? How did they generate them? From field data? Remote sensing? What time period was studied here?

Line 130: What does “were incorporated” mean?

Line 57: What is the course of the DEM? What spatial resolution?

Table 2: How did you get a fire activity map from “road, building, cropland”?

Line 177: Spell out DNN at first use- what is DNN? 

Line 173 to 226: This section is very jargony and I could not follow the methods used. Especially the lack of clarity in the data section above it.

How was accuracy assessed? Were the data split for training and testing? 

I stopped reading at line 230 because I already know I will reject the paper.



 

Author Response

Please see attached.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The overall quality of the manuscript has been greatly improved after revision, so the revised version is considered acceptable for publication.

Comments on the Quality of English Language

 A proof-read from English professional/ native is required before publication.

Back to TopTop