Early Crop Mapping Based on Sentinel-2 Time-Series Data and the Random Forest Algorithm
Round 1
Reviewer 1 Report
This is an interesting study. Early remote sensing identification and extraction of crop planting information can enable timely monitoring of crop growth, help farmers implement corresponding field management measures. However, there are some recommended amendments are required as follow:
1. How to overcome the problem that the early season mapping mentioned in the abstract can only use remote sensing image data of partial crop growth period?
2. How to deal with the collinearity problem between spectral indexes (Fig.4)?
3. The introduction of the census purpose of the sample sites in the Three Rivers Plain is not very clear, and it is suggested to sort out this part of the content.
4. Reference citations are required for the introduction of the random forest method.
5. The introduction to linear interpolation and Savitzky Golay (SG) methods are insufficient. It is suggested to provide additional information on these methods or cite relevant references.
6. Line 69: “…which leads to…” is suggested to be changed to “…which lead to…”.
7. Line 137: “the Three Rivers Plain” should be consistent with the context.
8. Line 280-281: “Scheme F4” is not clearly expressed.
9. Line 398: Further explanation can be given for the inconsistent distribution between early and post-season surveys.
Author Response
Response to Reviewer 1 Comments
Point 1: This is an interesting study. Early remote sensing identification and extraction of crop planting information can enable timely monitoring of crop growth, help farmers implement corresponding field management measures.
Response 1: We appreciate the reviewer’s positive comments and helpful suggestions. We have made the revision according to the reviewer's opinions.
Point 2: How to overcome the problem that the early season mapping mentioned in the abstract can only use remote sensing image data of partial crop growth period?
Response 2: Thanks for your careful review, we offer the following explanation. The remote sensing images of multiple vegetation indices combined with time series contain not only information on crop phenological characteristics but also crop growth characteristics. In this study, the linear interpolation method was used to fill the time series images with vacant data, and the SG filter was used to smooth the time series images in order to reduce the effect of less remote sensing data from earlier mapping.
Point 3: How to deal with the collinearity problem between spectral indexes (Fig.4)?
Response 3: Thanks for your careful review, we offer the following explanation. In this study, F4 is proposed, and the optimal features are selected using random forest importance analysis, which balances the effects of collinearity and importance among time-series remote sensing classification features.
Point 4: The introduction of the census purpose of the sample sites in the Three Rivers Plain is not very clear, and it is suggested to sort out this part of the content.
Response 4: Thanks for your careful review. We have made the revision.
Point 5: Reference citations are required for the introduction of the random forest method.
Response 5: We agree. We have added the description of the reference.
Point 6: The introduction to linear interpolation and Savitzky Golay (SG) methods are insufficient. It is suggested to provide additional information on these methods or cite relevant references.
Response 6: We agree. We have added the description of the reference.
Point 7: Line 69: “…which leads to…” is suggested to be changed to “…which lead to…”.
Response 7: We agree. We have made the revision.
Point 8: Line 137: “the Three Rivers Plain” should be consistent with the context.
Response 8: We agree. We have made the revision.
Point 9: Line 280-281: “Scheme F4” is not clearly expressed.
Response 9: Thanks for your careful review. We have made the revision.
Point 7: Line 398: Further explanation can be given for the inconsistent distribution between early and post-season surveys.
Response 7: Thanks for your careful review. We have provided a further explanation of the inconsistent distribution between early and post-season surveys.
Author Response File: Author Response.docx
Reviewer 2 Report
Early crop mapping based on Sentinel-2 time-series data and the random forest
algorithm
A summary
The research team trained a random forest classifier on Sentinel 2 satellite images and ground sample data
to detect the earliest identifiable time of Rice, Maize, and soybean crops in the Sanjiang Plain. The detected
remote sensing classification features revealed strong performances for early crop mapping.
General concept comments
The proposed methodology represents a useful approach for crop early detection and monitoring with
important implications for severe food crises rapid identification and the development of an early warning
systems, as highlighted by the authors.
Review
A Major revision is required. The manuscript presents important improvements to be made. Introduction
and material and methods parts are the most critical, and require the bigger improvements to allow the
readers to better understand the state-of-the-art and how the survey was carried out. I also suggest the
authors to pay more attention to the formatting and spend time on how things are written, some sentences
are not clear and can be misunderstood. In the next rows I reported some suggestions to improve the work.
Specific comments
Abstract
â–ª Lines 20-22: It is not clear why the authors define the classifiers training related to the remote sensing
image data as challenging.
Introduction
The introduction needs to be improved due to several errors (some listed below). It is unclear in most of its
parts, particularly in the state-of-the-art one, where the authors confusingly quote results from other works.
The paragraphs are not well connected and the final part concerning the objective is unclear.
â–ª Line 43: how early season crop mapping is important as in rapid response to agricultural disasters?
Please provide an explanation about it.
â–ª Lines 47-50: it is hard to understand the meaning of this sentence, please improve it.
â–ª Line 78: what measurement unit is “hm2”?
â–ª Line 95: why there is a question mark at the end of the sentence?
Materials and methods
â–ª Line 99: please provide the reference system.
â–ª Line 105: even if not mandatory, I suggest the authors to not use the second plural person (“we”) to
explain what the research team has done. Example: “In this study, spring rice, spring maize and spring
soybeans were selected as the surveyed cultivar”.
â–ª Line 108: what this yield value is related to, is it the total surface area? What cultivar this value is
related to?
â–ª Line 111: please provide the reference system.
â–ª Lines 117-118: what visual error consist of and, precisely, how the authors perform the visual
interpretation on Google Earth?
â–ª Lines 123-126: a reference to Figure 1 would help the readers to have a better understanding of how the two stripes where designed.
â–ª Lines 134-136: only training and validation is mentioned, is the dataset described in lines 137-141 the test one? If, yes, please define it as “test”.
â–ª Lines 143-144: when you define the resolution, please remember that meters or centimetres are referred to one pixel. Add it to this resolution value.
â–ª Lines 150-151: What the authors mean with “but also significantly reduces the problem of mixed image elements”?
â–ª Lines 162-165: it is not clear why and how you performed such interpolation and filtering techniques. Please provide references where these techniques were used or describe more in details the processes.
â–ª Figure 2: the figure is too big. What the sample data consist of? It is still not clear.
â–ª Line 241: please provide the default values.
Results
â–ª Figure 5: what the vertical dashed blue, black, and red lines refer to? Please define it in the capture.
â–ª Figure 6: I SUGGEST THE AUTHORS TO use higher contrast colours to better distinguish between the target crops.
Discussion
â–ª Lines 345-348: Do the authors think cloud masking methods and the Savitzjy-Golay filter helped to solve the problems stated below, there are no considerations about it.
â–ª Line 365: I think “although” is not the correct word, probably “Since” is better.
â–ª Lines 351-371: this section can benefit from appropriate recalls to tables, to help the reader better understand the results.
â–ª Line 372: check the format.
â–ª Line 385: how the authors think this problem could be overcome?
â–ª Lines 395-397: it is not clear what the authors mean with “the least remote sensing images”, can you provide more explanation?
â–ª Line 435: this sentence can be misunderstood. It seems like you analysed for ten days straight the survey zone, and not every 10 days.
I suggest that authors pay more attention to formatting and spend time on how things are written. Some sentences are not clear and can be misunderstood. If possible, submit the manuscript for English revision.
Author Response
Response to Reviewer 2 Comments
Point 1: Introduction and material and methods parts are the most critical, and require the bigger improvements to allow the readers to better understand the state-of-the-art and how the survey was carried out. I also suggest the authors to pay more attention to the formatting and spend time on how things are written, some sentences are not clear and can be misunderstood. In the next rows I reported some suggestions to improve the work.
Response 1: We appreciate the reviewer’s positive comments and helpful suggestions. We have made the revision according to the reviewer's opinions.
Point 2: Lines 20-22: It is not clear why the authors define the classifiers training related to the remote sensing image data as challenging.
Response 2: Thanks for your careful review, we offer the following explanation. Early season mapping uses imagery from just the early and middle stages of the crop growing season. Therefore, some potential uncertainties are more likely to affect the accuracy of early crop mapping.
Point 3: Line 43: how early season crop mapping is important as in rapid response to agricultural disasters?
Response 3: Thanks for your careful review, we offer the following explanation. Firstly early season mapping can contribute to the early detection of famines and rapid response to potential risks (e.g., agricultural disasters like flood, drought, and windstorm damage).Moreover, the timely or earlier cropping information also serves for the agricultural insurance companies to assess disaster losses and compensation for farmers, instead of traditional time and laborintensive field visits.Last but not least, the crop distribution information in the early stage helps to guide the agricultural water and fertilization management (e.g., magnitude and timing of irrigation and fertilization) as well as harvest transportation coordination.
Point 4: Lines 47-50: it is hard to understand the meaning of this sentence, please improve it.
Response 4: We agree. We revised the sentence.
Point 5: Line 78: what measurement unit is “hm2”?
Response 5: Thanks for your careful review. We have modified here.
Point 6: Line 95: why there is a question mark at the end of the sentence?.
Response 6: Thanks for your careful review. We have modified here.
Point 7: Line 99: please provide the reference system.
Response 7: Thanks for your careful review. We have added the description of the reference system.
Point 8: Line 105: even if not mandatory, I suggest the authors to not use the second plural person (“we”) to explain what the research team has done. Example: “In this study, spring rice, spring maize and spring soybeans were selected as the surveyed cultivar.”
Response 8: Thanks for your careful review. We revised the sentence.
Point 9: Line 108: what this yield value is related to, is it the total surface area? What cultivar this value is related to?
Response 9: Thanks for your careful review. We revised the sentence as ” The total grain yield of the Sanjiang Plain can reach 15 million tons per year, and the per capita arable area and grain yield are above the national average.”
Point 10: Line 111: please provide the reference system.
Response 10: Thanks for your careful review. We have added the description of the reference system.
Point 11: Lines 117-118: what visual error consist of and, precisely, how the authors perform the visual interpretation on Google Earth?
Response 11: Thanks for your careful review , we offer the following explanation. In this study, visual interpretation mainly eliminated crop samples with obvious errors. The Google Earth high-resolution image easily identifies whether it is a cultivated field or not. Therefore, we first excluded the crop samples that were not on cultivated land. After that, we reviewed the data and consulted the professionals to build a sample library of high-definition satellite images of rice, corn and soybean crops respectively. Finally crop samples with obvious errors were excluded according to the sample library.
Point 12: Lines 123-126: a reference to Figure 1 would help the readers to have a better understanding of how the two stripes where designed.
Response 12: We agree. We revised the sentence.
Point 13: Lines 134-136: only training and validation is mentioned, is the dataset described in lines 137-141 the test one? If, yes, please define it as “test”.
Response 13: Thanks for your careful review. We have made the revision.
Point 14: Lines 143-144: when you define the resolution, please remember that meters or centimetres are referred to one pixel. Add it to this resolution value.
Response 14: We agree. We revised the sentence.
Point 15: Lines 150-151: What the authors mean with “but also significantly reduces the problem of mixed image elements”?
Response 15: Thanks for your careful review , we offer the following explanation. Because Sentinel 2 images have high spatial resolution, high temporal resolution, and more multispectral bands, can provide additional information and advantages in the image fusion process..
Point 16: Lines 162-165: it is not clear why and how you performed such interpolation and filtering techniques. Please provide references where these techniques were used or describe more in details the processes.
Response 16: We agree. we have supplemented references.
Point 17: Figure 2: the figure is too big. What the sample data consist of? It is still not clear.
Response 17: We agree. We revised the Figure 2. The sample data has been modified in detail in point 13.
Point 18: Line 241: please provide the default values.
Response 18: Thanks for your careful review. We have added the description of the default values.
Point 19: Figure 5: what the vertical dashed blue, black, and red lines refer to? Please define it in the capture.
Response 19: Thanks for your careful review. We have added instructions.
Point 20: Figure 6: I SUGGEST THE AUTHORS TO use higher contrast colours to better distinguish between the target crops.
Response 20: We agree. Following the suggestion, we have redrawn Figure 2.
Point 21: Lines 345-348: Do the authors think cloud masking methods and the Savitzjy-Golay filter helped to solve the problems stated below, there are no considerations about it.
Response 21: Thanks for your careful review , we offer the following explanation. Cloud masking before linear interpolation can minimize the influence of clouds on remote sensing images, improve image quality, and reduce the influence of missing image values on linear interpolation. Linear interpolation followed by Savitzjy-Golay filtering can remove outliers and reduce noise to improve the quality of time-series images.
Point 22: Line 365: I think “although” is not the correct word, probably “Since” is better.
Response 22: We agree. We revised the sentence.
Point 23: Lines 351-371: this section can benefit from appropriate recalls to tables, to help the reader better understand the results.
Response 23: We agree. We have added the Figure 7.
Point 24: Line 372: check the format.
Response 24: We agree. We revised the sentence.
Point 25: Line 385: how the authors think this problem could be overcome?
Response 25: Thanks for your careful review , we offer the following explanation. The sample sites in this study were sampled through human fieldwork, and the distribution of sample sites was inevitably disturbed by human factors. Some sample points can be deleted or added by visual interpretation of Google Earth HD resolution images, and then a random function is used to obtain randomly distributed sample points.
Point 26: Lines 395-397: it is not clear what the authors mean with “the least remote sensing images”, can you provide more explanation?
Response 26: Thanks for your careful review , we offer the following explanation. Rice can be identified at 130 DOY, so only 90-130 DOY images were utilized to identify rice. Maize and soybeans and so on, so rice uses the least remote sensing images.
Point 27: Line 435: this sentence can be misunderstood. It seems like you analysed for ten days straight the survey zone, and not every 10 days.
Response 27: We agree. We revised the sentence.
Author Response File: Author Response.docx
Reviewer 3 Report
The text is well structured and raises an interesting issue. In any case, the validity of the results is limited mainly because it only considers the data of one year (year 2022). It is interesting that it considers time series, although it does not specify in detail how it does the interpolation.
Some issues to be clarified in the text:
1. How the process of "visual interpretation of Google Earth high-resolution images" has been carried out. In which cases it has been necessary to perform such visual interpretation ¿.
2. The authors would need to indicate the field measurements carried out at the sampling points, and whether they have formation of the meteorological conditions at these points. This could be useful to compare them with results from other years, or to apply the working methodology to other areas.
3. Indicate how the original band and spectral indices are weighted in the F3 scheme ¿
4. What are the optimal characteristics considered in schema F4. The text of lines 210-221 does not clarify this issue, since it does not specify or specify how this scheme is carried out. How you choose the optimal features ¿.
5. In Fig 3 you have normalized all values from 0 to 1. In Fig 4 if the Y-axis values are not normalized. Because in figure 4 it does not use the normalized values of 0 1, 0 from -1 to 1 as the NDVI does.
6. How the values of the "strip study area selected" are treated compared with the rest of the sampling points ¿. The results obtained as they depend on this band ¿
7. It is important to interpret the results of the analysis in different production campaigns. The results obtained in a single year must be compared with two previous years to have interpretable results. If this is not possible, the results obtained should be compared with other existing studies in the literature, analyzing the different results and indicating the advantages of the proposed method.
Author Response
Response to Reviewer 3 Comments
Point 1: The text is well structured and raises an interesting issue. In any case, the validity of the results is limited mainly because it only considers the data of one year (year 2022). It is interesting that it considers time series, although it does not specify in detail how it does the interpolation.
Response 1: We appreciate the reviewer’s positive comments and helpful suggestions. We have made the revision according to the reviewer's opinions.
Point 2: How the process of "visual interpretation of Google Earth high-resolution images" has been carried out. In which cases it has been necessary to perform such visual interpretation?
Response 2: Thanks for your careful review. In this study, visual interpretation mainly eliminated crop samples with obvious errors. The Google Earth high-resolution image easily identifies whether it is a cultivated field or not. Therefore, we first excluded the crop samples that were not on cultivated land. After that, we reviewed the data and consulted the professionals to build a sample library of high-definition satellite images of rice, corn and soybean crops respectively. Finally crop samples with obvious errors were excluded according to the sample library.
Point 3: The authors would need to indicate the field measurements carried out at the sampling points, and whether they have formation of the meteorological conditions at these points. This could be useful to compare them with results from other years, or to apply the working methodology to other areas.
Response 3: We agree. We have added information on weather conditions.
Point 4: Indicate how the original band and spectral indices are weighted in the F3 scheme?
Response 4: Thanks for your careful review. We have added an explanation of the weights.
Point 5: What are the optimal characteristics considered in schema F4. The text of lines 210-221 does not clarify this issue, since it does not specify or specify how this scheme is carried out. How you choose the optimal features?
Response 5: We agree. We have added an explanation of the optimal features.
Point 6: In Fig 3 you have normalized all values from 0 to 1. In Fig 4 if the Y-axis values are not normalized. Because in figure 4 it does not use the normalized values of 0 1, 0 from -1 to 1 as the NDVI does.
Response 6: Thanks for your careful review , we offer the following explanation. Figure 3 indicates the random forest importance analysis normalized scores of the remote sensing features and Figure 4 represents the true remote sensing feature values.
Point 7: How the values of the "strip study area selected" are treated compared with the rest of the sampling points? The results obtained as they depend on this band ?
Response 7: Thanks for your careful review. We have reintroduced this section.
Point 8: It is important to interpret the results of the analysis in different production campaigns. The results obtained in a single year must be compared with two previous years to have interpretable results. If this is not possible, the results obtained should be compared with other existing studies in the literature, analyzing the different results and indicating the advantages of the proposed method.
Response 8: Thanks for your careful review. According to your comments we have added relevant content.
Author Response File: Author Response.docx
Round 2
Reviewer 2 Report
The present form of the manuscript reports all the suggested changes, so I accept it in the present form.