Next Article in Journal
Impact of Atmospheric Correction on Classification and Quantification of Seagrass Density from WorldView-2 Imagery
Previous Article in Journal
Estimation and Mapping of Soil Organic Matter Content Using a Stacking Ensemble Learning Model Based on Hyperspectral Images
Previous Article in Special Issue
FSSBP: Fast Spatial–Spectral Back Projection Based on Pan-Sharpening Iterative Optimization
 
 
Article
Peer-Review Record

Mapping Agricultural Land in Afghanistan’s Opium Provinces Using a Generalised Deep Learning Model and Medium Resolution Satellite Imagery

Remote Sens. 2023, 15(19), 4714; https://doi.org/10.3390/rs15194714
by Daniel M. Simms 1,*, Alex M. Hamer 1, Irmgard Zeiler 2, Lorenzo Vita 2 and Toby W. Waine 1
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Remote Sens. 2023, 15(19), 4714; https://doi.org/10.3390/rs15194714
Submission received: 2 August 2023 / Revised: 13 September 2023 / Accepted: 21 September 2023 / Published: 26 September 2023

Round 1

Reviewer 1 Report

It is an interesting subject to map agricultural land in Afghanistans opium provinces using a generalised deep-learning model and medium resolution satellite imagery. The manuscript investigates the characteristics of agricultural land in Afghanistan that span different image datasets and use this new knowledge to train a generalised deep-learning model for fully automating image classification. The results indicate Training generalised deep-learning models using data from both new and long-term EO programmes, is an exciting opportunity for automating image classification across datasets and through time that can improve our understanding of the environment. However, this article has some important flaws as follows:

 1.      An overall flowchart for introducing the logic and relationship among steps are strongly suggested. I feel very difficult to understand the methods and steps. Many steps seem to be unrelated. You did not tell the role of each step clearly.

2.      The innovation and significance of the paper is not well expressed. Results in terms of tables and figures should be discussed more clearly.

3.      Why did you extract and test textural information for crop land mapping? Is it helpful? 

4.      The standardization of input data seems unnecessary to introduce. It is the basic step for image processing. 

5.      The significance and usage of the generalised model are unclear to me. 

6.      What do you want to tell when considering crop phenology or image time? when compare crop land between years, it is clear that we should use the images in growing seasons. The knowledge is easy to be understood. 

7.      It is essential to emphasize the potential influences of urban expansion and climate change on crop land change. By situating the study within this broader context, its significance could be better elevated and the broader implications of the findings could be addressed. Following literatures are suggested. 

Evaluating trends, profits, and risks of global cities in recent urban expansion for advancing sustainable development. Habitat International……

Urbanisation and environmental degradation in Dhaka Metropolitan Area of Bangladesh. Environ. Sustain…. 

8.      In section 2.8 “Land use change in Helmand Province”, I see the introduction of images used. This should be in section 2.2.

9.      In table 3. The tests on Shape and specific bands are unmeaningful and insignificant. Using all bands could certainly achieve better results than juts using one band. 

10.   Table 3. OA, %  FwIoU %, should be modified as “Table 3. OA (%)  FwIoU (%)”. 

11.   10. in Figure 4, detailed true color samples is are suggested for readers to know more about crop distribution clearly. 

12.   In Figure 5, the great differences between (a) and (b) in terms of intersection over union (%) should be discussed. 

13.   In section 3.3, “The generalised model, trained up to 2017, had similar OA, UA and PA on Landsat-8 and Sentinel-2 (resampled to 30m) without any fine-tuning using Sentinel-2 data.”  I did not see Sentinel-2 results in table 5.

 14.   In table 5, what does the “Training dataset DMC 2015 2016 2017” and the “×” mean?

Author Response

Thank you for your constructive review of our manuscript, we really appreciate the time and effort you have committed to the review process. Please see the response to your comments, point-by-point, below and we have marked up all changes in the attached manuscript (red are text deletions, new text in blue). We look forward to your response.

1 An overall flowchart for introducing the logic and relationship among steps are strongly suggested. I feel very difficult to understand the methods and steps. Many steps seem to be unrelated. You did not tell the role of each step clearly Text has been reworded in section 2 to better describe the separation of the preprocessing optimization (image features and input standardization in section 2.4), from the generalised model training (in section 2.5), making the steps clearer and consistent.
2 The innovation and significance of the paper is not well expressed. Results in terms of tables and figures should be discussed more clearly. We have improved the expression of the innovation and significance throughout the manuscript to highlight the development of a fully automatic classifier for agricultural land use that can be used with any medium resolution satellite image.
3 Why did you extract and test textural information for crop land mapping? Is it helpful?  Texture was found to account for ~73% of the total accuracy of the model, which is discussed in the first paragraph in section 4. The text has been modified to make the link with the results section clearer.
4 The standardization of input data seems unnecessary to introduce. It is the basic step for image processing.  Standardisation is an important step in both image processing and training CNNs but there are different approaches, so in the manuscript we specifically test several common methods. The rationale for introducing the concept is because of the importance of standardised input when using images from different years and different sensors, results for which are presented (table 4) and discussed in section 4. We have added some text in section 2.5 to make this point clearer.
5 The significance and usage of the generalised model are unclear to me.  The generalised model works across years and sensors (medium resolution) and can be used to classify new images (or hindcast) with without further training, It is trained all the available data and the approach determined from the testing of image features and standardisation. Clarification has been added in section 1 and section 2 to ensure this is fully described.
6 What do you want to tell when considering crop phenology or image time? when compare crop land between years, it is clear that we should use the images in growing seasons. The knowledge is easy to be understood. The comparison of total area and crop timing is important as it demonstrates how early in the season the generalised model can be used for accurate mapping (see section 2.7). Figure 7 shows the relationship between area and crop vigour (expressed through NDVI), we have added text in section 3.4 to make this clearer to the reader.
7 It is essential to emphasize the potential influences of urban expansion and climate change on crop land change. By situating the study within this broader context, its significance could be better elevated and the broader implications of the findings could be addressed. Following literatures are suggested.  We are conscious of the broader significance of urban expansion and climate change but our paper is focused on automation of mapping in the context of the UNDOCs survey. As we have not measured changes relating to urbanisation or a changing climate we feel that this would distract the reader from the specifics of the research.
8 In section 2.8 “Land use change in Helmand Province”, I see the introduction of images used. This should be in section 2.2. Table 1 describes the properties of the satellite data used, including the central wavelengths of the bands used, the revisit time and product resolution, along with the dates of all images. This table has been moved into section 2.2 (as close as possible) to be consistent with the text explaining the image properties.
9  In table 3. The tests on Shape and specific bands are unmeaningful and insignificant. Using all bands could certainly achieve better results than juts using one band.  Shape can be an important feature for image classification using CNNs, see Long et al. (2015) in the references, even though we have shown that in our specific case it is not significant (see section 4, paragraph 2). Data within image bands are correlated and contain a significant amount of redundant information, testing the contribution of each band to the model accuracy was investigated in order to optimize any potential pre-processing (last paragraph of section 2.4) of the imagery and as a possible way reduce model complexity (discussed in section 4).
10 Table 3. OA, %  FwIoU %, should be modified as “Table 3. OA (%)  FwIoU (%)”.  All table captions changed to the suggested format.
11 10. in Figure 4, detailed true color samples is are suggested for readers to know more about crop distribution clearly. Unfortunately the data do not have a blue band (DMC imagery from 2009 is limited to NIR, Red, Green, see new table 1) so it is not possible to show true colour images for the samples.
12 In Figure 5, the great differences between (a) and (b) in terms of intersection over union (%) should be discussed. +B10 Text added in section 3.2. to explain differences in fwIoU between the two models.
13 In Figure 5, the great differences between (a) and (b) in terms of intersection over union (%) should be discussed. +B10 Table 5 contains Sentinel-2 results for 2017 images.
14 In table 5, what does the “Training dataset DMC 2015 2016 2017” and the “×” mean? Table 5 caption edited to fully explain the cumulative training process by way of example for the first two rows.

Author Response File: Author Response.pdf

Reviewer 2 Report

The study uses a deep-learning model and medium-resolution satellite imagery to map agricultural land in Afghanistan's opium provinces.

 

The Research paper provides respectable findings and is well-written, but before it is accepted, it needs to be strengthened in the following ways:

 

In the abstract section, the author should include the quantitative results.

The introduction could be expanded, and more related research sources should be cited.

The author should discuss the study area's topography and vegetation conditions.

The author ought to discuss the drawbacks and challenges of the deep-learning model.

The author should describe important parameters of the accuracy.

The properties of the satellite data should be described by the author.

The figure must be amended to include the map scale and North arrow where they are absent.

The author should define all abbreviations before using them even if they are well known.

There is a typo error in many places, the author should correct them.

The author should mention the name of the software/tools used for data analysis.

The author must include logical arguments for the findings, limitations, and directions for further research in the conclusion section.

 Minor editing of the English language required

Author Response

Thank you for your constructive review of our manuscript, we really appreciate the time and effort you have committed to the review process. Please see the response to your comments, point-by-point, below and we have marked up all changes in the attached manuscript (red are text deletions, new text in blue). We look forward to your response.

  • In the abstract section, the author should include the quantitative results.
    Abstract edited to include quantitative results and effect of features.
  • The introduction could be expanded, and more related research sources should be cited.
    We have kept the introduction focused on the specific objectives related to operational opium monitoring, within relevant referencing, to ensure a clear narrative.
  • The author should discuss the study area's topography and vegetation conditions.
    More information added to section 2.1
  • The properties of the satellite data should be described by the author.
    Table 1 describes the properties of the satellite data used, including the central wavelengths of the bands used, the revisit time and product resolution, along with the dates of all images. This table has been moved into section 2.2 (as close as possible) to be consistent with the text explaining the image properties.
  • The figure must be amended to include the map scale and North arrow where they are absent.
    Scale bars and north arrows added to all figures. Size of image chip added to figure 4.
  • The author should define all abbreviations before using them even if they are well known.
    All abbreviations appear in brackets on first use. OLI and FCN-8 added to the text.
  • There is a typo error in many places, the author should correct them.
    Typographical errors corrected, see marked up copy of the manuscript.
  • The author should mention the name of the software/tools used for data analysis.
    Further information added to the last paragraph of section 2.3.
  • The author must include logical arguments for the findings, limitations, and directions for further research in the conclusion section.
    These are discussed in section 4.

Author Response File: Author Response.pdf

Reviewer 3 Report

In general, a good work of high technical quality it is observed, with a practical application of machine learning, well written and with a good theoretical foundation.

The summary can be improved by incorporating more information about the results obtained, such as mentioning the value of FwIoU or that the effect of the shape on the performance of the model is minimal. 

Line 113-128. The description of the composition of the training and evaluation set should include the total number of individual images (chips) used in each set, not just the percentage.

It is not clear why the Kappa index was excluded from the metrics used, being one of the most frequently used for classification problems. It is suggested to justify its omission (or include said metric in your report).

In the conclusions it is indicated that "...providing key information on the impact of opium cultivation on Afghanistan’s agricultural system". However, it is difficult to attribute this conclusion to the work carried out, since no distinction is made at the crop level, being only surface changes detected. There is a reference to (UNODC,2015) where a correlation is made between surface changes and opium cultivation, but this is not corroborated in the present work.

 

Author Response

Thank you for your constructive review of our manuscript, we really appreciate the time and effort you have committed to the review process. Please see the response to your comments, point-by-point, below and we have marked up all changes in the attached manuscript (red are text deletions, new text in blue). We look forward to your response.

The summary can be improved by incorporating more information about the results obtained, such as mentioning the value of FwIoU or that the effect of the shape on the performance of the model is minimal.  Abstract edited to include quantitative results and effect of features.
Line 113-128. The description of the composition of the training and evaluation set should include the total number of individual images (chips) used in each set, not just the percentage. Number of samples per year added to text.
It is not clear why the Kappa index was excluded from the metrics used, being one of the most frequently used for classification problems. It is suggested to justify its omission (or include said metric in your report). Text added to justify omission of kappa in section 2.10
In the conclusions it is indicated that "...providing key information on the impact of opium cultivation on Afghanistan’s agricultural system". However, it is difficult to attribute this conclusion to the work carried out, since no distinction is made at the crop level, being only surface changes detected. There is a reference to (UNODC,2015) where a correlation is made between surface changes and opium cultivation, but this is not corroborated in the present work. Text rewritten to emphases the use of the outputs by UNODC rather than corroboration of the link between agricultural area and opium

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

While some modifications have been made, there are still significant flaws that render the manuscript unacceptable.

It appears that there is repetition in your manuscript, with content in L94-103 being the same as that found in L82-89. Please eliminate this redundancy and ensure that each section of your manuscript provides unique and relevant information.

In Figure 1, it is important to include not only agriculture land but also other relevant information such as DEM (Digital Elevation Model), satellite images, and land use/land cover data to provide a comprehensive view of the study area. The original images can contribute to objectivity, reliability, and informativeness. The absence of such data raises concerns about the accuracy of the agriculture land classification. Addressing this issue is crucial to improve the quality of the study.

In L133, consider whether it is necessary to conduct this study if you already have an agriculture mask for multiple years. Discuss the quality of the mask and whether your results improve upon it. 

In L136-140, consider introducing basic information about opium, such as its peak biomass, which can help assess the suitability of your images for the study. Explain the variation in image capture times from January to June and address why there are no images available for the period from 2009 to 2015. Additionally, confirm the reliability of the images by verifying if they are indeed fully cloud-free.

In L153-158, it's important to provide a rationale for your choice of model, especially considering the availability of many similar and advanced models such as UNet++ and UNet3+. Acknowledge that there is no one-size-fits-all model for every task and clarify why your chosen model was deemed suitable for your specific study. To strengthen your argument, consider conducting comparative tests with other models to provide convincing evidence for your choice. This will demonstrate a thoughtful selection process and the model's effectiveness for your research.

In L185-192, you should address the concern that the number of samples for training and validating a DCNN model may be insufficient. To enhance the robustness of your model, consider increasing the number of samples. A cross-validation strategy, such as collecting samples from other years for training and using samples from 2009 for validation, is a valuable suggestion that can improve the reliability of your results.

Table 2 should be introduced within the context of Section 2.2 to provide readers with relevant information and improve the organization of your manuscript.

In Section 3.1, consider including figures that visually represent your experiments to provide a clearer and more vivid presentation of your findings. Additionally, include more accuracy indicators to bolster the persuasiveness of your results. Providing samples related to shape, texture, and spectral bands can further enhance the comprehensibility of your study and enable readers to assess your results more effectively.

Section 3.2 appears to have similar issues as Section 3.1. For instance, I did not see a NDVI image sample. 

In Figure 5, it's crucial to include a legend and a detailed explanation to clarify the meaning of elements such as "10-100%" and the overall message you intend to convey with the figure.

The presence of UA (User's Accuracy) and PA (Producer's Accuracy) in Table 5 raises questions about why these indicators were not included in Table 3 and Table 4. UA and PA were not introduced or explained before their appearance in Table 5. Additionally, the use of different accuracy indicators in these tables requires clarification. 

In Section 3.4, you should address the necessity of the experiment you conducted. If this experiment has been extensively studied and well-established in previous research, it is essential to justify its inclusion in your study.

If you intend to demonstrate your testing outside the Helmand study area, it is advisable to include original images and result maps related to these areas. Providing visual evidence in the form of images and maps can enhance the credibility and persuasiveness of your results. Simply presenting numerical data may not be sufficient to convey the completeness and validity of your work.

In the discussion section, which includes L400-L431, it's important to avoid simple repetition of results. Instead, consider incorporating more substantial content, such as comparative tests with other models, an analysis of the factors influencing agriculture land change, the implications of these changes, or exploring the relationship between land use change and opium cultivation.

In Figure 10, in addition to depicting agriculture expansion, it's advisable to include and discuss the situation of diminishing and re-developed agriculture land.

It's important to address the natural and social factors, such as wars and conflicts, that may be related to the agriculture use changes observed in your study. The absence of a discussion or quantification of the impact of war, despite its significance during the study years, is indeed a notable gap. 

In Figure 11, it's advisable to present classification errors more clearly by using labels such as FN (False Negative), TN (True Negative), FP (False Positive), and TP (True Positive). Providing this clarity will improve the interpretation of your findings and make the figure more clear.

In Figures 10 and 11, it is advisable to use true-color images that employ the RGB (Red, Green, Blue) channels from Landsat or similar data sources. True-color images can help readers directly compare the results with actual visual data, making it easier to assess the credibility and accuracy of your findings. 

Last but not least, I noticed that some of my previous comments have not been fully addressed or revised in the manuscript. Please thoroughly review and revise your paper to incorporate all the provided feedback and suggestions.

Author Response

Thank you once again for your time in reviewing our manuscript and your useful suggestions. We have revised the manuscript (marked up copy attached) and responded to each of your comments below.

It appears that there is repetition in your manuscript, with content in L94-103 being the same as that found in L82-89. Please eliminate this redundancy and ensure that each section of your manuscript provides unique and relevant information. Thanks for spotting this, we have removed the repeated sentences from the introduction.
In Figure 1, it is important to include not only agriculture land but also other relevant information such as DEM (Digital Elevation Model), satellite images, and land use/land cover data to provide a comprehensive view of the study area. The original images can contribute to objectivity, reliability, and informativeness. The absence of such data raises concerns about the accuracy of the agriculture land classification. Addressing this issue is crucial to improve the quality of the study. Figure 1 shows the study area extent, location and the outlines of the image collections used within the study. It is  intended only to show the information relevant to the study. Adding extra topographic information would make the figure cluttered and distract from the objectivity, reliability and informativeness. We have changed the table caption to highlight this.
In L133, consider whether it is necessary to conduct this study if you already have an agriculture mask for multiple years. Discuss the quality of the mask and whether your results improve upon it.  The purpose of the study is to train a generalised model from historical data, this is clearly articulated in the introduction and discussed in the manuscript.The quality of the existing data used to train the model is discussed and compared quantitatively to our classification results. Improvements relate to automatic classification without the need for manual interpretation with the same high accuracy (see first paragraph of the conclusion).
In L136-140, consider introducing basic information about opium, such as its peak biomass, which can help assess the suitability of your images for the study. Explain the variation in image capture times from January to June and address why there are no images available for the period from 2009 to 2015. Additionally, confirm the reliability of the images by verifying if they are indeed fully cloud-free. Images are timed for the peak in biomass, reference added in this section. Further discussion of timing can be found in section 4 relating to the results for timing of imagery (presented in figure 10).
In L153-158, it's important to provide a rationale for your choice of model, especially considering the availability of many similar and advanced models such as UNet++ and UNet3+. Acknowledge that there is no one-size-fits-all model for every task and clarify why your chosen model was deemed suitable for your specific study. To strengthen your argument, consider conducting comparative tests with other models to provide convincing evidence for your choice. This will demonstrate a thoughtful selection process and the model's effectiveness for your research. This is a valid point, we added extra justification for U-Net type models and a reference. 
In L185-192, you should address the concern that the number of samples for training and validating a DCNN model may be insufficient. To enhance the robustness of your model, consider increasing the number of samples. A cross-validation strategy, such as collecting samples from other years for training and using samples from 2009 for validation, is a valuable suggestion that can improve the reliability of your results. We used all the data containing agriculture and natural vegetation (full coverage) for training the model 2007 to 2009, including for the experiments in L185-192. The generalised model was evaluated using data for other years (see x's in table 5 and text in section 2.5).
Table 2 should be introduced within the context of Section 2.2 to provide readers with relevant information and improve the organization of your manuscript. We prefer to keep the list of images used to create yearly maps from the generalised model separate from the model training and evaluation data and linked to their respective methodology sections. Table 1 contains the images in used for model training and evaluation (heading amended for clarity) while table 2 is the list of images used to create the final maps, again caption edited for clarity and table moved into section 2.7.
In Section 3.1, consider including figures that visually represent your experiments to provide a clearer and more vivid presentation of your findings. Additionally, include more accuracy indicators to bolster the persuasiveness of your results. Providing samples related to shape, texture, and spectral bands can further enhance the comprehensibility of your study and enable readers to assess your results more effectively. Thanks for your comment and we agree that these would add to effective communication of the experiments, extra examples added for texture and spectral bands to improve reader understanding, see new figure 5 and 6.
Section 3.2 appears to have similar issues as Section 3.1. For instance, I did not see a NDVI image sample.  Same as above, new figure 7 added
In Figure 5 (now 8), it's crucial to include a legend and a detailed explanation to clarify the meaning of elements such as "10-100%" and the overall message you intend to convey with the figure. Caption text edited further for clarification of the overall message of the figure and information on IoU scales added to the legend (this is now figure 8).
The presence of UA (User's Accuracy) and PA (Producer's Accuracy) in Table 5 raises questions about why these indicators were not included in Table 3 and Table 4. UA and PA were not introduced or explained before their appearance in Table 5. Additionally, the use of different accuracy indicators in these tables requires clarification.  Overall accuracy was used to assess the pre-processing steps as a single value was required (see validation section of the methodology). The final model was evaluated against the user and producer metrics, definitions for which were missing. Section 2.9 now updated
In Section 3.4, you should address the necessity of the experiment you conducted. If this experiment has been extensively studied and well-established in previous research, it is essential to justify its inclusion in your study. Timing is an important factor as it not only relates to crop phenology but also for the acquisition window for imagery, which is a critical operational consideration. See section 2.6 and L470-478 discussed this in section 4. Reference added to critical role of phenology in (specifically) Afghanistan. Add text to clarify the importance of early information.
If you intend to demonstrate your testing outside the Helmand study area, it is advisable to include original images and result maps related to these areas. Providing visual evidence in the form of images and maps can enhance the credibility and persuasiveness of your results. Simply presenting numerical data may not be sufficient to convey the completeness and validity of your work. Figure 14 shows examples of the model output for Farah and Nangarhar with the images used for classification.
In the discussion section, which includes L400-L431, it's important to avoid simple repetition of results. Instead, consider incorporating more substantial content, such as comparative tests with other models, an analysis of the factors influencing agriculture land change, the implications of these changes, or exploring the relationship between land use change and opium cultivation. L400 to 431 is a discussion of the importance of texture and shape to model detection and how are finding are consistent and differ from other work. Drivers for change and socio economic factors, although interesting, are beyond the scope of this study. Check results section for repetition.
In Figure 10 (now 13), in addition to depicting agriculture expansion, it's advisable to include and discuss the situation of diminishing and re-developed agriculture land. Expansion dominates (figure 12), so we have chosen to show the maps in figure 13 as cumulative. We did not specifically investigate re-development, this would be a good use of a generalised model for future work, but beyond the scope of this study.
It's important to address the natural and social factors, such as wars and conflicts, that may be related to the agriculture use changes observed in your study. The absence of a discussion or quantification of the impact of war, despite its significance during the study years, is indeed a notable gap.  The context is presented in the introduction but the paper is about methodological advancements for mapping agriculture, that could be applied to other countries/regions.
In Figure 11 (now 14), it's advisable to present classification errors more clearly by using labels such as FN (False Negative), TN (True Negative), FP (False Positive), and TP (True Positive). Providing this clarity will improve the interpretation of your findings and make the figure more clear. Descriptive labels added to the figure to make the errors clear (shadow in a and b, pine forest in b).
In Figures 10 (now 13) and 11 (now 14), it is advisable to use true-color images that employ the RGB (Red, Green, Blue) channels from Landsat or similar data sources. True-color images can help readers directly compare the results with actual visual data, making it easier to assess the credibility and accuracy of your findings.  The norm for vegetation studies is false colour (NIR), active vegetation is clearer in false colour and we did not use true colour imagery in our study, see table 1. The DMC images used for classification in figure 14 (new figure number), have no blue band. Figure 10 (now figure 13) is a thematic map.
Last but not least, I noticed that some of my previous comments have not been fully addressed or revised in the manuscript. Please thoroughly review and revise your paper to incorporate all the provided feedback and suggestions. To the best of our knowledge, we responded to all the points from your first review, with thorough consideration, and made it clear where we had changed the paper. Thank you for your feedback and suggestions.

Author Response File: Author Response.pdf

Back to TopTop