Next Article in Journal
Statistical Characteristics of Cyclonic Warm-Core Eddies and Anticyclonic Cold-Core Eddies in the North Pacific Based on Remote Sensing Data
Next Article in Special Issue
Classification of Crops, Pastures, and Tree Plantations along the Season with Multi-Sensor Image Time Series in a Subtropical Agricultural Region
Previous Article in Journal
Optimization of Sensitivity of GOES-16 ABI Sea Surface Temperature by Matching Satellite Observations with L4 Analysis
Previous Article in Special Issue
In-Season Mapping of Irrigated Crops Using Landsat 8 and Sentinel-1 Time Series
 
 
Article
Peer-Review Record

Cropland Mapping Using Fusion of Multi-Sensor Data in a Complex Urban/Peri-Urban Area

Remote Sens. 2019, 11(2), 207; https://doi.org/10.3390/rs11020207
by Eunice Nduati 1,*, Yuki Sofue 1, Akbar Matniyaz 1, Jong Geol Park 2, Wei Yang 1,3 and Akihiko Kondoh 1,3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Remote Sens. 2019, 11(2), 207; https://doi.org/10.3390/rs11020207
Submission received: 19 November 2018 / Revised: 13 January 2019 / Accepted: 13 January 2019 / Published: 21 January 2019
(This article belongs to the Special Issue High Resolution Image Time Series for Novel Agricultural Applications)

Round  1

Reviewer 1 Report

The paper provides a deterministic remote sensing and machine learning based cropland modeling and mapping approach. The overall framework is valid and in line with expectation. The paper is interesting, detail oriented and is relevant to the field of the journal. While spatial analysis is detailed and correct, there are few scope for improvement.

SPECIFIC COMMENTS
2. Data and Methods
 

- In Line 99, Change km2 to km2 and add a comma while reporting the population number

- Line 102 remove ‘according to’

- Consider revising Line 102 (vastly heterogeneous)

- Revise Figure 1. The left blowup figure is unclear and hard to understand. Also in the right figure the ocean area can be made light blue instead of dark color. Also consider using an administrative map of the area instead of geographic if possible.

- Ln 110 Use same significant digit for reporting temperature and make sure to be consistent with spacing

- Ln 111 Add spacing after 1496 and mm

- Ln 112 Add a reference after agricultural production

- Ln  118 Add a reference after ‘Earth explorer site’

- Ln 126 Please consider replacing the word ‘desired’. There are several reference of word desire throughout the manuscript (for example: Line 358).

- Ln 177 Is 30 m resolution land cover a good reference? Have the authored considered using Lidar imagery with higher resolution?

- Line 179 were instead of was

- Figure 4: Please consider changing the color using colorbrewer website and increase font size

3. Results

- Figure 6. It was not clear to me why the correlation scatter points include change in color (blue to read). A clarification might be helpful

- Table 1 and 2 cited the kappa coefficient but there was no mention in the text as to how this information can be used to describe the two confusion matrix

 4. Validation

-Ln 331 Consider revising ‘Our results does overestimate’

5. Discussion

- Can the author provides some insight about the possible sources of uncertainty and how that may impact the modeling results and validation?

- Please move line 381-383 (starting from ‘In addition’) to ‘data and methods’ section

 Overall the manuscript can be improved greatly by grammatical and narrative improvement. We advise major revision. 

 


Author Response

Review 1

The paper provides a deterministic remote sensing and machine learning based cropland modeling and mapping approach. The overall framework is valid and in line with expectation. The paper is interesting, detail oriented and is relevant to the field of the journal. While spatial analysis is detailed and correct, there are few scope for improvement.


SPECIFIC COMMENTS

2. Data and Methods 

1.        In Line 99, Change km2 to kmand add a comma while reporting the population number

Response 1: We have revised as suggested. Please see Line 98.

2.        Line 102 remove ‘according to’

Response 2: We have revised accordingly. Please see Line 100.

3.        Consider revising Line 102 (vastly heterogeneous)

Response 3: We have revised the sentence to read, ‘It has a varied landscape comprised of…’. Please see Line 101.

4.        Revise Figure 1. The left blowup figure is unclear and hard to understand. Also in the right figure, the ocean area can be made light blue instead of dark color. Also, consider using an administrative map of the area instead of geographic if possible.

Response 4: We have changed to an administrative map with the ocean area in light blue. We also added one more inset in order to provide more geographical context. The top right inset shows the location of the study area relative to the whole of Japan, while the bottom right inset shows the locations of Chiba prefecture relative to Tokyo and the study area within Chiba Prefecture.

5.        Ln 110 Use same significant digit for reporting temperature and make sure to be consistent with spacing

Response 5: We have addressed the spacing issue and rounded the temperatures to the nearest whole number. Please see Lines 107-108.

6.        Ln 111 Add spacing after 1496 and mm

Response 6: We have revised as suggested and added a space between 1496 and mm. Please see Line 109.

7.        Ln 112 Add a reference after agricultural production

Response 7: We have added two references, [18] and [19]. Please see Line 110.

(i)                 [18] Ministry of Agriculture, Forestry and Fisheries (MAFF), Japan. Annual Report on Food, Agriculture and Rural Areas in Japan FY 2016 (Summary), 2017. http://www.maff.go.jp/j/wpaper/w_maff/h28/attach/pdf/index-29.pdf. -This summary report states that Chiba prefecture was ranked 6th nationally in terms of agricultural production, with vegetables being the main product.

(ii)               [19] Tivy, J. (2014). Agricultural ecology. Routledge. – The book describes ideal ecological conditions for various crops, including some grown in the study area such as rice (Pg. 13), peanuts (Pg. 16) and Taro (Pg. 19).

8.        Ln 118 Add a reference after ‘Earth explorer site’

Response 8: We have added the reference as suggested. Please see Reference [20] and Line 117.

9.        Ln 126 Please consider replacing the word ‘desired’. There are several reference of word desire throughout the manuscript (for example: Line 358).

Response 9: We apologize for the redundancy. The sentence has been deleted since a valid issue was raised by Reviewer 3 about the occurrence of this sentence prior to a discussion on the fusion process.

10.    Ln 177 Is 30 m resolution land cover a good reference? Have the authored considered using Lidar imagery with higher resolution?

Response 10: The concern on the fitness and validity of the reference data is well noted and was a key consideration in this study. Matton et al. (2015) and Waldner et al. (2015) found that timeliness and density of reference data influence the accuracy of cropland mapping. The financial implications of acquiring a higher resolution dataset prevented the acquisition of such data. We, therefore, evaluated two freely available cropland related datasets as possible sources of baseline reference data. That is the GFSAD30 global cropland map and the JAXA HRLULC map (ver. 18.03). For reasons cited in section 2.4, we found that we could not use these datasets. The reference data was therefore derived from masks generated from the 30 m Maximum Value Composite NDVI (MVC-NDVI), computed using the Landsat images of 2015 used in this study.

It is our view that the process used to derive the training and validation reference data samples provided satisfactory representative samples for the major land cover types that also match the resolution of the images to be classified. We have taken the liberty of providing some dynamic interactive maps showing the land cover masks generated in this study. Please use the ‘layers’ radio button on the upper left to activate/deactivate the various layers. This link, http://rpubs.com/Mwana/453538, shows an overlay of the Winter-Spring-Summer MVC-NDVI RGB composite (wss_mvc_1.2.3 layer) with the land cover masks generated via raster math overlaid on Esri World Imagery. This link, http://rpubs.com/Mwana/453539, shows the Spring-Summer-Fall MVC-NDVI RGB composite (ssf_mvc_1.2.3 layer) and the land cover masks. We are sorry that the order of appearance of the layers is not the same as the layers radio button order. Since the RGB image stack was added last, it appears last.

After getting the masks, we vectorized them and using the QGIS GEarthView plugin, we cleaned up the vectors. Thereafter, stratified random point generation was done to obtain random points for each class. Please see lines 227-229.

References:

Waldner, F., Fritz, S., Di Gregorio, A., & Defourny, P. (2015). Mapping priorities to focus cropland mapping activities: Fitness assessment of existing global, regional and national cropland maps. Remote Sensing7(6), 7959-7986.

Matton, N., Canto, G. S., Waldner, F., Valero, S., Morin, D., Inglada, J., ... & Defourny, P. (2015). An automated method for annual cropland mapping along the season for various globally-distributed agrosystems using high spatial and temporal resolution time series. Remote Sensing7(10), 13208-13232.

11.    Line 179 were instead of was

Response 11: We have amended as suggested. Please see Line 213.

12.    Figure 4: Please consider changing the color using colorbrewer website and increase font size

Response 12: We are sorry that we are not able to change the colors. The colors in Figures 4(a) and 4(b) are automatically created results of viewing the MVC-NDVI as RGB composites and correspond to the season and the magnitude of the MVC-NDVI at each pixel location. For land cover with high NDVI but minimal variation across the three seasons in each image stack, e.g. Forest, the pixels appear as off-white. Pixels with low NDVI across all seasons such as Urban and water are black or grey. For Paddy fields, they tend to have their maximum in the summer thus appearing as blue in the Winter-Spring-Summer stack and Green in the Spring-Summer-Stack. Croplands have a mix of colors depending on the number of planting times within the year and time of planting. This visualization helped in formulating the rules for the raster math applied in order to obtain the land cover reference data masks as shown in the maps provided in Comment [10].

We have increased the size for better visualization and hope that this addresses the problem with the font.

 

3. Results

13.    Figure 6. It was not clear to me why the correlation scatter points include change in color (blue to read). A clarification might be helpful

Response 13: Thank you for this comment. We apologize for the inadequate explanation. As the scatterplots are a pixel-by-pixel comparison of the Fusion/ Synthetic versus Observed/ Original NDVI images, an image to image comparison would lead to overplotting. We, therefore, used a color gradient from blue (low density) to red (high density) to represent the density of observations within a particular range of values and it is essentially a histogram extension. This reveals patterns in the data and by extension land cover phenology behavior, which we discussed in section 3.1: Fusion results. We have added a counts legend and context in lines 268-273.

14.    Table 1 and 2 cited the kappa coefficient but there was no mention in the text as to how this information can be used to describe the two confusion matrix

Response 14: We added Section 2.6: Accuracy assessment to describe the accuracy evaluation metrics used in this study.

4. Validation

15.    Ln 331 Consider revising ‘Our results does overestimate’

Response 15: We have amended as recommended to ‘Our result overestimated…’ Please see Line 385.

5. Discussion

16.    Can the author provides some insight about the possible sources of uncertainty and how that may impact the modeling results and validation?

Response 16: Thank you for this question. We believe that the main source of uncertainty with respect to classification is spatial heterogeneity in the study area. In lines 385-389 we highlight the misclassification of urban land cover as cropland and that this phenomenon has been observed in other regional small-scale maps. Since the spectral and temporal response of urban and cropland classes are significantly different, the main cause of the observed misclassification would have to be spatial heterogeneity and patch size. Smith et al. (2002) found that classification accuracy decreases as land cover heterogeneity increases and patch size decreases. Ecological landscape metrics analysis would have to be carried out in order to confirm this.

Reference: Smith, J. H., Wickham, J. D., Stehman, S. V., & Yang, L. (2002). Impacts of patch size and land-cover heterogeneity on thematic image classification accuracy. Photogrammetric Engineering and Remote Sensing68(1), 65-70.

17.    Please move line 381-383 (starting from ‘In addition’) to ‘data and methods’ section

Response 17: We have revised the Introduction and Methodology sections. Reference to the software used in lines 89-92

.


 

 


Author Response File: Author Response.pdf

Reviewer 2 Report

In general, the manuscript needs a grammatical review. Specifically, there were several instances of missing punctuation (ex. Line 21 “,respectively”; Line 30 “There is, therefore, an…”), incorrect word choice (ex. Line 22: “abounding” most commonly means abundant while “bounding” involves creating a boundary or enclosing), misplaced punctuation (Line 42 “…temporally continuous data, [6].”) etc.

There are several instances in the Introduction where you refer to high resolution data. What do you consider high resolution? Less than a meter? Without an example of a high resolution sensor, I am not sure what resolution you consider to be “high”. It would be helpful if with the first mention of “higher spatial resolutions” (Line 39) you defined it. Ex:”….higher spatial resolutions (1 meter or less)…” This would also be useful for your discussion of high temporal resolution (daily, once a week, etc.). I assume, by high resolution you mean collecting at one spot every 1 to 2 days (i.e. MODIS)?

The Introduction and Methods need to be restructured, reorganized and expanded. There isn’t enough background presented in the Introduction or explanation in the Methods to justify the objectives. Your objectives are interesting. Your Introduction is too lean to support the need for your objectives. I don’t know how unique your methodology is in respect to the larger existing literature. Your methods section is too vague to determine whether your methodology is legitimate or unique.

The paper fails to clearly and concisely explain who? what? where? and why? leaving the reader to do all of the heavy lifting. There is just a lack of detail throughout especially in the Introduction and Methods. With those sections improved, the paper would be more interesting and comprehensive but it’s impossible to assess the Results because the methodology is so confusing and vague.

Lines 39-40. You discuss high resolution imagery and “big data” but that’s not what is being used here. Why not, instead, include a discussion of medium to low resolution data successfully used for crop assessment? Provide more background and provide insight into how your methodology is unique.

Line 43. Clouds or haze are not “impediments” to image acquisition. The imagery will still be collected regardless of the atmospheric condition. It makes that imagery acquired more challenging to use.

Lines 61-62. “Consequently, spatially and temporally heterogeneous agricultural landscapes, such as those prevalent in urban and peri-urban areas, requiring high spatial and temporal resolution, cannot be adequately mapped and monitored using data reconstructed using either of the two broad satellite data reconstruction methods independently.” Consider revising this for clarity. I have read it multiple times and am still not clear what you are getting at. I understand it’s ideal to have high spatial resolution imagery in a complex urban area. Are you interested in cropland changes over time? Is that why you require high temporal resolution? From what is presented here, it reads that an assumption is being made that ALL imagery that would be appropriate for a cropland assessment will ALWAYS have clouds. I don’t believe that assumption and I don’t think that is what you mean. If the images that you deemed appropriate for your cropland assessment did have clouds and you did need to make corrections for the resulting missing data, then maybe this discussion for how to do that belongs in the methods section. For clarity, more effort needs to be made at improving the Introduction section including a review of other available literature on urban cropland assessments using imagery, sources of imagery appropriate for this type of assessment, and clear definitions of “high” resolution (spatial and temporal) imagery.

Lines 63-79. Consider moving this paragraph to after paragraph 1. This section helps define the purpose of your inquiry. It sets the stage for what is coming up and provides a little more information on why there is a need for the presented cropland assessment.

Line 99. Figure 1 shows the municipalities and the area but not the population as stated here.

Figure 1. Your map has a graticule; therefore, the north arrow isn’t necessary. I assume that Bing Satellite is the source of the background imagery on your map. Please make that clear (i.e., “Imagery source: Bing Satellite”).

Line 102. “Forests”, “Evergreen” and “Deciduous” don’t need to be capitalized.

Lines 104-106. Please check grammar and punctuation.

Line 116-117. Why were these two imagery sources chosen? What is the time period for the MODIS imagery used? You mention that you need daily imagery but from when? Month, day, year?  I assume that it’s imagery captured during leaf-on but nowhere is that stated.  

Line 128. “Daily NDVI images were desired for this study in order to allow for flexibility and computational efficiency during the fusion process as shown in Figure 3”  Figure 3 doesn’t show this. Do you mean Figure 2? Figure 3 shows how the two imagery sources line up. It says nothing about computation. And why? What part of the “fusion process” requires this? There isn’t any information about the fusion process for another two paragraphs.

Line 132-134. How many images are “sufficient”? What is considered adequate seasonal distribution? There is no discussion about when the imagery is captured for either MODIS or Landsat imagery. It’s only illustrated in Figure 3.

Line 135-137. “The distribution of available Landsat images also demonstrates why a daily MODIS data set was necessary as opposed to the 8-day or 16-day….” No. I am still not clear what it is your doing or why you are doing it. There is no clear explanation of why these two image sources are being used. There is no clear explanation of what year of MODIS data was used or what days.

Line 140. Why? Why was IB implemented?

There simply isn’t enough explanation in the methodology up to this point to provide a thorough review of the Results or a complete comprehension of the Methods. It is unclear why decisions were made, processes chosen, data used, etc. The manuscript needs a complete rewrite for grammatical clarity, methodological clarity, and literature review expansion. It’s a very interesting topic but in the manuscript’s current state, it’s impossible to assess the methodological rigor.


Author Response

 

Review 2

Comments and Suggestions for Authors

1.        In general, the manuscript needs a grammatical review. Specifically, there were several instances of missing punctuation (ex. Line 21 “, respectively”; Line 30 “There is, therefore, an…”), incorrect word choice (ex. Line 22: “abounding” most commonly means abundant while “bounding” involves creating a boundary or enclosing), misplaced punctuation (Line 42 “…temporally continuous data, [6].”) etc.

Response 1: Thank you for highlighting these issues. Grammatical and punctuation errors have been addressed through a grammatical and punctuation check using Grammarly premium version.

2.        There are several instances in the Introduction where you refer to high resolution data. What do you consider high resolution? Less than a meter? Without an example of a high resolution sensor, I am not sure what resolution you consider to be “high”. It would be helpful if with the first mention of “higher spatial resolutions” (Line 39) you defined it. Ex:”….higher spatial resolutions (1 meter or less)…” This would also be useful for your discussion of high temporal resolution (daily, once a week, etc.). I assume, by high resolution you mean collecting at one spot every 1 to 2 days (i.e. MODIS)?

Response 2: Thank you for this comment. We have revised the introduction as suggested and incorporated the correction. We have restricted ourselves to moderate and coarse resolution data. Please see Lines 29-31 and 71-72. High temporal resolution in the context of this study is daily data.

3.        The Introduction and Methods need to be restructured, reorganized and expanded. There isn’t enough background presented in the Introduction or explanation in the Methods to justify the objectives. Your objectives are interesting. Your Introduction is too lean to support the need for your objectives. I don’t know how unique your methodology is in respect to the larger existing literature. Your methods section is too vague to determine whether your methodology is legitimate or unique.

Response 3: We have revised the introduction as suggested. Below is an overview of how the objectives relate to the Introduction.

The motivation of this study was to characterize UPA at a local scale using a methodology that seeks to address the UPA-related spatial information needs of the non-remote sensing user community, towards promoting a better understanding of UPA, while also validating the available cropland-related spatial data sets. The study characterizes UPA through:

i.              Intra-annual cropland mapping, distinguishing upland horticultural croplands from paddy rice and other land cover types.

ii.              Estimation of cropping intensity in the identified croplands intra-annually

iii.              Experiment on using post-harvest practices information as reference data for distinguishing a horticultural crop (peanuts) from other crops

The objectives, derived from gaps identified in the current state of provision of UPA specific spatial data, are necessary for a better understanding of UPA drivers, dynamics and impacts. The introduction, therefore, contains the information detailed and structured as below:

I.         Importance of UPA:

(a)      Increasing urban population and the consequent need for the provision of food for urban dwellers – Food and nutrition security

(b)     Rapid urbanization and the need for information on UPA production units for planning and management of urban and peri-urban spaces

II.      Challenges (from a remote sensing perspective) to availing UPA-related spatial information due to:

(a)      Complex spatial structure of UPA (small-holdings and heterogeneous land cover)

(b)     Dynamic nature of UPA due to differences in crop types, varieties, cropping intensities, and management practices

(c)      The spectral similarity of croplands with other land cover types, that is, similar to other vegetation classes such as grasslands during growth and similar to bare land after harvest or when fallow

(d)     Focus on global-scale cropland mapping and field-crops such as wheat, rice and maize/ corn

(e)      Cropland mapping needs timely and reliable a priori information. The most reliable information is in situ information, whose accumulation (collection and ingestion) is time-consuming and expensive

Addressing these challenges would provide data for the fields of fields of agriculture, environmental sciences ecology, business economics, geography, urban planning, and food science technology, which are some of the primary users of UPA spatial data.

III.   How to address the challenges mentioned above:

(a)      Due to the dynamic nature of urban and peri-urban landscapes, higher update frequency is necessary, and the time and cost limitations of in situ data collection methods, preclude their applicability. Temporally specific available (specific to the season or year under consideration) remotely-sensed data can be used to generate reference datasets.

(b)     High spatial and temporal resolution datasets are needed to capture UPA production units. High spatial resolution data, due to the small size and land cover heterogeneity, and high temporal resolution due to the dynamic nature of UPA. Spatio-temporal image fusion provides high spatial and temporal resolution data. Data availability, the intended application or use of the data, and available computational resources and expertise dictate the choice of spatiotemporal fusion methods.

4.        The paper fails to clearly and concisely explain who? what? where? and why? leaving the reader to do all of the heavy lifting. There is just a lack of detail throughout especially in the Introduction and Methods. With those sections improved, the paper would be more interesting and comprehensive but it’s impossible to assess the Results because the methodology is so confusing and vague.

Response 4: Thank you for noting this. We have revised the Introduction and Methods to provide more details on:

                        i.              The need for spatiotemporal fusion with respect to the available data, objectives and localized nature of Urban and Peri-urban Agriculture (UPA)

                      ii.              The various types of fusion and why we used the ESTARFM fusion algorithm

                    iii.              The method used to obtain temporally specific (intra-annual) reference training and validation sample data

5.        Lines 39-40. You discuss high resolution imagery and “big data” but that’s not what is being used here. Why not, instead, include a discussion of medium to low resolution data successfully used for crop assessment? Provide more background and provide insight into how your methodology is unique.

Response 5: We have excluded the mention of “big data” and discussed medium and coarse (low) resolution data vis a vis the challenges of accurately classifying UPA croplands using these datasets, as recommended. Please see Lines 29-31 and 71-72.

6.        Line 43. Clouds or haze are not “impediments” to image acquisition. The imagery will still be collected regardless of the atmospheric condition. It makes that imagery acquired more challenging to use.

Response 6: The comment is well noted, and we have amended to “At any one time, approximately 35% of the global land surface is under cloud cover thus limiting information retrieval and meaningful interpretation of optical satellite data”. Please see Lines 59-60.

7.        Lines 61-62. “Consequently, spatially and temporally heterogeneous agricultural landscapes, such as those prevalent in urban and peri-urban areas, requiring high spatial and temporal resolution, cannot be adequately mapped and monitored using data reconstructed using either of the two broad satellite data reconstruction methods independently.” Consider revising this for clarity. I have read it multiple times and am still not clear what you are getting at. I understand it’s ideal to have high spatial resolution imagery in a complex urban area. Are you interested in cropland changes over time? Is that why you require high temporal resolution? From what is presented here, it reads that an assumption is being made that ALL imagery that would be appropriate for a cropland assessment will ALWAYS have clouds. I don’t believe that assumption and I don’t think that is what you mean. If the images that you deemed appropriate for your cropland assessment did have clouds and you did need to make corrections for the resulting missing data, then maybe this discussion for how to do that belongs in the methods section. For clarity, more effort needs to be made at improving the Introduction section including a review of other available literature on urban cropland assessments using imagery, sources of imagery appropriate for this type of assessment, and clear definitions of “high” resolution (spatial and temporal) imagery.

Response 7: Thank you for highlighting this issue. To address the concerns raised, we have expounded on the need for time series data for cropland mapping. Please see Lines 42-46. In connection to time series analysis and classification, we have highlighted some of the challenges, including missing data due to cloud cover and provided references on the same. Please see Lines 57-74. In the introduction, we have discussed why irregular time series due to missing data is a challenge in time series analysis (Lines 57-58) and mentioned the general methods used to handle missing data Lines 62-63. The advancement of image reconstruction techniques is connected to spatio-temporal fusion in Lines 64-66 and Lines 67-74 give a brief overview of blending. A detailed explanation of spatio-temporal fusion is provided in section 2.3 of the Methodology. We hope that this clarifies why we implemented the methodology as described.

8.        Lines 63-79. Consider moving this paragraph to after paragraph 1. This section helps define the purpose of your inquiry. It sets the stage for what is coming up and provides a little more information on why there is a need for the presented cropland assessment.

Response 8: Thank you for this comment and suggestion. We have amended the introduction as recommended. Please see Lines 38 – 56.

9.        Line 99. Figure 1 shows the municipalities and the area but not the population as stated here.

Response 9: This is well noted, and we have restructured the sentences. Please see Lines 95-98.

10.    Figure 1. Your map has a graticule; therefore, the north arrow isn’t necessary. I assume that Bing Satellite is the source of the background imagery on your map. Please make that clear (i.e., “Imagery source: Bing Satellite”).

Response 10: We have changed this figure to address concerns raised by Reviewer 2 on coherence with the Bing satellite image as the base map. We have altered it to show the geographic bounds of the seven municipalities more clearly as recommended and provided two insets for context.

11.    Line 102. “Forests”, “Evergreen” and “Deciduous” don’t need to be capitalized.

Response 11: We have changed accordingly. Please, see Line 101.

12.    Lines 104-106. Please check grammar and punctuation.

Response 12: We have amended to read:

Grasslands in the area consist of two types, that is, natural and managed. On the one hand, natural grasslands contain grass and shrubs growing naturally and include abandoned croplands and paddy fields. On the hand are the managed grasslands, such as golf courses, which are numerous due to proximity to Tokyo.

Please see Lines 103-106.

13.    Line 116-117. Why were these two imagery sources chosen? What is the time period for the MODIS imagery used? You mention that you need daily imagery but from when? Month, day, year?  I assume that it’s imagery captured during leaf-on but nowhere is that stated.  

Response 13: Thank you for the questions. We have provided a more detailed description of the data needs assessment and evaluation in the methodology in lines 118-137. We hope that the changes address the issues of:

i.              Why time series classification was deemed to be more suitable than single-date classification

ii.              Why multisensor spatiotemporal fusion was found to be necessary as opposed to using single sensor image reconstruction

We have also provided the dates of the MODIS imagery. Please see Lines 138-140.

The images acquired were for one calendar year, without consideration for the crop growth stages. The reason for this is the varied crop types, varieties and management practices which mean that even for the same crop type, the vegetative phase is not reached at the same time and some parcels remain bare while others have crops. Therefore, in our understanding, the entire year, is leaf-on but varies from parcel to parcel depending on the farmer. Please also see the attached crop calendar in Figure 1. It provides an idea of how diverse the crop varieties and planting are in the study area. The periods that are shaded indicate when the product is available in the market and the inverse may be assumed to be the growing period. However, this does not apply to all crops, and especially the vegetable crops, for which farmers may stagger planting throughout the year. Chiba’s weather and availability of water mean that the farmers can plant at any time therefore the entire year needs to be factored into any analysis on crop distribution.

14.    Line 128. “Daily NDVI images were desired for this study in order to allow for flexibility and computational efficiency during the fusion process as shown in Figure 3”  Figure 3 doesn’t show this. Do you mean Figure 2? Figure 3 shows how the two imagery sources line up. It says nothing about computation. And why? What part of the “fusion process” requires this? There isn’t any information about the fusion process for another two paragraphs.

Response 14: We have excluded this sentence and the figure because as you rightly point out, it does not communicate the point we wish to make. We have instead added table 1, showing all the available images for 2015 and their corresponding land cloud cover. We chose to show all images in order to communicate the earlier point on the loss of information due to cloud cover and therefore need for fusion. In addition we have explained within the text, the need for coinciding Landsat – MODIS which has been found in other studies as well. In Li et al. (2017), cited below, they used Blend-then-Index (BI) STARFM fusion of Landsat and MODIS 8-day composite and found that ESTARFM outperforms STARFM. In addition, since they used 8-day composite MODIS data, it does not line up with the available Landsat images and due to day-to-day changes in MODIS viewing geometry, this could weaken the fusion results. They reported correlation coefficients of 0.66 for the NIR band and 0.55 for the Red band. Please see lines 168-171. In Jarihani et al. (2014), they found that Index-then-Blend (IB) is best because it introduces fewer errors since fusion is applied to index images rather than the individual bands from which the index is computed. Please see lines 165-168. In view of this, in our study, in order to ensure correspondence between the Landsat and MODIS images we used the MODIS daily images, which allowed us the flexibility alluded to in the earlier draft, in choosing the start of the time series, as shown in the table below. In our case, the first available image is on 10th January 2015, therefore in order to get an 8-day interval, the first date predicted is 2nd January 2015. Please see Lines 173-175.

 

References:

[1]     Li, L., Zhao, Y., Fu, Y., Pan, Y., Yu, L., & Xin, Q. (2017). High resolution mapping of cropping cycles by fusion of landsat and MODIS data. Remote Sensing9(12), 1232.

[2]     Jarihani, A. A., McVicar, T. R., Van Niel, T. G., Emelyanova, I. V., Callow, J. N., & Johansen, K. (2014). Blending Landsat and MODIS data to generate multispectral indices: A comparison of “Index-then-Blend” and “Blend-then-Index” approaches. Remote Sensing6(10), 9213-9238.

The table below shows exactly how the Landsat and MODIS images line-up in the full time series, from the 23 Landsat images, to the MODIS fusion dates after selecting 2nd January 2015 as the start date, and the standard 8-day composite which would not match the Landsat dates.


15.    Line 132-134. How many images are “sufficient”? What is considered adequate seasonal distribution? There is no discussion about when the imagery is captured for either MODIS or Landsat imagery. It’s only illustrated in Figure 3.

Response 15: Thank you for this comment. We have omitted reference to “sufficient” number of images and “adequate” seasonal distribution, since as you rightly note, there is no standard by which to measure sufficiency or adequacy, and the number of good quality images may vary from year to year. We have also added information on the MODIS and Landsat acquisition dates as suggested. Please see Tables 1 and 2 for information on the Landsat images, and Lines 138-140 for information on the MODIS image acquisition dates.

16.    Line 135-137. “The distribution of available Landsat images also demonstrates why a daily MODIS data set was necessary as opposed to the 8-day or 16-day….” No. I am still not clear what it is your doing or why you are doing it. There is no clear explanation of why these two image sources are being used. There is no clear explanation of what year of MODIS data was used or what days.

Response 16: Thank you for this comment, the observation is well noted and we have amended the methodology to include details on why daily images as opposed to 8-day composite were used. We have provided details on the same in Response 14.

17.    Line 140. Why? Why was IB implemented?

Response 17: Thank you for the question. We have detailed the reasons behind the decision to use IB as explained in Response 14. We hope that the explanation provided answers this question.

18.    There simply isn’t enough explanation in the methodology up to this point to provide a thorough review of the Results or a complete comprehension of the Methods. It is unclear why decisions were made, processes chosen, data used, etc. The manuscript needs a complete rewrite for grammatical clarity, methodological clarity, and literature review expansion. It’s a very interesting topic but in the manuscript’s current state, it’s impossible to assess the methodological rigor.

Response 18: Thank you very much for the thorough review and constructive criticism of the manuscript. We have re-written the Introduction, methodology and added details in the Results and Discussion. We hope that the revisions provide the much needed clarity.


 

Figure 1: Chiba Prefecture Market Availability Crop Calendar




Author Response File: Author Response.docx

Reviewer 3 Report

General Comments: The authors did a good job to generate the synthetic NDVI at the Landsat resolution using a fusion technique and distinguish croplands from other land cover types. In general, the authors provided scientific reasons to justify why the results of this study are important for the remote sensing community. The introduction had a coherent study and I just added a couple of comments to improve the introduction section. The methodology could be written more clearly. I highlighted some parts that need to be considered by the authors in the next round. The results section was fine. However, the authors have to check the statistics presented for Fig 6.  I raised several questions particularly for Fig 7. The answers for them must be added in the manuscript as well.

 

Major:


Introduction

Line 20-21: “Regional food security is threatened by uncertain global climatic conditions and commodity price fluctuations which have resulted in decreased yields.” It is not clear how those factors particularly price fluctuation can affect the crop yield. 


Line 37-39: “Cropland and crop-type mapping activities have been around for a long time but have recently gained momentum due to advancements in data collection and ingestion technologies that have resulted in ’big data’.” Big Data must be clearly defined in a scientific manner.


Line 39-40: “higher spatial resolutions are now possible with higher revisit frequency” The authors mean CubeSat? I’m asking this question because in most of the cases increasing the spatial resolution leads to decreasing revisit frequency (e.g. Landsat versus MODIS).


Data and Methods:

Line 101-104: “It has a vastly heterogeneous landscape comprised of urban or built-up areas, Forests  (Evergreen and Deciduous), grasslands (land covered with grass or shrubs), paddy fields, croplands  (also described as upland cropland) and water bodies.” I don’t know how difficult it is but if it is possible, add a figure showing the land cover of the study area using NLCD 2011 or NLCD 2016.

 

Line 165-167: “Following the fusion process, the 8-day interval image series was smoothed and filtered using the Savitzky-Golay filter to modulate the effects of noise and missing data inherent in the original MODIS 250 m NDVI dataset” Add a short description of Savitzky-Golay filter.

 

Since the authors use a fusion technique based on MODIS and Landsat and these have a different spectral response in different wavelengths, I’m wondering to know if it is necessary to harmonize the spectral response of MODIS with Landsat before applying resampling technique or not.


Line 193-195:  “From aerial and satellite images, it is impossible to distinguish with certainty, one crop (e.g. peanuts) from another (e.g. carrots) during the growing season.” I agree with the authors if they just said “Satellite”.  But aerial imagery in some cases has enough pixel resolution to detect the crop one-by-one. Also, we can access the CubeSat imagery using Planet data server that has 1 m resolution.

  

Line 216-217 “The model tuning parameters are the number of samples per class, the number of levels for each tuning parameter and the number of cross-validation samples.” It is important for readers to understand how the authors calibrate or tune the random forest parameters. Did the authors use a trial and error technique? Or just used suggestions from the literature.


Results:


Fig 6: The r-square reported here belongs to all datasets mapped in each sub figures or is related to non-blue points. R2 0.9 for subfigure 1 is skeptical since the variation and uncertainty around 1-1 line seems too high. Please check those numbers.


Fig 7: It seems the smooth fusion technique works very to simulate the pattern of time series. However, there is a disagreement between observed datasets and fusion records. How the authors can justify the results of this section? Can we conclude that the performance of the fusion technique is highly sensitive to the crop type? What about the growing stage or phonological stage of the crop? Is there any relationship between the results obtained from these graphs and this statement mentioned in line 154-156 “Working on the assumption that for a heterogeneous landscape, the changes in reflectance within a mixed pixel are representative of the weighted sum of changes for each land cover type, and that these changes do not change significantly over a short period of time, the relationship then, can be inferred from the reflectance of the fine resolution pixels”  

 

Fig 8: The kind of constant values of NDVI for Urban an Forest make sense for me. However, I’m wondering to know why there is an order between the pick of NDVI timeseries of other classes (1st Crop , 2nd Grass,3rd  Paddy)?

 

Line 334-335: “Using other metrics for the same one-year data-set such as the NDVI index or shape and texture features may solve this” It requires a proper citation.

 

Fig 12 and Fig 13: If it is possible, put Fig 13  over Fig 12 as a transparent layer. It would help readers compare the accuracy of the approach presented by the authors with recent published HRLULC map.

 

Minor

Line 47-50: “These techniques can be broadly categorized into spectral methods which reconstruct a singular image based on all the spectral information available in it in order to differentiate cloud cover from other features and temporal interpolation methods which utilize information retrieved from a series of images related to the behaviour of biophysical phenomena” It is too long. Break it into simple and short sentences.

 

Line 88-89: “classification of peanuts versus other crops via classification of a dense regular image time series.” It is not clear to me why this objective is important and considered in the study.

 

Fig 3. The information of this figure mentioned in the context is not sufficient. I highly recommend to the authors to add a detailed information to make it more clear. Also, please define each subsection in the figure such as “Scaling”, Reprojection etc.


Line “The MODIS NDVI images were first resampled to 30m and cropped to match the extent of the Landsat 8 NDVI images …” Using which technique the authors resampled the coarse resolution to the higher resolution imagery pixel size? (KNN?), Also, add the formulation of IB in the manuscript.


Author Response

Review 3

Comments and Suggestions for Authors

General Comments: The authors did a good job to generate the synthetic NDVI at the Landsat resolution using a fusion technique and distinguish croplands from other land cover types. In general, the authors provided scientific reasons to justify why the results of this study are important for the remote sensing community. The introduction had a coherent study and I just added a couple of comments to improve the introduction section. The methodology could be written more clearly. I highlighted some parts that need to be considered by the authors in the next round. The results section was fine. However, the authors have to check the statistics presented for Fig 6. I raised several questions particularly for Fig 7. The answers for them must be added in the manuscript as well.

 Major:

Introduction

1.        Line 20-21: “Regional food security is threatened by uncertain global climatic conditions and commodity price fluctuations which have resulted in decreased yields.” It is not clear how those factors particularly price fluctuation can affect the crop yield. 

Response 1: This comment is well noted. We have altered this part of the introduction to highlight other threats to global and regional food security. Please see Lines 19-21.

2.        Line 37-39: “Cropland and crop-type mapping activities have been around for a long time but have recently gained momentum due to advancements in data collection and ingestion technologies that have resulted in ’big data’.” Big Data must be clearly defined in a scientific manner.

Response 2: Thank you for this comment. We have excluded any mention of ‘big data’ and instead discussed medium and coarse resolution datasets as recommended by Reviewer 3. Please see Lines 29-31 and 71-72.

3.        Line 39-40: “higher spatial resolutions are now possible with higher revisit frequency” The authors mean CubeSat? I’m asking this question because in most of the cases increasing the spatial resolution leads to decreasing revisit frequency (e.g. Landsat versus MODIS).

Response 3: We appreciate this observation. The statementhigher spatial resolutions are now possible with higher revisit frequencyrefers to high spatial resolution satellites such as Sentinel 2 that have high revisit frequency, achieved by having a constellation. We have, however, removed this statement from the manuscript.

Data and Methods:

4.        Line 101-104: “It has a vastly heterogeneous landscape comprised of urban or built-up areas, Forests  (Evergreen and Deciduous), grasslands (land covered with grass or shrubs), paddy fields, croplands  (also described as upland cropland) and water bodies.” I don’t know how difficult it is but if it is possible, add a figure showing the land cover of the study area using NLCD 2011 or NLCD 2016.

Response 4: Thank you for this comment. We apologize if we have misunderstood the comment, but to the best of our knowledge, the NLCD 2011 or NLCD 2016 datasets cover only the United States. We used the Japan Aerospace Exploration Agency’s (JAXA ) High-Resolution Land-Use and Land-Cover map of Japan (HRLULC Ver.18.03). We have provided a reference for it. Please see Lines 203-206 and reference [50].

5.        Line 165-167: “Following the fusion process, the 8-day interval image series was smoothed and filtered using the Savitzky-Golay filter to modulate the effects of noise and missing data inherent in the original MODIS 250 m NDVI dataset” Add a short description of Savitzky-Golay filter.

Response 5: Thank you for this comment. In the previous draft, we apologize that we failed to specify that we applied the Savitzky-Golay filter only for a test region where the sample points used to generate the NDVI temporal profiles shown in Figures 7 and 8 are located. This was done because the process of smoothing and filtering in R is memory intensive and could not handle the study area. In addition, the classification results for the unsmoothed and unfiltered fusion series are satisfactory.

6.        Since the authors use a fusion technique based on MODIS and Landsat and these have a different spectral response in different wavelengths, I’m wondering to know if it is necessary to harmonize the spectral response of MODIS with Landsat before applying resampling technique or not.

Response 6: Thank you very much for this question. Relative radiometric normalization was not carried out prior to resampling. The reason for this is that the pre-processing of the MODIS and Landsat data were completely independent of each other up until the fusion process. It was assumed that since we used Index-then-Blend (IB), rather than Blend-then-Index, and the MODIS and Landsat image pairs were temporally coincident, any radiometric disparities in the spectral bands would not have a major impact on NDVI. We have discussed the importance of having matching MODIS and Landsat dates for the reference image pairs. Please see Lines 165-168.

7.        Line 193-195:  “From aerial and satellite images, it is impossible to distinguish with certainty, one crop (e.g. peanuts) from another (e.g. carrots) during the growing season.” I agree with the authors if they just said “Satellite”.  But aerial imagery in some cases has enough pixel resolution to detect the crop one-by-one. Also, we can access the CubeSat imagery using Planet data server that has 1m resolution.

Response 7: Thank you for highlighting this. We have changed it to read “From moderate resolution satellite images, it is impossible to distinguish with certainty, one crop (e.g., peanuts) from another (e.g., carrots) during the growing season.”. Please see Lines 232-233.

8.        Line 216-217 “The model tuning parameters are the number of samples per class, the number of levels for each tuning parameter and the number of cross-validation samples.” It is important for readers to understand how the authors calibrate or tune the random forest parameters. Did the authors use a trial and error technique? Or just used suggestions from the literature.

Response 8: Thank you for this suggestion. We used recommendations from the package’s vignette as well as trial and error. We have included an explanation as suggested. Please see Lines 257-260.

Results:

9.        Fig 6: The r-square reported here belongs to all datasets mapped in each sub figures or is related to non-blue points. R2 0.9 for subfigure 1 is skeptical since the variation and uncertainty around 1-1 line seems too high. Please check those numbers.

Response 9: Thank you for this observation. We re-checked this result, and the reported R2 belongs to a random sample of 1 million points from both the observed and synthetic NDVI images. To avoid overplotting, we used the 2D density scatter plot. We have added a legend to the scatterplots as recommended by Reviewer 2 to explain the color range from blue to red and provided context in Lines 278-287.

10.    Fig 7: It seems the smooth fusion technique works very to simulate the pattern of time series. However, there is a disagreement between observed datasets and fusion records. How the authors can justify the results of this section? Can we conclude that the performance of the fusion technique is highly sensitive to the crop type? What about the growing stage or phonological stage of the crop? Is there any relationship between the results obtained from these graphs and this statement mentioned in line 154-156 “Working on the assumption that for a heterogeneous landscape, the changes in reflectance within a mixed pixel are representative of the weighted sum of changes for each land cover type, and that these changes do not change significantly over a short period of time, the relationship then, can be inferred from the reflectance of the fine resolution pixels

Response 10: Thank you for the insightful questions. Indeed there is a difference in terms of the amplitude of the mean observed with respect to the mean synthetic NDVI, especially in the grassland and paddy classes. This difference can be attributed to the fact that all spatiotemporal data fusion methods use spatial information from the fine-resolution images and temporal information from the coarse resolution images, which due to the resolution, have mixed pixels, as explained by Zhu et al. (2015). Also, ESTARFM algorithm assumes that no significant changes in the land cover occur between the dates of the reference image pairs. The effect of this is that the resulting mean temporal evolution of fusion NDVI appears to be modulated or muted compared to the observed NDVI. However, as the configuration of the synthetic NDVI temporal profiles is similar to the observed NDVI time series, the integrity of the estimation of biophysical changes is maintained as stated in Lines 194-196. This is also related to the statement cited in the last part of your question. With regards to the sensitivity of the algorithm to crop type, without reference data on crop types, we cannot commit to a conclusion. However, looking at the temporal profiles for several individual cropland sample points, as shown in the graph below, one can infer that the cropland class has high intra-class variability and this may be due to, among other factors, crop type. The phenology and growing stages can also be deduced from the fusion dataset since the regular sampling frequency maintains temporal continuity which informs the phenological traits as mentioned in Lines 154-155.

Reference: Zhu, X., Helmer, E. H., Gao, F., Liu, D., Chen, J., & Lefsky, M. A. (2016). A flexible spatiotemporal method for fusing satellite images with different resolutions. Remote sensing of environment172, 165-177.

11.    Fig 8: The kind of constant values of NDVI for Urban and Forest make sense for me. However, I’m wondering to know why there is an order between the pick of NDVI timeseries of other classes (1stCrop , 2nd Grass,3rd  Paddy)?

Response 11: Thank you for the question and interesting observation. The observed sequence may be attributed to management practices, that is planting time in the case of cropland and paddy, and seasonal weather conditions. For UPA cropland, planting can begin as soon as the temperatures begin to rise in spring (from around DOY 91), and since crops reach the vegetative phase faster than natural vegetation, the cropland will peak first [Please see reference 47]. Grasslands will then peak second and finally paddy rice since transplanting takes place around May (DOY 121 ~ 151) and is at its peak in the summer between July and August (from DOY 182). It's an interesting observation that requires further investigation, especially for grasslands since not all grasslands are natural.

12.    Line 334-335: “Using other metrics for the same one-year data-set such as the NDWI index or shape and texture features may solve this” It requires a proper citation.

Response 12: Thank you for highlighting this. We have added a reference as suggested. Please see Lines 388-389.

13.    Fig 12 and Fig 13: If it is possible, put Fig 13 over Fig 12 as a transparent layer. It would help readers compare the accuracy of the approach presented by the authors with recent published HRLULC map.

Response 13: Thank you for this suggestion. We attempted to overlay as suggested, but since there is an inevitable change in color in the regions of overlap, which cannot be included in the legend, we created side-by-side maps for comparison. Please see Figures 12 and 14. Also, we created a zoomed in version showing two regions in the study area where inconsistencies were detected especially in the cropland and urban classes. Please see Figure 13.

 Minor

14.    Line 47-50: “These techniques can be broadly categorized into spectral methods which reconstruct a singular image based on all the spectral information available in it in order to differentiate cloud cover from other features and temporal interpolation methods which utilize information retrieved from a series of images related to the behaviour of biophysical phenomena” It is too long. Break it into simple and short sentences.

Response 14: Thank you for this comment. We have changed the sentence to read, “Shen [26], broadly classified these methods into spatial, spectral, temporal and hybrid categories, which vary by the type of images they can be applied to and the sources of information used to fill the missing data.” Please see Lines 62-63.

15.    Line 88-89: “classification of peanuts versus other crops via classification of a dense regular image time series.” It is not clear to me why this objective is important and considered in the study.

Response 15: Thank you for this comment. We considered peanuts in this study for two reasons. The first is that in this area, peanuts are actively cultivated for their economic value, especially in Yachimata-shi as stated in Lines 231-232. However, as pointed out in the 2009 GAINS report on Japanese peanut production [54], peanut farms average about 1 Ha and are mostly family run. Also, there is no foreign investment, government subsidy or compensation program for peanut production. It is, therefore, a perfect example of horticultural production in the UPA that is contributing to sustainable livelihoods. The second reason is that crop-type reference data is difficult to collect for UPA croplands since farmers change crops from year to year. The post-harvest practices of Japanese farmers which enable detection of cultivation from Google Earth images are an example of a creative way of acquiring reference data based on knowledge of local practices as stated in Lines 54-56. Also, there is no spatially explicit data on peanut production.

16.    Fig 3. The information of this figure mentioned in the context is not sufficient. I highly recommend to the authors to add a detailed information to make it more clear. Also, please define each subsection in the figure such as “Scaling”, Reprojection etc.

Response 16: Thank you for pointing this issue out. We have provided details on the MODIS pre-processing steps, as recommended. Please see Lines 142-151.

17.    Line “The MODIS NDVI images were first resampled to 30 m and cropped to match the extent of the Landsat 8 NDVI images …” Using which technique the authors resampled the coarse resolution to the higher resolution imagery pixel size? (KNN?), Also, add the formulation of IB in the manuscript.

Response 17: Thank you for the comments. We have added information on the resampling method and details on IB. Please see Lines 177-178 and Lines 165-168 respectively.

 

 

 


Author Response File: Author Response.pdf

Round  2

Reviewer 1 Report

I would like to thank the authors for their effort in addressing the review suggestions. However, if possible, please clarify how the authors addressed Response 14. I could not see the section 2.6 in the manuscript as mentioned. Please include it in the next version. 

        "Response 14: We added Section 2.6: Accuracy assessment to describe the accuracy  evaluation metrics used in this study."


Author Response

I would like to thank the authors for their effort in addressing the review suggestions. However, if possible, please clarify how the authors addressed Response 14. I could not see the section 2.6 in the manuscript as mentioned. Please include it in the next version. 

        "Response 14: We added Section 2.6: Accuracy assessment to describe the accuracy evaluation metrics used in this study."

Response 1: Thank you very much for the review and for noting this omission. We sincerely apologize for the oversight. Section 2.6 can now be found at lines 274 – 283.


Author Response File: Author Response.pdf

Reviewer 2 Report

Well, this manuscript has greatly improved. I appreciate the effort put in to revising and adding to the content presented. The Introduction and Methods are much more concise and clear. 

There are still some minor issues with language clarity. For instance, in several tables and figures, DOY appears without DOY (Day of Year) being defined anywhere. A reader new to imagery may not be familiar with that term. On Line 315, Land Use Land Cover doesn't need to be capitalized. Also, later in the paragraph the author uses "land use / land cover" and later in the Discussion section they use "Land-use / land cover". Same with "producer'" and "user's" accuracy. In the Results section it's capitalized and in the Discussion it is lower case. Be consistent. 


Line 210: Seeing as you are using this land cover map for validation, a small bit of attention should be provided to the reader on how it was produced instead of "Details on its production are available at..." Just enough so the reader can determine its value as a reference source. The only time real issues found in various versions of this land cover database are discussed is in the Discussion section. This makes me wonder if this is the best dataset to use as a reference dataset. 


Line 225: What does "cleaned-up" mean? I assume that involved an edit to polygon boundaries to more accurately represent the extent of each mask? To correct for data loss due to the conversion process?


Figure 4. The imagery (c) is not necessary. 


Line 241-250: Super interesting. Filing that away for future image classification projects.


Line 252-260: A more detailed description of how this classifier works would be useful. 


Line 353-357: "The overall accuracy was 67.1%, and the PA and UA for the peanuts class were 63.2% and 71.2% respectively. Given the limited amount of reference data, we found this classification accuracy to be sufficient. However, further research is necessary for the determination of the spectral-spatial-temporal characteristics of peanuts as they are an important crop in this area." What do you think this could be attributed to? the classification method? A short discussion on this would be helpful. 


Author Response

There are still some minor issues with language clarity. For instance, in several tables and figures, DOY appears without DOY (Day of Year) being defined anywhere. A reader new to imagery may not be familiar with that term. On Line 315, Land Use Land Cover doesn't need to be capitalized. Also, later in the paragraph the author uses "land use / land cover" and later in the Discussion section they use "Land-use / land cover". Same with "producer'" and "user's" accuracy. In the Results section it's capitalized and in the Discussion it is lower case. Be consistent. 

Response 2: Thank you for this observation. We have made amendments throughout the manuscript for consistency of terms and also the definition of the DOY acronym for the imagery novice reader. We have used ‘land use/ land cover’ and provided the full phrase ‘Day of Year’ along with the acronym in the tables.

Line 210: Seeing as you are using this land cover map for validation, a small bit of attention should be provided to the reader on how it was produced instead of "Details on its production are available at..." Just enough so the reader can determine its value as a reference source. The only time real issues found in various versions of this land cover database are discussed is in the Discussion section. This makes me wonder if this is the best dataset to use as a reference dataset. 

Response 3: Thank you for pointing this out. We have added information on the production of the Japan Aerospace Exploration Agency’s (JAXA) Japan High Resolution Land Use Land Cover (Version 18.03) including the input data, the classification method, land use/ land cover categories and the reported accuracy for the cropland (upland cropland) class. The JHRLULC map series is important as it is freely available, has regular updates and distinguishes between upland cropland and paddy field classes. Please see lines 203 – 215.

 

Line 225: What does "cleaned-up" mean? I assume that involved an edit to polygon boundaries to more accurately represent the extent of each mask? To correct for data loss due to the conversion process?

Response 4: Thank you for the question. “Cleaned-up” does indeed refer to polygon editing of the vectorized masks. The generation of the land cover masks involved the use of raster math based on the temporal evolution of NDVI to estimate the land cover class. As such, some polygons were incorrectly labeled. For instance, since there are croplands whose phenology for the year closely matched that of paddy rice fields, we deleted these polygons. The loss of these polygons was compensated for by generating the dense point cloud thus ensuring that we had enough reference data points.

Figure 4. The imagery (c) is not necessary. 

Response 5: This is well noted, thank you. We have left out Figure 4(c) as recommended. Please see

Line 241-250: Super interesting. Filing that away for future image classification projects.

Response 6: We greatly appreciate that you found this to be interesting. We believe that to enhance crop mapping and monitoring using remote sensing data, there is a compelling case for the creation of a database on tillage and post-harvest practices or other crop identifying features. Such a database would include, for various crops in different parts of the world, characteristics or features identifiable from the high-resolution satellite imagery availed by platforms such as Google Earth and Microsoft’s Bing Aerial. As an example, we provide a google maps link that shows post-harvest practice on maize/ corn fields in Kenya, which results in distinct clumps on the ground, visible from Google Earth (https://goo.gl/maps/CrtcyyuHKjv). With investment in, and exploitation of technology such as UAV mapping, such information can be used by governments and food agencies to map crops within the season and predict yields, thus allowing for planning especially in the event that harvests are impacted by drought or pests and diseases.

Line 252-260: A more detailed description of how this classifier works would be useful. 

Response 7: Thank you for this suggestion. We have added information about the RF classifier as implemented in RStoolbox package. Please see lines 263 – 273.

Line 353-357: "The overall accuracy was 67.1%, and the PA and UA for the peanuts class were 63.2% and 71.2% respectively. Given the limited amount of reference data, we found this classification accuracy to be sufficient. However, further research is necessary for the determination of the spectral-spatial-temporal characteristics of peanuts as they are an important crop in this area." What do you think this could be attributed to? the classification method? A short discussion on this would be helpful. 

Response 8: Thank you for this recommendation. We have highlighted that RF requires numerous training datasets to train the model and that phenological similarity between peanuts and other crops cultivated at the same time such as carrots (Please see table 5) could have an effect on the classification accuracy. Please see lines 377 – 383.


Author Response File: Author Response.pdf

Reviewer 3 Report

The manuscript has been significantly improved. There is only few things that need to be considered by the authors.


(1): In line 19, still it is not clear how the commodity price fluctuation can affect the regional food security.


(2): The legend of figure 2 is too small.


(3): The scale of  figure 7 must be checked. it seems they are stretched.


Author Response

(1): In line 19, still it is not clear how the commodity price fluctuation can affect the regional food security.

Response 9: Thank you for the vital question, we hope that the explanation below provides some clarity.

According to Brown and Funk (2008) [Reference 1 in this manuscript], “Since the 1990s, rising commodity prices and declining per capita cultivated area have led to decreases in food production, eroding food security in many communities. Many regions that lack food security rely on local agricultural production to meet their food needs. Primarily tropical and subtropical, these regions are substantially affected by both global climate variations and global commodity price fluctuations.”

Kalkuhl, Von Braun and Torero (2016) is an excellent treatise on the complex and sensitive subject of food prices and their implications for food security and policy. In this book, they state that:

“Food markets cannot be considered in isolation: Spatially separated markets are linked through trade; food markets are influenced by commodity, asset, and financial markets; and these, in turn, influence trading and allocation decisions of actors that also engage in food markets. Because of the complex interlinkages and interactions between several actors and economic sectors, food prices are not the mere result of farmers supply and consumers demand, and price volatility is not solely determined by harvest and income shocks.”

The diagram below summarises the above statement.

Figure 1: Conceptual framework of the casual impacts of price volatility. Source: Kalkuhl, M., Von Braun, J., & Torero, M. (2016)

References:

1.        Brown, M. E., & Funk, C. C. (2008). Food security under climate change.

2.        Kalkuhl, M., Von Braun, J., & Torero, M. (Eds.). (2016). Food price volatility and its implications for food security and policy.

(2): The legend of figure 2 is too small.

Response 10: Thank you for noting this. We have increased the font size of Figure 2 items.

(3): The scale of figure 7 must be checked. It seems they are stretched.

Response 11: Thank you very much, this is well noted. We have rectified the diagram to eliminate the height stretch. 


Author Response File: Author Response.pdf

Back to TopTop