Next Article in Journal
Passenger Flow Prediction of Scenic Spots in Jilin Province Based on Convolutional Neural Network and Improved Quantile Regression Long Short-Term Memory Network
Previous Article in Journal
Exploring the Association of Spatial Capital and Economic Diversity in the Tourist City of Surat Thani, Thailand
 
 
Article
Peer-Review Record

A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead Imagery

ISPRS Int. J. Geo-Inf. 2022, 11(10), 508; https://doi.org/10.3390/ijgi11100508
by John R. Ballesteros 1,*, German Sanchez-Torres 2 and John W. Branch-Bedoya 1
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
ISPRS Int. J. Geo-Inf. 2022, 11(10), 508; https://doi.org/10.3390/ijgi11100508
Submission received: 30 August 2022 / Revised: 27 September 2022 / Accepted: 28 September 2022 / Published: 30 September 2022

Round 1

Reviewer 1 Report

The authors proposed a pipeline for the dataset generation from drone imagery. Overall, the topic is important in terms of creating high-quality data for deep learning applications. Using pixel value standard deviation to determine the buffer distance is also interesting. However, there are several issues in the manuscript. The authors need to express their ideas more clearly and in a well-organized manner. The research method needs some improvement as well.

 

Major issues:

1.  Confusing research design

I feel confused about the research purpose and design of this work. The authors proposed a dataset and, at the same time, justified the dataset quality by a model’s performance (section 3.3). However, it’s hard to infer the quality of a dataset directly by a model’s performance. A dataset can be high quality but cause low accuracy due to its complexity/difficulty. It’s interesting that the authors used pixel value standard deviation to increase the mask buffer. And it can be useful because sometimes we are more care about the road structure instead of the actual extent as long as the buffer is reasonable. However, the authors applied U-Net and proved 1m is the best buffer due to the best mIoU which doesn’t make lots of sense. If that’s true, why don’t you consider 5m as the buffer size which brings higher mIoU? Therefore, it’s incorrect to justify your buffer size by mIoU (or just by mIoU). In section 3.3, the authors also mentioned “The use of 90-degree mirroring data augmentation and data fusion for the road dataset using a buffer distance of 100 cm increased model performance.” It seems the authors applied data augmentation and data fusion to improve the model’s performance, but how does it make the dataset better? I suggest the authors to separate the dataset and the model (maybe into two papers). For the dataset part, the authors can focus on analyzing the dataset quality and provide a benchmark of the accuracies of different models on this dataset. For the model paper, the authors can analyze different fusion and augmentation strategies. 

 

Minor issues:

1. Introduction of related works

The authors introduced several works related to datasets, model evaluation, data fusion, and the imbalance issue. Instead of simply introducing each work, it would be better if the authors can briefly describe the relations between this work and the others. For example, what’s in common, the difference, or their issues? It doesn’t have to be long (a summary is enough), but it can give readers a better idea of what you going to propose/improve. For the dataset part, I also wonder if there are similar works using drones to collect data. Introducing them would be more related to your topic. 

 

2. Data augmentation before data split

In the dataset generation pipeline (Figure 1), the authors applied data augmentation before data split. Is there any reason to do so? It is invalid if some augmented training images are in the testing set. The final accuracy might be overestimated then. I didn’t see related discussions in the paper. 

 

3. Imbalance check 

As the author mentioned, class imbalance is a common problem that affects a model’s performance, so they ran an imbalance check and filtered out some imbalanced images. However, I would think a dataset should reflect the real-world situations. For the imbalance issue, we should try to resolve it from the model side or some data preprocessing strategies. If a model is trained under a dataset without (extreme) imbalance images, can we expect the same performance when the model is applied on the real world? I wonder if the authors can justify why they adopt the imbalance check step? 

 

4. Equation 4 

I don’t understand the purpose of equation 4. After fusing the DSM into RGB channels, you would apply rescaling on the images. But why did you apply the rescaling on DSM instead of HRGB, HLRGB or RGDSM? I didn’t see where you use the resultant NDSM either.

 

5. Text in figures

Some text in figures is too small to read (figure 6, figure 10). 

 

6. Figure 7

Figure 7 is trying to demonstrate the appropriate selection of 100 cm buffer. But why did the authors show the RGB pixel distribution of 150 cm with only the blue channel statistics? The readers would expect the distribution of 100 cm and statistics of all channels.  

 

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

 Recommendation: Minor revision.

 

This manuscript presents an interesting study of constructing a GIS pipeline for producing GeoAI datasets from drone imagery for object detection. The reviewer thinks the study is well conceived, conducted, and presented. No major concern regarding the study, but a few minor comments are below.

 

1.      Introduction - One single paragraph for the whole Introduction? It is so difficult to read through. You have to break it into several paragraphs for better readability.

2.       Line 69: What is DSM? Should define it the first time it is used.

3.      Table 1: Those numbers are confusing. I cannot discern which is xmin, ymin, xmax, ymax. Also, it is mentioned earlier that the WGS84 geographic coordinate system is used. But these numbers are not geographic coordinates (lat, lon). Please clarify and revise.

4.      Appendix A

This is not accessible: Drone Road dataset: https://zenodo.org/deposit/7019074

The link is missing: Drone Orthomosaics and Raster Masks.

Or is it the same as the “HAGDAVS dataset”?

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

Thank you for giving me this opportunity to read the manuscript entitled "A GIS pipeline for the production of GeoAI datasets from drone imagery". I enjoy reading this manuscript. The topic of this manuscript is interesting and the manuscript is well written. I think some minor issues still need to be addressed before it could be considered for publication in Drones. 1. Please replace the keywords that already appear in the manuscript's title with close synonyms or other keywords, which will also facilitate your paper to be searched by potential readers. 2. The scale, compass, and legend should be added to the maps in the Figures. 3. Lines 28-43: Appropriate and sufficient references need to be cited here to support the points and statements made in the section. For example, in Lines 31- 32 “It is an active area of research these days that have applications on many fields like disaster management, urban planning, logistics, retail, solar, and many others.”, a newly published paper titled “Dynamic assessments of population exposure to urban greenspace using multi-source big data” using GeoAI to address urban planning issues can be used as a reference here. 4. Limitations should be added as a sub-section of the Discussion section. 5. Some grammatical errors exist in the manuscript. Therefore, a critical review of the manuscript language will improve readability.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Thanks for the authors’ reply, which helped to make the article more clear for me. However, I still have some questions and hope the authors can clarify them in the article. 

 

1. Imbalance check. The authors mentioned imbalance check is optional and only applied on the training and validation sets. But I wonder what imbalance check would do when it is applied on the data. The authors mentioned 

“Pipeline includes a step to check data imbalance of image-mask pairs and produce gaussian-like distribution of pixels that guarantees less presence of misclassified pixels.”

“The buffer distance is a “tradeoff” between getting imbalance thin masks with no misclassifications, and wider masks with more pixels of mixed classes.”

“Instead of calculate imbalance on the whole raster mask, imbalance check may be applied on every mask using a threshold ?, this is, every image–mask pair corresponding to a balance mask is saved as a whole image of (2??? 284 pixels), for instance 512x256 pixels.”

So, I guess the pipeline either remove some of the data or increase the buffer to make the imbalance parameter increased. Either way, it doesn’t make lots of sense to me since it isn’t applied on the testing set. It’s like you remove the hard samples from the training and validation sets but keep them in the testing set. 

 

2. Data fusion. Data fusion has the similar issue like imbalance check. Figure 1 shows it is applied only on the training and validation sets. In Table 2, the authors showed the performance using different data fusion. I wonder if they are the results evaluated on the original testing set without data fusion? Unlike data augmentation, data fusion is normally applied on all the data. It doesn’t seem right if a dataset only provides fused training and validation data but no testing data.

 

3. Conclusion. In the conclusion, the authors mentioned “The imbalance check is close related to performance, imbalance values below 1% does not generate segmentation results with the U-Net employed, values above 10% created segmentation results with mIoU above 50%.” Where does the conclusion come from? Which dataset? I only saw the authors mentioned imbalance values in Figure 6, but they are not consistent. Only 3m buffer and full-size mask have imbalance values larger than 10%, but mIoU in Figure 10 (b) shows the different result. 

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Back to TopTop