Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

A Hybrid Image Segmentation Method for Accurate Measurement of Urban Environments

Electronics 2023, 12(8), 1845; https://doi.org/10.3390/electronics12081845

by Hyungjoon Kim¹

, Jae Ho Lee² and Suan Lee^1,*

Reviewer 1:

Milan Marinković

Reviewer 2:

Yu-Cheng Wang

Electronics 2023, 12(8), 1845; https://doi.org/10.3390/electronics12081845

Submission received: 22 March 2023 / Revised: 10 April 2023 / Accepted: 11 April 2023 / Published: 13 April 2023

(This article belongs to the Section Artificial Intelligence)

Round 1

Reviewer 1 Report

Dear Authors,

The development of the model is well designed but needs to be explained in more detail. You should provide a specific comparison of your model to other models. Comments and recommendations for improving the work are as follows:

Comment 1:

I think that the abstract should be expanded because it should be sufficient by itself to understand what was done in the scientific work.

Comment 2:

Line 24 – I think it should be explained what semantic segmentation is.

Comment 3:

Line 54- Explain in more detail what it represents “intersection over union (IoU)”.

Comment 4:

Line 189- How was this achieved? “ differences between the cityscapes dataset and the collected GSV image dataset.”

Comment 5:

Line 204- “256 by 256.”Are the units pixels?

Comment 6:

Explain photo cropping and number of photos (eg 2688) in more detail. The existing numbers are confusing.

Comment 7:

Line 232- What this means? “equal depth in both the encoder and decoder.”

Comment 8:

Line 254- Should it say “human and car” instead “human”?

Comment 9:

Line 309- in the text, explain individual numbers from table 2. For example, does +0.007 mean that Hybrid had better results than SegNet by 0.007? You partially stated this in the text, but an example would explain all the numbers better.

Comment 10:

Could you provide a comparison of your hybrid model with some of the other models available? Is it feasible?

Comment 11:

Line 328- Could you describe the differences with specific examples? I can see the differences in the pictures you attached, but it would be good if you wrote them in the text.

Comment 12:

Line 332- Why did you decide on this way of evaluating results and scoring?

Comment 13:

Line 354- What are the “inaccurate regions”?

Comment 14:

Because of those inaccurate regions, could you please suggest some ways to correct these shortcomings and what are the directions for further research?

Comment 15:

What I noticed is that there is little mention of greenery in the text in the sense of measurement through the model, and it is within the title. “green coverage”

Author Response

I think that the abstract should be expanded because it should be sufficient by itself to understand what was done in the scientific work.

Thank you for your feedback on our paper. We appreciate your suggestion to revise the abstract and have carefully reviewed and modified it accordingly. As per your advice, we have made certain amendments to the abstract and highlighted the changes in yellow for easy reference. We believe that these modifications have strengthened the clarity and accuracy of our paper.
Line 24 – I think it should be explained what semantic segmentation is.

Thank you for your review of our paper. We have carefully considered your feedback and have made the necessary revisions to the introduction section. Specifically, we have modified the first sentence to provide a more thorough explanation of semantic segmentation, and have highlighted these changes in yellow within the manuscript for easy reference. We believe that this modification will better contextualize our study for readers who may not be familiar with this particular technique.
Line 54- Explain in more detail what it represents “intersection over union (IoU)”.

Thank you for your valuable feedback on our paper. We have carefully reviewed your comments and have made the necessary revisions to the introduction section. Specifically, we have added a clear and concise explanation of what Intersection over Union (IoU) is to the third paragraph of the introduction section, and have highlighted these additions in yellow within the manuscript for easy reference. We believe that this addition will enhance the clarity and comprehensibility of our paper for readers who may be unfamiliar with this particular metric.
Line 189- How was this achieved? “differences between the cityscapes dataset and the collected GSV image dataset.”

We would like to express our appreciation for your insightful review of our paper. We have taken into consideration your comments and have made the necessary revisions to the manuscript. Specifically, we have added a brief explanation that GSV images have a 1:1 aspect ratio and are smaller in size than cityscape images. To ensure consistency in our pre-processing, we have also included information regarding the cropping of cityscape images to a 1:1 aspect ratio and unification of size. These modifications have been highlighted in yellow within the manuscript for easy reference. We believe that these additions will provide valuable context and enhance the overall clarity of our research.
Line 204- “256 by 256.”Are the units pixels?

We appreciate your insightful comments on our paper and have carefully reviewed and made the necessary revisions to the manuscript. Specifically, in response to your suggestion, we have added a detailed explanation regarding the pre-processing steps undertaken for the cityscapes dataset. As the size of the dataset we collected was 256 by 256, we standardized this size and cropped the cityscapes dataset images in various shapes, while ensuring a 1:1 aspect ratio was maintained. We believe that this information will provide readers with a better understanding of our methodology and further enhance the quality and rigor of our research.
Explain photo cropping and number of photos (eg 2688) in more detail. The existing numbers are confusing.

We would like to express our gratitude for your insightful feedback on our paper. Based on your suggestions, we have revised the description of the image size and the number of additional images that can be obtained through the repeated application of the aforementioned operation as follows:
In the case of ½ size, the image size is 1024 by 512 in width and height. If the above operation is repeated on this image under the same conditions, 512 additional images can be obtained. Similarly, in the case of ¼ size, 128 additional images can be obtained. As a result, a total of 2688 cropped images are created from one cityscapes image.
We believe that this revised explanation provides a clearer understanding of our methodology and will enhance the overall quality of our research.
Line 232- What this means? “equal depth in both the encoder and decoder.”

We appreciate your thoughtful review of our paper and have carefully considered your comments. Specifically, in our methodology, the encoder and decoder have a symmetrical structure, which means that they contain the same number of layers and the same architecture. As an example, in the case of the Vgg16-based SegNet, the encoder comprises 13 layers, excluding the fully-connected layer of Vgg16, and the decoder also comprises 13 layers, resulting in a symmetrical structure.
Line 254- Should it say “human and car” instead “human”?

As you suggest, it is correct to write "human and car" instead of "human". There was a typo and this part has been corrected. We apologize for any confusion or inconvenience caused by this error and are grateful for your valuable feedback in improving the quality and accuracy of our research.
Line 309- in the text, explain individual numbers from table 2. For example, does +0.007 mean that Hybrid had better results than SegNet by 0.007? You partially stated this in the text, but an example would explain all the numbers better.

To clarify this part, we have added a description of the table as follows:
The Hybrid column in Table 2 shows the IoU scores of our proposed Hybrid model, while the Compare to SegNet and Compare to DeepLabv3+ columns indicate how much the IoU has changed when the hybrid model is compared to SegNet or DeepLabv3+, respectively. Here, the + and - signs indicate that IoU has increased or decreased compared to SegNet or DeepLabv3+ result.
We believe that this revised explanation provides a clearer understanding of our experiments.
Could you provide a comparison of your hybrid model with some of the other models available? Is it feasible?

We appreciate your review of our paper and have taken your comments into careful consideration. Our primary objective in this study was to investigate the effectiveness of the hybrid model in improving the performance of semantic segmentation. We have successfully demonstrated the efficacy of this approach through our experimental results. However, we also acknowledge the importance of further research and development in this field. In particular, we plan to train our dataset on a new model in future studies, as the construction of a new dataset requires the model to be trained from scratch to achieve optimal performance. We believe that this approach will enable us to further enhance the accuracy and efficiency of semantic segmentation. We thank you for your valuable feedback, which has helped us to improve the quality and impact of our research.
Line 328- Could you describe the differences with specific examples? I can see the differences in the pictures you attached, but it would be good if you wrote them in the text.

We would like to express our gratitude for your review of our paper. In response to your comments, we have carefully examined our manuscript and have added further explanations for some particularly noteworthy examples. We believe that these additional details will provide a better understanding of the key findings of our study and highlight the potential applications of our approach. We hope that these clarifications have addressed any concerns you may have had, and we appreciate your valuable feedback in improving the quality of our research.
Line 332- Why did you decide on this way of evaluating results and scoring?

Thank you for your valuable feedback on our paper. We would like to clarify that since there were images in the dataset that did not have correct answers, we conducted a user study to evaluate the segmentation results, rather than using a general evaluation method such as IoU. As a result, the subjective opinions of the participants became the basis for our evaluation, and it was necessary to quantify their feedback. Therefore, we used a method of converting the order of the results of the three models into a score, which allowed us to obtain a quantitative measure of the subjective opinions expressed by the participants. We hope that this clarification helps to further elucidate the methodology used in our study, and we appreciate your comments in helping us to improve the quality of our research.
Line 354- What are the “inaccurate regions”?

We have revised the description in Figure 5 to clarify that there were some inaccurately predicted regions, such as walls and roads, in the second and fourth examples. The intent was to provide a more accurate and precise explanation of the regions that were not correctly predicted by the model.
The following sentence has been added to the paper. It can be confirmed that the resulting image has some inaccurately predicted regions, such as a wall, or road, but visually plausible segmentation for important classes.
Thank you for bringing this to our attention.
Because of those inaccurate regions, could you please suggest some ways to correct these shortcomings and what are the directions for further research?

In our future research, we aim to construct a more comprehensive hybrid model by incorporating various models. By utilizing a more updated model during this process, we anticipate that we will be able to accurately detect areas that are currently exhibiting inaccurate predictions.
What I noticed is that there is little mention of greenery in the text in the sense of measurement through the model, and it is within the title. “green coverage”

Based on our technology, we conducted an analysis of the green area ratio in GSV images, and we are currently engaged in a study aimed at providing recommendations for optimal walking paths using this information. These details have been included in the conclusion section of our paper.

Author Response File: Author Response.pdf

Reviewer 2 Report

Recommendation: Major Revision

This is a thesis on hybrid image segmentation methods for accurate image measurement in urban environments. This thesis proposes a method to accurately analyze unlabeled images by combining different segmentation models. This can provide important information for the analysis of urban environments.

1. Which segmentation models are combined in the hybrid model proposed in this thesis?

2. What data have you collected to accurately analyze unlabeled images?

3. What information can be obtained in the urban environment analysis using this method?

Author Response

We would like to express my sincere gratitude for taking the time to review our paper. Your feedback and constructive criticism were invaluable in improving the quality of our research. We appreciate your expertise and insights, which helped us to identify the strengths and weaknesses of our study. Your comments and suggestions have been carefully considered, and we have made the necessary changes to the manuscript accordingly. Your contribution has greatly enhanced the value of our research and improved its readability.

Which segmentation models are combined in the hybrid model proposed in this thesis?

We would like to express our gratitude for providing us with your valuable feedback on our paper. We employed a fusion of InceptionResNetv2-based DeepLabv3+ and Vgg16-based SegNet for our approach. We have revised the introduction and abstract to more clearly convey the scope and focus of our research.
What data have you collected to accurately analyze unlabeled images?

We appreciate your careful review of our manuscript and your insightful comments. In this study, a dataset of Google Street view images from Yongsan-gu, Seoul, South Korea was collected and subsequently segmented.
What information can be obtained in the urban environment analysis using this method?

Hybrid models, the image segmentation techniques employed in this study, offer valuable insights into the urban environment. These models can be applied to images obtained by ground-based cameras mounted on cars, drones, or smartphones, enabling the acquisition of detailed information about urban landscape characteristics. Such information may not be readily obtainable using conventional methods such as aerial or satellite image analysis. For example, segmentation can identify and map urban vegetation, including trees and shrubs, facilitating the estimation of green space in urban areas.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Dear Authors,

I am satisfied with most of your responses to my comments. I think that some answers could have been of better quality. For example, I think you could have corrected the abstract even better. Also, some explanations, like the explanation of semantic segmentation, could be of better quality and expanded.

Author Response

I am satisfied with most of your responses to my comments. I think that some answers could have been of better quality. For example, I think you could have corrected the abstract even better. Also, some explanations, like the explanation of semantic segmentation, could be of better quality and expanded.

Thank you for your feedback on our paper. We appreciate your constructive criticism and have taken your comments into consideration to improve the quality of our manuscript. We have carefully reviewed and revised the abstract and segmentation introduction section, paying particular attention to the quality and clarity of our explanations. We have also highlighted the revised parts of the manuscript in yellow to ensure they are easily identifiable.
We strive to deliver a high-quality paper that meets the expectations of our readers, and your feedback has been instrumental in achieving this goal. We hope that you will find our revisions satisfactory and that they improve the overall quality of our manuscript.

Author Response File: Author Response.pdf

Reviewer 2 Report

Recommendation: Accept

Author Response

We would like to express our sincere gratitude for accepting our paper. Your insightful feedback and constructive criticism were instrumental in improving the quality of our work. We appreciate the time and effort you invested in reviewing our manuscript. We hope that our work will inspire further research and development in this area.

Article Menu

A Hybrid Image Segmentation Method for Accurate Measurement of Urban Environments

Further Information

Guidelines

MDPI Initiatives

Follow MDPI