Next Article in Journal
Automatic Mapping of Rice Growth Stages Using the Integration of SENTINEL-2, MOD13Q1, and SENTINEL-1
Previous Article in Journal
Flood Mapping in Vegetated Areas Using an Unsupervised Clustering Approach on Sentinel-1 and -2 Imagery
 
 
Article
Peer-Review Record

An OSM Data-Driven Method for Road-Positive Sample Creation

Remote Sens. 2020, 12(21), 3612; https://doi.org/10.3390/rs12213612
by Jiguang Dai 1,2,3,4,5,6, Chengcheng Li 1,2,*, Yuqiang Zuo 4 and Haibin Ai 6
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Remote Sens. 2020, 12(21), 3612; https://doi.org/10.3390/rs12213612
Submission received: 18 September 2020 / Revised: 30 October 2020 / Accepted: 2 November 2020 / Published: 3 November 2020

Round 1

Reviewer 1 Report

I appreciate changes that have been made in order to improve the manuscript. In my opinion the necessary explanations and supplementations are in some cases not sufficient enough. That is why I would like to list some minor remarks concerning parts of article that were partially covered with further explanations of the type of changes that are expected.

Minor remarks:

1) lines 492 – 496 – You refer to Table 1, while – in my opinion - there should be a reference to Table 3.

2) lines 497 – 510 – As you have explained: ‘The image shown for Experiment 3 is a remote sensing image of the GF2 optical satellite covering the suburban area of Huludao’ and you call it ‘urban’ in lines 506-508. It is misleading and should be described more clearly.

3) Have you noticed any improvements in geometric  accuracy  of OSM data over time as it seems that your experiments come from different time periods? If there are any it is better to discuss them.

 

Author Response

Comment

Response

Lines: 492 – 496

Thank you very much for your valuable time and for the comments you provided. Indeed, this is our mistake. According to your suggestion, we have changed "Table 1" to "Table 3" in line 531.

Comment 1:

You refer to Table 1, while – in my opinion - there should be a reference to Table 3.

Lines: 497 – 510

Thank you very much for your comment. This is our mistake. We are very sorry for not being clear. According to your suggestion, we have revised "urban" to "suburban" in line 546.

 

Comment 2:

As you have explained: ‘The image shown for Experiment 3 is a remote sensing image of the GF2 optical satellite covering the suburban area of Huludao’ and you call it ‘urban’ in lines 506-508. It is misleading and should be described more clearly.

Lines: Section 2

Thank you very much for your comments. We have not carried out an analysis of OSM geometric accuracy over time, which needs to be further studied. In this study, although our experimental images come from different time periods, OSM data are basically downloaded in the same time period, thus we do not conduct the corresponding research. According to your comment, we have added " Their corresponding OSM data were downloaded at the same time from https://www.openstreetmap.org/."

Thank you for your time in reviewing this manuscript again. Your suggestions were very helpful in improving the quality of the manuscript. I wish you a happy life and work.

Comment 3:

Have you noticed any improvements in geometric  accuracy  of OSM data over time as it seems that your experiments come from different time periods? If there are any it is better to discuss them.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper proposes a method to create positive road network samples from OpenStreetMap data in order to ease the application of deep learning classification algorithms.

Overall Comments

The authors have greatly improved their paper (structure and figures). They demonstrate that the proposed method provides results similar to those which can be obtained manually and better than other known algorithms. The document is highly technical and scientifically sound, the literature review seems comprehensive.

Major Comments

Lines

Comment

all

Leaving aside the technical aspects of your paper, the text is sometimes difficult to read. Proofread your text and make sure it will be understandable to most of journal’s readers.

all

Some sections of the text require a serious English proofreading, both in terms of wording and syntax. I have identified many of them in minor comments but I suggest you find a native English speaker to review the whole document.

Minor Comments

Lines

Comment

24-27

Could the sentence be rephrased to make it clearer?

24

The term “fracture” should be replaced by “gaps between lines”.

53-55

Rephrase, the content repeats itself.

58-60

Rephrase.

63-64

Replace “ … measurement, and … accuracy [20]” by “ … measurement. Consequently, these data may not necessarily meet professional standards in terms of accuracy [20]”.

68-70

Rephrase. This is a key sentence, it could be made easier to read.

77

Replace “there are” by “their”.

77-78

Replace “The experiment … vs2013.” by “The algorithms were implemented in C ++ using Visual Studio 2013 platform.”

78-80

Since you don’t seem to know the accuracy of the ortho-photos, it would be sufficient to identify only their supplier.

82-86

Very important, the sentence could be made clearer.

87

Replace “Experiment 1” by “Experiment 1 (urban area)”.

89-92

Remove or rephrase because some of the words make no sense.

94-96

Sentences are incomplete. Information is missing to properly understand what you mean.

101

Replace “Experiment 2” by “Experiment 2 (rural area)”.

103-105

Remove or rephrase because some of the words used no sense.

106

Replace “Experiment 3” by “Experiment 3 (suburban area)”.

108-110

Remove or rephrase because some of the words make no sense.

112-114

What about replacing the text by: “Inconsistencies in orientation and position appear when OSM roads are superimposed on the ortho-photos. These inconsistencies cannot be ignored as they will reduce the reliability of positive samples for roads”?

114-121

Rephrase.

125-126

Replace “The directions” by “Since the direction” and remove “so”.

127-131

Rephrase.

128

“in the article”? Do you mean “in the image” or “in the proposed method”?

128

“other features”? Do you mean “other image features”?

133

Leave a space between previous paragraphs and new sections (1, 2, …). The same comment applies to all new (sub) sections.

134-137

The method used is not clearly defined. After multiple readings I understand that you first assessed the discrepancy between OSM road network and the images (3 m) and then you used Equation 1 to create rectangular buffers along OSM road segments. You can add that in your approach, a road segment is defined by two adjacent OSM nodes. A reader must understand all that on a first reading.

 

Replace “nodes and connecting lines” by “connected nodes”.

135

“In this article…” should be replaced by “In the proposed method” or “In the proposed approach” and it should be used only when the context may create some confusion. The same comment applies over the whole text.

Figure 2

Lines segments (red lines) are missing from the legend

162

Replace “roads have” by “roads generally have”.

163-165

What about replacing “In the actual scene … linear characteristics” by “However, in an image, roads may be hidden by vehicles (traffic noise), shadow from surrounding buildings or by other phenomena”?

168-170

Rephrase.

191-193

It has already been said. I suggest stating this just once (see comment for lines 163-165). You could label this phenomenon as “obstructed areas” or “occlusion” and refer to this term when you really need to elsewhere in the text.

395

Remove “the blue”, only refer to OSM data. Readers will look at the legend.

397-399

This should have been said in the Method section.

475-483

Do not repeat what already appears in surrounding tables. Use these lines to discuss the results.

492-496

Same comment

492

Do you really refer to Table 1?

Author Response

Dear reviewer:

First, thank you very much for taking time out of your busy schedule to read and revise our article. Thank you for your valuable comments. You have provided comprehensive guidance on  content of our article, which plays a very important role in improving the quality of our article. I would like to express my heartfelt thanks to you for giving us the opportunity to revise and improve the level of our articles.

We have carefully read the reviewers's comments, and according to the comments, carefully revised the article according to each comment.

Comment

Response

Major comment

Thank you very much for your comments. Based on the comment and each of the following comment, we have revised the article to ensure that it is understandable to most of the journal’s readers.

Leaving aside the technical aspects of your paper, the text is sometimes difficult to read. Proofread your text and make sure it will be understandable to most of journal’s readers.

Major comment

Thank you very much for your comments. We agree that there are some problems in the wording and syntax of our article, and we invited English-speaking experts to polish it. According to your suggestion, we again invited experts to refine our article in the revised draft.

Some sections of the text require a serious English proofreading, both in terms of wording and syntax. I have identified many of them in minor comments but I suggest you find a native English speaker to review the whole document.

Lines 24-27

Before modification

Response

Comment 1:

Could the sentence be rephrased to make it clearer?

Finally, the local texture self-similarity (LTSS) model is implemented to determine the road width, and the centerpoint autocorrection model and the random sample consensus(RANSAC) algorithm are used to extract the road centerline to complete the creation of road positive samples.

Thank you very much for your comments. We did not express ourselves clearly before. According to your comment, the original sentence has been restated in line 24-28 and amended as follows “Finally, a local texture self-similarity (LTSS) model is implemented to determine the road width, and a centerpoint autocorrection model and the random sample consensus(RANSAC) algorithm are used to extract the road centerline; and road width and road centerline are used to complete the creation of road positive samples.”

Lines 24

 

Thank you very much for your comment. According to your comment,we have replaced "fracture" with "gaps between the road lines" in line 24.

Comment 2:

The term “fracture” should be replaced by “gaps between lines”.

Lines 53-55

At present, the creation of sample sets is usually done manually, which greatly increases the difficulty in practical applications of deep networks, thus, the creation of sample sets is a bottleneck that limits the deep network applications.

Thank you very much for your comment. According to your comment,in line 59-61, the original sentence has been rephrased to“At present, the sample creating is usually performed manually, which often leads to a shortage of sample sets, and reduces the performance of deep learning.”

Comment 3:

Rephrase, the content repeats itself.

Lines 58-60

Among the VGI projects, one of the most influential and far-reaching projects is OpenStreetMap (OSM), in which volunteers contribute and share map data from all over the world through crowdsourcing.

Thank you very much for your comment. According to your comment,in lines 82-83, the original sentence has been modified to “OSM is the most influential and far-reaching project in VGI, where volunteers share map data all over the world by using crowdsourcing.”

Comment 4:
Rephrase.

Lines 63-64

However, it is worth noting that the volunteers who participate in the project do not necessarily have the professional qualifications or background in the field of geographic data collection or measurement, and it is difficult to meet the professional requirements in terms of accuracy[20].

Thank you very much for your comment. According to your comment,in lines 85-86, the original sentence has been replaced by “Consequently, these data may not necessarily meet professional standards in terms of accuracy.”

Comment 5:

Replace “ … measurement, and … accuracy [20]” by “ … measurement. Consequently, these data may not necessarily meet professional standards in terms of accuracy [20]”.

Lines 68-70

Inspired by the help of OSM data to extract the road, to overcome the aforementioned shortcomings of the existing road positive sample creating methods, to enhance the universality of deep learning, this article presents a road positive sample creation method using OSM data.

Thank you very much for your comment. According to your comment,in lines 91-93, the original sentence has been modified to“Therefore, inspired by the road extraction that uses OSM data, to solve the problems of current road sample creation described above, we propose an OSM data-driven road positive sample creating method.”

Comment 6:

Rephrase. This is a key sentence, it could be made easier to read.

Lines 77

 

Thank you very much for your comment. According to your comment,in the line 100 of the article, “there are” has been replaced by “their.”

Comment 7:

Replace “” by “”.

Lines 77-78

The experiments are completed in C + + on the platform of vs2013.

Thank you very much for your comment. According to your comment,in lines 100-101 of the article, the original sentence has been replaced by “The algorithms were implemented in C ++ using Visual Studio 2013 platform.”

Comment 8:

Replace “The experiment … vs2013.” by “The algorithms were implemented in C ++ using Visual Studio 2013 platform.”

Lines 78-80

 

Thank you very much for your comments. According to your comment,in the lines 103-106 of the article, the following has been added“The orthophoto images in Experiment 1 and 2 were provided by Beijing Longyufangyuan Information Technology Co., Ltd, and the orthophoto images in Experiment 3 were provided by Beijing Guocexinghui Information Technology Co., Ltd.”

Comment 9:

Since you don’t seem to know the accuracy of the ortho-photos, it would be sufficient to identify only their supplier.

Lines 82-86

To verify the effectiveness and universality of the method, considering the relationship between OSM data and population density[24], the OSM data drawn in densely populated areas are usually more accurate than those in sparsely populated areas, and the road grades between urban and rural areas are different. The following three ortho corrected images covering urban, rural, and suburban areas were selected for the experiments.

Thank you very much for your comments. According to your comment,in lines 108-111, the original sentence has been modified to “To verify the effectiveness and universality of the method, considering that the OSM data in densely populated areas is generally more accurate than that in sparsely populated areas [36], and the road types between urban and rural areas are different. The following three ortho corrected images covering urban, rural, and suburban areas were selected for the experiments.”

Comment 10:

Very important, the sentence could be made clearer.

Lines 87

 

Thank you very much for your comment. According to your comment,in the line 112 of the article, “Experiment 1” has been replaced by “Experiment 1 (urban area)”

Comment 11:

Replace “Experiment 1” by “Experiment 1 (urban area)”.

Lines 89-92

 

Thank you very much for your comment. According to your comment,in the lines 115-116 of the article, the original sentence has been rephrased to “The roads in the image may be hidden by a small amount of vehicle noise, shadow shading, and the local road width has continuous change phenomenon.”

Comment 12:

Remove or rephrase because some of the words make no sense.

Lines 94-96

Sample enhancement is a process from 1 to more. The purpose of this method is to create road positive samples automatically,that is a process from 0 to 1, which aims to achieve the automatic creation of positive road samples.

Thank you very much for your comment. According to your comment,in the lines 118-120 of the article, the original sentence " " has been rephrased to “It is mainly considered that the current methods are usually based on prior samples for sample enhancement. However, our method is based on no prior samples to create samples automatically, which leads to less comparison methods.”

Comment 13:

Sentences are incomplete. Information is missing to properly understand what you mean.

Lines 101

 

Thank you very much for your comment. According to your comment,in the line 125 of the article,“Experiment 2” has been replaced by “Experiment 2 (rural area)”

Comment 14:

Replace “Experiment 2” by “Experiment 2 (rural area)”.

Lines 103-105

The road type in the image is a rural road, the road grade is the one-way lane, the road includes a small amount of vehicle noise, shadow shading, and the local road curvature is large.

Thank you very much for your comments. According to your comment,in the line 128-129 of the article,the original sentence  has been rephrased to “Compared with urban roads, the characteristics of rural roads in the image show the larger curvature of local roads.”

Comment 15: Remove or rephrase because some of the words used no sense.

Lines 106

 

Thank you very much for your comments. According to your comment,in the line 130 of the article,“Experiment 3” has been replaced by “Experiment 3(suburban area)”

Comment 16:

Replace “Experiment 3” by “Experiment 3 (suburban area)”.

Lines 108-110

The image covers the suburban area, and the road types include urban road and provincial road. The road grade is divided into one-way lane and two-way road. A large amount of vehicle noise can be observed in the image.

Thank you very much for your comments. According to your comment,in the 128 line of the article,the original sentence has been rephrased to “The image contains two different types of roads: urban road and provincial road.”

Comment 17:

Remove or rephrase because some of the words make no sense.

Lines 112-114

The misleading errors in OSM data can not be ignored, as they will reduce the reliability of road positive samples. The geometric accuracy of Orthophoto is high. There are orientation and position errors when the OSM data and orthophoto are superimposed.

Thank you very much for your comments. According to your comment , in the lines 136-137 of the article the original sentence has been replaced by “Inconsistencies in orientation and position appear when OSM is superimposed on the orthophotos. These inconsistencies cannot be ignored as they will reduce the reliability of the road positive samples.”

Comment 18:

What about replacing the text by: “Inconsistencies in orientation and position appear when OSM roads are superimposed on the ortho-photos. These inconsistencies cannot be ignored as they will reduce the reliability of positive samples for roads”?

Lines 114-121

Hence, to obtain more reliable road positive samples, with regard to the method of the article, (a)in section 2.2.1, a LSOH model is constructed to determine the local road direction, (b)in section 2.2.2, a road homogeneity constraint rule,a road texture feature statistical model, and a polar coordinate constraint rule are used to extract local road line set, (c)in section 2.2.3, the iterative interpolation algorithm is used to connect the local road lines on both sides of the fracture, (d)in section 2.2.4, a LTSS model, the centerpoint autocorrection model and RANSAC algorithm are used to create road positive samples. Figure 1 provides the flow chart of the proposed method.

Thank you very much for your comment. According to your comment,in the lines 138-145 of the article , the original sentence has been rephrased to “Hence, to obtain more reliable road positive samples, with regard to the proposed method of the article, (a) in section 2.2.1, we propose a LSOH model to determine the local road direction; (b) in section 2.2.2, we propose the road homogeneity constraint rules, road texture feature statistical model and polar constraint rule to extract the local road line set; (c)in section 2.2.3, the iterative interpolation algorithm is used to connect the local road lines on both sides of the gaps between the road lines; (d)in section 2.2.4, a LTSS model, a centerpoint autocorrection model and the RANSAC algorithm are used in turn to extract road width and road centerline to complete  the creation of road positive samples. Figure 1 shows a flow chart of the proposed method.”

Comment 19: Rephrase

 

Lines 125-126

 

Thank you very much for your comment. According to your comment,first, in the line 149 of the article ,“The directions” has been replaced by “Since the direction”,second,in the line 150 of the article , “so”has been deleted.

Comment 20: Replace “The directions” by “Since the direction” and remove “so”.

Lines 127-131

The line segment between a pair of adjacent nodes of OSM data is defined as the local OSM vector, and the corrected direction of local OSM vector is defined as the local road direction in the article.The edges of roads and other features can indicate the direction of the road. For example, the edge of the indication line in the road, the edge of the motor vehicle, the edge of the separation zone, and the edge of both sides of the building can be considered.

Thank you very much for your comment. According to your comment,in the lines 150-155 of the article ,the original sentence has been rephrased to “The line segment between a pair of adjacent nodes of OSM data is defined as the local OSM vector, and the local road direction is consistent with the corrected direction of local OSM vector in the proposed method. Generally, there is a certain relationship between the road direction and the edge information in the road neighborhood. For example, the edge information of the indication line in the road, the motor vehicle, the separation zone, and buildings is consistent with the road direction.”

Comment 21: Rephrase.

Lines 128

 

Thank you very much for your comment. Before, our expression was not sufficiently clear, resulting, which resulted in ambiguity. According to your comment,in the line 152 of the article , “in the article” has been replaced by “in the proposed method.”

Comment 22:

“in the article”? Do you mean “in the image” or “in the proposed method”?

Lines 128

The edges of roads and other features can indicate the direction of the road.

Thank you very much for your comment. Before, our expression was not sufficiently clear, which resulted in ambiguity. According to your comment,in the lines 152-153 of the article , the original sentence has been replaced by “Generally, there is a certain relationship between the road direction and the edge information in the road neighborhood.”

Comment 23:

“other features”? Do you mean “other image features”?

Lines 133

 

Thank you very much for your comment. According to your comment,we have left a space between the previous paragraph and the new section, and all new (sub) sections of the manuscript have been modified.

Comment 24: Leave a space between previous paragraphs and new sections (1, 2, …). The same comment applies to all new (sub) sections.

Figure 134-137

a pair of adjacent nodes is used as the buffer axis.

Thank you very much for your comment. According to your comment,in the line 161 of the article , the original sentence has been rephrased to “the local OSM vector is used as the buffer axis”

Comment 25: The method used is not clearly defined. After multiple readings I understand that you first assessed the discrepancy between OSM road network and the images (3 m) and then you used Equation 1 to create rectangular buffers along OSM road segments. You can add that in your approach, a road segment is defined by two adjacent OSM nodes. A reader must understand all that on a first reading.

Lines 134

nodes and connecting lines.

Thank you very much for your comment. According to your comment,in the line 160 of the article , “nodes and connecting lines” has been replaced by “connected nodes.”

Comment 26:

Replace “nodes and connecting lines” by “connected nodes”.

Lines 135

 

Thank you very much for your comment. According to your comment,in the line 161 of the article , “In this article” has been replaced by “In the proposed method”,And the full text has been revised.

Comment 27:

“In this article…” should be replaced by “In the proposed method” or “In the proposed approach” and it should be used only when the context may create some confusion. The same comment applies over the whole text.

Lines Figure 2

 

Thank you very much for your comment. According to your comment,in the line 166 of the article , we have described the red line in the legend. The modified picture is as follows:

Comment 28:

Lines segments (red lines) are missing from the legend

Lines 162

 

Thank you very much for your comment. According to your comment,in the line 188 of the article , “roads have” has been replaced by “roads generally have”

Comment 29: Replace “roads have” by “roads generally have”.

Lines 163-165

In the actual scene … linear characteristics

Thank you very much for your comments. According to your comment,in the lines 189-190 of the article , “In the actual scene … linear characteristics” has been replaced by “However, in an image, roads may be hidden by vehicles (traffic noise), shadow from surrounding buildings or other phenomena.”

Comment 30: What about replacing “In the actual scene … linear characteristics” by “However, in an image, roads may be hidden by vehicles (traffic noise), shadow from surrounding buildings or by other phenomena”?

Lines 168-170

A road segment in the image corresponding to the local OSM vector after direction correction is defined as the local road line,and    a road segment set composed of several adjacent local road lines with the same direction is defined as the local road line set in this article.

Thank you very much for your comment. According to your comment,in the lines 193-195 of the article , the original sentence " " has been replaced by “A road segment in the image corresponding to the local OSM vector is defined as the local road line, and several adjacent local road lines with the same direction constitute the local road line set.”

Comment 31: Rephrase.

Lines 191-193

Sometimes, there are vehicles and other noises in the local area of road

Thank you very much for your comment. According to your comment,in the line 218 of the article “Sometimes, there are vehicles and other noises in the local area of road”  has been replaced by  “Sometimes, in an image, local roads may be hidden by noises.”

Comment 32: It has already been said. I suggest stating this just once (see comment for lines 163-165). You could label this phenomenon as “obstructed areas” or “occlusion” and refer to this term when you really need to elsewhere in the text.

Lines 395

 

Thank you very much for your comment. According to your comment,in the line 438 of the article "The blue" has been removed.

Comment 33: Remove “the blue”, only refer to OSM data. Readers will look at the legend.

Lines 397-399

vehicle interference and tree shadow shading are difficult problems in road sample creation. To fully verify the effectiveness of this method, the proposed method is compared with the CNN model and Unet model in this article.

Thank you very much for your comment. According to your comment,we have described it in lines 115-116 of the first section.

Comment 34: This should have been said in the Method section.

Lines 475-483

The specific data are shown in Table 2. Compared with the 96% integrity rate of the method in this article, the results of road extraction using the UNet network are unsatisfactory at, approximately 81%, while the results of road extraction using the CNN network model are even lower at, approximately 68%. In the accuracy rate evaluation, the extraction results of the two deep learning models can reach approximately 85%, compared with the 97% accuracy rate of the proposed method, indicating there is still a certain gap. In terms of extraction quality, the CNN network model road extraction results show the lowest extraction quality at, approximately 68%, which is 12% lower than the UNet network results and approximately 30% lower than the results of the proposed method.

Thank you very much for your comment. According to your comment,in the lines 517-524 of the article, the original sentence has been revised by“The specific data are shown in Table 2. In terms of the integrity rate, accuracy rate and extraction quality, the UNet network and CNN network are lower than the proposed method. This is because deep learning is a supervised learning method. In the process of network training, the network parameters are updated by iterating the network training based on the existing samples, so that the network model can effectively extract and represent the deep features and complete the complex feature mapping task. Therefore, the number, quality, type of samples and training model will impact the results, which makes the performance of a network model limited and fails to fully reflect the advantages of the deep learning method.”

Comment 35: Do not repeat what already appears in surrounding tables. Use these lines to discuss the results.

Lines 492-496

As shown in Table 3, the extraction results of Experiment 1 showed high integrity of 96.58%, accuracy of 97.08% and extraction quality of 93.85%, while the extraction accuracy of Experiment 2 is relatively low,with a value of 97.02%, and the integrity and extraction quality are in the middle level, The integrity and extraction quality of Experiment 3 are relatively low, but they can still reach approximately 85%.

Thank you very much for your comment. According to your comment,in the lines 531-533 of the article, the original sentence  has been revised by“As shown in Table 3, the results of experiments show that the extraction results of the proposed method suggest high integrity. Although Experiment3 is greatly disturbed by noise, the integrity and extraction quality still reach approximately 85%.”

Comment 36:

Same comment

Lines 492

 

Thank you very much for your valuable time and for the comments that you provided. This is our mistake. According to your suggestion, we have changed "Table 1" to "Table 3" in line 531.

Comment 37: Do you really refer to Table 1?

 

 

Author Response File: Author Response.pdf

Reviewer 3 Report

In this paper, a method of generating target positive samples is proposed, which aims to achieve a high accuracy positive sample labeling method. However, the comparison methods are the conventional deep learning classification and semantic segmentation, so the so-called positive sample generation in this paper is a method to predict the target at pixel level, and compared with the prediction results of deep learning.

In fact, in the field of deep learning, there are many papers on sample labeling. Most of the researches have realized automatic sample labeling by weak supervision or semi supervision. Therefore, if this paper is committed to the realization of the automatic annotation of samples, then the comparison with such methods should be the focus of this paper.

Secondly, considering that deep learning is a data-driven algorithm model, training samples and training methods will affect the prediction results, but this paper does not give a detailed description, especially what kind of network structure CNN refers to.

As an algorithm, we can not only examine the accuracy, but also the efficiency, which is not seen in this paper. In addition, the English expression of the full text needs to be improved

There are still some problems to be clarified.

  1. In line 39, the expression of deep network methods is not rigorous. From the point of view of this paper, the author only explains some methods of deep convolution neural network, so it is more accurate to use “deep convolution neural network (DCNN)”. After all, in addition to DCNN, some other networks can be called deep network.

 

  1. In line 52, the author commented that "when the test image is different from the sample set, the performance of the deep network is greatly affected". However, from the full text, the author does not know whether the method proposed in this paper can solve this problem.

 

  1. The determination of some parameters depends on the statistical results of the data set, such as the angle threshold in LSOH. Then, does another data set follow the same parameters? Do the fixed parameters affect the generalization ability of this method?

 

  1. Timeliness problem. In the road homogeneity constraint rule, the author proposes to use “a template” to find the centerpoint 1 pixel by 1 pixel. Obviously, only in this step will spend a lot of time. This ergodic feature point extraction method has high time complexity.

 

  1. In Experiment 1, the author said that in CNN results, due to tree occlusion, there were errors in road extraction results, namely overfitting. But I don't think it's CNN's problem. The concept of "road under the tree" may be a posteriori problem. However, it is difficult for neural networks to learn the experience beyond the pixel information of images. This is caused by the mechanism of deep learning image interpretation. It is precisely because, CNN can accurately extract the pixels belonging to the "road" category, so it does not infer the trees in the image as roads. The same phenomenon also exists in the results of UNet. If we use the proposed method to judge the tree as a part of the road, then the purity of the positive samples will be affected. For the learning machine, the positive samples collected in this way will have ambiguity, which will inevitably reduce the accuracy of learning.

 

  1. The comparison of experiments is not fair. The premise of this paper is to use OSM data as a priori knowledge to guide the following geometric methods to realize road marking, which is equivalent to that all the data are used for supervised learning, although the method in this paper is not a learning system. The OSM data, especially the road route line, has no auxiliary effect on the learning of the deep neural network. In other words, the deep neural network can only learn with the help of the marked pixel information, but can not introduce the deterministic road line information, and only use part of the data for training. The determining prior knowledge is more conducive to target extraction than the knowledge obtained by learning part of the data

 

Author Response

Dear reviewer:

First, thank you very much for taking time out of your busy schedule to read and revise our article. Thank you for your valuable comments. You have provided comprehensive guidance on  content of our article, which plays a very important role in improving the quality of our article. I would like to express my heartfelt thanks to you for giving us the opportunity to revise and improve the level of our articles.

We have carefully read the reviewers's comments, and according to the comments, carefully revised the article according to each comment.

Comment

After modification

Major Comment:

In this paper, a method of generating target positive samples is proposed, which aims to achieve a high accuracy positive sample labeling method. However, the comparison methods are the conventional deep learning classification and semantic segmentation, so the so-called positive sample generation in this paper is a method to predict the target at pixel level, and compared with the prediction results of deep learning.

In fact, in the field of deep learning, there are many papers on sample labeling. Most of the researches have realized automatic sample labeling by weak supervision or semi supervision. Therefore, if this paper is committed to the realization of the automatic annotation of samples, then the comparison with such methods should be the focus of this paper.

Secondly, considering that deep learning is a data-driven algorithm model, training samples and training methods will affect the prediction results, but this paper does not give a detailed description, especially what kind of network structure CNN refers to.

As an algorithm, we can not only examine the accuracy, but also the efficiency, which is not seen in this paper. In addition, the English expression of the full text needs to be improved.

 

Thank you very much for your comment. According to your comment,we have added a description of the sample production method, the CNN network structure and time,  and the specific modifications are as follows:

1. In the lines 62-73 of the article, We have added the description of sample production method, the contents are as follows “At present, there are three types of sample creating methods, namely, strong supervised learning, weak supervised learning, and semi supervised learning. Among them, strong supervised learning simplifies the problems in the real scene, which means that it does not hold true in many real scenes. What we get in real scenes is relatively weak supervision information, so weak supervised learning and semi supervised learning have been widely considered by researchers [21-23]. Generally, weak supervision can be classified into incomplete supervision, imprecise supervision and inaccurate supervision. Semi supervised learning uses a small number of labeled samples and a large number of unlabeled samples to train the classification model. Semi supervised learning is mainly divided into the four categories of a generative-based method [24,25], low density segmentation-based method [26,27], divergence-based method [28,29], and graph-based method [21-23]. The above methods are based on the existing samples, but the problem of creating samples without prior samples has not been discussed in depth.”

 2. In the lines 368-372 of the article, we have added a description of the CNN network structure,and the content is as follows: “An improved CNN structure based on the road [46] first uses the first 13 convolution layers of VGG [45] to extract levels instead of manual features in previous road extraction methods. Second, three additional convolution layers are used to adapt to the road structure. Then, a deconvolution and fusion layer are combined, and a cross entropy loss function with road structure constraints is proposed.”

3. In the lines 527-528 of the article, We have added information on algorithm efficiency, It is amended as follows:“Furthermore, because the deep learning method requires to obtain the sample set in advance, the proposed thus method has obvious advantages in time use.”、in the lines 534-534 of the article, “Moreover, the three experiments took different times, among which Experiment1 took the longest time but was still within 10 minutes.”

Table 2. Results of different methods

Method

Com%

Cor%

Q%

Time (min)

 

The proposed method

96.58

97.08

93.85

9

 

CNN network

71.34

89.51

68.85

366

 

U Net network

81.25

86.57

80.79

352

Table 3. Results of the proposed method

Experiment

Com%

Cor%

Q%

Time (min)

Experiment 2

94.21

97.02

91.56

4

Experiment 3

85.22

97.71

83.55

7

 

4. Thank you very much for your comments. We invited English-speaking experts to polish it. According to your suggestion, we again invited experts to refine our article again in the revised draft.

Comment 1:

In line 39, the expression of deep network methods is not rigorous. From the point of view of this paper, the author only explains some methods of deep convolution neural network, so it is more accurate to use “deep convolution neural network (DCNN)”. After all, in addition to DCNN, some other networks can be called deep network.

Thank you very much for your comment. According to your comment,in the line 40 of the article, the original sentence " deep network " has been revised to“deep convolution neural network.”

Comment 2:

In line 52, the author commented that "when the test image is different from the sample set, the performance of the deep network is greatly affected". However, from the full text, the author does not know whether the method proposed in this paper can solve this problem.

 

Thank you very much for your comment. The expression in our article may not be sufficiently clear. In our method, we mainly carry out research on traditional methods for sample creation, that do not involve“deep convolution neural network”,Therefore, the solution to the problem that " when the test image is different from the sample set, the performance of the deep network is greatly affected " needs to be solved by further research on deep learning.

 

Comment 3:

The determination of some parameters depends on the statistical results of the data set, such as the angle threshold in LSOH. Then, does another data set follow the same parameters? Do the fixed parameters affect the generalization ability of this method?

Thank you very much for your excellent comments. It's a really good comment. The parameter setting usually depends on the statistical results of the data set. When other data sets appear, our method also uses the same parameters, which will affect the generalization ability of the method. For example, when the road is covered by a large shadow, motor vehicles, and the road edge is blurred, the generalization of our method will certainly decline. At present, the premise of our method is that the road is clear and the interference is relatively small. However, in a complex scene, the generalization of sample creation is indeed low, which is also the research work that we need to carry out in the future.

Comment 4:

Timeliness problem. In the road homogeneity constraint rule, the author proposes to use “a template” to find the centerpoint 1 pixel by 1 pixel. Obviously, only in this step will spend a lot of time. This ergodic feature point extraction method has high time complexity.

Thank you very much for your comment. It is true that as the experts said, using templates to find the centerpoint pixel by pixel has a high time complexity, but this is the best method we have come up with so far. The problem of time complexity, it needs our follow-up research in the future.

Comment 5:

In Experiment 1, the author said that in CNN results, due to tree occlusion, there were errors in road extraction results, namely overfitting. But I don't think it's CNN's problem. The concept of "road under the tree" may be a posteriori problem. However, it is difficult for neural networks to learn the experience beyond the pixel information of images. This is caused by the mechanism of deep learning image interpretation. It is precisely because, CNN can accurately extract the pixels belonging to the "road" category, so it does not infer the trees in the image as roads. The same phenomenon also exists in the results of UNet. If we use the proposed method to judge the tree as a part of the road, then the purity of the positive samples will be affected. For the learning machine, the positive samples collected in this way will have ambiguity, which will inevitably reduce the accuracy of learning.

Thank you very much for your comments. It is true that experts say that if we use the proposed method to determine whether the tree is part of a road, it will affect the purity of positive samples. Therefore, it is difficult to detect the road under the tree. In response to this problem, in the lines 442-443 of the article, the original sentence has been revised to “there are errors in road extraction, which are caused by the deep learning mechanism.”

Comment 6:

The comparison of experiments is not fair. The premise of this paper is to use OSM data as a priori knowledge to guide the following geometric methods to realize road marking, which is equivalent to that all the data are used for supervised learning, although the method in this paper is not a learning system. The OSM data, especially the road route line, has no auxiliary effect on the learning of the deep neural network. In other words, the deep neural network can only learn with the help of the marked pixel information, but can not introduce the deterministic road line information, and only use part of the data for training. The determining prior knowledge is more conducive to target extraction than the knowledge obtained by learning part of the data

Thank you very much for your comments. In fact, we also found this problem when we chose the comparison method. First, from the perspective of prior knowledge, the accuracy of OSM data in our method is not high, but the samples of a neural network are accurate. Second, our method is completely different from a neural network. Our method is still based on geometry and texture analysis, and a neural network is a data-driven model.

However, there is no perfect automatic sample creating method for comparison. By referring to the suggestions given by experts before, we use two neural network models for comparison. In the future, we will continue to track the literature and, attempt to select a more fairer method for comparison.

Author Response File: Author Response.pdf

Reviewer 4 Report

1 The article’s title “Road Positive Sample Creation Method Combined with OSM Data” seems awkward. I suggest that the authors change the title of this article.

2 Literature Review is insufficient. It must be extended. There is generally no review of prior studies, so the authors must add a literature review showing who dealt with the same topic, who did research projects on that topic and what are the current knowledge in this field. What are the prior studies? How your article will fill the gap in our knowledge?

3 The authors merely combine the Open Street Data with widely used methods, but originality or novelty of this article is lacking. The authors need to clarify the article's novelty and value in the introduction section, especially the novelty in using Road Positive Sample Creation Method Combined with OSM(OpenStreetMap) Data. Because this method seems to be used in other geographic data. 4 I don’t know why the words from line 87-110 are in red.

5 In final conclusions you must add the implications for further studies. What topics should be undertaken by you and other researchers in the future.

Author Response

Dear reviewer:

First, thank you very much for taking time out of your busy schedule to read and revise our article. Thank you for your valuable comments. You have provided comprehensive guidance on content of our article, which plays a very important role in improving the quality of our article. I would like to express my heartfelt thanks to you for giving us the opportunity to revise and improve the level of our articles.

We have carefully read the reviewers's comments, and according to the comments, carefully revised the article according to each comment.

Comment

After modification

Comment 1:

The article’s title “Road Positive Sample Creation Method Combined with OSM Data” seems awkward. I suggest that the authors change the title of this article.

Thank you very much for your comment. According to your comment,we have revised the title to “An OSM Data-driven Road Positive Sample Creating Method”, to emphasize the importance of OSM data in sample creation。

Comment 2:

Literature Review is insufficient. It must be extended. There is generally no review of prior studies, so the authors must add a literature review showing who dealt with the same topic, who did research projects on that topic and what are the current knowledge in this field. What are the prior studies? How your article will fill the gap in our knowledge?

Thank you very much for your comments. According to your comments,in the Introduction, we have added current knowledge in this field, prior studies, etc., with specific revisions as follows:

1.   First,we have added a description of prior studies,and the following is added: in the lines 43-44 of the article, “Furthermore, Kass et al. [9] proposed a snake model, which fully uses the road geometric features and extracts roads by solving the extreme value of the energy function in a certain region.”、In the lines 46-47 of the article, “In this method, the road is regarded as a region with a certain geometric regularity and texture homogeneity. The road region is divided by segmentation, and the road is extracted by classification and post-processing methods.”、In the lines 54-58 of the article,“For example, Teerapong et al. [16] proposed an enhanced deep convolution neural network framework. The activation function of the exponential linear unit is embedded into the neural network to extract roads, and then, a method based on landscape measurement and a conditional random field is used to improve the accuracy of road extraction.”,In the lines 76-80 of the article,“For example, Cao and Sun [31] determined the candidate road seed points through an analysis of GPS points, ascertained the segmentation peak perpendicular to the road by filtering and difference, established the road centerpoint through the gray mean value and segmentation size, and connected the road centerpoints to form the road centerline.”,In the line 88 of the article,“For example, Li et al. [32] proposed a method for extracting highway networks from OSM data.”

2.   Second,in the lines 62-71 of the article,we have added a description of current sample creating methods, the following is added: “At present, there are three types of sample creating methods, namely, strong supervised learning, weak supervised learning, and semi supervised learning. Among them, strong supervised learning simplifies the problems in the real scene, which means that it does not hold true in many real scenes. What we get in real scenes is relatively weak supervision information, so weak supervised learning and semi supervised learning have been widely considered by researchers [21-23]. Generally, weak supervision can be classified into incomplete supervision, imprecise supervision and inaccurate supervision. Semi supervised learning uses a small number of labeled samples and a large number of unlabeled samples to train the classification model. Semi supervised learning is mainly divided into the four categories of a generative-based method [24,25], low density segmentation-based method [26,27], divergence-based method [28,29], and graph-based method [21-23].”

Comment 3:

The authors merely combine the Open Street Data with widely used methods, but originality or novelty of this article is lacking. The authors need to clarify the article's novelty and value in the introduction section, especially the novelty in using Road Positive Sample Creation Method Combined with OSM(OpenStreetMap) Data. Because this method seems to be used in other geographic data.

Thank you very much for your comment. We propose an OSM Data-driven Road Positive Sample Creating Method. At present, there are some studies on extracting roads with geographic data, but compared with other geographic data, OSM data cover a wide range and are available free of charge, therefore we use OSM as auxiliary data. There is currently no perfect method to create road samples by using OSM data. Therefore, our method is valuable. According to your comment,in the Introduction, we have added a description of the novelty and value of the research.

1.         In the lines 71-73 of the article,we have added a description of the novelty of the research,“The above methods are based on the existing samples, but the problem of creating samples without prior samples has not been discussed in depth.”

2.         In the lines 80-82 of the article,the following has been added “The above methods can extract roads from geographic data, but the extracted roads are misclassified or fractured, which leads to the low accuracy of extraction and sometimes requires manual participation.”

3.         In the lines 89-90 of the article,the following has been added “However, due to the complexity of road structures, this method easily identifies non-roads as multi- lane roads.”

4.         In the lines 90-93 of the article,the following has been added “Compared with other geographic data, OSM data cover a wide range and are easy to obtain.Therefore, inspired by the road extraction that uses OSM data, to solve the problems of current road sample creation described above, we propose an OSM data-driven road positive sample creating method.”

Comment 4:

I don’t know why the words from line 87-110 are in red.

Thank you very much for your comment. Because of our negligence, the font color was identified as red. This was our mistake. According to your comment,in the lines 130-134 of the article, we have revised the color of the font to black.

Comment 5:

In final conclusions you must add the implications for further studies. What topics should be undertaken by you and other researchers in the future.

  Thank you very much for your comment. According to your comment,in the Conclusion, we have added to the comments about future research. Specifically, the following is added: “However, the proposed method in this article still has shortcomings. For example, the template is used to find the center point pixel by pixel, and the parameter setting usually depends on the statistical results of the data set, which will affect the timeliness and generalization ability of the method. In addition, the premise of creating road positive samples in this article is that the road is clear, the interference is relatively small, and the robustness of creating samples in complex scenes is low. In view of the above problems, we will further study in the future to improve the robustness, generalization and timeliness of the method.”

 

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

For the comparison of time-consuming, the author needs to provide the configuration of computing hardware. Secondly, the time of CNN and UNET is significantly longer than that of the proposed method. However, the author does not indicate whether the time consumption of DCNN method is training time or prediction time, or the sum of the two. In fact, the prediction efficiency of deep learning method should not be so low. Experiments show that an NVIDIA GPU can process 10000 * 10000 images in about 15 minutes. Therefore, the time information provided in this paper is not objective.

 

As the author said, in other data sets, the proposed method, using fixed parameters for processing, may bring bad results, which is obviously the disadvantage of the proposed method compared with the deep learning method. Whether the author can analyze the boundary conditions of the method in depth is of great help to the applicability of the method.

 

It is difficult for the author to find a suitable comparison method, which makes the method lack of sufficient evidence. It is possible that the method using artificial features such as geometry and texture is superior to deep learning in some aspects. However, in the field of road extraction using artificial features, there are many contrast methods to choose, such as road tracking, snake, image gradient method, etc., which can be used as effective comparison methods.

Author Response

Dear reviewer:

First, thank you very much for taking time out of your busy schedule to read and revise our article. Thank you for your valuable comments. You have provided comprehensive guidance on the content of our article, which plays a very important role in improving the quality of our article. I would like to express my heartfelt thanks to you for giving us the opportunity to minor revise and improve the level of our articles.

We have carefully read the reviewers's comments, and according to the comments, carefully revised the article according to each comment.

 

The following is the response to reviewer3:

 

Comment

Response

For the comparison of time-consuming, the author needs to provide the configuration of computing hardware. Secondly, the time of CNN and UNET is significantly longer than that of the proposed method. However, the author does not indicate whether the time consumption of DCNN method is training time or prediction time, or the sum of the two. In fact, the prediction efficiency of deep learning method should not be so low. Experiments show that an NVIDIA GPU can process 10000 * 10000 images in about 15 minutes. Therefore, the time information provided in this paper is not objective.

Thank you very much for your comment. According to your comment, first, we have added hardware configuration in lines 122-123 as follows: "The proposed method is implemented using a PC with an NVIDIA GTX 1060TI and 8 GB of onboard memory."

Secondly, indeed, as you said, the time we provided before is the sum of training time and predicted time. According to your comment, in lines 530-531, we have explained the time provided in the paper as follows: "As shown in Table 2, the total training and prediction time of CNN and U Net were 366 minutes and 352 minutes, respectively. "

As the author said, in other data sets, the proposed method, using fixed parameters for processing, may bring bad results, which is obviously the disadvantage of the proposed method compared with the deep learning method. Whether the author can analyze the boundary conditions of the method in depth is of great help to the applicability of the method.

Thank you very much for your comment. According to your comment, we have added the boundary condition analysis of this method in the conclusion part. The specific contents are as follows:“In addition, the premise of creating road positive samples in this article is that the road is clear, the interference is relatively small, it is not suitable for images with noise accounting for more than 20% of the total road area or without corresponding OSM data.”

It is difficult for the author to find a suitable comparison method, which makes the method lack of sufficient evidence. It is possible that the method using artificial features such as geometry and texture is superior to deep learning in some aspects. However, in the field of road extraction using artificial features, there are many contrast methods to choose, such as road tracking, snake, image gradient method, etc., which can be used as effective comparison methods.

 

Thank you very much for your comments. Your comment is very good. We have done some research on this aspect before, but it is very helpless that compared with deep learning methods, these methods have shortcomings in accuracy and automation. For example, road tracking method requires manual participation in point filling, snake method needs to manually determine the initial contour, and gradient method has lower accuracy than deep learning method. After consideration, and in combination with the previous peer review recommendations, we choose the deep learning method with better accuracy and automation degree for comparative experiment.

Thank you for your time in reviewing this manuscript again. Your comments were very helpful in improving the quality of the manuscript. I wish you a happy life and work.

 

 

 

 

 

Author Response File: Author Response.pdf

Reviewer 4 Report

This new version of the paper has largely improved the previous version that was submitted. The authors still need to do proofreading and polishing the essay.

Author Response

Dear reviewer:

First, thank you very much for taking time out of your busy schedule to read and revise our article. Thank you for your valuable comments. You have provided comprehensive guidance on the content of our article, which plays a very important role in improving the quality of our article. I would like to express my heartfelt thanks to you for giving us the opportunity to minor revise and improve the level of our articles.

 

The following is response to reviewer4:

Comment

Response

This new version of the paper has largely improved the previous version that was submitted. The authors still need to do proofreading and polishing the essay.

Thank you very much for your comment. According to your comment, we have proofread and polish the article. Thank you for your time in reviewing this manuscript again. Your comments were very helpful in improving the quality of the manuscript. I wish you a happy life and work.

 

 

 

Author Response File: Author Response.pdf

Back to TopTop