Next Article in Journal
Spatiotemporal Mechanism-Based Spacetimeformer Network for InSAR Deformation Prediction and Identification of Retrogressive Thaw Slumps in the Chumar River Basin
Next Article in Special Issue
SDRnet: A Deep Fusion Network for ISAR Ship Target Recognition Based on Feature Separation and Weighted Decision
Previous Article in Journal
Hyperspectral Image Classification Based on Two-Branch Multiscale Spatial Spectral Feature Fusion with Self-Attention Mechanisms
Previous Article in Special Issue
Change Detection Based on Existing Vector Polygons and Up-to-Date Images Using an Attention-Based Multi-Scale ConvTransformer Network
 
 
Article
Peer-Review Record

The Cost of Urban Renewal: Annual Construction Waste Estimation via Multi-Scale Target Information Extraction and Attention-Enhanced Networks in Changping District, Beijing

Remote Sens. 2024, 16(11), 1889; https://doi.org/10.3390/rs16111889
by Lei Huang 1, Shaofu Lin 1, Xiliang Liu 1,*, Shaohua Wang 2, Guihong Chen 3, Qiang Mei 4 and Zhe Fu 5
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Remote Sens. 2024, 16(11), 1889; https://doi.org/10.3390/rs16111889
Submission received: 10 April 2024 / Revised: 21 May 2024 / Accepted: 23 May 2024 / Published: 24 May 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript addresses the issue of ‘Waste Estimation’ by proposing a dataset and a multi-task attention-enhancing network. Identifying construction waste in urban areas is a significant task, and there is a shortage of relevant datasets. The paper has contributed by open-sourcing a related dataset. Generally, proposing a reasonable algorithm could meet publication requirements. However, there are several questionable aspects in the design of the algorithm and the experimental setup. I would like to offer some suggestions that may be helpful to the authors:

 

Major

 

1. The authors have introduced DS-ASPP and MT-AENet which are designed to extract features from both local and global levels of the image. Indeed, the DS-ASPP structure is intended to address the aforementioned issues. However, there is a contradiction in the conclusion drawn from the ablation study where DS-ASPP did not enhance the model's performance but only reduced the model's running time. I would like to request that the authors explain this discrepancy.

 

2. In the introduction, the authors discuss various algorithms associated with urban waste. However, for comparative analysis, they only use some conventional semantic segmentation algorithms, which were not mentioned in the introduction. This experimental design seems unusual. Typically, if there are algorithms related to urban waste, one would expect these to be included in the comparative study. I suggest that the authors either revise the introduction to include these comparative algorithms or adjust the set of comparison algorithms to match those discussed in the introduction.

 

3. This model is used for waste estimation, the result can also be applied to “Engineering Waste”, “Demolition Waste”, and “Landfill Waste”. Similarly, the outcomes of Land Use and Land Cover (LULC) classification can interface with a multitude of applications. However, it does not necessarily imply that the underlying algorithm is inherently multi-task. The term "multi-task" in the context of machine learning typically refers to a model's capability to simultaneously learn and optimize for multiple related tasks, often within the same training framework. I recommend not using "Multi-Task" in the model's name. The title "Multi-task Attention Enhancing Network" is ambiguous and might not be suitable.

Author Response

请参阅附件。

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

This paper proposes a Multi-Task Information extraction and attention-enhancing networks for construction waste estimation and apply it in Changping district, Beijing. The research theme is significant, the idea is nonetheless interesting and the experiments are very systematic. However, there are some confusions and suggestions for improvement, as follows.

 

1. Writing. The abstract section needs further revision for clarifying the motivation of this work and core idea, e.g., this paper focuses on the problem of construction waste estimation, but the first contribution is proposing the MT-AENet to capture multi-scale contextual information ...

In my opinion, the MT-AENet is one step for construction waste estimation. What’s the relationship between MT-AENet and the research top?

 

2. There are a great number of building extraction research works, what’s the differences between the proposed model and previous works?

 

3. Experiments.The authors claim that building the new dataset is an important contribution, and for that reason, I propose to separate the construction of this dataset and its introduction into a separate chapter.

Comments on the Quality of English Language

N/A

Author Response

请参阅附件。

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

This paper first proposes a multi-task attention enhancement network (MT-AENet) based on high-resolution remote sensing images to meet the needs of dynamic identification of buildings and tracking of changes in construction waste disposal sites. This network can effectively solve the challenges of traditional networks in extracting irregularly shaped and fragmented construction waste, as well as the substandard performance in extracting multi-scale buildings. In the experiments of this article, the MT-AENet network was used to extract buildings and construction waste in different periods in Changping District, Beijing, and the actual generation and landfill volume of construction waste were calculated based on regional changes, and the resource conversion rate of urban construction waste was indirectly measured. The experimental results can be seen that compared with other baseline networks, the use of MT-AENet network has better results in extracting buildings and construction waste. This paper has great research value in the extraction of buildings and construction waste and in estimating the resource conversion rate of urban construction waste, and provides good suggestions for the future sustainable development of cities. But this paper has several shortcomings in content and format, I would suggest this paper to be accepted after addressing the following questions.

Detailed comments come as follows:

1. Title 2.2.2. There is an extra colon after the title 2.2.2. The Position Attention Module. It is recommended to delete the colon.

2. Section 2.2.4, the DAMM module should be added to the MT-AENet network. The introduction in Figure 2 is not detailed. It is recommended to add a short text to briefly introduce the DAMM module.

3. Figure 11. In Figure 11, the comparative statistical histogram of various administrative districts in Changping District in 2019 and 2020 does not show specific values and units in the legend. It is recommended to add them in the fiures.

4. Figure 11. In Figure 11, for the color rendering of the building area of each administrative district in Changping District, it is recommended to use gradient ribbons of the same color to make the representation more intuitive.

5. Table 11. The unit corresponding to the "resource conversion rate" in Table 11 is percentage, not tons, and the unit of the table is tons. It is recommended to revise the table.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

The paper presents a new approach to estimating construction waste in the context of urban renewal. It introduces the Multi-Task Attention-Enhanced Network (MT-AENet) which utilizes high-resolution remote sensing images to dynamically identify buildings and track changes in construction waste disposal sites. Experiments show improvements over traditional methods in terms of accuracy and efficiency.

Pros:

1. The introduction of MT-AENet with its encoder-decoder architecture, including dual attention mechanisms and depthwise eparable-atrous spatial pyramid pooling, significantly advances handling complex image segmentation tasks.

2. Figure 1 clearly illustrates the overall system framework.

3. A thorough evaluation of the proposed method against various baselines is presented, demonstrating superior performance in key metrics like precision, recall, F1 score, and IoU.

4. The method has clear practical implications for urban planning and waste management, offering a more efficient way to estimate and manage construction waste.

Cons:

1. The proposed model, while innovative, may require substantial computational resources, potentially limiting its applicability in resource-constrained environments. Please add more computational complexity analysis.

2. While the paper provides a detailed technical description, it could benefit from a deeper analysis of the environmental impact and practical applications of the findings. For example: Missing ablation studies: I think the authors can provide more ablations to verify the improvement of each part.

3. Only Changping District is considered. Can the proposed method be applied to other types of lands?

4. Incorrect punctuation usage in the formula: at the end of the formula, a comma or a period should be used (for almost all the equations).

5. Some part of Fig. 2 looks blurred. Consider using high-resolution figs. (e.g. the Position Attention Module part)

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

After the author's revisions, the conclusions and expressions of the paper are more reasonable and logical. The author has addressed all my concerns, and I have no further questions.

Reviewer 3 Report

Comments and Suggestions for Authors

All my concerns have been addressed. I don't have any more comments.

Back to TopTop