Next Article in Journal
Integration of Satellite-Based Optical and Synthetic Aperture Radar Imagery to Estimate Winter Cover Crop Performance in Cereal Grasses
Previous Article in Journal
How Much of a Pixel Needs to Burn to Be Detected by Satellites? A Spectral Modeling Experiment Based on Ecosystem Data from Yellowstone National Park, USA
 
 
Article
Peer-Review Record

SFRE-Net: Scattering Feature Relation Enhancement Network for Aircraft Detection in SAR Images

Remote Sens. 2022, 14(9), 2076; https://doi.org/10.3390/rs14092076
by Peng Zhang, Hao Xu, Tian Tian *, Peng Gao and Jinwen Tian
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Remote Sens. 2022, 14(9), 2076; https://doi.org/10.3390/rs14092076
Submission received: 14 March 2022 / Revised: 19 April 2022 / Accepted: 23 April 2022 / Published: 26 April 2022

Round 1

Reviewer 1 Report

The abstract is too long, the issue needs to be contextualized better. Remove everything about the state of the art, insert a small brief about experimental results and performance. 

Figure 2 is poorly understood. You need to make it aautocinsistent, then contextualize it better with what is written in the article. 

Same thing for Figure 3.

Same thing for all other figures.

The ground truths are missing. Try some experiments on the airplane graveyard in Tucson, there are ground truths there.

The performances need to be better explained.

You also have to tell how the method behaves if done with lower resolution images (so varying resolution) and varying signal to noise ratio SNR.

Then do a survey of detection performance at varying spatial resolution (Rayleigh distance) and SNR, with simulated data.

Author Response

Comment 1:

The abstract is too long, the issue needs to be contextualized better. Remove everything about the state of the art, insert a small brief about experimental results and performance.

 

Response:

Thank you for your positive comments and valuable suggestions to improve the quality of our manuscript. We have reduced the length of the abstract and inserted a small brief about experimental results and performance. The revised abstract is as follows:

 

Aircraft detection in synthetic aperture radar (SAR) images is a challenging task due to the discreteness of aircraft scattering characteristics, the diversity of aircraft size, and the interference of complex backgrounds. To address these problems, we propose a novel Scattering Feature Relation Enhancement Network (SFRE-Net) in this paper. Firstly, a cascade Transformer Block (TRsB) structure is adopted to improve the integrity of aircraft detection results by modeling the correlation between feature points. Secondly, a feature adaptive fusion pyramid structure (FAFP) is proposed to aggregate features of different levels and scales, enable the network autonomously extract useful semantic information, and improve the multi-scale representation ability of the network. Thirdly, a context attention enhancement module (CAEM) is designed to improve the positioning accuracy in complex backgrounds. Considering the discreteness of scattering characteristics, the module uses a dilated convolution pyramid structure to improve the receptive field and then captures the position of the aircraft target through the coordinate attention mechanism. Experiments on the Gaofen-3 dataset demonstrate the effectiveness of SFRE-Net with a precision rate of 94.4% and a recall rate of 94.5%. Our code is available at https://github.com/hust-rslab/SFRE-Net.

 

Comment 2:

Figure 2 is poorly understood. You need to make it aautocinsistent, then contextualize it better with what is written in the article. Same thing for Figure 3. Same thing for all other figures.

 

Response:

Thank you very much for your valuable comments, we think your comments can make the manuscript natural and smooth. We reiterate the idea in Figure 2. Figure 2 is the overall structure of our algorithm, which is used to show the design concept of the algorithm. The specific implementation details of different modules are introduced in the corresponding method introduction. The original intention of this paper is to alleviate the difficulties in SAR aircraft detection. Therefore, this paper focuses on the proposed three modules: Transformer Block (TRsB), Feature Adaptive Fusion Pyramid (FAFP) and Context Attention Enhancement Module (CAEM). For other methods that are not the main contribution of this paper, we also explain them in the manuscript to enhance the readability of the model. For example, for the backbone used in SFRE-Net, we explain in the manuscript that it adopts the same CSPDarknet53 as YOLOV5, which is a structure that everyone knows.

We have carefully considered your comments and revised the relevant contents. First, we change the position of Figure 2 to make it more in line with the context and lead to Figure 2 more naturally through the content in the manuscript. We have made a detailed auxiliary introduction at the bottom of Figure 2, which can make readers clearly understand the components and overall context of our algorithm. Second, according to your suggestion, we have modified the method chapter to make the module figures more suitable for the context of the manuscript. For example, for Figure 3, we introduced the design concept of TRsB to naturally lead out the figure, and introduced the detailed methods below Figure 3 to make Figure 3 more readable. Other figures, such as Figure 4, Figure 5 and Figure 6, have also been modified to make the context information more natural. Thank you again for your valuable comments and I hope the improvements of this manuscript meet your requirements.

 

Comment 3:

The ground truths are missing. Try some experiments on the airplane graveyard in Tucson, there are ground truths there.

 

Response:

Thank you very much for your careful comments and valuable suggestions. The data set used in this paper is the only publicly available SAR aircraft detection or classification dataset. The dataset provides seven different types of aircraft targets which can be used for fine-grained classification or aircraft detection. This paper focuses on the detection of SAR aircraft. The link to the website provided in this manuscript is a reference format required by the publisher of the dataset. You need to register and contact the dataset official on the website to download the dataset. Due to conflicts of interest, we cannot disclose this data set in our name. We provide you with the website address of the official website. You can contact official staff to download this dataset at http://gaofen-challenge.com/challenge/dataset/4.

Thank you very much for your valuable suggestions. It is a very good idea to test our algorithm in Tucson, which can enrich our manuscript experiment. We are very willing to carry out this experiment, but we can't find the open source SAR aircraft dataset of Tucson on the Internet. The dataset used in our manuscript is the only publicly accessible SAR aircraft dataset at present. If you have the resources of Tucson dataset, can you send us a link? We are very willing to conduct this experiment, which will further improve the quality of the manuscript. Thank you.

 

Comment 4:

The performances need to be better explained. You also have to tell how the method behaves if done with lower resolution images (so varying resolution) and varying signal to noise ratio SNR.

 

Response:

Thank you very much for your comments. Your comments are always so constructive. Multi-resolution and various signal-to-noise ratio (SNR) data are of great significance for SAR aircraft target detection, but the above suggestions have little relevance to the research content of this paper. The reasons are as follows:

1) The main purpose of this paper is to improve the aircraft detection performance in SAR images with discreteness of aircraft scattering characteristics, the diversity of aircraft size, and the interference of complex backgrounds. Abundant experiments have demonstrated the effectiveness of SFRE-Net we proposed for aircraft detection in SAR images.

2) The SAR dataset utilized in this paper is the only publicly available datasets for the aircraft detection. This dataset contains aircrafts with discrete scattering characteristics, various aircraft sizes and clutter interference, which can fully verify our research purpose. Comparisons with other target detection methods on this dataset prove the value of our SFRE-Net in SAR aircraft detection.

3) There is still a lack in the various resolution and SNR SAR aircraft images. Furthermore, there is a large variance in our main purpose and the various resolution detection or various SNR detection.

With all above reasons considered, there is little meaning of the varying resolution and varying signal to noise ratio SNR to our current works in this paper. And no more experiments on the varying resolution and varying signal to noise ratio SNR experiments will be conducted in this revision.

However, there is still value of varying resolution and varying signal to noise ratio SNR experiments for SAR aircraft target detection. And more works based on varying resolution and varying signal to noise ratio SNR would be fulfilled as an independent research work in our future research.

 

Comment 5:

Then do a survey of detection performance at varying spatial resolution (Rayleigh distance) and SNR, with simulated data.

 

Response:

 

Thank you again for your constructive comments. Your comments can always inspire us. However, our research focuses on improving the detection accuracy of SAR aircraft targets in real complex background scenes, rather than simulation data. In order to solve the difficulties of SAR aircraft detection in real scenes, we propose TRSB, FAFP and CAEM methods according to the scattering characteristics of real SAR aircraft targets, which can significantly improve the detection performance of SAR aircraft. There is a large variance of the simulation data and real data. And the simulate data makes little contribution to the current SAR aircraft detection in the real scenes. 

In a word, survey of detection performance at varying spatial resolution and SNR, with simulated data has little relation to our proposed model. And no more supplements are added in the survey in this revision.

 

We are very grateful for all your valuable comments, which will be of great help to improve the quality of our manuscript. However, some constructive comments lack relevance with the research content of this paper and are not suitable for consideration in our manuscript. But, we are deeply inspired and are willing to carry out new research in combination with these suggestion in future research work. We have tried our best to revise it and hope you are satisfied. Thank you.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors proposed a scattering feature relationship enhancement network (SFRE-Net) to improve the problem of SAR aircraft detection.  The paper is overall clearly written and technically sound, however, some revisions are required. The author should define all the acronyms and abbreviations the first time they use them. For Example, CFAR (line 29 p 1), CNN (line 40 p 2), DETR (line 135 p 4), and so on.

Line 40, add a sentence like "CNN is one of the most widely used architectures in deep learning"

Line 118: substitute ",which" with "and". Moreover, there is no need for a new paragraph.

Line 126: What is the reference for Yolov5?

Line 159: Substitute "depth" with "deep"

Author Response

Responds to the reviewer #2

Comment 1:

The authors proposed a scattering feature relationship enhancement network (SFRE-Net) to improve the problem of SAR aircraft detection.  The paper is overall clearly written and technically sound, however, some revisions are required. The author should define all the acronyms and abbreviations the first time they use them. For Example, CFAR (line 29 p 1), CNN (line 40 p 2), DETR (line 135 p 4), and so on.

 

Response:

We feel great thanks for your professional review work on our article. As you are concerned, there are several problems that need to be addressed. According to your nice suggestions, we have made corrections to our previous draft, and define all the acronyms and abbreviations the first time we use them in the new revised draft. We highlight the revised content in yellow. The revised content is as follows:

The constant false alarm rate (CFAR)

cell-averaging constant false alarm rate (CA-CFAR)

variability index constant false alarm rate (VI-CFAR)

Convolutional Neural Network (CNN)

End-to-End Object Detection with Transformers (DETR)

Bidirectional Feature Network (BiFPN)

Squeeze-and-Excitation (SE)

Bottleneck Attention Module (BAM)

Convolutional Block Attention Module (CBAM)

 

Comment 2:

Line 40, add a sentence like "CNN is one of the most widely used architectures in deep learning"

 

Response:

Thank you again for your positive comments and valuable suggestions to improve the quality of our manuscript. The revised content is as follows:

Recently, deep learning has developed rapidly. The Convolutional Neural Network (CNN) is one of the most widely used architectures in deep learning with strong feature description ability, which has made outstanding contributions in many fields [23], [24].

 

Comment 3:

Line 118: substitute ",which" with "and". Moreover, there is no need for a new paragraph.

 

Response:

We sincerely thank the reviewer for careful reading. As suggested by the reviewer, we have made corrections in the revised draft.

 

Comment 4:

Line 126: What is the reference for Yolov5?

 

Response:

Thank you for pointing this out. We have added references on yolov5 in the revised draft.

Comment 5:

Line 159: Substitute "depth" with "deep"

 

Response:

We were really sorry for our careless mistakes. Thank you for your reminder. Our correction is as follows:

At present, SAR aircraft target detection based on deep CNN has attracted extensive attention.

 

Thank you very much for your positive comments on our work. If there are any other modifications we could make, we would like very much to modify them and we really appreciate your help. Thank you very much for your help.

 

Author Response File: Author Response.pdf

Reviewer 3 Report

Dear Authors,

Introducing the presented article describes SAR and the ability of past and current target detection, which shows the reader the progress in the current issue of distinguishing targets with the help of the mentioned technologies. The authors are deeply focused only on the detection and no solution in the introductory chapter on usefulness at airports or on flying devices such as autonomous flying vehicles. In the beginning, it would be appropriate for the authors to prove that their research has a relevant basis for the need for simultaneous detection of ground targets or aircraft on the ground.

In Chapter 3, Materials and Methods, the authors return to the description of the important design of SFRE-Net and highlight the complications of SAR. It is not possible for the same problem to be discussed again in this chapter but in a unique form. It is necessary for the authors in this chapter to show the ability to determine the method of implementation of the SFRE-Net system design. And what they used for the solution. I ask the authors to comment on this.

Figure 3 authors used Dot-Product. My question is whether it is a vector product. When they explain mathematical sequence as a vector product, just below this figure.

I ask the authors for a correct English interpretation of “it mainly composed The Transformer encoder block of Multi-Head Self-Attention (MSA) layer and fully connected layer (MLP)” as (multi-layer perception).

Figure 9 - I ask the authors to attach an actual picture of the detected aircraft to the picture so that the reader completes the actual picture that these are detected aircraft.

Figure 10 - I take this picture as a key compared to detection with other technologies. However, I do not see any constant pattern by which I can compare the considered improvement.

 

Conclusion:

The authors set partial goals in the article, which were partially fulfilled in the article. It is necessary to set fewer goals and focus only on what shows the relevance of the results. The article in the introductory chapters is well described, but it is already clear in the results that the proposals are only in terms of theoretical improvement, with little added practical value in improving the overall capabilities of target detection at the airport.

Therefore, I recommend the editor review the article.

In overall quality, however, the article has no shortcomings in the scientific manuscript.

 

 

 

 

Author Response

Responds to the reviewer #3

Comment 1:

Introducing the presented article describes SAR and the ability of past and current target detection, which shows the reader the progress in the current issue of distinguishing targets with the help of the mentioned technologies. The authors are deeply focused only on the detection and no solution in the introductory chapter on usefulness at airports or on flying devices such as autonomous flying vehicles. In the beginning, it would be appropriate for the authors to prove that their research has a relevant basis for the need for simultaneous detection of ground targets or aircraft on the ground.

 

Response:

Thank you for your positive comments and valuable suggestions to improve the quality of our manuscript. This paper is a scientific paper, which mainly focuses on the problem of SAR aircraft detection in complex background, not an engineering paper. Therefore, our previous manuscript did not introduce the engineering scheme. However, we agree with you that we should highlight the significance of our research at the beginning of the introduction chapter, which will improve the research background and significance of this paper. Therefore, according to your suggestion, we have modified the beginning of the introduction chapter, as shown below:

 

Synthetic aperture radar has all-weather and all-day observation capability. With the unique imaging mechanism, SAR plays a crucial role in many fields [1-6], such as target detection, strategic reconnaissance, and terrain detection. Automatic target recognition (ATR) is one of the most important applications in SAR, which aims to locate and identify potential targets and has been studied for decades [7-10]. Aircraft is an important target in SAR, and the detection of it has important application value in airport management, military reconnaissance and other fields. With the development of SAR imaging technology, SAR aircraft detection has attracted extensive attention and has become an independent research direction [11-14].

 

Comment 2:

In Chapter 3, Materials and Methods, the authors return to the description of the important design of SFRE-Net and highlight the complications of SAR. It is not possible for the same problem to be discussed again in this chapter but in a unique form. It is necessary for the authors in this chapter to show the ability to determine the method of implementation of the SFRE-Net system design. And what they used for the solution. I ask the authors to comment on this.

 

Response:

We feel great thanks for your professional review work on our article. We very much agree with your comment that we should focus on the design of methods in Chapter 3 and should not discuss the same things as Chapter 1 again. Your valuable suggestions will greatly improve the quality of the manuscript. According to your suggestion, we revised the Chapter 3 in the revised draft, removed the repeated discussion on the same problem, focused on the design of the method, and highlighted them in yellow.

 

Comment 3:

Figure 3 authors used Dot-Product. My question is whether it is a vector product. When they explain mathematical sequence as a vector product, just below this figure.

 

Response:

We sincerely thank the reviewer for careful reading. This paper uses the method of self-attention mechanism to calculate the correlation between feature points. The principle of self-attention mechanism itself is well-known, not our contribution, so we have not explained the principle of self-attention mechanism itself. Next, I will describe the calculation process of self-attention mechanism in detail.

Firstly, each feature point in the feature map is encoded to obtain the embedding vector. Then, the embedding vector corresponding to each feature point generates qi, ki and vi vectors respectively through three learnable parameter matrices WQ, WK and WV。qi (qi ∈ Rd×1) represents the query vector corresponding to the feature point at the ith position after the flattening of the input feature map. The vectors corresponding to all feature points are stacked together to form Q (q1, q2, q3…), K (k1, k2, k3…) and V (v1, v2, v3…) matrices. Then multiply Q and K to obtain the correlation matrix W(i,j). The output formula is as follows.

Thank you for your careful observation. Strictly speaking, it is a multiplication in the form of matrix. However, we objectively want to express the similarity between feature points, such as q1•k1.

We have carefully considered your proposal and changed the Dot-Product in Figure 3 to matrix multiplication to make it more consistent with the mathematical representation. Thank you again for your proposal, which will make our manuscript more rigorous.

 

Comment 4:

I ask the authors for a correct English interpretation of “it mainly composed The Transformer encoder block of Multi-Head Self-Attention (MSA) layer and fully connected layer (MLP)” as (multi-layer perception).

Response:

Thanks for your careful checks. We are sorry for our carelessness. Based on your comments, we have made the corrections in the revised draft, as shown in follows:

The Transformer encoder block is mainly composed of Multi-Head Self-Attention (MHSA) layer and Multi-Layer Perception (MLP).

 

Comment 5:

Figure 9 - I ask the authors to attach an actual picture of the detected aircraft to the picture so that the reader completes the actual picture that these are detected aircraft.

 

Response:

Thank you for your constructive suggestion. As you are concerned, we should attach pictures of the real detected aircraft to improve the readability of the manuscript. We have modified Figure 9 to add real pictures of the detected aircraft, as shown below.

 

Comment 6:

Figure 10 - I take this picture as a key compared to detection with other technologies. However, I do not see any constant pattern by which I can compare the considered improvement.

 

Response:

Thank you for your professional comments and careful review of our paper. Please allow us to make a statement on this issue. Firstly, in order to ensure the fairness of the comparative experiment, we randomly selected several pictures from the test set to visualize the detection effects of different algorithms. Secondly, the algorithms compared in this paper are the most advanced target detection algorithms at present. They all have excellent detection ability, but compared with SFRE-Net proposed in this paper, we can still see their defects in SAR aircraft detection methods. For example, compared with RetinaNet, SFRE-Net can improve the integrity detection capability of aircraft targets. Compared with CenterNet and Faster R-CNN, SFRE-Net has stronger anti-interference ability and can effectively distinguish between target and background interference. Compared with FCOS, SFRE-Net has better detection effect on targets of different sizes. In Figure 10, comparing other algorithms with SFRE-Net, it can be seen that SFRE- Net can better alleviate the problems of discrete target, diversity of target size and complex background interference in SAR aircraft detection. Thirdly, Figure 10 allows readers to see the detection effect of SFER-Net more intuitively, and Table 2 fully proves the superiority of SFRE-Net in SAR aircraft detection.

 

 

 

Thank you very much for your positive comments on our work. If there are any other modifications we could make, we would like very much to modify them and we really appreciate your help. Thank you very much for your help.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Accepted

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Back to TopTop