Next Article in Journal
Infrasound Source Localization of Distributed Stations Using Sparse Bayesian Learning and Bayesian Information Fusion
Previous Article in Journal
Using a UAV Thermal Infrared Camera for Monitoring Floating Marine Plastic Litter
Previous Article in Special Issue
Satellite On-Board Change Detection via Auto-Associative Neural Networks
 
 
Article
Peer-Review Record

CloudSatNet-1: FPGA-Based Hardware-Accelerated Quantized CNN for Satellite On-Board Cloud Coverage Classification

Remote Sens. 2022, 14(13), 3180; https://doi.org/10.3390/rs14133180
by Radoslav Pitonak 1, Jan Mucha 2,*, Lukas Dobis 1, Martin Javorka 1 and Marek Marusin 1
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Remote Sens. 2022, 14(13), 3180; https://doi.org/10.3390/rs14133180
Submission received: 3 June 2022 / Revised: 24 June 2022 / Accepted: 27 June 2022 / Published: 2 July 2022

Round 1

Reviewer 1 Report

I found this paper very clear and well--written. It is pretty sound and the scientific results are reliable.

I have  few questions, mainly regarding the structure of the Neural Network (NN) used.

- In the introduction, maybe more detailed hints about hyperspectral images can be included

- The authors emploeyed a NN with 10 convolutional layers and 2 Fully Connected layers: did they explore other structures?

- Regarding the training, did the authors try other values for alpha?

- I am no expert on FPGA technology, but maybe the manuscript may benefit from giving more details on why the NN is trained first with floating point and then retrained and not directely trained with lower bit widhts.

Once the authors give more insights about the above observations, I suggest the publication of this work.

Author Response

Thank you for a very positive review and your valuable comments. We will address them one by one. 

# Reviewer 1, Comment 1

In the introduction, maybe more detailed hints about hyperspectral images can be included

 

# Answer 1.1

We would like to thank the reviewer for this comment. Nevertheless, we would like to point out, that our study focuses on the RGB images not hyperspectral, due to the nature of the CubeSats, as they should represent the low-cost, easy-to-deploy satellites, whereas the price of the hyperspectral imager is multiple times higher in comparison to RGB imager. Though we have planes to analyze hyperspectral images in the future and we added the following text to the Introduction.

# Action 1.1

We added the following into the 3rd paragraph of Section 1. 

Generally, remote-sensing satellites can be equipped with a palette of sensors providing information in various bands. From the simplest one (RGB imageries) followed by multispectral imageries (usually a combination of RGB and near-infrared band (NIR)) to the hyperspectral imageries providing a complex spectrum of the sensed area [25,26].

 

[25] SCHWARTZ, Christofer, et al. On-board satellite data processing to achieve smart information collection. In: Optics, Photonics and Digital Technologies for Imaging Applications VII. SPIE, 2022. p. 121-131.

 

[26] ELMASRY, Gamal; SUN, Da-Wen. Principles of hyperspectral imaging technology. In: Hyperspectral imaging for food quality analysis and control. Academic Press, 2010. p. 3-43.



# Reviewer 1, Comment 2

The authors employed a NN with 10 convolutional layers and 2 Fully Connected layers: did they explore other structures? 

 

# Answer 1.2

Thank you for this question. The straightforward answer is no, we did not. The decision was made in line with the scope of the paper and its major goal, which is to propose an FPGA-based hardware-accelerated quantized CNN for satellite on-board cloud coverage classification. Our inspiration comes from the most competitive work, namely the CloudScout cloud detection method proposed by Giuffrida et al. [34] and later extended by Rapuano et al. [20]. We designed the NN in line with their proposition to be able to compare the performance. However, their solution processes the hyperspectral images compared to our simplest solution working with RGB data only. 



# Reviewer 1, Comment 3

Regarding the training, did the authors try other values for alpha?

 

# Answer 1.3

We would like to clarify to the reviewer that all optimized hyperparameters are shown in Table 2., where can be seen that the False Positive penalizer (alpha) was optimized within uniform distribution ranging from 1 to 4. This range was selected in line with the [34], where the authors showed a decrease in the number of FP errors while keeping accuracy on acceptable value when parameter alpha was set to 2, as mentioned in the manuscript section 2.3.2 (end of paragraph). 



# Reviewer 1, Comment 4

I am no expert on FPGA technology, but maybe the manuscript may benefit from giving more details on why the NN is trained first with floating-point and then retrained and not directly trained with lower bit widths.

 

# Answer 1.4

Thank the reviewer for this comment, there was misleading info that we would like to clarify. Training of the baseline 32-bit model is independent of quantization aware training and it is done just to provide a comparison of the distortion in accuracy caused by quantized weights and activations. There is probably a misconception introduced by the connecting arrow from 32-bit baseline training to quantization aware training in figure 8.

 

# Action 1.4

We removed a misleading connecting arrow from 32-bit baseline training to quantization aware training in figure 8.

Reviewer 2 Report

 The authors propose an application based on artificial intelligence techniques for processing satellite images covered by clouds

Both software and hardware solutions have been developed and tested.

The paper includes an impressive number of details related to the design, implementation and testing of the proposed solutions. In order to increase the readability of the paper, a short synthetic paragraph could be included in each (sub) chapter, without technical or numerical details, in which to explain the general ideas related to that (sub) chapter.

It would also be interesting if a basic justification could be introduced, for different practical applications, in relation to the parameters that have priority in improving performance: calculation time, error rate, cost, etc.

 

Author Response

Thank you for a very positive review and your valuable comments. We will address them one by one. 

 

# Reviewer 2, Comment 1

The paper includes an impressive number of details related to the design, implementation and testing of the proposed solutions. In order to increase the readability of the paper, a short synthetic paragraph could be included in each (sub) chapter, without technical or numerical details, in which to explain the general ideas related to that (sub) chapter.

 

# Answer 2.1

We would like to thank the reviewer for this note. We do not agree that a short synthetic paragraph should be included in each (sub) chapter as the reviewer suggests. However, we included it in two sub-chapters of the methodology. We believe that the rest of the paragraphs are giving the idea related to that (sub) chapter after the first sentences are read.

 

# Action 2.1

We added the following to the beginning of Section 2.3.1.

The main idea of this section is to introduce the quantization of CNN and its implementation for the purpose of this study.

 

We added the following to the beginning of Section 2.3.2.

In the following paragraph, the proposed CNN architecture and loss function used during the training period are described.



# Reviewer 2, Comment 2

It would also be interesting if a basic justification could be introduced, for different practical applications, in relation to the parameters that have priority in improving performance: calculation time, error rate, cost, etc.

 

# Answer 2.2

Thank the reviewer for this comment, we will try to justify it. These parameters can be tuned for the mission requirements of the specific mission. For example, if false positives are the biggest issue (because of the very limited downlink) we need to prioritize false positives penalization. If there is a need to provide a real-time trigger (e.g turn on a very high resolution camera when there are no clouds in the field of view) we shall prioritize FPS. The goal of this study was to demonstrate flexibility and trade-off options.

Reviewer 3 Report

I really appreciated the amount of detail and the clarity of the manuscript. The background of the work was extensively discussed as well as the design architecture and process. Moreover, the results are very complete and clearly presented. It is not easy to find an accurate manuscript like this one.

However, I have just two small amendments:

- In section 2.3.6 it will be useful to add the total latency in terms of clock cycles and time for every implementation. It is a very important data as long as the already provided FPSs.

- In section 4 there are several comparisons with the state of the art. Although they are very detailed, some tables and/or figures would undoubtedly add more clarity to the presentation of the results.

Author Response

Thank you for a very positive review and your valuable comments. We will address them one by one. 

# Reviewer 3, Comment 1

In section 2.3.6 it will be useful to add the total latency in terms of clock cycles and time for every implementation. It is a very important data as long as the already provided FPSs. 

 

# Answer 3.1

We would like to thank the reviewer for this comment. However, we probably didn’t understand it completely. Here are our thoughts: 

  1. If the reviewer means the total latency per each model configuration, this should be reflected by the inverse value of the FPS, and we do not see a motivation to add this metric to the table as it is easy to get by FPS. 
  2. If the reviewer means the total latency per network layer this will be difficult to calculate. Total latency depends on the batch size and data stream, there will be different latency when there are already some images going through the pipeline in the comparison with single image inference.

 

# Reviewer 3, Comment 2

In section 4 there are several comparisons with the state of the art. Although they are very detailed, some tables and/or figures would undoubtedly add more clarity to the presentation of the results

 

# Answer 3.2

Thank the reviewer for this note. We agree that such a table will add more clarity, and we added Table 10. comparing these methods. 

 

# Action 3.2

We added Table 10 at the end of Section 4 and the following line at the end of subsection 4.1.

The comparison of these methods is summarized in Table 10.

Back to TopTop