Next Article in Journal
A Temperature-Controlled Apparatus for Gas Permeability under Low Gas Pressure
Previous Article in Journal
Effect of Adjunctive Use of Probiotics in the Treatment of Peri-Implant Mucositis: A Systematic Review and Meta-Analysis
 
 
Article
Peer-Review Record

RCA-PixelCNN: Residual Causal Attention PixelCNN for Pulsar Candidate Image Lossless Compression

Appl. Sci. 2023, 13(19), 10941; https://doi.org/10.3390/app131910941
by Jiatao Jiang 1,2,3, Xiaoyao Xie 1,3,*, Xuhong Yu 1,3, Ziyi You 4 and Qian Hu 5
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4: Anonymous
Appl. Sci. 2023, 13(19), 10941; https://doi.org/10.3390/app131910941
Submission received: 2 September 2023 / Revised: 25 September 2023 / Accepted: 28 September 2023 / Published: 3 October 2023

Round 1

Reviewer 1 Report

The paper is structured in a well-organized way, however missing some technical details.

The reason for not selecting the horizontal feature needs to be clarified. is there any strong support for using vertical features only?

Numbers in the first paragraph of the introduction section should be supported with proper evidence/citation. 

Citations are missing in many places. 

The concept is quite simple and lacks significant contribution.

Author Response

Questions and Suggestions from Reviewer 1

1.Numbers in the first paragraph of the introduction section should be supported with proper evidence/citation.

Reply:

Your opinion is very good, and I sincerely accept it. The data in the previous manuscript version came from statistics in our daily work, which may not be persuasive. I have modified this section's data and used data from papers published by the Chinese National Astronomical Observatory as support.

 (Di Li,2018)[1]pointed out that the volume of data generated by pulsar observations is one of the significant technical challenges in the FAST radio astronomical survey. For 8 bit sampling of the FAST backends, 100 μs time sampling, 4 K channels, 3 polariztions, 19 beams, the data rate will amount to 1.6 GB/s, 5.8 TB/h, and 144 TB/day. The annual data volume will depend upon the operational conditions and the time allocation between surveys and PI-led programs. If we only consider 200 observation days per year, the data volume for FAST would still amounts to 28PB.

2.The reason for not selecting the horizonal feature needs to be clarified. Is there any strong support for using vertical features only?

Reply: When introducing the pixel-to-pixel correlation of pulse candidate images in my manuscript, I emphasized that the time-phase subplots and frequency-phase subplots of pulse candidate images can intuitively show this vertical correlation. The phase of the pulse signal from the same point source radiation is identical, so these two subplots exhibit a prominent stripe. Of course, there is also horizontal correlation in the images. To capture the spatial structural correlations in the images, we introduce a RCA-PixelCNN model. To explain the research motivation more clearly, I have made the following modifications to the introduction section:

There is a significant spatial correlation within the pixel structure of pulsar candidate images. Especially in the time-phase subfigures and frequency-phase subfigures, we can visually observe this vertical correlation. This is because the pulse signals from the same point source radiation share the same phase, resulting in a prominent stripe in these two subfigures. The subfigures of pulsar candidate bodies have a size of 32x32, and traditional convolution operations may easily overlook distant information. To effectively model pixel density, our model structure must fully leverage the spatial structural correlation of these images, capturing both local feature dependencies and distant correlated feature information.

 

3. Citations are missing.

Reply:Yes, I missed citation 1,6,7.  I have now added it. Thank you for your reminder!

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper concerns  wide  aspects of compression and storage of the new FAST data in pulsar searches.  And we can see that authors proposed new method    RSA-PixelCNN for work with the deep and intense sky survey, named HTRU1.  The  method has the advantage in sense of average negative log-likelihood value.  The generated positive Pulsar Candidate Samples are good examples of the usage of the method.  Only one remark: could you give (1-2) examples of the astronomical results of the pulsars detected with FAST antenna using the method.

I have not comments about English, but sometimes the syntact - for ex.  excess commas?

Author Response

1.Give 1-2 example of the astronomical results fo the pulsars detected with FAST using the method.

Reply:I will add the follow introductions in the introduction section:

Guizhou Normal University FAST Early Data Center participated in all pulsar search work for CRAFITS observations with FAST, including the first pulsar discovered by FAST, PSR J1900-0134, and the first millisecond pulsar, PSR J0318+0253. Our methods were employed in the data processing stages of these significant scientific discoveries.

Author Response File: Author Response.pdf

Reviewer 3 Report

This paper introduced a scheme to address the compression issue of pulsar candidate images. However, there are some issues which are listed.

 

1. Line 13. This study focuses on.

2. Line 24. “model pulsar data. In”, space after data!

3. Line 59. “in a complex and a high-dimensional”. Adding “a”.

4. Line 128. The author does not mention the labels of other subfigures in Figures 1, 5, and above.

5. Line 152. “information, and a simply”. “a” after simply.

6. Line 162. It is better to refer to (a) and (b) in the caption of Figure 2.

7. Line 179. It is recommended to write “The Suggested Method” or “The Proposed Method” instead of “Our Method”.

8. Figure 3, the filter bank (number of filters) for each conv layer should be clarified. In addition, the activation layers should be mentioned inside the figure.

9. There is a space after each period “.”, lines 179, 202, and 209!

10. Line 235. What is the number of filters in each conv layer in Figure 4. “the number of channels decreases by a factor as indicated by the downward-pointing arrow in Figure 4.” Not clear!

11. Line 257. “obtain prediction features”, I think the authors mean “obtain extracted features”

12. Line 350. Again, how many filters (filter bank) were used in each convolutional layer in the proposed method?

13. Lines 370-384, and Table 1. The author does not clarify why the NLL was used while the application is not a logistic regression model. The negative-log-likelihood (NLL) function is used to estimate the parameters of a logistic regression model.

14. Figure 5. The orange and the red line is not the loss of test for the base model and the suggested model, respectively. These two lines represent the loss of validation since its values were changed with each epoch. This figure should be changed. Also, the value (one value) of the loss of test should be mentioned.

15. Table 3. The same note of 13.

There are some issues which were related to the quality of the English language such as (some of there)

Line 59. “in a complex and a high-dimensional”. Adding “a”.

 

Line 152. “information, and a simply”. “a” after simply.

 

Line 179. It is recommended to write “The Suggested Method” or “The Proposed Method” instead of “Our Method”.

 

There is a space after each period “.”, lines 179, 202, and 209!

 

Line 257. “obtain prediction features”, I think the authors mean “obtain extracted features”

 

 

 

 

 

Author Response

Questions and Suggestions from Reviewer 3

 

Thank you to the reviewer for the thorough review of my manuscript and for providing so many questions and suggestions. I will address the following issues:

 

  1. Line13 . “The study focuses on”replace by “ This study focuses on” in abstract

 

  1.  add a space after “model pulsar data ”.

 

  1. Line59 “complex and high-dimensional data.”replaced by “a complex and a high-dimensional data.”

 

  1. Did not mentioned the labels of other subfigures.

 

Reply: Yes, the processing of those subfigures is not the main focus of this paper. My other work deals with them. I have added the following introduction to Section 2.1:

 

Here, our primary focus is on the compression processing of subfigures (2), (3), and (4), while for the other subfigures, we will convert them into binary images using WBS(skip-white-block) encoding.

 

  1. line 152 “information, and simply”replaced by “information, and a simply”

 

  1. Does not refer to (a) and (b) in the caption of figure2.

 

I have added the explanations for subfigures (a) and (b) of  Figure 2 to Section 2.2: 

In Figure 2(a), Type A convolution masks the lower-right part of the convolutional kernel information. By using Type B convolution multiple times, as shown in Figure 2(b), we can expand the receptive field and extract information from the left and upper positions of the current location.

 

  1. Line 179 “Our method”replaced by “The Proposed Method”

 

  1. Figure 3,the filter bank(number of filters) for each conv layer should be clarifed. In additional ,the activation layers should be mentioned inside the figure.

 

We have made modifications to Figure 3, briefly introduced the modules in the text, and provided detailed technical implementation details of the entire model in Section 4.4.1.

 

  1. I have added a space into the Lines 179,202 and 209

 

  1. Figure4, no represented the number channels of each convolutional layer,and the number of channels descreased by factor.

 

  1. Line 257 ,”obtain prediction features”

it is “obtain the parameters of pixels density distribution”.

 

  1. How many filters not mention?

I introduced the implementation details of the model structure in section 4.4.1.

 

  1. Does not clarify why the NLL was used while the applicaion is not a logistic regression model.
  • Why use NLL to evaluate the models?

 I have introduced it in selection 4.3.

  • Whey not use logistic regression model?

Reply: Solving the joint density of image pixels can be decomposed into the product of conditional distributions for each pixel. We use discrete distribution models, such as categorical distribution or multinomial distribution, to fit the conditional distribution for each pixel. The choice of not using continuous density models, such as the logistic regression model or Gaussian distribution, is based on the following considerations:

(1)Fitting continuous distribution models would require first quantizing discrete pixel values into continuous variables and then learning the continuous distribution model.

(2)When used for compression encoding, it is necessary to discretize continuous variables and, at the same time, discretize the corresponding continuous density model to calculate the probability mass for each pixel.

Quantization and inverse quantization introduce errors and also increase the complexity of the work."

 

  1. About figure 5.

 

The loss of test for models should be the loss of validation loss.  I have modified it.

 

  1. The same note of 13

Yes,my response is same as quesiton 13. I have introduced why use NLL to evaulate models in selection 4.3.

Author Response File: Author Response.pdf

Reviewer 4 Report

In this work, authors present a study focusing on lossless compression for FAST pulsar search data, a model named PixelCNN, achieves results that makes it one of the most excellent image density estimators (according to the authors).

The paper is in general well presented, and the results could be considered as innovative and interesting for a general audience, presented algorithms could be considered as an improvement.

As general observations:

1. Consider improving figure 1. Add markers indicating what to look for, or improve description of the figure. 

Indicate the features of information employed in the figure.

 

2. At the beginning of page 5. As you present the issues with the model try to bring up details on how such issues could be addressed. Or provide a better contrast with the benefits of it.

3. It seems like at least the beginning of section 4.3.1 should be in the methods section. 

 

4. In section 5. Establish a clear comparison between the different available and similar models. Discuss figures 5-6. Provide better details for discussing the improvement of your model and which is the main contribution of your work.

Check for several typos in the text.

Author Response

  1. Consider improving figure 1. Add markers indicating what to look for, or improve description of the figure. 

Indicate the features of information employed in the figure.

Reply:  I introduced more image features as motivation for the research, as follows:

In the time-phase subfigure and frequency-phase subfigure of positive samples in the pulsar candidate set, we can intuitively observe a prominent stripe. This is because the phase of the pulsar signal from the same point source radiation is the same, resulting in significant vertical correlation in the images. The center of the period-dispersion subfigure is a bright point that radiates outward in all directions. These image pixels exhibit structural correlations, which can be a key factor in image modeling.

  1. At the beginning of page 5. As you present the issues with the model try to bring up details on how such issues could be addressed. Or provide a better contrast with the benefits of it.

Reply: I agree with your point. Our goal is to expand the receptive field while effectively utilizing global features. In Section 4, the first sentence states, “To expand the receptive field and effectively utilize global image information,”followed by an overview of the proposed method.

  1. It seems like at least the beginning of section 4.3.1 should be in the methods section. 

Reply: I introduced the methods in Section 3, and in Section 4.3.1, I provided detailed implementation details, such as parameter configurations. 

  1. In section 5. Establish a clear comparison between the different available and similar models. Discuss figures 5-6. Provide better details for discussing the improvement of your model and which is the main contribution of your work.

Reply: Thank you for your suggestions. I have made revisions to the section 5. In this section, we analyzed the spatial correlation features of pulsar candidate images and identified shortcomings in the baseline models. For each issue, we proposed our own solutions and compared them with the baseline models to better highlight the innovation in our work. The specific details are as follows:

 

We analyzed the distinctive features of various subfigures in the pulsar candidate diagnostic images and identified the limitations of the baseline models. In the time-phase subfigure and frequency-phase subfigure of positive samples in the pulsar candidate set, we can intuitively observe a prominent stripe. The center of the period-dispersion subfigure is a bright point that radiates outward in all directions. These image pixels exhibit structural correlations, which can be a key factor in image modeling. However, GMM and STM can only model 1D sequences, flattening the image disrupts its structural information. PixelCNN uses convolutional layers to extract features, effectively preserving spatial structural characteristics. However, the local nature of convolutional layers makes the PixelCNN model more likely to focus on nearby information and ignore important information at greater distances. To expand the receptive field, PixelCNN must stack multiple convolutional layers. Unfortunately, deepening the network can affect information propagation and lead to issues such as gradient vanishing.

 

We proposed an RCA-PixelCNN model for pulsar candidate image compression. The core module proposed in this model, the Residual Causal Attention block, defines a masked weight matrix to assess the importance of each pixel's position in the image. This block not only breaks the local constraints of convolution operations but also preserves positional dependencies in autoregressive networks. Additionally, the residual connection captures crucial image details, preventing feature information from being overwhelmed by noise. RCA-PxieCNN utilizes multiple Residual Causal Attention layers and Residual Casual Layers to expand the information receptive field, effectively capturing the global structural information of the image. We conducted experiments using the HTRU1 dataset to evaluate the RCA-PixelCNN model, calculating its average negative log-likelihood score. The results demonstrate that the negative log-likelihood score of RCA-PixelCNN surpasses that of existing prominent density models such as GMM, STM, and PixelCNN.

Author Response File: Author Response.pdf

Back to TopTop