Next Article in Journal
Solving the Optimal Reactive Power Dispatch Problem through a Python-DIgSILENT Interface
Previous Article in Journal
A Novel Artificial Multiple Intelligence System (AMIS) for Agricultural Product Transborder Logistics Network Design in the Greater Mekong Subregion (GMS)
 
 
Article
Peer-Review Record

Multimodal Biometrics Recognition Using a Deep Convolutional Neural Network with Transfer Learning in Surveillance Videos

Computation 2022, 10(7), 127; https://doi.org/10.3390/computation10070127
by Hsu Mon Lei Aung 1, Charnchai Pluempitiwiriyawej 2,*, Kazuhiko Hamamoto 3 and Somkiat Wangsiripitak 4
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Computation 2022, 10(7), 127; https://doi.org/10.3390/computation10070127
Submission received: 22 April 2022 / Revised: 31 May 2022 / Accepted: 6 June 2022 / Published: 21 July 2022

Round 1

Reviewer 1 Report

table 4 should be on the same page, improve quality figure 1, improve quality figure 4, This article seems to me to have a good level, which is pleasant to read and understand. I recommend your publication after review

Author Response

We have adjusted the page so that the whole Table 4 is on page 11. We have changed the pictures in Figures 1 and 4 with the ones with higher resolution.

Reviewer 2 Report

This article proposed a multimodal biometrics recognition network for surveillance videos, which is based on deep convolutional neural network with transfer learning. In my opinion, the results are interesting and encouraging that the transfer learning does work well for multimodal tasks. But the following questions need to be addressed:

 

  1. The fourth affiliation is not labelled in the author list.

 

  1. There are a lot of grammar mistakes in the manuscript. The English needs to be improved.
  2. The novelty is not clear in the manuscript. What’s the major difference compared with other methods or networks? There are also other research using transfer learning and multi-modal analysis. The authors should explain their novelty and advantages clearer in the manuscript.
  3. The method part is hard to follow. It would be better to add more details about the network and how it works.
  4. The authors show that the proposed network is smaller than other networks, and take less training time, which is true since the network is based on transfer learning. What about the inference time of the proposed method compared with other networks? The prediction time of the pre-trained base model should also be considered. Otherwise, it’s not fair to compare the training time of transfer learning with other models.
  5. In table 7, the authors show the comparison of accuracy with other multi-modal methods. The proposed transfer learning network shows comparable highest accuracy in the table. It would be better to add some discussions about the advantage of the proposed model compared with the Derbel’s method which shows highest accuracy.

Author Response

1. We have added the label for the 4th affiliation
2. We have revised the English grammars throughout the manuscript as best as we can.
3.  Our proposed method uses a simplified SOBS method, then the GE images  are abstracted from the horizontally centre-aligned and size-normalized silhouette images. It overcomes the problems with mis-alignment of inaccurate human body region segmentation which exists in other GE-based methods for gait recognition. 
        Secondly, most previous studies used HR face images for LR face recognition by synthetically generating the corresponding LR image and mapping function. But we use the RetinaFace detection technique to directly detect the LR face region (no HR face images needed) as an input image to the recognition process. It reduced the processing time and complexity.  
   These two explanations are in the last paragraph on page 14.
4. We have revised and added more details on our proposed method in Section 3.
5. The explanation on inference time is added on page 11, lines 354-358. Since the parameters and complexity in the prediction process of our model are also small, as shown in Table 3 on page 11 and the explanation below, so is the inference time for our model. 
6. The discussion has been added at the bottom of page 12 to the beginning of page 13.

 

Reviewer 3 Report

Paper presents an interesting study, but some revisions, such as follows, are recommended:

- in lines 233-244, please revise "The details of the proposed CNN model architecture is described in Table 1." ; meaning is unclear;

- quality of some figures should be improved (eg. 1, 2, 3, 4, etc.) at text level;

- please be consistent when referring to tables (eg.: "Table 4" in line 365 and "table 5" in line 367, etc. );

- in lines 357 - 358: "4.3.1.1 Unimodal Recognition using Multi-class Classification Machine Learning Algorithms" ; the using numbering for a paragraph is appropriate when other follows, e.g. 4.3.1.2, and so on; please revise;

- please pay attention to capitalization and spacing in the entire paper (e.g.: "Deep 50 learning (DL),   new subcategory..." in lines 50-51, "Our proposed method is  a combination of...", in line 93, " Although The classification..." in line 381, etc.); 

- the conclusion section could be extended.

 

Author Response

1. We have changed it to "The detailed architecture of our proposed CNN model is shown in Table 1." lines 237-238.
2. We have replaces all figures with the ones with higher resolution.
3. We have changed them all into Table # with capital letter.
4.  The 4.3.1.1 has been omitted and it becomes just a paragraph for 4.3.1
5. We have adjusted the spacings and capitalizations accordingly throughout the paper. Thank you.
6. We have revised the conclusion section.

 

Round 2

Reviewer 2 Report

I have no other comments on this manuscript.

Back to TopTop