Next Article in Journal
SelfCCL: Curriculum Contrastive Learning by Transferring Self-Taught Knowledge for Fine-Tuning BERT
Previous Article in Journal
Hybrid Classifier-Based Federated Learning in Health Service Providers for Cardiovascular Disease Prediction
Previous Article in Special Issue
Perturbation-Based Explainable AI for ECG Sensor Data
 
 
Article
Peer-Review Record

A Transfer Learning and Optimized CNN Based Maritime Vessel Classification System

Appl. Sci. 2023, 13(3), 1912; https://doi.org/10.3390/app13031912
by Mostafa Hamdy Salem 1,*, Yujian Li 2, Zhaoying Liu 1 and Ahmed M. AbdelTawab 3
Reviewer 1:
Reviewer 2:
Reviewer 3:
Appl. Sci. 2023, 13(3), 1912; https://doi.org/10.3390/app13031912
Submission received: 31 October 2022 / Revised: 5 December 2022 / Accepted: 7 December 2022 / Published: 1 February 2023
(This article belongs to the Special Issue Recent Advances in Deep Learning for Image Analysis)

Round 1

Reviewer 1 Report

It is transfer learning-based ship detection by using the well-known methods of the research together.   The method presented by the authors is not very innovative, but it is valuable for an adaptation study.  

The subject does not contain innovation in terms of machine learning. In terms of implementation, it can be discussed whether it is really necessary in terms of a hybrid approach and its performance.   In the study, performance improvement in terms of algorithms is discussed. In this respect, its advantages have been revealed. Tables, references and drawings are sufficient in my opinion.

 

The method proposed in the Paper consists of combining and using many known methods. I think it's an ordinary study on deep learning. No major scientific contribution. It is just another application about DL. The general flow, narration and presentation of the paper are smooth and there are no serious glaring flaws. Paper can be accepted in present form.

 

Author Response

Paper number: applsci-2036554

Paper title: A Transfer Learning and Optimized CNN Based maritime vessel Classification System Authors: Mostafa Hamdy Salem, Yujian Li 2, Zhaoying Liu and Ahmed M. AbdelTawab.

The authors would like to thank the area editor and the reviewers for their precious time and invaluable comments. We have carefully addressed all the comments. The corresponding changes and refinements made in the revised paper are summarized in our response below.

Reviewer 1

It is transfer learning-based ship detection by using the well-known methods of the research together.   The method presented by the authors is not very innovative, but it is valuable for an adaptation study.  

Response 1: We thank the reviewer for their valuable comments. These comments are very constructive, and will help us to improve the manuscript, specifically in terms of clarifying our methodology and the goal of this paper.  We address the reviewer’s concerns in this letter, and corresponding changes will be made to improve the manuscript.

The subject does not contain innovation in terms of machine learning. In terms of implementation, it can be discussed whether it is really necessary in terms of a hybrid approach and its performance.   In the study, performance improvement in terms of algorithms is discussed. In this respect, its advantages have been revealed. Tables, references and drawings are sufficient in my opinion.

 Response 2: We thank the reviewer for this positive feedback. The reviewer understood the main goal of our manuscript and the implications of our method. 

The method proposed in the Paper consists of combining and using many known methods. I think it's an ordinary study on deep learning. No major scientific contribution. It is just another application about DL. The general flow, narration and presentation of the paper are smooth and there are no serious glaring flaws. Paper can be accepted in present form

Response 3: We revised the manuscript to emphasis our contribution. references are re-arrange and listed in order according to their appearance in the text, figure 2 and 3 are re-drawn.

Author Response File: Author Response.docx

Reviewer 2 Report

The paper presents another application for transfer learning. Although the application seems useful and the results are promising, the paper organization and writing still requires a lot of improvements for publication, especially sections 2 and 3. 

 

- Lines 43-46, I would write the introduction about CNN in a different way to highlight their importance and the type of applications being used for. Moreover, the motivation about training requiring a lot of work is weak. please revise.

- Better organization and categorization of the related work is required. Also, the section is too long, please try to make shorter and more focused.

- Include citations to the CNN models references.  

- I would suggest referencing more recent articles involving transfer learning, the following article would provide a good starting point. Also, these papers will help the authors with the proper way of describing and reporting deep learning TL approaches and the performance results, which is lacking in their manuscript. 

 

Detection of developmental dysplasia of the hip in X-ray images using deep transfer learning. BMC Med Inform Decis Mak 22, 216 (2022).

On the Automatic Detection and Classification of Skin Cancer Using Deep Transfer Learning. Sensors 2022, 22, 4963. https://doi.org/10.3390/s22134963

Using deep transfer learning to detect scoliosis and spondylolisthesis from x-ray images. PLoS ONE 17(5): e0267851. https://doi.org/10.1371/journal.pone.0267851

 

- The sections numbering is wrong. you move from section 2 to 3.1 immediately. Section 3 should be materials and methods, and start with a description of the dataset after introducing figure 1.

- What is the number of images per category? If imabalanced then the Mathews correlation coeffient would be a better indicator of accuracy than F-measure. 

- The equation for the accuracy was not included. Also, you are reporting the validation accuracy not the testing accuracy and there is no mention of the training strategy or the data split percentages. 

- Figure 3, not all models require 224x224. For example, Xception requires 299x299. 

- Line 481-482, 

- Grammatical error, line 299. In general section 3.3.A is poorly written. For example, the sentence on line 301 is improper. 

- What is dataset used for pretraining the models? Is it Imagenet? please mention.

- Did the augmentation increase the size of the dataset? or did you discard the originals?

- Typo, line 307, in --> In

- Line 312, mention the figure number rather than the following figure. 

- Equation 4, remove the apostrophe from FP.

- Line 563 97.75 percent --> 97.75%

- Line 564 RLU --? ReLU.

- Line 644, section head, different --? Different. I would put the title as Comparison to related works, or related literature. 

- Line 716, "categorized using cutting-edge equipment", what does this mean and how does it relate to TL!!!

- Table of abbreviations as required by the journal template is missing.

 

Author Response

Paper number: applsci-2036554

Paper title: A Transfer Learning and Optimized CNN Based maritime vessel Classification System Authors: Mostafa Hamdy Salem, Yujian Li 2, Zhaoying Liu and Ahmed M. AbdelTawab.

The authors would like to thank the area editor and the reviewers for their precious time and invaluable comments. We have carefully addressed all the comments. The corresponding changes and refinements made in the revised paper are summarized in our response below.

Reviewer 2

The paper presents another application for transfer learning. Although the application seems useful and the results are promising, the paper organization and writing still requires a lot of improvements for publication, especially sections 2 and 3. 

 Response 1: We thank the reviewer for their valuable comments. These comments are very constructive, and will help us to improve the manuscript, specifically in terms of clarifying our methodology and the goal of this paper.  Section 2 and 3 are re-organized and rewritten as suggested.

- Lines 43-46, I would write the introduction about CNN in a different way to highlight their importance and the type of applications being used for. Moreover, the motivation about training requiring a lot of work is weak. please revise.

Response 2: We add a paragraph about CNN and its importance in different applications and the type of applications being used for. lot of work is removed.

- Better organization and categorization of the related work is required. Also, the section is too long, please try to make shorter and more focused.

Response 3: related work section is re-organized and shorten as suggested.

- Include citations to the CNN models references.  

Response 4: Citations to the CNN models’ references are included. Refs [5], [27],  [28],  [29], [30], [31], [32], [33], and [34].

- I would suggest referencing more recent articles involving transfer learning, the following article would provide a good starting point. Also, these papers will help the authors with the proper way of describing and reporting deep learning TL approaches and the performance results, which is lacking in their manuscript. 

Detection of developmental dysplasia of the hip in X-ray images using deep transfer learning. BMC Med Inform Decis Mak 22, 216 (2022).

On the Automatic Detection and Classification of Skin Cancer Using Deep Transfer Learning. Sensors 2022, 22, 4963. https://doi.org/10.3390/s22134963

Using deep transfer learning to detect scoliosis and spondylolisthesis from x-ray images. PLoS ONE 17(5): e0267851. https://doi.org/10.1371/journal.pone.0267851

  Response 5: we re-arrange references of the manuscript; the references are listed in order according to their appearance in the text.

- The sections numbering is wrong. you move from section 2 to 3.1 immediately. Section 3 should be materials and methods, and start with a description of the dataset after introducing figure 1.

Response 6: section 3 head is added.  3.Materials and Methods.

- What is the number of images per category? If imabalanced then the Mathews correlation coeffient would be a better indicator of accuracy than F-measure. 

Response 7: the number of images per category in our dataset is balanced so we use F-measure for accuracy indicator.

- The equation for the accuracy was not included. Also, you are reporting the validation accuracy not the testing accuracy and there is no mention of the training strategy or the data split percentages.

Response 8:  accuracy equation is added (4), data split percentages is mentioned in section 4.1. with ratio of 70 to 30.

- Figure 3, not all models require 224x224. For example, Xception requires 299x299. 

- Line 481-482,

Response 9:  resizing is illustrated in Figure 3.

- Grammatical error, line 299. In general section 3.3.A is poorly written. For example, the sentence on line 301 is improper. 

Response 10: we rewrote and modified it.  

- What is dataset used for pretraining the models? Is it Imagenet? please mention.

Response 11:  Mentioned in Section 3.3A.

- Did the augmentation increase the size of the dataset? or did you discard the originals?

Response 12:   If some classes have less than predefined maximum samples number, then augmented images are created for that class. To create creating a balanced training set, the augmented images is then merged with the original trained images to produce a new training set that has exactly maximum samples images in each class thus creating a balanced training set.

- Typo, line 307, in --> In

Response 13:  corrected.

- Line 312, mention the figure number rather than the following figure. 

Response 14:  corrected.

- Equation 4, remove the apostrophe from FP.

Response 15:  the apostrophe from FP is removed.

- Line 563 97.75 percent --> 97.75%

Response 16:  corrected.

- Line 564 RLU --? ReLU.

Response 17:   corrected throughout the manuscript. 

- Line 644, section head, different --? Different. I would put the title as Comparison to related works, or related literature. 

Response 18:  change title to “Comparison to related works,”

- Line 716, "categorized using cutting-edge equipment", what does this mean and how does it relate to TL!!!

Response 19:  cutting-edge equipment changed to state-of-the-art technology.

- Table of abbreviations as required by the journal template is missing.

Response 20:  we review journal template; Table of abbreviations is not required.

Author Response File: Author Response.docx

Reviewer 3 Report

  1. The references cited in the manuscript are out of order. For example, [8] [17] [19] etc., Ensure that all the cited references are in the order. 
  2. There are certain abbreviations without their full forms. For example: ATC. Authors should mention the full forms when they are used for the first time.
  3. SVM full form was mentioned differently at different places in the manuscript.
  4. The existing training models presented in Table 1 do not have references. 
  5. Where is Section 3 starting? Its heading is missing.
  6. Authors have presented the existing models like HPO, PSO, Ensemble Learning etc., Instead the authors are instructed to elaborate on their proposed Methods.
  7. High-level representation of the ship classification system is presented in Figure 1 as the proposed method. It is a generalized representation for any classification problem. Authors are instructed to present their detailed proposed models, methodology.
  8. Authors are instructed to highlight their contributions towards Methodology.

9. Figures 1, 3, and 6 are not clear, ensure that the citation is mentioned if the figures are taken from other sources.

 

 

Author Response

1- The references cited in the manuscript are out of order. For example, [8] [17] [19] etc., Ensure that all the cited references are in the order. 

 

Response 1: We revised the manuscript to emphasis our contribution. references are re-arranged and listed in order according to their appearance in the text.

 

2- There are certain abbreviations without their full forms. For example: ATC. Authors should mention the full forms when they are used for the first time.

 

Response 2: All abbreviations are reviewed and the full forms is mentioned at beginning.

 

3- SVM full form was mentioned differently at different places in the manuscript.

 

Response 3: All abbreviations are reviewed and SVM full form is unified.

 

4- The existing training models presented in Table 1 do not have references.

 

Response 4: The references for the existing training models presented in Table 1 are added in Table 1 and throughout the manuscript.

 

5- Where is Section 3 starting? Its heading is missing.

 

Response 5: Section 3 heading is added (3. Materials and methods)

 

6- Authors have presented the existing models like HPO, PSO, Ensemble Learning etc., Instead the authors are instructed to elaborate on their proposed Methods.

 

Response 6: We've briefly covered how we apply these three techniques to improve our models. PSO to further improve performance by optimizing the hyperparameters for HPO and ensemble learning. The modified model was subjected to all of these techniques.

7- High-level representation of the ship classification system is presented in Figure 1 as the proposed method. It is a generalized representation for any classification problem. Authors are instructed to present their detailed proposed models, methodology.

 

Response 7: we have added some detail about our proposed model in Figure 1 to illustrate our method

 

8- Authors are instructed to highlight their contributions towards Methodology.

 

Response 8: We add some explanation about the purpose of each stage. the effect of each stage on the overall performance.

 

9- Figures 1, 3, and 6 are not clear, ensure that the citation is mentioned if the figures are taken from other sources.

 

Response 9: All Figures are redrawn with high resolution.

 

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

- The authors ignore replying properly to most of my comments. For example, ""categorized using cutting-edge equipment", what does this mean and how does it relate to TL!!!" their response it not related. Another example, "Also, you are reporting the validation accuracy not the testing accuracy and there is no mention of the training strategy or the data split percentages.", their response only takes small part of the comment, and so on. In another comment; "Did the augmentation increase the size of the dataset?" their response "produce a new training set that has exactly maximum samples images ... " what does this mean?

- Duplicating the data through augmentation will lead to data leaking and false superficial results. Simply stating that the data is balanced and we used the F1 score is ill-informed as F1 score is useful for imbalanced datasets.

- The performance results should be based on actual testing and cross-validation strategies. 

Author Response

- What is the number of images per category? If imabalanced then the Mathews correlation coeffient would be a better indicator of accuracy than F-measure. 

Response 7:

  • The number of images per category in our dataset and the training strategy are illustrated in the table below. We trim the training data images with large numbers to make different classes contribute equally to the training procedure. As it is clear from the training strategy, there is no complete balancing; it is only limiting the class with a large data sample.

Class

No of images

70% Training

Training

After trimming

15% Testing

15% Validation

Cargo       

2120

1484

900

318

318

Tankers    

1217

851

851

183

183

Military    

1167

817

817

175

175

Carrier      

916

642

642

137

137

Cruise       

832

582

582

125

125

 

6252

4376

3792

938

938

 

- Also, you are reporting the validation accuracy not the testing accuracy and there is no mention of the training strategy or the data split percentages.

Response 8: Yes we are reporting the validation accuracy.  For training strategy, data split percentages is mentioned in section 4.1. with ratio of 70 for training to 30 for validation and testing. So, 15% each.  

- Did the augmentation increase the size of the dataset? or did you discard the originals?

Response 12:  

Augmentation does not increase the dataset size; we discard the originals. There is no duplication of the data through augmentation.

- Line 716, "categorized using cutting-edge equipment", what does this mean and how does it relate to TL!!!

Response 19:   Modern vessel classification system requires fast and small data samples. These requirements can be achieved by applying TL.

we rewrite as

“In further studies, the proposed method will be improved so that ships in different weather conditions may be classified for the modern vessel classification system that requires fast processing and small data samples.”

 

Author Response File: Author Response.docx

Round 3

Reviewer 2 Report

- Reporting the validation accuracy as opposed to the testing accuracy is misleading. This is especially true considering that in Figure 7, it is clear that there is overfitting in the model training. Moreover,  using the hold out method is insufficient, and the authors should have reported the testing results using cross-validation.

 

- The authors give contradicting statements regarding the dataset. In the first review they mentioned that " If some classes have less than predefined maximum samples number, then augmented images are created for that class. To create creating a balanced training set, the augmented images is then merged with the original trained images to produce a new training set that has exactly maximum samples images in each class thus creating a balanced training set." and when I pressed them on the issue they replied with "Augmentation does not increase the dataset size; we discard the originals. There is no duplication of the data through augmentation" and "...We trim the training data images with large numbers to make different classes contribute equally to the training procedure..."

 

Lines 276, ImageNet is a dataset not models as mentioned. 

Author Response

1- Reporting the validation accuracy as opposed to the testing accuracy is misleading. This is especially true considering that in Figure 7, it is clear that there is overfitting in the model training. Moreover, using the hold out method is insufficient, and the authors should have reported the testing results using cross-validation.

Response 1: We followed the same procedure as the reference we were comparing against to standardize the testing process. However, in the future, we will consider testing using five cross-validations.

- The authors give contradicting statements regarding the dataset. In the first review they mentioned that " If some classes have less than predefined maximum samples number, then augmented images are created for that class. To create creating a balanced training set, the augmented images is then merged with the original trained images to produce a new training set that has exactly maximum samples images in each class thus creating a balanced training set." and when I pressed them on the issue they replied with "Augmentation does not increase the dataset size; we discard the originals. There is no duplication of the data through augmentation" and "...We trim the training data images with large numbers to make different classes contribute equally to the training procedure..."

Response 2: Because some classes contain many samples, which may affect the training process, we tried to balance them but did not make them equal. First, we determined the maximum sample size before performing augmentation. With the augmentation process, we kept the sample number the same since we discarded the original. There is no overfitting since the data is not duplicated, and the augmentation is only a few per cent.

 

3- Lines 276, ImageNet is a dataset not models as mentioned. 

Response 3: Corrected to dataset, in the remaining manuscript its stated as dataset.

 

Author Response File: Author Response.docx

Back to TopTop