Next Article in Journal
A Hybrid Heuristic Algorithm for Energy Management in Electricity Market with Demand Response and Distributed Generators
Previous Article in Journal
Sparse Parabolic Radon Transform with Nonconvex Mixed Regularization for Multiple Attenuation
 
 
Article
Peer-Review Record

Performance Analysis of Deep Convolutional Network Architectures for Classification of Over-Volume Vehicles

Appl. Sci. 2023, 13(4), 2549; https://doi.org/10.3390/app13042549
by S. Sofana Reka 1, Venkata Dhanvanthar Murthy Voona 2, Puvvada Venkata Sai Nithish 2, Dornadula Sai Paavan Kumar 2, Prakash Venugopal 2,* and Visvanathan Ravi 2
Reviewer 1:
Reviewer 2: Anonymous
Appl. Sci. 2023, 13(4), 2549; https://doi.org/10.3390/app13042549
Submission received: 18 December 2022 / Revised: 8 February 2023 / Accepted: 10 February 2023 / Published: 16 February 2023

Round 1

Reviewer 1 Report

Based on the fact that high-volume trucks may cause some safety issues to drivers and other vehicles, it is important to check whether a truck is over volume or not. By carefully collecting and preprocessing some related data, the paper compared the performance of several deep CNN models to classify trucks into different classes. It is concluded that EfficientNet-B3 performs best in terms of several metrics.

The discussed topic is interesting and the approaches are useful. Overall, the paper is organized well and the language is easy to understand. However, there are some places need to be improved further.

1. Some sentences are incomplete, which affects readers' understanding of the paper to some extent. For example, the sentences in line 90, line 117, line 123 and etc.

2. The authors state that they used transfer learning to train the models. But in experiments, it is unclear which parts of weights in the compared CNN models are frozen and which parts of weights are fine-tuned by using the current data. The authors had better to add some explanations about these details.

3. As shown in Table 4, different models have different sizes of input images. I wonder how the collected images about vehicles classification are adjusted so that they are suitable to be fed into each considered model.

4. In conclusion, it is mentioned that the models are evaluated in terms of several metrics such as ROC curve, confusion matrix and so on. But from the results reported in the paper, I have not found any comparative results about confusion matrices and mean square errors of the compared models. Do the authors omit the corresponding results in the paper?

Author Response

Response to Reviewer 1 Comments

 

We would like to thank the authors for their efforts in addressing our comments.

The discussed topic is interesting and the approaches are useful. Overall, the paper is organized well and the language is easy to understand. However, there are some places need to be improved further.

 

 Reviewer Comment 1

Some sentences are incomplete, which affects readers' understanding of the paper to some extent. For example, the sentences in line 90, line 117, line 123 and etc.

 

Authors Response

The authors thank the reviewer for the valuable comment. As per the reviewer comment, necessary changes has been done in the revised manuscript and the entire manuscript is rechecked again

 Reviewer Comment 2

 

The authors state that they used transfer learning to train the models. But in experiments, it is unclear which parts of weights in the compared CNN models are frozen and which parts of weights are fine-tuned by using the current data. The authors had better to add some explanations about these details.

 

Authors Response

The authors thank the reviewer for the comments. In search of better performance, we have implemented fine-tuning on the deep CNN models discussed in this paper. The detail of number of layers unfrozen and trained can be observed in table 1. Unfreezing the complete neural network and training it on small scale dataset will lead to overfit the model. Considering the depth and complexity of the state-of-the-art models, we have unfrozen the 1/4th portion of the neural network and trained these layers on the collected dataset, whereas the frozen layers have the weights trained on ImageNet dataset.

Table 1. Fine-tuning of Deep CNN models.

S. No.

Model

Tot. No. of Layers

Unfrozen layers

1

MobileNetV2

158

120

2

ResNet50

178

134

3

VGG19

25

18

4

EfficientNetB0

240

180

5

EfficientNetB3

387

290

6

EfficientnetB4

477

360

 

Tot. No. of layers – Total number of layers consisting of input, zero padding, convolutional, and fully connected layers; Unfrozen Layers – Each model is trained from this nth layer till the fully connected layer to predict the output.

 

 

Reviewer Comment 3

 

As shown in Table 4, different models have different sizes of input images. I wonder how the collected images about vehicles classification are adjusted so that they are suitable to be fed into each considered model.

 

Authors Response

The authors thank the reviewer for the comment. The size of the collected images was adjusted with the help of Image Data Generator. The increase in the input dimensions from EfficientNetB0 to EfficientNetB3,B4 is due to the increase in depth of the EfficientNet neural network from B0 to B4. The deep CNN models were trained on dimensions 224, 240, 300, 380 and presented the best performed dimensions in table 4(numbering in reference to manuscript).

 

 Reviewer Comment 4

In conclusion, it is mentioned that the models are evaluated in terms of several metrics such as ROC curve, confusion matrix and so on. But from the results reported in the paper, I have not found any comparative results about confusion matrices and mean square errors of the compared models. Do the authors omit the corresponding results in the paper?  

Authors Response

The authors thank the reviewer for the valuable comments . As per the reviewer suggestion necessary modification has been done in the revised manuscript in Page no: 16. This details are mentioned here for your reference

The MSE(mean square error) values are updated in the table 4.

 

 

Table 4

Analysis of performance measures for different deep learning model.

 

S. No.

Neural

Network

Input Dimensions

Test Accuracy

MSE

Flop

Auc Score

1.

Mobilenetv2

224 x 224

86.71

0.056

1.3B

0.9001

2.

vgg 19

226 x 226

83.38

0.081

19.6b

0.875

3.

resnet 50

224 x 224

96.01

0.017

16.4b

0.9701

4.

efficientnet [b0]

240 x 240

94.01

0.028

0.7b

0.9554

5.

efficientnet [b3]

300 x 300

96.013

0.017

1.8b

0.9700

6.

efficientnet [b4]

380 x 380

96.34

0.014

4.2b

0.9726

 

 

 

The 3 x 3 confusion matrix for the best model [EfficientNetB4] is presented in figure 1 with three rows and columns related to 3 different classes of the vehicles. The rows represent the actual labels and columns represent predicted labels. From the confusion matrix, we can understand that the EfficientNetB4 model shows better performance in handling non-over volume goods carriers and non-goods carriers compare to over volume goods carriers.

 

 

Figure 1: The 3 x 3 confusion matrix for the best model [EfficientNetB4]. Over Vol. – Over volume goods carriers, Non-over vol. – Non-over volume goods carriers, Non-goods – Non-goods carriers.

Author Response File: Author Response.docx

Reviewer 2 Report

The topic of the article is interesting and important. However, the manuscript has the following deficiencies.

- a very poor review of the literature

- I recommend significantly expanding and separating such a section from the Introduction section;

- section 2. Materials and Methods - too general description, i.e. lack of presentation of basic mathematical models;

- there is no precise presentation of the own proposed method;

- even though I don't feel up to checking the English language, it seems to me that the text needs major corrections in this area.

Author Response

Response to Reviewer 2 Comments

 

We would like to thank the authors for their efforts in addressing our comments.

 

The topic of the article is interesting and important. However, the manuscript has the following deficiencies.

Authors Response

 Reviewer Comment 1

A very poor review of the literature

Authors Response

The authors thank the reviewer for the valuable comments. As per the reviewer suggestion, literature review is modified again and suitable references are added and more elaborate understanding to the work is done . Two sections of Introduction and related works are made in the revised manuscript from Page no: 2 .

 Reviewer Comment 2

 I recommend significantly expanding and separating such a section from the Introduction section

 

Authors Response

The authors thank the reviewer for the valuable comments. As per the reviewer suggestion, two sections of Introduction and related works are made in the revised manuscript from Page no: 2 .

 Reviewer Comment 3

 Section 2. Materials and Methods - too general description, i.e. lack of presentation of basic mathematical models;

 

Authors Response

The authors thank the reviewer for the valuable comments. In the revised manuscript, necessary changes had been done and more explanation are made from page no: 3 to 6.

Few sections were added for more clarity and understanding of the work as mentioned below

The detail of number of layers unfrozen and trained can be observed in Table 1. Unfreezing the complete neural network and training it on small scale dataset will lead to overfit the model. Considering the depth and complexity of the state of the art models, we have unfrozen the 1/4th portion of the neural network and trained these layers on the collected dataset, whereas the frozen layers have the weights trained on Imagenet[12] dataset.

Table 1. Fine-tuning of Deep CNN models.

S. No.

Model

Tot. No. of Layers

Unfrozen layers

1

MobileNetV2

158

120

2

ResNet50

178

134

3

VGG19

25

18

4

EfficientNetB0

240

180

5

EfficientNetB3

387

290

6

EfficientnetB4

477

360

 

 

 

 

 

 

 

Total . No. of layers – Total number of layers consisting of input, zero padding, convolutional, and fully connected layers; Unfrozen Layers – Each model is trained from this nth layer till the fully connected layer to predict the output. The proposed methodology begins with data collection and data augmentation of images related to goods carriers followed by data splitting into train, test, and validation. In the subsequent stage, transfer learning was implemented on the state-of-the-art DCNNs where the convolutional layers were frozen with ImageNet [12] weights and only the classifier layers (i.e., fully connected layers) were trained on the collected dataset. Furthermore, fine-tuning was carried out by unfreezing the one fourth portion of the neural network for the model training. Later, the best performing model was chosen based on different performance metrics.  Figure 1 provides pictorial overview of the proposed system.

 Reviewer Comment 4

There is no precise presentation of the own proposed method;

 

Authors Response

The author thank the reviewers for the valuable comments. As per the reviewer comments the entire sections of proposed method are revised in the manuscript from Page No: 3 for more clarity and more suitable sections are added with results

 

 Reviewer Comment 5

Even though I don't feel up to checking the English language, it seems to me that the text needs major corrections in this area.

 

Authors Response

The authors thank the reviewer for the comment.  As per the reviewer valuable comments, the entire manuscript has been reworked and has been check with English proof reading and all the necessary corrections were made for readability and all the changes are shown in the entire manuscript. Thank you for the suggestion.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The authors have addressed my comments in the revised manusript and I have no comments at present.

Reviewer 2 Report

The authors' explanations as well as corrections and additions to the text of the manuscript are satisfactory.

Back to TopTop