Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Open AccessArticle

Peer-Review Record

Performance Analysis of Deep Convolutional Network Architectures for Classification of Over-Volume Vehicles

Appl. Sci. 2023, 13(4), 2549; https://doi.org/10.3390/app13042549

by S. Sofana Reka¹, Venkata Dhanvanthar Murthy Voona², Puvvada Venkata Sai Nithish², Dornadula Sai Paavan Kumar², Prakash Venugopal^2,*

and Visvanathan Ravi²

Reviewer 1:

Chun-Xia Zhang

Reviewer 2: Anonymous

Appl. Sci. 2023, 13(4), 2549; https://doi.org/10.3390/app13042549

Submission received: 18 December 2022 / Revised: 8 February 2023 / Accepted: 10 February 2023 / Published: 16 February 2023

(This article belongs to the Special Issue Advances and Applications in Ambient Intelligence and Smart Environments)

Round 1

Reviewer 1 Report

Based on the fact that high-volume trucks may cause some safety issues to drivers and other vehicles, it is important to check whether a truck is over volume or not. By carefully collecting and preprocessing some related data, the paper compared the performance of several deep CNN models to classify trucks into different classes. It is concluded that EfficientNet-B3 performs best in terms of several metrics.

The discussed topic is interesting and the approaches are useful. Overall, the paper is organized well and the language is easy to understand. However, there are some places need to be improved further.

1. Some sentences are incomplete, which affects readers' understanding of the paper to some extent. For example, the sentences in line 90, line 117, line 123 and etc.

2. The authors state that they used transfer learning to train the models. But in experiments, it is unclear which parts of weights in the compared CNN models are frozen and which parts of weights are fine-tuned by using the current data. The authors had better to add some explanations about these details.

3. As shown in Table 4, different models have different sizes of input images. I wonder how the collected images about vehicles classification are adjusted so that they are suitable to be fed into each considered model.

4. In conclusion, it is mentioned that the models are evaluated in terms of several metrics such as ROC curve, confusion matrix and so on. But from the results reported in the paper, I have not found any comparative results about confusion matrices and mean square errors of the compared models. Do the authors omit the corresponding results in the paper?

Author Response

Response to Reviewer 1 Comments

We would like to thank the authors for their efforts in addressing our comments.

Reviewer Comment 1

Some sentences are incomplete, which affects readers' understanding of the paper to some extent. For example, the sentences in line 90, line 117, line 123 and etc.

Authors Response

The authors thank the reviewer for the valuable comment. As per the reviewer comment, necessary changes has been done in the revised manuscript and the entire manuscript is rechecked again

Reviewer Comment 2

The authors state that they used transfer learning to train the models. But in experiments, it is unclear which parts of weights in the compared CNN models are frozen and which parts of weights are fine-tuned by using the current data. The authors had better to add some explanations about these details.

Authors Response

The authors thank the reviewer for the comments. In search of better performance, we have implemented fine-tuning on the deep CNN models discussed in this paper. The detail of number of layers unfrozen and trained can be observed in table 1. Unfreezing the complete neural network and training it on small scale dataset will lead to overfit the model. Considering the depth and complexity of the state-of-the-art models, we have unfrozen the 1/4^th portion of the neural network and trained these layers on the collected dataset, whereas the frozen layers have the weights trained on ImageNet dataset.

Table 1. Fine-tuning of Deep CNN models.

S. No.	Model	Tot. No. of Layers	Unfrozen layers
1	MobileNetV2	158	120
2	ResNet50	178	134
3	VGG19	25	18
4	EfficientNetB0	240	180
5	EfficientNetB3	387	290
6	EfficientnetB4	477	360

Tot. No. of layers – Total number of layers consisting of input, zero padding, convolutional, and fully connected layers; Unfrozen Layers – Each model is trained from this nth layer till the fully connected layer to predict the output.

Reviewer Comment 3

As shown in Table 4, different models have different sizes of input images. I wonder how the collected images about vehicles classification are adjusted so that they are suitable to be fed into each considered model.

Authors Response

The authors thank the reviewer for the comment. The size of the collected images was adjusted with the help of Image Data Generator. The increase in the input dimensions from EfficientNetB0 to EfficientNetB3,B4 is due to the increase in depth of the EfficientNet neural network from B0 to B4. The deep CNN models were trained on dimensions 224, 240, 300, 380 and presented the best performed dimensions in table 4(numbering in reference to manuscript).

Reviewer Comment 4

In conclusion, it is mentioned that the models are evaluated in terms of several metrics such as ROC curve, confusion matrix and so on. But from the results reported in the paper, I have not found any comparative results about confusion matrices and mean square errors of the compared models. Do the authors omit the corresponding results in the paper?

Authors Response

The authors thank the reviewer for the valuable comments . As per the reviewer suggestion necessary modification has been done in the revised manuscript in Page no: 16. This details are mentioned here for your reference

The MSE(mean square error) values are updated in the table 4.

Table 4

Analysis of performance measures for different deep learning model.

S. No.	Neural Network	Input Dimensions	Test Accuracy	MSE	Flop	Auc Score
1.	Mobilenetv2	224 x 224	86.71	0.056	1.3B	0.9001
2.	vgg 19	226 x 226	83.38	0.081	19.6b	0.875
3.	resnet 50	224 x 224	96.01	0.017	16.4b	0.9701
4.	efficientnet [b0]	240 x 240	94.01	0.028	0.7b	0.9554
5.	efficientnet [b3]	300 x 300	96.013	0.017	1.8b	0.9700
6.	efficientnet [b4]	380 x 380	96.34	0.014	4.2b	0.9726

The 3 x 3 confusion matrix for the best model [EfficientNetB4] is presented in figure 1 with three rows and columns related to 3 different classes of the vehicles. The rows represent the actual labels and columns represent predicted labels. From the confusion matrix, we can understand that the EfficientNetB4 model shows better performance in handling non-over volume goods carriers and non-goods carriers compare to over volume goods carriers.

Figure 1: The 3 x 3 confusion matrix for the best model [EfficientNetB4]. Over Vol. – Over volume goods carriers, Non-over vol. – Non-over volume goods carriers, Non-goods – Non-goods carriers.

Author Response File: Author Response.docx

Reviewer 2 Report

The topic of the article is interesting and important. However, the manuscript has the following deficiencies.

- a very poor review of the literature

- I recommend significantly expanding and separating such a section from the Introduction section;

- section 2. Materials and Methods - too general description, i.e. lack of presentation of basic mathematical models;

- there is no precise presentation of the own proposed method;

- even though I don't feel up to checking the English language, it seems to me that the text needs major corrections in this area.

Author Response

Response to Reviewer 2 Comments

We would like to thank the authors for their efforts in addressing our comments.

The topic of the article is interesting and important. However, the manuscript has the following deficiencies.

Authors Response

Reviewer Comment 1

A very poor review of the literature

Authors Response

The authors thank the reviewer for the valuable comments. As per the reviewer suggestion, literature review is modified again and suitable references are added and more elaborate understanding to the work is done . Two sections of Introduction and related works are made in the revised manuscript from Page no: 2 .

Reviewer Comment 2

I recommend significantly expanding and separating such a section from the Introduction section

Authors Response

The authors thank the reviewer for the valuable comments. As per the reviewer suggestion, two sections of Introduction and related works are made in the revised manuscript from Page no: 2 .

Reviewer Comment 3

Section 2. Materials and Methods - too general description, i.e. lack of presentation of basic mathematical models;

Authors Response

The authors thank the reviewer for the valuable comments. In the revised manuscript, necessary changes had been done and more explanation are made from page no: 3 to 6.

Few sections were added for more clarity and understanding of the work as mentioned below

The detail of number of layers unfrozen and trained can be observed in Table 1. Unfreezing the complete neural network and training it on small scale dataset will lead to overfit the model. Considering the depth and complexity of the state of the art models, we have unfrozen the 1/4^th portion of the neural network and trained these layers on the collected dataset, whereas the frozen layers have the weights trained on Imagenet[12] dataset.

Table 1. Fine-tuning of Deep CNN models.

S. No.	Model	Tot. No. of Layers	Unfrozen layers
1	MobileNetV2	158	120
2	ResNet50	178	134
3	VGG19	25	18
4	EfficientNetB0	240	180
5	EfficientNetB3	387	290
6	EfficientnetB4	477	360

Total . No. of layers – Total number of layers consisting of input, zero padding, convolutional, and fully connected layers; Unfrozen Layers – Each model is trained from this nth layer till the fully connected layer to predict the output. The proposed methodology begins with data collection and data augmentation of images related to goods carriers followed by data splitting into train, test, and validation. In the subsequent stage, transfer learning was implemented on the state-of-the-art DCNNs where the convolutional layers were frozen with ImageNet [12] weights and only the classifier layers (i.e., fully connected layers) were trained on the collected dataset. Furthermore, fine-tuning was carried out by unfreezing the one fourth portion of the neural network for the model training. Later, the best performing model was chosen based on different performance metrics. Figure 1 provides pictorial overview of the proposed system.

Reviewer Comment 4

There is no precise presentation of the own proposed method;

Authors Response

The author thank the reviewers for the valuable comments. As per the reviewer comments the entire sections of proposed method are revised in the manuscript from Page No: 3 for more clarity and more suitable sections are added with results

Reviewer Comment 5

Even though I don't feel up to checking the English language, it seems to me that the text needs major corrections in this area.

Authors Response

The authors thank the reviewer for the comment. As per the reviewer valuable comments, the entire manuscript has been reworked and has been check with English proof reading and all the necessary corrections were made for readability and all the changes are shown in the entire manuscript. Thank you for the suggestion.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The authors have addressed my comments in the revised manusript and I have no comments at present.

Reviewer 2 Report

The authors' explanations as well as corrections and additions to the text of the manuscript are satisfactory.

Article Menu

Performance Analysis of Deep Convolutional Network Architectures for Classification of Over-Volume Vehicles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI