Next Article in Journal
Thermal Characterization of Phase Change Materials by Differential Scanning Calorimetry: A Review
Previous Article in Journal
In Vitro Model for Evaluation of Cancer Cell Proliferative Activity under Simulated Acidosis and Using Chitosan Microparticles
 
 
Article
Peer-Review Record

Classification of Malicious URLs by CNN Model Based on Genetic Algorithm

Appl. Sci. 2022, 12(23), 12030; https://doi.org/10.3390/app122312030
by Tiefeng Wu, Yunfang Xi *, Miao Wang and Zhichao Zhao
Reviewer 1:
Reviewer 4:
Appl. Sci. 2022, 12(23), 12030; https://doi.org/10.3390/app122312030
Submission received: 21 October 2022 / Revised: 21 November 2022 / Accepted: 22 November 2022 / Published: 24 November 2022

Round 1

Reviewer 1 Report

Very simple and trivial solution has been proposed and tested. We don't need subsections 2.1.2 and 2.2. Especially subsection 2.1.2 has formulas that are not even explained clearly. Experiments are very small. 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

This work proposes a convolutional neural network model using genetic algorithm optimization for detection of malicious URLs and the authors also evaluated the proposed model using URLs dataset. I think that this topic is interesting but the contribution of this work is not clear. For this reason, there are some issues needs to be improved. 

1) The authors should clearly state the originality of the work and the difference with other research works.

2) In section 3, the authors do not provide details about the dataset. I also recommend the authors to add a link to a public repository with the dataset and the code of proposed model in the revised version of the manuscript.

3) The experimental evaluation of the performance of the proposed model  is the most important part of the research. Therefore, the authors do not provide experimental comparison with other research works or models by other researchers.

4) The evaluation of the proposed model was based on the accucary and loss metrics. Why? The authors need to explain this.

5) The authors would provide a list of concrete conclusions and  contributions in a conclusion section. Finally, the authors could also provide some future extensions of this work.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

  This paper mainly compares the malicious 383 URL with the benign URL, visually observes the data distribution, and reduces the di-384 mension of the features by genetic algorithm to extract 20 features. Then, the convolu-385 tional neural network is used to classify and identify these 20 features.   For the identification of malicious URLs, combined with the current popular machine 381 learning algorithm to train the model, a model of convolutional neural network based on 382 genetic algorithm optimization is proposed.   Compared with other published material CNN (Convolutional Neural Network) with Genetic Algorithm are added to the subject.   However author should consider whether this methodology can be implemented in real time.   Instead of CNN other algorithms can be analysed in the next level.        

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 4 Report

The authors propose a model of convolutional neural network based on genetic algorithm optimization combined with the machine learning algorithm for the identification of malicious URLs. The paper seems original and interesting results are presented. This reviewer’s suggestions, questions, and concerns for the paper are listed below:

1.       The abstract does not highlight the specifics of the proposed research or findings. The abstract is a litter noise and should be more straightforward for the reader regarding the proposed method and its motivation. The abstract should present some main points for the readers, such as the main contribution, the proposed method, the main problem, the obtained results, the benchmark test, the comparative methods. The quantitative results should be clearly defined and reported in the abstract.

2.       Please, provide a paragraph with three to five clear positive impacts of the proposed algorithm.

3.       Why intelligent genetic algorithm is used for the focused problem is not clear. Is the search space too huge? Is the mathematical model not derived?

4.       Why there is a need to optimize the deep network as they are self-sufficient of themselves to adopt any situation of complexity?

5.       In general, the literature review is not sufficient. It is more of the type “researcher X did Y” rather than an authoritative synthesis assessing the current state-of-the-art. Where do we stand today? What approaches are there in the literature to model the problem? What are the main differences between them? What are their weaknesses and strengths?

6.       There is not a clear categorization of related works. Intelligent optimization methods are categorized into nine different classes according to the papers entitled “Comparative Assessment of Light-Based Intelligent Search and Optimization Algorithms” and “Plant intelligence based metaheuristic optimization algorithms”, etc. These papers should be considered by citing them in order to prevent confusion and to briefly show different methods and their feasibility for applying to the studied subject.

7.       The technical contribution of this work is recommended that the author refines it by mentioning the arisen problem solution approach, containing the scope, the significance of the research, and the potential outcomes.

8.       How far is the proposed algorithm far from the existing methods? There is no comparative study at all.

9.       Are the simulation result taken from the equal conditions? There is not any discussion.

10.   The paper lacks the running environment, including software and hardware. The analysis and configurations of experiments should be presented in detail for reproducibility. It is convenient for other researchers to redo your experiments and this makes your work easy acceptance.

11.   Are the simulation results taken from the equal conditions? There is not any discussion. Add further details on how simulations were conducted. Add further details on how simulations were conducted. Similarly, resource and system characteristics could be added to Tables for clarity. The paper lacks the running environment, including hardware and software. The analysis and configurations of experiments should be presented in detail for reproducibility. It is convenient for other researchers to redo your experiments and this makes your work easy acceptance. A table with parameter setting for experimental results and analysis should be included in order to clearly describe them.

12.   For the experimental results, it will be good to present a statistical test in the comparison of the results with other published methods. This can help to support the claim on improved results obtained with the selection methods studied.

13.   Nature of the intelligent genetic algorithm used in the paper is stochastic and different results may be obtained in different runs. That is why, standard deviations should be given. Statistical test results should also be given. Furthermore, t-distribution test may be used for the explanations of the results. In this manner, parametric tests should be avoided unless it can be shown that the samples drawn from normal populations, and non-parametric test should be performed. Of course in this case number of runs should be increased. The parametric t-test for testing any methods when the assumptions of a normal population are not satisfied is dangerous and can lead to inference errors. For smaller samples it relies even more heavily on this assumption. Valid test should be performed when samples are not necessarily drawn from a normal population.

14.   Findings should respond to the purpose of the study, and should be presented systematically.

15.   Discuss your findings in terms of what was previous known and not know about the focus of your research. Did your findings cohere and/or contrast with previous research on similar groups, locations, people, etc.?

16.   To have an unbiased view in the paper, discuss the limitations of your study. These limitations can be organized around simple distinctions of the choices you made in your study regarding who, what, where, when, why, and how. Show the advantage, disadvantages, and weaknesses of the studied works. Discuss your position on the generalizability of your results. Clarifying the study’s limitations allows the readers to better understand under which conditions the results should be interpreted. A clear description of limitations of a study also shows that the researcher has a holistic understanding of his/her study. However, the authors fail to demonstrate this in their paper. The authors should clarify the pros and cons of the methods. What are the limitation(s) methodology(ies) adopted in this work? Please indicate practical advantages, and discuss research limitations.

17.   Some more recommendations and conclusions should be discussed about the paper considering the experimental results. The Conclusion section is weak. Furthermore, there is not any discussion section about the results. The conclusion section needs significant revisions. It should briefly describe the findings of the study and some more directions for further research. The authors should describe academic implications, major findings, shortcomings, and directions for future research in the conclusion section. The conclusion in its current for is confused in general. Concerning Conclusion section, it would be better "Conclusions and Future Research", and it is strongly suggested to include future research of this manuscript. What will be happen next? What we supposed to expect from the future papers? So rewrite it and consider the following comments:

- Highlight your analysis and reflect only the important points for the whole paper.

- Mention the benefits

- Mention the implication in the last of this section.

 

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

It looks like some improvements are made in the paper in terms of the presentation quality. However, still in the calculation of the fitness function further explanation of the parameters are needed.  It would be good to include a nomenclature for all symbols used.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

After careful reading, the contribution of the revised paper is now clear. Furthmore, comments of the previous review have been addressed.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 4 Report

Thank you for the revised version. All of concerns, questions, and suggestions are corectly addressed. This reviewer thinks that the manuscript can now be accepted in its current form.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Back to TopTop