Next Article in Journal
Residual Energy-Based Computation Efficiency Maximization in Dense Edge Computing Systems
Next Article in Special Issue
One-Dimensional Convolutional Wasserstein Generative Adversarial Network Based Intrusion Detection Method for Industrial Control Systems
Previous Article in Journal
Comparison of Different Methods for Building Ensembles of Convolutional Neural Networks
Previous Article in Special Issue
Multimodel Collaboration to Combat Malicious Domain Fluxing
 
 
Article
Peer-Review Record

A Streamlined Framework of Metamorphic Malware Classification via Sampling and Parallel Processing

Electronics 2023, 12(21), 4427; https://doi.org/10.3390/electronics12214427
by Jian Lyu 1, Jingfeng Xue 1, Weijie Han 2, Qian Zhang 1,* and Yufen Zhu 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Electronics 2023, 12(21), 4427; https://doi.org/10.3390/electronics12214427
Submission received: 15 September 2023 / Revised: 15 October 2023 / Accepted: 24 October 2023 / Published: 27 October 2023
(This article belongs to the Special Issue AI-Driven Network Security and Privacy)

Round 1

Reviewer 1 Report

The paper needs a proper proof reading, especially regarding reference management. The citations don't appear properly. For example, page 2 line 54, the citation shows up as 278. 

In the results focused on comparing the model to the other models, emphasis need to be posed on the performance of other models as well, and why some have higher accuracy compared to the proposed model

The organization of the paper is very clear and is well written. No major changes are needed.

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

In this paper, authors study the metamorphic malware classification problem and based on which, they designed a streamlined framework for fact processing and classification on personal computers. In particular, the efficiency is promoted by devising a sampling method, a parallel training and inference method. The proposed two method benefit from the binary property of malware and especially the metamorphic malware implemented by Opcode sequences. Finally, the authors execute some experiments on testbeds and compare with 4 baselines from feature size, feature set, accuracy, and time cost. Some experiments shows that the proposed methods work better than the baselines in some aspects.

 

The merits of this work

1.     The authors propose sampling method to reduce mode workload, by using lightweight eigenvector, which can attenuate the complexity of feature engineering and achieve the similar classification requirements

2.     Authors propose a parallel processing approach with commonly available hardware resources that utilizes collaboration of multi-core and active recommendation. This approach allow personal computer conducting complex malware classification problem.

3.     Some experiments show that the time cost of proposed method is lower than selected baselines.

 

Weakness of this paper

1.     In the sampling process from massive dataset, author provide the sampling criteria as in the equation (The equation should be properly numbered), but not explaining why this equation was designed. Are these parameters in the equation experimental based or theoretical based? Please add more description about the sampling process.

2.     Both algorithm 1 and algorithm 2 should be reorganized in a formal algorithm presentation. The current algorithms are hard to read and understand.

3.     On thing that confuses me is in the feature matrix generation. Since sampling process already picked up the subset of data, why in section 3.2 author said “we need to generate feature matrices for all massive sample sets. In practical use, we first construct the training feature matrix from the labeled sample sets and then generate the actual detection feature matrix for the unlabeled sample sets”. Is that necessary to perform this operation on all data points? Since this process increase time cost.

4.     The figures can be adjust in appropriate size and shape. E.g., Fig. 7 is hard to read the descriptions in cells.

5.     In Fig. 8 and 9, what is the meaning of the red solid line? If fig 8 an 9 can be combined together, it will be better to observe the ablation of sampling.

6.     The 4 baselines are a little bit out of time, some new malware detection method within 4 years may deserve consideration.

7.     The time efficiency of proposed SELMAL seems related to the feature size after pre-processing. The feature size 300 is much smaller than other 3 baseline, while the time saved is not significantly as the feature size change. If the baselines can use smaller feature set, will they achieve better time efficiency?

The presentation of this paper needs to be improved harder. 

Most of the citations are not correctly used. Authors should check each and modify it carefully.

Some tables are using red color to emphasize results. But there is no description of these red-noted content. Please mark and note.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The authors introduce a simplified framework to classify metamorphic malware based on sampling and parallel processing.

 

The article begins with a fairly complete introduction explaining, among others, techniques for extracting features from malware.

 

Code obfuscation techniques and metamorphic and parallel processing techniques are then discussed.

 

The method proposed by the authors is explained and motivated in detail below. Results are shown, broadly compared with other similar methods, and the conclusions of the study are shown.

 

 

Despite being a very complete work, the authors are requested to apply the following recommendations, in order for its structure and understanding to be better:

 

- Several well-known classifiers are discussed and related, but the decision of the hyperparameters used in them during training is not explained.

 

- Table 4, on page 16 should be pushed down so that it begins on the next page, and the header does not remain on another page.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop