**5. Conclusions**

Numerous attempts have been made to tackle the problem of identifying malicious domains. However, many fail to successfully classify malware in realistic environments where an adversary can manipulate the features in order to make the model wrongly classify malicious domains. Specifically, this research used a large empirical dataset that was crawled over a significant amount of time at different hours of the day, and captures traffic generated in various countries and continents. Based on this rich dataset, this paper tackled the case where an attacker has access to the model (i.e., a set of features or output for a given input) and tampers with the domain properties. This tampering has a catastrophic effect on the model's efficiency. As a countermeasure, we propose two feature-based mechanisms: (I) an intelligent feature selection procedure that is robust to adversarial manipulation. We evaluated the robustness of each feature, taking into account both the hardness of changing its value and the effects of such manipulations on the classifier; (II) a novel and robust feature engineering process. Based on the domains' properties, we engineered a set of four features which are robust to adversarial manipulation and, together with the common features, improve the classifiers' performance.

We empirically evaluated the common feature set as well as our novel ones using a large dataset, which took into account both malicious and benign models. To extend our evaluation, we picked a broad set of well-known machine learning algorithms. Our evaluation showed that models trained using the robust features are more precise in terms of manipulated data while maintaining good results on clean data as well.

From the industry perspective, our solution can be easily adopted either in any organization's DPI center solution, Firewall, Load Balancer, behavioral analytic or as a client agen<sup>t</sup>

**Figure 7.** The F1-Score by feature sets and models.

that will query a cloud-service dataset. Further research is needed to create models that classify malicious domains into malicious attack types, either in terms of a more extensive list of models or by sampling data in a stratified way, validating the amount of data for any feature value. Another promising direction would be to cluster a set of malicious domains into one cyber campaign.

**Author Contributions:** Conceptualization, C.H., N.H. and A.D.; Data Curation, N.H.; Formal Analysis, C.H. and N.H.; Funding Acquisition, C.H.; Investigation, N.H.; Methodology, C.H. and N.H.; Project Administration, C.H.; Software, N.H.; Supervision, C.H. and A.D.; Validation, C.H. and A.D.; Visualization, N.H. and A.D.; Writing—Original Draft Preparation, C.H., N.H. and A.D.; Writing— Review and Editing, C.H. and A.D. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Ariel University and Holon Institute of Technology (RA1900000614).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Publicly available datasets were analyzed in this study. This data can be found here: https://github.com/nitayhas/robust-malicious-url-detection; accessed on 20 March 2022.

**Acknowledgments:** This work was supported by the Ariel Cyber Innovation Center in conjunction with the Israel National Cyber directorate of the Prime Minister's Office. The authors express special thanks to Nissim Harel of Holon Institute of Technology and Asaf Nadler of Akamai Technologies for the fruitful discussions and their insights.

**Conflicts of Interest:** The authors declare no conflict of interest.
