MDPI - Publisher of Open Access Journals

20 pages, 4173 KB

Open AccessArticle

AI-Based Phishing Detection and Student Cybersecurity Awareness in the Digital Age

by Zeinab Shahbazi, Rezvan Jalali and Maryam Molaeevand

Big Data Cogn. Comput. 2025, 9(8), 210; https://doi.org/10.3390/bdcc9080210 - 15 Aug 2025

Viewed by 778

Phishing attacks are an increasingly common cybersecurity threat and are characterized by deceiving people into giving out their private credentials via emails, websites, and messages. An insight into students’ challenges in recognizing phishing threats can provide valuable information on how AI-based detection systems [...] Read more.

Phishing attacks are an increasingly common cybersecurity threat and are characterized by deceiving people into giving out their private credentials via emails, websites, and messages. An insight into students’ challenges in recognizing phishing threats can provide valuable information on how AI-based detection systems can be improved to enhance accuracy, reduce false positives, and build user trust in cybersecurity. This study focuses on students’ awareness of phishing attempts and evaluates AI-based phishing detection systems. Questionnaires were circulated amongst students, and responses were evaluated to uncover prevailing patterns and issues. The results indicate that most college students are knowledgeable about phishing methods, but many do not recognize the dangers of phishing. Because of this, AI-based detection systems have potential but also face issues relating to accuracy, false positives, and user faith. This research highlights the importance of bolstering cybersecurity education and ongoing enhancements to AI models to improve phishing detection. Future studies should include a more representative sample, evaluate AI detection systems in real-world settings, and assess longer-term changes in phishing-related awareness. By combining AI-driven solutions with education a safer digital world can created. Full article

(This article belongs to the Special Issue Big Data Analytics with Machine Learning for Cyber Security)

► Show Figures

Figure 1

19 pages, 1339 KB

Open AccessFeature PaperArticle

Convolutional Graph Network-Based Feature Extraction to Detect Phishing Attacks

by Saif Safaa Shakir, Leyli Mohammad Khanli and Hojjat Emami

Future Internet 2025, 17(8), 331; https://doi.org/10.3390/fi17080331 - 25 Jul 2025

Viewed by 639

Abstract

Phishing attacks pose significant risks to security, drawing considerable attention from both security professionals and customers. Despite extensive research, the current phishing website detection mechanisms often fail to efficiently diagnose unknown attacks due to their poor performances in the feature selection stage. Many [...] Read more.

Phishing attacks pose significant risks to security, drawing considerable attention from both security professionals and customers. Despite extensive research, the current phishing website detection mechanisms often fail to efficiently diagnose unknown attacks due to their poor performances in the feature selection stage. Many techniques suffer from overfitting when working with huge datasets. To address this issue, we propose a feature selection strategy based on a convolutional graph network, which utilizes a dataset containing both labels and features, along with hyperparameters for a Support Vector Machine (SVM) and a graph neural network (GNN). Our technique consists of three main stages: (1) preprocessing the data by dividing them into testing and training sets, (2) constructing a graph from pairwise feature distances using the Manhattan distance and adding self-loops to nodes, and (3) implementing a GraphSAGE model with node embeddings and training the GNN by updating the node embeddings through message passing from neighbors, calculating the hinge loss, applying the softmax function, and updating weights via backpropagation. Additionally, we compute the neighborhood random walk (NRW) distance using a random walk with restart to create an adjacency matrix that captures the node relationships. The node features are ranked based on gradient significance to select the top k features, and the SVM is trained using the selected features, with the hyperparameters tuned through cross-validation. We evaluated our model on a test set, calculating the performance metrics and validating the effectiveness of the PhishGNN dataset. Our model achieved a precision of 90.78%, an F1-score of 93.79%, a recall of 97%, and an accuracy of 93.53%, outperforming the existing techniques. Full article

(This article belongs to the Section Cybersecurity)

► Show Figures

Graphical abstract

17 pages, 2890 KB

Open AccessArticle

Detecting Phishing URLs Based on a Deep Learning Approach to Prevent Cyber-Attacks

by Qazi Emad ul Haq, Muhammad Hamza Faheem and Iftikhar Ahmad

Appl. Sci. 2024, 14(22), 10086; https://doi.org/10.3390/app142210086 - 5 Nov 2024

Cited by 7 | Viewed by 14005

Abstract

Phishing is one of the most widely observed types of internet cyber-attack, through which hundreds of clients using different internet services are targeted every day through different replicated websites. The phishing attacker spreads messages containing false URL links through emails, social media platforms, [...] Read more.

Phishing is one of the most widely observed types of internet cyber-attack, through which hundreds of clients using different internet services are targeted every day through different replicated websites. The phishing attacker spreads messages containing false URL links through emails, social media platforms, or messages, targeting people to steal sensitive data like credentials. Attackers generate phishing URLs that resemble those of legitimate websites to gain these confidential data. Hence, there is a need to prevent the siphoning of data through the duplication of trustworthy websites and raise public awareness of such practices. For this purpose, many machine learning and deep learning models have been employed to detect and prevent phishing attacks, but due to the ever-evolving nature of these attacks, many systems fail to provide accurate results. In this study, we propose a deep learning-based system using a 1D convolutional neural network to detect phishing URLs. The experimental work was performed using datasets from Phish-Tank, UNB, and Alexa, which successfully generated 200 thousand phishing URLs and 200 thousand legitimate URLs. The experimental results show that the proposed system achieved 99.7% accuracy, which was better than the traditional models proposed for URL-based phishing detection. Full article

(This article belongs to the Collection Innovation in Information Security)

► Show Figures

Figure 1

18 pages, 4563 KB

Open AccessArticle

Kashif: A Chrome Extension for Classifying Arabic Content on Web Pages Using Machine Learning

by Malak Aljabri, Hanan S. Altamimi, Shahd A. Albelali, Maimunah Al-Harbi, Haya T. Alhuraib, Najd K. Alotaibi, Amal A. Alahmadi, Fahd Alhaidari and Rami Mustafa A. Mohammad

Appl. Sci. 2024, 14(20), 9222; https://doi.org/10.3390/app14209222 - 11 Oct 2024

Cited by 1 | Viewed by 1804

Abstract

Search engines are significant tools for finding and retrieving information. Every day, many new web pages in various languages are added. The threats of cyberattacks are expanding rapidly with this massive volume of data. The majority of studies on the detection of malicious [...] Read more.

Search engines are significant tools for finding and retrieving information. Every day, many new web pages in various languages are added. The threats of cyberattacks are expanding rapidly with this massive volume of data. The majority of studies on the detection of malicious websites focus on English-language websites. This necessitates more studies on malicious detection on Arabic-content websites. In this research, we aimed to investigate the security of Arabic-content websites by developing a detection tool that analyzes Arabic content based on artificial intelligence (AI) techniques. We contributed to the field of cybersecurity and AI by building a new dataset of 4048 Arabic-content websites. We created and conducted a comparative performance evaluation for four different machine-learning (ML) models using feature extraction and selection techniques: extreme gradient boosting, support vector machines, decision trees, and random forests. The best-performing model was then integrated into a Chrome plugin, created based on a random forest (RF) model, and utilized the features selected via the chi-square technique. This produced plugin tool attained an accuracy of 92.96% for classifying Arabic-content websites as phishing, suspicious, or benign. To our knowledge, this is the first tool designed specifically for Arabic-content websites. Full article

(This article belongs to the Special Issue Data Mining and Machine Learning in Cybersecurity)

► Show Figures

Figure 1

18 pages, 552 KB

Open AccessArticle

An Enhanced K-Means Clustering Algorithm for Phishing Attack Detections

by Abdallah Al-Sabbagh, Khalil Hamze, Samiya Khan and Mahmoud Elkhodr

Electronics 2024, 13(18), 3677; https://doi.org/10.3390/electronics13183677 - 16 Sep 2024

Cited by 8 | Viewed by 3649

Abstract

Phishing attacks continue to pose a significant threat to cybersecurity, employing increasingly sophisticated techniques to deceive victims into revealing sensitive information or downloading malware. This paper presents a comprehensive study on the application of Machine Learning (ML) techniques for identifying phishing websites, with [...] Read more.

Phishing attacks continue to pose a significant threat to cybersecurity, employing increasingly sophisticated techniques to deceive victims into revealing sensitive information or downloading malware. This paper presents a comprehensive study on the application of Machine Learning (ML) techniques for identifying phishing websites, with a focus on enhancing detection accuracy and efficiency. We propose an approach that integrates the CfsSubsetEval attribute evaluator with the K-Means Clustering algorithm to improve phishing detection capabilities. Our method was evaluated using datasets of varying sizes (2000, 7000, and 10,000 samples) from a publicly available repository. Simulation results demonstrate that our approach achieves an accuracy of 89.2% on the 2000-sample dataset, outperforming the traditional kernel K-Means algorithm, which achieved an accuracy of 51.5%. Further analysis using precision, recall, and F1-score metrics corroborates the effectiveness of our method. We also discuss the scalability and real-world applicability of our approach, addressing limitations and proposing future research directions. This study contributes to the ongoing efforts to develop robust, efficient, and adaptable phishing detection systems in the face of evolving cyber threats. Full article

(This article belongs to the Special Issue Artificial Intelligence and Applications—Responsible AI)

► Show Figures

Figure 1

24 pages, 7013 KB

Open AccessArticle

Comparative Analysis of Nature-Inspired Metaheuristic Techniques for Optimizing Phishing Website Detection

by Thomas Nagunwa

Analytics 2024, 3(3), 344-367; https://doi.org/10.3390/analytics3030019 - 6 Aug 2024

Cited by 3 | Viewed by 2105

Abstract

The increasing number, frequency, and sophistication of phishing website-based attacks necessitate the development of robust solutions for detecting phishing websites to enhance the overall security of cyberspace. Drawing inspiration from natural processes, nature-inspired metaheuristic techniques have been proven to be efficient in solving [...] Read more.

The increasing number, frequency, and sophistication of phishing website-based attacks necessitate the development of robust solutions for detecting phishing websites to enhance the overall security of cyberspace. Drawing inspiration from natural processes, nature-inspired metaheuristic techniques have been proven to be efficient in solving complex optimization problems in diverse domains. Following these successes, this research paper aims to investigate the effectiveness of metaheuristic techniques, particularly Genetic Algorithms (GAs), Differential Evolution (DE), and Particle Swarm Optimization (PSO), in optimizing the hyperparameters of machine learning (ML) algorithms for detecting phishing websites. Using multiple datasets, six ensemble classifiers were trained on each dataset and their hyperparameters were optimized using each metaheuristic technique. As a baseline for assessing performance improvement, the classifiers were also trained with the default hyperparameters. To validate the genuine impact of the techniques over the use of default hyperparameters, we conducted statistical tests on the accuracy scores of all the optimized classifiers. The results show that the GA is the most effective technique, by improving the accuracy scores of all the classifiers, followed by DE, which improved four of the six classifiers. PSO was the least effective, improving only one classifier. It was also found that GA-optimized Gradient Boosting, LGBM and XGBoost were the best classifiers across all the metrics in predicting phishing websites, achieving peak accuracy scores of 98.98%, 99.24%, and 99.47%, respectively. Full article

► Show Figures

Figure 1

15 pages, 771 KB

Open AccessArticle

PhishTransformer: A Novel Approach to Detect Phishing Attacks Using URL Collection and Transformer

by Sultan Asiri, Yang Xiao and Tieshan Li

Electronics 2024, 13(1), 30; https://doi.org/10.3390/electronics13010030 - 20 Dec 2023

Cited by 10 | Viewed by 3897

Abstract

Phishing attacks are a major threat to online security, resulting in millions of dollars in losses. These attacks constantly evolve, forcing the cyber security community to improve detection systems. One major problem with current detection systems is that they cannot detect new phishing [...] Read more.

Phishing attacks are a major threat to online security, resulting in millions of dollars in losses. These attacks constantly evolve, forcing the cyber security community to improve detection systems. One major problem with current detection systems is that they cannot detect new phishing attacks, such as Browser in the Browser (BiTB) and malvertising attacks. These attacks hide behind legitimate Uniform Resource Locators (URLs) and can evade detection systems that only analyze a web page URL without exploring the page content. To address this problem, we propose PhishTransformer, a deep-learning model that can detect phishing attacks by analyzing URLs and page content. We propose only using URLs embedded within a webpage, such as hyperlinks and JFrames, to train PhishTransformer. This helps reduce the number of features that need to be extracted from the page content, which makes training the model more efficient. PhishTransformer combines convolutional neural networks and transformer encoders to extract features from website URLs and page content. These features are then used to train a classifier that can distinguish between phishing attacks and legitimate websites. We tested PhishTransformer on a dataset of 10,000 URLs. Our results show that PhishTransformer can achieve an F1-score of 99%, precision of 99%, and recall of 99%. This result suggests that PhishTransformer is a promising new approach to phishing detection. Full article

(This article belongs to the Special Issue Advancements in Cross-Disciplinary AI: Theory and Application—2nd Edition)

► Show Figures

Figure 1

13 pages, 534 KB

Open AccessArticle

Improved Phishing Attack Detection with Machine Learning: A Comprehensive Evaluation of Classifiers and Features

by Sibel Kapan and Efnan Sora Gunal

Appl. Sci. 2023, 13(24), 13269; https://doi.org/10.3390/app132413269 - 15 Dec 2023

Cited by 14 | Viewed by 12854

Abstract

In phishing attack detection, machine learning-based approaches are more effective than simple blacklisting strategies, as they can adapt to new types of attacks and do not require manual updates. However, for these approaches, the choice of features and classifiers directly influences detection performance. [...] Read more.

In phishing attack detection, machine learning-based approaches are more effective than simple blacklisting strategies, as they can adapt to new types of attacks and do not require manual updates. However, for these approaches, the choice of features and classifiers directly influences detection performance. Therefore, in this work, the contributions of various features and classifiers to detecting phishing attacks were thoroughly analyzed to find the best classifier and feature set in terms of different performance metrics including accuracy, precision, recall, F1-score, and classification time. For this purpose, a brand-new phishing dataset was prepared and made publicly available. Using an exhaustive strategy, every combination of the feature groups was fed into various classifiers to detect phishing websites. Two existing benchmark datasets were also used in addition to ours for further analysis. The experimental results revealed that the features based on the uniform resource locator (URL) and hypertext transfer protocol (HTTP), rather than all features, offered the best performance. Also, the decision tree classifier surpassed the others, achieving an F1-score of 0.99 and being one of the fastest classifiers overall. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

27 pages, 3161 KB

Open AccessArticle

A Hybrid Approach for Alluring Ads Phishing Attack Detection Using Machine Learning

by Muhammad Waqas Shaukat, Rashid Amin, Muhana Magboul Ali Muslam, Asma Hassan Alshehri and Jiang Xie

Sensors 2023, 23(19), 8070; https://doi.org/10.3390/s23198070 - 25 Sep 2023

Cited by 27 | Viewed by 5581

Abstract

Phishing attacks are evolving with more sophisticated techniques, posing significant threats. Considering the potential of machine-learning-based approaches, our research presents a similar modern approach for web phishing detection by applying powerful machine learning algorithms. An efficient layered classification model is proposed to detect [...] Read more.

Phishing attacks are evolving with more sophisticated techniques, posing significant threats. Considering the potential of machine-learning-based approaches, our research presents a similar modern approach for web phishing detection by applying powerful machine learning algorithms. An efficient layered classification model is proposed to detect websites based on their URL structure, text, and image features. Previously, similar studies have used machine learning techniques for URL features with a limited dataset. In our research, we have used a large dataset of 20,000 website URLs, and 22 salient features from each URL are extracted to prepare a comprehensive dataset. Along with this, another dataset containing website text is also prepared for NLP-based text evaluation. It is seen that many phishing websites contain text as images, and to handle this, the text from images is extracted to classify it as spam or legitimate. The experimental evaluation demonstrated efficient and accurate phishing detection. Our layered classification model uses support vector machine (SVM), XGBoost, random forest, multilayer perceptron, linear regression, decision tree, naïve Bayes, and SVC algorithms. The performance evaluation revealed that the XGBoost algorithm outperformed other applied models with maximum accuracy and precision of 94% in the training phase and 91% in the testing phase. Multilayer perceptron also worked well with an accuracy of 91% in the testing phase. The accuracy results for random forest and decision tree were 91% and 90%, respectively. Logistic regression and SVM algorithms were used in the text-based classification, and the accuracy was found to be 87% and 88%, respectively. With these precision values, the models classified phishing and legitimate websites very well, based on URL, text, and image features. This research contributes to early detection of sophisticated phishing attacks, enhancing internet user security. Full article

(This article belongs to the Special Issue Security and Privacy in Cloud Computing Environment)

► Show Figures

Figure 1

17 pages, 6302 KB

Open AccessArticle

Binary Hunter–Prey Optimization with Machine Learning—Based Cybersecurity Solution on Internet of Things Environment

by Adil O. Khadidos, Zenah Mahmoud AlKubaisy, Alaa O. Khadidos, Khaled H. Alyoubi, Abdulrhman M. Alshareef and Mahmoud Ragab

Sensors 2023, 23(16), 7207; https://doi.org/10.3390/s23167207 - 16 Aug 2023

Cited by 10 | Viewed by 1886

Abstract

Internet of Things (IoT) enables day-to-day objects to connect with the Internet and transmit and receive data for meaningful purposes. Recently, IoT has resulted in many revolutions in all sectors. Nonetheless, security risks to IoT networks and devices are persistently disruptive due to [...] Read more.

Internet of Things (IoT) enables day-to-day objects to connect with the Internet and transmit and receive data for meaningful purposes. Recently, IoT has resulted in many revolutions in all sectors. Nonetheless, security risks to IoT networks and devices are persistently disruptive due to the growth of Internet technology. Phishing becomes a common threat to Internet users, where the attacker aims to fraudulently extract confidential data of the system or user by using websites, fictitious emails, etc. Due to the dramatic growth in IoT devices, hackers target IoT gadgets, including smart cars, security cameras, and so on, and perpetrate phishing attacks to gain control over the vulnerable device for malicious purposes. These scams have been increasing and advancing over the last few years. To resolve these problems, this paper presents a binary Hunter–prey optimization with a machine learning-based phishing attack detection (BHPO-MLPAD) method in the IoT environment. The BHPO-MLPAD technique can find phishing attacks through feature selection and classification. In the presented BHPO-MLPAD technique, the BHPO algorithm primarily chooses an optimal subset of features. The cascaded forward neural network (CFNN) model is employed for phishing attack detection. To adjust the parameter values of the CFNN model, the variable step fruit fly optimization (VFFO) algorithm is utilized. The performance assessment of the BHPO-MLPAD method takes place on the benchmark dataset. The results inferred the betterment of the BHPO-MLPAD technique over compared approaches in different evaluation measures. Full article

(This article belongs to the Special Issue Cybersecurity and Privacy in Smart Environments: Current Research Trends and Future Directions)

► Show Figures

Figure 1

14 pages, 2901 KB

Open AccessArticle

Hybrid Phishing Detection Based on Automated Feature Selection Using the Chaotic Dragonfly Algorithm

by Gharbi Alshammari, Majdah Alshammari, Tariq S. Almurayziq, Abdullah Alshammari and Mohammad Alsaffar

Electronics 2023, 12(13), 2823; https://doi.org/10.3390/electronics12132823 - 26 Jun 2023

Cited by 7 | Viewed by 2793

Abstract

Due to the increased frequency of phishing attacks, network security has gained the attention of researchers. In addition to this, large volumes of data are created every day, and these data include inappropriate and unrelated features that influence the accuracy of machine learning. [...] Read more.

Due to the increased frequency of phishing attacks, network security has gained the attention of researchers. In addition to this, large volumes of data are created every day, and these data include inappropriate and unrelated features that influence the accuracy of machine learning. There is therefore a need for a robust method of detecting phishing threats and improving detection accuracy. In this study, three classifiers were applied to improve the accuracy of a detection algorithm: decision tree, k-nearest neighbors (KNN), and support vector machine (SVM). Selecting the relevant features improves the detection accuracy for a target class and determines the class label with the greatest probability. The proposed work clearly describes how feature selection using the Chaotic Dragonfly Algorithm provides more accurate results than all other baseline classifiers. It also indicates the appropriate classifier to be applied when detecting phishing websites. Three publicly available datasets were used to evaluate the method. They are reliable datasets for training the model and measuring prediction accuracy. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 4295 KB

Open AccessArticle

A Lightweight Multi-View Learning Approach for Phishing Attack Detection Using Transformer with Mixture of Experts

by Yanbin Wang, Wenrui Ma, Haitao Xu, Yiwei Liu and Peng Yin

Appl. Sci. 2023, 13(13), 7429; https://doi.org/10.3390/app13137429 - 22 Jun 2023

Cited by 16 | Viewed by 4760

Abstract

Phishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches are not [...] Read more.

Phishing poses a significant threat to the financial and privacy security of internet users and often serves as the starting point for cyberattacks. Many machine-learning-based methods for detecting phishing websites rely on URL analysis, offering simplicity and efficiency. However, these approaches are not always effective due to the following reasons: (1) highly concealed phishing websites may employ tactics such as masquerading URL addresses to deceive machine learning models, and (2) phishing attackers frequently change their phishing website URLs to evade detection. In this study, we propose a robust, multi-view Transformer model with an expert-mixture mechanism for accurate phishing website detection utilizing website URLs, attributes, content, and behavioral information. Specifically, we first adapted a pretrained language model for URL representation learning by applying adversarial post-training learning in order to extract semantic information from URLs. Next, we captured the attribute, content, and behavioral features of the websites and encoded them as vectors, which, alongside the URL embeddings, constitute the website’s multi-view information. Subsequently, we introduced a mixture-of-experts mechanism into the Transformer network to learn knowledge from different views and adaptively fuse information from various views. The proposed method outperforms state-of-the-art approaches in evaluations of real phishing websites, demonstrating greater performance with less label dependency. Furthermore, we show the superior robustness and enhanced adaptability of the proposed method to unseen samples and data drift in more challenging experimental settings. Full article

► Show Figures

Figure 1

27 pages, 2738 KB

Open AccessArticle

A Deep Learning-Based Innovative Technique for Phishing Detection in Modern Security with Uniform Resource Locators

by Eman Abdullah Aldakheel, Mohammed Zakariah, Ghada Abdalaziz Gashgari, Fahdah A. Almarshad and Abdullah I. A. Alzahrani

Sensors 2023, 23(9), 4403; https://doi.org/10.3390/s23094403 - 30 Apr 2023

Cited by 49 | Viewed by 13108

Abstract

Organizations and individuals worldwide are becoming increasingly vulnerable to cyberattacks as phishing continues to grow and the number of phishing websites grows. As a result, improved cyber defense necessitates more effective phishing detection (PD). In this paper, we introduce a novel method for [...] Read more.

Organizations and individuals worldwide are becoming increasingly vulnerable to cyberattacks as phishing continues to grow and the number of phishing websites grows. As a result, improved cyber defense necessitates more effective phishing detection (PD). In this paper, we introduce a novel method for detecting phishing sites with high accuracy. Our approach utilizes a Convolution Neural Network (CNN)-based model for precise classification that effectively distinguishes legitimate websites from phishing websites. We evaluate the performance of our model on the PhishTank dataset, which is a widely used dataset for detecting phishing websites based solely on Uniform Resource Locators (URL) features. Our approach presents a unique contribution to the field of phishing detection by achieving high accuracy rates and outperforming previous state-of-the-art models. Experiment results revealed that our proposed method performs well in terms of accuracy and its false-positive rate. We created a real data set by crawling 10,000 phishing URLs from PhishTank and 10,000 legitimate websites and then ran experiments using standard evaluation metrics on the data sets. This approach is founded on integrated and deep learning (DL). The CNN-based model can distinguish phishing websites from legitimate websites with a high degree of accuracy. When binary-categorical loss and the Adam optimizer are used, the accuracy of the k-nearest neighbors (KNN), Natural Language Processing (NLP), Recurrent Neural Network (RNN), and Random Forest (RF) models is 87%, 97.98%, 97.4% and 94.26%, respectively, in contrast to previous publications. Our model outperformed previous works due to several factors, including the use of more layers and larger training sizes, and the extraction of additional features from the PhishTank dataset. Specifically, our proposed model comprises seven layers, starting with the input layer and progressing to the seventh, which incorporates a layer with pooling, convolutional, linear 1 and 2, and linear six layers as the output layers. These design choices contribute to the high accuracy of our model, which achieved a 98.77% accuracy rate. Full article

(This article belongs to the Special Issue AI & Blockchain Assisted Innovative Techniques & Solutions to Modern CPS Security for Industry 4.0/5.0)

► Show Figures

Figure 1

26 pages, 3846 KB

Open AccessArticle

Analysis of the Performance Impact of Fine-Tuned Machine Learning Model for Phishing URL Detection

by Saleem Raja Abdul Samad, Sundarvadivazhagan Balasubaramanian, Amna Salim Al-Kaabi, Bhisham Sharma, Subrata Chowdhury, Abolfazl Mehbodniya, Julian L. Webber and Ali Bostani

Electronics 2023, 12(7), 1642; https://doi.org/10.3390/electronics12071642 - 30 Mar 2023

Cited by 98 | Viewed by 5369

Abstract

Phishing leverages people’s tendency to share personal information online. Phishing attacks often begin with an email and can be used for a variety of purposes. The cybercriminal will employ social engineering techniques to get the target to click on the link in the [...] Read more.

Phishing leverages people’s tendency to share personal information online. Phishing attacks often begin with an email and can be used for a variety of purposes. The cybercriminal will employ social engineering techniques to get the target to click on the link in the phishing email, which will take them to the infected website. These attacks become more complex as hackers personalize their fraud and provide convincing messages. Phishing with a malicious URL is an advanced kind of cybercrime. It might be challenging even for cautious users to spot phishing URLs. The researchers displayed different techniques to address this challenge. Machine learning models improve detection by using URLs, web page content and external features. This article presents the findings of an experimental study that attempted to enhance the performance of machine learning models to obtain improved accuracy for the two phishing datasets that are used the most commonly. Three distinct types of tuning factors are utilized, including data balancing, hyper-parameter optimization and feature selection. The experiment utilizes the eight most prevalent machine learning methods and two distinct datasets obtained from online sources, such as the UCI repository and the Mendeley repository. The result demonstrates that data balance improves accuracy marginally, whereas hyperparameter adjustment and feature selection improve accuracy significantly. The performance of machine learning algorithms is improved by combining all fine-tuned factors, outperforming existing research works. The result shows that tuning factors enhance the efficiency of machine learning algorithms. For Dataset-1, Random Forest (RF) and Gradient Boosting (XGB) achieve accuracy rates of 97.44% and 97.47%, respectively. Gradient Boosting (GB) and Extreme Gradient Boosting (XGB) achieve accuracy values of 98.27% and 98.21%, respectively, for Dataset-2. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

15 pages, 10137 KB

Open AccessArticle

A Novel Phishing Website Detection Model Based on LightGBM and Domain Name Features

by Jingxian Zhou, Haibin Cui, Xina Li, Wenjin Yang and Xi Wu

Symmetry 2023, 15(1), 180; https://doi.org/10.3390/sym15010180 - 7 Jan 2023

Cited by 13 | Viewed by 3819

Abstract

Phishing attacks have evolved in terms of sophistication and have increased in sheer number in recent years. This has led to corresponding developments in the methods used to evade the detection of phishing attacks, which pose daunting challenges to the privacy and security [...] Read more.

Phishing attacks have evolved in terms of sophistication and have increased in sheer number in recent years. This has led to corresponding developments in the methods used to evade the detection of phishing attacks, which pose daunting challenges to the privacy and security of the users of smart systems. This study uses LightGBM and features of the domain name to propose a machine-learning-based method to identify phishing websites and maintain the security of smart systems. Domain name features, often known as symmetry, are the property wherein multiple domain-name-generation algorithms remain constant. The proposed model of detection is first used to extract features of the domain name of the given website, including character-level features and information on the domain name. The features are filtered to improve the model’s accuracy and are subsequently used for classification. The results of experimental comparisons showed that the proposed model of detection, which integrates two types of features for training, significantly outperforms the model that uses a single type of feature. The proposed method also has a higher detection accuracy than other methods and is suitable for the real-time detection of many phishing websites. Full article

(This article belongs to the Special Issue Symmetry and Asymmetry in Cryptography)

► Show Figures

Figure 1

Search Results (37)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (37)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI