1. Introduction
Cloud computing is a robust physical or virtual infrastructure that exists on various devices and servers and allows users to access online data instead of using an internal hard drive [
1]. The concept of cloud computing is linked to distributed systems [
2] because, even in a distributed system, the data are stored in different locations (servers). These data can be retrieved from any place where these servers are available. Cloud computing is a technology used to access data globally [
3]. Cloud computing has four service deployment models: public, private, community, and hybrid [
4]. A public cloud is a free cloud managed by a third party and provides free services to users. Amazon Web Service (AWS), Oracle Cloud, and Microsoft Azure are the most common public cloud providers. A private cloud is designed for a specific organization and accessible within that organization. The hybrid cloud is the third type of cloud computing, featuring the best features of public and private clouds. If an organization wants to use both public and private clouds, it uses the hybrid cloud. Community cloud is the fourth type of cloud computing. A set of organizations can access the systems and services through the community cloud. By 2021, the rate of cloud computing will remain something like this [
5]:
73% of organizations turned to the private cloud;
92% of organizations turned to the public cloud;
69% of organizations turned to a hybrid cloud.
On average, 5 to 30 percent of companies have moved their organizations to the cloud [
6]. Cloud computing has increased 2 to 3 times since 2013 and provides more than 1400 services [
7]. The rapid advancement of internet technology has brought many conveniences in life, but security issues have increased rapidly [
8,
9]. According to the survey of 2021, the ratio of Ransomware malware attacks increased to 52.5 million [
10], while Distributed Denial of Service (DDoS) attacks increased from 1392 per day to 2043 per day [
11]. As the number of services in cloud computing increases, so does the number of attacks in the cloud. Different attackers have designed various algorithms to break cloud security and provide some algorithms to end-users in free applications. These applications can contain bugs and intrusions, allowing attackers to attack cloud servers quickly [
12]. Many data centers have external security, while attacks on data centers are primarily from the inside [
13]. This is because security mechanisms can detect external attacks and prevent data from being attacked, but internal side attacks can be challenging to detect. The growth rate of malware attacks is shown in
Table 1.
Despite implementing various security mechanisms on the internal system of cloud computing, information leakage could not be prevented, and internal side attacks could not be reduced [
14]. Network security is essential because the Internet is so widespread [
15]. Many researchers have developed many algorithms that prevent cloud servers from attacks and designed a set of rules and policies that can be applied logically and physically to prevent the cloud from intrusions. So far, only a few solutions are available that are fully implemented on the cloud server [
16].
The growth rate of malware attacks increases by 20 to 30 times yearly [
17], as shown in
Figure 1. Overall, malware infections have been increasing over the last ten years.
In 2009, the ratio of malware was 12.4 million;
In 2010, the malware ratio increased from 12.4 million to 29.97 million;
In 2011, the malware ratio increased from 29.97 million to 48.17 million;
In 2012, the malware ratio increased from 48.17 million to 82.62 million;
In 2013, the malware ratio increased from 82.62 million to 165.81 million;
In 2014, the malware ratio increased from 165.81 million to 308.96 million;
In 2015, the malware ratio increased from 308.96 million to 452.93 million;
In 2016, the malware ratio increased from 452.93 million to 580.40 million;
In 2017, the malware ratio increased from 580.40 million to 702.06 million;
In 2018, the malware ratio increased from 702.06 million to 812.67 million.
Cloud security is needed to protect the cloud from intrusions and attacks. An intruder can never easily and quickly access cloud data but can try to access data without permission. To resolve this issue, firewalls were used on various Personal Computers (PCs) in the network. Firewalls have different rules and policies to protect the Cloud Personal Computers (PCs) from attacks. The network PC can prevent some intrusions, but firewalls alone are not enough to protect the network from intrusions [
18].
The best way to protect the cloud from intrusions is an Intrusion Detection System [
19]. An intrusion detection system is a technique that monitors all malicious activity on the network and protects the cloud from all such activity [
20]. IDS has attracted many researchers. An IDS can protect all networks and their components [
21]. When cloud servers are attacked, the biggest problem is IDS integrity, maintenance, and availability [
22]. Different techniques are used to protect cloud networks from attacks, including general threat reporting, signature matching, and anomaly detection and prevention systems [
23]. Cloud servers have data encryption mechanisms to protect data from unauthorized access or misuse [
24]. Sensitive data encryption is a method that encrypts data and requires a decryption mechanism to access the data, after which access to the data is possible [
25]. Extensive mechanisms are available to provide security to the network. These include data encryption, access control systems, data backups, authentication, and permissions.
The contribution of this paper is as follows:
Developed a tool to protect the cloud from internal and external attacks and used a Signature-Based Intrusion Detection System (SIDS) and network sensors as a gateway to protect the cloud from the external side;
Applied Signature-Based Intrusion Detection System and network sensors in parallel and monitored all malicious activity outside the network;
Applied user verification through Access Control System (ACS) if someone attacks from inside the cloud server or tries to misuse the cloud shell;
Used Cloud Shell with “rwx-mode” to provide internal security in which each user’s access will be different from the other users;
Prevented cloud server attacks to protect the cloud servers from inside and outside attacks while comparing the existing techniques.
2. Literature Review
Aryachandra et al. [
26] proposed a network-based architecture. In this architecture, the author worked on a tool called snort. The author used two cloud servers in this network, Virtual Machine Based Rootkit-1 (VMBR1) and Virtual Machine Based Rootkit-2 (VMBR2). Virtual Machine Based Rootkit-1 and Virtual Machine Based Rootkit-2 connect the cloud server-1 with cloud server-2 using switch-2. The authors placed an IDS in three different scenarios. In the first scenario, IDS is placed outside the cloud server. In the second scenario, IDS is placed inside the cloud server. In the third scenario, the IDS is set on both sides of the cloud servers. To detect escalating attacks, eth-0 and eth-1 are used. After the placement of IDS in different scenarios, it was attacked several times. To show the performance of IDSs, they created truth tables. At the end of the paper, the authors checked the effect of the Central Processing Unit (CPU) and Random Access Memory (RAM) during the execution of snort.
Narwal et al. [
27] surveyed various IoT (Internet of Things) papers and discussed how the mobile industry and the Internet have been growing too fast over the years. The set technologies have created a lot of convenience for the user, but more than that, the chances of attacks on their data are increasing. To address this issue, the system is classified into NIDS and HIDS. After that, an Architecture was developed for the set of techniques, including the stand-alone Intrusion Detection Support System, the Mutual Interference Detection Support System, and the Decentralized Intrusion Detection Support System. Discussing security of each phase is critical during the installation of the Intrusion Detection System in cloud computing. If the gateway of the Intrusion Detection System is strengthened, the number of attacks will be reduced.
Tuan et al. [
28] analyzed the efficiency of various machine learning algorithms to detect inside botnet attacks. The algorithms used in this article are Support Vector Machine (SVM), Naïve Bayes, and Unsupervised Learning Algorithms. The Transmission Control Protocol (TCP) dump packet analyzer tool is used to capture the packets and extract their features. Two datasets, UNBS-NB 15 and KDD99, are used to check the efficiency of algorithms. After that, different results have been observed: Unsupervised Learning Algorithm: 94.78%, Support Vector Machine: 84.32%, and Naïve Bayes: 71.63%, and it has been concluded that Unsupervised Learning Algorithms can detect better attacks with higher accuracy.
Tadapaneni et al. [
29] worked on the growing cloud security issues and how to protect the cloud from the initial level. This article explores cloud servers’ security threats, vulnerabilities, and security models. The cloud can store all kinds of data and be accessed from any node. However, data security is one of the most significant growth issues in the cloud. According to the author, each technology consists of two stages. One stage leads to success, and the other stage leads to challenges. Even in the case of cloud computing, cloud end users are facing a variety of security issues. Data leaks, non-standardization, malware injections, and data breaches can lead to security issues. At the end of the paper, the significant issues of cloud computing faced by end-users and discussed some cloud threats that can damage the cloud servers.
Jyoti [
30] surveyed various papers, collected and compared all the datasets and algorithms used in these papers, and discussed that an intrusion detection system is an effective tool to monitor and prevent all malicious activity on the network. The author used the Pareto principle to secure the cloud from the outside and used three effective techniques of the Intrusion Detection System (Signature-Based Intrusion Detection System, Anomaly-Based Intrusion Detection System, Network-Based Intrusion Detection System). According to the author, attack rates will be lower if cloud servers provide external security. At the end of the paper, Signature-Based Intrusion Detection System (SIDS) results are 80.21%. In comparison, Anomaly-Based Intrusion Detection System (AIDS) results are 20.54%, while Network-Based Intrusion Detection System results are 61.23%, concluding that Signature Based Intrusion Detection Systems can provide better security than Anomaly Based and Network-Based Intrusion Detection Systems.
Wang et al. [
31] used a prevention technique to secure the cloud and protect the cloud from intrusion so that the cloud could be kept as safe as possible. This article uses three algorithms to secure the Cloud: C-Kmeans, Spatial and Channel-wise Attention in Convolutional Neural Network (SCA-SNN), Principal component analysis (PCA), and lymphography and glass datasets to overcome anomaly-based detection problems. These algorithms were then applied to the datasets, and results were obtained as Kmeans: 77%, Spatial and Channel-wise Attention in Convolutional Neural Network (SCA-SNN): 88%, Principal component analysis (PCA): 67%. Finally, the Spatial and Channel-wise Attention in Convolutional Neural Network algorithm can overcome the problems of an Anomaly-Based Intrusion Detection System compared to other algorithms.
In 2021 [
32], researchers developed a tool to protect cloud servers from attacks inside and outside. First, a cloud server security architecture was designed to use a separate router for each country and assign a unique IP address to each router. Second, an intrusion detection system was placed outside the cloud server, and then various attacks were carried out, such as brute force and pattern matching. Multiple techniques were discussed to prevent the cloud servers from being attacked. Third, intrusion detection systems were set up inside the cloud server, and various DDoS attacks were carried out. Different results were then applied to the confusion matrix, the External Detection Rate (EDR) was 92%, the Internal Detection Rate (IDR) was 89%, while Both side Detection Rate (BDR) was 85%. It was then concluded that, if an efficient architecture for a cloud server is developed to protect it internally and externally, the chances of an attack can be reduced. when different data points are collected into a cluster form, clustering algorithms are developed to classify these data points into different groups, and data points are divided into different categories based on these algorithms. The functionality of data points of every group is almost similar, but the functionality of each group is different from the others. Clustering techniques are unsupervised algorithm techniques. No additional data may be used to improve the outcomes of the clustering process. This study reviews articles on optimization-based semi-supervised clustering from 2013 to 2020. A four-step process is used to conduct this review in which the application domain, classification of supervised clustering, and optimization techniques are explored.
4. Experiment
4.1. Detection of Attacks outside the Cloud Server Using Semi-Supervised Clustering
Gateway security is the best way to protect cloud servers from attacks. Users must verify themselves whenever they enter a cloud server or attempt to access it. When users enter the login keys, all keys are stored in the cluster and verified according to network sensor rules. The network sensor sends the keys to the Signature-Based Intrusion Detection System for verification if the keys are in the correct format. The signature-based Intrusion Detection System matches the keys with the existing keys. If the arriving keys match with the existing keys, then the user will be considered a valid user and sent to the label clustering, which grants access to the cloud shell. If the arriving keys do not match with the existing keys, then the user will be considered an invalid user and sent to unlabeled clustering. In
Figure 3, the user accesses the cloud server via the valid keys and is sent to the label clustering, which grants access to the cloud shell.
When the user entered the wrong keys on the first attempt, the server provided a second chance to re-enter the keys. If the user enters the correct keys in the second attempt, the user will have to verify the account. If the user verifies the account, the user will be sent to the label clustering, and if the user fails to verify the account, the IP address is terminated for a few hours. However, even after reopening the IP address, if the user enters the wrong keys from the same IP address, this IP address will be permanently blocked.
In
Figure 4, the user used invalid keys on the first attempt. The cloud server provided a second chance to re-insert the keys. When the user entered the correct keys in the second attempt, the server verified the account and then provided the cloud shell access to the user.
When the user enters incorrect keys and cannot verify the account, the PC’s IP address “
192.168.10.4” is temporarily blocked for two hours, as shown in
Figure 5.
When the IP address reopened two hours later, the attacker tried to enter the wrong keys. At that point, the user’s IP address was permanently blocked, and a detection mechanism has been implemented, as shown in
Figure 6.
In this paper, we try temporarily blocking the IP address to protect the cloud server from attacks because the attacker develops an attack algorithm whenever an attacker attacks the cloud server. More efficient algorithms will be needed to protect the cloud server from invading algorithms. When the attack prevention algorithm is ready, the attacker will have developed a new algorithm and attacked the cloud server. Therefore, it is better to block IP addresses temporarily and permanently. Even if the attacker applies the algorithm, the attacker’s IP address will be permanently blocked before implementing this algorithm.
Prevention Techniques from External Attacks
Cloud servers can be protected from attacks in three more ways. The first way is to set the login limits on the cloud server. The second way is to encrypt the keys with hashing and salting algorithms to protect the cloud from hacking and use hashed code on the cloud server, making it difficult for the attacker to snatch the key. The third method is to use One-time password (OTP) based authentication when logging in to the cloud server, after which the user can access it.
4.2. Detection of Attacks inside the Cloud Server Using Semi-Supervised Clustering
This article focused on the command line interface rather than the Graphical User Interface (GUI) to protect the cloud server from inside attacks and provide a secure environment for data. Each user is provided limited access. Label clustering contains valid users, but thieves can come incorrectly. Many queries have been created to address this issue, and each query has been assigned a function to protect cloud data. Once a user enters label clustering, the user can access organizations, countries, users, and files through Cloud Shell.
4.2.1. Queries for Shell
Different queries are designed for the cloud shell, as shown in
Figure 7. The “
show_*org” query will be used when the user wants to see the details of all the organizations. With the help of this query, all the organizations’ details, the organizations’ status, the protocols used in the organizations, and the country to which the organizations are affiliated can be seen with the help of “show_*org”, as shown in
Figure 8.
When all the organization details are exposed to the user, the user can execute four queries on this detail. The first query is “
exit_all.” This query will be used to exit the cloud server, as shown in
Figure 9.
The second query is “
cls”
, whose job is to clear the screen. With this query’s help, all the organizations’ details will disappear from the screen. The third query is “
show_*data,” which will show the data of all organizations, as shown in
Figure 10.
If the user wants to access a file, the user must first enable the access mode for which the “
enable_access mode” query will be used; then, the files may be accessible, as shown in
Figure 11.
Some queries are allowed to users in the cloud server, and some are denied to users so that users can only access data and not make changes to other users’ files, as shown in
Figure 12.
The query “
show_*ctn“ will be used when a user wants to view all the countries connected to the cloud server; the query “
show_*ctn” will be used, as shown in
Figure 13.
When a user tries to use the wrong queries on a cloud server, this response will be considered a negative response. This response can be from the correct user side or the invalid user side when the valid user tries to misuse the queries or use such queries that are not allowed. In this case, the user will be alerted, and the user will be blocked due to repeated use of incorrect queries, as shown in
Figure 14.
This article uses Cloud Shell to protect the cloud server from internal attacks. If the attacker gains access to the cloud server but does not know the exact query, the attacker can be easily identified. If a valid user tries to tamper with other users’ data, that user’s account is blocked. If mechanisms were created to secure the cloud server, the attacker devised more attacks than security mechanisms. Cloud Shell is an efficient technique in which every user can be given limited access, and each user will be able to stay within their limits. It will not know other users’ queries, nor will it be able to destroy anyone else’s data.
4.2.2. Prevention Techniques from Internal Attacks
There are three ways to protect the cloud server from internal attacks. The first way is to use the cloud shell on the server using the “rwx” mode as used in this paper. The second way is to store the data in an encrypted form. If a user wants to access data, the user can only do so with the decryption key of their data. The advantage is that no user will access another user’s data. The third way is that, if an attacker tries to gain access to the cloud through malicious activity, these activities will be blocked through Cloudflare.
4.3. Implementation of Semi-Supervised Clustering Results in the Confusion Matrix
Check the demonstration performance of the proposed semi-supervised clustering technique on the dataset shown in
Table 2 using 5-fold cross-validation. Some of the parameters used for testing the confusion matrix are as follows:
TP (True Positive): True Positive means arriving packets are intrusive, and the tool predicts it correctly;
FN (False Negative): The arriving packets were intrusive, and the tool did not predict them correctly;
TN (True Negative): True Negative means that arrival packets were not intrusive, and the tool declared that arriving packets are not intrusive;
FP (False Positive): Packets that were not intrusive, but the tool found them to be intrusive.
4.3.1. Evaluation
After obtaining results, these results have been applied to the confusion matrix to obtain the instrument’s performance in which the accuracy has been evaluated.
Positive Prediction
The Positive Prediction (
PP) indicates a positive probability of an outcome. This means how likely it is to detect
intrusion from
intrusive packets:
Negative Prediction
The Negative Prediction (
NP) indicates a negative probability of an outcome. This means the probability that incoming packets are
authentic and have
no intrusion.
After finding the positive and negative predictors, it has been discussed that, when intrusive activities have been performed on a cloud server, the intrusion detection rate of the tool has been 96.87%, while when authentic packets are transmitted on a cloud server, the tool’s authenticity rate has been 91.67%.
K-Fold Cross Validation
In this article, 100 different samples of datasets have been taken, and these datasets have been applied to K-fold cross-validation in which the value of K is taken as 5, and then five times iterations have been done. After that, different results have been obtained by applying different formulas, and the performance of the tool has been identified, as shown in
Table 3: Sensitivity (Se), Precision (Pr), Accuracy (Acc), Specificity (Sp), and Error Rate (ER).
After obtaining the different results, it is discussed that data can be stored in a clustered form with the help of semi-supervised clustering. These clustered data can be divided into different categories by implementing some semi-supervised mechanisms. This paper uses semi-supervised clustering to protect data from internal and external side attacks. Different mechanisms can be implemented when someone tries to access a cloud server based on the user’s behaviour. The attack rate can be completely determined if more algorithms are used. The most significant advantage of semi-supervised clustering is that each clustered value can be stored. Clustered means storing all user’s records, whether valid or invalid and dividing them into different categories based on these records. A cloud server can be secured internally and externally if semi-supervised clustering is used efficiently, and different mechanisms can be applied to each user behaviour.
5. Comparative Analysis
Many researchers have designed different algorithms to secure the cloud servers and have developed many tools to reduce the attacking ratio from the cloud servers. Some researchers have surveyed most of the papers and tried to implement the best techniques to secure the cloud servers, but as the number of users in the cloud has increased, the attack ratio has also increased. If the cloud administrator has designed an encryption algorithm to secure the cloud data, the attackers have created a more efficient decryption algorithm to decrypt that data. Based on these, the attack ratio increased rather than decreased. A comparative analysis of the latest is shown in
Table 4.
In 2016 [
26], researchers placed the Intrusion Detection System inside and outside the cloud server to protect the cloud server from attacks from both sides. They used Virtual Machine Based Rootkit-1 and Virtual Machine Based Rootkit-2 to connect the cloud server to different devices and worked on a tool to implement an Intrusion Detection System. They confirmed that IDS could be used to prevent cloud server attacks and concluded that, when an Intrusion Detection System is activated on the server, it will be only 2.5%. In 2019 [
27], researchers worked on three algorithms to protect the cloud server from botnet attacks, used the TCP dump packet analyzer to extract features from botnet packets, and obtained different results in unsupervised learning 94.78%, support vector 84.32%, while Naive Bayes is 71.63%. A comparative analysis between previous work results with the latest work results is shown in
Table 5.
In 2020 [
28], authors surveyed various papers, collected all the data sets used in these papers, and discussed whether, if all these techniques are implemented in the intrusion detection system, the cloud server can be prevented from attacks. In 2021 [
29], the authors used outlier detection and semi-supervised clustering algorithms to secure cloud servers and used some real data sets on these algorithms. After that, lymphography and glass were used to perform simulations, and the algorithms’ novelty was observed. However, no paper has experimentally proven how a cloud server can be protected from an attack if a cloud server is attacked, nor has any paper proved how this mechanism be avoided if an attacker on a cloud server uses a mechanism,.
5.1. Statistical Analysis
We carried out a one-way ANOVA test shown in
Table 6, to determine the statistical significance for observed performance results. The test was applied at α = 0.05 significance level. The testing hypotheses are
Ho: The performance is the same among the five algorithms across the datasets;
Ha: At least one of the performances of the algorithm is significantly better than the other algorithms.
5.2. Validation Test
This article discusses various techniques used to protect cloud servers from attacks and implement these techniques’ functionality on the tool.
The user should be thoroughly checked before granting access to the cloud server for the Signature-Based Intrusion Detection System and network sensors.
When an invalid user accesses the cloud server, Cloud Shell should be used to prevent that user from accessing the cloud server.
Attacks can be minimized by using Cloud Shell because only the correct query on Cloud Shell can give the user access, and the boundaries of each query should be defined. When a user uses an incorrect query or crosses boundaries, the user can be checked via ACS.
If each mechanism is implemented inside and outside the cloud server, as discussed in this article, the chances of attacking the cloud server may be reduced.
5.3. Novelty of Proposed Work
This paper protected the cloud server internally and externally through efficient clustering techniques and prevented attacks from both sides. If an attacker gains access to a cloud server through an algorithm or a key snatching method, then the biggest problem for the attacker is that it needs to know the exact queries. When the wrong person enters the invalid keys, it is verified whether the person who has access to the cloud is right or wrong. If the correct user uses queries that the database administrator has denied or attempted to modify, such users are also blocked from cloud servers. If an attacker attacks the cloud server from outside, the IP address of that user will be blocked based on different rules. The most significant advantage is that, as long as the attacker misuses the cloud server, the attacker’s IP address will be blocked. After that, the confusion matrix was used to better evaluate the instrument’s performance, with a positive prediction result of 96.87%, a negative prediction result of 90.62%, and a tool accuracy result of 94.79%. This means that the tool’s ability to check unauthorized incoming packets and their intrusion is 96.87%, the tool’s ability to check authentic packets is 90.02%, and the tool’s performance is 94.79%. Based on the confusion matrix, it has been decided to protect the cloud server from internal and external attacks and to protect the cloud by replacing the graphical user interface with a command-line interface, and as soon as an attack occurs, detect it and be saved from the attack immediately.
6. Conclusions
The development and testing of the tool have shown that the intrusion detection system is a technique that monitors all malicious activities in the cloud and prevents all attacks in the cloud. The cloud server can only be secured when given an initial level of security, every phase of the cloud server is secured, and all devices connected to the cloud server are secured. When an attacker tries to attack a cloud server, the first step is to attack the gateway. If an attacker gains access to the cloud server, it is also essential to protect it internally. The best way is CLI (Command Line Interface). This is because, if the wrong person uses the invalid queries, such people will automatically get stuck, and the performance of cloud computing will not be affected.
In this paper, we designed a security architecture and developed a tool to implement it, which provides security to the internal and external sides of the cloud. We observed the tool’s performance based on Confusion Matrix and concluded that internal attacks could be reduced if the cloud algorithm is externally secured. This is because external attacks are always attempted by an unauthorized person who uses attacking techniques and tries to reach the cloud.
In the future, we will modify the semi-supervised algorithm and will develop a more secure and efficient semi-supervised clustering algorithm that will completely prevent the cloud servers from both sides attacks. This paper proposed an algorithm that protects the cloud server from outside through a detection mechanism and IP blockage mechanism, and protects the cloud server from inside attacks through cloud shell, ACS mechanism, and detection mechanism. The proposed algorithm cannot protect the cloud server from various attacks such as Brute force, DDoS, and Replay attacks, which is the limitation of this paper’s proposed method. In future work, the latest algorithm will detect and prevent all possible attacks inside and outside. After that, we will compare this algorithm with different algorithms and work out the best algorithm among them. Furthermore, we will check the effect of the Random Access Memory and Central Processing Unit while implementing the semi-supervised clustering algorithm.