1. Introduction
Cloud computing migrates traditional local computing to up-to-date remote computing to bring benefits of accessing large-scale storage resources, centralized computing resources and abundant service resources [
1,
2]. Meanwhile, frequent data downloading and uploading of cloud services brings enormous security challenges to both cloud users and cloud service providers, such as identity theft, data leakage and service hijacking [
3,
4,
5]. To solve this problem, VPN technology is widely applied to guarantee reliable and stable communication between clients and clouds [
6] or among interconnected clouds [
7,
8].
Traditional VPNs usually work in the form of software service in stationary devices, such as computers, routers, gateways and servers [
9]. When directly applied as a software gateway in a network convergence node under the circumstances of cloud computing, the quality of the VPN service is usually unsatisfactory, leading to low-throughput, high-latency and unstable VPN connections. Existing improvements of network performance range from hardware-accelerated VPN gateways to optimizations of software VPN gateways. Hardware-accelerated VPN gateways usually rely on dedicated devices [
10] or drivers [
11,
12], which is inextensible and inapplicable for various scenarios. Meanwhile, the existing optimizations of software VPN gateways are fragmentary because they do not take generality and high-performance into consideration simultaneously [
13].
To improve the performance of VPN gateways on the premise of their generality, this paper researches a generic performance improvement method for VPN gateways. From the perspective of generality, we first propose a graph-based generic VPN communication model to abstract communication entities, links and, more importantly, the core duties of a VPN gateway. Then, we formulate three generic VPN core technologies: VPN session, VPN routing and VPN NAT, and propose their core algorithms: the VPN session matching algorithm, VPN routing searching algorithm and VPN NAT algorithm, respectively. Finally, we re-design a generic VPN core framework based on the work above. From the perspective of performance, we analyze the fundamentals of VPN gateways and clarify three performance bottlenecks: packet receiving and sending, the kernel protocol stack and the virtual network interface card. Then, we propose a three-layer generic high-performance architecture to improve the three bottlenecks above, respectively, including a DPDK-based [
14] VPN packet processing layer, user space basic protocol stack and user space VPN NAT. In addition, we implement two prototype systems based on the proposed GHPA and traditional methods, respectively. The experimental results prove our proposed GHPA is superior to other traditional implementations in RTT, system throughput, packet forwarding rate and jitter. Meanwhile, GHPA is extensible and applicable to other VPN gateways.
Our main contributions in this paper are as follows:
Formulation of a generic VPN communication model with three VPN core components, including a VPN session, route and NAT.
Refactor of a DPDK-based generic high-performance architecture (GHPA) for VPN gateways.
Implementations of a user space VPN gateway based on GHPA and a kernel-space VPN gateway, separately.
Performance evaluations of software VPN gateways.
The rest of this paper is organized as follows.
Section 2 provides the related work.
Section 3 introduces fundamentals and performance bottlenecks of VPN gateways.
Section 4 describes the modeling, formulation and algorithms of a proposed generic VPN core framework and the proposed GHPA for VPN gateways.
Section 5 represents two prototype systems and then
Section 6 introduces performance tests on our prototype systems and other common VPNs.
Section 7 concludes this paper. All acronyms mentioned in this paper are shown in
Table A1.
2. Related Work
Research on VPN gateways can be classified into three categories: solutions to network interconnections [
15,
16,
17], assurance of security issues [
18,
19] and evaluations and improvements of network performance. Evaluations and improvements of network performance range from evaluations of VPN gateways [
20,
21,
22,
23,
24] and hardware-accelerated VPN gateways [
11,
12,
25,
26] to improvements of software VPN gateways [
13,
27,
28,
29].
With respect to evaluations of VPN gateways, Zakaria et al. [
20] examine the current deployment of VPN connections in IoT gateways, discussing their characteristics, benefits and drawbacks. Lawas et al. [
21] evaluate the performance of SSTP and IKEv2 protocols by measuring throughput, jitter and delay. It is found that IKEv2 has significantly better performance than SSTP in their test-bed environment. Redzovic et al. [
22] test the performance of an IPsec VPN router implemented by Quagga (
https://goo.gl/NXbOfL accessed on 20 May 2024) and Strongswan (
https://www.strongswan.org/ accessed on 1 October 2016) open-source software tools, and provide optimal VPN configuration according to their test results. Ismoyo and Wardhani [
23] carry out performance tests on OpenVPN gateways with the ATHS3 block cipher algorithm and VEA stream cipher algorithm to confirm that the ATHS3 algorithm has better performance and efficiency. Kotuliak et al. [
24] conduct comparative performance tests on OpenVPN and IPsec VPN under the conditions of different encryption algorithms. The test results show that OpenVPN and IPsec VPN have their own priorities in different situations.
In terms of hardware-accelerated VPN gateways, Yi et al. [
25] propose a GPU acceleration framework to achieve both high generality and throughput under skewed flow size distributions. Raumer et al. [
26] implement a VPN security gateway application, based on MoonGen and IPsec ESP tunnel mode, to demonstrate and evaluate the VPN throughput and energy saving performance of the improved device driver. Heinemann et al. [
12] utilize a GPU to improve the encryption and decryption efficiency of IPsec VPN, in which the GPU achieves encryption and decryption 2.9 to 8.7 times faster than a CPU. Turan et al. [
11] design a hardware-based coprocessor to accelerate the cryptographic operations of an open-source software VPN called SigmaVPN, which reduces time overhead in encryption by 93% compared to the software-only solution and increases TCP and UDP bandwidth by a factor of 4.36 and 5.36, respectively, for 1024-byte Ethernet frames.
In terms of improvements in software VPN gateways, Wei et al. [
27] propose a scheme of adaptively adjusting the size of the thread pool based on a self-learning algorithm to improve the processing performance of a VPN gateway. Pudelko et al. [
13] evaluate existing open-source VPN projects, OpenVPN, Linux IPsec, and WireGuard, by implementing WireGuard-compatible VPN benchmarking with open-source solutions. Then, they proposed a WireGuard pipeline architecture on top of DPDK, achieving 6.2 Mpps and 40 Gbit/s, which is the fastest of all the evaluated VPN implementations. Li [
28] establish a high-performance software router based on a DPDK framework to approach the line rate of our 40 Gb network interface card (NIC). Zhang et al. [
29] propose an IPsec thumbnail protocol to implement IPsec acceleration in a pure software method, which achieves obvious improvements in IPsec transmission speed.
The most relevant pieces of research for this paper are [
13,
28], which similarly utilize a DPDK packet processing platform to overcome the performance bottlenecks of Linux kernel and NIC [
30]. Related user space packet processing frameworks are netmap [
31] and PF RING ZC [
32]. DPDK has been proven to show better performance in throughput and latency [
33,
34]. Therefore, our proposed GHPA is based on DPDK. Limitations of the research in [
13] are implemented by open-source projects and their implementation is a testbed for different software architectures, not a VPN gateway meant for production use. In addition, Li [
28] verifies that DPDK-based software routers are able to approach the line rate of NIC, not in relation to VPNs.
To summarize, the existing improvements in software VPN gateways are fragmentary because they do not take generality and high performance into consideration simultaneously. Therefore, this paper conducts comprehensive research on VPN gateways by formulating a VPN core communication model, modularizing core components of VPN sessions, routes and NAT and refactoring a DPDK-based generic high-performance architecture.
3. Background
In this section, we first discuss the fundamentals of traditional VPN gateways. Then, performance bottlenecks of traditional VPN gateways are analyzed according to their implementations and working mechanisms. Finally, our research work is clarified to solve the existing performance bottlenecks.
3.1. Introduction to Fundamentals of Traditional VPN Gateways
This section discusses the fundamentals of traditional VPN gateways from the perspectives of classification, working mechanisms and performance evaluation standards.
3.1.1. Classification
VPN gateways can be classified into different types according to different classification standards. From a gateway type perspective, VPN gateways can be divided into hardware VPN gateways and software VPN gateways. Hardware VPN gateways are usually customizable devices equipped with hardware accelerators to provide a high-performance VPN service for large-scale enterprises or governments. In contrast, software VPN gateways work in the form of software services on general servers or gateway devices. Using network topologies as a classification standard, VPN gateways can be categorized into centralized gateways and peer-to-peer (P2P) gateways. The former ones comply to a client/server (C/S) service mode by acting as centralized switches in VPN topologies while the latter ones directly communicate with other gateways. Considering different protocols applied by VPN gateways, VPN gateways can be classified into PPTP VPN gateways, L2TP VPN gateways, IPsec VPN gateways, SSL VPN gateways and other VPN gateways. This paper researches the performance bottlenecks of centralized software VPN gateways and provides a generic high-performance architecture for VPN gateways (which refers to centralized software VPN gateways for the remainder of this paper).
3.1.2. Working Mechanisms
To better understand the working mechanisms of VPN gateways, we first discuss the core duties of VPN gateways in a VPN topology from a packet’s perspective. A VPN gateway mainly processes two kinds of packets: VPN packets from VPN clients and normal packets from target servers. A VPN packet encapsulates an inner packet in an outer packet. The inner packet, carrying a real data payload of the applications of VPN clients, is encrypted and encapsulated as the VPN payload of the outer packet. A normal packet is a standard packet that carries either requests from VPN clients to target servers or responses from target servers back to VPN clients. Therefore, a VPN gateway plays the role of a bridge in VPN topological structure by delivering packets to destinations, ranging from VPN clients inside its virtual network to target servers outside its virtual network. The former mode is applied for interconnecting hosts from different local area networks (LANs), so we name it the relay mode. The latter mode is called the agent mode because a VPN gateway forwards requests from VPN clients to target servers and delivers responses from target servers back to VPN clients.
Figure 1 depicts the working mechanism of a traditional VPN gateway both in relay mode and agent mode. Traditional VPN gateways receive VPN packets through an NIC driver, kernel protocol stack, and Linux socket, sequentially. Then, the VPN gateway process decrypts the VPN payload with a corresponding session key for an inner packet and checks its destination address. If the address is one of the other VPN clients, the VPN gateway works in relay mode by encrypting the inner packet with the peer session key and sending it to the peer VPN client. If not, the inner packet is written to the virtual network interface card (VNIC) by a character-driven device. Afterwards, the packet is routed to the kernel protocol stack to execute source network address translation (SNAT) and sent to the target server.
3.1.3. Performance Evaluation Standards
In general, the performance of a network application is measured by round-trip time (RTT), system throughput, packet forwarding rate and jitter. Considering that the protocols of inner packets in VPN tunnels are likely to affect the performance of VPN tunnels, the system throughput of a VPN gateway should be evaluated according to different transmission protocols: UDP or TCP. Therefore, performance evaluation standards of VPN gateways include RTT, UDP throughput (referring to system throughput via UDP transmission), UDP packet forwarding rate, UDP jitter and TCP throughput (referring to system throughput via TCP transmission). Generally speaking, a high-performance VPN gateway is supposed to provide a high-performance VPN service with low latency, high throughput, a high packet forwarding rate and low jitter.
3.2. Analysis of Performance Bottlenecks of Traditional VPN Gateways
This section analyzes three performance bottlenecks of traditional VPN gateways: packet receiving and sending, the kernel protocol stack and VNIC. The details are as follows.
3.2.1. Packet Receiving and Sending
As shown in
Figure 1, the packet receiving and sending of traditional VPN gateways is implemented based on a Linux socket. Taking VPN gateways receiving packets as an example, the whole process of packet receiving can be divided into three stages: receiving NIC and its driver, kernel protocol stack decoding, and VPN gateway process receiving. When NIC receives ingress packets and triggers hardware interrupts, the NIC driver copies the packets from RX queues to ring buffer in kernel space. The kernel protocol stack then performs protocol validation, firewall filtering, IP routing and copies the packets from ring buffer to socket buffer by software interrupts. Finally, the VPN gateway process receives the packets from socket buffers.
The process of packet receiving and sending causes system interrupts and memory copies several times. Because of the resource management of operating systems, expensive system calls, context switches and memory allocation and release occur with the advent of system interrupts and memory copies. When massive packets arrive in NIC, the situation is worse in that packets are discarded by NIC due to the exhaustion of system resources. Therefore, the traditional Linux socket method of packet receiving and sending is one of the most important performance bottlenecks of VPN gateways.
3.2.2. Kernel Protocol Stack
A kernel protocol stack mainly functions as both a protocol decoder and a protocol encoder that meets the OSI seven-layer architecture. In addition, it provides IP routing and Netfilter services for VPN gateways. IP routing is used to route ingress packets from the kernel protocol stack to upper VPN gateway processes and to route egress packets from VPN gateway processes to VNIC, which is necessary for both relay-mode VPN gateways and agent-mode VPN gateways. The Netfilter service, applied to execute SNAT for egress packets and destination NAT (DNAT) for ingress packets, is compulsory for agent-mode VPN gateways only.
In respect of its functions, the kernel protocol stack is abundant but redundant due to excessive memory consumption and computational overheads. In respect of its performance, the efficiency of passing a packet from user space to kernel space by VNIC and then executing NAT by the kernel Netfilter service is unsatisfactory. Hence, the kernel protocol stack becomes a performance bottleneck of VPN gateways in a high-speed network environment.
3.2.3. Virtual Network Interface Card
VNIC is widely used to establish a virtual network device in most implementations of agent-mode VPN gateways. VNIC mainly consists of two parts: a tun/tap device and character-driven device. The tun/tap device is a virtual network interface to read packets from VPN gateway processes when sending a request to target servers and to write packets to VPN gateway processes when receiving responses from target servers. The character-driven device manipulates the procedure of packet reading and writing between user space and kernel space.
Indeed, VNIC simplifies the implementation of agent-mode VPN gateways by utilizing system IP routing and the NAT functions of the kernel Netfilter service. However, it increases packet processing time in kernel space and occupies more system resources when switching packets between user space and kernel space. In addition, the interfaces in packet reading and writing provided by the character-driven device are as vulnerable as the Linux socket under the circumstances of fast packet processing. Thus, VNIC is another performance bottleneck of VPN gateways.
3.3. Summary
Through this introduction to fundamentals and analysis of the performance bottlenecks of traditional VPN gateways, we find that the performance of VPN gateways is limited by traditional packet receiving and sending, the kernel protocol stack and VNIC. Therefore, our improvement work is to apply a DPDK-based fast packet processing method, to design a user space protocol stack and to implement user space NAT instead of using VNIC with the Netfiler service, respectively. To provide a generic high-performance architecture for VPN gateways, re-designing a generic VPN core framework to provide general VPN functions is equally important.
4. Proposed Generic High-Performance Architecture
Aiming to address the existing performance bottlenecks of traditional VPN gateways mentioned in
Section 3.2, this section introduces a generic high-performance method for VPN gateways. We start by modeling a generic VPN communication model, formulating generic VPN technologies and designing corresponding core algorithms. Then, we present the proposed three-layer generic high-performance architecture (GHPA), which includes a DPDK-based VPN packet processing layer, a user space basic protocol stack and a generic VPN core framework. The symbols in this section are described in
Table 1.
4.1. Generic VPN Communication Model
To re-design a generic VPN core framework, the duties of a generic VPN gateway are supposed to be clarified ahead so that a graph-based generic VPN communication model can be proposed to abstract communication entities in VPN topology.
As shown in
Figure 2, this model is abstracted from a common VPN topology to a graph
The vertex set of graph
represents
communication entities (denoted by set
) in a VPN topology, which are composed of
VPN clients (denoted by set
), a VPN gateway (denoted by set
) and
target servers (denoted by set
), as in Expressions (1) and (2):
The edge set
of graph
represents
communication links in VPN topology so that the connectivity of graph
can be expressed by the adjacency matrix (denoted by
= [
]) in Expression (3):
A value of 0 refers to vertex unreachable while a value of 1 refers to vertex reachable. As all network traffic in a VPN topology is switched by a centralized VPN gateway, the paths of graph
can be expressed by sequence
or
as in Expressions (4) and (5):
Path and represent direct communication links between a VPN gateway and its clients. Path refers to a relay communication link via a VPN gateway. Path and are agent communication links via a VPN gateway. The connectivity of paths to a graph is as important as the management of communication links to a VPN gateway. A VPN gateway mainly manages three kinds of communication links during its life cycle: handshake communication links (referring to path and ), relay-mode data communication links (referring to path ) and agent-mode data communication links (referring to path and ).
Through the abstraction of a generic VPN communication model, the duties of a generic VPN gateway are clarified to include handshake communication, relay-mode data communication and agent-mode data communication from the perspective of VPN communication procedures. Then, generic theoretical and algorithmic details of the three key points above will be explored in
Section 4.2 and
Section 4.3, respectively.
4.2. Generic VPN Core Technologies
In this section, three VPN core technologies are introduced in formal language to support the three communication procedures of VPN gateways mentioned in
Section 4.1, including VPN session, VPN routing and VPN NAT.
4.2.1. Formulation of VPN Session
A VPN session is used to process VPN packets during the lifecycle of a VPN tunnel, including the identification of VPN connections and drive of VPN protocols.
In terms of connection identification, a VPN tunnel is a TCP-based or UDP-based persistent connection maintained by a VPN gateway in public link, which can be identified by a unique 4-tuple: source network address, source port, destination network address and destination port. In particular, VPN gateways usually utilize fixed network addresses and ports to listen to requests from remote VPN clients so that a VPN connection (denoted by tuple
) can be simplified from a 4-tuple set to a 2-tuple set, as in Expression (6):
Element
and
refer to the network address and port of VPN client
, respectively.
The drive of VPN protocols is an equally important task for a VPN session. To formulate a protocol-independent VPN session, VPN communication procedures can be divided into two stages: handshake stages and transmission stages. In the handshake stage, VPN gateways utilize identity authentication protocols to authenticate VPN clients and key negotiation protocols to generate session keys for next stage use. Then, VPN gateways use the session keys to encrypt and decrypt packets in the transmission stage of either relay mode or agent mode. Therefore, the procedures of VPN connection
can be expressed by set
, as in Expression (7):
In addition, the session keys of VPN connection
can be expressed by set
.
To summarize, a VPN session is used to store information about connections, communication procedures and session keys in a VPN tunnel, as in Expression (8):
As such, a VPN gateway manages a set of VPN session information (referring to set
) during its runtime, as in Expression (9):
4.2.2. Formulation of VPN Routing
Differing from traditional network routing, VPN routing is required to find a session key of the peer VPN client according to its virtual address of an inner packet in relay-mode VPN gateways. After decrypting the payload of an ingress VPN packet, a VPN gateway decodes the inner packet to get its destination virtual address . Then, is utilized to find the session key of communication link . Afterward, the VPN gateway encrypts the inner packet with session key , encapsulates the inner packet with a VPN header again and then sends it to the peer VPN client.
In short, a VPN routing policy is a 2-tuple (referring to tuple
) to associate a virtual address
with the session key
, as in Expression (10):
Therefore, a VPN gateway manages a set of VPN routing policies (referring to set
) during its runtime, as in Expression (11):
4.2.3. Formulation of VPN NAT
VPN NAT is applied to transfer a virtual address to a gateway network address and replace an application port with a gateway port for an inner packet in agent-mode VPN gateways. In essence, VPN NAT is a mapping strategy between applications of VPN clients and services of target servers. A VPN gateway executes SNAT for an inner request packet from a VPN client to a target server and executes DNAT for a response packet in the opposite direction. The two NAT strategies above can be expressed by mapping
and mapping
, as in Expressions (12) and (13):
The 2-tuple (
) refers to the virtual address and application source port of VPN client
. The 2-tuple (
) refers to the destination network address and destination port of target server
. The 2-tuple (
) refers to the network address and available port of VPN gateway
. The element
refers to the communication protocol between the applications of VPN client
and service of target server
.
In conclusion, a VPN gateway manages a set of SNAT and DNAT strategies (referring to set
) during its runtime, as in Expression (14):
4.3. Generic VPN Core Algorithms
On the basis of the description and formulation of the VPN session, VPN routing and VPN NAT, this section further discusses the core algorithms of these three technologies.
4.3.1. VPN Session Matching Algorithm
A VPN gateway manages a VPN session set (referring to Expression (9)) of all connected clients and utilizes specific session information to drive a specific VPN tunnel for a specific client during its runtime. Therefore, we propose a session matching algorithm based on a chained hash table to efficiently match real-time VPN packets with VPN sessions.
A session hash table consists of two kinds of chains: a two-way hash chain that stores VPN session
and settles hash collision problems, and a free chain, which is a pre-allocated session chain to store new sessions without the overheads of dynamic memory allocation. The hash index of a hash bucket is computed by 2-tuple
with the crc32 hash algorithm.
Figure 3 depicts the storage structure of the VPN session table.
Furthermore, our proposed VPN session matching algorithm, driven by communication procedures based on the chained hash table, is described in Algorithm 1. When receiving an ingress VPN packet, hash index
of its 2-tuple
is computed and the VPN session table is searched with ix and
. If no session node is found, an insertion process is activated. First, a free session node of the free chain is used to store information of
according to Expression (8). Then, hash index
is utilized to find the corresponding index of the hash bucket. Afterwards, session node
is successfully inserted in the tail of the hash chain and is pointed by the corresponding tail pointer. After finding or inserting a session node, a VPN session is matched with the ingress packet and then its VPN header is decoded to drive different communication procedures.
Algorithm 1 Algorithm to match VPN sessions. |
Input:
The connection information of an ingress packet, ; The VPN session hash table, ; |
1 | compute hash index with ; |
2 | search in with and ; |
3 | if is null then |
4 | insert into ; |
5 | initialize
|
6 | end if |
7 | decode VPN header to get communication procedure s; |
8 | switch(s) |
9 | case 1: |
10 | drive VPN handshake procedure ; |
11 | case 2: |
12 | decrypt the packet payload with , ; |
13 | execute VPN routing searching with Algorithm 2; |
14 | drive relay mode transmission procedure ; |
15 | case 3: |
16 | decrypt the packet payload with , ; |
17 | execute VPN NAT with Algorithm 3; |
18 | drive agent mode transmission procedure ; |
19 | default: |
20 | ignore invalid packets; |
21 | end switch |
4.3.2. VPN Routing Searching Algorithm
A VPN gateway manages VPN routing policies set R (referring to Expression (11)) to find a specific session key for a specific virtual address in relay-mode data transmission. In view of network address storage and searching methods in the kernel routing table, we propose a VPN routing searching algorithm based on a Patricia tree to provide real-time matching for VPN routing policies.
A Patricia tree is a space-optimized trie applied in the traditional network routing algorithm of 4.4 BSD-Lite Linux kernel. It is a binary tree with two types of nodes: internal nodes and leaf nodes. Each node stores information of a bit sequence, a comparative bit and pointers pointing to its parent node and child nodes, while only leaf nodes store information of keywords. Therefore, the storage structure of our proposed VPN routing table is shown in
Figure 4, which takes storing three routing policies
,
and
as an example. Parent nodes are split by the value of comparative bits, splitting to the left child and child node while the value of each comparative bit equals to 0 and 1, respectively.
Hence, our proposed VPN routing searching algorithm is described in Algorithm 2. The whole process can be divided into three steps. First, an address a in bit sequence format is transferred from a string formatted virtual address
. Second, the routing table T is searched from its root node to its leaf node by comparative bit recursively. Third, a is compared with address b of the leaf node and the session key of address
is obtained.
Algorithm 2 Algorithm to search VPN routing policies. |
Input:
The destination addresses of an inner packet, ; The VPN routing table, ; Output: The session key of communication link , ; |
1 | if is null then |
2 | return null; |
3 | end if |
4 | transfer to bit sequence ; |
5 | let
points to root node of |
6 | while is not leaf node do |
7 | read comparative bit
of ; |
8 | if == 0 then |
9 | let points to left child of ; |
10 | else |
11 | let ints to right child of ; |
12 | end if |
13 | end while |
14 | read bit sequence
of ; |
15 | if equals to then |
16 | return of ; |
17 | else |
18 | return null; |
19 | end if |
4.3.3. VPN NAT Algorithm
A VPN gateway manages VPN NAT strategies set N (referring to Expression (14)) to execute SNAT for egress packets and execute DNAT for ingress packets in agent-mode data transmission. We propose a VPN SNAT algorithm and a VPN DNAT algorithm based on chained hash tables.
VPN NAT hash tables consist of a SNAT table
and a DNAT table
. Each table has the same two chains and same hash function as session hash tables. In addition, table Ns and Nd share a two-way time chain.
Figure 5 illustrates the storage structures of two NAT tables, especially omitting the less important tail pointers and emphasizing the structure of the shared two-way time chain. Algorithm 3 describes the process of executing SNAT for egress packets, which can be interpreted as inserting SNAT and DNAT strategies for later packets’ use. Similarly, the process of executing DNAT for ingress packets is easier, which can be considered as searching DNAT strategies. As such, the details of the VPN DNAT algorithm will not be discussed any further.
Algorithm 3 Algorithm to execute SNAT for egress packets. |
Input: The 5-tuple of an egress packet, (); The SNAT strategies set referring to Expression (12), ; The DNAT strategies set referring to Expression (13), ; |
1 | if () is null then |
2 | return null; |
3 | end if |
4 | if ) is not null then |
5 | use to do SNAT for the packet; |
6 | else |
7 | find available 2-tuple (, ) of the VPN gateway; |
8 | add strategy to ; |
9 | add strategy to ; |
10 | use () to do SNAT for the packet; |
11 | end if |
Compared to inserting and searching NAT strategies, removing these strategies is relatively difficult for a VPN gateway. As a SNAT strategy and a DNAT strategy are inserted in pairs (, ), we utilize a time chain to connect all the strategy pairs. Therefore, a VPN gateway can remove timeout NAT strategy pairs by periodically checking the shared time chain.
4.4. Proposed Generic High-Performance Architecture
This section introduces our proposed generic high-performance architecture (HGPA) in light of the existing performance bottlenecks of VPN gateways.
Figure 6 presents the overall architecture of the three-layer GHPA. On the bottom layer, a DPDK-based VPN packet processing layer is proposed to replace the traditional way of using Linux socket to improve the existing bottleneck of packet receiving and sending mentioned in
Section 3.2.1. On the middle layer, a user space basic protocol stack is designed and applied instead of using the redundant and inefficient kernel protocol stack referred to in
Section 3.2.2. In addition, a generic VPN core framework on the top layer is presented to enclose the generic core technologies and algorithms of VPN gateways, in which a user space NAT implementation avoids the bottlenecks of VNIC and the kernel Netfilter service according to
Section 3.2.3. The details of each layer are discussed as follows.
4.4.1. DPDK-Based VPN Packet Processing Layer
The design of the DPDK-based VPN packet processing layer complies with DPDK arts and crafts for fast packet receiving and sending between the kernel space and user space. This layer is mainly composed of four parts: an IGB_UIO kernel module, DPDK PMD, DPDK EAL and DPDK memory management libraries. The IGB_UIO kernel module takes over control of NIC in kernel space by unbinding the traditional IGB kernel module and registering the NIC device. DPDK PMD can directly access packets from the RX and TX descriptors without any interrupts to quickly receive and send packets in user space. DPDK EAL provides an easy-to-use programming interface, which is utilized to load and launch the DPDK working environment, reserve system memory and configure and initialize NIC. Library Ring, Mempool and Mbuf make up the hardcore of DPDK memory management libraries. Among them, library Mbuf is a basic data structure to store link layer packets to be received by upper layers or to be sent by NIC.
4.4.2. User Space Basic Protocol Stack
Packets received by the upper packet processing layer are unable to be directly used by upper applications. Therefore, the user space basic protocol stack is born to bridge the gap between packet processing and upper applications. Meanwhile, the inefficient kernel protocol stack is abandoned in this implementation method. This basic protocol stack provides protocol validation, packet encapsulation, packet decapsulation and checksum computing, which can be regarded as a minimal subset of the kernel protocol stack on the premise of implementing all core functions of a VPN gateway. As shown in
Figure 6, Ethernet II, an address resolution protocol (ARP), IPv4, TCP and UDP are required to guarantee normal communication between a VPN gateway and its clients both in relay mode and agent mode. Ethernet II, the carrier of upper protocols, plays a vital role in packet delivery. To ensure network reachability during the work cycle of a VPN gateway, ARP is utilized to periodically notify the physical address and network address of a VPN gateway to its defaulted gateway device. In addition, there is no doubt that IPv4, TCP and UDP are essential to carry upper VPN protocols in a public network link.
4.4.3. Generic VPN Core Framework
A generic VPN core framework is re-designed on the basis of the modeling, formulation and algorithms design of the generic VPN core technologies from
Section 4.1,
Section 4.2 and
Section 4.3. As shown in
Figure 6, this framework mainly covers three components: the VPN session component, VPN routing component and VPN NAT component, which are compulsory to guarantee the core functions of traditional VPN gateways. The VPN session component provides generic topology management and generic tunnel management for VPN gateways. The former refers to monitoring and managing the real-time work state of communication entities (referring to set {
}) and communication links (referring to set {
}). The latter refers to matching VPN sessions (referring to set S) in real-time and driving communication procedures (referring to set {
}) according to different VPN protocols, which is extensible and applicable for other common VPNs. The VPN routing component is utilized for the management of VPN routing policies (referring to set
) and is applied exclusively to meet the demands of data transmission via relay-mode VPN gateways, such as interconnecting hosts from different LANs and building interconnected private clouds through public network links. The VPN NAT component is specialized for managing VPN NAT strategies for agent-mode VPN gateways, and is used for secure remote access and bypassing network censorship. In addition, the VPN NAT component provides a highly efficient method for executing SNAT for egress packets and DNAT for ingress packets in user space rather than combining VNIC with the kernel Netfilter service to implement NAT, as in traditional methods.
6. Performance Tests and Evaluations
After implementation of the two prototype systems above, we conduct performance tests on the implemented VPN gateways and other common VPN gateways to prove our proposed GHPA has superior performance. The details are as follows.
6.1. Test Objects
Besides HP-VPN and T-VPN, we chose another four open-source VPN projects: PPTP, Accel-ppp, Strongswan and OpenVPN as our test objects, which basically covers the common VPN protocols, such as PPTP VPN, IPSec VPN and SSL VPN.
6.2. Test Environment
Our performance tests are carried out in a LAN environment. As shown in
Figure 8, the topology consists of one gateway device and two PCs. Details of these device configurations are shown in
Table 2. In our experiments, the VPN servers of test objects are deployed on the gateway while VPN clients are configured or installed on the PCs.
6.3. Test Process
According to
Section 3.1.3, RTT, UDP throughput, UDP packets per second, UDP jitter and TCP throughput are supposed to be tested during the testing process. Meanwhile, considering the two communication modes of VPN gateways, we use Nping and Iperf, two open-source network test tools, to conduct two performance tests for both relay-mode VPN gateways and agent-mode VPN gateways. Each performance test has six rounds for six test objects, respectively. The test process of each round is described as follows.
The test of the relay-mode VPN gateways starts when a VPN server and the corresponding VPN clients are successfully started. First, we execute command ping [address] -c [times] in PC 1 to test RTT and record the test results. Then, we execute command iperf -u -c [address] -l [length] -b [bandwidth] -t [interval] in PC 1, observe and record the maximum UDP throughput, UDP packets per second and UDP jitter of different payload lengths on the premise of a zero packet loss rate. Finally, we test TCP throughput by executing iperf -c [address] -M [mss] -t [interval] in PC 1 and recording the test results.
In addition, the process of testing agent-mode VPN gateways is similar to the process above, so further details will not be stated. The differences between the two test processes are the value of parameter [address]. In the relay-mode tests, the parameter [address] is set to the virtual address of PC 2. In the agent-mode tests, the parameter [address] is set to an address that is not equal to the virtual address of PC 2. Therefore, the corresponding route and DNAT rules must be set on the gateway and PC 2 according to the specific situation.
6.4. Tests Results and Analyses
The test results of the relay-mode VPN gateways and agent-mode VPN gateways are shown in
Figure 9 and
Figure 10, respectively.
According to the test results of the relay-mode VPN gateways in
Figure 9, HP-VPN performs the best in all the evaluating standards mentioned in
Section 3.1.3. The details are as follows.
As shown in
Figure 9a, the RTT of HP-VPN is obviously lower than that of other VPN gateways. It is further calculated that the average RTTs of the six VPN gateways in ascending order are 0.712 ms, 0.896 ms, 0.978 ms, 1.139 ms, 1.201 ms and 1.599 ms for HP-VPN, Strongswan, T-VPN, Accel-ppp, OpenVPN and PPTP, respectively, which proves HP-VPN provides a better low-latency VPN service than other VPN gateways.
Figure 9b depicts the changes in maximum UDP throughput under the circumstances of different data payload lengths. It is clear that the UDP throughput of HP-VPN is far higher than the other five VPN gateways. In view of the overall trends of all curves, maximum UDP throughput rises with increase in data payload length while curve gradient decreases with increase in data payload length. Therefore, a shorter data payload challenges system throughput more. Simply speaking, maximum performance differences among the six VPN gateways can be calculated when testing with the shortest data payload, which will be discussed in
Section 6.5.
Figure 9c shows the changes in UDP packet forwarding rate under the conditions of different data payload lengths. The packet forwarding rate of HP-VPN is obviously higher than the other VPN gateways. Meanwhile, the packet forwarding rate decreases with increase in data payload length and the overall trends of all curves are contrary to
Figure 9b.
Figure 9d displays the changes in UDP jitter under the circumstances of different data payload lengths. The curves of HP-VPN and PPTP perform relatively few fluctuations than the other VPN gateways. Then, it is calculated that the average jitter of the six VPN gateways in ascending order is 0.042 ms, 0.045 ms, 0.073 ms, 0.087 ms, 0.090 ms and 0.125 ms for HP- VPN, PPTP, OpenVPN, T-VPN, Accel-ppp and Strongswan, respectively, which verifies HP-VPN provides a more stable VPN service than other VPN gateways.
Figure 9e shows the maximum TCP throughput of the six VPN gateways. There is no doubt that the maximum TCP throughput of HP-VPN is far higher than the other five VPN gateways.
Similarly, HP-VPN performs the best in all the evaluating standards in the performance tests of agent-mode VPN gateways as well, ranging from the RTT test results in
Figure 10a to the maximum TCP throughput test result in
Figure 10e. Considering the process and evaluating standards of testing the two mode VPN gateways are the same, detailed analyses need not be repeated again. However, it is worth mentioning that all VPN gateways show better performance in agent mode than in relay mode. That is because more computational overhead and time overhead are required to encrypt and encapsulate VPN tunnel packets one more time in relay-mode VPN gateways.
Ordinary one-way analysis of variance (ANOVA) is executed from the test results of
Figure 9 and
Figure 10. The
F and
p values for each group are listed in
Table 3. The calculation results verify that the differences between the means of each group are statistically significant.
6.5. Evaluations
It is observed that the performance of the HP-VPN gateway based on GHPA is superior to the other five VPN gateways in RTT, UDP throughput, UDP packet forwarding rate, UDP jitter and TCP throughput either in relay mode or in agent mode. Considering the PPTP VPN gateway shows the worst performance from the overall test results, we set all evaluating standards of PPTP VPN to index 1 and then calculate the specific performance differences of all test objects both in relay mode and in agent mode, which are shown in
Table 4 and
Table 5, respectively.
From the perspective of transverse comparisons between HP-VPN based on GHPA and other VPN gateways implemented with traditional methods, the HP-VPN gateway is proved to have the best performance in all evaluating standards both in relay mode and in agent mode. From the perspective of vertical comparisons between HP-VPN and T-VPN with same VPN protocols, using GHPA significantly improves the performance of a VPN gateway as well. When working in relay mode, the average RTT of HP-VPN is 0.73 times that of T-VPN; the UDP throughput of HP-VPN is 2.50 to 4.29 times that of T-VPN; the UDP packet forwarding rate of HP-VPN is 2.50 to 4.29 times that of T-VPN; the average UDP jitter of HP-VPN is 0.49 times that of T-VPN; and the TCP throughput of HP-VPN is 2.47 times that of T-VPN. When working in agent mode, the average RTT of HP-VPN is 0.72 times that of T-VPN; the UDP throughput of HP-VPN is 3.34 to 5.04 times that of T-VPN; the UDP packet forwarding rate of HP-VPN is 3.34 to 5.04 times that of T-VPN; the average UDP jitter of HP-VPN is 0.38 times that of T-VPN; and the TCP throughput of HP-VPN is 2.90 times that of T-VPN. The refactors of user space VPN by utilizing DPDK-based packet processing layer and user space basic protocol stack prove to overcome the performance bottlenecks of VPN gateways mentioned in
Section 3.2 significantly. Quantitative analysis will be conducted to research user space VPN implementations in our future work.
In conclusion, GHPA for VPN gateways is superior to traditional methods of implementing VPN gateways in performance and generality. In terms of performance, GHPA provides a high-throughput, low-latency and stable VPN service for VPN gateways. In terms of generality, GHPA is extensible and applicable for other common VPN gateways. GHPA makes it easier to implement high-performance gateways when implementing VPN protocols to meet specific needs.