1. Introduction
In cryptography, the security of a cryptosystem is often based on the hardness of a known and believed hard problem, such as factorization, discrete logarithm, and Learning With Errors (LWEs). Some of such hard problems could be solved with the help of algorithms implemented in large-scale quantum computers. A typical example is Shor’s algorithm [
1], which could break the most popular and most widely used public key cryptosystems, such as RSA [
2] and Elliptic Curve Cryptography (ECC) [
3,
4].
Introduced independently by Koblitz [
3] and Miller [
4] in 1984, ECC is a subfield of asymmetric cryptography. It uses the algebraic properties of elliptic curves over finite fields, and its security is based on the hardness of the Elliptic Curve Discrete Logarithm Problem (ECDLP). ECC allows key exchange [
5], encryption and decryption [
6], digital signature [
7], random number generation [
8], and requires smaller key sizes compared with other asymmetric systems such as RSA. ECC is used in industrial applications such as the Bitcoin digital currency [
9], the security of the transport layer [
10], and various communication services.
The use of machine learning techniques in cryptography and security is still a rapidly evolving topic. Nevertheless, machine learning has already been deployed in certain applications, mainly for security issues. In recent years, machine learning algorithms have been used to implement and enhance the efficiency and security of various cryptographic systems. These algorithms are applied to analyze cryptosystems, detect intrusions, test the security of systems, and perform cryptanalysis.
The connection between machine learning (ML) and cryptography was first discussed by Rivest [
11] in 1991. Since then, various intersections between the two fields have been extensively studied, covering both cryptography and cryptanalysis, the two subfields of cryptology. In cryptography, the schemes proposed in [
12,
13,
14] are based on neural network models, while the schemes proposed in [
15,
16] are based on deep learning.
ML is employed to select optimal secret keys for use in encryption and decryption in a symmetric system, as well as optimal public keys for encryption in an asymmetric system [
17,
18,
19,
20]. ML is also utilized to observe the algebraic properties of encrypted data and to test the vulnerabilities of cryptographic systems [
21]. Furthermore, it helps to understand the weaknesses and vulnerabilities of security and privacy and develop resilient defenses [
22]. Various machine learning algorithms are also leveraged to build effective intrusion detection software packages, targeting both intrusions and attacks [
23,
24].
In cryptanalysis applications, Alani [
25] introduced an attack on DES and Triple-DES based on a neural network. In 2015, Maghrebi et al. [
26] proposed a method to apply deep learning in side-channel attacks.
In the ECC field, there are plenty of schemes for which implementation as well as security are challenging tasks. In [
27], Tellez and Ortíz presented a study for possible applications of the Genetic Algorithm (GA) and the Particle Swarm Optimization (PSO), two artificial intelligence (AI) algorithms, to generate strong parameters for ECC. In [
28], Villegas and Cordero presented an experimental evaluation of the resistance of ECC to simple power attacks using ML models. In [
29], Weissbart et al. presented several attacks on the Edwards Digital Signature Algorithm (EdDSA) using machine learning techniques. In [
30], Wøien et al. presented a neural network model for asymmetric encryption, focusing on algorithms in ECC. In [
31], the performance of the execution time, the energy consumption, and the memory usage of the encryption/decryption algorithms of several lightweight cryptographic systems are studied using machine learning models.
In this paper, the main objective is to study how Elliptic Curve Cryptography can be performed with the support of machine learning.
Section 2 provides an overview of the main concepts of artificial intelligence and machine learning.
Section 3 introduces the arithmetical theory of elliptic curves.
Section 4 examines elliptic curve cryptography.
Section 5 discusses the main attacks on ECC.
Section 6 explores the application of machine learning in the field of ECC.
Section 7 summarizes and concludes this paper.
2. Artificial Intelligence and Machine Learning
AI is a combination of science and technology. It is based on several disciplines in engineering and mathematics, such as algebra, statistics, probability, and chaos theory. Other fields, including biology, computer science, information theory, and linguistics, also contribute to AI. Today, AI is applied across various fields such as vision systems, gaming, finance and banking, healthcare, language processing and recognition, self-driving vehicles, pharmaceutical discovery, chatbots, robotics, computer vision, data analysis, and cybersecurity.
2.1. Overview of Machine Learning
ML is a subfield of AI focused on creating, testing, and adapting computer procedures, algorithms, and programs that can automatically improve by learning from past experiences. It is used in various applications, such as financial fraud detection, healthcare report analysis, agricultural optimization, information dissemination, financial investment optimization, traffic prediction, and language translation.
There are three categories of machine learning algorithms: supervised, reinforcement, and unsupervised.
- •
Supervised learning. In supervised learning, the machine is under the supervision of an operator. The input and the output datasets are labeled and known to the operator and are proposed to the algorithm that is implemented in the machine. The task of the algorithm is to find a link between the input and the output datasets. To this end, the algorithm must identify patterns from the input dataset, learn from former statistical occurrences, and propose predictions. If the predictions are far from correct, then some parts of the algorithm are improved. This process continues until the predictions are acceptable, and the errors are sufficiently minimized. To improve the algorithm, several techniques are used such as classification, linear regression, and forecasting. The ultimate goal is that the algorithm can make correct predictions on any unseen data. A typical application of supervised learning is fraud detection. Fraudulent and suspicious transactions can be detected by the algorithm using stored data.
- •
Reinforcement learning. In this category of machine learning, the algorithm is trained to take certain accurate actions. This can be accomplished by rewarding the good actions and blaming the bad ones. To be accurate, the algorithm learns from experiences how to achieve a goal in an optimal way through interactions with the environment. The algorithm has to discover the actions that are desired or not. A typical example of reinforcement learning is autonomous driving. A solid autonomous driver must analyze and make several decisions and behaviors in various situations such as finding an optimal path, avoiding dense traffic, predicting travel time, and driving safely.
- •
Unsupervised learning. In unsupervised learning, the machine is independent of any human operator. The machine learning algorithm analyses and clusters the unlabeled datasets without the need for human help or intervention. The clustering technique permits the discovery of the hidden patterns and groups of unlabeled datasets based on their categories, similarities, and differences. The goal of unsupervised learning is to group the datasets into clusters that are more organized within an optimal number of classes. A typical application of unsupervised learning is customer segmentation by commercial companies. They can use an unsupervised learning algorithm to categorize their customer’s common needs and cluster them into categories to propose their products to potential buyers.
2.2. Overview of Perceptron and Multilayer Perceptron
The perceptron is a basic supervised learning algorithm and the simplest type of artificial neural network, invented by Rosenblatt in 1958 [
32]. There are two families of perceptrons: single-layer perceptrons, which can process only linear activation functions, and multilayer perceptrons, which can process nonlinear activation functions.
A single-layer perceptron is designed to categorize several binary inputs and give one binary output, generally 0 or 1. It is composed of several basic components, including an input layer, weights, a bias, an activation function, and a single output layer (see
Figure 1). The perceptron starts by taking the bias, and a list of scalar input features. A weight is assigned to each input, and a linear combination of all couples (input, weight) is processed. The result of the linear combination is added to the bias, and introduced into the activation function, which decides to what category belong the input features. Typically, if the input features are
, the weights are
, the bias is
b, and the function is
f, then the output is
A multilayer perceptron is an artificial neural network that can process all kinds of data, including nonlinearly separable data. It is composed of an input layer, one or more hidden layers, and one output layer. The input layer is composed of one or more nodes where the initial input data is introduced. The hidden layers are also composed of one or more nodes. Each node in a hidden layer receives inputs from all the nodes of the previous layer. The information is processed and passed to the nodes of the next layer. At the end, the output layer receives the final inputs and produces the final output. The output layer is composed of a number of nodes, which represents the number of possible classes of featured information (see
Figure 2).
Multilayer perceptrons are used in various applications such as speech and image recognition, banking, e-commerce, banking, and travel.
2.3. Overview of Artificial Neural Networks
Neural Networks are modern algorithms at the heart of machine learning, inspired by the human brain. They mimic the functioning of biological neurons to analyze tasks and propose solutions. A neural network is composed of a sequence of layers of nodes, namely, input layers, hidden layers, and output layers (see
Figure 2). The data is introduced in the input layers and is processed in the hidden layers using activation functions. Finally, predictions are made by the output layers.
The nodes in two adjacent layers are connected, and the connections are guided by weights. Moreover, each node has an associated bias. The weights and biases are adjusted during the training phase of the neural network through feedforward and backpropagation. These adjusted weights and biases enable each node to optimize its computations.
There are various types of neural networks such as Generative Adversarial Networks (GANs), Convolutional Neural Networks (CNNs), Feedforward Neural Networks (FNNs), and Recurrent Neural Networks (RNNs).
5. Security of ECC
In this section, we present the most powerful attacks on ECC systems. Most of the attacks are designed to solve the elliptic curve discrete polynomial.
5.1. Pollard’s Rho Algorithm
Let
n be the order of the subgroup
, and
with
. Pollard’s rho method tries to find a collision, that is, two couples of integers
, (
such that
and
. Equivalently, this is
, from which one deduces
. If
, then
If the couples
and (
are selected randomly in
, the expected running time is
, and the storage of the triples
requires
cells, which is infeasible if
n is large. Nevertheless, some variants of Pollard’s rho method solve the ECDLP with the same running time, but with much less storage. The following variant is one of them. It proceeds as in Algorithm 3, where the following functions are used
Algorithm 3 Pollard’s rho algorithm for the ECDLP |
Require: An elliptic curve E, a base point , the order n of P, a point . Ensure: The integer k such that . 1: Partition in three sets of almost equal size, namely . 2: Choose two random integers . 3: Compute . 4: Compute , , . 5: Compute , , . 6: Set 7: whiledo 8: Compute , , . 9: Compute , , . 10: . 11: end while 12: ifthen 13: Compute . 14: else 15: Go to step 2. 16: end if 17: Return k.
|
Several variants have been proposed to improve Pollard’s rho method [
41,
42,
43]. Moreover, there exists a parallelized variant of Pollard’s rho method (see [
40], Section 4.1.2), which can be applied to
M processors, with running time
.
5.2. The Pohlig–Hellman Algorithm
The Pohlig–Hellman attack on the discrete logarithm problem was first presented in [
44]. It applies optimally when
is divisible only by small prime factors. It reduces the problem of computing the ECDLP over subgroups of prime order.
Let
n be the order of the group
. Suppose that
. Let
with
. The goal of the Pohlig–Hellman method is to find
using the Chinese Remainder Theorem by solving the system
for which the unique solution in
is
The values
,
, are computed recursively. Set
with
. Also, set
Then, since
, and
for some integer
,
satisfies
Then
Hence,
can be computed by solving the discrete logarithm
in
.
Using
, we set
which satisfies
Again,
can be computed by solving the discrete logarithm
in
.
This procedure is repeated recursively and leads to the computation of
by solving the discrete logarithm
in
where
The Pohlig–Hellman method can be summarized in Algorithm 4.
Algorithm 4 Pohlig–Hellman algorithm for the ECDLP |
Require: An elliptic curve E, a base point , the order n of P, a point . Ensure: The integer k such that . 1: Factor n as . 2: Set . 3: for i from 1 to r do 4: Set . 5: Compute . 6: Compute . 7: for j from 0 to do 8: Compute z such that . 9: Compute . 10: Compute . 11: end for 12: Compute . 13: Compute . 14: Compute . 15: end for 16: Return k.
|
The complexity of the Pohlig–Hellman method is expressed in the form but for most values of n, the complexity is of where q is the largest prime factor of n. As a consequence, to maximize the resistance of solving the ECDLP by the Pohlig–Hellman method, the order should be a multiple of at most one large prime number.
5.3. The Side-Channel Attacks
To test the security of a cryptosystem, several kinds of security are applied such as provable security and side-channel security. While provable security seems more theoretical, side-channel security is devoted to practical implementations of cryptographic systems. Attacks that scrutinize the implementation procedures are called side-channel attacks. A naive and direct implementation of some public key systems such as RSA, DH, and ECC can leak information about their private keys, which permits to recovery of the entire key. A typical example is the modular exponentiation in RSA and DH, as well as the double and add procedure for scalar multiplication of points on elliptic curves.
In 1996, Kocher [
45] presented the power analysis, the first possible side-channel attack. Since then, various types of side-channel attacks have been proposed for practical use. Some are based on implementation issues such as single power analysis [
45], differential power analysis [
46], fault attacks [
47], and timing attacks [
45].
If the addition of two points
P and
Q is naively implemented, then it is possible to guess if it is computed for
or
. Similarly, if the scalar multiplication
is simply implemented using the double and add method, then one can guess all the bits of the binary decomposition of
k. This is feasible by measuring the time taken to perform the computation for any bit. When the bit is 1, one has to compute an addition on the elliptic curve as in Steps 5–7 of Algorithm 5, while no addition is needed when the bit is 0. As a consequence, performing a computation for a bit 1 is longer than performing a computation for a bit 0.
Algorithm 5 Left to right double and add method |
Require: An elliptic curve E, a point , an integer k. Ensure: The point . 1: Decompose , , . 2: Set . 3: for i from down to 0 do 4: Compute . 5: if then 6: Compute . 7: end if 8: end for 9: Return Q.
|
Several algorithms for scalar multiplication have been proposed against timing attacks [
48]. They make the scalar multiplication regular and constant-time. A typical example is the double and add always method, as presented in Algorithm 6.
A yet more regular and more resistant way to perform the scalar multiplication on elliptic curves is the Montgomery ladder [
35]. This algorithm was originally specified for Montgomery’s elliptic curves and was later generalized to any elliptic curve with Weierstrass form, independently by Brier and Joye in [
49], and Izu and Takagi in [
50].
Another known side channel attack is fault attack [
47,
51]. It consists in injecting a fault during the arithmetic operations and exploiting the output to guess a part of or even the whole private key. The basic idea is to inject a fault in the regular computation on the original curve
E to force it to be performed in a parallel computation on a weaker curve
where the ECDLP is easy to solve. To avoid fault attacks, several countermeasures have been proposed. The basic countermeasure is to check whether the output is still a point of
E. Another countermeasure is to use a less sensitive scalar multiplication method, such as Montgomery’s ladder method, as presented in Algorithm 7.
Algorithm 6 Double and add always method |
Require: An elliptic curve E, a point , an integer k. Ensure: The point . 1: Decompose , , . 2: Set . 3: for i from down to 0 do 4: Compute . 5: Compute . 6: if then 7: Set . 8: else 9: Set . 10: end if 11: end for 12: Return Q.
|
Algorithm 7 Montgomery’s ladder |
Require: An elliptic curve E, a point , an integer k. Ensure: The point . 1: Decompose , , . 2: Set . 3: Set . 4: for i from down to 0 do 5: Compute . 6: Compute . 7: end for 8: Return .
|
5.4. Shor’s Algorithm
In 1994, Shor [
1,
52] presented a quantum algorithm to factor large composite numbers, and to solve the discrete logarithm problem in a finite field of prime order. Shor’s algorithm was extended to solve the elliptic curve discrete logarithm problem by Proos and Zalka [
53] in 2003. It may be exploited by a large-scale quantum computer and would undermine the security of the most popular public key systems such as RSA, DH, ElGamal, and ECC. If
E is an elliptic curve over
, then Shor’s algorithm can be efficiently used to solve the elliptic curve discrete logarithm in a polynomial running time of
(see [
54], Theorem 1.2). A detailed description of Shor’s algorithm for the ECDLP is proposed in [
55].
5.5. Other Attacks
Several attacks have been presented to compute the ECDLP, some are less efficient than Pollard’s rho method, and some are more efficient for specific types of elliptic curves.
- •
The baby-step–giant-step algorithm was invented by Shanks [
56] in 1971. While its running time is approximately the same as Pollard’s rho method, it requires approximately
space for values storage. The idea behind this method is to choose an integer
, to compute
, to compute, and to store all values of
(the baby steps) and
(the giant steps) for
and to compare the stored lists. If one match is found, then
for some integers
a and
b. This gives
, and
.
- •
The MOV attack, due to Menezes, Okamoto, and Vanstone [
57], is efficient when the elliptic curve is supersingular, that is
. It is based on Weil pairing that maps two points in
to an element in
. The integer
k is the embedding degree associated with any elliptic curve
. It is the smallest integer
such that
divides
. If
are three given points in
with an unknown
r, and
e is the Weil pairing, then one can compute
, and
. Hence,
that is
. This reduces to the discrete logarithm problem in
. For supersingular curves,
is sufficiently small, and the discrete logarithm problem can be easily solved over
. If the elliptic curve is not supersingular, it is required that
.
- •
The elliptic curves such that
are called anomalous and are weak for the attacks presented in [
58,
59,
60]. In such curves, the ECDLP can be reduced to the discrete logarithm problem in the additive field
which is easy to solve.
5.6. Robust Elliptic Curves for Cryptography
To avoid the attacks described before, it is crucial to choose robust elliptic curves for use in cryptography. We list here a few criteria for this purpose.
- •
The size of , as well as the size of should be large enough to resist the attacks that have a running time or storage that depend on such as Pollard’s rho method, Pohlig–Hellman’s method, and baby-step–giant-step method.
- •
Both and should have a dominant large prime factor. This property ensures that Pollard’s rho attack and Pohlig–Hellman’s attack will be ineffective.
- •
The curve
E should not be anomalous, that is, the order
should not be equal to
p. When the curve is anomalous, the ECDLP in
E can be reduced to the additive discrete logarithm problem in
, which is trivial to solve [
58,
59,
60].
- •
The curve
E should not be supersingular, that is the order
should not be equal to
. This requirement follows the work of Menezes, Okamoto, and Vanstone [
57], and the work of Frey and Rück [
61]. Both works show that, for an elliptic curve
E over
, the ECDLP can be transferred from
to the Discrete Logarithm Problem (DLP) in the multiplicative group
for some positive integer
k. If
k is small, typically
, then the DLP in
can be attacked by a standard method, such as the baby-step–giant-step [
56], Pollard’s method [
62], Pohlig–Hellman’s method [
44], or the index calculus method [
63]. To avoid a MOV attack, it is required to check that
does not divide the integers
for
.
We notice that several tools are devoted to selecting safe elliptic curves. A typical example is [
64] where the security of almost all popular cryptographic elliptic curves is discussed.