2.1. Homomorphic Encryption
Homomoprhic encryption (HE) is a scheme that allows arbitrary computations on encrypted data. It is based on the property that the result of operations between ciphertext is equal to the result of operations between plaintexts, hence the server can operate on the client’s data without decrypting it. If the following equation holds, then the encryption scheme is called homomorphic over the ∘ operation ★.
where
is an encryption algorithm, and
M is a set of plaintexts.
In 1978, the concept of privacy homomorphism was originally proposed as a modification to the RSA cryptosystem, which describes the concept of preserving the computations between ciphertexts as presented by Rivest et al. [
6]. This concept has led to numerous attempts by researchers around the world to search for a homomorphic scheme with various sets of operations. However, the main hindrance that researchers faced for about 30 years was the limitation of the number of operations that could be evaluated. During arithmetic, the noise level is mounting, and at some point, it becomes too large that is impossible to proceed with the operation without losing its correctness.
In 2009, Craig Gentry presented the idea of a Fully homomorphic encryption (FHE) scheme that is based on NTRU (N-th degree Truncated polynomial Ring Units), a lattice-based cryptosystem that is considered Somewhat homomorphic encryption (SHE). It can hold up to a limited number of operations—that is, a few multiplications and many additions. A public FHE shceme is composed of 4-tuple of probabilistic polynomial time protocols and algorithms namely (KeyGen, Enc, Eval and Dec):
KeyGen() → (): KeyGen is key generation algorithm that takes a security parameter and outputs public encryption key , public evaluation key and secret decryption key .
Enc(,m) →: Encryption algorithm takes and a vector of plaintext message m∈, and outputs a ciphertext .
Eval(,,<, ⋯, >) →: Evaluation algorithm takes a result of k-input function : → and a set of ciphertexts , ⋯, . Then it outputs a new ciphertext that is an encryption of (, ⋯, ) where ← Enc(,m) for i = 1, ⋯, k.
Dec(, ) → or ⊥: Decryption algorithm Dec takes and a ciphertext , and outputs a message ∈ if is result of Enc(, m) and is matched to , otherwise, it outputs ⊥.
The key idea proposed in Gentry’s work is bootstrapping, which is the process of refreshing a ciphertext and maintaining a lower level of noise to produce a new ciphertext so that more homomorphic operations can be evaluated. However, the bootstrapping procedure required adding many ciphertexts to the public key, and encrypting the individual bits (or coefficients) of the secret key. This resulted in very large public keys, and plaintext has to be bit-by-bit encrypted, increasing the capacity of the ciphertext increases at each step, making it too expensive in terms of computations.
2.2. Our Novel Approach Based on THFE Library
Various HE libraries (e.g., References [
7,
8,
9,
10,
11]) have been suggested based on Gentry’s scheme [
12]. Chillotti et al. released TFHE (The Fast Fully Homomorphic Encryption over the Torus) [
13] library, which was designed from FHEW(Fastest Homomorphic Encryption in the West). It is based on both learning with errors (LWE) [
14] assumption and its ring variant [
15]. It significantly improves the performance of the bootstrapping operation—less than 0.1 s and facilitating an unlimited number of operations based on a gate-by-gate bootstrapping procedure. Morever, the bootstrapping key size was scaled down from 1 GB to 24 MB to maintain the security level while reducing the overhead caused by noise.
The library supports the homomorphic evaluation of the binary gates (AND, OR, XOR, NOR, NAND, etc.), as well as the negation and the MUX gate, which can be used for various operations. Additionally, there is no restriction on the number of gates or their compositions, making it possible to perform any computations over encrypted data.
2.2.1. Bitwise Representation of Number
We performed homomorphic encryption of plaintext bits, that is, each bit within the plaintext is encrypted with the same key. Therefore, using our notation, given the plaintext , the result of the encryption stage yields a ciphertext , which is an array containing encrypted bits. This is different from the mainstream approach since it is lattice-based and initiated from the input’s integer values. Specifically, integer-based FHE requires the encoding process through the rounding operation of plaintexts for the conversion of real number input. Through the rounding process, the outcome contains an error.
Our approach is more accurately designed to solve the problem of conversion from real number to integer. In general, floating-point number system guarantees a broader range of input; however, it can also lead to more complex algorithms when using HE scheme. HE gate operations take a significant amount of time compared to plaintext gate operations, thus using floating-point is less efficient. On the other hand, the major advantage of using a fixed-point representation is that it is a simple integer arithmetic operation with much less logic than floating-point, which improves performance by reducing the bootstrapping procedure on every encrypted bit. Therefore, it is crucial in FHE to reduce, so we only adopt a fixed-point number system in this paper.
We assigned
,
and 1 bit for integer, decimals and signed bit, respectively. The encryption of each bit is designated in the same position as in the plaintext. In consequence, the values between plaintext and its corresponding ciphertext are precisely equal. By assigning a lengthy input, we guarantee higher accuracy with a tradeoff of the execution time, increasing dramatically. Typically, HE gate operations take a significant amount of time compared to plaintext gate operations. This is due to heavy loads of noise accumulated through HE gates. Therefore, realistically, we assigned the length of the input as 8, 16, and 32 for experimentation.
Figure 1 shows an example of an 8-bit fixed-point representation of number 6.75 using our approach.
2.2.2. Bitwise Operation
We illustrate some of the critical concepts, such as our basic scheme and functions, to help us understand. However, in order to grasp fully-understanding of our approach, interested readers should refer to the previous works of our approach [
4,
5,
16,
17]. For the sake of the flow, we introduce the very basics of our approach.
The goal of HE operations is to construct a HE function that yields the encrypted result that matches the plaintext operation. For instance, result of plaintext addition, should match the result of the HE addition, where represents HE addition. These basic HE operations are built from the combination of HE gate operations. We provide a simple illustration of our method in constructing HE absolute value operation to show the difference between plaintext and ciphertext.
Suppose we have a plaintext value a which is then converted to fixed point number constituting 0 and 1. If the goal is to derive an absolute value of a in plaintext condition, it is not hard to see that by using 2’s complement method, one can easily obtain the absolute value depending on the sign of a. However, in the encrypted state, since the sign bit of ciphertext, where r is the length of the input, is encrypted or not known, we have to consider both cases where a is positive or negative. Thus, we have to perform both negative and positive cases of . Consider the case when is positive, then the sign bit is . Using NOT gate on the sign bit, our encrypted value becomes and call it . With the value , if we perform AND gate operations to every bits of , we will obtain when is positive. If is negative, then is which, in the same manner, yields after executing AND gate operations to every bits of .
Also in the negative case, we have to perform 2’s complement operation on
, and denote the outcome of operation to be
. Likewise, AND gate operations on every bits of
and the sign bit, which is
provides the result of negative case. Therefore, by adding two results of positive and negative cases, one can obtain an encrypted result of absolute value operation starting from the encrypted number of
a. In this way, we constructed various HE operations from HE bootstrapping gates. Now, we provide basic notations of gates and operations that are used in our work.
Table 1 demonstrates our homomorphic operations and homomorphic functions used in this study.
2.2.3. HE Bitwise Logarithm
We introduced absolute value function in the previous section to demonstrate that there should be other measures in the plaintext algorithm in order to obtain the same result in the encrypted domain. Now, we suggested another key HE function logarithm used in derivation of evidence function. The details of the HE function are well explained in Reference [
16]. In this paper, we briefly explain its method and show an example to illustrate our approach.
To design logarithm function in plaintext, we derive
in the first stage. Next, we obtain the general form of logarithm with different base. We assume
x and
y to be real numbers s.t.
and
. Then,
y can be written as
where binary representation of
is
. Since
can be rewritten as
, we obtain nested-form of the equation as the following.
By squaring x, the formula becomes . Since is either 0 or 1 and if , is equal to 1 otherwise 0. For the case of , we divide by 2. Following the procedure recursively provides bits or fractional part of y.
So far, the above procedure is to obtain decimals of y that are less than 1. If , we perform the above procedure for the fractional part of y whereas we only count the index of the position of most significant bit of x for deriving integer of y. In conclusion, summation of the two outcomes is the result of .
This plaintext mechanism is useful to our scheme on the encrypted domain since the process of obtaining decimals mainly involves two fast-operations: comparison and shift. However, the approach to solve the problem should be different. Given the problem of
)
, our initial work is to gain
. This is not an easy task compared to plaintext situation where we can easily shift bits of
x to normalized
x s.t.
x lies in between 1 and 2. Instead, we make a detour using our HE functions to process normalization of a ciphertext. This is the first step that we perform that is different from plaintext domain. Algorithm 1 is the process of normalization of
.
Algorithm 1 HomNorm() |
- 1:
for to do - 2:
Ciphertext arrays where , else - 3:
Ciphertext arrays where - 4:
- 5:
by - 6:
end for - 7:
← Add all the elements of - 8:
← Subtract by - 9:
← HomEqualCompare - 10:
← Bitwise - 11:
← Add all the elements of - 12:
return
|
Next, we proceed to obtain fractional bits of
from the Algorithm 1. The problem in the encrypted domain is to decide whether the square of
is greater than or equal to
. We use
function to compare the values of ciphertexts,
and
that returns as
if the former is larger than the latter value. In addition, with this ciphertext, we can decide whether to shift
by 1 or not. Our explained method of obtaining encrypted decimals of
is listed in the Algorithm 2.
Algorithm 2 HomSquareShift() |
- 1:
for to do - 2:
- 3:
- 4:
- 5:
Left shift 1 bit of - 6:
- 7:
- 8:
) - 9:
end for - 10:
return
|
Through Algorithms 1 and 2, we obtain the decimals of . Since, the integer value of is the value of the position of the most significant value, it is at line 8 of the Algorithm 1. Therefore, is the result of HE logarithm of . For the general result of , we first derive to perform which is equal to .
2.2.4. Time Complexity of Designed Homomorphic Operation
The experimental environment configuration settings are as follows. All computations ran on a computer with 32 GB RAM, Intel Core i7-8700 CPU 3.2 GHz(Intel, Santa Clara, CA, USA), Ubuntu 18.04 and we used TFHE library version 1.0.1. was used. We measured computational speeds of 1-bit basic gate operations for 1000 times in TFHE. All bootstrapping HE gates except NOT and MUX gates took about 10.8 ms to evaluate. The MUX gate took 20.9 ms and NOT gate takes 0.000154 ms, which is significantly lower than other gates. Therefore, we ignored the speed of NOT gate in calculating execution. We denoted time complexity of all binary gates as
except Mux gate which was denoted as
.
Table 2 shows performance time for each operation in detail.
As more bits are assigned to the input value, execution time increases dramatically because it involves many operations, including addition, subtraction and comparison that increase linearly with the length of data and their interactions.
2.3. Model Selection
Most statistical inference approaches for analyzing the data aim to make “good” predictions. However, “good” predictions we cannot be obtained if the proper models are not chosen in the first place. Worse, there exists no such perfect model that is generally suitable for any data. Therefore, a crucial step in data analysis is to consider a set of candidate models and then select the most appropriate one. This is called model selection, one of the most important and essential steps to obtain stably accurate results in data analysis. Also, the model selection can be divided into two main branches because the meaning of the ’model’ is interpreted differently in various fields.
A Model regarded as an algorithm: It selects the most appropriate process among different machine learning approaches—for example, support vector machine (SVM), KNN, logistic regression, and so forth.
A Model regarded as a complexity: It selects among different hyperparameters in a set of features for the same machine learning approach—for example, determining an order between polynomial models in linear regression.
In this study, we focus on model selection with various complexities, by adapting numerical solutions for the regression analysis.
Polynomial regression is a type of linear regression that refers to the relationship between the independent variable
x and dependent variable
y modeled as
Mth degree polynomial:
In Equation (
3),
is a set of polynomial coefficients, and it is determined by fitting the polynomial to the training data by minimizing the errors.
M is the order of the polynomial, so we need to estimate the optimal order
to obtain the best prediction results.
Generally, model selection criteria are based on an estimator of generalization performance evaluated over the data. The models following M can be evaluated as estimated errors that are decomposed by bias and variance. If the model is too simple to describe the data, there is a high probability of a high bias and low variance called under-fitting. On the contrary, over-fitting occurs when a complex model has low bias and high variance. In machine learning, an over-fitted model may fit perfectly into training data but may not be suitable for new data.
To avoid under-fitting and over-fitting, one must select an appropriate model with optimal complexity. Thus, we need to find a way to determine the right value between models with different complexity. A considerable number of selection procedures were proposed in the literature, for example, the AIC(Akaike Information Criterion) method [
18], the Cp method [
19], the BIC(Bayesian Information Criterion) method [
20], the Cross-Validation (CV) method [
21], Bayesian evidence methods [
22], and so forth. In this study, we developed two
model selection algorithms that can work in encrypted domain for the Cross-validation (CV) and Bayesian
model selection.
2.3.1. Cross Validation
CV is one of the most widely used methods to evaluate predictive performances of a candidate model in model selection. CV estimates the expected error, and does not require the models to be parametric. It makes full use of data without leaking information into the training phase. Regarding data splitting, some of the data is used for fitting each model to be compared, and the rest of the data is used to measure the predictive performances of the models by the validation errors. Through these processes, the model with the best overall performance is selected.
2.3.2. Bayesian Model Selection
Bayesian
model selection is also a well-known approach for choosing an appropriate model. The Bayesian paradigm offers a principle approach that addresses the model choice by considering the posterior probability given a model. The Bayesian view of model comparison includes the consistent application of the rules of sum and product of probabilities, as well as the use of probabilities that represent uncertainty in model comparison [
23]. More precisely, suppose that the comparing models can be enumerated and indexed by the set {
:
}. It represents the probability distribution for the observed data
generated by model
. The posterior distribution for a set of model parameters is:
where
is the likelihood and
represents the prior distribution of the parameters of
. The model evidence for model
, based on the product and sum rule as,
represents the preference shown by the data for different models, and it is also called marginal likelihood because it can be viewed as a likelihood function over the space of models in which the parameters have been marginalized out.