1. Introduction
The name internet of things (IoT), coined by the MIT researcher Kevin Ashton [
1], usually refers to smart objects, connected through the internet to other sensors, devices and servers with which collect and/or share data for improving their functionalities. IoT can also be combined with other technologies, for example with cloud computing [
2]. It is possible to create a sustainable smart home aiming to reduce resources’ consumption or develop specific applications in the medical field [
3] such as wearable devices which monitor our physical conditions, specific devices used to check patients with chronic illnesses, and so on. Data collected by IoT devices need to be (a) processed to form informations by applying, for example, data mining techniques [
4]; (b) evaluated in order to make decision by adopting agent based models [
5,
6], bayesian decision models [
7], fuzzy logic [
8] and so on; (c) protected from attacks, failures and leaks during communication [
2].
Several issues have to be faced in securing IoT applications. An important example is given by the intrinsic constraints of the devices [
9], that usually have a small amount of memory and cannot perform heavy computations. It is very likely to have such devices in non-protected environment, where an adversary can access them and perform attacks. In particular, she/he can perform an analysis of the controlled binary [
10] or perform differential fault analysis [
11,
12]. Moreover, since these devices are connected, compromising one of them can open the way to botnet attacks [
2,
3]. We can observe that we are exactly in a white-box framework, and white-box cryptography [
13] has been first developed to cope with the scenario in which an attacker can physically interact both with the implementation of the used cryptographic algorithm and with the device on which the encryption/decryption operations are executed. The usually studied scenario, namely black-box, in which the execution cannot be observed nor modified by the attacker, is not always suitable for IoT applications [
14]. The reader can think about what happens in the context of digital rights management where discovering the key means to have the possibility to spread digital contents to people that have not payed such contents.
The effort of researchers towards white-box cryptographic schemes materialized with [
13,
15] where white-box versions of AES and DES have been implemented. Nevertheless, it is important to remark that these implementations have been attacked via algebraic attacks [
16] (improved by [
17]), [
18,
19,
20]. Moreover, also Jacob et al. in [
21] can easily break Chow’s implementations.
The need to have white-box algorithms for practical applications leads to develop some specific algorithms. Examples of block ciphers developed to be employed in the framework of white-box cryptography are ASASA [
22] and SPACE [
23]. However, these ciphers are not free of drawbacks or weaknesses. In particular, decomposition attacks can affect ASASA’s security while SPACE is heavy from a computational point of view [
24]. An important step forward for white-box cryptography, was the development of substitution–permutation network (SPN)box [
24], another block cipher that relies on internal block ciphers with the aim to reduce the computation time. In [
9], the problem of intrinsic constraints on computational power and memory of IoT devices in unprotected environment is addressed. The authors refer to smart objects with limited computational power and memory that may contain sensitive data and can be easily lost or stolen. Differently from AES/DES white-box implementations, the authors do not decline a well-known cipher into the new framework, but they develop a new one, relying on a modification of Lai-Massey structure. The crucial point is that only the encryption is thought to be done on the IoT constrained device, while the decryption phase is supposed to be done on a computer or server and in a black box scenario. In [
25] the authors refer to embedded distributed devices which collect and securely send information to centralized servers. Subsequently, these servers decrypt and process all the information. As previously mentioned, the collected information may be sensitive and it is possible for an attacker to get control of the whole device. The scheme proposed in [
25] is lightweight and suitable for constrained devices. In particular, such a new design has the following peculiarities:
the employed operations are very simple; they essentially consist of lookup tables and bit operations;
the lookup tables and the structure containing sensitive data are small in memory;
the provided security is medium-level (∼) and protection is ensured for reasonable amount of time;
it is possible to update the key at small costs.
The scheme is based on a Fesitel structure, but it adds two bijections, as a defence against attacks. Moreover, to cope with structural cryptanalysis [
25,
26] different size components are used.
This paper improves of a previous work entiteled “White-box Cryptography: A Time-security Trade-off for the SPNbox Family” [
27], presented by F.Cioschi, N.Fornari and A.Visconti at the 2nd International Conference on International Conference on Wireless, Intelligent and Distributed Environment for Communication (WIDECOM 2019). In this paper, we (a) introduce the white-box approach in the IoT context, explaining the importance of protecting data in an environment where attackers have full control over the whole system; (b) explain the importance of having a fast black-box implementation of a white-box cipher; (c) summarize our previous idea [
27] explaining how to modify the internal block ciphers of the SPNbox family in order to increase the size of the key space; (d) measure the performance of a black-box implementation (server-side) on 32- and 64-bit architectures and by encrypting/decrypting 10,000 payloads of a lightweight messaging protocol—i.e., MQTT—which contains the data sent over the internet.
The remainder of the paper is organized as following. In
Section 2, block ciphers are introduced. In
Section 3, we present several white-box implementations and related attacks published in literature. In
Section 4 and
Section 5, we summarize two block ciphers’ families, namely, SPACE and SPNbox, which are white-box friendly by design. In
Section 6, we explain the importance of increasing the number of bits of the key used in each round. In
Section 7, the testing activities are presented. Finally,
Section 8 is devoted to discussion and conclusions.
3. The White-Box Approach
The White-Box approach aims to avoid key recovery attacks by embedding the cryptographic key into a robust representation of the cipher. Consider a block cipher . We compute a map such that, given a key , it holds . If an attacker knows even both and , it should be very hard for him to find out the key.
Example 1. Let ϕ and ψ be defined as follows:If , we can consider ψ as a white-box implementation of ϕ, by representing ψ as a lookup table. The first white-box AES implementation has been proposed by Chow et al. in [
13]. The authors suggest that key extraction can be avoided by a careful use of lookup tables. In particular, given a secret key and a block cipher, it is possible to create a lookup table which maps the plaintext in a corresponding ciphertext. In some cases, this lookup table may be huge and unusable due to its dimension. Therefore, a block cipher
can be represented as a network of smaller lookup tables (see
Figure 1) that have to be read in a particular order [
13]. Unfortunately, in the white-box framework an adversary has full access to these tables, exposing the cipher to possible attacks. Since there is no reason to make an attacker’s life easier, tables can be protected by means of internal encodings [
13]. This means that a map is composed after table
i and its inverse before table
, leaving the ciphertext unchanged. However, internal encoding does not protect against code-lifting attacks. Indeed, an attacker may recover the tables of the cipher and understand their concatenation order. Doing so, she/he is able to decrypt messages even though he had not recovered the secret key. Therefore, another protection is required: external encoding. Internal and external encodings are also discussed in [
30], while a different approach, based on polynomial algebra techniques [
31], gave rise to a perturbated white-box implementation of AES [
32], broken by [
33] in 2010.
Chow’s work is a milestone for white-box cryptography and its framework has also been used by some subsequent works such as [
34,
35]. However, researchers found attacks also for these new approaches:
White-Box AES Implementation: Chow [
13]; Attack: [
16]; Work Factor:
;
White-Box AES Implementation: Karroumi [
34]; Attack: [
18,
20]; Work Factor:
;
White-Box AES Implementation: Xiao Lai [
35]; Attack: [
18]; Work Factor:
;
White-Box AES Implementation: Xiao Lai [
35] generic linear version; Attack: [
18]; Work Factor:
;
White-Box AES Implementation: Xiao Lai [
35] affine/non-affine version; Attack: [
19]; Work Factor: at least
.
The attacks listed above may require to know the internal data representation and sometimes this means to produce a significant reverse engineering effort. An improved AES implementation is given in [
36]. This implementation is immune to attacks described in [
16,
18] but it is not to the one presented in [
37].
The first paper aiming to break all white-box implementations belonging to the framework introduced in [
13] is [
19], but it has the weak point to require some additional hypotheses. Differently, Derbez et al. [
38] breaks all the papers in Chow’s framework by solving the affine equivalence problem (see [
39,
40]). Chow’s framework has also been used by [
41] and subsequently attacked by [
42].
A significant advance from the attacker’s point of view became feasible by shifting the focus from the attacks previously described to side channel attacks [
43]. In particular, new approaches to verify the security of a white-box implementation have been proposed in [
44] where Bos et al. present differential fault analysis (DFA) and differential computational analysis (DCA) attacks (further information on fault-injection and differential power analysis attacks can be found in [
45,
46] respectively). In addition, in [
47,
48] the authors explained more formally why DCA is effective against linear and nibble encoding, Rivain and Wang [
43] provide an extensive analysis on the effectiveness of DCA, finally Biryukov and Udovenko [
49] give a general protection method for white-box implementations against DCA.
Obfuscation techniques or the randomization of the location of the lookup tables can be used to enhance security of white-box algorithms [
50], while [
51] examines how these techniques are successful against both DCA and differential power attacks (DPA). The paper [
52] exploits noncommutative groups to obfuscate operation that should be made on commutative ones and it is employed in the IoT framework. Finally, an evaluation on software protections to white-box implementations is provided by [
51].
Some improvements to DCA have been developed by [
43,
53]. The first one extends DCA to successfully address implementations using masking and shuffling techniques [
53]. The second one provide a DCA-like collision attack with a good complexity [
43].
Some paper such as [
54,
55,
56] address the problem of incompressibility or code hardness. The idea is that an attacker in the white-box framework should not be able to rewrite the code of some implementation in order to decrease the code-hardness. In [
54] two incompressible white-box schemes called “WhiteKey” and “WhiteBlock” are introduced and one instance for each scheme is provided (called PuppyCipher and CoureurDesBois respectively), [
55] describes the concept of code-hardness, time-hardness and memory-hardness, while [
56] provides a new incompressible white-box implementation based on the assumption of one-way permutations.
We conclude our extensive analisys of implementations and attacks, citing a white-box signature scheme [
57] and the methods [
58] used to attack the most resistant implementation submitted to the white-box competition called “CHES 2017 CTF Challenge”.
In the sequel, we will analyze in detail two family of white-box cipher called SPACE (
Section 4) and SPNbox (
Section 5).
4. SPACE: A Block Cipher
SPACE is a block cipher developed by Bogdanov and Isobe in [
23], that is based on a Feistel network. This cipher is designed so that security against key extraction in the white-box context reduces to the well studied problem of key recovery for block ciphers in the standard black-box setting.
SPACE is a generalized Feistel network [
29]. Given a message
and a secret key
, it encrypts
m to a ciphertext
. In describing SPACE, three quantities are often employed:
. In particular, in [
23]
,
and
.
We summarize here the encryption procedure:
At each encryption round, the Feistel function takes as input. Then is added to the rest of the state . The first bits of the new state are given by the result of this operation. The last are filled with .
Now, consider be the projection of Definition 2 and the Feistel function used by SPACE, specified in Definition 4.
Definition 4. Let be a block cipher and r the round number represented in binary with digits (so we see it as an element of ). The Feistel function
is defined as We give a specific notation for the round independent part of .
Definition 5. The round independent part of the Feistel function is Notice that, differently from traditional Feistel networks, SPACE does not use round keys. There is one secret key
k used by
. This secret key cannot be hardcoded, hence
is implemented as a look-up table. The reader might ask himself the reason for designing SPACE over another block cipher
when
could be directly implemented as a look-up table. It turns out that this second possibility cannot be developed. If we were to implement
as a look-up table we would need
bits of space:
For
the construction of such a look-up table is practically impossible. Therefore Bogdanov and Isobe propose to truncate the output of
, computed over a smaller domain (see
Figure 2):
Since the first zeros are used as padding in order to form an n-bit input to provide to , it is completely useless to store them, hence the look-up table implementation needs bits. Thus, the size of the tables for different values of —SPACE(, R), where R is the suggested number of rounds—is the following:
SPACE-(8,300); Table: 3.84 KB
SPACE-(16,128); Table: 918 KB
SPACE-(24,128); Table: 218 MB
SPACE-(32,128); Table: 51.5 GB
Notice that (1) AES white-box implementations of Chow et al. [
13] and Xiao Lai [
35] has a table of 752 KB and 20.5 MB respectively; (2) not all
values are suitable, indeed, for
and
the size of the table is not good enough to be used in practice. On the contrary, for
the table has the same size of that described in [
13].
5. The SPNbox Family
The SPACE family of space-hard block ciphers [
23] benefits of the Feistel structure from a security point of view and prevents the use of parallel execution (see
Section 4). However, as suggested in [
24], using an SPN-type design it is possible to satisfy the requirement of parallelism maintaining a suitably high level of space hardness. Thus, Bogdanov et al. described the SPNbox family of space-hard block ciphers [
24]. Let us briefly explain their idea.
SPNbox- is a substitution-permutation network (SPN) with a block length of n bits, a k-bit secret key, and based on -bit substitution boxes.
State:
The state of SPNbox-
is representable as a vector of
elements of
bits each:
Key Schedule:
The
k-bit master key is expanded,
round keys of
bits, by means of a Key Derivation Function (KDF)—e.g., PBKDF2 [
59,
60,
61,
62], ARGON2 [
63], Scrypt [
64], and so on:
Round Transformation:
We encrypt a plaintext
and we get a ciphertext
, by using the following
R transformations—e.g.,
:
The nonlinear layer
is a substitution layer where
t identical bijective
-bit S-boxes depending on the key are applied to the state:
These identical S-boxes are constituted by an internal small block cipher of block length
bit.
The linear layer
, a diffusion layer, applies a
MDS matrix to the state:
The affine layer
takes the state and adds round-dependent constants to it:
with
for
.
The Underlying Small Block Ciphers:
The identical
-bit S-boxes in the
layer (which depend on the key) are block ciphers. They are based on the round transformation of AES and they are formed by
rounds operating on a state
of
l bytes, where
:
where SB, MC and AK indicate the AES transformations SubBytes, MixColumns and AddRoundKey, respectively. Notice that (a) the number of rounds
that [
24] suggests are
= 16,
= 20,
= 32 and
= 64; (b) different matrices are employed in the
round transformation. More precisely, for
= 32 we use the
matrix of AES, while in the other cases a sub-matrix of
is used. If
= 8,
is the identity map’s matrix. Note that, as for the Feistel function in SPACE, in the white-box setting the small block ciphers
are implemented as lookup tables.
6. Issues and Possible Solutions
Although the white-box implementation of the cipher is very important, it may have some limitations due to the key embedded into the device. If several devices have to communicate with a server and such devices do not support Transport Layer Security (TLS) protocol due to insufficient resources, the server needs to manage a number of keys (pre-shared or not) in order to decrypt the messages. In a white-box context this means having a number of different implementations that run on our server and this is not a good idea. Therefore, the server will be provided with a fast black-box implementation of the cipher.
Figure 3 helps us to visualize this idea, where a white-box implementation runs on a number of devices and a fast black-box implementation runs on our server.
In order to design a fast black-box implementation of a white-box cipher, we modify the inner round described in
Section 5, increasing the number of bits of the key used in each round. In particular, we replace the AES’ ShiftRow transformation, omitted by [
24], with a key-dependent circular bit shift transformation (see
Figure 4 and
Figure 5).
If we are shifting eight bits of the state, i.e., = 8, three bits are required to execute the circular shift. Thus, we use 11 bits of the key in each round i: eight of them for the AK transformation and three for the BitShift transformation. If the state doubled, tripled, or quadrupled, i.e., = 16, 24, 32, the bits of the key used are , and respectively.
Notice that the implementation of [
24] employs the AES-NI instructions, while the idea described in this paper does not. In the encryption phase (
= 32, 24, 16), the matrices involved in the computation of the MixColumns transformation (
, and
for short) are sub-matrices of that used in AES (
). On the contrary, in the decryption phase, we need to invert
and
. Since their inverse matrices are not sub-matrices of
and the decryption instruction of AES-NI is based only on
, for
= 24, 16 we cannot use the AES-NI instructions. Anyway, in IoT context, the impossibility of using AES-NI instructions is not a problem in itself because not all IoT devices support this instruction set.
7. Testing Activities
The testing activity reported is twofold. In the first part, we measure the performance of internal layer
(see Algorithm 1)—the external part (layer
and
) is exactly the same as in [
24], so it would be pointless to evaluate it.
Algorithm 1: Layer with BitShift transformation. |
|
We run our code on laptops with different hardware configurations. More precisely, our laptops are equipped with
Intel
® Core
TM i3-330M, 2.13 GHz processor with 3 MB SmartCache, 8 GB RAM and Ubuntu 18.04.1 LTS 64-bit. The source code has been compiled with GCC 7.3.0 with
-O3 optimization enabled (see
Table 1);
Intel
® Core
TM i3-350M, 2.26 GHz processor with 3 MB SmartCache, 8 GB RAM and Ubuntu 18.04.2 LTS 64-bit. The source code has been compiled with GCC 7.4.0 with
-O3 optimization enabled (see
Table 2);
Intel
® Core
TM i7-2860QM, 2.50/3.60 GHz processor with 8 MB SmartCache, 16 GB RAM and Kubuntu 18.10 64-bit. The source code has been compiled with GCC 7.3.0 with
-O3 optimization enabled (see
Table 3);
Intel
® Core
TM i7-5500U, 2.40/3.00 GHz processor with 4 MB Cache, 8 GB RAM and Ubuntu 18.04.2 LTS 64-bit. The source code has been compiled with GCC 7.4.0 with
-O3 optimization enabled (see
Table 4);
Intel
® Core
TM i7-8550U CPU, 1.80/4.00 GHz processor with 8 MB SmartCache, 32 GB RAM and Ubuntu 18.04.2 LTS 64-bit. The source code has been compiled with GCC 7.4.0 with
-O3 optimization enabled (see
Table 5);
Intel
® Core
TM i3-350M, 2.26 GHz processor with 3 MB SmartCache, 4 GB RAM and Debian GNU/Linux 9 32-bit. The source code has been compiled with GCC 6.3.0 with
-O3 optimization enabled (see
Table 6);
In the second part, as explained in
Section 6, we examine the cipher in the IoT context, where black-box and white-box implementations are involved.
7.1. 32/64-Bit Architectures
We compared the performance of internal layer
(yellow rectangles of
Figure 5) with and without BitShift transformation (green rectangles of
Figure 5) for different
sizes. We avoid the operations involved in
and
layers.
Table 1,
Table 2,
Table 3,
Table 4,
Table 5 and
Table 6 show the time required to encrypt/decrypt one million of different plaintexts (fixed size of 128 bits) using the same key (randomly chosen). Notice that in addition to the key bits needed for the initial AddRoundKey
, SPNbox layer
uses 512 key bits—i.e., 512 bit = 16 round × 32 bit (
= 32), or 512 bit = 32 round × 16 bit (
= 16), or 512 bit = 64 round × 8 bit (
= 8). Therefore, we set to 512 the minimum amount of key bits to be used in our solution. In particular, we will execute: 12 rounds (
= 12), using 528 key bits (
= 32); 24 rounds (
= 24), using 528 key bits (
= 16); and finally 47 rounds (
= 47), using 517 key bits (
= 8).
Our testing activities show that implementations with BitShift are generally faster than those without it. In particular, several cases show that the improvement in the execution time exceeds 20%. Only in
Table 5,
= 8, encryption and decryption, we find a different result.
7.2. IoT Environment
The testing activity has been performed using MQTT [
65], a lightweight communication protocol designed for small sensors and mobile devices in low bandwidth environments. By default data are sent in clear text over the internet, thus we encrypt data contained in the payload. We measure the performance of layer
as described in Algorithm 2. More precisely, we compare the performances with and without BitShift transformation for different
—size of 32, 16, and 8 bits—encrypting one million of different plaintexts—size of 16, 64, 256, and 1024 bytes—using the same key. Then, we send one hundred MQTT messages, each of which contains 10,000 encrypted payloads. Finally, adopting the same approach, the server collects and decrypts the same number of MQTT messages with encrypted payloads.
Algorithm 2: MQTT: testing activity executed for each payload (16, 64, 128, and 1024 bytes). |
|
Our testing activity has been executed on a machine equipped with an Intel® Core™ i7-6500U CPU @ 2.50 GHz × 4 processor, with 12 GB SDRAM DDR4-2133, Intel® HD Graphics 520 (Skylake GT2) GPU and operating system Ubuntu™ 18.04.2 TLS. We used Eclipse Mosquitto™ [
66] version 1.4.15, which implements the MQTT protocol versions 3.1.1. The source code has been compiled with GCC 7.4.0, “-O3” optimization enabled.
Table 7 summarizes the results obtained.
In particular, for the encryption phase, we got a highest gain (23.680%) in the case of 128-byte payload and , while the highest loss () in the case of a 64-byte payload and . For the decrypt phase, the highest gain () is obtained with a 16-byte payload and , and the highest loss () with a 128-byte payload and . Notice that the case turned out to be the worst one.
8. Conclusions
In the era of the internet of things, the involved devices are usually lightweight, so they cannot perform heavy computations nor store a huge amount of data. In addition, these data might be sensitive—energy consumptions, medical records, and so on—and could be sent in an unprotected environment. In a white-box scenario, an attacker could easily read these data because she/he has full access to the whole execution platform and white-box cryptography can be used to secure data in this specific context.
Considering the effectiveness of side-channel attacks, new ciphers has been designed with white-box attack model in mind. In this paper, we focused on the SPNbox family [
24], suggesting how to increase the number of key bits used in each round and showing that this improvement affects the performance of the cipher. The introduction of a key-dependent circular bit shift transformation helped us to increase the keyspace and to reduce the number of rounds of the cipher, reducing the execution time too.
We described and analyzed the performance of the modified cipher in the IoT context, where both white-box and black-box implementations may be required. In particular, we measured its performance (a) on 32/64-bit architectures and (b) encrypting the payload of an IoT messaging protocol. Our testing activities have been executed on consumer laptops. The results obtained encrypting and decrypting one million of different 128-bit plaintexts on 32/64-bit architectures showed that the execution time for layer is reduced up to 22% while the highest loss is about 8%.
Moreover, the testing activities performed with lightweight protocol MQTT had a gain of about 23% and 22% (encryption and decryption phase, respectively) while a loss of about 9% and 5%. In all our testing activities the case turns out to be the worst one.
Possible future works are try to (a) understand in details why current implementation fails for and (b) implement a communication protocol based on Transport Layer Security pre-shared key ciphersuites (TLS-PSK) in order to compare the performance of white-box implementations with those of lightweight ciphers.