2.2.5. SHA-3

SHA-3 [34] is the newest hash standard issued by NIST. Unlike previous SHA algorithms, it is based on *sponge construction* [35] instead of the Merkle–Damgard structure [ ˙ 36]. SHA-3 is in fact a slightly modified *Keccak* algorithm [37], the winner of the NIST contest. SHA-3, like SHA-2, is capable of four hash length generations: 224, 256, 384, and 512 bits, depending on the underlying sponge construction configuration.

Keccak has an internal state which is *b*-bit string *S*; this can be also presented as a three-dimensional array (named *A*, Figure 5) with mapping as in Equation (5). For SHA-3, *b* = 1600 and two more helper variables are derived from this value: *w* = *b*/25 = 64 and *l* = *log*2(*w*) = 6.

$$A[x, y, z] = S[w(5y + x) + z] \tag{5}$$

**Figure 5.** SHA-3 state as three-dimensional array *A*.

In Figure 5:


An SHA-3 round consists of five step mappings denoted *θ*, *ρ*, *π*, *χ*, and *ι* (Equation (6)). Each of those mappings takes state array *A* as an input and returns an updated state array *A'*. The *ι* mapping also takes round index *ir* as an argument.

$$Rnd(A, i\_r) = \iota(\chi(\pi(\rho(\theta(A)))), i\_r) \tag{6}$$

A detailed explanation of every step mapping can be found in [34], and the descriptions below will give a brief idea of how each of these works.

The effect of *θ* is to XOR (⊕) each bit in the state with the parities of two columns in the array. The *ρ* operation result is modification of the *z* coordinate for every bit in each *lane* by an offset (modulo lane size), which depends on fixed *x* and *y* coordinates of this lane. The *π* operation effect is rearranged positions of lanes in every state array slice. In the *χ* operation, each bit of the state array is XORed (⊕) with a non-linear function of two other bits in its row. The effect of the *ι* operation is to modify some of the bits in *Lane(0,0)* (the exact center of the state array slice) in a way that depends on the round index *ir*. *Lane(0,0)* is XORed (⊕) with a *w*-bit string, where most of the bits are "0", but a selected few are the result of *rc(x)* transformation dependent on round index *ir*.

Before the message is fed into the sponge construction, a two-bit suffix "01" is appended to its end. It supports *domain separation* and allows us to distinguish the SHA-3 hash function from other algorithms. Now the message must be padded so its length is a multiple of *rate* (*r*) parameter, which essentially is the SHA-3 block width. SHA-3 utilizes a *pad10\*1* padding scheme, which generates a bit string starting and ending with "1" and filled with an appropriate number of 0s (hence the asterisk, which in regular expression notation indicates *zero* or more).

Figure 6 presents the SHA-3 sponge construction's principle of operation. At the beginning, the SHA-3 state is initialized with a 1600-bit (*b* = 1600) string of zeros. In the phase called *absorption*, the padded message is divided into series of *r*-bit blocks and XORed (⊕) into a state vector. Then *f* transformation, which consists of 24 SHA-3 rounds, is applied to the state. This process is repeated until the whole message is absorbed. In the second stage, the actual hash is *squeezed* from the sponge. For all SHA-3 hash lengths, the hash can be obtained without applying the *f* transformation again—an appropriate number of bits is taken directly from the state vector as *r* is always greater than the hash length (Table 2). Variable *c* is the *capacity* of the sponge, and for SHA-3 it is double the hash length (*c = 2d*). As variables *r* and *c* satisfy relation *r+c=b*, the selection of capacity determines the block width of the SHA-3 algorithm.

**Figure 6.** Sponge construction, which is the basis of SHA-3.

**Table 2.** Capacity *c* and rate *r* of SHA-3 algorithms in relation to hash length.


In the network probe, SHA-3 is applied to a 104-bit string that consists of 32-bit IP source and destination addresses, 8-bit IP protocol information, and 16-bit source and destination ports for TCP/UDP. The SHA-3 digest is trimmed to the 32 most significant bits, which are considered the flow hash.

#### *2.3. Implementation and Verification*
