2.2.4. SHA-1

SHA-1 is a cryptographic hash function created in 1995, described in [28,29]. In its cycle of life it is currently marked as deprecated, because it is prone to a variety of attacks. In 2015, a group of researchers was able to find a *freestart collision*, where the SHA-1 initialization vector was chosen by themselves [30], but soon the full SHA-1 algorithm was also cracked [31–33].

An organized crime syndicate in possession of tens of thousands of dollars can create an SHA-1 collision in about two months, and for instance, forge an SSL certificate. That is the reason famous brands such as Microsoft, Google, and Mozilla abandoned the SHA-1 algorithm; however, it still may be useful in real-time applications such as network monitoring.

The SHA-1 function produces a 160-bit hash. It is capable of hashing messages as long as 2<sup>64</sup> − 1 bits, which are divided into 512-bit blocks processed one by one.

The first step of the algorithm is *padding*, because the length of the message must be a multiple of 512 bits. During this process, the information about message length is encoded in 64 bits (hence the message length limit). This number is concatenated with exactly one "1" bit and an appropriate number of "0" bits, so when the padding bit string is appended to the message, the total length is a multiple of 512 bits. The temporary value of the hash is stored in five 32-bit variables *H*, initialized as in Listing 1.

**Listing 1.** Initial values of *H* variables in SHA-1 algorithm.

H\_0 ( 0 ) = 0 x67452301 H\_1 ( 0 ) = 0xEFCDAB89 H\_2 ( 0 ) = 0x98BADCFE H\_3 ( 0 ) = 0 x10325476 H\_4 ( 0 ) = 0xC3D2E1F0

Every block of the message is processed through 80 rounds according to the scheme in Figure 4.

**Figure 4.** SHA-1 algorithm round scheme.

Variables *A* to *E* are assigned values of corresponding *H* registers from the previous block or *H(0)* for the first block. The *W* array is generated—the first 16 words are 32-bit chunks of the processed block and subsequent words are calculated with Equation (2).

$$\mathcal{W}(i) = \mathcal{W}(i-\mathfrak{A}) \oplus \mathcal{W}(i-\mathfrak{A}) \oplus \mathcal{W}(i-1\mathfrak{A}) \oplus \mathcal{W}(i-1\mathfrak{A}) \tag{2}$$

Function *F* and the value of variable *K* depend on the current round number as in Equations (3) and (4).

$$F(i) = \begin{cases} (B \& \text{C}) \mid ((\sim B) \& \text{D}) & \text{for } 0 \ll i \ll 19\\ B \oplus \text{C} \oplus D & \text{for } 20 \ll i \ll 39\\ (B \& \text{C}) \mid (B \& \text{D}) \mid (\text{C} \& \text{D}) & \text{for } 40 \ll i \ll 59\\ B \oplus \text{C} \oplus D & \text{for } 60 \ll i \ll 79 \end{cases} \tag{3}$$

$$K(i) = \begin{cases} 0 \text{x} 5A827999 & \text{for } 0 \ll i \ll 19\\ 0 \text{x} 6ED9EBA1 & \text{for } 20 \ll i \ll 39\\ 0 \text{x} 8F1BBDC & \text{for } 40 \ll i \ll 59\\ 0 \text{x} CA62C1D6 & \text{for } 60 \ll i \ll 79 \end{cases} \tag{4}$$

After 80 rounds for the given block, the *H* registers are updated as in Listing 2. When all blocks of the message are processed, the hash can be read as a concatenation of *H* variables.

**Listing 2.** Update of *H* variables when block was processed in the SHA-1 algorithm.

H\_0 ( i ) = H\_0 ( i −1 ) + A H\_1 ( i ) = H\_1 ( i −1 ) + B H\_2 ( i ) = H\_2 ( i −1 ) + C H\_3 ( i ) = H\_3 ( i −1 ) + D H\_4 ( i ) = H\_4 ( i −1 ) + E

In the proposed network probe, SHA-1 is applied to a 104-bit string that consists of 32-bit IP source and destination addresses, 8-bit IP protocol information, and 16-bit source and destination ports of TCP/UDP. The 160-bit hash is reduced to 32-bit words by XORing (⊕) all *H* registers together.
