2.1. Hardware Trojan
Hardware trojan attacks involve injecting malicious circuitry into integrated circuits (ICs) during any stage of the manufacturing process or supply chain [
19]. These trojans remain dormant until they are triggered, at which point they can carry out various malicious activities, such as altering functionalities, leaking sensitive data, or completely disrupting device operation [
1]. The challenge of detecting these trojans is compounded by their infrequent activation, and the potential consequences they pose, especially to critical systems, are a significant threat to hardware security and integrity.
The growing use of FPGAs in applications that are critical for security has intensified concerns about their susceptibility to hardware trojans. Similar to application-specific integrated circuits (ASICs), FPGAs are vulnerable to malicious modifications throughout their design and fabrication lifecycle [
13]. Attackers have various avenues to introduce trojans into FPGAs, including manipulating the hardware description language (HDL) code, altering the FPGA fabric, modifying physical parameters, tampering with bitstreams, or exploiting vulnerabilities in FPGA computer-aided design (CAD) tools. For example, a trojan circuit could be stealthily embedded to monitor the FPGA’s internal nodes, logic modules, and look-up tables. Upon activation, this trojan could cause malfunctions by altering LUT values, disrupting configuration cells to induce routing errors, or writing erroneous data into block random access memory [
20].
2.2. Bitstream Architecture
The bitstream is a binary file that contains the configuration information to program the internal logic of the FPGA. In essence, a bitstream can be considered analogous to a stream of bytes arranged in layers that resemble a protocol stack for computer networking [
21]. In this work, we focus on Xilinx 7-series devices, and the corresponding layers in the bitstream protocol stack are
physical interface,
configuration packets, and
configuration memory frames, as shown in
Figure 1.
Moving up the analogous bitstream protocol stack gives successively more information about the configuration of the physical resources on the FPGA. Starting at the bottom layer of the bitstream protocol stack, the physical interface indicates which configuration interface—such as JTAG, SPI (serial peripheral interface), BPI (byte peripheral interface), and SelectMAP—is used to program the FPGA. As a result of multiple configuration interfaces being available for programming the FPGA, the generated bitstream can have multiple file formats, as shown in
Figure 2. In the current work, the bitstream was generated in the
bit file extension.
The bitstreams generated for Xilinx 7-series FPGAs consist of instructions for modifying the configuration logic along with configuration data. In particular, for Xilinx 7-series FPGAs, a bitstream consists of three sections [
22]:
As a result of the multiple physical interfaces present for FPGA programming, it gives rise to both parallel and serial modes of configuration. The
bus width auto-detection pattern, incorporated at the beginning of every bitstream, is used to set the configuration bus width. Detection of the
bus width auto-detection pattern is ignored for serial configuration modes. After this pattern detection is complete, a
sync word is used to align the configuration logic at a 32-bit word boundary, and it signals the start of the packet processing. For parallel configuration modes, the bus width detection must occur before the
sync word is detected, which for Xilinx 7-series FPGAs is
0xAA995566 [
22].
The
physical interface layer, henceforth, can be considered to pass a sequence of packets containing a series of read/write operations for the configuration logic of the FPGA. In Xilinx 7-series FPGA devices, the sequence of packets go through the FPGA configuration process shown in
Figure 3. From
Figure 3, it can be observed that the steps in the configuration process can be grouped into two stages:
setup and
bitstream loading [
22].
In the
setup stage, the device first powers up and provides power to different pins corresponding to various physical interfaces, it sequentially clears the configuration memory, and it samples the mode pins on the rising edge of the configuration clock, to determine if the serial mode or the parallel mode of configuration has been activated [
22].
After the
setup stage, the
bitstream loading stage commences, and it is similar for both the serial and the parallel configuration modes. In the first step of
bitstream loading, synchronization occurs by means of a special 32-bit synchronization word (
0xAA995566). The special synchronization word signals the device to align the configuration data with the internal configuration logic. For parallel configuration modes, such as BPI, SlaveSelectMAP, and Master SelectMAP, bus width auto-detection must occur before synchronization, whereas the bus width auto-detection pattern is ignored for serial configuration modes, as Slave Serial, Master Serial, SPI, and JTAG. The data on the input pins prior to the synchronization word are ignored except for bus width auto-detection. After synchronization is complete, a
device ID check is carried out and must pass, before the configuration data frames can be loaded. The
device ID check ensures that the bitstream loaded is for the correct device. In the following step, the configuration data frames are loaded. At the final step of the
bitstream loading stage, a
cyclic redundancy check (CRC) is performed on the configuration data packets. Xilinx 7-series devices perform a 32-bit CRC check, in order to detect errors in the transmission of the bitstream. If the CRC check fails, configuration is aborted. When the loading of the configuration frames is complete, the device enters the
startup sequence as per the command contained in the bitstream [
22].
It is to be noted that only after detection of the
sync word does the configuration logic process each 4-byte data word on the interface as a configuration packet or component of a multiple-word configuration packet [
22]. In the final layer of the bitstream protocol stack, the
configuration memory frames are determined. For Xilinx 7-series FPGA devices, the configuration memory is organized into frames that are distributed as tiles across the physical layout of the device. A frame is the smallest addressable segment of the Xilinx 7-series FPGA, and all operations, therefore, modify whole configuration frames [
22]. The bitstream is in a packetized form, from which the configuration memory frame data are to be extracted. As such, the bitstream consists of two packet types:
Type 1 and
Type 2. Through these packets, all 7-series FPGA bitstream commands are executed by read/write operations on the configuration registers. Such operations focus on the overall programming of the FPGA. Regarding the packet types,
Type 1 packets, shown in
Figure 4, are used for configuration register reads and writes [
22]. The opcode related to
Type 1 packets, illustrated in
Figure 5, specifies the intended operation, such as read, write, or other actions.
Type 2 packets, shown in
Figure 6, follow a
Type 1 packet and are used to write to longer blocks. Among the configuration registers, the
frame address register (FAR) and the
frame data register, input (FDRI) register are used to configure frame data [
22]. All 7-series Xilinx FPGAs are divided into two halves, top and bottom, with all the frames having a length of 3232 bits (which is 101 32-bit words) [
22]. Using this arrangement, the FAR divides the FPGA device into five fields, as given in
Figure 7: block type, top/bottom bit, row address, column address, and minor address. Using this division by the FAR, frame data can be written at a frame address specified in the FAR.
2.3. Recurrent Neural Networks (RNN)
Recurrent neural networks (RNNs) are a category of neural networks that can process sequential data, where the sequences can be of varying length. The term recurrent refers to the fact that previous outputs can be fed back into the inputs while maintaining hidden states. The architecture of a conventional RNN is shown in
Figure 8 [
23], where
is the input vector,
is the vector of the activation functions, and
is the output vector.
In each time step, the activation
and output
are computed by Equations (1) and (2), respectively. In the equations,
,
,
,
,
are temporally shared coefficients and
,
are the activation functions. The construction of the hidden layer for a conventional RNN is given in
Figure 9, which grants it the ability to store historical information. However, for a conventional RNN it becomes increasingly challenging to retain information very far into the past, as the gradient can exponentially increase or decrease in relation to the number of layers. As a result, the conventional RNN encounters a problem known as the vanishing gradient. A modification to the RNN architecture, where the hidden layer is replaced by a cell, as shown in
Figure 10, mitigates the issue of vanishing gradient. The resulting RNN, which uses the cell in
Figure 10, is called long short-term memory (LSTM), and it manages to retain only the most important historical information for generating the output and discards the rest. Therefore, it avoids the possible explosion of the gradient encountered in a conventional RNN.