2.2.1. Side-Channel Detection

Side-channel analysis is a well-studied detection method, with physical side channels such as temporal (propagation delay), thermal, and electrical (current, EMI, voltage, charge). Side-channel attack (SCA) analysis utilizes the hardware runtime characteristic, such as power, of a cryptographic device to evaluate if it leaks secret information or reveals encryption behaviors. Unlike exploiting software bugs, such attacks on hardware components are not due to buggy hardware. Side-channel attacks can be categorized in a simple power analysis [29], differential power analysis [29], and correlation power analysis [30]. Since a correlation power analysis requires far fewer traces for recovering the key than a simple power analysis or differential power analysis, a correlation power analysis that retrieves the key through analyzing the correlation between the computing data and the measured power consumption, has become the most popular way for side-channel attacks to crack many cryptographic implementations [31,32]. Among many kinds of targets, an awareness of the potential of the EM side-channel attacks is developing [33–39]. The attacker is typically interested in emanations resulting from data processing operations, such as state changes and current flows in the CMOS circuits. These currents result in EM emanations, sometimes, in unintended ways. Such emanations carry information about the data or clock rates. The emanations provide multiple views of events unfolding within the device at each clock cycle because each active component produces and induces various types of emanations, increasing their vulnerability to hacking or exploitation. However,

much of the literature on the utilization of current and EM side channels is generally not isolated from the side-channel under test [20], and it requires components to be placed in the circuit itself to detect changes in the waveform. In the case of an EM side-channel analysis, the unit under test must be within a particular test setup in order for those verifying the chip to discover the trojan, and once it leaves that setting, if the trojan goes undetected, it cannot be discovered until malicious events occur. Hence, this study proposes IC power interconnect the EM side-channel analysis via magnetic tunnel junction sensors.

#### 2.2.2. IC Current Sensing for Hardware Trojan Detection

Various IC current sensing methods have been previously proposed, including built-in current sensors (BICSs) [40] and magnetoresistance sensors [41,42]. Previously, BICS were employed and shown to be able to detect trojans [40]. Other research indicated that MTJ sensors can be utilized for anomaly detection [43]. In many current sensing schemes, the conventional methods utilize invasive series components, such as a series resistor, a power MOSFET (to observe on-resistance), and even an integrator [44–47]. These schemes cause high-power dissipation and have many limitations, including process dependency, control difficulty and high complexity. The major issue is that the inserted components change the characteristics of the overall circuits unless a small resistance component is inserted in the loop. Although using a small resistance component can reduce the risk of degrading the performance, it increases the difficulty to sense the signal accurately. In this study, novel on-chip non-invasive EM sensors will be exploited to collect EM emanations for (1) observing if the device reveals detectable patterns; (2) monitoring if the device is under attack, which may result in unusual activities. To enable the on-chip security detection function for mobile devices, we propose an EM-sensing system to monitor critical signals with non-invasive sensors that can avoid inserting new components in the signal path, so that the system characteristics will not be modified by the sensing circuits.

This study proposes to utilize MTJ sensors for current sensing, and machine learning models to develop an on-chip, isolated current sensor that will enable hardware trojan detection for protection of RF transceivers. Thus, this study not only focuses on the physics of hardware transceivers, computationally light-weight machine learning models, and MTJ sensors, but also develops a number of potential hardware trojans that cover the vulnerabilities pointed out in previous research, such as added components and injected noise.

#### **3. On-Chip Magnetic Tunnel Junction (MTJ) Based Sensors for Instant Device Power/Current and EM Emission Monitoring**

The basic MTJ structure consists of two ferromagnetic layers separated by the insulator layer. The pinned layer has the fixed magnetization direction, while the magnetization direction can be changed in the free layer. Conventionally, the MTJ devices have been used as oscillators or memory [48–52]. The MTJ devices can be fabricated monolithically over CMOS circuits. An e-beam-based nanofabrication process was developed to fabricate the MTJ-based spin torque oscillator over the Metal-4 layer of CMOS circuits. In this study we directly changed the resistance of MTJ devices with an external magnetic field. Hence, the MTJ devices are utilized as a non-invasive current sensor, which resistance is a function of the external magnetic field. The MTJ devices can be exploited as EM sensors placed near the critical signal paths.

Figure 2a shows the on-chip non-invasive current sensor that consists of the magnetic flux guide and concentrator along with the magnetic tunnel junction to convert magnetization rotation into a voltage change. A current along the power line of the chip generates magnetization rotation in the above magnetic layer with its rotation magnitude linearly proportional to the current amplitude. The patterned planar funnel-shaped magnetic film will amplify the rotation angle as the magnetization flux travels along the strip. An MTJ is placed at the end of the strip with its free layer exchange coupled to the flux guide. An MgObased tunnel barrier is used to obtain high magnetoresistance ratio (MR) of larger than 300%. The reference magnetic layer on the other side of the tunnel barrier has its magnetization

pinned in the direction orthogonal to the flux propagation direction by using an antiferromagnetic layer deposited above. The resistance of the MTJ depends on the relative magnetization orientation of the two magnetic layers sandwiching the tunnel barrier, i.e., the angle *q* in the figure on the left. The resistance can be computed by *R*(*θ*) = *R*⊥ <sup>1</sup>+*p*<sup>2</sup> cos *θ* where *R*\_⊥ is the resistance when *q* = 90◦ and *p* is polarization factor. The maximum and minimum resistance can be calculated as *Rmin* = *R*0◦ = *R*⊥ <sup>1</sup>+*p*<sup>2</sup> and *Rmax* = *R*180◦ = *R*⊥ <sup>1</sup>−*p*<sup>2</sup> . Therefore, *MR* = (*Rmax* − *Rmin*)/*Rmin* = *R*⊥ <sup>1</sup>−*p*<sup>2</sup> − *R*⊥ <sup>1</sup>+*p*<sup>2</sup> / *R*⊥ <sup>1</sup>+*p*<sup>2</sup> = <sup>2</sup>*p*<sup>2</sup> <sup>1</sup>−*p*<sup>2</sup> . For today's typical MTJ, *p* is equal to 0.70~0.75 and MR is equal to ~200%. The resulting resistance-area product (*<sup>R</sup>*⊥*<sup>A</sup>*) is around 1 kΩ · μm<sup>2</sup> ∼ 1 MΩ · μm2. The analysis shows that millivolts level signal voltage is expected for a milliampere-level current change. Here, a bridge sensing structure [53] in Figure 2b is used to eliminate any response to the external stray field disturbance, such as the earth field effect.

**Figure 2.** (**a**) The on-chip non-invasive current sensor that consists of the magnetic flux guide and concentrator along with magnetic tunnel junction, and (**b**) the bridge sensing structure to eliminate any response to field disturbance.

The entire MTJ-based sensor structure can be directly fabricated on top of the top metal layer of the semiconductor chip/circuit with two potential methods. The first one is the chemical mechanical polishing (CMP) process that will be performed over the top metal layer with deposition of the magnetic flux guide and MTJ film stack using the sputtering technique. An e-beam/optical lithography with an ion-mill process will be employed to fabricate the sensor structure along with contacting pads and connection to the circuit underneath. The other method is to adopt the dry etch to remove the top passivation layers for chip protection from the electrode areas. The silicon dioxide can be further thinned down by an optional dielectric reactive etch in order to enhance the coupling efficiency and the minimum detectable resolution.

#### **4. Machine Learning Algorithms for Real-Time Threat-and-Vulnerability Detection**

A typical side-channel signal analysis involves pre-processing to diminish dimensionality, where the measured traces are compared with predicted leakage using distinguishing algorithms. The most common technique is correlation computation [11]. For example, the Pearson correlation coefficient, ρ, for the information component, *t*, of all measured traces between predicted leakage, *Lp*, and measured leakage, *Lm*(*t*), is defined as follows: *ρ*(*t*) = *Cov Lp*, *Lm*(*t*) / *Var Lp* ·*Var*(*Lm*(*t*)), where Cov is covariance and Var defines variance. Pre-processing is adopted to diminish the set of points in the trace to remove high-order signals. However, it is still computationally expensive to realize pre-processing and correlation computation on energy-constrained RF/analog devices. To eliminate the need of pre-processing the data, Bayesian neural networks (BNNs) are exploited to directly process data and extract the features in the proposed research.

#### *4.1. Bayesian Neural Networks*

Bayesian neural networks (BNNs) have been investigated as a computationally lightweight ye<sup>t</sup> robust approach to the classification of electrical signals. In particular, a previous work [26] investigated the use of BNNs as a way to classify power amplifiers (PAs) based upon variational differences due to process corners. This study also investigated classifying side-channel signals sensed from the MTJ sensors, such as integrated circuit (IC) supply current, through Bayesian neural networks. BNNs are based upon Bayes' probability theorem which states that the probability for a hypothesis from a given set of data D to be true is equal to the probability that D is true given a hypothesis h multiplied by the probability the hypothesis is true divided by the probability of *D*.

$$P(h|D) = \frac{P(D|h)P(h)}{P(D)}\tag{1}$$

In this case, *P*(*h*|*D*) is the posterior probability of *h* because it reflects the confidence that *h* holds after seeing *D*. Bayes concept learning is based upon some main assumptions, that is, that the BNN is trained utilizing a sequence of training examples (D), consisting of a set of instances x, which are mapped to a label, *y* such that:

$$D = [(X\_n, y\_n) | n = 1, \ 2, \ \dots, \ N] \tag{2}$$

For some *n*, *Xn* is a vector of a set of points corresponding to an IC current signal sensed through a MTJ resistive sensor and *yn* is a vector of assigned class labels, corresponding to a set of K classes. Given a model with parameters *θ*, and prior distribution *Pr*(*θ*) the posterior distribution for the parameters is as follows:

$$\Pr(\theta|\mathbf{X}\_{tr}, y\_{tr}) = \frac{\Pr(\theta)\Pr(y\_{tr}|\mathbf{X}\_{tr}, \theta)}{\int \Pr(\theta)\Pr(y\_{tr}|\mathbf{X}\_{tr}, \theta)d\theta} \tag{3}$$

In classifying a test set *Xnew*, the predictive distribution of the classification set *Ynew* becomes:

$$\Pr(\mathbf{Y}\_{n\text{cw}} | \mathbf{X}\_{n\text{cw}} \; \mathbf{X}\_{\text{tr}}, \mathbf{y}\_{\text{tr}}) = \int \Pr(\mathbf{Y}\_{n\text{cw}} | \mathbf{X}\_{n\text{cw}}, \theta) \Pr(\theta | \mathbf{X}\_{\text{tr}}, \mathbf{y}\_{\text{tr}}) d\theta \tag{4}$$

Due to the intractable nature of the integral in Equation (4), various numerical methods, such as the computationally heavy Markov Chain Monte Carlo method, must be applied to estimate the predictive distribution. In this study, the comparatively lighter computational method variational inference [54] is used to estimate the integral. The variational posterior is assumed to be a Gaussian distribution, where the samples of the weights are obtained by shifting and scaling unit Gaussian variables with mean *μ* and standard deviation *σ*, where *σ* = ln(1 + exp(*ρ*)). Thus, each sample of the weights can be expressed as:

$$w = \mu + \sigma \circ \epsilon \tag{5}$$

where " ◦ " denotes an element-wise multiplication and is a vector of Gaussian normal distribution *N*(0, 1) to introduce variance to the weights for the Bayesian neural network as in Figure 3.

**Figure 3.** The weights of the Bayesian neural network weights are sampled from probability distributions.

#### *4.2. BNN Architecture and Optimization*

The BNN for this study was a network of two hidden layers with thirty-two nodes per layer, as shown in Figure 4. The BNN was trained utilizing the Python library Pytorch. The Cadence simulation data were quantized and classified to train the BNN. The BNN was trained and tested with the data from a number of different hardware trojans. We first tested the ability of the system to classify individual trojans, the details of which will be discussed in a further section. For each of these cases, certain trojans were easier to detect than others, with some of the particular trojans being able to be detected with nearly 100% accuracy. Furthermore, we also trained and tested the BNN with all the different hardware trojans combined into one dataset. We equally trained the BNN with normal and abnormal data, and noticed that due to similarities between the normal and abnormal data, the accuracy for the overall combined dataset was around 90%. We determined the exact structure of our BNN in order to maximize the accuracy for the total trojan dataset and we found that utilizing 32 hidden neurons per layer produced nearly 6% higher accuracy after 1000 training epochs than 16 neurons, but more statistically insignificant accuracy depreciation than a network with 64 neurons in the same amount of time. Thus, we decided to utilize 32 neurons to minimize resource usage and accuracy.

**Figure 4.** The optimized architecture of the lightweight Bayesian neural network for classification of the sensed EM signals.
