1. Introduction
Scan-based testing is a commonly practiced design-for-test (DfT) scheme that facilitates detection and diagnosis of faults in integrated circuits (ICs) because of the high controllability and observability it provides [
1]. By replacing common flip-flops with scan flip-flops and connecting them serially, DfT allows access to internal nets and assists in the extraction of test response values in sequence. Design for testability and scan chain architecture play crucial roles in the realm of IC design and testing. They are integral components that contribute significantly to the efficiency and effectiveness of IC testing processes and yield improvement. For example, in the pre-silicon design of ICs, Synopsys Tetramax [
2] is popularly used for automatic test pattern generation (ATPG) and silicon testability analysis, which automates the process of generating test patterns to test digital ICs for potential defects. In post-silicon testing, the JTAG [
3] and Nexus standards are widely adopted, using ATPG test patterns to perform boundary scan tests and debugging.
While the scan chain architecture incorporates DfT principles to improve IC testing, the controllability provided by scan flip-flops also introduces weaknesses in on-chip security primitives to non-invasive attacks, and the observability of scan chain outputs may create paths that leak information, thus introducing further security risks. Scan-based attacks, which are types of side-channel attacks, aim to extract secret keys through the analysis of scan data obtained from scan chains. In prior research, scan-based attacks against advanced encryption standard (AES), data encryption standard (DES), and Rivest–Shamir–Adleman (RSA) crypto modules have been proposed to extract secret keys through scan chains [
4]. Scan-based attacks are commonly classified into two groups: differential scan-based attacks and signature scan-based attacks. Differential scan-based attacks (DSAs) take advantage of a specific characteristic in the round function of block ciphers, where two unique inputs can lead to output vectors that exhibit a distinctive Hamming distance following a single encryption round [
5]. Yang et al. [
6,
7] detail a two-phase procedure in their research. This procedure is designed to first determine the position of intermediary registers within the scan chain and then apply DSAs to retrieve the round key. On the other hand, signature scan-based attacks require the generation of a set of signatures by subjecting various chosen plaintexts to encryption simulations using a range of potential encryption keys. Kodera et al. [
8] describe a procedure that observes changes in the scan data, providing particular plaintexts to form the scan signature matrix and thus reducing the key candidate number from
to 512.
In order to secure crypto-chips from scan-based attacks, multiple countermeasures are proposed. Scan-based attack countermeasures are mainly categorized into two strategies: scan chain obfuscation and scan I/O restriction. Scan chain obfuscation aims to prevent attackers from controlling the scan chain by modifying the scan structure, inserting obfuscation gates, or adding sub-chains alongside the original scan chain [
9]. Agrawal et al. [
10] proposed an obfuscated scan chain structure that incorporates XOR gates at random points in the scan chain. Atobet et al. [
11] proposed the state-dependent scan flip-flop (SDSFF) that replaces scan flip-flops at random points to prevent attackers from identifying the correct scan timing. Lee et al. [
12] proposed subchain modification techniques that allow Lock and Key controls and scan order obfuscation to prevent attacks from accessing the scan structure. Gaikwad et al. [
13] proposed InvisibleScan architecture, which utilizes FSM obfuscation with access control mechanisms to protect the scan chain by preventing direct access to the SI, SO, and SE logic. The dynamically obfuscated scan chain (DOSC) [
14] incorporates a method that integrates the permutation of scan chains with XOR gates and employs logic-locking techniques using dynamic keys. Moreover, the DOSC incorporates a shadow chain that restricts the dynamic keys from leakage to the scan output. This means that both obfuscation and scan I/O restriction are applied in DOSCs, providing a robust defense mechanism against unauthorized access or attacks.
In this paper, we propose DOSCrack, which stands for Deobfuscation using Oracle-guided Symbolic Execution and Clustering of Binary Security Keys. In addition, we also propose a modification to the DOSC that substantially increases the computational effort required to break it. Our contributions are listed below.
We propose DOSCrack, a novel deobfuscation framework that incorporates structural analysis, symbolic execution, key candidate clustering, and sequential equivalence checking to non-invasively recover the DOSC’s seed—the foundation of its security.
The DOSCrack framework is designed to target the DOSC architecture while being equally effective at breaking static scan chain obfuscation mechanisms by exploiting the scrambled scan chain through symbolic execution.
We experimentally apply our framework on implementations of the DOSC with different seed bit-lengths. The framework demonstrates scalability, with the recorded runtime exhibiting a proportional increase as the seed size grows.
We propose a countermeasure against our deobfuscation framework that incorporates a nonlinear shift feedback register (NLSFR) to improve the DOSC’s robustness against symbolic execution modeling.
The NLFSR countermeasure achieves this enhanced security with minimal design overhead, making it a practical solution for industrial DfT/T applications such as the BIST (Built-In Self-Test) architecture.
We analyze the complexity of NLFSR-based countermeasures by examining the growth in the number of clauses, as well as the time complexity for NLFSR compared to LFSR in the original DOSC architecture.
Our experiments demonstrate that the NLFSR-based countermeasure effectively defends against the DOSCrack deobfuscation attack when the seed size exceeds 32 bits, with the attack timing out after 10 days.
The rest of the paper is organized as follows.
Section 2 gives the necessary background of techniques, as well as the structure of the dynamically obfuscated scan chain.
Section 3 introduces the novel DOSCrack, our proposed deobfuscation framework, and describes each step.
Section 5 provides a countermeasure and explains how it improves the DOSC.
Section 6 analyzes the results of applying our framework to DOSC benchmarks. Finally,
Section 7 concludes the paper with key takeaways and future works.
2. Background and Preliminary Concepts
In this section, we will provide a concise overview of the architecture of the DOSC and then introduce the fundamental concepts of symbolic execution and binary clustering. In the end, we give the threat model of our DOSCrack framework, outlining the potential security risks from the attacker’s perspective and the assumptions of the DOSCrack attack.
2.1. DOSC Architecture
The DOSC [
14] architecture is shown in
Figure 1. It consists of four parts: the control unit, the LFSR (linear feedback shift register), the shadow chain, and the obfuscated scan chain. The control unit generates signals that load the seed from non-volatile memory and regulates the clock frequency of the shadow chain. Then, the LFSR takes the seed for obfuscated key sequence generation, and the shadow chain protects the obfuscated key from potential differential attacks. In DOSC, the seed must be kept a secret from the attacker. Otherwise, the attacker can generate test patterns and interpret test responses. If the DOSC is used to protect logic-locked circuits, knowing the DOSC’s seed would allow an attacker to perform Boolean satisfiability (SAT) attacks [
15] against the locked functional circuit.
Previously, the Boolean satisfiability (SAT) attack was performed against the DOSC architecture itself in an attempt to obtain the LSFR’s seed. To do so, sequential circuit unrolling was utilized [
16]. The DOSC architecture was found to be robust against SAT attacks because such an unrolling inevitably results in scalability issues for Boolean SAT solvers. As a result, the SAT attack targeting the DOSC seed timed out after 20 days.
2.2. Symbolic Execution
Symbolic execution is a program analysis technique used in testing, debugging, and verification. Instead of executing a program with concrete input values, symbolic execution operates on symbolic values and expressions. These symbolic values represent input variables, and the program’s execution path is explored symbolically by tracking how these symbolic values change. Symbolic execution engines are often combined with simulations to generate feasible execution paths and further utilize satisfiability (SAT) or satisfiability modulo theories (SMT) solvers to find the executing patterns. In prior research, Ahmed et al. [
17] proposed a framework that combines symbolic execution with simulation data to generate test vectors that activate rarely triggered hardware Trojans. Vafeei et al. [
18] proposed another framework that utilizes random simulations in symbolic models to identify critical execution branches for potential hardware Trojans. In our deobfuscation framework, we leverage the built-in symbolic execution engine from EISec [
19] to convert our target netlist into C code. Subsequently, the generated C code undergoes symbolic modeling.
2.3. Binary Clustering
Binary clustering refers to grouping binary data into clusters based on pattern similarity or distance measurements such as Hamming distance (HD) or Euclidean distance. Binary data typically imply that the data being clustered consists of only two potential states. Thus, this concept is well suited for hardware applications, as logic gate values are limited to 1s and 0s. In hardware security, binary clustering is often employed to identify patterns in binary vectors that indicate malicious behaviors or hardware Trojans. For example, SCOAP [
20] and COTD [
21] test malicious signals using clustering analysis based on testability reports. In [
20], Zhao et al. introduced a hardware Trojan detection method that relies on clustering gates and registers value distributions using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. He et al. [
22] proposed Fuzzy C-Means clustering in conjunction with fusion distance algorithms to detect anomalies indicative of hardware Trojan in integrated circuits. The choice of clustering algorithm depends on the specific characteristics of the data and the goals of the analysis. Commonly used binary clustering algorithms include hierarchical clustering, k-means clustering, and DBSCAN. In this paper, we utilize binary clustering to group similar candidate keys, which allows us to further rule out similar keys (or associated seeds) as a group.
2.4. Threat Model
In this section, we briefly review the threat model of design-for-test/debug (DfT/D) and present the assumptions of our DOSCrack attack framework. In the context of semiconductor supply chains, the design house typically dispatches the DfT/D inserted netlist to contract third-party fab/foundries for the production of chips. These contract foundries therefore have full access to the scan chains embedded in the design. This accessibility presents a potential risk as it allows these facilities to conduct scan-based attacks to extract sensitive information or compromise the security of the fabricated semiconductor devices. Additionally, the assembly and test facilities, which are responsible for performing scan testing (i.e., JTAG) on fabricated chips, have access to the knowledge of DfT/D insertion techniques. As assumed in many other logic-locking papers, it is also possible for the design to be reverse-engineered or leaked by untrusted parties. Therefore, end users and debuggers may also have access to this information and scan chains. Based on this context, our DOSCrack framework operates under the assumption that potential attackers have knowledge of the DOSC architecture and access to the scan chain. However, they do not know the value of the LFSR seed. Obtaining the seed is therefore the goal of their attack. This threat model also aligns well with Kerckhoffs’s principle, which states that the security of a cryptosystem must only lie in the secrecy of its keys and everything else should be considered public knowledge.
While many scan-based attacks (i.e., differential scan-based attacks) necessitate some degree of understanding of the chip’s functional logic to be effective, our method exclusively relies on running the unlocked chip (or simulating an unlocked design) in scan mode and does not require any knowledge of the functional logic. This feature sets our DOSCrack framework apart from many other oracle-guided attacks. Additionally, DOSCrack performs a non-invasive attack, preserving the integrity of the oracle chip throughout the process. Note that once DOSCrack has recovered the seed, an attacker can proceed with SAT attacks against the functional circuit through the scan chain to recover its logic-locking key.
3. DOSCrack Framework
An overview of DOSCrack, the proposed DOSC deobfuscation framework, is illustrated in
Figure 2. The inputs to the framework are the target obfuscated oracle and the netlist of the stand-alone DOSC architecture. Using this information, the framework then generates the minimized obfuscation key candidates set as outputs, which can subsequently be mapped to the corresponding seeds. Our objective is to narrow down the candidate seed space to a manageable quantity, where we can eliminate incorrect obfuscation keys until only a singular valid seed remains. Our framework is composed of four integral components:
- 1.
We use an unlocked chip (or equivalently simulate an unlocked chip’s DOSC and scan chain) to act as an oracle;
- 2.
We employ structural analysis to distinguish between the linear feedback shift register (LFSR) and scan chains in the netlist;
- 3.
We conduct symbolic execution followed by employing an SMT solver to rule out keys and their associated seeds;
- 4.
We utilize clustering algorithms to efficiently categorize the remaining key candidates, find distinguishing patterns, and iteratively rule out more keys/seeds.
3.1. Structural Analysis
As explained in the threat model (
Section 2.4), we assume that there is access to an open-source DOSC architecture and the attacker’s goal is to obtain the LFSR seed. Structural analysis begins by taking the netlist of the DOSC architecture as input to identify the LFSR, shadow chain, and scan chains so that symbolic engines can modulate each part separately. The LFSR is typically implemented as a finite state machine (FSM) in a physical context. This implementation means that the LFSR is designed to transition between a finite number of states based on current state and input. On the other hand, scan chains are implemented using datapath registers. These registers are used to store and shift data through the scan chain during testing or debugging processes, which allows for sequential loading and shifting of test data. The FSM structure is suitable for LFSRs because it can efficiently handle the linear feedback mechanism that defines the LFSR’s behavior. Therefore, structural analysis distinguishes the LFSR and scan chains on the basis of their distinct physical implementations and functional roles.
Algorithm 1 shows the detailed steps of our structural analysis. During structural analysis, the interaction analysis tool EXERT [
23] is employed to identify FSMs and datapaths (line 1). This identification is based on the characteristic features of the FSM that align with the expected behavior of an LFSR, primarily its feedback network typically constructed from XOR gates (line 2 to 6). This process effectively separates the LFSR and scan chains (line 8), allowing for more precise modulation with symbolic engines.
Algorithm 1 Structual Analysis |
- Input:
DOSC netlist N; - Output:
LFSR netlist, scan chain netlist; - 1:
DOSC netlist → EXERT interaction analysis - 2:
if feedback nets detected then - 3:
FSM registers ← LFSR - 4:
feedback nets connectivity ← XOR gate inputs - 5:
else if feedback nets not detected then - 6:
datapath registers ← scan chains - 7:
end if - 8:
return LFSR netlist, Scan chain netlist; feedback nets connectivity
|
3.2. Oracle Interaction in Test Mode
To apply the proposed deobfuscation framework, random input sequences are applied to the target oracle that already has an LFSR activation seed for its scan chain. These patterns/responses are provided/collected in test mode, where the random inputs with a certain number of sequences are fed into the scan input (SI) port, while an equivalent number of patterns are shifted out from the scan out (SO) port after certain clock cycles. We utilize the unlocked chip (oracle) in test mode, which ensures that the resulting patterns do not incorporate functional logic. This also improves the performance of our deobfuscation framework by focusing the attack only on the scan chain (DOSC).
3.3. Symbolic Execution Engine and Symbolic Equation System
In this section, we delve into the methodology employed by our symbolic execution engine, with a specific focus on the process of recovering the symbolically assigned n-bit seed, with bits denoted as from the symbolic equation system. This system is constructed through the symbolic modeling of both the LFSR and the scan chain.
3.3.1. Symbolic Execution Engine
The symbolic execution engine modulates the LFSR and the scan chains by converting the target netlist to functionally equivalent C code, where every bit of the unknown seed is represented as a symbolic variable. As our structural analysis identifies the LFSR and the scan chains, the symbolic execution engine proceeds to model the LFSR and the scan chains separately. An LFSR is most often a shift register whose input bit is driven by the XOR of some bits of the overall shift register value. Structural analysis provides detailed information about the connectivity of the feedback nets, which represent the locations within the LFSR register where the XOR gate receives its inputs. To symbolically represent LFSR outputs, we denote these two inputs as
.
Figure 3 shows an example of a 5-bit LFSR with
and
.
We describe the
i-th bit output of the LFSR in cycle
t as
. Thus, a general LFSR output at cycle
t is represented as:
This equation models the LFSR outputs and can be described as follows. The first bit of the LFSR is always connected to the output of the XOR gate. This means that at every clock cycle, the first bit of the LFSR is updated based on the output from the XOR gate, with the inputs of the XOR gate identified through structural analysis as being at positions within the LFSR. For the rest of the bits in the LFSR, from position 1 to (excluding the first bit which is directly updated from the XOR gate), the shifting process occurs when each bit shifts from its previous state to the next position with each clock cycle.
Table 1 shows the cycled output based on seed
for the example in
Figure 3. At cycle 0, the seed
is loaded into the LFSR cells and the output of any cycle
t can be computed based on the connectivity information
obtained from structural analysis. Thus, we define the function
f which represents the LFSR output given the seed and cycle
t as
, where
denotes the LFSR output in a vector form.
The symbolic engine models the shadow chain and the scan chains together. Both scan chains and shadow chains can be conceptualized as cascading datapath registers, receiving inputs from the scan in the (SI) port of the scan chain and the obfuscation key input, whereas the outputs are directed to the scan out (SO) port. The symbolic engine then modulates the transformed C code, treating it as a symbolic equation representing the relationship between the inputs and the outputs.
Table 2 shows an example of conversion from netlist to C code to symbolic equation for a single scan chain cell. This scan chain cell is located at the end of the scan chain that connects directly to the SO port. By generating every symbolic equation for all scan cells, the symbolic equation that connects the SI port to the SO port is formulated and represented as
where
N denotes the length of scan chains and
T denotes the cycle when scan in patterns are shifted into the scan chain. The
denotes the continuous XOR operation of
and is derived by symbolically modeling the scan chains.
Figure 4 shows an example of scan chain modulation under this equation with
, where
. This equation captures the relationship between the SI and SO patterns, effectively encapsulating the dynamics of how the obfuscation key values are processed within the scan chain. This encapsulation is conducted using our symbolic execution engine. In the conversion process to C code, we have integrated a flag variable, labeled
m. This variable increases each time the code representing the scan flip-flop behavior is executed. The primary purpose of this flag variable is to keep track of the location of the scan patterns. Given the tracking of SI and SO patterns with correlated obfuscation keys, we define the function
g where
which is derived such that the scan-out pattern is symbolically represented with a corresponding scan-in pattern with a certain sequence of XOR operation on the obfuscation key.
3.3.2. Symbolic Equation System and Solver
In the previous section, the obfuscation keys are symbolically represented in
Table 1, and the scan chains are modeled with symbolic equations in
Table 2 in our running example. The two symbolic equations then form simultaneous equations
f and
g:
As discussed in
Section 3.2, the SI and SO patterns are generated from test mode interactions within the oracle. Upon providing a sequence of
m-bit SI patterns, we collect the corresponding
m-bit SO patterns from the oracle. By putting
m pairs of these corresponding SI and SO patterns into the simultaneous equations, we construct a symbolic equation system that encompasses
m equations, each representing their relationship and transformation. As a result, in the symbolic equation system that we have developed, the assigned seed constitutes the only set of symbolic variables. By applying an SMT solver to this system of equations, we can effectively solve for possible solutions of these variables and recover candidate seeds.
Note that as our SI and SO patterns are generated randomly, there is no assurance that all symbolic equations in the above system are independent. This issue might result in a large number of potential solutions provided by the SMT solver. Despite this, these initial results serve as a starting point for applying the clustering algorithm to explore similarities in the solution space and utilize sequential equivalence checking (SEC) to produce distinct SI and SO patterns. Future work can focus on more careful selection of SI patterns.
3.4. Obfuscation Key Clustering
Our system utilizes the SMT solver to generate solutions for a symbolic equation system, which in turn produces the space of obfuscation keys and the corresponding seed values. To effectively narrow down the possible seed space, it is essential to introduce more symbolic equations. The symbolic equation is built on SO and SI patterns formulated by functions f and g that are produced by the symbolic execution engine. As the functions f and g are fixed based on the inherent nature of the circuit implementation, more SO and SI patterns need to be obtained by further queries to the oracle in order to produce more symbolic equations. However, the effectiveness of generating SO and SI patterns from random simulations diminishes over time. This diminishing effect occurs because the candidates for the obfuscation keys begin to form similarities, making it increasingly difficult for random SI vectors to effectively differentiate between them based on SO patterns. As a result, the random simulation approach becomes less capable of exploring and identifying differences among potential obfuscation keys. To address this problem, we applied the K-means clustering algorithm designed to categorize existing candidates for obfuscation keys into distinct groups. Following this categorization, we construct a miter circuit specifically for SEC, which generates distinguishing patterns that are tailored to each group of key candidates.
The overall workflow is illustrated in
Figure 5. After using symbolic execution and equation solvers, we produce a certain number of key candidates as a starting point. We applied the K-means clustering algorithm to categorize the space of key candidates. We utilized the Hamming distance (HD) metric for the K-means clustering algorithm, as it is the most suitable for binary data. As the obfuscation key dynamically changes over time, we select and group the first four cycles of the obfuscation key that produce the clustering data, as shown in the first step in
Figure 5. From two distinct clusters, we select centroid keys and map them back to their corresponding symbolic seeds. The two seeds are subsequently loaded to two DOSC architectures and compared by the miter circuit. The miter is constructed with an XOR gate, which compares two copies of the DOSC architecture loaded with different seeds. This setup, as shown in the second step of our process, forms the basis for further analysis. Next, the miter circuit feeds to the Jasper SEC engine [
24] that generates the input sequence that can differentiate between the two DOSC benchmarks. We then query the oracle with this tailored input sequence to form a new symbolic equation that is then added to our symbolic equation system and solvers.
This approach effectively addresses the limitations of random simulation. The generated input sequence is specifically designed to produce distinct outputs, which implies that the corresponding symbolic equation can eliminate one of the two key candidates. Furthermore, due to the similarity of keys within the same cluster, this equation is likely to rule out even more keys in that same cluster. Thus, this method efficiently solves the issue of diminishing returns in random simulation, ensuring a more effective elimination of potential key candidates. By iteratively selecting different keys from various clusters, we can formulate more equations, effectively reducing the key candidate space to a size suitable for brute force.
The re-clustering is needed as the process of ruling out obfuscation keys progresses, and the distinct margins of the clusters become less defined or maintained as more keys/seeds get ruled out.
Figure 6 shows the flowchart of the specific conditions under which re-clustering should be invoked. Moreover, it outlines the decision points for determining when to abandon the clustering approach and switch to ruling out keys by brute force. During the iterative process of selecting clusters for generating distinguishing patterns using Jasper SEC, we incorporate two critical checkpoints to assess the
silhouette score (the silhouette score is a common metric for evaluating the quality of clustering results. It operates by measuring how similar objects are to others in their own cluster compared to other clusters. The silhouette score ranges from −1 to +1, where a high value indicates that the clustering parameters (number of clusters, among others) are appropriate). The first check occurs right after the completion of the K-means clustering. At this point, we evaluate the silhouette score to gauge the effectiveness of the clustering, ensuring that the clusters formed are distinct and meaningful. If the silhouette score is below our threshold score of 0.6, we take this to mean that the clustering is not successful and we should switch to rule out keys by brute force. The second check occurs during the iterations. This occurs after a certain number of keys have been eliminated. At this point, we calculate the silhouette score, taking into account both the reduced key sets under the initial cluster formations. If the silhouette score remains above our set threshold, it conveys that the existing clustering maintains distinct margins and indicates that the generation of distinguishing patterns is still efficient. Then, another iteration of selecting clusters is performed to feed into the miter to generate distinguishing patterns. On the other hand, if the silhouette score drops below our threshold, it suggests a deterioration in the clustering structure, and re-clustering is needed to re-establish distinct grouping. By inserting two checkpoints of the silhouette score to evaluate the clustering margins, we ensure a dynamic and responsive approach to generate effective and distinct patterns in ruling out groups of keys in each iteration.
4. Exploiting Alternative Scan Chain Obfuscation Techniques with DOSCrack
In addition to targeting dynamically obfuscated scan chains, DOSCrack also serves as an effective method for breaking existing static scan chain obfuscation frameworks such as SeqL [
25] and XOR-obscuring scan chains [
10]. Both static obfuscation methods employ MUX or XOR gates inserted within the scan chain to scramble its architecture, as illustrated in
Figure 7. Static scan chain obfuscation relies on the assumptions of security that the keys and the locations of the scrambling gates remain undiscoverable by attackers. DOSCrack leverages symbolic execution as a powerful reverse engineering technique to explore the locations of scrambling gates, making it an exceptionally effective method for breaking static obfuscation. Once the scan chain architecture is modeled using DOSCrack symbolic execution engine, the XOR scrambling mechanism is effectively breached. For MUX-based scrambled scan chains, the select bits of the MUX are treated as the scrambling key inputs (sck0 and sck1). Breaking this type of scan chain involves determining the correct scrambling keys, which can be attacked by DOSCrack by constructing symbolic equations and solving them to retrieve the correct key values. An example of exploiting the scan chain architecture of the static obfuscation techniques is shown in
Table 3.
Recent research has further advanced the concept of the DOSC architecture by proposing enhanced designs that incorporate additional layers of security. Lee et al. [
26] proposed a scan chain architecture that integrates dynamic keys with PUF keys, along with bypass mechanisms and fake response generators, creating a more robust and secure scan chain architecture. DOSCrack also has the potential to compromise this enhanced scan chain architecture. If the PUF key-based authentication is successfully bypassed using a developed method, the dynamic key generation mechanism can be attacked independently with DOSCrack.
6. Experimental Results and Evaluation
In the section, we first synthesize four DOSC benchmarks with different LFSR seed sizes (8-bit, 16-bit, 32-bit, and 64-bit) to evaluate DOSCrack. Our framework is tested exclusively based on test mode simulation of a seed-loaded oracle netlist, which allows us to synthesize the scan chains only by keeping the functional input port of the scan flip-flop open. To further simplify the application of our DOSCrack method, we establish only a single scan chain for each benchmark, and the length of the scan cells is identical to its seed size for every benchmark. In our approach, the seed values are synthesized as static inputs, and we operate under the assumption that these seeds are not reachable or accessible during normal operation as would be the case with a physical oracle chip.
The overview of our synthesized DOSC is shown in
Figure 8. The test mode simulation is performed by Synopsys VCS with 500 clock cycles of random SI patterns. The benchmarks are synthesized with Synopsys Design Compiler. We used an Intel(R) Xeon E5-2450L 32-core CPU with 128 GB memory operating at 1.80 GHz for the synthesis of DOSC benchmarks, test mode simulation, and running of DOSCrack. The runtime results are shown in
Table 6 and illustrated in
Figure 9. We discuss the DOSCrack results on 8-bit, 16-bit, 32-bit, and 64-bit seeds for the LFSR in
Section 6.1,
Section 6.2,
Section 6.3 and
Section 6.4, while the results for the DOSC containing the NLFSR are discussed in
Section 6.6.
Section 6.7 discusses the overhead of our countermeasure under different assumptions.
6.1. DOSC with 8-Bit Seed
Our testing of the DOSCrack framework starts with a DOSC benchmark configured with an 8-bit seed and a corresponding 8-bit LFSR. Both the scan chains and the shadow chain comprise eight scan cells each, aligning with the 8-bit size of the seed to maintain consistency across the system. Given the modest 8-bit size of the seed, our symbolic execution engine operates efficiently and the SMT solver generates just three potential key candidates. Consequently, there is no requirement to employ clustering algorithms to manage the obfuscation key candidates’ space. The total runtime is 18.5 min but this includes simulation time for SI/SO patterns which is 8.3 min. Since querying an oracle chip would be much faster than simulations, the real runtime for symbolic execution and SMT solvers is only 10.2 min. The effective handling of an 8-bit seed by our symbolic execution engine results in only three possible key candidates from the SMT solver, which serves as evidence of the scalability of our stand-alone symbolic execution modeling approach. The break time for a traditional SAT attack on the 8-bit DOSC benchmark exceeds 5 h, demonstrating a significant 30-fold reduction in runtime compared to previous approaches. For the 8-bit DOSC, there is no need for clustering because there are only three remaining seeds. These were trivially brute forced to prune DOSC’s actual LFSR seed.
6.2. DOSC with 16-Bit Seed
We further tested a DOSC benchmark configured with a 16-bit seed under the same configuration of 16-bit LFSR and sixteen scan cells. With the same number of simulation SI/SO patterns, our SMT solver has generated 356 potential key (seed) candidates. This quantity is suitable for performing clustering analysis. The initial clustering results identify 14 clusters with a silhouette score of 0.68, indicating a successful clustering outcome. By carefully choosing combinations of clusters and inputting their centroid keys into the Jasper SEC system, we were able to successfully generate 91 distinguishing patterns. The addition of these new equations significantly enhanced the efficacy of our analysis. As a result of this strategic approach, each newly generated equation proved effective in ruling out approximately four potential key candidates. The possible obfuscation key candidates are reduced to 23 without re-clustering. The corresponding seed candidates then are ruled out by brute force and the total runtime is 37.9 min.
6.3. DOSC with 32-Bit Seed
Under the same configuration of DOSC, we perform the deobfuscation on a 32-bit seed benchmark. Unlike the previous test, we increase the number of initial SI/SO patterns generated from simulation as we aim to explore the diminishing effect issues explained in
Section 3.4. We generated 400 SI/SO patterns to build symbolic equations and further generated another 100 patterns to compare the number of solutions from the SMT solver. As a result, the SMT solver produced 50,288 solutions with 500 patterns and 50,354 solutions with 400 patterns. The observation here is that the increase of 100 patterns, which is a 25% increase in simulation patterns, only led to a reduction of 66 in the number of possible key candidates. This result validates the issue of diminishing effect and the necessity of utilizing clustering and SEC miter circuits to produce distinguishing patterns. With clustering, the possible obfuscation key candidates were reduced from 50,288 to 327, with a successful clustering of 107 groups.
Figure 10 shows the HD distribution between clusters in three re-clustering iterations. As keys are selectively ruled out in the clustering process, there is a noticeable linear decrease in the HD between clusters. This trend indicates that, over time, the remaining keys within each cluster become more similar to each other.
6.4. DOSC with 64-Bit Seed
The biggest benchmark we tested uses a 64-bit seed with DOSC. We have decided to maintain the number of SI/SO patterns obtained from the simulation at 500 as we presume the diminishing issue will occur that increasing the number of patterns further would not contribute to our symbolic modeling. The initial outcome from the SMT solver reveals approximately 150,000 obfuscation key candidates, which equates to around
solutions. Clustering results in 2440 groups and we go through iterations of clustering whenever the silhouette score is below 0.6.
Figure 9b,d show the runtime results as a bar chart and the clustering distribution in a histogram, respectively. The 64-bit DOSC takes five iterations of clustering and the number of the key distribution becomes more narrow through each iteration. The final phase of our analysis involved pruning out the remaining 6508 key candidates using a brute-force approach. The average time taken to brute-force test 1000 random keys in our analysis, often referred to as the ’pruning time’, is approximately 2.5 h. The entire attack takes almost 4 days to find the seed.
6.5. Comparisons to Other Attacks
While DOSC architectures have proven to be robust against traditional SAT attacks, they time out after 20 days. State-of-the-art attacks on logic-locked scan chains with dynamic keys rely on customized SAT-based techniques [
37,
38] designed to target scan chains. In these works, the attack framework was tested on various benchmarks and demonstrated that the seed can be successfully recovered in an hour. However, their attack assumes that the functional logic is not logic-locked, thereby allowing access to the primary I/O pairs needed to generate DIPs [
39]. This contrasts with the original threat model of the DOSC [
16], where the functional logic should be logic-locked and inaccessible from being exploited to attack the scan chain. In contrast, our DOSCrack framework adopts a stricter and more realistic threat model, focusing exclusively on performing attacks in scan mode. By performing an oracle-based attack that operates solely within the scan chain, DOSCrack eliminates the dependency on functional logic access.
6.6. Trivium-Based DOSC Architecture
To assess the effectiveness of our proposed NLFSR-based countermeasures, we extend our evaluation to include applying the DOSCrack framework to different bit sizes of the Trivium-based DOSC architecture. We build Trivium-based DOSC benchmarks with different bit-lengths based on Algorithm 2, which shows an example of our 8-bit Trivium and the original 288-bit Trivium architecture. In our evaluation, it was found that the total time taken to crack the 8-bit Trivium-based DOSC using the DOSCrack framework amounted to 52.8 min. Compared to the runtime observed with a system utilizing an LFSR, the 8-bit Trivium-based DOSC exhibits approximately five times the complexity in terms of runtime. The increased complexity observed in the 8-bit Trivium-based DOSC can be attributed to the complexity analysis in
Section 5.
In the case of the 288-bit Trivium architecture, our approach involves maintaining its full 288-bit state while specifically loading an 8-bit seed into the system. The remaining bits of the state, which are not occupied by the 8-bit seed, are set to 0. The execution time of solving the symbolic equation system using CryptoMiniSAT times out, even when maximized to 6 days, similar to the other benchmarks shown in
Figure 11. The comparison of DOSC benchmarks in the aspect of area and power consumption is shown in
Table 7. In the case of symbolic modeling, the NLFSR introduces a higher degree of nonlinearity and complexity in the relationships between the key bits and the output.
The mathematical and runtime complexity comparisons between the LFSR- and Trivium-based DOSC of the other size benchmarks are shown in
Figure 11. The number of clauses generated by both the LFSR and Trivium methods shows an exponential increase with bit-length. However, our symbolic execution approach reduces the mathematical complexity by approximately 75%. The runtime results also demonstrate the effectiveness of the countermeasures when using Trivium for the DOSC. Both SAT solvers time out after 24 h when attempting traditional SAT-based attacks on 64-bit benchmarks. In contrast, DOSCrack, enhanced by our symbolic execution, achieves a great reduced runtime by completing in 49.6 min with the CryptoMiniSAT solver and 54.7 min with the Grasp solver. It is important to note that this runtime reflects only the time required to solve the symbolic equation system once and derive the candidate key space initially. To fully execute DOSCrack, multiple iterations are needed to generate additional symbolic equations, which will further increase the runtime.
6.7. Combining DOSC and BIST
Figure 12 illustrates an architecture that integrates Built-In Self-Test (BIST) with DOSC functionality. In industry DfT application, BIST and scan chain insertion are typically implemented concurrently. Since BIST requires a test pattern generator to apply input stimuli and a response analyzer to compare the outputs, we can optimize the design by sharing the NLSFR between both the test pattern generator and the key generator for DOSC obfuscation (the seeds used by the BIST and DOSC should probably be different for security purposes).
Table 8 provides the area overhead associated with our integrated design. The results show that incorporating the DOSC architecture into the BIST results in only a 1.1% increase in area overhead, demonstrating the efficiency of our approach.
7. Summary and Future Work
In this paper, we present DOSCrack [
40], an oracle-guided attack that utilizes symbolic execution and binary clustering to break the DOSC. By applying DOSCrack on different sizes of benchmarks, our framework successfully reduced the number of key candidates and broke the 64-bit seed DOSC in 3 d 23 h, which is much more efficient compared to the time-out threshold of 10 days for traditional SAT attacks. Additionally, in our testing, the brute-force effort to rule out 1000 keys took approximately 2.5 h. This duration highlights the significantly greater complexity of the brute-force method compared to our approach. To evaluate our framework from the designers’ perspective, we proposed countermeasures to symbolic execution modeling by introducing the NLFSR trivium cipher as a random bit generator. This innovative approach significantly improves the security of the DOSC. Our Trivium-based DOSC, as a result of this integration, demonstrates higher complexity compared to a DOSC with a traditional LFSR.
Future work in this area could concentrate on developing more advanced algorithms for intelligently selecting different clusters to generate distinguishing patterns. Currently, our approach relies on iterating through various combinations of keys from different clusters to create these patterns. By enhancing the method of choosing clusters, we could significantly improve the distinctiveness of the patterns generated, thereby more effectively ruling out obfuscation keys. One promising direction for this enhancement is the development of a reinforcement learning algorithm. Such an algorithm would learn and adapt over time, based on the success of previous pattern generations, to make more informed decisions about which clusters to select. Another area on improvement is generation of independent SI/SO patterns rather than giving random ones to the oracle during the symbolic equation system step of the framework.