Secure ECDSA SRAM-PUF Based on Universal Single/Double Scalar Multiplication Architecture
Abstract
:1. Introduction
1.1. Background
1.2. Related Works
1.3. Motivation
1.4. Contributions and Structure
- An RM code soft-decision attack algorithm for SRAM-PUFs is proposed. The attacker simply needs to modify the parameter in the ROM to clone the SRAM-PUF.
- We propose a secure ECDSA SRAM-PUF based on custom signature and verification schemes. The computationally expensive modular inversion operation present in standard ECDSA is omitted in the custom schemes. The custom schemes enhance the difficulty of the proposed RM code soft-decision attack algorithm.
- We propose a universal computing hardware architecture for ECSM and elliptic curve double scalar multiplication (ECDSM) based on the differential addition chains (DAC) to enhance the overall performance of the design.
- A secure ECDSA SRAM-PUF architecture is proposed in this paper. The hardware architecture for RM code soft-decision emphasizes lightweight design, while the ECDSA architecture is performance-oriented. Our design is implemented on both an ASIC and FPGA to compare with the existing literature in terms of bit error rate (BER), reliability, uniqueness, and area–time product (ATP).
2. Security Problems Existing in Soft-Decision RM Codes
Algorithm 1 The Proposed Attack Algorithm for SRAM-PUFs Based on Soft-Decision RM Code |
Require: w and y. Ensure: Replicated y.
Return: y. |
3. The Proposed Secure SRAM-PUF Scheme Based on Custom ECDSA
3.1. Parameter Selection for RM Code
3.2. The Proposed Secure SRAM-PUF Scheme
Algorithm 2 The Proposed Secure SRAM-PUF Scheme |
Require: SRAM, a private key d, a public key H, and the order of the elliptic curve n. Ensure: A stable SRAM-PUF response y.
Return: y. |
3.3. The Proposed Universal Algorithm for ECSM and ECDSM
- When , the choice is the same as the previous iteration;
- When , the choice is the opposite of the previous iteration;
- When , the choice is ;
- When , the choice is .
Algorithm 3 The Proposed Two-Dimensional DAC Generation Algorithm |
Require: Ensure:
Return: . |
Algorithm 4 The Proposed Flag Signal Generation Algorithm |
Require: Ensure: .
Return: . |
Algorithm 5 The Proposed ECDSM Algorithm |
Require: . Ensure: .
Return: . |
4. Hardware Architecture
4.1. The Overall Architecture of ECDSA SRAM-PUF
- SRAM: Provides the initial response of the PUF. The data in it become invalid after the response is read, and it is then used as the data RAM to cache the intermediate variable of the RM code.
- response_temp: Caches the initial value of SRAM. To improve the entropy of PUF, four segments of 256-bit SRAM initial values are selected and repeated four times to obtain a 1024-bit response. The RM code structure needs to be multiplexed and calculated four times. After a calculation is completed, the initial value in the SRAM will be overwritten by the intermediate result of the calculation. To preserve the remaining initial values of the SRAM, response_temp is added to cache the other three initial values of the SRAM.
- ROM: Stores the w and P of the auxiliary data algorithm as well as the signature values , , required for signature verification. The variables w and P need to output the value corresponding to the multiplexing times when calculating the RM code for each multiplexing.
4.2. The Architecture of Soft-Decision RM Code
- XOR gates: Two XOR gates at the input of the ALU are utilized to perform XOR operations during the recovery phase. Specifically, they are involved in XORing operations for obtaining and .
- Comparator with XOR gate: The comparator, coupled with an XOR gate at the ALU’s output is responsible for converting the likelihood estimation obtained after calculations into a code word c. The XOR gate in this context contributes to the computation of .
- LUT: The LUT is employed to calculate the likelihood estimation using the formula . Hardware implementation of this logarithmic calculation can be complex. However, due to the limited set of possible error probability values P derived from 100 instances of power on and off during the registration phase, a pre-calculated LUT is used. The LUT helps obtain the corresponding for each P, significantly saving computational time. Additionally, the LUT’s corresponding relationship can be randomized to enhance the difficulty of attacker interference.
- ALU: Performs calculations related to the F function, G function, and accumulation, as specified in Algorithm A3. Facilitates operations such as passing data directly to the next module.
- Address generator: Generates the current corresponding SRAM read and write addresses based on the algorithm’s requirements.
- Stack: Functions as a cache unit that stores parameters and local variables for each round of recursion. Enables the implementation of a software-driven approach to realize hardware recursion. This approach reduces the complexity of the state machine by offloading certain control aspects to the stack.
- State machine: Serves as the core control logic for the entire recursive module. Utilizes a software-driven approach, allowing certain steps of the recursion to be expanded into a large state machine. Manages the control of the current recursive round, while the recursion of other rounds is controlled by parameters stored in and retrieved from the stack.
4.3. The Architecture of ECDSA
- 1
- The value m is processed in the SHA256 module through the hash operation to derive the digest value of the message. A fixed random interception is employed to capture 163 bits. It is crucial to note that the random interception must be firmly embedded in the circuit to prevent potential manipulation by attackers attempting to alter the HASH value through the configuration of the interception position. The hash interception may result in entropy loss; hence, entropy augmentation is performed in subsequent steps to restore the entropy that has been lost.
- 2
- Compute in the signature verification algorithm. In the context of , the values and directly feed into the modular multiplier for the computation of . In contrast, during standard ECDSA signature verification, the value s requires modulo inverse calculation using Fermat’s little theorem. Hence, a shift register is incorporated in the structure to sequentially output each bit of , controlling the input to the modulo multiplier and buffering the intermediate result in the register.
- 3
- Calculate the ECSM of by the base point G. The result of the ECSM is cached in the register. According to the pipeline idea, the modular multiplier calculates simultaneously, and the calculation of is similar to the calculation of . Since is already obtained when calculating , performing a modular inversion is unnecessary.
- 4
- The output of the modular multiplier transitions from to . Subsequently, the strobe signal of the input selector is altered to input the public key H, initiating the ECSM of the public key H by .
- 5
- Compute the sum of points and , generating distinct outputs based on the ECDSA module. For standard ECDSA signature verification, compute ; for , compute X.
- 6
- For standard ECDSA signature verification, compare with r. The module outputs 1 for identical results and 0 for different ones. For , the module obtains four X values after four iterations. These 163-bit X values need to be extended to four 256-bit X values to match the number of RM code soft-decisions and restore the lost entropy. The extension method must use the same approach as the extension of stored in the ROM; otherwise, the same value cannot be obtained after the expansion of X and , preventing completion of the signature verification due to failure to satisfy Equation (1).
5. Implementation Results
5.1. ASIC Results
5.2. FPGA Results
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Soft-Decision SRAM-PUF Based on Reed–Muller Codes
- Registration stage: Initiate the SRAM power cycle, toggling it on and off for 100 iterations to document error probabilities . These probabilities indicate the likelihood of each bit differing from the corresponding bit in the standard value y. The recorded error probabilities serve as indicators for the forthcoming response generation, revealing the likelihood of each bit being the same or different from the corresponding bit in y. Subsequently, randomly select an RM code c and calculate the auxiliary data w by an exclusive-OR operation (XOR). Then, the auxiliary data w and error probabilities P are both stored in the ROM. The registration stage must be concluded before the chips depart the factory. The statistical analysis of P and the storage of w and P are stored during the manufacturing or testing phases of the chip.
- Recovery stage: Following transactions, customers receive a response with an error code when the SRAM is powered on. The value is obtained by XORing with w. Subsequently, utilizing RM code soft-decision decoding, and error probability P are inputted to execute error correction. The error probability P enhances error correction capabilities by providing additional information to the RM code. Following the correction, the corrected code c is obtained, and yields the SRAM-PUF response y. This stage is completed on the chip by customers.
Algorithm A1 Soft-Decision SRAM-PUF Algorithm Based on Reed–Muller Codes |
Require: SRAMs. Ensure: A stable response y.
Return: y. |
Algorithm A2 Soft-Decision Repeat Code |
Require: L and n. Ensure: .
Return: . |
Algorithm A3 Soft-Decision RM Code |
Require: L, r, and m. Ensure: .
Return: X and Z. |
Appendix B. Elliptic Curve Cryptography Arithmetic
Algorithm A4 The Montgomery Ladder ECSM Algorithm |
Require: with , . Ensure: .
Return Q. |
Algorithm A5 The Single PA Algorithm |
Require: and . Ensure: and .
Return: . |
Appendix C. Differential Addition Chain
Appendix D. Custom ECDSA Signature and Verification
Algorithm A6 Custom ECDSA Signature |
Require: d, m and n. Ensure: and .
Return: and . |
Algorithm A7 Custom ECDSA Verification |
Require: , , , m, H. Ensure: Verification results, X, Z.
Return: X and Z. |
References
- Arora, H.; Soni, G.K.; Kushwaha, R.K.; Prasoon, P. Digital image security based on the hybrid model of image hiding and encryption. In Proceedings of the 2021 6th International Conference on Communication and Electronics Systems (ICCES), Coimbatre, India, 8–10 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1153–1157. [Google Scholar]
- Matted, S.; Shankar, G.; Jain, B.B. Enhanced image security using stenography and cryptography. In Computer Networks and Inventive Communication Technologies; Springer: Berlin/Heidelberg, Germany, 2021; pp. 1171–1182. [Google Scholar]
- Halak, B.; Zwolinski, M.; Mispan, M.S. Overview of PUF-based hardware security solutions for the Internet of Things. In Proceedings of the 2016 IEEE 59th International Midwest Symposium on Circuits and Systems (MWSCAS), Abu Dhabi, United Arab Emirates, 16–19 October 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–4. [Google Scholar]
- Mall, P.; Amin, R.; Das, A.K.; Leung, M.T.; Choo, K.K.R. PUF-based authentication and key agreement protocols for IoT, WSNs and smart grids: A comprehensive survey. IEEE Internet Things J. 2022, 9, 8205–8228. [Google Scholar] [CrossRef]
- Holcomb, D.E.; Burleson, W.P.; Fu, K. Power-up SRAM state as an identifying fingerprint and source of true random numbers. IEEE Trans. Comput. 2008, 58, 1198–1210. [Google Scholar] [CrossRef]
- van Dijk, M.; Rührmair, U. Protocol attacks on advanced PUF protocols and countermeasures. In Proceedings of the 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, 24–28 March 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 1–6. [Google Scholar]
- Rührmair, U.; van Dijk, M. PUFs in security protocols: Attack models and security evaluations. In Proceedings of the 2013 IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 19–22 May 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 286–300. [Google Scholar]
- Rührmair, U.; Jaeger, C.; Algasinger, M. An attack on PUF-based session key exchange and a hardware-based countermeasure: Erasable PUFs. In Proceedings of the International Conference on Financial Cryptography and Data Security, Gros Islet, Saint Lucia, 28 February–4 March 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 190–204. [Google Scholar]
- Karakoyunlu, D.; Sunar, B. Differential template attacks on PUF enabled cryptographic devices. In Proceedings of the 2010 IEEE International Workshop on Information Forensics and Security, Seattle, WA, USA, 12–15 December 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1–6. [Google Scholar]
- Merli, D.; Schuster, D.; Stumpf, F.; Sigl, G. Side-channel analysis of PUFs and fuzzy extractors. In Proceedings of the International Conference on Trust and Trustworthy Computing, Pittsburgh, PA, USA, 22–24 June 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 33–47. [Google Scholar]
- Patterson, D.A.; Hennessy, J.L. Computer Organization and Design ARM Edition: The Hardware Software Interface; Morgan Kaufmann: Burlington, MA, USA, 2016. [Google Scholar]
- Lohrke, H.; Tajik, S.; Krachenfels, T.; Boit, C.; Seifert, J.P. Key extraction using thermal laser stimulation: A case study on xilinx ultrascale fpgas. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2018, 2018, 573–595. [Google Scholar] [CrossRef]
- Hankerson, D.; Menezes, A.J.; Vanstone, S. Guide to Elliptic Curve Cryptography; Springer Science & Business Media: Berlin, Germany, 2006. [Google Scholar]
- Rashid, M.; Imran, M.; Jafri, A.R.; Al-Somani, T.F. Flexible architectures for cryptographic algorithms—A systematic literature review. J. Circuits Syst. Comput. 2019, 28, 1930003. [Google Scholar] [CrossRef]
- Li, J.; Li, Z.; Cao, S.; Zhang, J.; Wang, W. Speed-Oriented Architecture for Binary Field Point Multiplication on Elliptic Curves. IEEE Access 2019, 7, 32048–32060. [Google Scholar] [CrossRef]
- Khan, Z.U.A.; Benaissa, M. Throughput/Area-efficient ECC Processor Using Montgomery Point Multiplication on FPGA. IEEE Trans. Circuits Syst. II Express Briefs 2015, 62, 1078–1082. [Google Scholar] [CrossRef]
- Khan, Z.U.A.; Benaissa, M. High-Speed and Low-Latency ECC Processor Implementation Over GF( 2m) on FPGA. IEEE Trans. Very Large Scale Integr. VLSI Syst. 2017, 25, 165–176. [Google Scholar] [CrossRef]
- Imran, M.; Rashid, M.; Jafri, A.R.; Najam-Ul-Islam, M. ACryp-Proc: Flexible asymmetric crypto processor for point multiplication. IEEE Access 2018, 6, 22778–22793. [Google Scholar] [CrossRef]
- Harb, S.; Jarrah, M. FPGA implementation of the ECC over GF (2 m) for small embedded applications. ACM Trans. Embed. Comput. Syst. TECS 2019, 18, 1–19. [Google Scholar] [CrossRef]
- Kiyan, T.; Lohrke, H.; Boit, C. Comparative assessment of optical techniques for semi-invasive SRAM data read-out on an MSP430 microcontroller. In Proceedings of the ISTFA 2018: Proceedings from the 44th International Symposium for Testing and Failure Analysis, Phoenix, AZ, USA, 28 October–1 November 2018; ASM International: Novelty, OH, USA, 2018; p. 266. [Google Scholar]
- Faraj, M.; Gebotys, C. Quiescent photonics side channel analysis: Low cost SRAM readout attack. Cryptogr. Commun. 2021, 13, 363–376. [Google Scholar] [CrossRef]
- Shifman, Y.; Miller, A.; Keren, O.; Weizmann, Y.; Shor, J. A Method to Improve Reliability in a 65-nm SRAM PUF Array. IEEE Solid-State Circuits Lett. 2018, 1, 138–141. [Google Scholar] [CrossRef]
- Satpathy, S.; Mathew, S.K.; Suresh, V.; Anders, M.A.; Kaul, H.; Agarwal, A.; Hsu, S.K.; Chen, G.; Krishnamurthy, R.K.; De, V.K. A 4-fJ/b delay-hardened physically unclonable function circuit with selective bit destabilization in 14-nm trigate CMOS. IEEE J.-Solid-State Circuits 2017, 52, 940–949. [Google Scholar] [CrossRef]
- Alvarez, A.; Zhao, W.; Alioto, M. 14.3 15fJ/b static physically unclonable functions for secure chip identification with <2% native bit instability and 140× Inter/Intra PUF hamming distance separation in 65 nm. In Proceedings of the 2015 IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–3. [Google Scholar]
- Yang, K.; Dong, Q.; Blaauw, D.; Sylvester, D. 8.3 A 553F 2 2-transistor amplifier-based Physically Unclonable Function (PUF) with 1.67% native instability. In Proceedings of the 2017 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 5–9 February 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 146–147. [Google Scholar]
- Yamamoto, D.; Sakiyama, K.; Iwamoto, M.; Ohta, K.; Takenaka, M.; Itoh, K. Variety enhancement of PUF responses using the locations of random outputting RS latches. J. Cryptogr. Eng. 2013, 3, 197–211. [Google Scholar] [CrossRef]
- Zhang, F.; Yang, S.; Plusquellic, J.; Bhunia, S. Current based PUF exploiting random variations in SRAM cells. In Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, 14–18 March 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 277–280. [Google Scholar]
- Kumar, S.S.; Guajardo, J.; Maes, R.; Schrijen, G.J.; Tuyls, P. The butterfly PUF protecting IP on every FPGA. In Proceedings of the 2008 IEEE International Workshop on Hardware-Oriented Security and Trust, Dresden, Germany, 14–18 March 2016; IEEE: Piscataway, NJ, USA, 2008; pp. 67–70. [Google Scholar]
- Bossuet, L.; Ngo, X.T.; Cherif, Z.; Fischer, V. A PUF based on a transient effect ring oscillator and insensitive to locking phenomenon. IEEE Trans. Emerg. Top. Comput. 2013, 2, 30–36. [Google Scholar] [CrossRef]
- Itoh, T.; Tsujii, S. A fast algorithm for computing multiplicative inverses in GF (2m) using normal bases. Inf. Comput. 1988, 78, 171–177. [Google Scholar] [CrossRef]
- Khan, Z.U.A.; Benaissa, M. High speed ECC implementation on FPGA over GF(2m). In Proceedings of the International Conference on Field Programmable Logic and Applications, London, UK, 2–4 September 2015; pp. 1–6. [Google Scholar]
- Imran, M.; Rashid, M.; Jafri, A.R.; Kashif, M. Throughput/area optimised pipelined architecture for elliptic curve crypto processor. IET Comput. Digit. Tech. 2019, 13, 361–368. [Google Scholar] [CrossRef]
- Nguyen, T.T.; Lee, H. Efficient algorithm and architecture for elliptic curve cryptographic processor. J. Semicond. Technol. Sci. 2016, 16, 118–125. [Google Scholar] [CrossRef]
- Imran, M.; Shafi, I.; Jafri, A.R.; Rashid, M. Hardware design and implementation of ECC based crypto processor for low-area-applications on FPGA. In Proceedings of the 2017 International Conference on Open Source Systems & Technologies (ICOSST), Lahore, Pakistan, 18–20 December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 54–59. [Google Scholar]
Research Work | ASIC Process (nm) | Area Per Bit (m2/bit) | GE Per Bit (GE/bit) | BER |
---|---|---|---|---|
[22] | 65 | 49.12 | 25.58 | |
[23] | 14 | 1.84 | 11.83 | |
[24] | 65 | 50.7 | 26.40 | |
[25] | 65 | 17.9 | 9.32 | |
This work | 40 | 15.40 | 21.45 |
Research Work | Device or Process | Reliability Intra-Chip Variation | Uniqueness Inter-Device Variation | Temperature | Voltage Supply |
---|---|---|---|---|---|
Latch-PUF [26] | Spartan-3E | M = 2.4% SD = 0.75% | M = 46% SD = 3.8% | 0 °C–85 °C | — * |
Latch-PUF [26] | Spartan-6 | M = 0.86% SD = 0.54% | M = 49% SD = 3.9% | — | 1.14 V–1.26 V |
SRAM-PUF [27] | 45 nm | M = 0.72% SD = 10% | M = 49.97% SD = 15% | 10 °C–85 °C | — |
Butterfly-PUF [28] | 65 nm | = 6% — | M = % — | −20 °C–80 °C | — |
TERO-PUF [29] | Cyclone II | M = 1.7% — | M = 48% — | 28 °C | 1.5 V |
Delay-Hardened-PUF [23] | 14 nm | M = 3.4% — | M = 48.6% — | 25 °C–110 °C | 0.55 V–0.75 V |
Amplifier-PUF [25] | 180 nm | M = 0.07% SD = 0.32% | M = 49.89% SD = 6.24% | −40 °C–120 °C | 0.8 V–1.8 V |
This work | 40 nm | M = 3.17% SD = 1.63% | M = 49.44% SD = 2.44% | 0 °C–85 °C | 0.8 V–1.2 V |
Design | Clock Cycle | Freq. | Latency | ATP | ||
---|---|---|---|---|---|---|
[15] | 547 | 320.5 | 28,911 | 8460 | 3.413 | 28,878 |
[16] | 4168 | 397 | 4271 | 1476 | 20.997 | 30,992 |
[17] | 450 | 159 | 41,090 | 11,657 | 5.660 | 65,983 |
[31] | 780 | 223 | 27,105 | 8736 | 6.995 | 61,113 |
[32] | 3960 | 369 | 9965 | 2207 | 21.463 | 47,370 |
[18] | 3960 | 351 | 10,955 | 3107 | 22.564 | 70,107 |
[19] | 13,000 | 320.8 | 6169 | 2201 | 81.047 | 178,385 |
[33] | 52,012 | 800 | − * | 4665 | 130.03 | 606,590 |
[34] | 3426 | 135 | − | 3657 | 50.076 | 185,613 |
This Work | 1958 | 296 | 13,912 | 3902 | 6.615 | 25,812 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, J.; Chen, Z.; He, X.; Liu, K.; Hao, Y.; Ma, M.; Wang, W.; Dang, H.; Li, X. Secure ECDSA SRAM-PUF Based on Universal Single/Double Scalar Multiplication Architecture. Micromachines 2024, 15, 552. https://doi.org/10.3390/mi15040552
Zhang J, Chen Z, He X, Liu K, Hao Y, Ma M, Wang W, Dang H, Li X. Secure ECDSA SRAM-PUF Based on Universal Single/Double Scalar Multiplication Architecture. Micromachines. 2024; 15(4):552. https://doi.org/10.3390/mi15040552
Chicago/Turabian StyleZhang, Jingqi, Zhiming Chen, Xiang He, Kuanhao Liu, Yue Hao, Mingzhi Ma, Weijiang Wang, Hua Dang, and Xiangnan Li. 2024. "Secure ECDSA SRAM-PUF Based on Universal Single/Double Scalar Multiplication Architecture" Micromachines 15, no. 4: 552. https://doi.org/10.3390/mi15040552
APA StyleZhang, J., Chen, Z., He, X., Liu, K., Hao, Y., Ma, M., Wang, W., Dang, H., & Li, X. (2024). Secure ECDSA SRAM-PUF Based on Universal Single/Double Scalar Multiplication Architecture. Micromachines, 15(4), 552. https://doi.org/10.3390/mi15040552