Designing a Scalable and Area-Efficient Hardware Accelerator Supporting Multiple PQC Schemes
Abstract
:1. Introduction
2. Background
2.1. Post-Quantum Cryptography
2.2. Dilithium
Algorithm 1 Main algorithm of Dilithium |
1 Keygen() 2 3 4 5 // Use NTT for faster multiplication 6 7 return 8 SignSignature(sk, M) 9 // Generate masking vector 10 11 while do 12 13 14 15 16 if or then 17 18 return 19 Verify 20 21 return |
2.3. Kyber
- Kyber.CPAPKE.KeyGen. Key generation first generates a matrix A by sampling from the SHAKE128 hash function. It then samples secret s and e using SHAKE128. The public key is computed as , and the secret key is s.
- Kyber.CPAPKE.Encryption. Encryption first generates the matrix by sampling from the SHAKE128 hash function. Then, it computes and . The ciphertext c is computed as .
- Kyber.CPAPKE.Decryption. Decryption uses the secret key and ciphertext c to restore u and v by decompressing the ciphertext. The original message is computed as .
Algorithm 2 Main algorithm of Kyber |
1 Keygen() 2 3 4 5 return 6 Encrypt(Secretkey sk,Message M) 7 8 9 10 11 12 return 13 Decrypt 14 15 16 17 18 19 20 if c = c′ then 21 return 22 else 23 return 24 return K |
2.4. Falcon
Algorithm 3 Main algorithm of Falcon |
1 Keygen() 2 3 4 5 6 7 return 8 SignSignature(M, sk, ) 9 10 11 12 13 14 while do 15 for do 16 17 18 19 20 21 22 23 return 24 Verify() 25 26 27 if then 28 return accept 29 else 30 return reject |
2.5. SPHINCS+
Algorithm 4 Main algorithm of SPHINCS+ |
1 Keygen() 2 3 4 5 6 return ((SK.seed, SK.prf, PK.seed, PK.root), (PK.seed, PK.root)) 7 SignSignature() 8 9 10 11 12 13 //Compute message digest and index 14 15 first floor ((ka+7)/8 bytes of digest 16 next floor ((h-h/d+7)/8) bytes of digest 17 next floor ((h/d+7)/8) bytes of digest 18 md = first ka bits of tmp_md 19 20 21 //FORS sign 22 23 24 //get FORS public key 25 26 //sign FORS public key with Hyper Tree 27 28 29 30 return SIG 31 verify() 32 //Init 33 34 35 36 37 //compute message digest and index 38 39 = first floor((ka +7)/ 8) bytes of digest 40 = next floor((h - h/d +7)/ 8) bytes of digest 41 = next floor((h/d +7)/ 8) bytes of digest 42 first ka bits of tmp_md 43 first h - h/d bits of tmp_idx_tree 44 first h/d bits of tmp_idx_leaf 45 //compute FORS public key 46 47 48 49 50 51 //verify Hyper Tree signature 52 53 return |
3. Related Works and Motivation
- 1.
- Versatility across applications and environments. Allows for a single solution adaptable to different security and performance requirements.
- 2.
- Reduced need for multiple specialized accelerators. Unifies HW resources, reducing overall cost and complexity.
- 3.
- Simplified maintenance and updates. Changes can be uniformly applied across all supported schemes and easier to adapt to evolving cryptographic standards.
- 4.
- Enhanced flexibility and longevity of the HW design. Ensures compatibility with future PQC standards and extends the useful life of the HW investment.
4. Design Methodology
4.1. Performance Profiling
- Keccak. Predominantly in SPHINCS+ (99%) and significantly in Dilithium (43%).
- NTT. Major component in Falcon as FFT operations (60.4%).
- Polynomial operations. Present in Falcon as floating-point operations, indicating heavy computational load (30.8%).
- Reduction. Notable in Kyber (25%) and Dilithium (21%) due to the use of Montgomery reduction.
- P1—Polynomial operations are commonly used, but data types are different. Three schemes—Dilithium, Kyber, and Falcon—commonly perform operations over polynomial data. Dilithium and Kyber operate over polynomials with coefficients in integer rings, requiring the variants of Montgomery reduction [25] and the Number Theoretic Transform (NTT). Both functions are frequently used and represent significant hotspots in their execution profiles. Falcon also operates over polynomial data but uses polynomials with floating-point coefficients and performs Fast Fourier Transforms (FFTs) instead of NTTs, eliminating the need for modular reductions.
- P2—Dissimilar proportion of Keccak. Keccak is another hotspot function present in all four schemes, but accounts for a varying proportions of execution time. It constitutes 43% of Dilithium’s operations, 19% for Kyber, 3.7% for Falcon, and 99% for SPHINCS+. Since there is no clear preference for any specific scheme (there are no statistical numbers available for which scheme is more frequently used, as PQC schemes are merely developed, not deployed in industries), we assumed that all four schemes will be used equally.
- P3—Distinct high-level operation sequence. Although the schemes share common functions, their high-level operation sequences differ significantly. This is due to not only the inherent differences in algorithms but also the varying polynomial length used in different parameters. For instance, using parallel butterfly modules to compute NTT [26] requires different numbers of stages, and, for each stage, we need different numbers of cycles depending on the number of butterfly modules we instantiate.
4.2. Proposed Design
4.2.1. Keccak Accleration Moudle (KAM)
- 1.
- KAM-Small
- Optimized for minimal area consumption, it uses 5 KALUs (Keccak ALUs) to compute each step of the Keccak permutation, taking 5 cycles per step.
- 2.
- KAM-Large
- A mid-range solution balancing area and performance, it has 25 KALUs, allowing each Keccak permutation step to be computed in a single cycle.
- 3.
- KAM-FP
- For maximum performance, this variant has a fully-pipelined datapath that computes each round of permutation in a single cycle.
4.2.2. Joint Polynomial Arithmetic Unit (JPAU)
4.2.3. Control Unit
- Sample_polynomial. The UPCU initiates and manages the polynomial sampling process. This includes setting up necessary registers and handling data flow for efficient sampling.
- Polynomial_multiplication. The UPCU controls the sequence of multiplication and accumulation operations, coordinating data flow and setting up operands for the computation.
- NTT_INTT. The UPCU manages the NTT and INTT operations, controlling the butterfly units and Montgomery reduction units. It ensures efficient operations by adjusting control signals and managing data flow through various stages, utilizing the Twiddle factor ROM for different schemes.
5. Implementation
6. Evaluation
6.1. Dilithium and Kyber
6.2. Falcon
6.3. SPHINCS+
6.4. Power Consumption
7. Discussions
7.1. Architectural Differences against Others
7.2. Security and Reliability
7.3. Limitations and Future Works
8. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
AVX | Advanced Vector eXtension |
ASIC | Application Specific Integrated Circuit |
ALU | Arithmetic and Logical Unit |
DSE | Design Space Exploration |
DSA | Digital signature algorithm |
XMSS | eXtended Merkle Signature Scheme |
FFT | Fast Fourier Transform |
FPGA | Field Programmable Gate Array |
FSM | Finite-State Machine |
GE | Gate Equivalent |
GPV | Gentry–Peikert–Vaikuntanathan |
HW | Hardware |
IoT | Internet of Things |
INTT | Inverse Number Theoretic Transform |
JPAU | Joint Polynomial Arithmetic Unit |
KALU | Kecakk ALU |
KAM | Keccak Acceleration Module |
KEA | Key exchange algorithm |
NIST | National Insitute of Standards and Technology |
NTRU | Number Theory Research Unit |
NTT | Number Theroretic Transform |
PQC | Post-Quantum Cryptosystem |
Public Key | |
Secret Key | |
SW | Software |
UPCU | Unified Polynomial Control Unit |
WOTS | Winternitz One-Time Signature |
References
- Carracedo, J.M.; Milliken, M.; Chouhan, P.K.; Scotney, B.; Lin, Z.; Sajjad, A.; Shackleton, M. Cryptography for Security in IoT. In Proceedings of the 2018 Fifth International Conference on Internet of Things: Systems, Management and Security, Valencia, Spain, 15–18 October 2018; pp. 23–30. [Google Scholar] [CrossRef]
- Katzenbeisser, S.; Polian, I.; Regazzoni, F.; Stöttinger, M. Security in Autonomous Systems. In Proceedings of the 2019 IEEE European Test Symposium (ETS), Baden-Baden, Germany, 27–31 May 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Muzikant, P.; Willemson, J. Deploying Post-quantum Algorithms in Existing Applications and Embedded Devices. In Proceedings of the Ubiquitous Security; Wang, G., Wang, H., Min, G., Georgalas, N., Meng, W., Eds.; Springer: Singapore, 2024; pp. 147–162. [Google Scholar]
- Kim, D.; Choi, H.; Seo, S.C. Parallel Implementation of SPHINCS+ With GPUs. IEEE Trans. Circuits Syst. I Regul. Pap. 2024, 71, 2810–2823. [Google Scholar] [CrossRef]
- Shor, P.W. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Rev. 1999, 41, 303–332. [Google Scholar] [CrossRef]
- Aikata, A.; Mert, A.C.; Imran, M.; Pagliarini, S.; Roy, S.S. KaLi: A Crystal for Post-Quantum Security Using Kyber and Dilithium. IEEE Trans. Circuits Syst. I Regul. Pap. 2023, 70, 747–758. [Google Scholar] [CrossRef]
- Avanzi, R.; Bos, J.; Ducas, L.; Kiltz, E.; Lepoint, T.; Lyubashevsky, V.; Schanck, J.M.; Schwabe, P.; Seiler, G.; Stehlé, D. CRYSTALS-Kyber: Algorithm Specifications and Supporting Documentation, Submission to the NIST Post-Quantum Project. 2021. Available online: https://pq-crystals.org/kyber/data/kyber-specification-round3-20210131.pdf (accessed on 7 August 2024).
- Ducas, L.; Kiltz, E.; Lepoint, T.; Lyubashevsky, V.; Schwabe, P.; Seiler, G.; Stehlé, D. CRYSTALS-Dilithium: Algorithm Specifications and Supporting Documentation, Submission to the NIST Post-Quantum Project. 2021. Available online: https://pq-crystals.org/dilithium/data/dilithium-specification-round3-20210208.pdf (accessed on 7 August 2024).
- Fouque, P.A.; Hoffstein, J.; Kirchner, P.; Lyubashevsky, V.; Pornin, T.; Prest, T.; Ricosset, T.; Seiler, G.; Whyte, W.; Zhang, Z. Falcon: Fast-Fourier Lattice-Based Compact Signatures over NTRU, Specification v1.2. 2020. Available online: https://falcon-sign.info/falcon.pdf (accessed on 7 August 2024).
- Aumasson, J.P.; Bernstein, D.J.; Beullens, W.; Dobraunig, C.; Eichlseder, M.; Fluhrer, S.; Gazdag, S.L.; Hülsing, A.; Kampanakis, P.; Kölbl, S.; et al. SPHINCS+ Specification. Submission to the NIST Post-Quantum Project. 2020. Available online: https://sphincs.org/data/sphincs+-r3.1-specification.pdf (accessed on 7 August 2024).
- NIST. Selected Algorithms 2022, July 2022. Available online: https://csrc.nist.gov/projects/post-quantum-cryptography/selected-algorithms-2022 (accessed on 7 August 2024).
- Seo, E.Y.; Kim, Y.S.; Lee, J.W.; No, J.S. Peregrine: Toward Fastest FALCON Based on GPV Framework. Cryptology ePrint Archive. 2022. Available online: https://eprint.iacr.org/2022/1495 (accessed on 7 August 2024).
- Bernstein, D.J.; Hopwood, D.; Hülsing, A.; Lange, T.; Niederhagen, R.; Papachristodoulou, L.; Schneider, M.; Schwabe, P.; Wilcox-O’Hearn, Z. SPHINCS: Practical Stateless Hash-Based Signatures. In Proceedings of the Advances in Cryptology–EUROCRYPT 2015; Oswald, E., Fischlin, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2015; pp. 368–397. [Google Scholar]
- Aikata, A.; Mert, A.C.; Jacquemin, D.; Das, A.; Matthews, D.; Ghosh, S.; Roy, S.S. A Unified Cryptoprocessor for Lattice-Based Signature and Key-Exchange. IEEE Trans. Comput. 2023, 72, 1568–1580. [Google Scholar] [CrossRef]
- Basso, A.; Bermudo Mera, J.M.; D’Anvers, J.P.; Karmakar, A.; Sinha Roy, S.; Van Beirendonck, M.; Vercauteren, F. SABER: Mod-LWR Based KEM (Round 3 Submission) SABER Submission Package for Round 3. 2017. Available online: https://www.esat.kuleuven.be/cosic/pqcrypto/saber/files/saberspecround3.pdf (accessed on 7 August 2024).
- Lee, J.; Kim, W.; Kim, J.H. A Programmable Crypto-Processor for National Institute of Standards and Technology Post-Quantum Cryptography Standardization Based on the RISC-V Architecture. Sensors 2023, 23, 9408. [Google Scholar] [CrossRef]
- Nguyen, T.H.; Kieu-Do-Nguyen, B.; Pham, C.K.; Hoang, T.T. High-Speed NTT Accelerator for CRYSTAL-Kyber and CRYSTAL-Dilithium. IEEE Access 2024, 12, 34918–34930. [Google Scholar] [CrossRef]
- Lee, Y.; Youn, J.; Nam, K.; Jung, H.H.; Cho, M.; Na, J.; Park, J.Y.; Jeon, S.; Kang, B.G.; Oh, H.; et al. An Efficient Hardware/Software Co-Design for FALCON on Low-End Embedded Systems. IEEE Access 2024, 12, 57947–57958. [Google Scholar] [CrossRef]
- Gupta, N.; Jati, A.; Chattopadhyay, A.; Jha, G. Lightweight Hardware Accelerator for Post-Quantum Digital Signature CRYSTALS-Dilithium. IEEE Trans. Circuits Syst. I Regul. Pap. 2023, 70, 3234–3243. [Google Scholar] [CrossRef]
- Wagner, A.; Oberhansl, F.; Schink, M. To Be, or Not to Be Stateful: Post-Quantum Secure Boot using Hash-Based Signatures. In Proceedings of the 2022 Workshop on Attacks and Solutions in Hardware Security, Los Angeles, CA, USA, 11 November 2022; ASHES’22. pp. 85–94. [Google Scholar] [CrossRef]
- Mandal, S.; Roy, D.B. KiD: A Hardware Design Framework Targeting Unified NTT Multiplication for CRYSTALS-Kyber and CRYSTALS-Dilithium on FPGA. In Proceedings of the 2024 37th International Conference on VLSI Design and 2024 23rd International Conference on Embedded Systems (VLSID), Kolkata, India, 6–10 January 2024; pp. 455–460. [Google Scholar] [CrossRef]
- Beckwith, L.; Nguyen, D.T.; Gaj, K. Hardware Accelerators for Digital Signature Algorithms Dilithium and FALCON. IEEE Des. Test 2023, 1. [Google Scholar] [CrossRef]
- Bisheh-Niasar, M.; Azarderakhsh, R.; Mozaffari-Kermani, M. Azarderakhsh, R.; Mozaffari-Kermani, M. A Monolithic Hardware Implementation of Kyber: Comparing Apples to Apples in PQC Candidates. In Progress in Cryptology–LATINCRYPT 2021, Proceedings of the 7th International Conference on Cryptology and Information Security in Latin America, Bogotá, Colombia, 6–8 October 2021; Longa, P., Ràfols, C., Eds.; Springer: Cham, Switzerland, 2021; pp. 108–126. [Google Scholar]
- Intel Inc. Intel Vtune Profiler. 2023. Available online: https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html (accessed on 7 August 2024).
- Montgomery, P.L. Modular multiplication without trial division. Math. Comput. 1985, 44, 519–521. [Google Scholar] [CrossRef]
- Richard, T.; Chao, L.; Myoung, A. Algorithms for Discrete Fourier Transform and Convolution; Springer: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
- SYNOPSYS Inc. Synopsys Design Cimpiler. Available online: https://www.synopsys.com/implementation-and-signoff/rtl-synthesis-test/dc-ultra.html (accessed on 7 August 2024).
- Martins, M.; Matos, J.M.; Ribas, R.P.; Reis, A.; Schlinker, G.; Rech, L.; Michelsen, J. Open Cell Library in 15nm FreePDK Technology. In Proceedings of the 2015 Symposium on International Symposium on Physical Design, Monterey, CA, USA, 29 March–1 April 2015; ISPD ’15. pp. 171–178. [Google Scholar] [CrossRef]
- Soni, D.; Basu, K.; Nabeel, M.; Aaraj, N.; Manzano, M.; Karri, R. Hardware Architectures for Post-Quantum Digital Signature Schemes; Springer: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
- Alharbi, A.R.; Hazzazi, M.M.; Jamal, S.S.; Aljaedi, A.; Aljuhni, A.; Alanazi, D.J. DCryp-Unit: Crypto Hardware Accelerator Unit Design for Elliptic Curve Point Multiplication. IEEE Access 2024, 12, 17823–17835. [Google Scholar] [CrossRef]
- Amiet, D.; Leuenberger, L.; Curiger, A.; Zbinden, P. FPGA-based SPHINCS+ Implementations: Mind the Glitch. In Proceedings of the 2020 23rd Euromicro Conference on Digital System Design (DSD), Kranj, Slovenia, 26–28 August 2020; pp. 229–237. [Google Scholar] [CrossRef]
- Kocher, P.C. Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems. In Proceedings of the Advances in Cryptology—CRYPTO ’96; Koblitz, N., Ed.; Springer: Berlin/Heidelberg, Germany, 1996; pp. 104–113. [Google Scholar]
- Bogdanov, A. Improved Side-Channel Collision Attacks on AES. In Proceedings of the Selected Areas in Cryptography; Adams, C., Miri, A., Wiener, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 84–95. [Google Scholar]
- Ji, Y.; Wang, R.; Ngo, K.; Dubrova, E.; Backlund, L. A Side-Channel Attack on a Hardware Implementation of CRYSTALS-Kyber. In Proceedings of the 2023 IEEE European Test Symposium (ETS), Venezia, Italy, 22–26 May 2023; pp. 1–5. [Google Scholar] [CrossRef]
- Xagawa, K.; Ito, A.; Ueno, R.; Takahashi, J.; Homma, N. Fault-Injection Attacks Against NIST’s Post-Quantum Cryptography Round 3 KEM Candidates. In Proceedings of the Advances in Cryptology–ASIACRYPT 2021; Tibouchi, M., Wang, H., Eds.; Springer: Cham, Switzerland, 2021; pp. 33–61. [Google Scholar]
- Zhao, Y.; Pan, S.; Ma, H.; Gao, Y.; Song, X.; He, J.; Jin, Y. Side Channel Security Oriented Evaluation and Protection on Hardware Implementations of Kyber. IEEE Trans. Circuits Syst. I Regul. Pap. 2023, 70, 5025–5035. [Google Scholar] [CrossRef]
- Bos, J.W.; Gourjon, M.O.; Renes, J.; Schneider, T.; Vredendaal, C.V. Masking Kyber: First- and higher-order implementations. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2021, 173–214. [Google Scholar] [CrossRef]
- Migliore, V.; Gérard, B.; Tibouchi, M.; Fouque, P.A. Masking Dilithium. In Proceedings of the Applied Cryptography and Network Security; Deng, R.H., Gauthier-Umaña, V., Ochoa, M., Yung, M., Eds.; Springer: Cham, Switzerland, 2019; pp. 344–362. [Google Scholar]
- Cerdeira, D.; Martins, J.; Santos, N.; Pinto, S. ReZone: Disarming TrustZone with TEE Privilege Reduction. In Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA, 10–12 August 2022; pp. 2261–2279. [Google Scholar]
- Ryan, K. Hardware-Backed Heist: Extracting ECDSA Keys from Qualcomm’s TrustZone. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; CCS ’19. pp. 181–194. [Google Scholar] [CrossRef]
Kyber | Dilithium | FALCON | SPHINCS+ | |
---|---|---|---|---|
Algorithm Type | Key Exchange (KEA) | Digital Signature (DSA) | ||
Based Approach | Lattice-Based | Hash-Based |
Dilithium2 | Dilithium3 | Dilithium5 | |
---|---|---|---|
q | 8,380,417 | 8,380,417 | 8,380,417 |
N | 256 | 256 | 256 |
(k, l) | (4, 4) | (6, 5) | (8, 7) |
(q− 1)/88 | (q − 1)/32 | (q − 1)/32 | |
Kyber512 | Kyber768 | Kyber1024 | |
q | 3329 | 3329 | 3329 |
N | 256 | 256 | 256 |
k | 2 | 3 | 4 |
3 | 2 | 2 | |
2 | 2 | 2 | |
(10, 4) | (10, 4) | (10, 5) | |
Falcon512 | Falcon1024 | Peregrine * 512 | Peregrine * 1024 | |
---|---|---|---|---|
q | 12,289 | 12,289 | 12,289 | 12,289 |
N | 512 | 1024 | 512 | 1024 |
b | 34,034,726 | 70,265,242 | 34,034,726 | 150,700,176 |
Parameter | n | h | d | log(t) | k | w | NIST Security Level |
---|---|---|---|---|---|---|---|
SPHINCS+-256s | 32 | 64 | 8 | 14 | 22 | 16 | 5 |
SPHINCS+-256s robust | 32 | 64 | 8 | 14 | 22 | 16 | 5 |
Mnemonic | Opcode | Description |
---|---|---|
NOP | 0000 | No operation, Do nothing |
ADD | 0001 | Result[i]←vec_a[i]+vec_b[i] |
SUB | 0010 | Result[i]←vec_a[i]-vec_b[i] |
CADDQ | 0011 | Result[i]←(vec_a[i] <0) ? vec_a[i] +Q : vec_a[i] |
MULT | 0100 | Result[i]←vec_a[i] × vec_b[i] |
SHIFT | 0101 | Result[i]←vec_a[i] <<SHIFT_AMOUNT |
REDUCE | 0110 | Result[i]←MontgomeryReduction(vec_a[i]) |
AND | 0111 | Result[i]←vec_a[i] AND vec_b[i] |
OR | 1000 | Result[i]←vec_a[i] OR vec_b[i] |
XOR | 1001 | Result[i]←vec_a[i] XOR vec_b[i] |
NTT_BUTTERFLY | 1010 | Result←Butterfly(vec_a) |
INTT_BUTTERFLY | 1011 | Result←InvButterfly(vec_a) |
COMP | 1100 | Comp_result[i] ← COMPARE(vec_a[i], vec_b[i]) |
RESERVED | 1101–1111 | - |
Technology | Clock Frequency | Area (mm2) | kGE | Target kGE | Target Scheme | |
---|---|---|---|---|---|---|
Ours_Baseline | 15 nm | 1000 MHz | 0.056 | 284.939 | - | Dilithium, Kyber, SPHINCS+, Falcon(Peregrine) |
Ours_S | 15 nm | 1000 MHz | 0.062 | 315.743 | 300 | Dilithium, Kyber, SPHINCS+, Falcon(Peregrine) |
Ours_M | 15 nm | 1000 MHz | 0.115 | 584.624 | 613.849 | Dilithium, Kyber, SPHINCS+, Falcon(Peregrine) |
Ours_L | 15 nm | 1000 MHz | 0.120 | 611.389 | 613.849 | Dilithium, Kyber, SPHINCS+, Falcon(Peregrine) |
Gupta et al. [19] | 65 nm | 1176 MHz | 0.227 | 157.000 | - | Dilithium |
Aikata et al. [14] | 65 nm | 400 MHz | 0.317 | 220.000 | - | Dilithium, Saber |
Aikata et al. [6] | 28 nm | 1000 MHz | 0.263 | 747.000 | - | Dilithium, Kyber |
Wagner et al. [20] | 120 nm | 250 & 500 MHz | 0.560 | 84.000 | - | SPHINCS+ |
Wagner et al. [20] extended | 120 nm | 250 & 500MHz | 0.476 | 98.800 | - | SPHINCS+ |
Lee et al. [18] | 28 nm | 300 MHz | 0.038 | 98.729 | - | Falcon(Verification) (1) |
Soni et al. [29] 512 | 65 nm | 122 MHz | 0.387 | 184.300 | - | Falcon(Signing) (2) |
Soni et al. [29] 1024 | 65 nm | 173 MHz | 0.380 | 181.120 | - | Falcon(Signing) (2) |
Bisheh-Nisar et al. [23] | 65 nm | 200 MHz | N/A | 93 | - | Kyber |
Parameter | Gupta et al. [19] | Aikata et al. [14] | Akata et al. [6] | Ours_S | Ours_M | Ours_L | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | ||
Keygen | Dilithium2 | - | - | 0.52 | 0.75 | 1.27 | 0.54 | 1.00 | 1.00 | 1.74 | 0.94 | 2.09 | 1.08 |
Dilithium3 | - | - | 0.57 | 0.82 | 1.39 | 0.59 | 1.00 | 1.00 | 1.76 | 0.95 | 2.08 | 1.07 | |
Dilithium5 | 1.11 | 2.23 | 0.61 | 0.88 | 1.50 | 0.63 | 1.00 | 1.00 | 1.77 | 0.96 | 2.08 | 1.07 | |
Kyber512 | - | - | - | - | 4.66 | 1.97 | 1.00 | 1.00 | 1.04 | 0.56 | 2.18 | 1.12 | |
Kyber768 | - | - | - | - | 3.47 | 1.47 | 1.00 | 1.00 | 1.04 | 0.56 | 2.26 | 1.17 | |
Kyber1024 | - | - | - | - | 3.08 | 1.30 | 1.00 | 1.00 | 1.04 | 0.56 | 2.31 | 1.19 | |
Sign | Dilithium2 | - | - | 0.96 | 1.38 | 2.31 | 0.98 | 1.00 | 1.00 | 1.90 | 1.03 | 2.01 | 1.04 |
Dilithium3 | - | - | 1.08 | 1.55 | 2.63 | 1.11 | 1.00 | 1.00 | 1.92 | 1.04 | 2.01 | 1.04 | |
Dilithium5 | 2.39 | 4.81 | 1.35 | 1.94 | 3.30 | 1.40 | 1.00 | 1.00 | 1.93 | 1.04 | 2.01 | 1.04 | |
Kyber512 | - | - | - | - | 2.85 | 1.20 | 1.00 | 1.00 | 1.07 | 0.58 | 2.74 | 1.42 | |
Kyber768 | - | - | - | - | 2.57 | 1.09 | 1.00 | 1.00 | 1.07 | 0.58 | 2.71 | 1.40 | |
Kyber1024 | - | - | - | - | 2.34 | 0.99 | 1.00 | 1.00 | 1.07 | 0.58 | 2.68 | 1.39 | |
Verify | Dilithium2 | - | - | 0.83 | 1.19 | 2.02 | 0.85 | 1.00 | 1.00 | 1.83 | 0.99 | 2.00 | 1.03 |
Dilithium3 | - | - | 0.85 | 1.22 | 2.08 | 0.88 | 1.00 | 1.00 | 1.85 | 1.00 | 2.01 | 1.04 | |
Dilithium5 | 1.71 | 3.44 | 0.86 | 1.24 | 2.11 | 0.89 | 1.00 | 1.00 | 1.86 | 1.01 | 2.03 | 1.05 | |
Kyber512 | - | - | - | - | 2.26 | 0.95 | 1.00 | 1.00 | 1.11 | 0.60 | 2.65 | 1.37 | |
Kyber768 | - | - | - | - | 1.96 | 0.83 | 1.00 | 1.00 | 1.11 | 0.60 | 2.61 | 1.35 | |
Kyber1024 | - | - | - | - | 2.10 | 0.89 | 1.00 | 1.00 | 1.11 | 0.60 | 2.57 | 1.33 |
Parameter | CPU(AVX) | Lee et al. [18] | Soni et al. [29] | Ours_S | Ours_M | Ours_L | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Thrpt. | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | ||
Keygen | Falcon512 | 0.15 | - | - | - | - | 1.00 | 1.000 | 1.82 | 0.983 | 1.82 | 0.940 |
Falcon1024 | 0.11 | - | - | - | - | 1.00 | 1.000 | 1.84 | 0.992 | 1.84 | 0.948 | |
Sign | Falcon512 | 0.24 | 0.002 | 0.006 | - | - | 1.00 | 1.000 | 1.60 | 0.863 | 1.60 | 0.826 |
Falcon1024 | 0.25 | 0.002 | 0.006 | - | - | 1.00 | 1.000 | 1.62 | 0.874 | 1.62 | 0.836 | |
Verify | Falcon512 | 0.12 | - | - | 0.01 | 0.015 | 1.00 | 1.000 | 1.98 | 1.072 | 2.00 | 1.034 |
Falcon1024 | 0.13 | - | - | 0.01 | 0.015 | 1.00 | 1.000 | 1.99 | 1.076 | 2.00 | 1.033 |
Parameter | CPU(AVX) | Wagner et al. [20] | Wagner et al. [20] Extended | Amiet et al. [31] | Ours_S | Ours_M | Ours_L | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Thrpt. | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | Thrpt. | FoM | ||
Keygen | 256s-simple | 0.05 | - | - | - | - | - | - | 1.00 | 1.000 | 1.00 | 0.540 | 2.95 | 1.522 |
256s-robust | 0.04 | - | - | - | - | - | - | 1.00 | 1.000 | 1.00 | 0.540 | 2.97 | 1.535 | |
Sign | 256s-simple | 0.03 | - | - | - | - | 0.82 | - | 1.00 | 1.000 | 1.00 | 0.540 | 2.95 | 1.522 |
256s-robust | 0.03 | - | - | - | - | 0.83 | - | 1.00 | 1.000 | 1.00 | 0.540 | 2.97 | 1.535 | |
Verify | 256s-simple | 0.01 | 0.03 | 0.104 | 0.04 | 0.135 | 0.06 | - | 1.00 | 1.000 | 1.00 | 0.540 | 2.58 | 1.330 |
256s-robust | 0.01 | 0.02 | 0.077 | 0.04 | 0.131 | 0.08 | - | 1.00 | 1.000 | 1.00 | 0.540 | 2.59 | 1.336 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jung, H.; Oh, H. Designing a Scalable and Area-Efficient Hardware Accelerator Supporting Multiple PQC Schemes. Electronics 2024, 13, 3360. https://doi.org/10.3390/electronics13173360
Jung H, Oh H. Designing a Scalable and Area-Efficient Hardware Accelerator Supporting Multiple PQC Schemes. Electronics. 2024; 13(17):3360. https://doi.org/10.3390/electronics13173360
Chicago/Turabian StyleJung, Heonhui, and Hyunyoung Oh. 2024. "Designing a Scalable and Area-Efficient Hardware Accelerator Supporting Multiple PQC Schemes" Electronics 13, no. 17: 3360. https://doi.org/10.3390/electronics13173360
APA StyleJung, H., & Oh, H. (2024). Designing a Scalable and Area-Efficient Hardware Accelerator Supporting Multiple PQC Schemes. Electronics, 13(17), 3360. https://doi.org/10.3390/electronics13173360