A Variation-Aware Binary Neural Network Framework for Process Resilient In-Memory Computations
Abstract
:1. Introduction
- We develop mathematical models to quantify the impact of process variations on SRAM-based BNN CIM circuits. In these circuits, the current of an SRAM cell represents the multiplication result of the stored weight and the input activation. However, parametric process variations cause fluctuations in the SRAM cell current, directly affecting the accuracy of analog computations, and consequently, the BNN inference accuracy. Our model interprets these fluctuations as variations in the weights of the BNN. To model these weight variations, we utilized the distribution of SRAM cell currents obtained through Monte Carlo (MC) simulations in 28 nm FD-SOI technology. Consequently, our method is applicable to SRAM-CIM circuits employing current-based analog computation.
- Based on the derived model, we present the variation-aware BNN training framework to produce variation-resilient training results. During the training, BNNs are considered bi-polar neural networks due to the weight variations aforementioned. We demonstrate the efficacy of the developed framework through extensive simulations.
- We optimize the biasing voltages of word lines (WLs) and bit lines (BLs) in SRAM to achieve a balance between maintaining acceptable accuracy and minimizing power consumption.
2. Preliminaries
2.1. Binary Neural Network
2.2. The Architecture of SRAM-Based CIM
2.3. Mapping DNNs onto SRAM-Based CIM Arrays
2.3.1. Input Splitting
2.3.2. Mapping
2.4. In-Memory Batch Normalization
3. A Variation-Aware Binary Neural Network Framework
3.1. Variation-Aware Models for SRAM-Based BNN CIM
3.2. Variation-Aware Framework for Bi-Polar Neural Networks
Algorithm 1 Training a reconstructed L-layer BNN with variation-aware weights and activations. |
Require: a minibatch of inputs and target , previous weights W, previous BatchNorm parameters (, ), , weights initialization coefficients from [30] , and previous learning rate . Ensure: updated weights , updated BatchNorm parameters (, ) and updated learning rate 1. Computing the parameters gradients: 1.1 Forward propagation: for to L do // Input size per array // Number of groups while do end while // Input splitting for to nGroups do end for if then end if end for 1.2 Backward propagation: Compute knowing and for to 1 do if then (STE) end if for to nGroups do end for end for 2. Accumulating the parameters gradients: for to L do for to nGroups do end for end for |
3.3. Optimization of Biasing Voltages
3.4. Modeling of IR Drop
4. Validation of Our Framework
4.1. Experimental Setting
4.2. Results and Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
CIM | Computation in memory |
DNNs | Deep neural networks |
BA | Binary Activation |
BNNs | Binary neural networks |
ADCs | Analog-to-digital converters |
CSA | Current sense amplifier |
MAC | Multiply-and-accumulation |
eNVM | Emerging non-volatile memories |
MC | Monte Carlo |
BN | Batch normalization |
FC | Fully connected |
BLs | Bit lines |
WLs | Word lines |
References
- Wan, W.; Kubendran, R.; Schaefer, C.; Eryilmaz, S.B.; Zhang, W.; Wu, D.; Deiss, S.; Raina, P.; Qian, H.; Gao, B.; et al. A compute-in-memory chip based on resistive random-access memory. Nature 2022, 608, 504–512. [Google Scholar] [CrossRef] [PubMed]
- Wen, T.-H.; Hsu, H.-H.; Khwa, W.-S.; Huang, W.-H.; Ke, Z.-E.; Chin, Y.-H.; Wen, H.-J.; Chang, Y.-C.; Hsu, W.-T.; Lo, C.-C.; et al. 34.8 A 22nm 16Mb Floating-Point ReRAM Compute-in-Memory Macro with 31.2TFLOPS/W for AI Edge Devices. In Proceedings of the 2024 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 18–22 February 2024; pp. 580–582. [Google Scholar] [CrossRef]
- Antolini, A.; Lico, A.; Zavalloni, F.; Scarselli, E.F.; Gnudi, A.; Torres, M.L.; Canegallo, R.; Pasotti, M. A Readout Scheme for PCM-Based Analog In-Memory Computing With Drift Compensation Through Reference Conductance Tracking. IEEE Open J.-Solid-State Circuits Soc. 2024, 4, 69–82. [Google Scholar] [CrossRef]
- Khaddam-Aljameh, R.; Stanisavljevic, M.; Mas, J.F.; Karunaratne, G.; Braendli, M.; Liu, F.; Singh, A.; Müller, S.M.; Egger, U.; Petropoulos, A.; et al. HERMES Core—A 14 nm CMOS and PCM-based In-Memory Compute Core using an array of 300ps/LSB Linearized CCO-based ADCs and local digital processing. In Proceedings of the 2021 Symposium on VLSI Circuits, Kyoto, Japan, 13–19 June 2021; pp. 1–2. [Google Scholar] [CrossRef]
- Courbariaux, M.; Hubara, I.; Soudry, D.; El-Yaniv, R.; Bengio, Y. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or −1. arXiv 2016, arXiv:1602.02830. [Google Scholar]
- Rastegari, M.; Ordonez, V.; Redmon, J.; Farhadi, A. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. arXiv 2016, arXiv:1603.05279v4. [Google Scholar]
- Kim, H.; Kim, K.; Kim, J.; Kim, J.-J. BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations. In Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 30 April 2020; Available online: https://openreview.net/forum?id=r1x0lxrFPS (accessed on 25 September 2024).
- Yin, S.; Jiang, Z.; Seo, J.-S.; Seok, M. XNOR-SRAM: In-Memory Computing SRAM Macro for Binary/Ternary Deep Neural Networks. IEEE J.-Solid-State Circuits 2020, 6, 1733–1743. [Google Scholar] [CrossRef]
- Liu, R.; Peng, X.; Sun, X.; Khwa, W.-S.; Si, X.; Chen, J.-J.; Li, J.-F.; Chang, M.-F.; Yu, S. Parallelizing SRAM arrays with customized bit-cell for binary neural networks. In Proceedings of the 55th Annual Design Automation Conference (DAC), San Francisco, CA, USA, 24–28 June 2018. [Google Scholar] [CrossRef]
- Kim, H.; Oh, H.; Kim, J.-J. Energy-efficient XNOR-free in-memory BNN accelerator with input distribution regularization. In Proceedings of the 39th International Conference on Computer-Aided Design (ICCAD), Virtual Event. 2–5 November 2020. [Google Scholar] [CrossRef]
- Choi, W.H.; Chiu, P.-F.; Ma, W.; Hemink, G.; Hoang, T.T.; Lueker-Boden, M.; Bandic, Z. An In-Flash Binary Neural Network Accelerator with SLC NAND Flash Array. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), Seville, Spain, 12–14 October 2020. [Google Scholar] [CrossRef]
- Angizi, S.; He, Z.; Awad, A.; Fan, D. MRIMA: An MRAM-Based In-Memory Accelerator. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. (Tcad) 2020, 5, 1123–1136. [Google Scholar] [CrossRef]
- Saha, G.; Jiang, Z.; Parihar, S.; Xi, C.; Higman, J.; Karim, M.A.U. An Energy-Efficient and High Throughput in-Memory Computing Bit-Cell With Excellent Robustness Under Process Variations for Binary Neural Network. IEEE Access 2020, 8, 91405–91414. [Google Scholar] [CrossRef]
- Kim, J.; Koo, J.; Kim, T.; Kim, Y.; Kim, H.; Yoo, S.; Kim, J.-J. Area-Efficient and Variation-Tolerant In-Memory BNN Computing using 6T SRAM Array. In Proceedings of the Symposium on VLSI Circuits, Kyoto, Japan, 9–14 June 2019. [Google Scholar] [CrossRef]
- Oh, H.; Kim, H.; Ahn, D.; Park, J.; Kim, Y.; Lee, I.; Kim, J.-J. Energy-efficient charge sharing-based 8T2C SRAM in-memory accelerator for binary neural networks in 28nm CMOS. In Proceedings of the IEEE Asian Solid-State Circuits Conference (A-SSCC), Busan, Republic of Korea, 7–10 November 2021. [Google Scholar] [CrossRef]
- Bhunia, S.; Mukhopadhyay, S.; Roy, K. Process Variations and Process-Tolerant Design. In Proceedings of the 20th International Conference on VLSI Design Held Jointly with 6th International Conference on Embedded Systems (VLSID’07), Bangalore, India, 6–10 January 2007. [Google Scholar] [CrossRef]
- Yi, W.; Kim, Y.; Kim, J.-J. Effect of Device Variation on Mapping Binary Neural Network to Memristor Crossbar Array. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 25–29 March 2019. [Google Scholar] [CrossRef]
- Laborieux, A.; Bocquet, M.; Hirtzlin, T.; Klein, J.-O.; Nowak, E.; Vianello, E.; Portal, J.-M.; Querlioz, D. Implementation of Ternary Weights With Resistive RAM Using a Single Sense Operation Per Synapse. IEEE Trans. Circuits Syst. Regul. Pap. 2021, 1, 138–147. [Google Scholar] [CrossRef]
- Sun, X.; Peng, X.; Chen, P.-Y.; Liu, R.; Seo, J.-s.; Yu, S. Fully parallel RRAM synaptic array for implementing binary neural network with (+1, −1) weights and (+1, 0) neurons. In Proceedings of the 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), Jeju, Republic of Korea, 22–25 January 2018. [Google Scholar] [CrossRef]
- Sun, X.; Yin, S.; Peng, X.; Liu, R.; Seo, J.-s.; Yu, S. XNOR-RRAM: A scalable and parallel resistive synaptic architecture for binary neural networks. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, 19–23 March 2018. [Google Scholar] [CrossRef]
- Liu, B.; Li, H.; Chen, Y.; Li, X.; Wu, Q.; Huang, T. Vortex: Variation-aware training for memristor X-bar. In Proceedings of the 52nd Annual Design Automation Conference (DAC), San Francisco, CA, USA, 8–12 June 2015. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556v6. [Google Scholar]
- Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. Technical Report. 2009. Available online: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf (accessed on 25 September 2024).
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Kim, Y.; Kim, H.; Kim, J.-J. Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators. arXiv 2019, arXiv:1811.02187. [Google Scholar]
- Sari, E.; Belbahri, M.; Nia, V.P. How Does Batch Normalization Help Binary Training? arXiv 2020, arXiv:1909.09139v3. [Google Scholar]
- Kim, H.; Kim, Y.; Kim, J.-J. In-memory batch-normalization for resistive memory based binary neural network hardware. In Proceedings of the 24th Asia and South Pacific Design Automation Conference (ASPDAC), Tokyo, Japan, 21–24 January 2019. [Google Scholar] [CrossRef]
- Chen, T.; Gielen, G.G.E. A 14-bit 200-MHz Current-Steering DAC With Switching-Sequence Post-Adjustment Calibration. IEEE J.-Solid-State Circuits 2007, 42, 2386–2394. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980v9. [Google Scholar]
- Arandilla, C.D.C.; Alvarez, A.B.; Roque, C.R.K. Static Noise Margin of 6T SRAM Cell in 90-nm CMOS. In Proceedings of the UkSim 13th International Conference on Computer Modelling and Simulation (UKSIM), Cambridge, UK, 30 March–1 April 2011. [Google Scholar] [CrossRef]
IEEE Access’20 [13] | VLSI’19 [14] | ASSCC’21 [15] | DAC’18 [9] | |
---|---|---|---|---|
SRAM bit cell | 10T + BEOL MOM cap | 6T (split WL) | 8T2C | 6T |
Technology | 22 nm | 28 nm | 28 nm | 65 nm |
VLSI’19 [14] | ASSCC’21 [15] | This Work | |
---|---|---|---|
Technique to mitigate process variations. | In-memory calibration by using some biasing rows. | Charge-based computation. | Software framework. |
Advantages | Robust to deterministic noise. | Robust to random cell variations. | No hardware penalty. |
Disadvantages | Difficulty in coping with random cell variations. The area and power overhead of the biasing rows. | Large cell area compared to 6T-SRAM cell. | Many training processes. |
Network | Dataset | Full Precision | BNN | Split BNN | ||||||
---|---|---|---|---|---|---|---|---|---|---|
128 | 256 | 512 | ||||||||
(+1/−1) | (1/0) | (+1/−1) | (1/0) | (+1/−1) | (1/0) | (+1/−1) | (1/0) | |||
CONVNET [25] | MNIST [25] | 99.43 | 99.29 | 99.33 | 98.85 | 98.92 | 98.89 | 99.17 | 99.13 | 99.22 |
RESNET-18 [22] | CIFAR-10 [24] | 91.17 | 82.82 | 83.06 | 67.68 | 78.87 | 78.30 | 81.02 | 78.23 | 81.85 |
VGG-9 [23] | CIFAR-10 [24] | 93.71 | 89.77 | 91.36 | 86.94 | 87.24 | 87.63 | 88.58 | 88.35 | 88.79 |
Layer | Input Count per Output | Array Size | ||
---|---|---|---|---|
128 | 256 | 512 | ||
1 | 3 × 3 × 1 | - | - | - |
2 | 3 × 3 × 32 | 3 | 2 | 1 |
3 | 3 × 3 × 32 | 3 | 2 | 1 |
4 | 3 × 3 × 32 | 3 | 2 | 1 |
5 | 1568 | 14 | 7 | 4 |
6 | 512 | - | - | - |
Layer | Input Count per Output | Array Size | ||
---|---|---|---|---|
128 | 256 | 512 | ||
1 | 3 × 3 × 3 | - | - | - |
2→7 | 3 × 3 × 16 | 2 | 1 | 1 |
8→13 | 3 × 3 × 32 | 3 | 2 | 1 |
14→19 | 3 × 3 × 64 | 6 | 3 | 2 |
20 | 64 | - | - | - |
Layer | Input Count per Output | Array Size | ||
---|---|---|---|---|
128 | 256 | 512 | ||
1 | 3 × 3 × 3 | - | - | - |
2 | 3 × 3 × 128 | 9 | 6 | 3 |
3 | 3 × 3 × 128 | 9 | 6 | 3 |
4 | 3 × 3 × 256 | 18 | 9 | 6 |
5 | 3 × 3 × 256 | 18 | 9 | 6 |
6 | 3 × 3 × 512 | 36 | 18 | 9 |
7 | 8192 | 64 | 32 | 16 |
8 | 1024 | 8 | 4 | 2 |
9 | 1024 | - | - | - |
Software Implementation | Hardware Implementation |
---|---|
where |
0.1 | 0.2 | 0.3 | 0.4 | ||
---|---|---|---|---|---|
0 | 0 | 0 | 0 | ||
0 | 0 | 0 | 0 | ||
138 | 0 | 0 | 0 | ||
Flipped | 311 | 0 | 0 | ||
Flipped | Flipped | 514 | 0 | ||
Flipped | Flipped | Flipped | 807 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Le, M.-S.; Pham, T.-N.; Nguyen, T.-D.; Chang, I.-J. A Variation-Aware Binary Neural Network Framework for Process Resilient In-Memory Computations. Electronics 2024, 13, 3847. https://doi.org/10.3390/electronics13193847
Le M-S, Pham T-N, Nguyen T-D, Chang I-J. A Variation-Aware Binary Neural Network Framework for Process Resilient In-Memory Computations. Electronics. 2024; 13(19):3847. https://doi.org/10.3390/electronics13193847
Chicago/Turabian StyleLe, Minh-Son, Thi-Nhan Pham, Thanh-Dat Nguyen, and Ik-Joon Chang. 2024. "A Variation-Aware Binary Neural Network Framework for Process Resilient In-Memory Computations" Electronics 13, no. 19: 3847. https://doi.org/10.3390/electronics13193847
APA StyleLe, M.-S., Pham, T.-N., Nguyen, T.-D., & Chang, I.-J. (2024). A Variation-Aware Binary Neural Network Framework for Process Resilient In-Memory Computations. Electronics, 13(19), 3847. https://doi.org/10.3390/electronics13193847