Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems
Abstract
:1. Introduction
- We observe that the non-linear conductance levels can result in more conductance representations than linear conductance levels when a pair of differential ReRAM devices are employed to map a weight, which effectively holds the inference accuracy in software after the weights are quantized.
- The weight quantization criteria are generated based on the non-linear conductance values of a pair of differential ReRAM devices. The method of minimum error substitute (MES) is employed in the quantization process to determine the quantized weight locations in the software network, which provides a universal quantization method with a different conductance value distribution.
- The proposed MES-based conductance-aware quantization is evaluated with LeNet, AlexNet and VGG16, including the consideration of the device variation.
2. Preliminary
2.1. The Characteristics of the ReRAM
2.2. Reram-Based DNN
2.2.1. Rca-Based MAC Operation
2.2.2. Accelerating the MAC Operations in DNN Based on RCA Hardware
- (1)
- Extend the sign and range of the conductance of ReRAM
- (2)
- Quantization of the weights in DNN
- (3)
- Strategies of the conductance-aware quantization of weights
3. Proposed Method
3.1. The Characteristic of the Differential Pair ReRAMs with Non-Linear Distribution of Conductance
3.2. Conductance-Aware Quantization Based on Minimum Error Substitution
Algorithm 1: MES-CAQ. |
|
4. Simulation and Results
4.1. Reram Non-Linear Conductance Models and Fitting Functions
4.2. The Simulation Results of the MES-CAQ
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Boiman, O.; Shechtman, E.; Irani, M. In defense of nearest-neighbor based image classification. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Huang, C.; Liu, P.; Fang, L. MXQN: Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks. Appl. Intell. 2021, 51, 4561–4574. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Graves, A.; Mohamed, A.R.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 6645–6649. [Google Scholar]
- Abdel-Hamid, O.; Mohamed, A.r.; Jiang, H.; Deng, L.; Penn, G.; Yu, D. Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 2014, 22, 1533–1545. [Google Scholar] [CrossRef] [Green Version]
- Ielmini, D.; Wong, H.S.P. In-memory computing with resistive switching devices. Nat. Electron. 2018, 1, 333–343. [Google Scholar] [CrossRef]
- Sebastian, A.; Le Gallo, M.; Khaddam-Aljameh, R.; Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 2020, 15, 529–544. [Google Scholar] [CrossRef] [PubMed]
- Chi, P.; Li, S.; Xu, C.; Zhang, T.; Zhao, J.; Liu, Y.; Wang, Y.; Xie, Y. Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory. ACM Sigarch Comput. Archit. News 2016, 44, 27–39. [Google Scholar] [CrossRef]
- Song, L.; Qian, X.; Li, H.; Chen, Y. Pipelayer: A pipelined reram-based accelerator for deep learning. In Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, USA, 4–8 February 2017; pp. 541–552. [Google Scholar]
- Shafiee, A.; Nag, A.; Muralimanohar, N.; Balasubramonian, R.; Strachan, J.P.; Hu, M.; Williams, R.S.; Srikumar, V. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM Sigarch Comput. Archit. News 2016, 44, 14–26. [Google Scholar] [CrossRef]
- Wong, H.S.P.; Lee, H.Y.; Yu, S.; Chen, Y.S.; Wu, Y.; Chen, P.S.; Lee, B.; Chen, F.T.; Tsai, M.J. Metal–oxide RRAM. Proc. IEEE 2012, 100, 1951–1970. [Google Scholar] [CrossRef]
- Beckmann, K.; Holt, J.; Manem, H.; Van Nostrand, J.; Cady, N.C. Nanoscale hafnium oxide rram devices exhibit pulse dependent behavior and multi-level resistance capability. Mrs Adv. 2016, 1, 3355–3360. [Google Scholar] [CrossRef]
- Liu, R.; Lee, H.Y.; Yu, S. Analyzing inference robustness of RRAM synaptic array in low-precision neural network. In Proceedings of the 2017 47th European Solid-State Device Research Conference (ESSDERC), Leuven, Belgium, 11–14 September 2017; pp. 18–21. [Google Scholar]
- Chen, W.; Lu, W.; Long, B.; Li, Y.; Gilmer, D.; Bersuker, G.; Bhunia, S.; Jha, R. Switching characteristics of W/Zr/HfO2/TiN ReRAM devices for multi-level cell non-volatile memory applications. Semicond. Sci. Technol. 2015, 30, 075002. [Google Scholar] [CrossRef]
- Kim, W.; Menzel, S.; Wouters, D.; Waser, R.; Rana, V. 3-bit multilevel switching by deep reset phenomenon in Pt/W/TaO X/Pt-ReRAM devices. IEEE Electron Device Lett. 2016, 37, 564–567. [Google Scholar] [CrossRef]
- Zhao, L.; Chen, H.-Y.; Wu, S.-C.; Jiang, Z.; Yu, S.; Hou, T.-H.; Philip Wong, H.S.; Nishi, Y. Multi-level control of conductive nano-filament evolution in HfO2 ReRAM by pulse-train operations. Nanoscale 2014, 6, 5698–5702. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Tsigkourakos, M.; Bousoulas, P.; Aslanidis, V.; Skotadis, E.; Tsoukalas, D. Ultra-Low Power Multilevel Switching with Enhanced Uniformity in Forming Free TiO2−x-Based RRAM with Embedded Pt Nanocrystals. Phys. Status Solidi A 2017, 214, 1700570. [Google Scholar] [CrossRef]
- Terai, M.; Sakotsubo, Y.; Kotsuji, S.; Hada, H. Resistance Controllability of Ta2O5/TiO2 Stack ReRAM for Low-Voltage and Multilevel Operation. IEEE Electron Device Lett. 2010, 31, 204–206. [Google Scholar] [CrossRef]
- He, Z.; Lin, J.; Ewetz, R.; Yuan, J.S.; Fan, D. Noise injection adaption: End-to-end ReRAM crossbar non-ideal effect adaption for neural network mapping. In Proceedings of the 56th Annual Design Automation Conference 2019, Las Vegas, NV, USA, 2–6 June 2019; pp. 1–6. [Google Scholar]
- Huang, C.; Xu, N.; Qiu, K.; Zhu, Y.; Ma, D.; Fang, L. Efficient and optimized methods for alleviating the impacts of IR-drop and fault in RRAM based neural computing systems. IEEE J. Electron Devices Soc. 2021, 9, 645–652. [Google Scholar] [CrossRef]
- Lin, J.; Xia, L.; Zhu, Z.; Sun, H.; Cai, Y.; Gao, H.; Cheng, M.; Chen, X.; Wang, Y.; Yang, H. Rescuing memristor-based computing with non-linear resistance levels. In Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, 19–23 March 2018; pp. 407–412. [Google Scholar]
- Lin, J.; Wen, C.D.; Hu, X.; Tang, T.; Lin, C.; Wang, Y.; Xie, Y. Rescuing RRAM-based Computing from Static and Dynamic Faults. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2020, 40, 2049–2062. [Google Scholar] [CrossRef]
- Kwon, D.; Lim, S.; Bae, J.H.; Lee, S.T.; Kim, H.; Kim, C.H.; Park, B.G.; Lee, J.H. Adaptive weight quantization method for nonlinear synaptic devices. IEEE Trans. Electron Devices 2018, 66, 395–401. [Google Scholar] [CrossRef]
- Xu, C.; Dong, X.; Jouppi, N.P.; Xie, Y. Design implications of memristor-based RRAM cross-point structures. In Proceedings of the 2011 Design, Automation & Test in Europe, Grenoble, France, 14–18 March 2011; pp. 1–6. [Google Scholar]
- Singh, J.; Raj, B. Tunnel current model of asymmetric MIM structure levying various image forces to analyze the characteristics of filamentary memristor. Appl. Phys. A 2019, 125, 203. [Google Scholar] [CrossRef]
- Tzouvadaki, I.; Stathopoulos, S.; Abbey, T.; Michalas, L.; Prodromakis, T. Monitoring PSA levels as chemical state-variables in metal-oxide memristors. Sci. Rep. 2020, 10, 15281. [Google Scholar] [CrossRef] [PubMed]
- Xia, L.; Gu, P.; Li, B.; Tang, T.; Yin, X.; Huangfu, W.; Yu, S.; Cao, Y.; Wang, Y.; Yang, H. Technological exploration of RRAM crossbar array for matrix-vector multiplication. J. Comput. Sci. Technol. 2016, 31, 3–19. [Google Scholar] [CrossRef]
- Kazemi, A.; Alessandri, C.; Seabaugh, A.C.; Hu, X.S.; Niemier, M.; Joshi, S. A device non-ideality resilient approach for mapping neural networks to crossbar arrays. In Proceedings of the 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 20–24 July 2020; pp. 1–6. [Google Scholar]
- Zhu, Y.; Zhang, G.L.; Wang, T.; Li, B.; Shi, Y.; Ho, T.Y.; Schlichtmann, U. Statistical training for neuromorphic computing using memristor-based crossbars considering process variations and noise. In Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France, 9–13 March 2020; pp. 1590–1593. [Google Scholar]
- Hu, M.; Strachan, J.P.; Li, Z.; Grafals, E.M.; Davila, N.; Graves, C.; Lam, S.; Ge, N.; Yang, J.J.; Williams, R.S. Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In Proceedings of the 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC), Austin, TX, USA, 5–9 June 2016; pp. 1–6. [Google Scholar]
- Liu, C.; Hu, M.; Strachan, J.P.; Li, H. Rescuing memristor-based neuromorphic design with high defects. In Proceedings of the 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC), Austin, TX, USA, 18–22 June 2017; pp. 1–6. [Google Scholar]
- Charan, G.; Hazra, J.; Beckmann, K.; Du, X.; Krishnan, G.; Joshi, R.V.; Cady, N.C.; Cao, Y. Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation. In Proceedings of the 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 20–24 July 2020; pp. 1–6. [Google Scholar]
- Liu, B.; Li, H.; Chen, Y.; Li, X.; Wu, Q.; Huang, T. Vortex: Variation-aware training for memristor X-bar. In Proceedings of the 52nd Annual Design Automation Conference, San Francisco, CA, USA, 8–12 June 2015; pp. 1–6. [Google Scholar]
- Chen, L.; Li, J.; Chen, Y.; Deng, Q.; Shen, J.; Liang, X.; Jiang, L. Accelerator-friendly neural-network training: Learning variations and defects in RRAM crossbar. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), Lausanne, Switzerland, 27–31 March 2017; pp. 19–24. [Google Scholar]
Parameters | a | ||
---|---|---|---|
Values | 0.10 | , 2, 3 | , 2, 3 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, C.; Xu, N.; Wang, W.; Hu, Y.; Fang, L. Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems. Micromachines 2022, 13, 667. https://doi.org/10.3390/mi13050667
Huang C, Xu N, Wang W, Hu Y, Fang L. Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems. Micromachines. 2022; 13(5):667. https://doi.org/10.3390/mi13050667
Chicago/Turabian StyleHuang, Chenglong, Nuo Xu, Wenqing Wang, Yihong Hu, and Liang Fang. 2022. "Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems" Micromachines 13, no. 5: 667. https://doi.org/10.3390/mi13050667
APA StyleHuang, C., Xu, N., Wang, W., Hu, Y., & Fang, L. (2022). Conductance-Aware Quantization Based on Minimum Error Substitution for Non-Linear-Conductance-State Tolerance in Neural Computing Systems. Micromachines, 13(5), 667. https://doi.org/10.3390/mi13050667