Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction Mechanism
Abstract
:1. Introduction
2. Background and Related Work
2.1. Branch Prediction and Its Significance
2.2. Historical Development of BTB
2.3. Power Optimization and Capacity
2.4. Innovations in Power Reduction
2.5. The Two-Level BTB Structure
3. The Proposed Structure
3.1. Structure Overview
3.1.1. The M-BTB Module
3.1.2. The V-BTB Module
Algorithm 1: Way prediction algorithm. |
3.2. Experiment Settings
4. Experiment Evaluation
4.1. Analysis of Accesses
4.2. Analysis of Power
4.3. Analysis of Area
4.4. The Analysis of Performance
4.5. Way Selection Mechanism
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Vishal, G.; Panda, B. Micro btb: A high performance and storage efficient last-level branch target buffer for servers. In Proceedings of the 19th ACM International Conference on Computing Frontiers, Turin, Italy, 17–22 May 2022; pp. 12–20. [Google Scholar] [CrossRef]
- Kim, I.; Jun, J.; Na, Y.; Kim, S.W. Design of a G-Share branch predictor for EISC processor. IEIE Trans. Smart Process. Comput. 2015, 4, 366–370. [Google Scholar] [CrossRef]
- Ranganathan, N.; Nagarajan, R.; Jimenez, D.; Burger, D.; Keckler, S.W.; Lin, C. Combining Hyperblocks and Exit Prediction to Increase Front-End Bandwidth and Performance; The University of Texas at Austin: Austin, TX, USA, 2002. [Google Scholar]
- Seznec, A. A new case for the tage branch predictor. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, Porto Alegre, Brazil, 3–7 December 2011; pp. 117–127. [Google Scholar] [CrossRef]
- Hartstein, A.; Puzak, T.R. The optimum pipeline depth for a microprocessor. ACM Sigarch Comput. Archit. News 2002, 30, 7–13. [Google Scholar] [CrossRef]
- Pellegrini, A.; Stephens, N.; Bruce, M. The arm neoverse n1 platform: Building blocks for the next-gen cloud-to-edge infrastructure soc. IEEE Micro 2020, 40, 53–62. [Google Scholar] [CrossRef]
- Wang, Y.; Dvorkin, Y.; Fernández-Blanco, R.; Xu, B.; Qiu, T.; Kirschen, D.S. Look-ahead bidding strategy for energy storage. IEEE Trans. Sustain. Energy 2017, 8, 1106–1117. [Google Scholar] [CrossRef]
- Sadooghi-Alvandi, M.; Aasaraai, K.; Moshovos, A. Toward virtualizing branch direction prediction. In Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, 12–16 March 2012; pp. 455–460. [Google Scholar] [CrossRef]
- Asheim, T.; Grot, B.; Kumar, R. Btb-x: A storage-effective btb organization. IEEE Comput. Archit. Lett. 2021, 20, 134–137. [Google Scholar] [CrossRef]
- Kaynak, C.; Grot, B.; Falsafi, B. Confluence: Unified instruction supply for scale-out servers. In Proceedings of the 48th International Symposium on Microarchitecture, Waikiki, HI, USA, 5–9 December 2015; pp. 166–177. [Google Scholar] [CrossRef]
- Burcea, I.; Moshovos, A. Phantom-btb: A virtualized branch target buffer design. ACM Sigplan Not. 2009, 44, 313–324. [Google Scholar] [CrossRef]
- Xiong, Z.Y.; Lin, Z.H.; Ren, H.Q. Efficient BTB Based on Taken Trace. Comput. Sci. 2017, 93, 104620. [Google Scholar]
- Grayson, B.; Rupley, J.; Zuraski, G.Z. Evolution of the samsung exynos cpu microarchitecture. In Proceedings of the 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), Virtual, 30 May–3 June 2020; pp. 40–51. [Google Scholar]
- Adiga, N.; Bonanno, J.; Collura, A.; Heizmann, M.; Prasky, B.R.; Saporito, A. The ibm z15 high frequency mainframe branch predictor industrial product. In Proceedings of the 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), Virtual, 30 May–3 June 2020; pp. 27–39. [Google Scholar] [CrossRef]
- Ros, A.; Jimborean, A. The entangling instruction prefetcher. IEEE Comput. Archit. Lett. 2020, 19, 84–87. [Google Scholar] [CrossRef]
- Perais, A.; Sheikh, R. Branch Target Buffer Organizations. In Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, Toronto, ON, Canada, 28 October–1 November 2023; pp. 240–253. [Google Scholar]
- Campanizzi, J.A. Effects of locus of control and provision of overviews in a computer-assisted instruction sequence. AEDS J. 1978, 12, 21–30. [Google Scholar] [CrossRef]
- Ishii, Y.; Lee, J.; Nathella, K.; Sunwoo, D. Rebasing instruction prefetching: An industry perspective. IEEE Comput. Archit. Lett. 2020, 19, 147–150. [Google Scholar] [CrossRef]
- Kumar, R.; Grot, B.; Nagarajan, V. Blasting through the front-end bottleneck with shotgun. ACM Sigplan Not. 2018, 53, 30–42. [Google Scholar] [CrossRef]
- Asheim, T.; Kumar, R.; Grot, B. Fetch-Directed Instruction Prefetching Revisited. arXiv 2020, arXiv:2006.13547. [Google Scholar] [CrossRef]
- Chang, Y.J. An energy-efficient BTB lookup scheme for embedded processors. IEEE Trans. Circuits Syst. II Express Briefs 2006, 53, 817–821. [Google Scholar] [CrossRef]
- Levison, N.; Weiss, S. Low power branch prediction for embedded application processors. In Proceedings of the 16th ACM/IEEE International Symposium on Low Power Electronics and Design, Austin, TX, USA, 118–20 August 2010; pp. 67–72. [Google Scholar] [CrossRef]
- Khan, T.A.; Brown, N.; Sriraman, A. Twig: Profile-guided btb prefetching for data center applications. In Proceedings of the MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Virtual Event, Greece, 18–22 October 2021; pp. 816–829. [Google Scholar] [CrossRef]
- Levison, N.; Weiss, S. Branch target buffer design for embedded processors. Microprocess. Microsyst. 2010, 34, 215–227. [Google Scholar] [CrossRef]
- Asheim, T.; Grot, B.; Kumar, R. A specialized BTB organization for servers. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Chicago, IL, USA, 10–12 October 2022; pp. 548–549. [Google Scholar]
- Wang, S.; Hu, J.; Ziavras, S.G. BTB access filtering: A low energy and high performance design. In Proceedings of the 2008 IEEE Computer Society Annual Symposium on VLSI, Montpellier, France, 7–9 April 2008. [Google Scholar] [CrossRef]
- Sardashti, S.; Seznec, A.; Wood, D.A. Skewed compressed caches. In Proceedings of the 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, Cambridge, UK, 13–17 December 2014; pp. 331–342. [Google Scholar] [CrossRef]
- Muralimanohar, N.; Balasubramonian, R.; Jouppi, N.P. CACTI 6.0: A tool to model large caches. HP Lab. 2009, 27, 28. [Google Scholar]
- Austin, T.; Larson, E.; Ernst, D. SimpleScalar: An infrastructure for computer system modeling. Computer 2002, 35, 59–67. [Google Scholar] [CrossRef]
- Perleberg, C.H.; Smith, A.J. Branch target buffer design and optimization. IEEE Trans. Comput. 1993, 42, 396–412. [Google Scholar] [CrossRef]
- Pyne, S.; Pal, A. Branch Target Buffer Energy Reduction Through Efficient Multiway Branch Translation Techniques. J. Low Power Electron. 2012, 8, 604–623. [Google Scholar] [CrossRef]
Processor Core | Data Path Width | 4 inst. per cycle |
Load/Store ueue | 8 entries | |
RUU | 16 entries | |
Function units | 4 IALU, 1 IMULT/IDIV, 1 FMULT/FDIV/FSQRT, 1 FMULT/FDIV/FSQRT | |
Branch Predictor | Branch Predictor | Bimodal Predictor |
BTB | 64-entry, 512-entry, 4-way | |
RAS | 16-entry | |
Memory Hierarchy | L1 I/DCache | 32 KB, 4-way, 32 B blocks, LRU |
L2 UCache | 2 MB, 8-way, 64 B blocks, LRU |
Scheme | Level | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|---|
2-level BTB | 1-BTB | - | - | - | - | 100% |
(Bank) | ||||||
2-BTB | - | - | - | - | 100% | |
(Way) | ||||||
Our method | M-BTB | 0% | 100% | 0% | 0% | 0% |
(Bank) | ||||||
V-BTB | 39.8% | 58.0% | 1.74% | 0.1% | 0.4% | |
(Way) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nian, J.; Liu, H.; Gao, X.; Zhang, S.; Yang, M. Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction Mechanism. Electronics 2024, 13, 1185. https://doi.org/10.3390/electronics13071185
Nian J, Liu H, Gao X, Zhang S, Yang M. Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction Mechanism. Electronics. 2024; 13(7):1185. https://doi.org/10.3390/electronics13071185
Chicago/Turabian StyleNian, Jiawei, Hongjin Liu, Xin Gao, Shaolin Zhang, and Mengfei Yang. 2024. "Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction Mechanism" Electronics 13, no. 7: 1185. https://doi.org/10.3390/electronics13071185
APA StyleNian, J., Liu, H., Gao, X., Zhang, S., & Yang, M. (2024). Enhancing Power Efficiency in Branch Target Buffer Design with a Two-Level Prediction Mechanism. Electronics, 13(7), 1185. https://doi.org/10.3390/electronics13071185