ADPO: Adaptive DRAM Controller for Performance Optimization
Abstract
:1. Introduction
- Different from traditional DRAM controllers, we have employed a decoupled architecture of scheduler and protocol, which enables a scene-based adaptive DRAM controller with dynamic adjustments.
- By considering the read–write mode switching penality, we have designed a dynamic read and write mode arbiter to improve DDR utilization.
- In terms of dynamic threshold control strategies, we have also compared the performance benefits brought by this technology. It leads to average read latency savings ranging from approximately 10 to 150 ns.
2. DRAM Adapitive Control Arbiter
2.1. Theoretical Basis
2.2. Read and Write Switching Engine
2.3. Adaptive Scheduling Engine
3. DRAM Adapitive Control Arbiter Implemetation
- (1)
- If the read–write switching is accomplished well enough, theoretically, the number of transactions in the four modes should be balanced. However, due to the following reasons, certain modes may accumulate a relatively large number of commands.
- (2)
- There is a certain imbalance in the access to different ranks, and there is also an imbalance in the ratio of reading to writing. This may be because the upstream hash or interleave is not good enough, or the access pattern during a certain period has spatial locality. For the overall read–write switching, whether it is ACT or RW, it is more inclined toward reading. Write commands are relatively more difficult to arbitrate, which may lead to a backlog of write commands. In such a situation, it is necessary to switch to the corresponding mode to eliminate this imbalance.
- (3)
- The timing parameters are unbalanced. For example, the time from WR to RD (TCCD) within the same rank is significantly longer than the other timing parameters. This naturally makes it difficult for the write operations within the rank to switch to read operations. Thus, proper rank-level parallelism is used to reduce the bubble when read and write turnarounds have to be initiated.
4. Results
4.1. Experimental Methodology
4.2. Performance Results
4.3. Synthesis Results
5. Related Work
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Leiserson, C.E.; Thompson, N.C.; Emer, J.S.; Kuszmaul, B.C.; Lampson, B.W.; Sanchez, D.; Schardl, T.B. There’s plenty of room at the Top: What will drive computer performance after Moore’s law? Science 2020, 368, eaam9744. [Google Scholar] [CrossRef] [PubMed]
- Hennessy, J.L.; Patterson, D.A. A New Golden Age for Computer Architecture. Commun. ACM 2019, 62, 48–60. [Google Scholar] [CrossRef]
- Lee, S.; Kang, S.H.; Lee, J.; Kim, H.; Lee, E.; Seo, S.; Yoon, H.; Lee, S.; Lim, K.; Shin, H.; et al. Hardware architecture and software stack for PIM based on commercial DRAM technology: Industrial product. In Proceedings of the 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), Valencia, Spain, 14–18 June 2021. [Google Scholar]
- Lee, S.; Kim, K.; Oh, S.; Park, J.; Hong, G.; Ka, D.; Hwang, K.; Park, J.; Kang, K.; Kim, J.; et al. A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-in-Memory supporting 1TFLOPS MAC Operation and Various Activation Functions for Deep-Learning Applications. In Proceedings of the 2022 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 20–26 February 2022; pp. 1–3. [Google Scholar] [CrossRef]
- Ghose, S.; Li, T.; Hajinazar, N.; Cali, D.S.; Mutlu, O. Understanding the Interactions of Workloads and DRAM Types: A Comprehensive Experimental Study. arXiv 2019, arXiv:1902.07609. [Google Scholar]
- Hassan, M. Managing DRAM Interference in Mixed Criticality Embedded Systems. In Proceedings of the 2019 31st International Conference on Microelectronics (ICM), Cairo, Egypt, 15–18 December 2019. [Google Scholar]
- Kim, Y.; Han, D.; Mutlu, O.; Harchol-Balter, M. ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers. In Proceedings of the HPCA-16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture, Bangalore, India, 9–14 January 2010. [Google Scholar]
- Kim, Y.; Papamichael, M.; Mutlu, O.; Harchol-Balter, M. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, Atlanta, GA, USA, 4–8 December 2010. [Google Scholar]
- Subramanian, L.; Seshadri, V.; Kim, Y.; Jaiyen, B.; Mutlu, O. MISE: Providing performance predictability and improving fairness in shared main memory systems. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), Shenzhen, China, 23–27 February 2013. [Google Scholar]
- Seshadri, V.; Lee, D.; Mullins, T.; Hassan, H.; Boroum, A.; Kim, J.; Kozuch, M.A.; Mutlu, O.; Gibbons, P.B.; Mowry, T.C. Ambit: In-Memory accelerator for bulk bitwise operations using commodity DRAM technology. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, Boston, MA, USA, 14–17 October 2017; pp. 273–287. [Google Scholar]
- Hong, B.; Kim, G.; Ahn, J.H.; Kwon, Y.; Kim, H.; Kim, J. Accelerating links-list traversal through near-data processing. In Proceedings of the International Conference on Parallel Architectures and Compilation, Haifa, Israel, 11–15 September 2016; pp. 113–124. [Google Scholar]
- Xue, D.; Li, C.; Huang, L.; Wu, C.; Li, T. Adaptive memory fusion: Towards transparent, agile integration of persistent memory. In Proceedings of the 24th IEEE International Symposium on High Performance Computer Architecture, Vienna, Austria, 24–28 February 2018; pp. 324–335. [Google Scholar]
- Qureshi, M.K.; Srinivasan, V.; Rivers, J.A. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture, Austin, TX, USA, 20–24 June 2009; pp. 24–33. [Google Scholar]
- Li, S.; Yang, Z.; Reddy, D.; Srivastava, A.; Jacob, B. DRAMsim3: A Cycle-Accurate, Thermal-Capable DRAM Simulator. IEEE Comput. Archit. Lett. 2020, 19, 106–109. [Google Scholar] [CrossRef]
- Synopsys, Inc. Synopsys Design Compiler. Available online: https://www.synopsys.com/support/training/rtl-synthesis/design-compiler-rtl-synthesis.html (accessed on 23 March 2025).
- Intel Inc. 3rd Gen Intel Xeon Scalable Processors. Available online: https://www.intel.com/content/www/us/en/products/docs/processors/xeon/3rd-gen-xeon-scalable-processors-brief.html (accessed on 9 March 2020).
- Sudarshan, C.; Lappas, J.; Weis, C.; Mathew, D.M.; Jung, M.; Wehn, N. A Lean, Low Power, Low Latency DRAM Memory Controller for Transprecision Computing. In Embedded Computer Systems: Architectures, Modeling, and Simulation: 19th International Conference, SAMOS 2019, Samos, Greece, 7–11 July 2019; Springer International Publishing: Cham, Switzerland, 2019; pp. 429–441. [Google Scholar]
- JESD79-4C; DDR4 SDRAM JESD79-4C. JEDEC Solid State Technology Association: Orlando, FL, USA, 2020.
- JESD79-5C; DDR5 SDRAM JESD79-5C. JEDEC Solid State Technology Association: Orlando, FL, USA, 2024.
- Hong, S.; McKee, S.; Salinas, M.; Klenke, R.; Aylor, J.; Wulf, W. Access order and effective bandwidth for streams on a Direct Rambus memory. In Proceedings of the Fifth International Symposium on High-Performance Computer Architecture, Orlando, FL, USA, 9–12 January 1999; pp. 80–89. [Google Scholar]
- Jacob, B.; Ng, S.W.; Wang, D.T. Memory Systems (Cache, DRAM, Disk), 1st ed.; Morgan Kaufmann: San Mateo, CA, USA, 2008. [Google Scholar]
- Shin, W.; Jang, J.; Choi, J.; Suh, J.; Kwon, Y.; Moon, Y.; Kim, L.S. Rank-Level Parallelism in DRAM. IEEE Trans. Comput. 2017, 66, 1274–1280. [Google Scholar]
- Wang, Z.; Khan, S.M.; Jiménez, D.A. Rank idle time prediction driven last-level cache writeback. In Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness, Beijing, China, 16 June 2012. [Google Scholar]
- Yi, J.; Wang, M.; Bai, L. Design of DDR3 SDRAM read-write controller based on FPGA. J. Phys. Conf. Ser. 2021, 1846, 012046. [Google Scholar] [CrossRef]
- Ecco, L.; Ernst, R. Architecting high-speed command schedulers for open-row real-time SDRAM controllers. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), Lausanne, Switzerland, 27–31 March 2017; pp. 626–629. [Google Scholar] [CrossRef]
- Ecco, L.; Ernst, R. Improved DRAM Timing Bounds for Real-Time DRAM Controllers with Read/Write Bundling. In Proceedings of the 2015 IEEE Real-Time Systems Symposium, San Antonio, TX, USA, 1–4 December 2015. [Google Scholar]
- Ecco, L.; Ernst, R. Tackling the Bus Turnaround Overhead in Real-Time SDRAM Controllers. IEEE Trans. Comput. 2017, 66, 1961–1974. [Google Scholar]
SR/DR(6400) | Read | Write | Activite |
---|---|---|---|
READ | 8/8+r2r_gap a2 | 16/8+r2w_gap a1 | NA |
WRITE | 69/8+w2r_gap a1 | 8/8+w2w_gap a2 | NA |
ACTIVITE | NA | NA | 6/0 |
Item | Configuration |
---|---|
ddrc commm num | 8 |
ddrc comm period | 18,760 |
ddrc comm timeout | 1600 |
ddrc comm data width | 32 |
ddrc data rate | 4266 |
ddrc comm split byte | 64 |
ddrc comm aw outstanding | 256 |
ddrc comm ar outstanding | 256 |
ddrc aw/wd/br/ar/rd delay | 1 |
ddrc comm aw depth | 2 |
ddrc comm wd depth | 256 |
ddrc comm br depth | 32 |
ddrc comm ar depth | 2 |
ddrc comm rd depth | 2048 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Z.; Li, Y.; Zeng, X. ADPO: Adaptive DRAM Controller for Performance Optimization. Micromachines 2025, 16, 409. https://doi.org/10.3390/mi16040409
Liu Z, Li Y, Zeng X. ADPO: Adaptive DRAM Controller for Performance Optimization. Micromachines. 2025; 16(4):409. https://doi.org/10.3390/mi16040409
Chicago/Turabian StyleLiu, Zhuorui, Yan Li, and Xiaoyang Zeng. 2025. "ADPO: Adaptive DRAM Controller for Performance Optimization" Micromachines 16, no. 4: 409. https://doi.org/10.3390/mi16040409
APA StyleLiu, Z., Li, Y., & Zeng, X. (2025). ADPO: Adaptive DRAM Controller for Performance Optimization. Micromachines, 16(4), 409. https://doi.org/10.3390/mi16040409