Next Article in Journal
Multi-Method Analysis of Medical Records and MRI Images for Early Diagnosis of Dementia and Alzheimer’s Disease Based on Deep Learning and Hybrid Methods
Next Article in Special Issue
Universal Reconfigurable Hardware Accelerator for Sparse Machine Learning Predictive Models
Previous Article in Journal
2D Omni-Directional Wireless Power Transfer Modeling for Unmanned Aerial Vehicles with Noncollaborative Charging System Control
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit

1
School of Electrical Engineering, Korea University, Seoul 02841, Korea
2
School of Electronic and Electrical Engineering, Hongik University, Seoul 04066, Korea
*
Author to whom correspondence should be addressed.
Electronics 2021, 10(22), 2859; https://doi.org/10.3390/electronics10222859
Submission received: 15 September 2021 / Revised: 16 November 2021 / Accepted: 17 November 2021 / Published: 19 November 2021
(This article belongs to the Special Issue Energy-Efficient Processors, Systems, and Their Applications)

Abstract

Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Although many studies have proposed methods for implementing high-performance CNN accelerators on FPGAs using optimized data types and algorithm transformations, accelerators can be optimized further by investigating more efficient uses of FPGA resources. In this paper, we propose an FPGA-based CNN accelerator using multiple approximate accumulation units based on a fixed-point data type. We implemented the LeNet-5 CNN architecture, which performs classification of handwritten digits using the MNIST handwritten digit dataset. The proposed accelerator was implemented, using a high-level synthesis tool on a Xilinx FPGA. The proposed accelerator applies an optimized fixed-point data type and loop parallelization to improve performance. Approximate operation units are implemented using FPGA logic resources instead of high-precision digital signal processing (DSP) blocks, which are inefficient for low-precision data. Our accelerator model achieves 66% less memory usage and approximately 50% reduced network latency, compared to a floating point design and its resource utilization is optimized to use 78% fewer DSP blocks, compared to general fixed-point designs.
Keywords: convolutional neural network; FPGA; high-level synthesis; accelerator convolutional neural network; FPGA; high-level synthesis; accelerator

Share and Cite

MDPI and ACS Style

Cho, M.; Kim, Y. FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit. Electronics 2021, 10, 2859. https://doi.org/10.3390/electronics10222859

AMA Style

Cho M, Kim Y. FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit. Electronics. 2021; 10(22):2859. https://doi.org/10.3390/electronics10222859

Chicago/Turabian Style

Cho, Mannhee, and Youngmin Kim. 2021. "FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit" Electronics 10, no. 22: 2859. https://doi.org/10.3390/electronics10222859

APA Style

Cho, M., & Kim, Y. (2021). FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit. Electronics, 10(22), 2859. https://doi.org/10.3390/electronics10222859

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop