Next Article in Journal
A New Modified MARS Cryptosystem Based on Niho Exponent with an Enhanced S-Box Generation
Previous Article in Journal
Multibeam SIW Leaky-Wave Antenna with Beam Scanning Capability in Two Dimensions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

AI-Driven Performance Modeling for AI Inference Workloads

1
Infineon Technologies Dresden GmbH & Co. KG, 01099 Dresden, Germany
2
Center for Advancing Electronics Dresden (CFAED), Technical University (TU) Dresden, 01062 Dresden, Germany
*
Author to whom correspondence should be addressed.
Electronics 2022, 11(15), 2316; https://doi.org/10.3390/electronics11152316
Submission received: 30 June 2022 / Revised: 18 July 2022 / Accepted: 21 July 2022 / Published: 26 July 2022
(This article belongs to the Section Computer Science & Engineering)

Abstract

Deep Learning (DL) is moving towards deploying workloads not only in cloud datacenters, but also to the local devices. Although these are mostly limited to inference tasks, it still widens the range of possible target architectures significantly. Additionally, these new targets usually come with drastically reduced computation performance and memory sizes compared to the traditionally used architectures—and put the key optimization focus on the efficiency as they often depend on batteries. To help developers quickly estimate the performance of a neural network during its design phase, performance models could be used. However, these models are expensive to implement as they require in-depth knowledge about the hardware architecture and the used algorithms. Although AI-based solutions exist, these either require large datasets that are difficult to collect on the low-performance targets and/or limited to a small number of target platforms and metrics. Our solution exploits the block-based structure of neural networks, as well as the high similarity in the typically used layer configurations across neural networks, enabling the training of accurate models on significantly smaller datasets. In addition, our solution is not limited to a specific architecture or metric. We showcase the feasibility of the solution on a set of seven devices from four different hardware architectures, and with up to three performance metrics per target—including the power consumption and memory footprint. Our tests have shown that the solution achieved an error of less than 1 ms (2.6%) in latency, 0.12 J (4%) in energy consumption and 11 MiB (1.5%) in memory allocation for the whole network inference prediction, while being up to five orders of magnitude faster than a benchmark.
Keywords: performance modeling; machine learning; regression models performance modeling; machine learning; regression models

Share and Cite

MDPI and ACS Style

Sponner, M.; Waschneck, B.; Kumar, A. AI-Driven Performance Modeling for AI Inference Workloads. Electronics 2022, 11, 2316. https://doi.org/10.3390/electronics11152316

AMA Style

Sponner M, Waschneck B, Kumar A. AI-Driven Performance Modeling for AI Inference Workloads. Electronics. 2022; 11(15):2316. https://doi.org/10.3390/electronics11152316

Chicago/Turabian Style

Sponner, Max, Bernd Waschneck, and Akash Kumar. 2022. "AI-Driven Performance Modeling for AI Inference Workloads" Electronics 11, no. 15: 2316. https://doi.org/10.3390/electronics11152316

APA Style

Sponner, M., Waschneck, B., & Kumar, A. (2022). AI-Driven Performance Modeling for AI Inference Workloads. Electronics, 11(15), 2316. https://doi.org/10.3390/electronics11152316

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop