Journal of Low Power Electronics and Applications

1178 KiB

Open AccessArticle

Energy Efficiency Effects of Vectorization in Data Reuse Transformations for Many-Core Processors—A Case Study †

by Abdullah Al Hasib, Lasse Natvig, Per Gunnar Kjeldsberg and Juan M. Cebrián

J. Low Power Electron. Appl. 2017, 7(1), 5; https://doi.org/10.3390/jlpea7010005 - 22 Feb 2017

Cited by 5 | Viewed by 8282

Thread-level and data-level parallel architectures have become the design of choice in many of today’s energy-efficient computing systems. However, these architectures put substantially higher requirements on the memory subsystem than scalar architectures, making memory latency and bandwidth critical in their overall efficiency. Data [...] Read more.

Thread-level and data-level parallel architectures have become the design of choice in many of today’s energy-efficient computing systems. However, these architectures put substantially higher requirements on the memory subsystem than scalar architectures, making memory latency and bandwidth critical in their overall efficiency. Data reuse exploration aims at reducing the pressure on the memory subsystem by exploiting the temporal locality in data accesses. In this paper, we investigate the effects on performance and energy from a data reuse methodology combined with parallelization and vectorization in multi- and many-core processors. As a test case, a full-search motion estimation kernel is evaluated on Intel^® Core^TM i7-4700K (Haswell) and i7-2600K (Sandy Bridge) multi-core processors, as well as on an Intel^® Xeon Phi^TM many-core processor (Knights Landing) with Streaming Single Instruction Multiple Data (SIMD) Extensions (SSE) and Advanced Vector Extensions (AVX) instruction sets. Results using a single-threaded execution on the Haswell and Sandy Bridge systems show that performance and EDP (Energy Delay Product) can be improved through data reuse transformations on the scalar code by a factor of ≈3× and ≈6×, respectively. Compared to scalar code without data reuse optimization, the SSE/AVX2 version achieves ≈10×/17× better performance and ≈92×/307× better EDP, respectively. These results can be improved by 10% to 15% using data reuse techniques. Finally, the most optimized version using data reuse and AVX512 achieves a speedup of ≈35× and an EDP improvement of ≈1192× on the Xeon Phi system. While single-threaded execution serves as a common reference point for all architectures to analyze the effects of data reuse on both scalar and vector codes, scalability with thread count is also discussed in the paper. Full article

(This article belongs to the Special Issue Emerging Network-on-Chip Architectures for Low Power Embedded Systems)

► Show Figures

Figure 1

626 KiB

Open AccessArticle

A Novel Design Flow for a Security-Driven Synthesis of Side-Channel Hardened Cryptographic Modules

by Sorin A. Huss and Oliver Stein

J. Low Power Electron. Appl. 2017, 7(1), 4; https://doi.org/10.3390/jlpea7010004 - 08 Feb 2017

Cited by 9 | Viewed by 8157

Abstract

Over the last few decades, computer-aided engineering (CAE) tools have been developed and improved in order to ensure a short time-to-market in the chip design business. Up to now, these design tools do not yet support an integrated design strategy for the development [...] Read more.

Over the last few decades, computer-aided engineering (CAE) tools have been developed and improved in order to ensure a short time-to-market in the chip design business. Up to now, these design tools do not yet support an integrated design strategy for the development of side-channel-resistant hardware implementations. In order to close this gap, a novel framework named AMASIVE (Adaptable Modular Autonomous SIde-Channel Vulnerability Evaluator) was developed. It supports the designer in implementing devices hardened against power attacks by exploiting novel security-driven synthesis methods. The article at hand can be seen as the second of the two contributions that address the AMASIVE framework. While the first one describes how the framework automatically detects vulnerabilities against power attacks, the second one explains how a design can be hardened in an automatic way by means of appropriate countermeasures, which are tailored to the identified weaknesses. In addition to the theoretical introduction of the fundamental concepts, we demonstrate an application to the hardening of a complete hardware implementation of the block cipher PRESENT. Full article

(This article belongs to the Special Issue Hardware Security – Threats and Countermeasures at the Circuit and Logic Levels)

► Show Figures

Graphical abstract

923 KiB

Open AccessArticle

Completing the Complete ECC Formulae with Countermeasures

by Łukasz Chmielewski, Pedro Maat Costa Massolino, Jo Vliegen, Lejla Batina and Nele Mentens

J. Low Power Electron. Appl. 2017, 7(1), 3; https://doi.org/10.3390/jlpea7010003 - 01 Feb 2017

Cited by 11 | Viewed by 7805

Abstract

This work implements and evaluates the recent complete addition formulae for the prime order elliptic curves of Renes, Costello and Batina on an FPGA platform. We implement three different versions:(1) an unprotected architecture; (2) an architecture protected through coordinate randomization; and (3) an [...] Read more.

This work implements and evaluates the recent complete addition formulae for the prime order elliptic curves of Renes, Costello and Batina on an FPGA platform. We implement three different versions:(1) an unprotected architecture; (2) an architecture protected through coordinate randomization; and (3) an architecture with both coordinate randomization and scalar splitting in place. The evaluation is done through timing analysis and test vector leakage assessment (TVLA). The results show that applying an increasing level of countermeasures leads to an increasing resistance against side-channel attacks. This is the ﬁrst work looking into side-channel security issues of hardware implementations of the complete formulae. Full article

(This article belongs to the Special Issue Hardware Security – Threats and Countermeasures at the Circuit and Logic Levels)

► Show Figures

Figure 1

3063 KiB

Open AccessArticle

On Improving Reliability of SRAM-Based Physically Unclonable Functions

by Arunkumar Vijayakumar, Vinay C. Patil and Sandip Kundu

J. Low Power Electron. Appl. 2017, 7(1), 2; https://doi.org/10.3390/jlpea7010002 - 12 Jan 2017

Cited by 20 | Viewed by 10823

Abstract

Physically unclonable functions (PUFs) have been touted for their inherent resistance to invasive attacks and low cost in providing a hardware root of trust for various security applications. SRAM PUFs in particular are popular in industry for key/ID generation. Due to intrinsic process [...] Read more.

Physically unclonable functions (PUFs) have been touted for their inherent resistance to invasive attacks and low cost in providing a hardware root of trust for various security applications. SRAM PUFs in particular are popular in industry for key/ID generation. Due to intrinsic process variations, SRAM cells, ideally, tend to have the same start-up behavior. SRAM PUFs exploit this start-up behavior. Unfortunately, not all SRAM cells exhibit reliable start-up behavior due to noise susceptibility. Hence, design enhancements are needed for improving reliability. Some of the proposed enhancements in literature include fuzzy extraction, error-correcting codes and voting mechanisms. All enhancements involve a trade-off between area/power/performance overhead and PUF reliability. This paper presents a design enhancement technique for reliability that improves upon previous solutions. We present simulation results to quantify improvement in SRAM PUF reliability and efficiency. The proposed technique is shown to generate a 128-bit key in ≤0.2

μ

s at an area estimate of 4538

μ

m

^{2}

with error rate as low as

10^{- 6}

for intrinsic error probability of 15%. Full article

(This article belongs to the Special Issue Hardware Security – Threats and Countermeasures at the Circuit and Logic Levels)

► Show Figures

Figure 1

182 KiB

Open AccessEditorial

Acknowledgement to Reviewers of Journal of Low Power Electronics and Applications in 2016

by Journal of Low Power Electronics and Applications Editorial Office

J. Low Power Electron. Appl. 2017, 7(1), 1; https://doi.org/10.3390/jlpea7010001 - 10 Jan 2017

Cited by 1 | Viewed by 5921

Abstract

The editors of Journal of Low Power Electronics and Applications would like to express their sincere gratitude to the following reviewers for assessing manuscripts in 2016.[...] Full article

Journal Menu

Journal Browser

J. Low Power Electron. Appl., Volume 7, Issue 1 (March 2017) – 5 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI