Hybrid Full Adders: Optimized Design, Critical Review and Comparison in the Energy-Delay Space

Giustolisi, Gianluca; Palumbo, Gaetano

doi:10.3390/electronics11193220

Open AccessArticle

Hybrid Full Adders: Optimized Design, Critical Review and Comparison in the Energy-Delay Space

by

Gianluca Giustolisi

^*

and

Gaetano Palumbo

Dipartimento di Ingegneria Elettrica Elettronica e Informatica (DIEEI), Università degli Studi di Catania, Viale A. Doria 6, 95125 Catania, Italy

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(19), 3220; https://doi.org/10.3390/electronics11193220

Submission received: 15 September 2022 / Revised: 27 September 2022 / Accepted: 3 October 2022 / Published: 8 October 2022

(This article belongs to the Special Issue Feature Papers in Circuit and Signal Processing)

Download

Browse Figures

Versions Notes

Abstract

In this paper, we design and compare seven meaningful hybrid one-bit full adders topologies that are optimized in terms of energy-delay trade-offs to operate in multibit ripple carry adders. The goal is to provide the designer with a simple and powerful approach for choosing the best topology for a given power budget, speed performance, or any combination of both. The design and comparison deal with 4-bit and 8-bit ripple carry adders and exploit the derivation of the energy-efficient curves in the energy-delay space. To do so, first we define the procedures to obtain energy consumption and propagation delay by simulating a ripple carry adder designed at a transistor level. Then, we introduce a design methodology to optimize a ripple carry adder by minimizing some significant figures-of-merit in terms of energy-delay trade-offs. The comparison of the energy-efficient curves allows us to make a simple and effective comparison as well as to identify the best one-bit full adder topologies.

Keywords:

CMOS digital integrated circuits; ripple carry adders; full adders; energy-efficient curve; energy-delay space; VLSI

1. Introduction

The one bit full adder (OBFA) is a key element for performing calculation operations, as it is the main circuit to implement a n-bit adder, the basic arithmetic circuit of every digital system [1,2]. Hence, the choice of a specific transistor-level topology of OBFA and its corresponding design must be pursued with particular attention and care. Indeed, in a digital system, the OBFA determines the performance of the parts dedicated to the execution of arithmetic operations, thus strongly affecting the overall performance of the system in terms of speed and energy consumption.

Among the numerous topologies adopted to implement an OBFA, two general design strategies can be identified. The natural (and more trivial) way makes use of only one logic style [1,2]. In contrast, the hybrid strategy considers different logic styles that are shared and combined together to gain performance benefits [3,4].

Compared to conventional approaches based on a single logic style, hybrid topologies seem to be particularly interesting. For example, in OBFAs the benefits of transmission gate [1] or transmission function [5] logic styles can be nullified by their intrinsic lack of driving capability, a drawback that can be solved merging a second different logic style. For this reason, in recent years, the hybrid approach has gained greater interest as demonstrated by the several topologies proposed in the literature [3,4,6,7,8,9,10].

Considering the assortment of OBFAs, further extended with the introduction of hybrid structures, the comparison of different topologies at the transistor-level becomes a typical, but not trivial, target. In this scenario, the lack of effective comparison criteria makes it difficult for the designer to select the best topology for a given power budget, speed performance or any combination of both. Some comparison approaches were proposed initially in [11,12,13] and, more recently, in [3,4,6,7,8,9,10,14,15,16,17,18], where new topologies were also presented. All these approaches compared the OBFA topologies with regard to speed, energy consumption, power-delay product, or a mixture of other equivalent factors. In general, the comparison considered only a couple of design points instead of inspecting the whole energy-delay space (EDS).

Over the past two decades, the EDS has become the primary domain for comparing digital circuits and systems fairly. Indeed, exploiting the generalized figure of merit (FOM),

E^{i} D^{j}

, introduced in [19,20,21,22,23], we can fully understand the energy-delay trade-offs of a digital circuit. The generalized FOM can also be used to compare different basic arithmetic circuits or blocks in terms of energy-delay properties but, in the case of full adders, it was only used in [24] and [25]. In the first article, the comparison was conducted on an architectural level, since different structures of n-bit full adders were analyzed. In the second work, the comparison considered different OBFA topologies at the transistor-level and, with the exception of two hybrid circuits, it focused on traditional and more consolidated structures. Therefore, a clear and reliable comparison of a meaningful set of hybrid OBFAs in the entire EDS is still missing.

In this paper, we compare the most promising and meaningful hybrid OBFA in the whole EDS. The goal is to provide the designer with a simple and powerful approach for choosing the best topology for a given power budget, speed performance or any combination of both. The comparison also includes the last recent topologies and concerns the following hybrid structures:

Low-Power (LP) full adder [26];
New Hybrid Pass Static CMOS (NHPSC) full adder [4];
Hybrid full adder with 22 Transistors (HFA-22T) [18];
Hybrid full adder with Buffer and 26 Transistors (HFA-B-26T) [18];
high-speed hybrid full adder design-2 (HSHFA-D2) [9];
high-speed hybrid full adder design-4 (HSHFA-D4) [9];
scalable low-power hybrid full adder (SLPHFA) [10].

In addition, the conventional mirror CMOS (CMC) full adder is also included in the comparison so as to have a well-known and consolidated topology as a reference structure.

The comparison exploits the design methodology presented in [25], where the transistors of OBFAs were sized by minimizing the generalized FOM,

E^{i} D^{j}

, for some specific values of i and j. The procedure allowed one to build the energy-efficient curve (EEC) that was subsequently used to compare different structures. Unlike [25], where the design methodology and the comparison considered the single OBFA cell, in this case the design (and the transistor sizing) is focused on multibit ripple carry adders consisting of the hybrid OBFAs mentioned above.

The paper is structured as follows. Section 2 outlines how the EEC can be used to compare digital circuits. Section 3 briefly describes the considered OBFA topologies. Section 4 reports the adopted simulation strategies. Section 5 illustrates the optimized design strategy through the energy-efficient curve determination. Section 6 summarizes the OBFA comparison. Finally, Section 7 reports conclusions and final remarks.

2. Overview on the Energy-Delay Space

The product of energy and delay (

E D

) measures the trade-off between speed and dissipation and is a well-known figure of merit (FOM) typically used in the analysis and design of digital circuits and systems. A more generalized approach was proposed in the early 2000s where the general parameter,

E^{i} D^{j}

(or the interchangeable

E D^{η}

, with

η = j / i

), was introduced in [19] to account for any speed-dissipation trade-off. For this parameter, the cases of minimum energy dissipation and minimum delay are represented by

j / i = 0

(

η = 0

) and

j / i = \infty

(

η = \infty

), respectively. In the same paper, assuming a defined power supply of the circuit or system under consideration, the energy efficient curve (EEC) is also introduced.

The EEC is composed of the points where the circuit or system has the minimum delay for a given energy consumption or, equivalently, the minimum energy consumption for a given delay. It is apparent that any design point above the EEC is inefficient because either, at the same delay, an unnecessary higher energy dissipation is wasted or, at the same energy, a higher delay is present.

As shown in [20], the EEC can be modeled by:

(\frac{E}{E_{0}} - 1) (\frac{D}{D_{0}} - 1) = 1,

(1)

where

E_{0}

and

D_{0}

are the asymptotes that represent the two theoretical minimum for the energy and the delay, respectively. When plotted in the Energy-Delay Space (EDS), the EEC follows the hyperbolic function shown in Figure 1. Of course, real circuits may diverge from (1) and, to model realistic cases, the parameter

γ

(

0 < γ \leq 1

) was introduced in [21,22] to correct relationship (1) into:

(\frac{E}{E_{0}} - 1) (\frac{D}{D_{0}} - 1) = γ .

(2)

It can be demonstrated that the EEC is composed of the points that minimize the generic FOM,

E^{i} D^{j}

, as parameters i and j vary [20,24]. In other words, all optimal transistor sizing that minimize

E^{i} D^{j}

lie on the EEC.

The EEC represents a useful and simple tool for comparing similar digital circuits, whether they are transistor-level topologies or system architectures. Consider for example the two EECs shown in Figure 2, which refer to two different digital circuits, namely A (red) and B (blue). Assume also that the two circuits are designed in the points,

P_{0}

and

Q_{0}

, shown in the figure. Limiting the comparison to these two points only, we can conclude that circuit A performs better than B in terms of delay and, on the other side, that circuit B is better than A in terms of dissipation.

Focusing the attention on the two EECs that describe their respective circuits with regard to energy-delay trade-offs, it becomes evident that, with respect to circuit A designed in point

P_{0}

, circuit B can be designed at the same speed but with a lower energy consumption (point

Q_{1}

) or with a lower delay at the same energy consumption (point

Q_{2}

). Conversely, with respect to circuit B designed in point

Q_{0}

, circuit A can be designed with a lower delay at the same energy consumption (point

P_{1}

) or at the same delay but with a lower energy consumption (point

P_{2}

). More generally, the two EECs in Figure 2 divide the EDS in two regions. If speed is the main specification, circuit B can be designed to perform better than A. On the other hand, if power dissipation is the main goal, circuit A can be designed to perform better than B. The best solution is achieved by considering the design points that lie in the two solid lines that belong to both circuits and by discarding the design points in the dashed lines.

3. OBFA Topologies

The OBFA structures that will be compared in the following are briefly discussed in this section. All of them receive three input bits, A, B and

C_{i}

(i.e., the bits to be summed and the Carry Input from the previous stage), and generate two output bits, the Sum and the Carry Output, defined by

\begin{matrix} S & = A \oplus B \oplus C_{i} \end{matrix}

(3a)

\begin{matrix} C_{o} & = (A + B) C_{i} + A B \end{matrix}

(3b)

being

X \oplus Y = X \bar{Y} + \bar{X} Y

the xor operator.

The low-power (LP) full adder is shown in Figure 3. It is one of the first hybrid topologies and was presented in [26]. It is based on the low power xor and xnor gates, introduced in [27], that produce the intermediate Propagate signals,

P = A \oplus B

and

\bar{P} = \bar{A \oplus B}

. Then, the two outputs are generated by:

\begin{matrix} S & = (A \oplus B) \bar{C_{i}} + (\bar{A \oplus B}) C_{i} \end{matrix}

(4a)

\begin{matrix} C_{o} & = (A \oplus B) C_{i} + (\bar{A \oplus B}) A, \end{matrix}

(4b)

which are formally equivalent to (3). The current drawn from the power line is very small as only two transistors are directly connected to

V_{DD}

and GND. This greatly reduces the power dissipation. However, the current flowing through the input terminals must be taken into account as it represents the main contribution to the energy consumption of the cell. The LP full adder belongs to the group of OBFA with no driving capability, as its outputs are not decoupled from the inputs of the next cell. This poses a serious drawback in the propagation of the Carry signal in n-bit adders. In fact, as analyzed in [13], when this type of cells is cascaded to compose an n-bit adder, the overall propagation delay of the carry signal becomes proportional to

n^{2}

, a value that can be prohibitive when a large number of bits have to be summed.

The new hybrid pass static CMOS (NHPSC) full adder is shown in Figure 4 and was proposed in [4]. The cell is a mixture of pass-transistor, transmission-gate, and standard CMOS logic. As CMOS inverters decouple the NHPSC full adder cell from the subsequent stages, it belongs to the group of OBFA with driving capability. In the original paper, it turned out to be the best over other topologies, so it was included in our comparison.

The hybrid full adder with 22 transistors (HFA-22T) and the hybrid full adder with buffer and 26 transistors (HFA-B-26T) are depicted in Figure 5 and Figure 6, respectively, and were both introduced in [18]. They exploited a new simultaneous XOR-XNOR circuit, that avoid glitches in the output nodes of the full adder. After introducing the XOR-XNOR circuit, the original paper presented also six new full adder topologies. Among these we have chosen for the comparison the HFA-22T cell, which is the most promising topology with no driving capability, and the HFA-B-26T cell, which is the most promising topology with driving capability.

The high-speed hybrid full adder design-2 (HSHFA-D2), shown in Figure 7, and the high-speed hybrid full adder design-4 (HSHFA-D4), shown in Figure 8 was presented in [9] in a group of four new full adders. In this case, the FAs are also based on a new high-speed, low-power 10-T XOR–XNOR circuit, which provided full simultaneous swing outputs and improved delay performance. Among the four topologies presented in the original paper, we have chosen for the comparison the HSHFA-D2 cell, that is the most promising topology with driving capability, and the HSHFA-D4 cell, which is the most promising topology with no driving capability.

The scalable low-power hybrid full adder (SLPHFA), shown in Figure 9, was presented in [10]. Differently from other topologies, the carry generation section is made up of a new AND-OR module with transmission gates and complementary pass transistor logic and does not use an intermediate propagate signal. The sum section exploits two XOR modules implemented using transmission gates and pass transistors. The circuit has no driving capability.

As we mentioned in the introduction, we included the conventional mirror CMOS (CMC) full adder in the comparison so as to have a well-known and consolidated topology as a reference structure. The CMC topology is shown in Figure 10. In this circuit, the Sum output is produced by exploiting the relationship

S = A B C_{i} + \bar{C_{o}} (A + B + C_{i})

[2]. It still remains one of the most used full adder topologies and is easily available in industrial libraries of standard cells.

Since the CMC cell produces the complemented signals,

\bar{S}

and

\bar{C_{o}}

, multibit adders are implemented as depicted in Figure 11, where the carry path does not require any signal inversion. This implementation exploits the properties of the sum function [2]:

\begin{matrix} \bar{S} & = S (\bar{A}, \bar{B}, \bar{C_{i}}) \end{matrix}

(5a)

\begin{matrix} \bar{C_{o}} & = C_{o} (\bar{A}, \bar{B}, \bar{C_{i}}) . \end{matrix}

(5b)

4. Comparison Strategy and Simulations

To extract the features of the OBFAs, we examined multi-bit ripple-carry adders (RCAs) and, in particular, we focused the comparison considering 4-bit and 8-bit structures. However, this is not a limitation since the same approach can be applied to any multi-bit RCA.

The multibit RCA, whose basic cell is the OBFA, is designed in a 28-nm FD-SOI CMOS technology from STMicroelectronics and, to simplify our investigation, we used regular threshold transistors only (

V_{T} \sim 350

mV). The process allows a minimum channel size of

W_{\min} / L_{\min} = 80 nm / 30 nm

with a nominal supply voltage of 1 V.

Each OBFA cell is described through suitable design parameters whose values are chosen with the help of the Cadence optimization tool (i.e., ADE Assembler). The design parameter values are constrained in limited ranges that are explored by the optimization tool through transient simulations of the entire multi-bit RCA. The target of the optimization process is to minimize a FOM defined with regard to delay and energy of the multi-bit RCA. As a consequence, by defining some suitable FOMs in terms of propagation delay and energy dissipation per clock cycle, the optimization tool can sweep the EDS and is able to find the EEC of the multi-bit RCA.

The schematic of the test-bench used for transient simulations is shown in Figure 12 where the n-bit RCA is realized by connecting the carry output of the (

j - 1

)-th cell to the carry input of the j-th cell [2]. In the following, to simplify the notation,

C_{j}

refers to the carry input of the j-th cell.

4.1. Carry Propagation Delay

In RCAs, the main speed limitation is determined by the carry signal that propagates through the cells of the multi-bit structure. Therefore, we are interested in measuring the interval of time between the simultaneous application of the input signals (

A_{j}, B_{j}

in Figure 12) and the carry output of the last cell (

C_{n}

in Figure 12). This interval of time is measured between 50% of input transition to the corresponding 50% of output transition and defines the Carry Propagation Delay (CPD). Obviously, we are interested in measuring the worst CPD that occurs when all the cells of the RCA propagate the carry signal.

With the exception of the first cell, FA

_{0}

, the worst CPD is to be found among those input combinations that make the carry of every cell toggle only on the basis of the carry of the previous one. This happens when the propagate signal of such cells is high, that is for

P_{j} = A_{j} \oplus B_{j} = 1

or, in other words, when

A_{j} = \bar{B_{j}} (\forall j = 1, 2, \dots, n - 1) .

(6)

Under the constraint set by (6), let us define

τ_{A}

as the propagation delay of the carry in the generic j-th cell when

A_{j} = 1

and

τ_{B}

as the propagation delay when

A_{j} = 0

. If

τ_{A} > τ_{B}

it is simple to show that the worst propagation delay between the first and the last carry output (i.e., between

C_{1}

and

C_{n}

in Figure 12) is observed when

A_{j} = 1

for any

j = 1, 2, \dots, n - 1

. Conversely, if

τ_{B} > τ_{A}

the worst delay is observed when

A_{j} = 0

for any

j = 1, 2, \dots, n - 1

. It is evident that both conditions are to be explored.

In addition, two cases of carry propagation can be distinguished. In the first case, all carry signals start from zero and, when the cell FA

_{0}

sets

C_{1} = 1

, all subsequent carry signals propagate to one. In the second case, all carry signals start from one and, when the cell FA

_{0}

sets

C_{1} = 0

, all subsequent carry signals propagate to zero. Since

C_{1} = A_{0} B_{0}

, the first case occurs when the product

A_{0} B_{0}

toggles from ‘0’ to ‘1’ while the second case when the product

A_{0} B_{0}

toggles from ‘1’ to ‘0’.

Based on the discussion above, the worst CPD can be found by examining the four transitions reported in Table 1, denoted as T1, T2, T3 and T4.

The test-bench in Figure 12 uses digital (ideal) patterns, pat-A

_{j}

and pat-B

_{j}

, as input signals. The ideal patterns are shaped by two cascaded inverters (FO1 and FO4) to provide realistic inputs,

A_{j}

and

B_{j}

, to the multi-bit RCA. The ideal patterns are sketched in Figure 13. First 6-bit sequences provide the four transitions defined in Table 1 and are used for the evaluation of the CPD. Subsequent bitstreams are randomly generated and, as detailed in the next subsection, are used to estimate the energy consumption.

In the evaluation of the propagation delay we consider

\bar{C_{n}}

as the output signal. This is because, when the RCA is made up of OBFAs with no driving capability, the last carry,

C_{n}

, is not perfectly shaped and the identification of the 50% of output transition may fail. Reading the output after an inverter gate allows us to read a correct output signal at the price of a negligible added delay. Moreover, since the carry propagation depends on the product

A_{0} B_{0}

, one of these two waveforms is used as the input signal. Specifically, in transitions T1 and T3 we use

B_{0}

and in transitions T2 and T4 we consider

A_{0}

.

Hence, the worst CPD is evaluated by:

CPD = max [del {(\bar{C_{n}}, B_{0})}_{T 1}, del {(\bar{C_{n}}, A_{0})}_{T 2}, del {(\bar{C_{n}}, B_{0})}_{T 3}, del {(\bar{C_{n}}, A_{0})}_{T 4}],

(7)

where

del {(Y, X)}_{T}

represents the operator that evaluate the propagation delay between two digital signals, Y and X, at transition T.

4.2. Energy Consumption

The energy consumption accounts for the average energy dissipated in the multi-bit RCA per clock cycle. It is computed by simulating the circuit in Figure 12 using random bitstreams for the digital patterns, pat-A

_{j}

and pat-B

_{j}

. Random bitstreams are generated using the Matlab code in Figure 13. Each bitstream sequence stems from a specific seed (set by the rng function) so that the patterns are reproducible and can be used as a reference for future tests. From Matlab, the patterns stored in the A and B matrices are saved into a text file that is easily read by Cadence Virtuoso for transient simulations.

The following simulation setup is used to evaluate the energy dissipation per clock cycle, E. We simulate the multi-bit RCA over the time window,

T_{W}

, driving it with the above mentioned digital bitstreams of length

L_{S}

(i.e., the variable streamLength in the Matlab code). The switching frequency,

f_{s}

, establishes the interval between two consecutive transitions (

T_{s} = 1 / f_{s}

) and is small enough to allow the signals to settle properly. Obviously, the simulation window is related to

L_{S}

and

f_{s}

by:

T_{W} = \frac{L_{S}}{f_{s}} .

(8)

For any power supply and input terminal, q, we record the overall power that enters the multi-bit RCA as

p_{q} (t) = v_{q} (t) i_{q} (t)

and evaluate the corresponding average value:

〈 p_{q} (t) 〉 = \frac{1}{T_{W}} \int_{0}^{T_{W}} v_{q} (t) i_{q} (t) d t .

(9)

Multiplying (9) by the switching period,

T_{s} = 1 / f_{s}

, we obtain the energy dissipated across terminal q per clock cycle:

E_{q} = \frac{〈 p_{q} (t) 〉}{f_{s}} = \frac{1}{L_{S}} \int_{0}^{T_{W}} v_{q} (t) i_{q} (t) d t .

(10)

Finally, summing all the terminal contributions together, yields:

E = \frac{1}{f_{s}} \sum_{q} 〈 p_{q} (t) 〉,

(11)

which represents the desired energy consumption per clock cycle.

In our specific case, we set a 1-GHz switching frequency and a bitstream length,

L_{S}

, of 128. As shown in Figure 13, the simulation patterns also include the 6-bit headers for the CPD evaluation. However, they were cut off for evaluating the energy consumption.

4.3. Normalization

To compare different digital structures independently of the technology adopted, we normalize energy and delay to specific reference quantities [25]. As the reference delay,

D_{0}

, we define the average propagation delay exhibited by a symmetrical FO4 not gate loaded by a symmetrical FO16 inverter and driven by a minimum-size one. This is the delay of a gate in a tapered FO4 chain buffer. The reference energy,

E_{0}

, is that dissipated in an unloaded and minimum-size symmetrical inverter (i.e.,

W_{p} = 2 W_{n} = 2 W_{\min}

,

L = L_{\min}

and

C_{L} = 0

F) during one clock cycle. The reference values in our 28-nm FD-SOI CMOS technology result in:

\begin{matrix} E_{0} & = 183.4 \times 10^{- 18} J \end{matrix}

(12a)

\begin{matrix} D_{0} & = 13.5 \times 10^{- 12} s . \end{matrix}

(12b)

5. Determination of the Energy-Efficient Curves

For any OBFAs described in Section 3, we designed 4-bit and 8-bit RCAs in the Cadence environment. The designs were conducted by determining the EECs using the definitions of energy and delay in Section 4.

Each OBFA is partitioned into individual sections and each section is identified by a distinct parameter (i.e.,

K_{C 1}

,

K_{C 2}

, …) that sets the relative transistor size in the section with respect to the minimum value allowed by the process,

W_{\min} / L_{\min}

. In any section, transistors are also marked with a multiplication factor that accounts for their relative sizes in that section. So, for example, referring to the section denoted by

K_{inv}

in Figure 4, the size of PMOS transistors is

2 \times K_{inv} \times W_{\min} / L_{\min}

, while the size of NMOS transistors is

1 \times K_{inv} \times W_{\min} / L_{\min}

. This multiplication factor allows us to specify the ratio between PMOS and NMOS transistors in the same section.

Once the RCA is set out, the points of the EEC are determined by searching for those parameters that minimize a set of corresponding FOMs defined in the form

E^{i} D^{j}

. Specifically, the procedure regards seven FOMs that correspond to as many design points, namely

D_{\min}

,

E D^{3}

,

E D^{2}

,

E D

,

E^{2} D

,

E^{3} D

and

E_{\min}

. The minimization process of the FOMs is executed by means of the Cadence Virtuoso optimization tool. The tool can minimize a target function by adjusting specific parameters of the OBFA (i.e.,

K_{C 1}

,

K_{C 2}

, …) defined in suitable design intervals.

The LP full adder, depicted in Figure 3, consists of three sections which correspond to three parameters that can be explored by the optimization tool. However, since the propagation delay of the Sum signal does not impact on the CPD (that represents the main speed limitation of a multi-bit RCA), we can set

K_{S} = 1

as this does not affect the propagation delay but reduce the energy dissipation of the RCA. As a consequence, the optimization process regards

K_{X}

and

K_{C}

, only, defined between 1 and 10 with a 0.5-step. Table 2 and Table 3 show the data returned by the minimization process for the 4-bit and the 8-bit RCA, respectively. The first two columns identify the OBFA topology and the design points, in terms of FOMs, minimized by the optimization process. The next two columns show the normalized CPD,

D / D_{0}

, and the normalized energy per clock cycle,

E / E_{0}

. The remaining columns report the design parameters that were explored by the optimization tool. The data of the two tables will be used in the following to trace the EECs of the analyzed structures; therefore, the relative discussion will be conducted on subsequent figures.

The NHPSC full adder, shown in Figure 4, is made up of seven sections. Parameter

K_{inv}

defines the section that deals with the inversion of input signals. The section dedicated to the xor/xnor operation is identified by

K_{X 1}

and

K_{X 2}

. The Carry section is identified by

K_{C 1}

,

K_{C 2}

and

K_{C inv}

. Finally,

K_{S}

identifies the Sum section. We set

K_{S} = 1

and explore the EDS with the remaining parameters to find the EEC. Parameters

K_{inv}

,

K_{X 1}

,

K_{C 1}

and

K_{C inv}

range from 1 to 10 while

K_{X 2}

and

K_{C 2}

range in from 1 to 3, all with a 0.5-step. The results of the minimization process are reported in Table 2 and Table 3.

The HFA-22T and the HFA-B-26T full adders are depicted in Figure 5 and Figure 6 and are partitioned into four and five sections, respectively. In both circuits

K_{inv}

identifies the section that deals with the inversion of input signals. Parameters

K_{X}

,

K_{C}

and

K_{S}

identify the xor/xnor section, the Carry section and the Sum section, respectively. The HFA-B-26T full adder includes a driving stage in the Carry section, identified by

K_{C inv}

. In both circuits we set

K_{S} = 1

, and explore the EDS to find the EECs. For the HFA-22T full adder,

K_{inv}

and

K_{X}

range from 1 to 5 while

K_{C}

range from 1 to 10, both with a 0.5-step. For the HFA-B-26T full adder,

K_{inv}

,

K_{X}

,

K_{C}

and

K_{C inv}

range from 1 to 10 with a 0.5-step. The minimization process is summarized in Table 2 and Table 3.

The HSHFA-D2 and the HSHFA-D4 full adders are depicted in Figure 7 and Figure 8 and are partitioned into five and three sections, respectively. In both circuits

K_{X}

,

K_{C}

and

K_{S}

identify the xor/xnor section, the Carry section and the Sum section, respectively. In the HSHFA-D2 full adder,

K_{inv}

identifies the section that inverts input signals and

K_{C inv}

identifies the driving stage in the Carry section. Setting

K_{S} = 1

, we explore the EDS to find the EECs. For the HSHFA-D2 full adder,

K_{inv}

,

K_{C}

and

K_{C inv}

range from 1 to 15 while

K_{X}

ranges from 1 to 5, all with a 0.5-step. For the HSHFA-D4 full adder,

K_{X}

and

K_{C}

range from 1 to 10 with a 0.5-step. The minimization process results are reported in Table 2 and Table 3.

The SLPHFA full adder, shown in Figure 9, is partitioned into five sections. Section

K_{inv}

deals with the inversion of input signals, sections

K_{C 1}

and

K_{C 2}

are dedicated to the Carry, section

K_{X}

to the xor operation and

K_{S}

to the Sum. We

K_{S} = 1

and, since the xor section is used to generate the Sum signal only, we can also set

K_{X} = 1

without affecting the CPD of the RCA. The EDS is explored by ranging

K_{inv}

and

K_{C 2}

from 1 to 5, and

K_{C 1}

from 1 to 15, all with a 0.5-step. Table 2 and Table 3 report the results of the minimization process.

The CMC full adder, shown in Figure 10, is partitioned into four sections. Parameters

K_{S 1}

and

K_{S 2}

were set to 1 so that the minimization process concerned the parameters related to the carry section, only.

K_{C 1}

and

K_{C 2}

were defined 1 to 15 and from 1 to 10, respectively, both with a step of 0.5. Table 2 and Table 3 show the results of the minimization process.

An example of the optimization process performed by the Cadence Virtuoso simulation environment is shown in Figure 14 for the two RCAs built with the HFA-B-26T One-Bit full adder cell. The graph shows two energy-efficient curves, for a 4-bit and an 8-bit RCA. The optimization tool looks for seven design points by minimizing the corresponding FOM defined in terms of

E^{i} D^{j}

. Gray dots in the figure are the design points scanned by the minimization algorithm. Blue dots are the optimal design points found by the minimization algorithm, as reported in Table 2 and Table 3. Similar plots are obtained for all the remaining seven OBFA topologies but were not included because they do not provide any additional information.

6. Comparison

The data collected in Table 2 and Table 3 allow us to make an effective comparison among the various RCAs implemented with different OBFAs.

Figure 15 shows the EECs of the 4-bit RCAs implemented with the OBFAs analyzed in the previous sections. In the figure, filled symbols (squares, circles, triangles) are used to identify OBFAs with driving capability while OBFAs without driving capability are identified by empty symbols. The figure is extremely powerful since, it allows the designer to choose the best topology at a glance in terms of energy-delay trade-offs. Moreover, it makes clear which OBFAs should be avoided.

The figure reveals that, among the hybrid solutions, the OBFAs with driving capability (NHPSC, HFA-B-26T and HSHFA-D2) and the recent SLPHFA have similar EECs and do not provide any advantage in terms of Energy-Delay trade-offs. In fact, with respect to other solutions, the corresponding 4-bit RCAs are slower and dissipate a higher amount of energy. The CMC has better performance and is the best among the OBFAs with driving capabilities. The HFA-22T, the HSHFA-D4 and, particularly, the LP OBFA are the best choice in terms of energy-delay trade-offs for any value of energy and delay.

Figure 16 shows the EECs of the 8-bit RCAs implemented with the various OBFAs. In this case, filled symbols (squares, circles, triangles) identify OBFAs with driving capability while empty symbols identify those without driving capability. As expected, with respect to the 4-bit case, energy consumption and delay increase by increasing the number of bits of the RCA, n.

In the 8-bit case, among the hybrid topologies, the OBFAs with driving capability (NHPSC, HFA-B-26T and HSHFA-D2) and the recent SLPHFA are confirmed to have the worst behavior in the energy-delay space while the HFA-22T, the HSHFA-D4 and the LP OBFA behave better since they achieve same speed performance at the cost of a lower dissipation. However, as predicted in [13], the delays of these three latter topologies, increase with

n^{2}

so that, if n is high and speed is of primary concern, the traditional CMC OBFA still remains the best solution over any hybrid topology in the range

[D_{\min}, E D]

. Conversely, the LP OBFA confirms its superior performance if the energy is the main constraint to be minimized.

Finally, for the eight OBFAs analyzed, Figure 17 reports the 4-bit and the 8-bit EECs normalized to the number of bits, n. The figure confirms the behavior forecasted in [13] for the topologies without driving capability. In OBFAs with driving capability (CMC, NHPSC, HFA-B-26T and HSHFA-D2) and in the SLPHFA the 4-bit and 8-bit EECs normalized to the number of bits remain almost unchanged since both energy and delay increase proportionally to n. In HFA-22T, HSHFA-D4 and LP OBFAs the delay increases with

n^{2}

so that the 8-bit EEC is right shifted with respect to the corresponding 4-bit curve. It is worth noting that, even if the SLPHFA has no driving capability, it behaves as an OBFA with driving capability. This is because the input carry signal drives only transistor gates and the path between

C_{i}

and

C_{o}

is not modeled with an

R C

network.

7. Conclusions

We designed and compared seven meaningful topologies of hybrid OBFAs that were optimized in terms of energy-delay trade-offs to operate in multibit RCAs. The design and comparison dealt with 4-bit and 8-bit RCAs and exploited the derivation of the EECs in the EDS. The well-known and consolidated CMC full adder was also included in the comparison as a reference structure.

Since the derivation of the EEC implies the determination of energy consumption and propagation delay, we defined the procedures for getting them through circuit simulations. Then, we presented a design methodology to optimize an RCA by minimizing some significant FOMs (e.g.,

E^{i} D^{j}

). Specifically, we partitioned each OBFA of the RCA into a few sections, each identified by an individual parameter that stood for the relative transistor sizes in the section with respect to the minimum size of the technology process,

W_{\min} / L_{\min}

. The numerical optimizer of the circuit simulator was used to minimize the FOMs and find out the EEC of the RCA.

All the EECs were plotted in two different graphs for the 4-bit case and the 8-bit one. This allowed us to make a simple and effective comparison as well as to identify the best OBFA topologies. For the 4-bit case, the comparison revealed that HFA-22T, HSHFA-D4 and LP OBFAs are the best solutions among the other hybrid topologies and also compared to the traditional CMC full adder. For the 8-bit case, the comparison showed that, with respect to other hybrid topologies, HFA-22T, HSHFA-D4 and LP OBFAs behave better since they achieve the same speed performance at the cost of a lower dissipation. However, if n is large and speed is of primary concern, the traditional CMC OBFA still remains the best solution over any hybrid topology analyzed.

Author Contributions

Conceptualization, G.G. and G.P.; methodology, G.G. and G.P.; software, G.G. and G.P.; validation, G.G. and G.P.; formal analysis, G.G. and G.P.; investigation, G.G. and G.P.; resources, G.G. and G.P.; data curation, G.G. and G.P.; writing—original draft preparation, G.G. and G.P.; writing—review and editing, G.G. and G.P.; visualization, G.G. and G.P.; supervision, G.G. and G.P.; project administration, G.G. and G.P.; funding acquisition, G.G. and G.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by Università degli Studi di Catania through the Project “Programma Ricerca di Ateneo UNICT 2020–22 linea 2”.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the non-disclosure agreement signed with owner of the technology process.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

CMC	Conventional Mirror CMOS
CPD	Carry Propagation Delay
EDS	Energy-Delay Space
EEC	energy-efficient curve
FOM	Figure Of Merit
HFA-22T	Hybrid full adder with 22 Transistors
HFA-B-26T	Hybrid full adder with Buffer and 26 Transistors
HSHFA-D2	high-speed hybrid full adder design-2
HSHFA-D4	high-speed hybrid full adder design-4
LP	Low-Power
OBFA	One-Bit full adder
NHPSC	New Hybrid Pass Static CMOS
RCA	Ripple Carry Adder
SLPHFA	scalable low-power hybrid full adder

References

Weste, N.; Eshragian, K. Principles of CMOS VLSI Design: A Systems Perspective, 2nd ed.; Addison-Wesley: Boston, MA, USA, 1993. [Google Scholar]
Rabaey, J.; Chandrakasan, A.; Nikolic, B. Digital Integrated Circuits, 2nd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2003. [Google Scholar]
Shams, A.; Darwish, T.; Bayoumi, M. Performance analysis of low-power 1-bit CMOS full adder cells. IEEE Trans. VLSI Syst. 2002, 10, 20–29. [Google Scholar] [CrossRef]
Goel, S.; Kumar, A.; Bayoumi, M.A. Design of Robust, Energy-Efficient full adders for Deep-Submicrometer Design Using Hybrid-CMOS Logic Style. IEEE Trans. VLSI Syst. 2006, 14, 1309–1321. [Google Scholar] [CrossRef]
Zhuang, N.; Wu, H. A new design of the CMOS full adder. IEEE J. Solid-State Circuits 1992, 27, 840–844. [Google Scholar] [CrossRef]
Lin, J.F.; Hwang, Y.T.; Sheu, M.H.; Ho, C.C. A Novel High-Speed and Energy Efficient 10-Transistor full adder Design. IEEE Trans. Circuits Syst. I 2007, 54, 1050–1059. [Google Scholar] [CrossRef]
Hassoune, I.; Flandre, D.; O’Connor, I.; Legat, J.D. ULPFA: A New Efficient Design of a Power-Aware full adder. IEEE Trans. Circuits Syst. I 2010, 57, 2066–2074. [Google Scholar] [CrossRef]
Basireddy, H.R.; Challa, K.; Nikoubin, T. Hybrid Logical Effort for Hybrid Logic Style full adders in Multistage Structures. IEEE Trans. VLSI Syst. 2019, 27, 1138–1147. [Google Scholar] [CrossRef]
Kandpal, J.; Tomar, A.; Agarwal, M.; Sharma, K.K. High-Speed Hybrid-Logic full adder Using High-Performance 10-T XOR–XNOR Cell. IEEE Trans. VLSI Syst. 2020, 28, 1413–1422. [Google Scholar] [CrossRef]
Hasan, M.; Hossein, M.J.; Hossain, M.; Zaman, H.U.; Islam, S. Design of a Scalable Low-Power 1-Bit Hybrid full adder for Fast Computation. IEEE Trans. Circuits Syst. II 2020, 67, 1464–1468. [Google Scholar] [CrossRef]
Callaway, T.; Swartzlander, E. Low Power Arithmetic Components. In Low Power Design Methodologies; Rabaey, J., Pedram, M., Eds.; Kluwer Academic Publishers: Norwell, MA, USA, 1996. [Google Scholar]
Zimmermann, R.; Fichtner, W. Low-power logic styles: CMOS versus pass-transistor logic. IEEE J. Solid-State Circuits 1997, 32, 1079–1090. [Google Scholar] [CrossRef]
Alioto, M.; Palumbo, G. Analysis and Comparison on full adder Block in Submicron Technology. IEEE Trans. VLSI Syst. 2002, 10, 806–823. [Google Scholar] [CrossRef]
Chang, C.H.; Gu, J.; Zhang, M. A review of 0.18-μm full adder performances for tree structured arithmetic circuits. IEEE Trans. VLSI Syst. 2005, 13, 686–695. [Google Scholar] [CrossRef]
Aguirre-Hernandez, M.; Linares-Aranda, M. CMOS Full-Adders for Energy-Efficient Arithmetic Applications. IEEE Trans. VLSI Syst. 2011, 19, 718–721. [Google Scholar] [CrossRef]
Purohit, S.; Margala, M. Investigating the Impact of Logic and Circuit Implementation on Full Adder Performance. IEEE Trans. VLSI Syst. 2012, 20, 1327–1331. [Google Scholar] [CrossRef]
Bhattacharyya, P.; Kundu, B.; Ghosh, S.; Kumar, V.; Dandapat, A. Performance Analysis of a Low-Power High-Speed Hybrid 1-bit Full Adder Circuit. IEEE Trans. VLSI Syst. 2015, 23, 2001–2008. [Google Scholar] [CrossRef]
Naseri, H.; Timarchi, S. Low-Power and Fast full adder by Exploring New XOR and XNOR Gates. IEEE Trans. VLSI Syst. 2018, 26, 1481–1493. [Google Scholar] [CrossRef]
Pénzes, P.I.; Martin, A.J. Energy-Delay Efficiency of VLSI Computations. In Proceedings of the ACM 12th Great Lakes Symposium on VLSI, New York, NY, USA, 18–19 April 2002; pp. 104–111. [Google Scholar] [CrossRef]
Zyuban, V.; Strenski, P. Unified methodology for resolving power-performance tradeoffs at the microarchitectural and circuit levels. In Proceedings of the IEEE International Symposium on Low Power Electronics and Design, Monterey, CA, USA, 14 August 2002; pp. 166–171. [Google Scholar] [CrossRef]
Zyuban, V.; Strenski, P.N. Balancing hardware intensity in microprocessor pipelines. IBM Journal of Research and Development 2003, 47, 585–598. [Google Scholar] [CrossRef]
Alioto, M.; Consoli, E.; Palumbo, G. General Strategies to Design Nanometer Flip-Flops in the Energy-Delay Space. IEEE Trans. Circuits Syst. I 2010, 57, 1583–1596. [Google Scholar] [CrossRef]
Alioto, M.; Consoli, E.; Palumbo, G. From energy-delay metrics to constraints on the design of digital circuits. Int. J. Circ. Theor. Appl. 2012, 40, 815–834. [Google Scholar] [CrossRef]
Oklobdzija, V.; Zeydel, B.; Dao, H.; Mathew, S.; Krishnamurthy, R. Comparison of high-performance VLSI adders in the energy-delay space. IEEE Trans. VLSI Syst. 2005, 13, 754–758. [Google Scholar] [CrossRef]
Giustolisi, G.; Palumbo, G. Analysis and Comparison in the Energy-Delay Space of Nanometer CMOS One-Bit Full-Adders. IEEE Access 2022, 10, 75482–75494. [Google Scholar] [CrossRef]
Shams, A.; Bayoumi, M. A novel high-performance CMOS 1-bit full-adder cell. IEEE Trans. Circuits Syst. II 2000, 47, 478–481. [Google Scholar] [CrossRef]
Wang, J.M.; Fang, S.C.; Feng, W.S. New efficient designs for XOR and XNOR functions on the transistor level. IEEE J. Solid-State Circuits 1994, 29, 780–786. [Google Scholar] [CrossRef]

Figure 1. Plot of the energy-efficient curve (EEC) from (1). Optimal design points,

E^{i} D^{j}

, are reported in red dots. Other less efficient points are reported in gray dots.

Figure 1. Plot of the energy-efficient curve (EEC) from (1). Optimal design points,

E^{i} D^{j}

, are reported in red dots. Other less efficient points are reported in gray dots.

Figure 2. Comparison of two digital circuits by means of their energy-efficient curves (EECs). Red line: EEC of Circuit A. Blue line: EEC of Circuit B.

Figure 3. Low-power (LP) full adder. Propagate signals,

P = A \oplus B

and

\bar{P} = \bar{A \oplus B}

, are generated using pass-transistor logic. Subsequent transmission gates produce the outputs, S and

C_{o}

.

Figure 3. Low-power (LP) full adder. Propagate signals,

P = A \oplus B

and

\bar{P} = \bar{A \oplus B}

, are generated using pass-transistor logic. Subsequent transmission gates produce the outputs, S and

C_{o}

.

Figure 4. New hybrid pass static CMOS (NHPSC) full adder. The cell generates internally complementary input signals,

\bar{A}

and

\bar{C_{i}}

. Propagate signals,

P = A \oplus B

and

\bar{P} = \bar{A \oplus B}

, are used to produce the output signals, S and

C_{o}

. The cell is decoupled from subsequent stages by CMOS inverters.

Figure 4. New hybrid pass static CMOS (NHPSC) full adder. The cell generates internally complementary input signals,

\bar{A}

and

\bar{C_{i}}

. Propagate signals,

P = A \oplus B

and

\bar{P} = \bar{A \oplus B}

, are used to produce the output signals, S and

C_{o}

. The cell is decoupled from subsequent stages by CMOS inverters.

Figure 5. Hybrid full adder with 22 transistors (HFA-22T). The cell generates internally complementary input signals,

\bar{A}

and

\bar{C_{i}}

. Propagate signals are generated using an improved version of the circuits used in the LP full adder. Output signals, S and

C_{o}

, are generated using transmission gates.

Figure 5. Hybrid full adder with 22 transistors (HFA-22T). The cell generates internally complementary input signals,

\bar{A}

and

\bar{C_{i}}

. Propagate signals are generated using an improved version of the circuits used in the LP full adder. Output signals, S and

C_{o}

, are generated using transmission gates.

Figure 6. Hybrid full adder with buffer and 26 transistors (HFA-B-26T). The cell generates internally complementary input signals,

\bar{A}

and

\bar{C_{i}}

. Propagate signals are generated using the same module used in HFA-22T. Output signals,

\bar{S}

and

\bar{C_{o}}

, are generated using transmission gates. The outputs are also decoupled from subsequent stages using CMOS inverters.

Figure 6. Hybrid full adder with buffer and 26 transistors (HFA-B-26T). The cell generates internally complementary input signals,

\bar{A}

and

\bar{C_{i}}

. Propagate signals are generated using the same module used in HFA-22T. Output signals,

\bar{S}

and

\bar{C_{o}}

, are generated using transmission gates. The outputs are also decoupled from subsequent stages using CMOS inverters.

Figure 7. High-speed hybrid full adder design-2 (HSHFA-D2) full adder. Complementary input signals,

\bar{A}

and

\bar{C_{i}}

are internally generated. Propagate signals are simultaneously generated using a new high-speed circuit. Output signals, S and

C_{o}

, are generated using transmission gates and are decoupled from subsequent stages using CMOS inverters.

Figure 7. High-speed hybrid full adder design-2 (HSHFA-D2) full adder. Complementary input signals,

\bar{A}

and

\bar{C_{i}}

are internally generated. Propagate signals are simultaneously generated using a new high-speed circuit. Output signals, S and

C_{o}

, are generated using transmission gates and are decoupled from subsequent stages using CMOS inverters.

Figure 8. High-speed hybrid full adder design-4 (HSHFA-D4) full adder. Propagate signals are simultaneously generated using the same module used in HSHFA-D2. Mixed logic is used to produce the output signals, S and

C_{o}

.

Figure 8. High-speed hybrid full adder design-4 (HSHFA-D4) full adder. Propagate signals are simultaneously generated using the same module used in HSHFA-D2. Mixed logic is used to produce the output signals, S and

C_{o}

.

Figure 9. Scalable low-power hybrid full adder (SLPHFA) full adder. Input signals,

\bar{A}

and

\bar{C_{i}}

are internally complemented. Only one propagate signal is generated (

P = A \oplus B

) which is used to produce the sum signal only. The carry signal is generated using a new AND-OR module with transmission gates and complementary pass transistor logic.

Figure 9. Scalable low-power hybrid full adder (SLPHFA) full adder. Input signals,

\bar{A}

and

\bar{C_{i}}

are internally complemented. Only one propagate signal is generated (

P = A \oplus B

) which is used to produce the sum signal only. The carry signal is generated using a new AND-OR module with transmission gates and complementary pass transistor logic.

Figure 10. Conventional mirror CMOS (CMC) full adder. The Sum signal is produced from the Carry one. The cell generates complemented output signals,

\bar{S}

and

\bar{C_{o}}

.

Figure 10. Conventional mirror CMOS (CMC) full adder. The Sum signal is produced from the Carry one. The cell generates complemented output signals,

\bar{S}

and

\bar{C_{o}}

.

Figure 11. CMC multibit adders are implemented with no signal inversion in the carry path.

Figure 12. Simulation schematic for testing the multi-bit RCA.

Figure 13. Input patterns of the RCA. the patterns allow us to (1) evaluate the carry propagation delay; (2) estimate the energy consumption.

Figure 14. Optimization process for two RCAs built with the HFA-B-26T One-Bit full adder cell. The graph shows two energy-efficient curves, for a 4-bit and an 8-bit RCA. For each RCA, the optimization tool looks for seven design points by minimizing the corresponding FOM defined in terms of

E^{i} D^{j}

. Gray dots are those explored by the minimization algorithm. Blue dots are the optimal values found by the optimization process.

Figure 14. Optimization process for two RCAs built with the HFA-B-26T One-Bit full adder cell. The graph shows two energy-efficient curves, for a 4-bit and an 8-bit RCA. For each RCA, the optimization tool looks for seven design points by minimizing the corresponding FOM defined in terms of

E^{i} D^{j}

. Gray dots are those explored by the minimization algorithm. Blue dots are the optimal values found by the optimization process.

Figure 15. Comparison of 4-bit RCAs. Curves are obtained by plotting the data in columns

D / D_{0}

and

E / E_{0}

from Table 2. Dashed lines are used for 4-bit RCAs. Filled symbols (squares, circles, triangles) refer to OBFAs with driving capability. Empty symbols refer to OBFAs without driving capability.

Figure 15. Comparison of 4-bit RCAs. Curves are obtained by plotting the data in columns

D / D_{0}

and

E / E_{0}

from Table 2. Dashed lines are used for 4-bit RCAs. Filled symbols (squares, circles, triangles) refer to OBFAs with driving capability. Empty symbols refer to OBFAs without driving capability.

Figure 16. Comparison of 8-bit RCAs. Curves are obtained by plotting the data in columns

D / D_{0}

and

E / E_{0}

from Table 3. Solid lines are used for 8-bit RCAs. Filled symbols (squares, circles, triangles) refer to OBFAs with driving capability. Empty symbols refer to OBFAs without driving capability.

Figure 16. Comparison of 8-bit RCAs. Curves are obtained by plotting the data in columns

D / D_{0}

and

E / E_{0}

from Table 3. Solid lines are used for 8-bit RCAs. Filled symbols (squares, circles, triangles) refer to OBFAs with driving capability. Empty symbols refer to OBFAs without driving capability.

Figure 17. Comparison between 4-bit and 8-bit EECs for the eight OBFAs analyzed. The EECs are normalized to the number of bits of the RCA, n.

Table 1. Transitions for determining the worst carry propagation delay.

	$A_{j} = 1; B_{j} = 0$	$A_{j} = 0; B_{j} = 1$
	$(j = 1, 2, \dots, n - 1)$	$(j = 1, 2, \dots, n - 1)$
	T1	T2
$A_{0} B_{0} = 0 \to 1$	$A = 00 \dots 0 \to 11 \dots 1$	$A = 00 \dots 0 \to 00 \dots 1$
	$B = 00 \dots 0 \to 00 \dots 1$	$B = 00 \dots 0 \to 11 \dots 1$
	T3	T4
$A_{0} B_{0} = 1 \to 0$	$A = 11 \dots 1 \to 11 \dots 1$	$A = 00 \dots 1 \to 00 \dots 0$
	$B = 00 \dots 1 \to 00 \dots 0$	$B = 11 \dots 1 \to 11 \dots 1$

Table 2. 4-bit RCAs: EEC design points (FOMs), normalized delay, normalized energy and relative design parameters.

OBFA	FOM	$D / D_{0}$	$E / E_{0}$	$K_{inv}$	$K_{X \| X 1}$	$K_{X 2}$	$K_{C \| C 1}$	$K_{C 2}$	$K_{C inv}$
LP	$D_{\min}$	4.04	29.7		5.0		6.0
	$E D^{3}$	4.15	24.6		3.5		4.0
	$E D^{2}$	4.24	23.1		3.0		3.5
	$E D$	4.56	20.6		2.5		2.5
	$E^{2} D$	5.64	17.6		1.5		1.5
	$E^{3} D$	7.16	16.1		1.0		1.0
	$E_{\min}$	7.16	16.1		1.0		1.0
	$D_{\min}$	7.28	73.5	6.5	5.5	2.0	7.0	2.0	5.5
	$E D^{3}$	7.58	56.7	5.0	3.5	1.0	5.0	2.0	3.5
	$E D^{2}$	8.05	45.3	3.0	2.0	1.0	3.5	1.0	2.5
NHPSC	$E D$	8.52	41.8	2.5	1.5	1.0	3.0	1.0	2.0
	$E^{2} D$	9.82	36.9	1.5	1.5	1.0	2.0	1.0	1.5
	$E^{3} D$	11.7	34.1	1.0	1.5	1.0	1.5	1.0	1.0
	$E_{\min}$	12.2	33.6	1.0	1.5	1.0	1.0	1.0	1.0
	$D_{\min}$	4.44	36.0	1.5	3.5		7.5
	$E D^{3}$	4.61	28.9	1.0	2.0		7.0
	$E D^{2}$	4.61	28.9	1.0	2.0		6.5
HFA-22T	$E D$	4.81	27.1	1.0	1.5		4.5
	$E^{2} D$	5.75	23.4	1.0	1.0		2.5
	$E^{3} D$	6.30	22.5	1.0	1.0		2.0
	$E_{\min}$	9.09	20.7	1.0	1.0		1.0
	$D_{\min}$	7.55	74.7	8.0	2.0		7.5		6.0
	$E D^{3}$	8.44	41.9	3.5	1.0		2.5		2.5
	$E D^{2}$	8.44	41.9	3.5	1.0		2.5		2.5
HFA-B-26T	$E D$	9.59	33.9	1.0	1.0		1.5		1.5
	$E^{2} D$	10.9	30.7	1.5	1.0		1.0		1.0
	$E^{3} D$	11.8	29.7	1.0	1.0		1.0		1.0
	$E_{\min}$	11.8	29.7	1.0	1.0		1.0		1.0
	$D_{\min}$	7.33	79.6	8.5	2.5		8.0		6.0
	$E D^{3}$	7.81	49.4	4.5	1.0		3.5		3.0
	$E D^{2}$	8.16	43.7	3.5	1.0		2.5		2.5
HSHFA-D2	$E D$	9.32	35.7	2.0	1.0		1.5		1.5
	$E^{2} D$	10.6	32.4	1.5	1.0		1.0		1.0
	$E^{3} D$	11.5	31.3	1.0	1.0		1.0		1.0
	$E_{\min}$	11.5	31.3	1.0	1.0		1.0		1.0
	$D_{\min}$	4.65	33.4		3.0		5.0
	$E D^{3}$	4.78	27.9		1.5		4.0
	$E D^{2}$	4.78	27.9		1.5		4.0
HSHFA-D4	$E D$	5.19	24.8		1.0		3.0
	$E^{2} D$	5.89	22.6		1.0		2.0
	$E^{3} D$	6.67	21.5		1.0		1.5
	$E_{\min}$	8.23	20.5		1.0		1.0
	$D_{\min}$	8.48	48.7	3.0			8.5	3.5
	$E D^{3}$	8.87	36.3	2.0			4.5	2.5
	$E D^{2}$	9.00	35.1	2.0			4.0	2.5
SLPHFA	$E D$	10.0	30.0	1.5			2.5	2.0
	$E^{2} D$	12.5	25.1	1.0			1.5	1.0
	$E^{3} D$	13.9	23.8	1.0			1.0	1.0
	$E_{\min}$	13.9	23.8	1.0			1.0	1.0
	$D_{\min}$	5.84	54.3				10.0	8.0
	$E D^{3}$	6.11	39.9				6.0	4.5
	$E D^{2}$	6.31	36.7				5.0	4.0
CMC	$E D$	7.25	29.9				3.0	2.5
	$E^{2} D$	8.56	26.4				2.0	1.5
	$E^{3} D$	9.57	25.2				1.5	1.5
	$E_{\min}$	12.1	23.6				1.0	1.0

Table 3. 8-bit RCAs: EEC design points (FOMs), normalized delay, normalized energy and relative design parameters.

OBFA	FOM	$D / D_{0}$	$E / E_{0}$	$K_{inv}$	$K_{X \| X 1}$	$K_{X 2}$	$K_{C \| C 1}$	$K_{C 2}$	$K_{C inv}$
	$D_{\min}$	14.4	51.7		6.5		3.0
	$E D^{3}$	14.8	42.7		3.0		2.5
	$E D^{2}$	15.3	39.5		2.5		2.0
LP	$E D$	16.3	36.3		2.0		1.5
	$E^{2} D$	16.6	35.6		1.5		1.5
	$E^{3} D$	21.3	32.3		1.0		1.0
	$E_{\min}$	21.3	32.3		1.0		1.0
	$D_{\min}$	13.9	137	5.5	5.0	1.0	4.5	1.0	4.5
	$E D^{3}$	14.4	116	4.5	3.0	1.0	4.5	1.0	3.0
	$E D^{2}$	14.8	108	3.5	2.5	1.0	4.5	1.0	3.0
NHPSC	$E D$	17.4	85.2	2.0	2.0	1.0	2.5	1.0	2.0
	$E^{2} D$	20.3	76.1	1.5	1.5	1.0	1.5	1.0	1.5
	$E^{3} D$	24.5	69.9	1.0	1.5	1.0	1.0	1.0	1.0
	$E_{\min}$	24.5	69.9	1.0	1.5	1.0	1.0	1.0	1.0
	$D_{\min}$	15.6	56.9	1.0	3.0		3.5
	$E D^{3}$	15.7	49.4	1.0	1.5		3.0
	$E D^{2}$	16.1	46.0	1.0	1.0		2.5
HFA-22T	$E D$	16.1	46.0	1.0	1.0		2.5
	$E^{2} D$	17.4	44.2	1.0	1.0		2.0
	$E^{3} D$	17.4	44.2	1.0	1.0		2.0
	$E_{\min}$	27.5	41.0	1.0	1.0		1.0
	$D_{\min}$	14.5	173	7.5	7.0		7.0		6.0
	$E D^{3}$	14.9	107	5.0	1.0		3.5		3.5
	$E D^{2}$	16.1	86.4	3.0	1.0		2.5		2.5
HFA-B-26T	$E D$	18.4	71.9	2.0	1.0		1.5		1.5
	$E^{2} D$	21.2	64.9	1.5	1.0		1.0		1.0
	$E^{3} D$	23.0	62.6	1.0	1.0		1.0		1.0
	$E_{\min}$	23.0	62.6	1.0	1.0		1.0		1.0
	$D_{\min}$	13.9	201	10.0	1.5		10.5		8.0
	$E D^{3}$	15.0	114	5.0	1.0		4.0		3.5
	$E D^{2}$	16.5	90.5	3.0	1.0		2.5		2.5
HSHFA-D2	$E D$	18.7	75.7	2.0	1.0		1.5		1.5
	$E^{2} D$	21.4	68.4	1.5	1.0		1.0		1.0
	$E^{3} D$	23.2	66.0	1.0	1.0		1.0		1.0
	$E_{\min}$	23.2	66.0	1.0	1.0		1.0		1.0
	$D_{\min}$	15.3	64.1		4.0		3.5
	$E D^{3}$	15.7	49.5		1.5		3.0
	$E D^{2}$	16.2	45.4		1.0		2.5
HSHFA-D4	$E D$	16.8	43.4		1.0		2.0
	$E^{2} D$	18.1	41.5		1.0		1.5
	$E^{3} D$	18.1	41.5		1.0		1.5
	$E_{\min}$	22.5	39.6		1.0		1.0
	$D_{\min}$	16.3	124	3.0			13.0	4.0
	$E D^{3}$	17.2	85.3	2.0			6.0	3.0
	$E D^{2}$	17.7	78.4	2.0			5.0	2.5
SLPHFA	$E D$	19.8	65.5	1.5			3.0	2.0
	$E^{2} D$	26.4	52.6	1.0			1.5	1.0
	$E^{3} D$	29.3	50.3	1.0			1.0	1.0
	$E_{\min}$	29.3	50.3	1.0			1.0	1.0
	$D_{\min}$	10.5	120				12.0	6.5
	$E D^{3}$	11.1	82.9				6.5	3.5
	$E D^{2}$	11.5	76.2				5.5	3.0
CMC	$E D$	13.1	63.2				3.5	2.0
	$E^{2} D$	16.2	54.2				2.0	1.5
	$E^{3} D$	18.7	51.1				1.5	1.0
	$E_{\min}$	23.7	48.5				1.0	1.0

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Giustolisi, G.; Palumbo, G. Hybrid Full Adders: Optimized Design, Critical Review and Comparison in the Energy-Delay Space. Electronics 2022, 11, 3220. https://doi.org/10.3390/electronics11193220

AMA Style

Giustolisi G, Palumbo G. Hybrid Full Adders: Optimized Design, Critical Review and Comparison in the Energy-Delay Space. Electronics. 2022; 11(19):3220. https://doi.org/10.3390/electronics11193220

Chicago/Turabian Style

Giustolisi, Gianluca, and Gaetano Palumbo. 2022. "Hybrid Full Adders: Optimized Design, Critical Review and Comparison in the Energy-Delay Space" Electronics 11, no. 19: 3220. https://doi.org/10.3390/electronics11193220

APA Style

Giustolisi, G., & Palumbo, G. (2022). Hybrid Full Adders: Optimized Design, Critical Review and Comparison in the Energy-Delay Space. Electronics, 11(19), 3220. https://doi.org/10.3390/electronics11193220

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hybrid Full Adders: Optimized Design, Critical Review and Comparison in the Energy-Delay Space

Abstract

1. Introduction

2. Overview on the Energy-Delay Space

3. OBFA Topologies

4. Comparison Strategy and Simulations

4.1. Carry Propagation Delay

4.2. Energy Consumption

4.3. Normalization

5. Determination of the Energy-Efficient Curves

6. Comparison

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI