Next Article in Journal
Secure and Flexible Privacy-Preserving Federated Learning Based on Multi-Key Fully Homomorphic Encryption
Previous Article in Journal
A Novel Mathematical Approach for Inductor-Current Expressions Definition in Multilevel Dual-Active-Bridge Converters
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Standard Cell Sizing for Worst-Case Performance Optimization Considering Process Variation in Subthreshold Region

1
National ASIC System Engineering Center, Southeast University, Nanjing 210096, China
2
College of Integrated Circuit Science and Engineering, Nanjing University Posts and Telecommunications, Nanjing 210023, China
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(22), 4477; https://doi.org/10.3390/electronics13224477
Submission received: 3 October 2024 / Revised: 11 November 2024 / Accepted: 14 November 2024 / Published: 14 November 2024
(This article belongs to the Section Microelectronics)

Abstract

:
Ultra-low-voltage design brings considerable outcomes in power reduction and energy efficiency improvement at the cost of performance degradation and uncertainty. Conventional standard cell design methodology cannot guarantee optimal performance for subthreshold operations due to the lack of consideration of process variation. In this paper, an effective subthreshold cell sizing method is proposed to minimize the worst-case propagation delay by deriving the optimal pMOS-to-nMOS width ratio (β) analytically, which reveals the relation between the minimal worst-case delay and the process parameters and provides distinct guidance for standard cell library design. The proposed method demonstrated good agreement with the Monte Carlo SPICE simulation results and was validated at the cell level and the circuit level. At the cell level, the logic cells designed with the proposed method show at least 8.6% and 7.4% improvement, on average, for worst-case delay and energy-delay product (EDP), respectively, with an additional 3.2% energy overhead compared to the prior approaches. At the circuit level, the proposed method improves the worst-case performance and worst-case EDP of the ring oscillator by at least 15.5% and 15.0%, respectively, with a 0.9% energy penalty. Moreover, the ISCAS’89 and OpenCores circuits synthesized with the optimized cells achieve at least 6.6% worst-case performance enhancement, 6.9% power reduction, and 9.4% area saving.

1. Introduction

State-of-the-art ultra-low-voltage design decreases the supply voltage down to threshold voltage as a promising candidate to meet stringent power budgets for many applications [1,2]. However, due to the small gate voltage drive, subthreshold circuits face severe challenges in terms of over 500~1000× performance degradation [3] and uncertainty compared with super-threshold operation, which could be mitigated with customized standard cells. Commercial cell libraries are designed and characterized for super-threshold voltage operations [4,5], which require special modifications to improve performance and reduce power consumption, as well as variability for the subthreshold region.
Plenty of research has been presented to deal with subthreshold cell design [6,7,8,9,10,11,12,13,14]. The minimum-width cell design was proposed in [6,7] by breaking wider transistors into multiple fingers to mitigate the impact of the inverse narrow width effect (INWE) or the narrow width effect (NWE) for performance improvement. The optimal pMOS-to-nMOS width ratio (β) for the subthreshold domain was reevaluated in [8] to achieve equal rise and fall times. The concept of logical effort was adopted in [9,10] to perform transistor sizing for standard cells with stacking structure, which diverges from the situation in the super-threshold region when the subthreshold operation is performed. However, the impact of process variation is not considered, nor is the statistical delay distribution. An analytical expression was derived in [11] to find the optimal pMOS-to-nMOS width ratio in the subthreshold region with the consideration of process variation. The work in [12] introduced a subthreshold cell sizing methodology by balancing the mean value of the pMOS and nMOS transistor currents, but the variance of the current distribution is neglected. In [13], although the optimization solution was finally verified with Monte Carlo (MC) simulations, the impact of process variation is not considered during cell sizing. In [14], a digital cell library was presented in the near-threshold region to obtain both high energy efficiency and optimal performance with an asymmetric gate length scheme and a forward body biasing technique. A multi-threshold-voltage and multi-channel-length standard cell library was developed in [15] to enable the fine granularity of driving strength for near-threshold and subthreshold circuit design at minimal power and area overhead. The impact of the Reverse Short Channel Effect (RCSE) and the Inverse Narrow Width Effect (INWE) on the device I-V characteristics under the subthreshold region was studied by [16] for standard cell library design. The best switching efficiency was used as the indicator in [17] for the optimal channel length design targeting ultra-low voltages. In [18], the standard cell pMOS-to-nMOS width ratio was sized to maximize the performance with the constraint of a full diffusion layout structure to improve the circuit performance at the cost of higher energy consumption.
In most prior cell sizing methods for ultra-low-voltage design, the cell delay variation due to process mismatch is not taken into consideration, leading to a suboptimal solution for cell sizing. To demonstrate the impact of delay variation on the cell sizing solution, the fluctuation tendency of nominal delay and worst-case delay is plotted in Figure 1 by varying β, which is obtained by the MC simulation results of an inverter cell driving an identical one under the TSMC 28 nm process. The worst-case delay is defined as the 3σ percentile point of the delay distribution. In the super-threshold region (Figure 1a), the nominal delay achieves the minimum value, with nearly the same β as the worst case. However, in the subthreshold region (Figure 1b), the optimal β for the nominal delay deviates from that for the worst case, so that could not guarantee the minimal worst-case delay, suffering from 26.2% performance degradation.
In this work, a standard cell sizing technique is proposed to derive the optimal pMOS-to-nMOS width ratio (β) analytically for worst-case performance optimization in the subthreshold domain by considering process variation with random variables.
The main contributions of this work are summarized as follows:
  • The optimal β targeting at worst-case performance was derived analytically by minimizing the 3σ percentile of propagation delay distribution, which has been validated under various process technologies to demonstrate good agreement with MC SPICE simulation results.
  • The analytical expression of the optimal β reveals the relation between the optimal worst-case cell delay and the process parameters with physical insight. To be precise, the ratio of mobility, as well as the ratios of mean and variance of threshold voltage for nMOS and pMOS transistors, determine the optimal β for minimal worst-case cell delay, which provides distinct guidance for standard cell design for specific processes without time-consuming MC SPICE simulations.
  • The standard logic cells designed by the proposed optimization method were validated under the process of TSMC 28 nm technology, which outperforms the competitive approaches with significant worst-case performance improvement and worst-case energy-delay product (EDP) reduction at both the cell level and the circuit level.
This paper is organized as follows: Section 2 derives the subthreshold worst-case delay model analytically considering process variation, and the optimal β for minimal worst-case delay is derived in Section 3. Validation results are given and compared in Section 4. Section 5 draws the conclusions.

2. Subthreshold Worst-Case Propagation Delay Model

The propagation delay (tp) for the subthreshold region can be modeled by an inverter driving an identical cell, as shown in Figure 2, where the channel widths of the nMOS and pMOS transistors are denoted as Wn and Wp, respectively, and the channel lengths of all transistors are equal to L. The ratio of the pMOS-to-nMOS width is defined as (1).
β = W p / W n
The load capacitance for the first-stage inverter in Figure 2 is denoted as CL, which represents all capacitances at node ZN, including the total drain and gate capacitances associated with all nMOS and pMOS transistors, Cn and Cp, and the wire capacitance, Cw. Since Cn and Cp are both proportional to the transistor channel area, i.e., transistor channel width, the value of Cp is β times that of Cn, and CL can be expressed as
C L = C n + C p + C w = 1 + β C n + C w
The propagation delay of the first inverter in Figure 2 can be expressed by [4].
t p = t p H L + t p L H 2 = V D D C L 4 1 I n + 1 I p
where tpHL and tpLH are the delays of high-to-low and low-to-high voltage transitions of the ZN node, and VDD is the supply voltage. In and Ip are the subthreshold drain currents of the nMOS and pMOS transistors of the first inverter, which are proportional to the ratio of channel width and length and exponentially related to threshold voltage, which can be expressed as [11].
I n = I 0 μ n W n L e V g s V t h n n ϕ t 1 e V d s ϕ t I p = I 0 μ p W p L e V g s V t h p n ϕ t 1 e V d s ϕ t
with
I 0 = C o x n 1 ϕ t 2
where I0 is a process-dependent parameter, Cox refers to the gate oxide capacitance per unit area, n is the subthreshold slope factor, Vgs and Vds are, respectively, the gate-source voltage and drain-source voltage, μn(μp) is the charge carrier mobility, Vthn(Vthp) is the threshold voltage, n is the sub-threshold slope factor, and Φt is the thermal voltage.
By substituting the subthreshold drain current as (4) into (3) with a step input signal (Vgs = VDD) and approximating the term 1 e V d s ϕ t to 1, the propagation delay for the subthreshold region can be written as
t p = k × 1 + β × α n + 1 + 1 β × Λ × α p
where the related parameters are defined as
α n = e V t h n n ϕ t , α p = e V t h p n ϕ t , Λ = μ n μ p , k = V D D e V D D n ϕ t 4 I 0 W n L μ n C n + C w 1 + β
With process-related parameters, including αn/αp and Λ, it can be seen from (6) that the propagation delay for the subthreshold region is closely related to the pMOS-to-nMOS width ratio (β).
As claimed in prior publications [19,20], the fluctuations of current and propagation delay are dominated by the threshold voltage variation at the subthreshold voltage, which is associated with the parameters αn and αp in (6). Since the threshold voltages Vthn and Vthp are Gaussian-distributed [8,12], the random variables αn and αp follow log-normal (LN) distributions, whose means and variances can be expressed as
E α n = e E V t h n + D V t h n 2 ,   D α n = e D V t h n 1 E 2 α n E α p = e E V t h p + D V t h p 2 β ,   D α p = e D V t h p β 1 E 2 α p
with
E V t h n = E V t h n n ϕ t ,   E V t h p = E V t h p n ϕ t   D V t h n = D V t h n n ϕ t 2 ,   D V t h p = D V t h p n ϕ t 2
where E(Vthp)/E(Vthn) and D(Vthp)/D(Vthn) are the mean and variance of the threshold voltage of minimum-sized pMOS/nMOS transistors, respectively. The variance of threshold voltage for the pMOS transistor is reversely proportional to β according to Pelgrom’s law [21]. Therefore, the mean and variance of tp can be analytically derived as
E t p = k 1 + β e E V t h n + D V t h n 2 + k 1 + 1 β Λ e E V t h p + D V t h p 2 β D t p = k 2 1 + β 2 e D V t h n 1 e 2 E V t h n + D V t h n + k 2 1 + 1 β 2 Λ 2 e D V t h p β 1 e 2 E V t h p + D V t h p β
which indicates that both the mean and variance of tp are highly dependent on β, as well as process-related parameters.
By approximating the propagation delay in (6) to follow the LN distribution, the worst-case propagation delay in terms of the 3σ percentile point of the delay distribution can be represented as
t p max = e μ t p + 3 σ t p
where the distribution parameters μ and σ can be expressed as (12) and (13), respectively, by E(tp) and D(tp) in (10) by considering E(V’thn) >> D(Vthn) ≈ 0, E(V’thp) >> D(Vthp) ≈ 0,
μ t p = ln E t p 1 + D t p E 2 t p = ln k + ln e E V t h n + ln β + Λ Γ β + 1 + Λ Γ
σ t p = ln 1 + D t p E 2 t p = ln 1 + D V t h n β 2 + Λ 2 Γ 2 Ψ 2 β β + Λ Γ 2
with
Γ = e E V t h p e E V t h n , Ψ = D V t h n D V t h n

3. Optimization Method for Subthreshold Worst-Case Propagation Delay

According to the worst-case propagation delay model derived as shown in (11), the minimal value can be achieved with the minimal μ + 3σ, which can be obtained with the optimal βopt by letting the derivation of μ + 3σ with β equal zero.
μ + 3 σ β β = β o p t = 0
However, due to the complicated relations between μ + 3σ and β, as shown in (12) and (13), it is almost impossible to derive the expression of μ + 3σ with β so as to solve the optimal βopt analytically. In order to simplify this problem, the goal of minimizing μ + 3σ is replaced by solving the optimal β o p t μ and β o p t σ for the minimal μ and σ, respectively, as formulated in (16) in Section 3.1 and Section 3.2. With the optimal β o p t μ and β o p t σ , the optimal βopt can be proved in Section 3.3 in detail to be between them and estimated as the average shown in (17).
μ β β = β o p t μ = 0 ,   σ β β = β o p t σ = 0
β o p t = β o p t μ + β o p t σ 2

3.1. Optimal β Derivation for Minimal μ of Delay Distribution

In order to achieve the minimal μ, it can be easily found from (12) that the value of β only affects the last term of μ. Due to this, minimizing μ is equivalent to the minimization of the exponent of the last term, which can be represented as fμ(β) in (18).
f μ β = C n + C w 1 + β β + Λ Γ β + 1 + Λ Γ
Through (18), the optimal β for the minimal μ can be easily solved by deriving the derivative of the function fμ(β) and letting it be zero, as follows:
f μ β β β = β o p t μ = 0 β o p t μ = Λ Γ 1 + C w C n
It is worth noting that the derived β o p t μ for the minimal μ is the same as that derived in [11], where it was used to minimize the nominal delay without considering process variation. It can be found from (19) that the total wire capacitance Cw would increase the optimal β for the minimal μ. If Cw could be considered to be negligible compared to Cn, the optimal β for the minimal μ could be simplified to
β o p t μ = Λ Γ

3.2. Optimal β Derivation for Minimal σ of Delay Distribution

It can be observed from (13) that minimizing σ is equivalent to the minimization of fσ(β) as
f σ β = β 2 + Λ 2 Γ 2 Ψ 2 β β + Λ Γ 2
Through (21), the optimal β for the minimal σ is the solution of the following equation by deriving the derivative of the function fσ(β) and letting it be zero:
f σ β β β = β o p t σ = 0 g σ β o p t σ = h σ β o p t σ
where
g σ β = β 3 h σ β = Λ Γ Ψ 2 2 3 β + Λ Γ
It can be seen from (23) that the optimal β o p t σ for minimizing σ can be obtained by solving the intersection of the cubic curve of gσ(β) and the linear line of hσ(β), where gσ(β) is a process-independent function of β, while hσ(β) is impacted by process-dependent parameters including Λ, Γ, and Ψ.

3.3. Proof of Estimation of Optimal β for Worst-Case Delay with Optimal β for μ and σ of Delay Distribution

Since the differentiation of μ + 3σ is a continuous function of β, the optimal βopt for the minimal worst-case delay is certain to be between β o p t μ and β o p t σ if and only if the signs of the derivatives for β o p t μ and β o p t σ are opposite, as shown in (24) and (25).
μ + 3 σ β β = β o p t μ > 0   and   μ + 3 σ β β = β o p t σ < 0
μ + 3 σ β β = β o p t μ < 0   and   μ + 3 σ β β = β o p t σ > 0
The value of μ + 3 σ β for β o p t μ and β o p t σ can be represented as
μ + 3 σ β β = β o p t μ = 2 Ψ 2 β o p t μ + 3 × 3 σ p 2 2 1 + β o p t μ × 1 1 + β o p t μ 2 + σ n 2 + σ p 2 β o p t μ ln 1 + 1 + σ p 2 β o p t μ 1 + β o p t μ 2 μ + 3 σ β β = β o p t σ = 1 β o p t μ β o p t σ 2 β o p t σ + β o p t μ 2 β o p t σ + 1 + β o p t μ 2
By observing (26), the signs of the derivatives for β o p t μ and β o p t σ are consistent with the signs of Sμ and Sσ as shown in (25), respectively, which can be proven to be opposite by analyzing the relations of gσ(β) and hσ(β), as demonstrated in Figure 3.
S μ = 2 Ψ 2 β o p t μ + 3 S σ = 1 β o p t μ β o p t σ 2
Figure 3 plots the β related functions of hσ(β) and gσ(β) as a blue line and a red cubic curve, respectively. By comparing (23) and (27), it can be noticed that the analytical expressions of Sμ/Sσ own similar forms as that of hσ(β) and gσ(β); thus, it can be illustrated by Figure 3 that the signs of Sμ and Sσ, e.g., the signs of the derivatives of β o p t μ and β o p t σ , are absolutely opposite. In order to demonstrate the relative relations between hσ(β) and gσ(β) due to various process-dependent parameters, including Λ, Γ, and Ψ, three blue lines are drawn in Figure 3 to respectively indicate all types of cases, including when hσ(β) is larger than, equal to, and smaller than gσ(β) when β equals β o p t μ . By taking the upper blue line for hσ(β) as an example, which is larger than gσ(β) when β is β o p t μ , i.e., h σ β o p t μ > g σ β o p t μ , the signs of Sμ and Sσ can be proven to be absolutely negative and positive, respectively, as follows.
First, the sign of Sμ can be proven to be negative when h σ β o p t μ > g σ β o p t μ . By joining the expressions of hσ(β) and gσ(β) in (23) into the condition of h σ β o p t μ > g σ β o p t μ , it can be deduced that 2 Ψ 2 < β o p t μ + 3 , indicating Sμ is negative according to (27).
Second, the sign of Sσ can be proven to be positive when h σ β o p t μ > g σ β o p t μ . It can obviously be found in Figure 3 that, in this case, the x-coordinate of the intersection of hσ(β) and gσ(β), i.e., β o p t σ as defined in (22), is certain to be larger than β o p t μ , indicating that Sσ is positive according to (27).
Similarly, the signs of Sμ and Sσ can be proven to be absolutely positive and negative, respectively, by taking the lower blue line for hσ(β) as an example. In all, the signs of the derivatives for β o p t μ and β o p t σ can be proven to be absolutely opposite so that the minimal worst-case delay is certain to be between β o p t μ and β o p t σ or even identical with both β o p t μ and β o p t σ for the case of the middle blue line; thus, it can be estimated as (17).
Several useful conclusions could be drawn based on the above analytical derivation to reveal the relation between the optimal βopt and process parameters with physical insight.
Firstly, whether the optimal βopt for minimal worst-case propagation delay would be larger or smaller than β o p t μ is determined by the ratio of the standard deviation of threshold voltages of nMOS and pMOS transistors, i.e., Ψ. As can be seen in (27), the magnitude of Ψ impacts the signs of Sμ and Sσ, as well as the relative relation between βopt and β o p t μ .
Secondly, Ψ is also related to the slope and intercept of hσ(β), so that determines the impact of process variation to the optimal βopt. Specifically, the smaller Ψ is, the smaller the slope and intercept of hσ(β) are, and the larger the deviation of the optimal βopt from β o p t μ .
Thirdly, the optimal βopt for worst-case propagation delay is only dependent on the ratio of mobility, as well as the ratios of mean and variance of threshold voltage for nMOS and pMOS transistors. In other words, it is independent of supply voltage and valid for any corners in the subthreshold domain.

4. Validation Results and Discussion

4.1. Validation of the Proposed Method at Gate Level

The analytically derived optimal βopt for the worst-case subthreshold operation was validated by MC SPICE simulation results under various process technologies. Compared with the competitive approaches in [4,11,18], which neglect the impact of the process variation in the subthreshold region, the optimal βopt derived in this work is highly consistent with the MC simulation results for all validated processes, as shown in Table 1. For all processes, 10K trails of MC SPICE simulations were performed by the HSPICE tool at the TT corner with a supply voltage of 0.35 V and temperature of 25 °C to evaluate the worst-case propagation delay of the inverter for each specific β, which was swept by gradually increasing from an initial value of 1.0. It can be seen that for most processes, a higher β is required by the proposed standard cell sizing solution to compensate for the impact of process variation in the subthreshold region. Moreover, only for the process of TSMC 40 nm, the optimal βopt is smaller than the case of subthreshold optimization without the consideration of process variation [11], indicating that the cell area could be saved to minimize the worst-case propagation delay. The optimal β o p t μ and β o p t σ for the minimal μ and σ are also compared with the optimal βopt in Table 1, where the former is adopted as the optimal solution in [11]. It was found that the divergence between the optimal βopt and optimal β o p t μ / β o p t σ ranges between 19% and 33% for various processes.
The proposed subthreshold cell sizing method was applied to standard cell design under the process of TSMC 28 nm, as well as the approaches in [3,10,11]. For all designed cells, the transistor channel lengths were kept at the minimum, and the consistent layout area constraint was applied for each cell to make a fair comparison in terms of the worst-case propagation delay, energy consumption, and energy-delay product (EDP).
In order to validate the improvement in the optimal βopt derived in this work for various logic structures of cells, Table 2 shows the validation results for the standard cells using different methods at 0.35 V, 25 °C, and TT corner with 10K MC SPICE simulations, where Ave. Incr. in the last row indicates the average increase in our method compared with others. Compared with the method derived for the super-threshold region [4], the proposed statistical optimization method reduces the worst-case propagation delay, energy consumption, and EDP by 15.7%, 10.5%, and 26.6% on average, respectively. Compared with the method for the subthreshold region without considering process variation [11], the proposed method shows an average of 8.6% and 7.4% reduction in terms of worst-case propagation delay and EDP, with a slight increase in energy consumption of 2.2%. Compared with the method by balancing the mean of the pMOS and nMOS transistor current distributions in [12] for the subthreshold region, the proposed method reduces the worst-case propagation delay and worst-case EDP by 12.1% and 11.9% at the cost of an additional 3.2% worst-case energy consumption. Compared with the method in [18] to improve the circuit performance with the constraint of a full diffusion layout structure, the proposed method still reduces the worst-case propagation delay, energy consumption, and EDP by 5.6%, 15.8%, and 26.7% on average, respectively.
In order to validate the improvement in the optimal βopt derived in this work for various subthreshold corners with different voltages and temperatures, the standard cells designed with different methods are further compared at other corners by MC SPICE simulation with a supply voltage between 0.25 V and 0.35 V and temperatures ranging from −40 °C to 125 °C, as shown in Table 3. It can be seen that the proposed method outperforms others in terms of worst-case propagation delay, similar to the corner, at 0.35 V and 25 °C.

4.2. Validation of the Proposed Method at Circuit Level

The standard logic cells designed under the process of TSMC 28 nm technology by different optimization methods were validated and compared at the circuit level by a ring oscillator and several ISCAS’89 benchmark circuits.
The ring oscillator was implemented with nine identically sized inverters, whose worst-case period, worst-case energy consumption, and worst-case EDP are listed in Table 4. It shows a similar tendency as the results for standard cells. In detail, compared with [4,11,12,18], the worst-case period (worst-case EDP) of the ring oscillator using the cells by this work can be reduced by 21.6% (25.2%), 15.5% (15.0%), 25.8% (22.9%), and 5.2% (16.3%), respectively, indicating significant performance improvement compared to prior solutions when considering the nontrivial impact due to process variation in the worst case. Moreover, 4.5% reduction, 0.9% penalty, and 1.1% and 11.6% reduction for the worst-case energy consumption can be observed compared with [4,11,12,18], showing that the energy overhead paid for the optimal βopt is acceptable.
The standard cell libraries were validated and compared in terms of frequency, power, and area with the synthesis results of ISCAS’89 and OpenCores benchmark circuits, as shown in Table 5, where the number of cells (# Cells) in the synthesized circuit netlist indicates the complexity of each circuit. Ave. Impr. in the last row indicates the average improvement in our method compared with others by increasing frequency and decreasing power and area. It was found that the proposed subthreshold cell sizing method outperforms the competitive methods with at least 6.6% performance improvement, 6.9% power reduction, and 9.4% area reduction on average, indicating the overall performance, power, and area (PPA) enhancement of standard cells optimized with the proposed sizing solution. Owing to the standard cell library designed with the proposed method, the synthesized circuits demonstrate a good balance among performance, power, and area, leading to performance improvement for the subthreshold circuit, as well as power and area cost savings compared with prior methods.

5. Conclusions

Improving the worst-case performance is critical for subthreshold standard cell and circuit design when the impact of process variation cannot be neglected. With the consideration of process variation, the optimal βopt is derived analytically to minimize the 3σ percentile point of delay distribution, which reveals the relation between the optimal worst-case cell delay and the process parameters with physical insight. Validation results show significant improvement in worst-case delay, energy, and EDP at the gate and circuit levels. In future works, the statistical impact of more layout-dependent effects, such as Reverse Short Channel Effect (RCSE) and Inverse Narrow Width Effect (INWE), will be considered in-depth for the robustness of standard cell design at the subthreshold domain.

Author Contributions

P.C. and J.G. organized this work. P.C. and J.G. performed the modeling, simulation, and experiment work. The manuscript was written and edited by P.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant (62174031), in part by the Natural Science Foundation of Jiangsu Province (BK20240637) and in part by the Fundamental Research Funds for the Central Universities.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Paul, S.; Honkote, V.; Kim, R.; Majumder, T.; Aseron, P.; Grossnickle, V.; Sankman, R.; Mallik, D.; Jain, S.; Vangal, S.; et al. An energy harvesting wireless sensor node for IoT systems featuring a near-threshold voltage IA-32 microcontroller in 14 nm tri-gate CMOS. In Proceedings of the 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), Honolulu, HI, USA, 15–17 June 2016; pp. 1–2. [Google Scholar]
  2. Shi, W.; Pan, A.; Yu, S.; Choy, C.-S. A Subthreshold Baseband Processor Core Design With Custom Modules and Cells for Passive RFID Tags. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2018, 37, 159–167. [Google Scholar] [CrossRef]
  3. Dreslinski, R.G.; Wieckowski, M.; Blaauw, D.; Sylvester, D.; Mudge, T. Near-threshold computing: Reclaiming moore’s law through energy efficient integrated circuits. Proc. IEEE 2010, 98, 253–266. [Google Scholar] [CrossRef]
  4. Rabaey, J.M.; Chandrakasan, A.; Nikolic, B. Digital Integrated Circuits: A Design Perspective; Prentice-Hall: Upper Saddle River, NJ, USA, 2003. [Google Scholar]
  5. Singh, K.; De Gyvez, J.P. Twenty years of near/sub-threshold design trends and enablement. IEEE Trans. Circuits Syst. II: Express Briefs 2020, 68, 5–11. [Google Scholar] [CrossRef]
  6. Muker, M.; Shams, M. Designing digital subthreshold CMOS circuits using parallel transistor stacks. Electron. Lett. 2011, 47, 372. [Google Scholar] [CrossRef]
  7. Zhou, J.; Jayapal, S.; Busze, B.; Huang, L.; Stuyt, J. A 40 nm Dual-Width Standard Cell Library for Near/Sub-Threshold Operation. IEEE Trans. Circuits Syst. I Regul. Pap. 2012, 59, 2569–2577. [Google Scholar] [CrossRef]
  8. Liu, B.; Ashouei, M.; Gemmeke, T.; de Gyvez, J.P. Sub-threshold custom standard cell library validation. In Proceedings of the Fifteenth International Symposium on Quality Electronic Design, Santa Clara, CA, USA, 3–5 March 2014; pp. 257–262. [Google Scholar]
  9. Keane, J.; Eom, H.; Kim, T.-H.; Sapatnekar, S.; Kim, C. Subthreshold logical effort: A systematic framework for optimal subthreshold device sizing. In Proceedings of the 43rd Annual Conference on Design Automation, San Francisco, CA, USA, 24–28 July 2006; p. 425. [Google Scholar]
  10. Lin, X.; Wang, Y.; Pedram, M. Joint sizing and adaptive independent gate control for FinFET circuits operating in multiple voltage regimes using the logical effort method. In Proceedings of the 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, USA, 8–21 November 2013; pp. 444–449. [Google Scholar]
  11. Nabavi, M.; Ramezankhani, F.; Shams, M. Optimum PMOS-to-NMOS Width Ratio for Efficient Subthreshold CMOS Circuits. IEEE Trans. Electron Devices 2016, 63, 916–924. [Google Scholar] [CrossRef]
  12. Liu, B.; Ashouei, M.; Huisken, J.; De Gyvez, J.P. Standard cell sizing for subthreshold operation. In Proceedings of the 49th Annual Design Automation Conference, San Francisco, CA, USA, 3–7 June 2012; p. 962. [Google Scholar]
  13. Kim, T.-H.; Keane, J.; Eom, H.; Kim, C.H. Utilizing Reverse Short-Channel Effect for Optimal Subthreshold Circuit Design. IEEE Trans. VLSI Syst. 2007, 15, 821–829. [Google Scholar]
  14. Jun, J.; Song, J.; Kim, C. A Near-Threshold Voltage Oriented Digital Cell Library for High-Energy Efficiency and Optimized Performance in 65nm CMOS Process. IEEE Trans. Circuits Syst. I Regul. Pap. 2018, 65, 1567–1580. [Google Scholar] [CrossRef]
  15. Zhang, H.; He, W.; Sun, Y.; Seok, M.M. An energy-efficient logic cell library design methodology with fine granularity of driving strength for near-and sub-threshold digital circuits. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea, 22–28 May 2021; pp. 1–5. [Google Scholar]
  16. Sasipriya, P. Design and Characterization of Standard Cell Libraries for Optimal Subthreshold Circuits. In Proceedings of the Innovations in Power and Advanced Computing Technologies (i-PACT), Kuala Lumpur, Malaysia, 8–10 December 2023; pp. 1–5. [Google Scholar]
  17. Chen, Y.; Nie, Y.; Jiao, H. An ultralow-power 65-nm standard cell library for near/subthreshold digital circuits. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2022, 30, 676–680. [Google Scholar] [CrossRef]
  18. Lim, Y.W.; Kamsani, N.A.; Sidek, R.M.; Hashim, S.J.; Rokhani, F.Z. Energy-Performance Optimization via P/N Ratio Sizing With Full Diffusion Layout Structure and Standard Cell Height Tuning in Near-Threshold Voltage Operation. IEEE Access 2022, 11, 12536–12546. [Google Scholar] [CrossRef]
  19. Zhai, B.; Hanson, S.; Blaauw, D.; Sylvester, D. Analysis and mitigation of variability in subthreshold design. In Proceedings of the ISLPED ’05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005, San Diego, CA, USA, 8–10 August 2005; pp. 20–25. [Google Scholar]
  20. Drego, N.; Chandrakasan, A.; Boning, D. Lack of Spatial Correlation in MOSFET Threshold Voltage Variation and Implications for Voltage Scaling. IEEE Trans. Semicond. Manufact. 2009, 22, 245–255. [Google Scholar] [CrossRef]
  21. Pelgrom, M.J.M.; Duinmaijer, A.C.J.; Welbers, A.P.G. Matching properties of MOS transistors. IEEE J. Solid-State Circuits 1989, 24, 1433–1439. [Google Scholar] [CrossRef]
Figure 1. SPICE simulation results of the nominal and worst-case propagation delay for inverter under TSMC 28 nm (a) super-threshold region (1.1 V) and (b) subthreshold region (0.35 V).
Figure 1. SPICE simulation results of the nominal and worst-case propagation delay for inverter under TSMC 28 nm (a) super-threshold region (1.1 V) and (b) subthreshold region (0.35 V).
Electronics 13 04477 g001
Figure 2. Inverter driving an identical inverter.
Figure 2. Inverter driving an identical inverter.
Electronics 13 04477 g002
Figure 3. Derivation of the signs of Sμ and Sσ by the relation of hσ(β) and gσ(β).
Figure 3. Derivation of the signs of Sμ and Sσ by the relation of hσ(β) and gσ(β).
Electronics 13 04477 g003
Table 1. Comparison of optimal β between analytical models and MC SPICE simulation results for various process technologies.
Table 1. Comparison of optimal β between analytical models and MC SPICE simulation results for various process technologies.
βTSMC 28 nmTSMC 40 nmSMIC 40 nmTSMC 65 nm
MC SPICE Sim.2.6 (−2%)1.7 (−5%)2.7 (−2%)2.2 (3%)
[4]1.25 (−53%)1.51 (−16%)2.06 (−27%)1.58 (−26%)
β o p t μ [11]1.81 (−31%)2.38 (33%)1.98 (−30%)1.72 (−19%)
β o p t σ 3.47 (31%)1.20 (−33%)3.66 (30%)2.54 (19%)
[18]1.51 (−43%)1.40 (−22%)1.72 (−39%)1.35 (−37%)
This work2.641.792.822.13
Table 2. Comparison of worst-case propagation delay, energy consumption, and energy-delay product for standard logic cells operating at 0.35 V, 25 °C, TT corners under TSMC 28 nm process.
Table 2. Comparison of worst-case propagation delay, energy consumption, and energy-delay product for standard logic cells operating at 0.35 V, 25 °C, TT corners under TSMC 28 nm process.
CellWorst-Case Propagation Delay (ps)Worst-Case Energy Consumption (fJ)Worst-Case Energy-Delay Product (fJ × ps)
[4][11][12][18]Ours[4][11][12][18]Ours[4][11][12][18]Ours
INV76.471.071.368.364.00.2110.1850.1860.2130.18915.712.713.114.911.6
NAND298.693.7102.296.490.60.2060.1790.1770.2220.18220.016.418.021.916.0
NOR2198.1177.8167.0162.4155.00.2230.1920.1810.2310.19342.331.631.638.428.0
AOI21D215.6198.0202.2195.9183.60.3410.2970.2910.3490.30271.956.762.569.753.4
OAI21D93.785.199.681.177.00.0870.0780.0810.1020.0827.76.16.38.55.6
Ave. Incr.
(%)
15.78.612.15.60.010.5−2.2−3.215.80.026.67.411.926.70.0
Table 3. Comparison of worst-case propagation delay for standard logic cells at corners under TSMC 28 nm process (unit: ps).
Table 3. Comparison of worst-case propagation delay for standard logic cells at corners under TSMC 28 nm process (unit: ps).
Cell0.35 V, −40 °C0.35 V, 125 °C
[4][11][12][18]Ours[4][11][12][18]Ours
INV2872682652472283334343532
NAND23463203333042823939393836
NOR2972892853724756191171177169154
AOI21D1010903959806767217196205194174
OAI21D4564054263773439789998680
Ave. Incr.
(%)
22.014.516.14.90.013.58.812.78.20.0
Cell0.25 V, −40 °C0.25 V, 125 °C
[4][11][12][18]Ours[4][11][12][18]Ours
INV49144844495449284791154153156158152
NAND288988307889769266877129118126121112
NOR222,93921,27820,79718,19417,21220231855191915051589
AOI21D19,17717,24818,51714,71614,31016821542158414001340
OAI21D81067350766962676119716644760609554
Ave. Incr.
(%)
20.014.217.22.80.015.89.414.73.80.0
Table 4. Comparison of worst-case period, energy consumption, and energy-delay product for ring oscillator operating at 0.35 V, 25 °C, TT corners under TSMC 28 nm process.
Table 4. Comparison of worst-case period, energy consumption, and energy-delay product for ring oscillator operating at 0.35 V, 25 °C, TT corners under TSMC 28 nm process.
Ring Oscillator[4][11][12][18]Ours
Worst-case period (ns)4.64(21.6%)4.31(15.5%)4.91(25.8%)3.84(5.2%)3.64
Worst-case
energy consumption (fJ)
1.12(4.5%)1.06(−0.9%)1.08(1.1%)1.21(11.6%)1.07
Worst-case
Energy-delay product (ns × fJ)
5.08(25.2%)4.47(15.0%)4.93(22.9%)4.53(16.3%)3.80
Table 5. Comparison of frequency, power consumption, and area for benchmark circuits operating at 0.35 V, 25 °C, TT corner under TSMC 28 nm process.
Table 5. Comparison of frequency, power consumption, and area for benchmark circuits operating at 0.35 V, 25 °C, TT corner under TSMC 28 nm process.
Ckt# CellsFrequency (MHz)Power (uW)Area (um2)
[4][11][12][18]Ours[4][11][12][18]Ours[4][11][12][18]Ours
s27191171291201151420.380.370.370.370.369.999.619.89.49.21
s3821791091141121101224.954.534.734.334.01206.6174.4178.9167.4151.3
s53781294961019710010635.733.235.1032.9030.41381.71317.2134212981140.2
s1320712198489858799102.2100.1100.9098.1095.23575.93363.2342732863138.1
s3841782788183817887365.6324.0332.40311.39277.813,54211,47911,50110,6859605
s3858483248082808086373.7345.7367.90321.34297.213,68511,94512,50111,20410,138
aes_ip20,795931099798111220.3210.50215.80186.84171.9016,92414,40915,80914,41712,286
tv807161103105103104114109.5106.30105.4085.2681.0096988330889378787000
vga lcd124,0311191211201221282786.22675.12690.12638.352375.2140,459120,708128,375115,247103,291
Ave. Impr.
(%)
-12.76.611.111.20.017.012.114.26.90.019.912.715.99.40.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, P.; Guo, J. Standard Cell Sizing for Worst-Case Performance Optimization Considering Process Variation in Subthreshold Region. Electronics 2024, 13, 4477. https://doi.org/10.3390/electronics13224477

AMA Style

Cao P, Guo J. Standard Cell Sizing for Worst-Case Performance Optimization Considering Process Variation in Subthreshold Region. Electronics. 2024; 13(22):4477. https://doi.org/10.3390/electronics13224477

Chicago/Turabian Style

Cao, Peng, and Jingjing Guo. 2024. "Standard Cell Sizing for Worst-Case Performance Optimization Considering Process Variation in Subthreshold Region" Electronics 13, no. 22: 4477. https://doi.org/10.3390/electronics13224477

APA Style

Cao, P., & Guo, J. (2024). Standard Cell Sizing for Worst-Case Performance Optimization Considering Process Variation in Subthreshold Region. Electronics, 13(22), 4477. https://doi.org/10.3390/electronics13224477

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop