Next Article in Journal
Implementing a Timing Error-Resilient and Energy-Efficient Near-Threshold Hardware Accelerator for Deep Neural Network Inference
Previous Article in Journal
Embedded Object Detection with Custom LittleNet, FINN and Vitis AI DCNN Accelerators
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Methodology to Design Static NCL Libraries

1
Department of Electronics Engineering, Faculty of Electricals and Electronics Engineering, Ho Chi Minh City University of Technology (HCMUT), 268 Ly Thuong Kiet Street, District 10, Ho Chi Minh City 700000, Vietnam
2
Vietnam National University Ho Chi Minh City, Linh Trung Ward, Thu Duc District, Ho Chi Minh City 700000, Vietnam
3
Department of Electronics Engineering, Faculty of Electricals and Electronics Engineering, Ho Chi Minh City University of Food Industry (HUFI), 140 Le Trong Tan Street, Tay Thanh Ward, Tan Phu District, Ho Chi Minh City 700000, Vietnam
*
Author to whom correspondence should be addressed.
J. Low Power Electron. Appl. 2022, 12(2), 31; https://doi.org/10.3390/jlpea12020031
Submission received: 20 December 2021 / Revised: 3 March 2022 / Accepted: 20 March 2022 / Published: 6 June 2022

Abstract

:
The Null Convention Logic (NCL) based asynchronous design technique has interested researchers because this technique had overcome disadvantages of the synchronous technique, such as noise, glitches, clock skew and power. However, using the NCL-based asynchronous design method is difficult for university students and researchers because of the lack of standard NCL cell libraries. Therefore, in this paper, a novel flow is proposed to design NCL cell libraries. These libraries are used to synthesize NCL-based asynchronous designs. We chose the static NCL cell library to illustrate the proposed design solution because this library is one of the most basic NCL libraries. Static NCL cells in this library are designed based on the Process Design Kit 45nm technology and are implemented by the Virtuoso and the Design Compiler (DC) tool. In addition, the Ocean script and Electronic Design Automation (EDA) environment are used for supporting designs and simulations. A complete library of 27 NCL cells was designed to serve for study and research. We also implemented synthesis for NCL full adders using this library and compared our synthesis results with the results of other authors. The comparison results indicated that our results were a 20% improvement on power consumption.

1. Introduction

Synchronous circuits have played a significant role and have dominated the semiconductor industry [1]. This industry has continuously diminished the wire and transistor dimension. As a result, billions of transistors are integrated into a single chip, and low power and high-performance circuits will be created in the following technology generations. Furthermore, synchronous circuits use a clock signal to synchronize their operations. Therefore, the semiconductor industry must face clock-related issues, including clock skew, power consumption, noise, electromagnetic interference, and the complexity of clock network layout. These issues are considered future technological challenges for the semiconductor industry [2].
In contrast to synchronous circuit paradigms, asynchronous circuit paradigms synchronize their operations through the local handshake protocol. Therefore, asynchronous circuitry may eliminate the clock issues mentioned above [3]. Among asynchronous circuit paradigms, NCL is a quasi-delay-insensitive (QDI) logic paradigm used in commercial applications and is chosen to design asynchronous circuits [4]. Many studies of NCL-based asynchronous circuits are implemented, such as the complementary metal-oxide-semiconductor circuit design of threshold gates with latency [5], of which comparisons of NCL threshold gate models [3] and some relevant studies can be found in [6,7,8,9,10,11,12]. In most of the studies mentioned above, authors synthesized their designs in one of three approaches. The first approach was to use tools to convert synchronous to asynchronous designs [13]. This approach makes it hard to optimize large-scale designs. In the second approach, the authors used a full-custom design flow to synthesize NCL-based designs. This flow is not suitable for complex designs. The last approach uses the conventional synchronous cell library to synthesize NCL-based designs [14]. This approach may prevent the NCL-based asynchronous designs from achieving optimal power. Therefore, the lack of NCL asynchronous cell libraries have caused many difficulties in research and development.
In state-of-the-art research, there were several flows suggested to design NCL asynchronous cell libraries [15,16]. These flows are complex, and authors used some of their own tools—a reason that causes difficulties for researchers who want to continue to inherit and develop. Therefore, we proposed a simple flow to design the NCL cell library using only the main commercial tools. Additionally, this flow also helps researchers themselves to easily create NCL cell libraries.
In this flow, cell schematic designs, cell symbol generation, and cell simulation to determine leakage power and input capacitance are implemented using Virtuoso. NCL static cells are designed based on PDK 45nm technology and are chosen as a case study. In addition, cell characterization is assisted by Ocean script to determine parameters, such as cell rise delay, cell fall delay, rise transition, fall transition, rise power and fall power. Thanks to the Ocean script, researchers saved time approaching a new method quickly.
The rest of this paper comprises three sections: Section 2 presents an overview of NCL, the proposed flow, and cell characterization. Section 3 provides results and discussions. Finally, Section 4 gives conclusions of our methodology to design the NCL cell library.

2. Materials and Methods

2.1. Null Convention Logic

NCL is an asynchronous logic and latency-insensitive model. To achieve delay insensitivity, NCL circuits must satisfy two rules: input-completeness and observability [17]. In terms of input-completeness, NCL designs require two following conditions: “The output may not transition from NULL to a complete set of DATA until the input value is purely DATA”, and “The output may not transition from the DATA state to a NULL completer until the input value is completely NULL”. About observability, this requires that there are no orphans transmitting through a gate. Orphans can be ignored through the isochronic fork assumption—wire delays are less than gate delays within a component [17]. The observability condition ensures that any gate transitions are observable at the output. To satisfy this observability condition, each transition occurring at each gate must transition at least one of the outputs.
The NCL-based circuit design method does not use a clock signal and is aimed at asynchronous circuits [5]. These asynchronous circuits always execute correctly, regardless of component and wire delays [18,19]. To achieve the delay target mentioned above, NCL circuits utilize dual-rail logic [18]. A conventional logic signal is generated by only one rail, while two rails form a dual-rail logic signal. Table 1 shows the conversion of a conventional logic signal to a dual-rail signal [20]. The value ‘11′ is illegal because A0 and A1 rails are mutually exclusive.
Unlike conventional asynchronous circuits, NCL-based circuits use a set of twenty-seven threshold gates [18,19,20]. A general symbol of the thmn threshold gate is illustrated in Figure 1. Where n is the total number of inputs, m is the threshold value that means at least m of n inputs must become ‘1’ state before the output becomes ‘1’ state. Another type of threshold gate is denoted thnmWz1z2…zm. It is a weighted threshold gate, where the input weights are z1, z2, …, and zm. For example, th23w2 is shown in Figure 1b, where the A input weight and the threshold value are two. Therefore, when the A input becomes a ‘1’ state, the output will be asserted.
As presented above, NCL-based asynchronous circuits are designed by threshold gates. The general structure of a static CMOS threshold gate with hysteresis consists of five function blocks (set, reset, hold data, hold null and an inverter at the output), as shown in Figure 2. In this structure, the reset block and hold data block are complementary to each other and have standard structures which are depicted in Figure 3 [4,5]. The reset block is active when all inputs are in the ‘0’ state, while the hold data block is active when at least one or more inputs are in the ‘1’ state. Their structures only depend on the number of cell inputs. Therefore, threshold gates with the same number of inputs will have the same reset block and hold data block. Similarly, the set block and the hold null block complement each other, but their actual structures depend on the number of inputs and the threshold value.
Similar to the general structure of the static threshold gate, a general structure of the semi-static threshold gate comprises three function blocks (reset block, set block, and a weak feedback inverter at the output) [3,5]. In this structure, when both set and reset blocks are off, the logic level on node Y will be remained by this inverter. In addition, this weak inverter will be influenced by noise on node Y if it is too small.
In many applications related to real-time computing, such as signal processing, the flow of the input data is continuous at the minimum speed. In these cases, a feedback mechanism is not essential to maintain the state information. Therefore, the weak feedback inverter can be removed from the semi-static structure. As a result, a new paradigm is formed and is called dynamic threshold gates [3].

2.2. The Proposed Flow Chart to Design Standard NCL Cell Libraries

In this section, we present the proposed flow chart to design NCL cell libraries. This flow comprises ten steps depicted in Figure 4. Firstly, the cell specification analysis step is implemented to form the basis for the schematic circuit design step. The next step is to create the cell symbol to carry out the testbench circuit. This circuit is simulated to check the cell operations. If the cell operation test results are not good, we will go back to the schematic circuit design step. Otherwise, we will go to the simulation step at the corners. This step is implemented to measure leakage power and the input capacitance.
Pin capacitance can be specified in all inputs and outputs. In most cases, it is only determined at input pins. Thus, the cell output capacitance is equal to zero [21]. The input capacitance value is computed by Equation (1), which represents the relationship of capacitance, voltage, and current.
I = C d V d t
By providing pulse voltage to the input pin and measuring the current at the same point, we will calculate the input capacitance value by Equation (2):
C i n p u t = t t + Δ t I ( t ) d t t t + Δ t d V
where I is the current at an input pin, and it is created by charging and discharging the charge through the input capacitance.
Most normal cells only consume power when the output changes. However, other powers are dissipated as the cells are supplied with the voltage but are not active because the leakage current is not equal to zero. The sub-threshold current or the tunneling current through the gate oxide of MOS devices generates the leakage [21]. The leakage power is determined according to Equation (3):
P Leakage = I Leakage V D D
To determine the leakage power, we first list all input combinations of that cell and then compute the leakage power of every case by connecting the voltage supply line to the ground when inputs are low, or connecting the voltage supply line to VDD when inputs are high. The leakage power of a standard cell is equal to the average of all cases. Simultaneously, with the simulation step at corners, we perform the cell characterization to determine the timing and power models. As ADE does not have options or powerful commands to execute the repetitive tasks, the Ocean script is utilized to assist the cell characterization automatically. Cell characterization is represented in detail in Section 2.3 below. The parameters mentioned above, including leakage power, input capacitances, timing model, and power model, will contribute to forming the *.lib file [22]. This file complies with Synopsys standards. Subsequently, we use the Library Compiler tool of Synopsys to convert the *.lib file to the *.db file. To do this, read_lib and write_lib commands shall be used. “Read_lib <your path>/library.lib” command is used to read and compile the library file. If the compilation is successful, the program returns 1, and the *.db file is created by using “write_lib library_name -f db -o <your path>/library.db” command [23]. This *.db file is not only one of the crucial files in the library but also contains the essential parameters of cells and is used to synthesize NCL-based asynchronous circuits using the DC tool of Synopsys.
Finally, the synthesis step is carried out to check if the cell library works properly. In this step, we will write a piece of RTL code and synthesize it at the gate level. If the synthesis results are good (i.e., the design can be synthesized successfully) the NCL cell library will be completed. The complete library comprises 27 cells.
To illustrate the proposed flow, we choose any one of twenty-seven cells in the library, for instance, the th22 cell. As represented in Section 2.1, reset and hold data blocks are in standard forms and only depend on the total number of inputs. Therefore, the reset block is formed by two PMOS transistors in series, and the hold data block is formed by two NMOS transistors in parallel. To construct the schematic circuits of the set block and hold the null block, we conduct an analysis of their function. The set block is only active when both A and B input goes to the high level. For this condition, the switching expression of the set block also describes the function of this threshold gate, represented by Equation (4). The hold null block complements the set block, so the switching expression of this block is obtained by complementing Equation (4), and this result is shown in Equation (5). The circuits to implement set and hold null blocks by using NMOS transistor networks and PMOS transistor networks are illustrated in Figure 5. The remaining steps will be continued in Section 2.3.
F ( A ,   B ) = A B
F ( A ,   B ) = A ¯ + B ¯

2.3. NCL Cell Characterization

Cell characterization is one of the most important steps in the flow because, in this step, cell timing models and power models are determined to form the library. We perform characterization for all cells, and the quantities of cell fall delay, cell rise delay, fall transition, rise transition, rise power, fall power, input capacitance and leakage power. In this section, 27 cells will use the same load capacitance Cload (1.4 fF, 2.54 fF, 4.61 fF, 8.37 fF, 15.2 fF, 27.6 fF, 50.0 fF) and the same fall time and rise time of the input Vpulse waveform (0.01 ns, 0.0192 ns, 0.0368 ns, 0.0707 ns, 0.136 ns, 0.261 ns, 0.5 ns) to realize cell characterization. Cload and slope values are determined based on Equation (6) and Elmore delay, respectively [24]. We measured Cload and slope values with different drive strengths to get a range that cells can fall into.
Cload = (Wn × Ln × Cox) + (Wp × Lp × Cox)
where:
Cox: Gate oxide capacitance;
Wn: width of the NMOS transistor;
Ln: length of the NMOS transistor;
Wp: width of the PMOS transistor;
Lp: length of the PMOS transistor.
To simulate all the cases, we must carry out the tasks manually because there are no options and powerful commands in the graphic user interface to carry out repetitive tasks, which is one of the greatest drawbacks of the ADE. In addition, there is no approach to characterize a normal cell automatically. Hence, in this subsection, Ocean language is utilized to assist automatically implementing simulations within Cadence because it is one of the powerful script languages. The structure of the .ocn file includes three main parts: part 1 is to assign the Cload and slope values to the two arrays, respectively; part 2 is to create loops for simulating 49 cases; and the last part is to measure rise transition, fall transition, cell rise delay, cell fall delay, rise power and fall power based on Figure 6, Figure 7 and Figure 8 and Equation (7).
The general structure of the .ocn file:
loadlist = list (“L1” “L2” “L3” … “Ln”);
slopelist = list (“S1” “S2” “S3” … “Sn”);
foreach (slopevar slopelist
foreach (loadvar loadlist
“Measure cell rise delay, cell fall delay, …”
);
);
where:
Ln: load value
Sn: slope value
Besides the Ocean script, the calculator of Virtuoso is also used to perform cell characterization. The parameters mentioned above, such as fall time and rise time of the input voltage, and load capacitance must be determined clearly in the *.ocn file to run the simulation 49 times and calculate the timing and the dynamic power models, including fall transition, rise transition, cell rise delay, and cell fall delay. We do not use the Ocean script to assist in measuring the leakage power and the input capacitance because it is used to measure a range of values. The testbench circuit of the th22 gate is shown in Figure 9.
The cell timing models are utilized to provide accurate timing for various cell cases in the design environment. Hence, non-linear delay paradigms are utilized to create a *.lib file because these paradigms are precise even if utilized for the submicron technology [21]. The timing models are computed for every timing arc of the cell. Delays and timing are table models. These paradigms must be determined clearly for all the cells in the library. The transition time at the output pin and the hysteresis via the cell for different combinations of the input transition time at the cell input and total output capacitance at the cell output pin is captured by the table models [21]. Figure 6, Figure 7 and Figure 8 show calculating time values of the timing models (delay and transition time). The percentages (30%, 70%, 10%, 90%) are threshold values, which are specified clearly in the liberty file [25].
Dynamic power consists of rise power and fall power. Rise power is calculated in case the output changes from low to high. Similarly, fall power is calculated in case the output changes from high to low. The dynamic power is determined by Equation (7), where V0 is the power supply.
P dynamic = V DD T   t t + Δ t i ( / V 0 / PLUS ) ( t ) . d t

3. Results and Discussions

In this section, results of the processes in Section 2 are presented and discussed, including the cell function test, cell characterization and test synthesis for RTL code. In addition, we also make comparisons between our synthesis results and the results of other authors.

3.1. Function Test Results

In our proposed flow, it is necessary to check the function of the cell because if all possible combinations of the inputs are not fully checked, it can cause the results after performing cell characterization to be wrong. The simulation results, to test its function, are shown in Figure 10, Figure 11 and Figure 12. Theoretically, when all inputs transition to low, the output will become low and when two inputs transition to high, the th22 gate output will become high. Figures (from Figure 10, Figure 11 and Figure 12) indicate that the circuit works correctly.

3.2. Cell Characterization Results

To implement cell characterization, Ocean script is used to assist in measuring 49 cases as mentioned in Section 2.3. Figure 13 is the simulation result of those 49 cases (with Pin A supplied Vpulse, pin B connected to VDD). Similarly, Figure 14 shows the simulation results for the case (with Pin A supplied Vpulse, pin B connected to GND). With the support of Ocean script, the parameters, such as cell fall, cell rise, rise transition, fall transition, rise power and fall power are implemented quickly and accurately. These parameters are shown in Table 2, Table 3, Table 4, Table 5, Table 6 and Table 7, where the unit of timing parameters is in ns and the power is in pW.
At the end of Section 3.2, Monte Carlo simulations under mismatch variations of cell rise, cell fall, rise transition, fall transition, rise power and fall power are shown in Figure 15, Figure 16, Figure 17, Figure 18, Figure 19 and Figure 20. These simulations are implemented with a 50pF load and 500ps slope. The simulation results are good because of the similarity to the Gauss distribution with a standard deviation of ±3 sigma.

3.3. The Synthesis Results of The RTL Code

In this section, we use the full adder model [26] as an example for testing the library that we generated based on our proposed flow. This model comprises two th23 gates and two th34w2 gates, as shown in Figure 21, and its output equations are as follows:
Cout1 = A1B1 + A1Cin1 + B1Cin1
Cout0 = A0B0 + A0Cin0 + B0Cin0
S1 = A0B0Cin1 + A0B1Cin0 + A1B0Cin0 + A1B1Cin1
= Cout0A1 + Cout0B1 + Cout0Cin1 + A1B1Cin1
S0 = A0B0Cin0 + A0B1Cin1 + A1B0Cin1 + A1B1Cin0
= Cout1A0 + Cout1B0 + Cout1Cin0 + A0B0Cin0
Since Cout is not input-complete with any inputs, S must be input-complete with all inputs [17] which means the equations for S must be in canonical form. Equations (10) and (11) show that the S output has all the inputs (A, B, and Cin) in each product term. Therefore, the full adder satisfies the input-complete condition.
In Figure 21, the two Th23 gates are connected to the two Th34w2 output gates, and the outputs for the Th23 gates are Cout1 and Cout0. The Cout1 includes the product terms of the inputs (A, B, Cin) with rail1, and Cout0 includes the product terms of the inputs (A, B, Cin) with rail0. When inputs are asserted, the Th23 gates are asserted, and the Cout is asserted. Similar to the Cout output, S0 includes the product terms of the inputs (A0, B0, Cin0 and Cout1), and S1 includes the product terms of the inputs (A1, B1, Cin1 and Cout0). If Cout output is asserted, and only one of the three inputs is asserted, the S output will be asserted. For instance, if the inputs (A, B, Cin) are DATA0, DATA1, and DATA0, respectively, the Th23 gates are asserted. As a result, Cout and S are DATA0 and DATA1, respectively. Therefore, the circuit shown in Figure 21 satisfies the observability conditions.
We synthesize this by using the Design Compiler. The typical parameters of the library are temperature (25 °C), voltage (1.25 V), and process (ff). The netlist file after synthesis is shown in Figure 22.
The synthesis results of area, power and delay depicted in Figure 23, Figure 24 and Figure 25, respectively, show that the NCL-based design is synthesized successfully. Based on our proposed solution, many other cells can be made to create a full set of NCL cell libraries. This work has a substantial contribution to researching and developing the asynchronous circuits based on NCL.
The comparison between our work and [20] is given in Table 8. In terms of area of the designs, the full adders in [20] are much less than our result because the adders P-FA-L0, P-FA-L1, and P-FA-L2 are strong indication adders that use the common split-end reset and hysteresis mechanism at the circuit level instead of designing each rail separately [20]. These adders share transistors between rails in three configurations, such as logic block 0 (LB0), logic block 1 (LB1), and logic block 2 (LB2), which results in P-FA-L0, P-FA-L1, and P-FA-L2 models. The area of the adder P-FA-L2 is the smallest because it shares the transistors among four rails. As a result, short paths between VDD and GND through the dP transistors at the DATA state are formed [20]. However, reducing more areas makes short paths not static and consumes high power. That is why the P-FA-L2 adder’s power consumption is the highest, approximately 1.28 times our result. The power of P-FA-L0 and P-FA-L1 is lower than ours because transistors are shared between the rails. Thus, with the delay, there is a significant difference between the results in [20] and our result because we calculated the delay based on the Design Compiler tool that helps optimize the design while the adders in [20] were measured by using the Cadence tool. Another reason would be due to the influence of the technology node; we used the 45nm technology in our work while the adders in [20] were simulated with the 65nm technology. Therefore, the comparison of delay would only be relative.
Finally, our static NCL library is compared with the static NCL library in [15]. We notice that both works use the static structure of the NCL cells. The library in [15] was implemented using the author’s own tools and the commercial tools. Hence, if there are any problems during the installation and the use, it would be difficult for readers to overcome. Meanwhile, our static NCL library was implemented by commercial tools. The flow to implement this library would also be simpler than that in [15]. In addition, we synthesized the 4-bit full adder by using our NCL library and the NCL library in [15]. The results synthesized by the DC tool are shown in Table 9. In terms of power, the synthesis result using our library is smaller than that using the library in [15]. The reason for the difference in power could be that our library was implemented in the pre-layout stage and the library in [15] was implemented in the post-layout stage. In addition, the synthesis result of delay using our library is larger than the one using the library in [15] because the library in [15] was optimized by many of the author’s own tools and implemented in the post-layout stage.

4. Conclusions

In this paper, the methodology to design the NCL cell library was presented via the proposed flow. All blocks of this flow were explained in detail and some examples were given. Our proposed flow could be used for research at universities. It not only could solve the problem of the lack of a standard NCL cell library that is difficult for students and researchers, but also it could help them save time and effort. The complete cell library includes 27 cells which were designed using 45 nm CMOS technology and were used for the synthesis of the NCL-based asynchronous designs by the Design Compiler tool from Synopsys.

Author Contributions

Conceptualization, methodology, T.L.T. and T.H.; software, data curation, L.T.T.; investigation, T.L.T. and L.T.T.; writing—original draft preparation, T.L.T.; writing—review and editing, supervision, T.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We acknowledge the support of time and facilities from Ho Chi Minh City University of Technology (HCMUT), VNU-HCM for this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Nowick, S.M.; Singh, M. Asynchronous design-part 1: Overview and recent advances. IEEE Des. Test 2015, 32, 5–18. [Google Scholar] [CrossRef]
  2. Wu, J. Null Convention Logic Applications of Asynchronous Design in Nanotechnology and Cryptographic Security. Ph.D. Thesis, the Missouri University of Science and Technology, Rolla, MO, USA, 2012. [Google Scholar]
  3. Haulmark, K.; Khalil, W.; Bouillon, W.; Di, J. Comprehensive comparison of null convention logic threshold gate implementations. In Proceedings of the 2018 New Generation of CAS (NGCAS), Valletta, Malta, 20–23 November 2018; Volume 1, pp. 37–40. [Google Scholar]
  4. Sakib, A.A.; Smith, S.C. Implementation of Static NCL Threshold Gates Using Emerging CNTFET Technology. In Proceedings of the 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS), Glasgow, UK, 23–25 November 2020; pp. 1–4. [Google Scholar] [CrossRef]
  5. Sobelman, G.E.; Fant, K. CMOS circuit design of threshold gates with hysteresis. In Proceedings of the 1998 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, CA, USA, 31 May–3 June 1998; Volume 2, pp. 61–64. [Google Scholar]
  6. Parsan, F.A.; Smith, S.C. CMOS implementation comparison of NCL gates. In Proceedings of the 2012 IEEE 55th International Midwest Symposium on Circuits and Systems (MWSCAS), Boise, ID, USA, 5–8 August 2012; pp. 394–397. [Google Scholar] [CrossRef]
  7. Huy, N.L.; Beckett, P. Null convention logic primitive element architecture for ultralow power high performance portable digital systems. In Proceedings of the 2017 IEEE Regional Symposium on Micro and Nanoelectronics (RSM), Batu Ferringhi, Malaysia, 23–25 August 2017; pp. 167–170. [Google Scholar]
  8. Metku, P.; Kim, K.K.; Kim, Y.B.; Choi, M. Low-Power Null Convention Logic Multiplier Design Based on Gate Diffusion Input Technique. In Proceedings of the 2018 International SoC Design Conference (ISOCC), Daegu, Korea, 12–15 November 2018; pp. 233–234. [Google Scholar] [CrossRef]
  9. Metku, P.; Kim, K.K.; Choi, M. Novel area-efficient null convention logic based on cmos and gate diffusion input (Gdi) hybrid. J. Semicond. Technol. Sci. 2020, 20, 127–134. [Google Scholar] [CrossRef]
  10. Huy, N.L.; Holland, A.S.; Beckett, P. Silicon on insulator null convention logic based asynchronous circuit design for high performance low power digital systems. In Proceedings of the 2018 2nd International Conference on Recent Advances in Signal Processing, Telecommunications & Computing (SigTelCom), Ho Chi Minh City, Vietnam, 29–31 January 2018; pp. 111–115. [Google Scholar]
  11. Moreira, M.T.; Beerel, P.A.; Sartori, M.L.L.; Calazans, N.L.V. NCL synthesis with conventional EDA tools: Technology mapping and optimization. IEEE Trans. Circuits Syst. I Regul. Pap. 2018, 65, 1981–1993. [Google Scholar] [CrossRef]
  12. Guazzelli, R.A.; Moreira, M.T.; Calazans, N.L.V. A comparison of asynchronous QDI templates using static logic. In Proceedings of the 2017 IEEE 8th Latin American Symposium on Circuits & Systems (LASCAS), Bariloche, Argentina, 20–23 February 2017; pp. 1–4. [Google Scholar]
  13. Reese, R.B.; Smith, S.C.; Thornton, M.A. Uncle—An RTL Approach to Asynchronous Design. In Proceedings of the 2012 IEEE 18th International Symposium on Asynchronous Circuits and Systems, Kongens Lyngby, Denmark, 7–9 May 2012; pp. 65–72. [Google Scholar]
  14. Oliveira, D.L.; Verducci, O.; Faria, L.A.; Curtinhas, T. A novel Κ convention logic (NCL) gates architecture based on basic gates. In Proceedings of the 2017 IEEE XXIV International Conference on Electronics, Electrical Engineering and Computing (INTERCON), Cusco, Peru, 15–18 August 2017; pp. 1–4. [Google Scholar]
  15. Oliveira, C.H.M.; Moreira, M.T.; Guazzelli, R.A.; Calazans, N.L.V. ASCEnD-FreePDK45: An open source standard cell library for asynchronous design. In Proceedings of the 2016 IEEE International Conference on Electronics, Circuits and Systems (ICECS), Monte Carlo, Monaco, 11–14 December 2016; pp. 652–655. [Google Scholar]
  16. Moreira, M.T.; Calazans, N.L.V. Design of Standard-Cell Libraries for Asynchronous Circuits with the ASCEnD Flow. In Proceedings of the 2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Natal, Brazil, 5–7 August 2013; pp. 217–218. [Google Scholar]
  17. Smith, S.C.; Di, J. Designing Asynchronous Circuits using NULL Convention Logic (NCL). Synth. Lect. Digit. Circuits Syst. 2009, 4, 1–96. [Google Scholar] [CrossRef] [Green Version]
  18. Albert, A.J.; Ramachandran, S. Static implementation of a null convention logic based exponent adder. Int. J. Appl. Eng. Res. 2015, 10, 7601–7614. [Google Scholar]
  19. Caberos, A.; Huang, S.C.; Cheng, F.C. Area-efficient CMOS implementation of NCL gates for XOR-AND/OR dominated circuits. In Proceedings of the 2017 IEEE Asia Pacific Conference on Postgraduate Research in Microelectronics and Electronics (PrimeAsia), Kuala Lumpur, Malaysia, 31 October–2 November 2017; pp. 37–40. [Google Scholar]
  20. Fawzy, B.G.; Abutaleb, M.M.; Eladawy, M.I.; Ghoneima, M. Strong Indication Full-Adder Circuit for NULL Convention Logic Automation Flows. In Proceedings of the 2018 18th International Symposium on Communications and Information Technologies (ISCIT), Bangkok, Thailand, 26–29 September 2018; pp. 416–421. [Google Scholar]
  21. Bhasker, J.; Chadha, R. Static Timing Analysis for Nanometer Designs: A Practical Approach, 2009th ed.; Springer: New York, NY, USA, 2009. [Google Scholar]
  22. Charafeddine, K.; Ouardi, F. Novel methodology to d etermine leakage power in standard cell library design. Heliyon 2020, 6, e04168. [Google Scholar] [CrossRef] [PubMed]
  23. VLSI Tutorial. Available online: https://personal.utdallas.edu/~xxx110230/lc/ (accessed on 7 January 2022).
  24. Naresh, A. Design and Characterization of a Standard Cell Library for the FREEPDK45 Process. Master’s Thesis, Oklahoma State University, Stillwater, OK, USA, December 2010. [Google Scholar]
  25. Synopsys. Liberty User Guides and Reference Manual Suite; Synopsys: Mountain View, CA, USA, 2017; Volume 2, pp. 1069–1072. [Google Scholar]
  26. Vakil, A.; Jayadev, K.P.; Hegde, S.; Koppad, D. Comparitive analysis of null convention logic and synchronous CMOS ripple carry adders. In Proceedings of the 2017 Second International Conference on Electrical, Computer and Communication Technologies (ICECCT), Coimbatore, India, 22–24 February 2017; pp. 1–5. [Google Scholar]
Figure 1. The primary threshold gate (a) thmn; (b) Th23w2.
Figure 1. The primary threshold gate (a) thmn; (b) Th23w2.
Jlpea 12 00031 g001
Figure 2. General structure of a static CMOS threshold gate.
Figure 2. General structure of a static CMOS threshold gate.
Jlpea 12 00031 g002
Figure 3. General structure of reset block and hold data block: (a) reset block; (b) hold data block.
Figure 3. General structure of reset block and hold data block: (a) reset block; (b) hold data block.
Jlpea 12 00031 g003
Figure 4. The proposed NCL cell library design flow chart.
Figure 4. The proposed NCL cell library design flow chart.
Jlpea 12 00031 g004
Figure 5. Threshold gate th22.
Figure 5. Threshold gate th22.
Jlpea 12 00031 g005
Figure 6. Transition time at output pin. (a) Rise transition. (b) Fall transition.
Figure 6. Transition time at output pin. (a) Rise transition. (b) Fall transition.
Jlpea 12 00031 g006
Figure 7. Cell rise delay. (a) Timing arc is negative unate. (b) Timing arc is positive unate.
Figure 7. Cell rise delay. (a) Timing arc is negative unate. (b) Timing arc is positive unate.
Jlpea 12 00031 g007
Figure 8. Cell fall delay. (a) Timing arc is negative unate. (b) Timing arc is positive unate.
Figure 8. Cell fall delay. (a) Timing arc is negative unate. (b) Timing arc is positive unate.
Jlpea 12 00031 g008
Figure 9. The testbench circuit.
Figure 9. The testbench circuit.
Jlpea 12 00031 g009
Figure 10. Function test results of th22 with A connected VDD and B supplied Vpulse.
Figure 10. Function test results of th22 with A connected VDD and B supplied Vpulse.
Jlpea 12 00031 g010
Figure 11. Function test results of th22 with A supplied Vpulse and B connected VDD.
Figure 11. Function test results of th22 with A supplied Vpulse and B connected VDD.
Jlpea 12 00031 g011
Figure 12. Function test results of th22 with A and B supplied Vpulse.
Figure 12. Function test results of th22 with A and B supplied Vpulse.
Jlpea 12 00031 g012
Figure 13. The simulation result with Pin A supplied Vpulse and pin B connected to VDD.
Figure 13. The simulation result with Pin A supplied Vpulse and pin B connected to VDD.
Jlpea 12 00031 g013
Figure 14. The simulation result with Pin A supplied Vpulse and pin B connected to GND.
Figure 14. The simulation result with Pin A supplied Vpulse and pin B connected to GND.
Jlpea 12 00031 g014
Figure 15. The Monte Carlo simulation of cell rise.
Figure 15. The Monte Carlo simulation of cell rise.
Jlpea 12 00031 g015
Figure 16. The Monte Carlo simulation of rise transition.
Figure 16. The Monte Carlo simulation of rise transition.
Jlpea 12 00031 g016
Figure 17. The Monte Carlo simulation of cell fall.
Figure 17. The Monte Carlo simulation of cell fall.
Jlpea 12 00031 g017
Figure 18. The Monte Carlo simulation of fall transition.
Figure 18. The Monte Carlo simulation of fall transition.
Jlpea 12 00031 g018
Figure 19. The Monte Carlo simulation of rise power.
Figure 19. The Monte Carlo simulation of rise power.
Jlpea 12 00031 g019
Figure 20. The Monte Carlo simulation of fall power.
Figure 20. The Monte Carlo simulation of fall power.
Jlpea 12 00031 g020
Figure 21. NCL full adder.
Figure 21. NCL full adder.
Jlpea 12 00031 g021
Figure 22. The netlist file after synthesis.
Figure 22. The netlist file after synthesis.
Jlpea 12 00031 g022
Figure 23. The area report result.
Figure 23. The area report result.
Jlpea 12 00031 g023
Figure 24. The power report result.
Figure 24. The power report result.
Jlpea 12 00031 g024
Figure 25. The delay report result.
Figure 25. The delay report result.
Jlpea 12 00031 g025
Table 1. Dual-rail signal.
Table 1. Dual-rail signal.
Boolean LogicCode
Dual-Rail LogicA1A0
0DATA001
1DATA110
NULL00
ILLEGAL11
Table 2. Cell rise delay.
Table 2. Cell rise delay.
C (fF)1.42.544.618.3715.227.650.0
T (ns)
0.01000.0317190.0347130.0396220.0484660.0638020.0915840.142036
0.01920.0355190.0384870.0434280.0522110.0677090.0953550.145774
0.03680.0429390.0457760.0508170.0594520.0748920.1022740.153379
0.07070.0569990.0598100.0647620.0734850.0890040.1170780.167156
0.13600.0809120.0838450.0887980.0974100.1129590.1406490.191456
0.26100.1208890.1238630.1288310.1374770.1529920.1803820.229623
0.50000.1879690.1912970.1965700.2054000.2205540.2484530.298070
Table 3. Rise transition.
Table 3. Rise transition.
C (fF)1.42.544.618.3715.227.650.0
T (ns)
0.01000.0141910.0176110.0235070.0346640.0552450.0934380.162108
0.01920.0142310.0176070.0235820.0346070.0549070.0929510.162408
0.03680.0143170.0174390.0237220.0345080.0551210.0932750.160903
0.07070.0145510.0177470.0237100.0350470.0555840.0923490.162379
0.13600.0156060.0188670.0245250.0357430.0560560.0936710.160805
0.26100.0178960.0210760.0266620.0371000.0566430.0935620.160940
0.50000.0216480.0245890.0299910.0399660.0591280.0948190.160774
Table 4. Cell fall delay.
Table 4. Cell fall delay.
C (fF)1.42.544.618.3715.227.650.0
T (ns)
0.01000.0402390.0427060.0467410.0534110.0648990.0847510.120116
0.01920.0438110.0462690.0502800.0570400.0682850.0881510.123954
0.03680.0505100.0529180.0569190.0636910.0749680.0948180.130357
0.07070.0633760.0657550.0698150.0765150.0877760.1075990.142960
0.13600.0869370.0893270.0934240.1001220.1116180.1314660.167641
0.26100.1273680.1299570.1342140.1410810.1526200.1726180.208160
0.50000.1964050.1992680.2038780.2110730.2229480.2431330.279049
Table 5. Fall transition.
Table 5. Fall transition.
C (fF)1.42.544.618.3715.227.650.0
T (ns)
0.01000.0129190.0152100.0194910.0271970.0414500.0675620.115309
0.01920.0129070.0152050.0196630.0273680.0413940.0674050.114845
0.03680.0128350.0152530.0196090.0273600.0415250.0675840.115338
0.07070.0129210.0154140.0195680.0273870.0415630.0675990.115411
0.13600.0137340.0160630.0203030.0280080.0421030.0676050.115320
0.26100.0155380.0178410.0220380.0297590.0434570.0680850.115567
0.50000.0184210.0210660.0252410.0325530.0460800.0700920.116487
Table 6. Fall power.
Table 6. Fall power.
C (fF)1.42.544.618.3715.227.650.0
T (ns)
0.0100−0.000976−0.000986−0.000996−0.001010−0.001025−0.001038−0.001044
0.0192−0.000949−0.000962−0.000975−0.000991−0.001005−0.001017−0.001025
0.0368−0.000922−0.000932−0.000954−0.000969−0.000985−0.001001−0.001010
0.0707−0.000905−0.000919−0.000920−0.000946−0.000949−0.000961−0.000989
0.1360−0.000902−0.000907−0.000912−0.000926−0.000945−0.000965−0.000967
0.2610−0.000889−0.000889−0.000898−0.000910−0.000927−0.000946−0.000961
0.5000−0.000905−0.000909−0.000916−0.000926−0.000940−0.000958−0.000976
Table 7. Rise power.
Table 7. Rise power.
C (fF)1.42.544.618.3715.227.650.0
T (ns)
0.0100−0.000926−0.001105−0.001442−0.002045−0.003127−0.005081−0.008588
0.0192−0.000909−0.001097−0.001433−0.002036−0.003118−0.005068−0.008578
0.0368−0.000900−0.001086−0.001420−0.002022−0.003105−0.005057−0.008568
0.0707−0.000885−0.001072−0.001405−0.002007−0.003090−0.005045−0.008558
0.1360−0.000892−0.001078−0.001408−0.002001−0.003088−0.005047−0.008559
0.2610−0.000902−0.001083−0.001405−0.002012−0.003096−0.005039−0.008560
0.5000−0.000950−0.001131−0.001459−0.002059−0.003141−0.005093−0.008610
Table 8. 1-bit full adder comparison results (without registers).
Table 8. 1-bit full adder comparison results (without registers).
DesignArea (transistor)Power (µW)Delay (ns)
Ours926.170.13
P-FA-L0 [20]743.57137.44
P-FA-L1 [20]663.77137.9
P-FA-L2 [20]607.93138.66
Table 9. 4-bit full adder comparison results with two different libraries.
Table 9. 4-bit full adder comparison results with two different libraries.
DesignPower (mW)Delay (ns)
Ours0.12451.13
Using library in [15]0.15710.59
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Thanh, T.L.; Tri, L.T.; Hoang, T. A Methodology to Design Static NCL Libraries. J. Low Power Electron. Appl. 2022, 12, 31. https://doi.org/10.3390/jlpea12020031

AMA Style

Thanh TL, Tri LT, Hoang T. A Methodology to Design Static NCL Libraries. Journal of Low Power Electronics and Applications. 2022; 12(2):31. https://doi.org/10.3390/jlpea12020031

Chicago/Turabian Style

Thanh, Toi Le, Lac Truong Tri, and Trang Hoang. 2022. "A Methodology to Design Static NCL Libraries" Journal of Low Power Electronics and Applications 12, no. 2: 31. https://doi.org/10.3390/jlpea12020031

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop