*NTV Clocking Architecture*

A calibrated ring oscillator (CRO) serves as a low-power on-chip high-frequency (MHz) clock source for the 14-nm NTV-MCU. The CRO is a frequency-locked loop (Figure 11a) that uses an RTC as a reference to generate a MHz clock output. Internally, the CRO tracks the frequency of oscillation from a ring oscillator and generates a delay code that adjusts the oscillation frequency to closely match the target frequency based on the reference clock. The CRO can operate in (1) closed-loop mode, where it accurately tracks the target frequency, as well as in (2) open loop mode at ultralow voltages, producing clock with tens of KHz frequency, enough for always-on (AON) sensing operation on the MCU. Silicon characterization data for the CRO is presented in Figure 11b. The on-die CRO locks to a wide range of target frequencies from 1 V down to 0.4 V. The CRO dissipates 60 μW (450 mV) while generating a 16 MHz output to clock the MCU at VOPT. In open-loop condition, the CRO is functional down to a deep sub-threshold voltage of 128 mV, dissipating 3.8 μW, while generating a 7-kHz clock output. The CRO achieves a measured clock period jitter of 4.6 ps at 400-MHz operation.

The low-VDD global clock distribution network on the NTV-CPU (Figure 11c) is designed with low-VT devices to minimize clock skew across logic and memory voltage domain crossings, across the entire operating voltage range, and considers the effect of random variations. The clock tree incorporates two-stage level shifters and programmable delay buffers in the clock path. The level shifters in the clock path track the delay in the data-path level shifters. In addition, programmable lookup table based delay buffers can be tuned to compensate for any inter-block skew variations. SSTA (6σ) variation analysis shows 50% skew reduction at 0.5 V from clock delay tuning (Figure 11d).

**Figure 11.** Multi-voltage global clock generation and distribution: (**a**) Calibrated Ring Oscillator (CRO) used in 14-nm NTV-MCU; (**b**) CRO operating range in both open/closed loops with μW power consumption; (**c**) The global clock distribution in the 32-nm NTV-CPU uses multi-stage level shifters; (**d**) Simulated clock skew reduction benefit from voltage specific delay tuning using look-up tables.

## **5. Key Results from Experimental NTV Prototypes**
