

Article



# A Novel Word Line Driver Circuit for Compute-in-Memory Based on the Floating Gate Devices

Xiaofeng Gu <sup>D</sup>, Rao Che, Yating Dong and Zhiguo Yu \*

Engineering Research Center of Internet of Things Technology Applications (Ministry of Education), Department of Electronic Engineering, Jiangnan University, Wuxi 214122, China

\* Correspondence: yuzhiguo@jiangnan.edu.cn

Abstract: In floating gate compute-in-memory (CIM) chips, due to the gate equivalent capacitance of the large-scale array and the parasitic capacitance of the long-distance transmission wire, it is difficult to balance the switching speed and area of the word line driver circuit (WLDC). The difference among multiple voltage domains required for floating gate CIM devices has also far exceeded the withstand voltage range of a single transistor in the WLDC. This paper proposes a novel WLDC based on the working principle of the CIM array. A multi-level pre-processing voltage control method is adopted to carry out an optional hierarchical transmission of multiple high voltages, significantly reducing the propagation delay. The proposed WLDC is based on the Wilson current mirror structure, which substantially reduces the physical design area. The simulation results show that the circuit can convert a 1.2 V low-voltage domain input signal with a frequency of 10 MHz into a high-voltage domain output voltage, and the output voltage range of a single WLDC can reach -10 V to 10 V. With a capacitive load within 5 pF, the transmission delay is less than 10 ns. The layout area is 594.88  $\mu$ m<sup>2</sup>, which is suitable for a large-scale CIM array.

**Keywords:** compute-in-memory (CIM); multiple voltage domains; word line driver circuit; Wilson current mirror



Citation: Gu, X.; Che, R.; Dong, Y.; Yu, Z. A Novel Word Line Driver Circuit for Compute-in-Memory Based on the Floating Gate Devices. *Electronics* 2023, *12*, 1185. https:// doi.org/10.3390/electronics12051185

Academic Editors: Gerard Ghibaudo and Francis Balestra

Received: 9 January 2023 Revised: 23 February 2023 Accepted: 25 February 2023 Published: 1 March 2023



**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

# 1. Introduction

The rapid development of emerging digital technologies represented by artificial intelligence, cloud computing, and the Internet of Things has put forward unprecedented high requirements for data storage, transportation, and processing. Most modern computers use the von Neumann architecture (the separation of memory and computing architecture), which suffers from problems such as "memory wall" and "power consumption wall" [1]. In recent years, with memory logic operations and vector-matrix multiplication [2,3] showing optimization of time efficiency and power, the integration of compute-in-memory (CIM) has come into being [4–6], helping to prolong the life of Moore's Law and expand the application field of neural networks [7].

To realize the memory and operation functions of the CIM array based on the floating gate devices, the peripheral driver circuit of the CIM array needs to realize the transmission, switching, and generation of various high voltages for word/bit/source lines. Commonly used structures can be summarized into two types. One is where the level shifter circuit [8–10] cooperates with the switching metal-oxide-semiconductor (MOS) to realize high-voltage control and transmission [11]. Since the level shifter circuit cannot directly generate high voltage, charge pumps are needed to realize the generation of various high voltages. However, the level shifter circuit works in one voltage domain and can only realize the mutual switching between one high voltage and ground (GND). The operating voltage of the CIM array of floating gate devices includes positive high voltage (PHV), negative high voltage (NHV), GND, floating and so on. Thus, the extensive use of level shifter circuits and switch arrays will cause space constraints on the chip. In addition,

with the advancement of technology, the gate-source and gate-drain withstand voltages of complementary metal-oxide semiconductor (CMOS) devices continue to decline, and the frequent switching of domains in positive and negative voltages in the switch array reduces the reliability of the circuit. The other type is the selection and switching of multiple voltages directly through digital-to-analog converters (DACs) [12] or decoders [13,14]. However, in order to expand the application scenarios of CIM chips, the scale of CIM arrays continues to expand, and a large number of DACs have to be used, which is not conducive to the development of CIM chips in the direction of high integration in the case of tight on-chip area.

The driver circuit needs to provide the operating voltage of the CIM unit in the CIM array so that the CIM unit can complete the processes of writing, reading and erasing [14–16]. As shown in Figure 1, the CIM array driver circuit can be divided into two critical modules according to its function: high voltage control circuit and a charge pump [17]. According to the characteristics of floating gate devices, high voltage is required to switch the working mode, and the driver circuit is used to control the port potential of the CIM array to complete the memory and calculation of floating gate devices.



Figure 1. The fundamental architecture of the compute-in-memory (CIM) array driver circuit.

In this work, we analyze the advantages and disadvantages of the two conventional level shifter topologies and design a novel word line driver circuit (WLDC) based on the Wilson current mirror combining the working principle of the CIM array. This design separates the working mode and the working state and transmits multiple voltages in stages, significantly reducing the transmission delay. In addition, using the clamping principle of a single MOS transistor, the voltage drop of a single floating gate device port in the WLDC is reduced to half of the original, which solves the problem of the withstand voltage and high voltage switching of the WLDC and dramatically improves the WLDC's reliability.

The rest of the paper is organized as follows. Section 2 describes the proposed WLDC for the CIM array in detail. Section 3 presents the simulation results and discussion. The final conclusion is drawn in Section 4.

#### 2. Proposed WLDC for CIM

#### 2.1. Architecture of the CIM Array WLDC

According to the characteristics of floating gate devices and the requirements of artificial intelligence algorithms, all the word lines (WLs) of the CIM array are in the same operation mode when working, and each WL implements individual control and switching. Figure 2 highlights the framework of the CIM array high-voltage driver circuit based on floating gate devices where the WLDC part on the left consists of a shift register module, charge pump, and WL driver circuits. The specific CIM array WL driving voltage requirements are shown in Table 1.

In order to realize switching the WLDC between the multiple working modes and multiple working states in Table 1, a multi-level pre-processing voltage control method is proposed in this paper. According to the working principle of the floating gate CIM array, the WLDC is set to three working stages: weight writing, convolution operation, and weight erasing. First, in the weight writing stage, high-speed switching between HVINPUT and HVGND is realized. Second, in the convolution operation stage, high-speed switching between HVREAD and HVGND is realized. Third, in the weight-erasing stage, high-speed switching between HVRESET and HVGND is realized. By using this method, a 20 V voltage difference between HVINPUT and HVGND, the charging and discharging

time of load capacitance and parasitic capacitance are reduced, and the transmission delay of WL voltage is shortened effectively. The advantages of this design are that it avoids excessive power consumption caused by repeated mode switching in the high-voltage circuit, and the voltage transmission efficiency and stability are improved significantly.



Figure 2. The architecture of the CIM array high-voltage driver circuit.

| <b>Operation Mode</b> | <b>Operation State</b> | Signal  | Explanation                            |
|-----------------------|------------------------|---------|----------------------------------------|
| weight writing        | work                   | HVINPUT | weight input positive voltage          |
|                       | stop                   | HVGND   | high-voltage gnd                       |
| convolution operation | work                   | HVREAD  | convolution operation positive voltage |
|                       | stop                   | HVGND   | high-voltage gnd                       |
| weight erasing        | work                   | HVRESET | weight erase negative voltage          |
|                       | stop                   | HVGND   | high-voltage gnd                       |

Table 1. Compute-in-memory (CIM) array word line voltage demands.

## 2.2. Conventional WLDC Based on Level Shifter

Conventional voltage level shifters can be categorized into two main types. Figure 3a shows the topology of the differential cascade voltage switch(DCVS). The architecture of the DCVS is based on a cross-coupled pull-up PMOSs (PM1 and PM2) and two NMOSs (NM1 and NM2) driven by the complementary inputs IN and INB. When the voltage of IN is high, NM1 and NM2 are On and Off, respectively. NM1 then pulls down node Q1, causing PM2 to turn on. Because node Q2 then increases to  $V_{DDH}$ , PM1 turns off, and OUT decreases to the VSS. Note that the voltage of OUT is determined by the driver current of the pull-up transistor PM2 and pull-down transistor NM2. If  $V_{DDH}$  is much larger than  $V_{DDL}$ , the pull-down transistors, making the output fail to flip. At this point, the



size of the pull-down transistor can be increased, which in turn leads to increased power dissipation, load capacitance, and area [18].

**Figure 3.** Two main categories of conventional level shifters and traditional word line driver circuits. (a) Differential cascade voltage switch; (b) Current mirror-based architecture; (c) Three different mode circuits to transmit three operating voltages.

Figure 3b shows that the level shifter adopts a primary current mirror as a pull-up network. This type of architecture has little competition between the pull-up and pull-down nets and the circuit's left and right branches. Therefore, the size of the pull-down network is more relaxed while also operating relatively faster. So in this situation, it is usually preferred to its DCVS counterparts in terms of circuit delay and area. However, it has a reasonably large standby power, mainly due to the quiescent current flowing through one of the circuit branches depending on the input state [19].

Figure 3c is the traditional WLDC evolved from Figure 3a, which consists of three LSs, and each LS is used to transmit HVINPUT, HVRESET and HVREAD, respectively. It requires too many MOS tubes, and the area occupied is too large, which is not conducive to our high integration on the chip. It is worth noting that there is a fatal problem in the two conventional level shifter circuits. The type of voltage transmission is single, and the kind of voltage transmission can only be changed by changing V<sub>DDH</sub>, which cannot meet the requirements of CIM arrays and cannot achieve voltage switching in the negative voltage domain. Given the relative merits of the two architectures, this paper chooses the level shifter circuit based on the current mirror structure as the research basis since it has the two benefits of low area and low delay.

## 2.3. Proposed NovelWLDC

To solve the problems of the traditional WLDC, we propose a novel area-efficient WLDC. The simplified schematic of the proposed WLDC is shown in Figure 4. This design is a hybrid structure composed of a Wilson current mirror level shifter (WCMLS), inverter, and MOS switch. The number of MOS transistors used in the proposed design is half that of the traditional WLDC. We integrate the transmission of three different voltages into one circuit structure, which not only reduces the area, it is beneficial to the high-density layout of the large-scale CIM array. At the same time, we reduce the number of control signals to one, which avoids the tedious switching process and greatly improves the operability of the circuit.



Figure 4. Proposed novel word line driver circuit.

The proposed WLDC based on the WCMLS can significantly reduce the power consumption of the WLDC compared with the simple current mirror level shifter. Another purpose of adopting WCMLS is that the area is much smaller than the level shifter of the DCVS structure. The adopted technique alleviates the need for the length of NM1, NM2, PM4, and PM5. This decreases the impact of the parasitic capacitance, thus improving the power delay product. NM1, NM2, PM4, and PM5 can all take smaller width and length values than DCVS. The proposed WLDC is designed for the 65 nm CMOS technology and sized as reported in Table 2.

| Table | 2. | Transistor | sizes. |
|-------|----|------------|--------|
| lable | 2. | Iransistor | sizes. |

| Transistor | W/L (μm) | Transistor | W/L (μm) |  |
|------------|----------|------------|----------|--|
| PM1        | 0.7/1    | NM1        | 2/0.6    |  |
| PM2        | 0.7/0.65 | NM2        | 2/0.6    |  |
| PM3        | 0.7/0.65 | NM3        | 0.7/0.6  |  |
| PM4        | 1/0.65   | NM4        | 0.7/0.6  |  |
| PM5        | 1/0.65   | NM5        | 0.7/0.6  |  |
| PM6        | 2/0.65   | NM6        | 2/0.6    |  |
| PM7        | 10/0.65  | NM7        | 8/0.6    |  |

The five high-voltage inputs of  $V_{PHV}$ ,  $V_{LV}$ ,  $V_{NHV}$ ,  $V_{INPUT\_READ}$ , and  $V_{RESET}$  determine whether the circuit is in the working phase of weight writing, convolution operation, or weight erasing, then the input IN determines whether the working state of the word line is

work or stop. The hierarchical control of the working mode and the working state of the CIM array is adopted, the transmission process is optimized, and the switching speed of the WL work and stop in the fixed working stage is greatly improved.

2.3.1. The Principle of Weight Writing and Convolution Operation

Table 3 presents every port voltmeter of the WLDC. At the stages of weight writing and convolution operation, the control signals of the WLDC are as follows:  $V_{PHV}$  is the maximum of HVINPUT or HVREAD,  $V_{LV}$  and  $V_{PHV}$  are HVGND,  $V_{INPUT\_READ}$  is HVINPUT or HVREAD,  $V_{RESET}$  is HVGND. Meanwhile, the WLDC works in the HVGND and positive high-voltage domains.

| V <sub>PHV</sub> | V <sub>LV</sub> | V <sub>NHV</sub> | V <sub>INPUT_READ</sub> | V <sub>RESET</sub> | IN  | OUT     |
|------------------|-----------------|------------------|-------------------------|--------------------|-----|---------|
| HVINPUT          | HVGND           | HVGND            | HVINPUT                 | HVGND              | VDD | HVINPUT |
| HVINPUT          | HVGND           | HVGND            | HVINPUT                 | HVGND              | GND | HVGND   |
| HVINPUT          | HVGND           | HVGND            | HVREAD                  | HVGND              | VDD | HVREAD  |
| HVINPUT          | HVGND           | HVGND            | HVREAD                  | HVGND              | GND | HVGND   |
| VDD              | VDD             | HVRESET          | HVGND                   | HVRESET            | VDD | HVRESET |
| VDD              | VDD             | HVRESET          | HVGND                   | HVRESET            | GND | HVRESET |

Table 3. Word line driver circuit port voltmeter.

The schematic of the proposed WLDC, which works at the stages of weight writing or convolution operation, is shown in Figure 5. For a rising transition at the input, the operation of the circuit is as follows.



Figure 5. The principle of positive high-voltage transmission.

At the initial moment, when the input is low, the voltages of nodes Q1 and Q2 are high, whereas Q3 is low. NM1 and PM2 are on, while NM2, PM1, and PM3 are off. A rising edge of the input turns NM2 on and turns NM1 off, causing the parasitic capacitance of node Q2 to start discharging simultaneously. While the voltage of Q2 is coming down, PM3 gradually turns on. Therefore, a transient current flows through the left branch and pulls Q3 up. This current is mirrored to the right branch and causes contention between the pull-up and pull-down currents in node Q2. However, by increasing the voltage of Q3, PM1 gradually turns off and prohibits the static current flowing through the left branch. Node Q2 will discharge to GND without the current mirrored by the left branch.

Because the gate of PM6 is connected to a negative VDD, the low voltage of node Q2 will be transmitted to Q7 through PM6. When Q7 is GND, PM7 is turned on, and  $V_{INPUT}$  or  $V_{READ}$  is transmitted to the output terminal of WL. It should be noted here that PM7 and NM7 need to be MOS transistors with a high threshold voltage so that the leakage current and power consumption of the output node can be smaller.

Nevertheless, for a high-to-low transition, NM1 is on, and NM2 is off, pulling the nodes Q1 and Q3 down. When the voltage at Q1 reaches  $V_{PHV}-V_{th}$ , PM1 gradually opens, allowing the static current to flow through PM1. This current is mirrored to the right branch and increases the voltage of node Q2. By pulling up the voltage at node Q2, PM3 turns off and does not allow the static current to flow through. Similarly, the voltage of Q2 will be transmitted to the Q7 node through PM6. When Q7 is at a high voltage, NM7 will be turned on, and  $V_{RESET}$  will be transferred to the WL.

This circuit realizes the word line voltage conversion from GND-VDD to output HVGND-HVINPUT or HVREAD. The gate voltage of NM6 is 2VDD, which avoids the influence on the NMOS current mirror when the PMOS current mirror works.

## 2.3.2. The Principle of Weight Erasing

With the control signals illustrated in Table 2, at the stage of weight erasing, they are as follows:  $V_{PHV}$  and  $V_{LV}$  are VDD,  $V_{NHV}$  is HVRESET,  $V_{INPUT\_READ}$  is HVGND, and  $V_{RESET}$  is HVRESET. At present, the WLDC works in the HVGND and negative high-voltage domains.

Figure 6 reveals the schematic of the proposed WLDC, which works at the stage of weight erasing. For a rising transition at the input, the operation of the circuit is as follows. At the initial moment when input is low, the voltage of nodes Q4 and Q5 are high, whereas Q6 is low. PM5 and NM3 are on, while PM4, NM4, and NM5 are off. A rising edge of the input turns PM4 on and turns PM5 off, causing the parasitic capacitance of nodes Q4 and Q5 to start discharging simultaneously. When the voltage of Q5 reaches  $V_{NHV}$ - $V_{th}$ , NM4 will be turned on. In that case, a transient current flows through the left branch. This current is mirrored to the right branch and pulls up the voltage of Q6, which gradually turns off NM3 and prohibits the static current from flowing through the left branch. However, when the gate of NM6 is connected to a positive 2VDD, the voltage of node Q4 will be transmitted to Q7 through NM6. When Q7 is VDD, NM7 is turned on, and  $V_{RESET}$  is transmitted to the output terminal of WL.



Figure 6. The principle of negative high-voltage transmission.

On the other hand, for a high-to-low transition, PM4 is off, and PM5 is on, pulling node Q6 down. When the voltage at Q6 drops, NM3 gradually opens, allowing the static current to flow through NM4, increasing the voltage of node Q4. This current is mirrored to the right branch and causes contention between pull-up and pull-down currents in node Q6. By increasing the voltage at nodes Q4 and Q5, NM3 turns off, causing the pull-up current of NM5 to be unable to resist the pull-down current of PM5, thus discharging the voltage of Q6 to VDD. Similarly, the voltage of Q4 will be transmitted to Q7 through NM6. When Q7 is at a negative high voltage, NM7 will be turned on, and HVGND will be sent to the WL.

This circuit realizes the word line voltage conversion from GND-VDD to output HVGND-HVRESET. The gate voltage of PM6 is -VDD, which avoids the influence on the PMOS current mirror when the NMOS current mirror works.

## 3. Results and Discussion

The proposed WLDC is designed with 65 nm CMOS technology where the VDD is 1.2 V, and the HVGND is 0 V. In the CIM array based on floating gate devices, the weight writing voltage HVINPUT is 10 V, the convolution operation voltage HVREAD is 5 V, and the weight erasing voltage HVRESET is -10 V. The results shown here are the post-layout simulated results, performed for an input signal with a frequency of 10 MHz, a voltage of 1.2 V, and an input rise and fall time of 1ns with load capacitance considered as 1 pF at the output of the proposed WLDC. The input and output simulation waveforms of the three states are shown in Figure 7.



Figure 7. Transient simulation results of the proposed word line driver circuit (WLDC).

In the case of various capacitive loads, the delay of three operating modes can be controlled within 10 ns, as shown in Figure 8. It can be concluded that the WLDC designed in this work has a low delay and a strong driving ability when driving a large load of pF levels. The proposed WLDC can meet the high-speed switching requirements of the CIM array word line.



Figure 8. Switching delay of the proposed WLDC with different load capacitance.

Although the WLDC we designed has improved the static power consumption since the main structure is a current mirror LS, the static power consumption of the circuit compared with the DCVS structure is unavoidable. Figure 9 shows the static power consumption of the three operating modes under the tt/ff/ss process at different temperatures. At the same time, the static power consumption is relatively higher at high temperatures and at the ff process corner. Figure 10 presents the total power consumption of the three operating modes under the tt/ff/ss process at different temperatures. At normal room temperature, the static power consumption of the proposed WLDC is 102 pW, and the total power consumption is 249 nW. We sacrificed some power consumption for the advantages of area and speed, which is within our tolerance.



Figure 9. Static power of the proposed WLDC at all process corners concerning temperature.



Figure 10. Total power of the proposed WLDC at all process corners concerning temperature.

Table 4 compares our proposed circuit with those previously reported. The WLDC designed in this work has good symmetry. The PMOS current mirror is used in the positive high-voltage domain, and the NMOS current mirror is used in the negative high-voltage domain. Good matching can be achieved in the layout design stage by increasing the reuse of n wells and p wells. Thus, the layout area can be saved. Therefore, the proposed WLDC is suitable for large-scale CIM arrays. Although the Power-delay product (PDP) is slightly larger than the paper [11], the proposed WLDC makes up for this by being able to transmit both positive and negative voltages at high speed. The novel WLDC consists of a standard 65 nm 5V CMOS process with deep n-well devices and measures  $42.13 \times 14.12 = 594.88 \,\mu\text{m}^2$ .

| Parameter                        | [8]     | [9]      | [10]      | [11]  | [17]     | This Work      |
|----------------------------------|---------|----------|-----------|-------|----------|----------------|
| Input voltage (V)                | 0.2     | 1.8      | 3-8.5     | 0.3   | 1.8      | 1.2            |
| Output voltage (V)               | 3       | 9.8-12.8 | 5.35-12.4 | 1.2   | 4.5-13.5 | 10/5/-10       |
| Operating frequency (MHz)        | 1       | 1.25     | 10        | 1     | 10       | 10             |
| Rise/Fall time (ns)              | 10.01   | 40-50    | 1.8       | 25    | 2.5-8    | 3.01/2.39/4.89 |
| Output load (pF)                 | -       | 15       | -         | -     | 0.015    | 1              |
| Static power (nW)                | 0.6     | 774      | -         | 2.5   | 0.37     | 0.1            |
| Total power (nW)                 | 11,233  | -        | 26,500    | 22.4  | -        | 249            |
| Power-delay product (nW.ns)      | 113,453 | -        | 47,700    | 560   | -        | 749.5          |
| Layout area (µm <sup>2</sup> ) * | 651     | 2628     | 1043      | 469   | 960      | 595            |
| Negative voltage support         | No      | No       | No        | No    | No       | Yes            |
| Multi-voltages transmission      | No      | No       | Yes       | No    | Yes      | Yes            |
| Process technology               | 45 nm   | 0.18 µm  | 0.18 μm   | 65 nm | 0.11 μm  | 65 nm          |

Table 4. Summary of comparisons between this work and previously reported results.

\* Layout area is the simulation result in a 65 nm process.

## 4. Conclusions

In this work, by analyzing the problems of the conventional WLDC and combining the voltage requirements of the floating gate CIM array, we proposed a multi-level preprocessing voltage control method that can effectively reduce the transmission delay. On this basis, we designed a novel WLDC for the floating gate CIM array. Compared with the traditional solution, we have reduced the area consumption by nearly half. The simulation result indicated that the delay was less than 10 ns with a large load of pF levels. Meanwhile, we used the clamp voltage divider structure to convert the 20 V voltage drop of the positive and negative voltage domains of the single device port to 10 V, solving the problems of withstanding the voltage and high-voltage switching of the WLDC. The proposed novel WLDC was designed with 65 nm CMOS technology and showed significant improvements in transmission delay, static power, and area, making it a promising candidate for the floating gate CIM array.

**Author Contributions:** Conceptualization, X.G. and Z.Y.; methodology, X.G. and R.C.; software, R.C. and Y.D.; validation, R.C. and Z.Y.; formal analysis, R.C. and Y.D.; investigation, X.G. and R.C.; resources, X.G. and Z.Y.; data curation, X.G. and R.C.; writing—original draft preparation, X.G. and R.C.; writing—review and editing, X.G. and Z.Y.; visualization, R.C. and Y.D.; supervision, X.G. and Z.Y.; project administration, X.G. and Z.Y.; funding acquisition, X.G. and Z.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Fundamental Research Funds for the Central Universities (Grant No. JUSRP51510), the Key R&D Program of Jiangsu Province (Grant No. BE2019003-2) and the Joint Project of Yangtze River Delta Community of Sci-Tech Innovation (Grant No. 2022CSJGG0400).

Acknowledgments: We thank Zuoqin Zhao for the helpful discussion.

Conflicts of Interest: The authors declare no conflict of interest.

#### References

- Sze, V.; Chen, Y.H.; Emer, J.; Suleiman, A.; Zhang, Z. Hardware for Machine Learning: Challenges and Opportunities. In Proceedings of the 2017 IEEE Custom Integrated Circuits Conference (CICC), Austin, TX, USA, 30 April–3 May 2017; pp. 1–8. [CrossRef]
- Sahay, S.; Bavandpour, M.; Mahmoodi, M.R.; Strukov, D. Energy-Efficient Moderate Precision Time-Domain Mixed-Signal Vector-by-Matrix Multiplier Exploiting 1T-1R Arrays. *IEEE J. Explor. Solid-State Comput. Devices Circuits* 2020, 6, 18–26. [CrossRef]
- 3. Rizzo, T.; Strangio, S.; Iannaccone, G. Time Domain Analog Neuromorphic Engine Based on High-Density Non-Volatile Memory in Single-Poly CMOS. *IEEE Access* 2022, *10*, 49154–49166. [CrossRef]
- Shukla, P.; Muralidhar, A.; Iliev, N.; Tulabandhula, T.; Fuller, S.B.; Trivedi, A.R. Ultralow-Power Localization of Insect-Scale Drones: Interplay of Probabilistic Filtering and Compute-in-Memory. *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.* 2022, 30, 68–80. [CrossRef]
- Kim, S.; Baek, M.H.; Hwang, S.; Jang, T.; Park, K.; Park, B.G. A Novel Vector-matrix Multiplication (VMM) Architecture based on NAND Memory Array. J. Semicond. Technol. Sci. 2020, 20, 242–248. [CrossRef]
- Paliy, M.; Strangio, S.; Ruiu, P.; Rizzo, T.; Iannaccone, G. Analog Vector-Matrix Multiplier Based on Programmable Current Mirrors for Neural Network Integrated Circuits. *IEEE Access* 2020, *8*, 203525–203537. [CrossRef]
- Yan, B.; Yang, Q.; Chen, W.H.; Chang, K.T.; Su, J.W.; Hsu, C.H.; Li, S.H.; Lee, H.Y.; Sheu, S.S.; Ho, M.S.; et al. RRAM-based Spiking Nonvolatile Computing-In-Memory Processing Engine with Precision-Configurable In Situ Nonlinear Activation. In Proceedings of the 2019 Symposium on VLSI Technology, Kyoto, Japan, 9–14 June 2019; pp. T86–T87. [CrossRef]
- Ma, C.; Ji, Y.; Qiao, C.; Zhou, T.; Qi, L.; Li, Y. An Energy-Efficient Level Shifter Using Time Borrowing Technique for Ultra Wide Voltage Conversion from Sub-200mV to 3.0V. In Proceedings of the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), Daegu, Republic of Korea, 22–28 May 2021; pp. 1–5. [CrossRef]
- 9. Cha, H.K.; Zhao, D.N.; Cheong, J.H.; Guo, B.; Yu, H.B.; Je, M. A CMOS High-Voltage Transmitter IC for Ultrasound Medical Imaging Applications. *IEEE Trans. Circuits Syst. II-Express Briefs* **2013**, *60*, 316–320. [CrossRef]
- Palomeque, M.D.; Rodriguez, V.A.; Delgado, R.M. A High-Voltage Floating Level Shifter for A Multi-Stage Charge-Pump in A Standard 1.8 V/3.3 V CMOS Process. *Aeu-Int. J. Electron. Commun.* 2022, 156, 154389. [CrossRef]
- 11. Zhao, W.; Alvarez, A.B.; Ha, Y. A 65-nm 25.1-ns 30.7-fJ Robust Subthreshold Level Shifter With Wide Conversion Range. *IEEE Trans. Circuits Syst. II Express Briefs* 2015, 62, 671–675. [CrossRef]
- Xie, S.; Ni, C.; Sayal, A.; Jain, P.; Hamzaoglu, F.; Kulkarni, J.P. 16.2 eDRAM-CIM: Compute-In-Memory Design with Reconfigurable Embedded-Dynamic-Memory Array Realizing Adaptive Data Converters and Charge-Domain Computing. In Proceedings of the 2021 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 13–22 February 2021; pp. 248–250. [CrossRef]
- Rana, V.; Pasotti, M.; Carissimi, M. Row Decoder for Embedded Phase Change Memory Using Low Voltage Transistors. Microelectron. J. 2018, 81, 117–122. [CrossRef]
- Monga, K.; Chaturvedi, N.; Gurunarayanan, S. A Dual-Mode In-Memory Computing Unit Using Spin Hall-Assisted MRAM for Data-Intensive Applications. *IEEE Trans. Magn.* 2021, 57, 1–10. [CrossRef]
- 15. Hung, J.M.; Li, X.; Wu, J.; Chang, M.F. Challenges and Trends in Developing Nonvolatile Memory-Enabled Computing Chips for Intelligent Edge Devices. *IEEE Trans. Electron Devices* **2020**, *67*, 1444–1453. [CrossRef]
- Yantır, H.E.; Eltawil, A.M.; Salama, K.N. IMCA: An Efficient In-Memory Convolution Accelerator. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2021, 29, 447–460. [CrossRef]

- 17. Rana, V.; Sinha, R. Stress Relaxed Multiple Output High-Voltage Level Shifter. *IEEE Trans. Circuits Syst. II Express Briefs* **2018**, 65, 176–180. [CrossRef]
- 18. Osaki, Y.; Hirose, T.; Kuroki, N.; Numa, M. A Low-Power Level Shifter With Logic Error Correction for Extremely Low-Voltage Digital CMOS LSIs. *IEEE J. Solid-State Circuits* **2012**, *47*, 1776–1783. [CrossRef]
- Zhou, J.; Wang, C.; Liu, X.; Zhang, X.; Je, M. An Ultra-Low Voltage Level Shifter Using Revised Wilson Current Mirror for Fast and Energy-Efficient Wide-Range Voltage Conversion from Sub-Threshold to I/O Voltage. *IEEE Trans. Circuits Syst. I Regul. Pap.* 2015, 62, 697–706. [CrossRef]

**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.