**2. Design Methods of 3-D Synapse Array Architecture**

In general, a large-size artificial neural network that has a large number of synaptic weights and neuron layers is required to obtain high performance artificial intelligence tasks. In the case of the ImageNet classification challenge, state-of-the-art deep neural network (DNN) architectures have 5~155M synaptic weight parameters [16]. In order to implement efficiently a large-size artificial neural network on a limited-size hardware chip, we proposed the 3-D stacked synapse array structure (Figure 1) in the previous work [11].

**Figure 1.** 3-D synapse array structure [11]. (**a**) 3-D stacked synapse device; (**b**) Unit synapse cell structure.

Unit synapse cell is composed of two CTF devices having two drain nodes (D(+), D(−)) and common source node(S). The D(+) part is connected to the output neuron circuit to increase membrane potential, acting as an excitatory synaptic behavior. The D(−) part is connected to the output neuron circuit to decrease membrane potential, acting as an inhibitory synaptic behavior. By using this configuration, it can be represented the negative and positive weight at the same time. As summarized in Table 1, the CTF device has several advantages over other non-volatile memory devices. First, it does not need an additional selector device because the three-terminal MOSFET-based unit cell has a built-in selection operation. Second, it has perfect CMOS compatibility. Third, the linear and incremental modulation of the weight (conductance) can be more easily achieved because its conductance is determined by the number of trapped charges. Fourth, it has good retention reliability characteristics. On the other hand, the drawback of CTF is large power consumption during program operation. Therefore, CTF devices are the best solution for off-chip learning-based neuromorphic systems where frequent weight updates do not occur.


**Table 1.** Comparison between non-volatile memory devices for neuromorphic hardware systems.

The proposed 3-D stacked synapse array structure is based on the word-line stacking method which is similar to the commercialized V-NAND flash memory. Therefore, it has the advantage of utilizing the existing stable process methods used in V-NAND flash memory.

A key issue in the design of 3-D stacked synapse array architecture is the metal interconnection. For example, a 4-layer stacked synapse array would have four times as many word lines as a 2-D synapse array. If the word-line (WL) decoder is connected by a conventional metal interconnection method, the vertical length of the WL decoder (HWL\_Decoder) will increase as illustrated in Figure 2, resulting in an enormous loss of area efficiency in terms of full-chip level architecture. To solve this issue, we proposed the smart design of a layer select decoder with 3-D metal line connection in the previous work [11]. As shown in Figure 3a, the area of WL decoder is not increased, and a layer select decoder is added to selectively operate each stacked layer. A layer select decoder delivers the gate voltages generated by the WL decoder to the WLs of the selected layer. It is important to note that the vertical length of a layer select decoder is the same as that of the WL decode, and the horizontal length is only 4 F×N where F is the minimum feature size and N is the number of staked layers. The specific structure of the transistors and metal interconnects is depicted in our previous paper [11].

The top-view layout of the 3-D synapse array architecture is illustrated in Figure 4. The layer select decoder is composed of pass transistors. The pass transistors are arranged next to each word line and are connected one-to-one with each WL contact. The gate nodes of the pass transistors are vertically connected to form a layer select line (LSL) that is controlled by LSL control circuit. Through this configuration, each stacked layer can be selectively operated while maintaining compact full-chip configuration efficiency. For example, if the turn-on voltage is applied to L4 and the turn-off voltages are applied to L1~L3, pass transistors corresponding to L = 4 are activated. Consequently, the WL voltages generated in the WL decoder are transferred to the forth-layer WLs (L = 4).

**Figure 2.** Metal interconnection scheme of synapse array architecture. (**a**) 2-D neuromorphic system architecture; (**b**) 3-D neuromorphic system architecture (a bad design example).

**Figure 3.** Schematic of the proposed 3-D synapse array architecture. (**a**) Metal interconnection of a full-chip architecture; (**b**) Each synapse layer configuration to implement artificial neural network.

**Figure 4.** Top view image of the revised synapse array architecture.

In this paper, we proposed an improved architecture design compared to the previous work, adding the ground select decoder as shown in Figure 4. If there is only a layer select decoder, the WLs of the unselected stacked layer are on a floating state because they are not connected to the WL decoder. In this case, the potential of the WLs of the unselected layer varies due to the capacitive coupling between the stacked WLs. In the worst case, the WLs of unselected layers located above or below (L = *n* − 1 or L = *n* + 1) the selected layer (L = *n*) may be boosted together when a high voltage is applied to the selected WLs. To solve this inherent risk of the architecture of the previous version, a ground select decoder that applies a turn-off voltage (0 V) to the WLs of the unselected layer is added to the right side of the main 3-D stacked synapse array as shown in Figure 4.

The detailed manufacturing process of the 3-D synapse array was described in our previous paper [11]. The revised synapse array architecture can be made with the same process method. Since the newly added ground select decoder structure has the same structure as the layer select decoder, it can be made by just adding the same layout as the layer select decoder.

To validate the synaptic operations of the designed CTF-based synapse device, the technology computer-aided design (TCAD) device simulation (Synopsys Sentaurus [17]) was used. The specific device parameters are summarized in Table 2. Electrical characteristics of the designed synapse device are discussed in the next chapter.

**Table 2.** Physical parameters of the device used for electrical simulation.

