1. Introduction
Over the past decade, the spiking neural network (SNN) has become one of the most popular architectures to simulate the brain neural network. Considered as the third-generation neural network, the SNN shows many advantages. The artificial neural network (ANN) is considered as the second-generation neural network. Compared with the ANN, the SNN is more plausible biologically and achieves better performance in pattern recognition tasks [
1]. The ANN often uses fairly perfect integrators and a non-linear activation function. The cortical neurons behave as leaky integrations that use conductance-based synapses. Furthermore, the standard training method in the ANN is back propagation, in which each neuron is fed its specific error signal for updating the weight matrix during training. But this kind of learning based on neuron-specific error signals is unlikely to be achieved in the cerebral cortex, where the learning methods are closer to unsupervised learning methods, such as the spike-timing-dependent plasticity (STDP) mechanism [
2]. In the SNN, the neural information is stored in the neuron in the form of spike training. When there is an external signal, the neuron is used to integrate the input and the leakage, while the weight of the synapse connecting each neuron is updated based on the STDP mechanism.
The STDP learning process includes the following stages. The first is the adjustment of the connection strengths (i.e., synaptic weights) based on the relative timing of a particular neuron’s output and input states. The second stage is the hardware implementation of the SNN trained by STDP. In the implementation, the neuron is needed to generate the spiking signal and the synapse for the adjustment of the weight in real time. In the SNN, it shows huge benefits related to its asynchronous processing and massively parallel architecture [
1,
2]. Recent developments in neuromorphics aim to implement the SNN in hardware to fully exploit its potential in terms of low energy consumption. Nevertheless, the general-purpose computing platforms and the custom hardware architectures implemented using standard CMOS technology cannot rival the power efficiency of the human brain. Hence, there is a need for novel nanoelectronic devices that can efficiently model the neurons and synapses constituting the SNN.
As an emerging non-volatile memory, magnetic random-access memory (MRAM) has many advantages, such as non-volatile data, low power consumption, high integration, strong durability, compatibility with the CMOS process, radiation hardness, etc., and is considered as one of the most promising next-generation memories [
3,
4,
5,
6,
7,
8]. MRAM opens the door to the new computing paradigm, which is different from the traditional Von Neumann architecture. As the core device of MRAM, the magnetic tunnel junction (MTJ) device shows promising properties [
9]. At present, it is applied in many fields, including memory, sensors, and neural networks [
8,
9,
10,
11,
12,
13,
14].
Current digital implementations of neuromorphic computing rely on large numbers of CMOS transistors, which commonly need a large area and consume a lot of energy. A cutting-edge neuromorphic circuit with a superior architecture is highly needed. For example, the sheer number of synapses for a few-node neural network requires intricate connections and routings, which would be relatively expensive with a CMOS-only solution. The MTJ device can be used to represent the biological neuron and the synapse on a one-to-one basis to mimic the computational dynamics in the human brain [
15]. In this sense, the MTJ device offers a compact and energy-efficient solution to take the place of the traditional CMOS-based neural network [
16,
17].
In the paper, a dynamic model of the MTJ device is established first. Based on the operation mechanism of the MTJ device, a high resemblance is shown between the magnetization dynamics of the MTJ device and the STDP mechanism observed in biological synapses. Also, there exists a high resemblance between the magnetization dynamics of the MTJ device and the characteristics observed in biological neurons. Finally, a demo SNN based on the MTJ synapse and the MTJ neuron is constructed, which is used to solve the image recognition problem in the paper. Compared with other works, a different neuron structure is adopted in the paper. The detailed explanation of how to convert pixel information into presynaptic spikes is shown. The corresponding neuron reset circuit is designed to implement STDP in hardware.
The rest of the paper is organized as follows.
Section 2 introduces the MTJ model strategies.
Section 3 presents the similar characteristics between the MTJ device and the biological synapse.
Section 4 presents the similar characteristics between the MTJ device and the biological neuron. In
Section 5, the SNN is mimicked by the MTJ device and is applied to distinguish the two types of handwritten digit images. Finally, conclusions are made in
Section 6.
2. Dynamic Model of the STT-MTJ Device for Simulation
The model of the STT-MTJ device was constructed based on the Landau–Lifshitz–Gilbert (LLG) equation [
15,
16,
17]:
where
is the unit magnetic vector of the free layer;
is the unit magnetic vector of the pinned layer;
JSTT is the current density of the MTJ device and
JSTT =
ISTT/Area;
is the effective magnetic field, including the demagnetization field; and
is the STT term. Other parameters are listed in
Table 1 [
18].
Considering the probabilistic switching behavior of the MTJ device, two stochastic aspects are included in the MTJ model. The first one is the angle between the stochastic initial magnetization vector and the easy axis. The second aspect is the stochastic thermal fluctuation field caused by thermal noise. For these two additional stochastic effects of the MTJ device, two corresponding stochastic terms are included in the LLGS equation to simulate the influence on the probabilistic characteristics of the MTJ device.
To simulate the first stochastic term, the initial value of
is set with polar coordinates as follows:
In most cases,
.
is the initial angle, which follows a Gaussian distribution as follows [
19]:
where
and
is the average value of
[
20,
21,
22].
The other stochastic term is the random thermal fluctuation field,
. The three components of
in the
x,
y, and
z directions follow a Gaussian distribution as follows [
23]:
is the time step of the simulation. The LLGS equation with stochastic terms is generally named the SLLGS equation.
The typical sandwich structure of the STT-MTJ device is shown in
Figure 1a, including the free layer, oxide layer, and reference layer, respectively [
3]. The magnetization direction of the reference layer is fixed at (0, 0, 1), and the magnetization of the free layer can store information. As shown in
Figure 1, with
= 5°, when
ISTT = 5 μA, the STT moment is not large enough and
remains at (0, 0, 1) due to the damping term. At this time, the MTJ device is in the P state. When
ISTT = 200 μA, the MTJ device is switched to the AP state. At this time,
is changed to be (0, 0, −1).
3. Design of the STT-MTJ-Based Synapse
There exists a high resemblance between the STT-MTJ device and the biological synapse. In biology, the synapse acts as the bridge between the neurons. The neuron emitting a signal is called the presynaptic neuron (PRE), and the neuron receiving the signal is called the post neuron (POST). The synapse is used to connect the PRE with the POST. Based on the STDP mechanism, the update of the synaptic weight depends on the spiking time modes of the PRE and the POST. If the PRE spike is ahead of the POST, the synaptic weight will be increased. On the contrary, if the PRE spike lags after the POST, the synaptic weight will be decreased accordingly. The mathematical expression of the STDP mechanism is as follows [
2]:
where
is the relative change of the synaptic weight and
A+, A−, τ+, and
τ− are the constants.
is the time difference between the PRE and the POST spikes. , and tPRE is the moment when the PRE is activated, while tPOST is the moment when the POST is activated.
The synaptic weight is increased with the positive time windows (with > 0); this is called long-term potentiation (LTP). For the negative time windows (with < 0), the synaptic weight is decreased, which is called long-term depression (LTD). According to the STDP mechanism, the synaptic weight can be programmed in situ based on the spiking timing information transmitted between the spiking neurons.
A similar adjustment mechanism is also observed in the MTJ device in terms of the device conductance. For the perpendicular MTJ device (P-MTJ), its conductance can be adjusted by controlling the pulse width of the voltage. With the positive write current flowing from the free layer to the pinned layer, the resistance of the MTJ is increased. On the contrary, with the negative write current flowing from the pinned layer to the free layer, the resistance is decreased.
Figure 2 shows the designed synapse based on the STT-MTJ device. The structure is shown in
Figure 2a. The 1T-1MTJ cell is used as the synaptic connection between the PRE and the POST. The gate voltage of the NMOS,
VG, is controlled by the membrane potential of the PRE, while the node voltage at the top of the MTJ,
VT, is controlled by the membrane potential of the POST.
Figure 2b shows the schematic of the time sequences of
VG and
VT under the 1T-1MTJ structure, which are controlled by the PRE and the POST, separately. When Δ
t > 0,
VT+ is overlapped with
VG. In this condition, the internal current flows from the fixed layer to the free layer in the MTJ device, driving the MTJ to be switched to the P state. So, the conductance of the MTJ device is increased.
On the contrary, when Δt < 0, VT− is overlapped with VG. The current in the MTJ device flows from the free layer to the fixed layer, driving the MTJ to the AP state. So, the conductance of the MTJ device is decreased.
Since the MTJ device in the AP state has a low conductance, while the MTJ in the P state has a high conductance, the MTJ device in the AP state is used to mimic the synapse with a weight of ‘zero’, and the P state is used to mimic the synapse with a weight of ‘one’. Based on the designed structure in
Figure 2, the MTJ device can be used to simulate the STDP mechanism.
Next, the behavior of the MTJ synapses was investigated with the implementation of the handwritten digit images in the MNIST database [
24,
25].
Figure 3 shows one of the images used in the paper. As shown in
Figure 3a, the image of the handwritten digit “4” is a 28 × 28 matrix, with 784 pixels in total. In the SNN field, the image is transformed into a current pulse sequence, which is named a presynaptic pulse sequence. During the changing process, the basic principle is that the pixel in the pure black area is noted as ‘0′, while the pixel in the pure white area is noted as ‘1′. As show in
Figure 3b, these pixels are converted to a series of current pulses, where the pixels close to ‘0′ are converted into a negative current pulse and the pixels close to ‘1′ are converted into a positive current pulse.
For the generated presynaptic pulse sequence in
Figure 3b, the reconstructed synapse is shown in
Figure 4b. It can be seen that the reconstructed image is hard to read. The 784-pixel information in the image is transformed into the corresponding 784 current pulse sequences (also known as presynaptic pulse sequences). As shown in
Figure 4b, the 784 random magnetic vector angles are generated based on the uniform distribution, θ. This indicates that the initial state of the MTJ synapse is random, making it difficult to distinguish the image’s content. To improve its quality, repetitive training is needed. As shown in
Figure 4c, 10 training steps were conducted on the 784 synapses, i.e., with repeated writing of the presynaptic sequences 10 times. It can be seen that the results gradually tended to stabilize, with the numbers gradually becoming clear and readable. In the corresponding handwritten digital image, the pixels in the black area are close to zero, with the MTJ device in the AP state and a weight of 1. On the contrary, the pixels in the white area correspond to the MTJ device in the P state and a weight of 0.
4. STT-MTJ-Based Neuron
The similar properties of the STT-MTJ device and the biological neuron are studied in this section.
Figure 5 shows a schematic diagram of the membrane potential of a biological neuron, in which the spike and the leakage of the input are integrated together. The neuron would be activated when the membrane potential exceeds the threshold voltage [
7].
The similar characteristics of the MTJ device were shown by micromagnetic simulation based on the MTJ model in
Section 2. The stochastic magnetic simulation was carried out based on the P-MTJ device with the
= (0, 0, −1). The states of the MTJ in the magnetic dynamic model can be characterized by the m
z, which is the z-component of
, with
mz = cosθ. The term m
z can be used to stand for the membrane potential of the biological neuron in the STT-MTJ-based SNN structure.
The first two terms on the right-hand side of the LLGS equation described in Equation (1) are related to the leakage ones of the membrane potential in the magnetization dynamics, while the last term is related to the input pulse applied on the MTJ neuron.
Figure 6 shows the integration process and the activation process of the m
z in the STT-MTJ device. As shown in
Figure 6, the input pulse with a 1 ns period and a 0.55 ns pulse width is adopted as the input pulse signal of the neuron. The precession of
mz is simulated and four periods are shown in
Figure 6a. It can be seen that
mz can exhibit the integrated function, showing the accumulation effect of the multiple inputs and the leakages. Based on the integration function, the pulse has an obvious influence on the value of m
z, with
mz being increased with a pulse and decreased without a pulse.
The activation of the neuron occurs when the membrane potential exceeds the threshold. In the MTJ neuron, the activation corresponds to the switching behavior of the MTJ device and is also presented by
mz. As shown in
Figure 6b,
mz is switched from −1 to +1 successfully with six-cycle pulses as the input, which means that the MTJ neuron can be activated successfully. Due to the non-volatile property of the STT-MTJ device, m
z can be kept at +1 even without the input pulse. Therefore, a reset circuit must be designed to reset the activated MTJ neuron. The reset period is similar to the refractory period observed in the biological neuron. The reset neuron cannot be activated again for a short time after being activated.
The operation of the MTJ neurons can be divided into three stages, namely, the write stage, the read stage, and the reset stage. As shown in
Figure 7, in the writing stage,
VWRITE is high. The input synaptic current, I, is transmitted through the heavy metal layer. The MTJ neuron is driven by the input current. The state of the MTJ device is switched from the P state to the AP state. So, the neuron is activated. In the reading phase,
VREAD is high, and the state of the MTJ device is determined by the node voltage,
VSPIKE, between the reference MTJ and the MTJ neuron. The read
VSPIKE corresponds to a low-level MTJ in the P state, while the read
VSPIKE corresponds to a high-level MTJ in the AP state. In the reset stage, if
VSPIKE is high, a reset operation is initiated. Reverse current flows through the heavy metal layer, causing the MTJ neuron to be switched from the AP state to the P state, terminating the activation state.
Besides the integration of the input and the leakage, probabilistic activation is another characteristic of the biological neuron. The probability of neuron dynamics [
26] mainly comes from the randomness of ion channel switching and the randomness of neurotransmitter releasing. The switching behavior of the MTJ device is also probabilistic in nature. The switching probability of the MTJ (from AP to P or P to AP) is increased with the magnitude of the input current. Therefore, the switching probability of the MTJ device can be mapped with the activation probability of the biological neuron [
27]. The activation probability of the biological neuron typically varies non-linearly with the input synaptic current [
2,
28], which is similar to the non-linear variation of the switching probability of the MTJ device with the applied current.
The switching probability of the MTJ neuron can be adjusted by many factors. As shown in
Figure 8, the switching probability can be changed with the variations of the MTJ free layer thickness,
tFL, and the duration of the input current,
tpw. The applied input current,
I, is a square wave signal with
tpw width. It can be seen that the switching probability of the MTJ device is decreased with increasing
tFL (as shown in
Figure 8a), while it is decreased with decreasing
tpw (as shown in
Figure 8b). By controlling
tFL,
tpw, or other factors, the activation function of the MTJ neuron can be adjusted by the changing of the switching probability. So, the MTJ neuron can be designed to be sensitive to specific inputs and to be inactivated with other inputs.
5. MTJ Mimics the SNN with Application in Image Recognition
Figure 9 shows the application scenario of the SNN for handwritten digit image recognition. Only the connections for one neuron are shown in the illustration in
Figure 9.
The information of the input handwritten digit images is transferred to neurons through synapses. The neurons receive the postsynaptic current pulses. The synapses and the excitatory neurons are mimicked by the MTJ devices, as introduced in
Section 3 and
Section 4. The role of the inhibitory neuron is equivalent to the peripheral reset circuit, which prevents the neuron from being activated repetitively within a short limited period of time, simulating the refractory period of the biological neurons.
Figure 10 shows ten images of handwritten digits, including five images of “1” and five images of “0”. The images were used as the input samples for the MTJ-based image recognition.
Figure 11 and
Figure 12 show the recognition processes for the handwritten digits “1” and “0” based on the MTJ synapses and the MTJ neuron, respectively. Each set of figures includes the original image, the random initial synapses with the training process for the synapses repeated ten times, the post-synapse current pulse, and the
mz of the MTJ neurons. The same MTJ neurons are used in
Figure 11 and
Figure 12. The switching probability function of the MTJ neuron is adjusted so that it is sensitive to the input of the handwritten digit “0”, while it is not activated when the images of the handwritten digit “1” are inputted. The recognition function is achieved.