# This document is an author-formatted work. The definitive version for citation appears as:

R. Zand, A. Roohi and R. F. DeMara, "Energy-Efficient and Process-Variation-Resilient Write Circuit Schemes for Spin Hall Effect MRAM Device," in *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 25, no. 9, pp. 2394-2401, Sept. 2017. doi: 10.1109/TVLSI.2017.2699579

http://ieeexplore.ieee.org/abstract/document/7927416/

# Energy-Efficient and Process Variation-Resilient Write Circuit Schemes for Spin Hall Effect MRAM

Ramtin Zand, Arman Roohi, and Ronald F. DeMara, Senior Member, IEEE

*Abstract*—In this paper, various energy-efficient write schemes are proposed for switching operation of spin Hall Effect (SHE)-based magnetic tunnel junctions (MTJs). A transmission gate (TG)-based write scheme is proposed which provides a symmetric and energy-efficient switching behavior. We have modeled SHE-MTJ using precise physics equations, and then leveraged the model in SPICE circuit simulator to verify the functionality of our designs. Simulation results show the TG-based write scheme advantages in terms of device count and switching energy. In particular, it can operate at 12% higher clock frequency while realizing at least 13% reduction in energy consumption compared to the most energy-efficient write circuits. We have analyzed the performance of the implemented write circuits in presence of process variation in the transistors' threshold voltage and SHE-MTJ dimensions. Results show that the proposed TG-based design is the second most process variation-resilient write circuit scheme for SHE-MTJs among the implemented designs. Finally, we have proposed the 1TG-1T-1R SHE-based magnetic random access memory (MRAM) bit cell based on the TG-based write circuit. Comparisons with several of the most energy-efficient and variation-resilient SHE-MRAM cells indicate that 1TG-1T-1R delivers reduced energy consumption with 43.9% and 10.7% energy-delay product (EDP) improvement, while incurring low area overhead.

*Index Terms*— Magnetic random access memory (MRAM), magnetic tunnel junction (MTJ), spin-based memory cell, spin Hall Effect (SHE) MRAM, write energy, process variation.

#### I. INTRODUCTION

**R**ECENTLY, magnetic tunnel junctions (MTJs) have attracted considerable attentions as an alternative for CMOS in both logic and memory [1-3]. MTJ consists of two ferromagnetic (FM) layers, called *Fixed Layer* and *Free Layer*, separated by a thin oxide barrier [4]. The fixed layer is magnetically pinned and utilized as a reference layer, while the free layer magnetic orientation can be modified using various switching approaches. There are two different magnetization configurations for FM layers, *parallel* (P) and *antiparallel* (AP) according to which MTJ resistance is low or high, respectively.

In [5], spin transfer torque (STT) switching technique is proposed for changing the MTJ states. Despite the advantages, the main challenge of STT switching approach is its high dynamic energy consumption. Reducing the energy consumption of the STT-MTJ write operation has been widely researched in recent years [6, 7]. Recently, spin hall effect-based MTJ (SHE-MTJ) is introduced as an alternative for STT-MTJ, which provides separate paths for read and write operations, while expending significantly less switching energy [8-11]. In this paper we concentrate on different write mechanisms for SHE-MTJ devices. In particular, first we implement and analyze five different write circuits. Then, we investigate their performance in presence of process variation.

# II. FUNDAMENTALS AND MODELING OF SHE-MTJ

In [12], the physical equations which describe the three-terminal SHE-MTJ device behavior are provided. Fig. 1 shows the SHE-MTJ, in which the MTJ free layer is directly connected to a heavy metal (HM). The MTJ logic state is determined by the direction of the charge current applied to the write terminals. Ratio of the injected spin current to the applied charge current, spin Hall injection efficiency (SHIE), is defined as below:

$$SHIE = \frac{I_{sz}}{I_{cx}} = \frac{\pi.MTJ_{width}.MTJ_{length}}{4HM_{thick}.HM_{width}} \theta_{SHE} \left[ 1 - \operatorname{sech}(\frac{HM_{thick}}{\lambda_{sf}}) \right] \quad (1)$$

where  $\lambda_{sf}$  is the spin flip length in HM, and  $\Theta_{SHE}$  is the SHE angle [12]. The critical spin current required for switching the free layer magnetization orientation is expressed by (2), where  $V_{MTJ}$  is the MTJ free layer volume [13]. Thus, SHE-MTJ critical charge current can be calculated using (1) and (2).

$$I_{S,critical} = 2q\alpha M_S V_{MTI} (H_k + 2\pi M_S) / \bar{h}$$
<sup>(2)</sup>



Fig. 1. (a) SHE-MTJ structure. Current along +x induces a spin injection current +z direction, producing the spin torque for aligning the magnetic direction of the free layer in +y/-y directions. (b) SHE-MTJ top view.

Equation (3) demonstrates the relation between SHE-MTJ switching time and the voltage applied to the HM terminals with the critical voltage  $v_c$ , which is given by (4).

$$\tau_{SHE} = \frac{\tau_0 \ln(\pi/2\theta_0)}{(\frac{\nu}{\nu_c}) - 1} , \quad \tau_0 = \frac{M_s \cdot HM_{Volume} \cdot q}{I_c \cdot P \cdot \mu_B}$$
(3)  
$$\nu_c = 8\rho I_c \left\{ \theta_{SHE} \left[ 1 - \operatorname{sech}(\frac{HM_{thick}}{\lambda_{sf}}) \right] \pi HM_{length} \right\}^{-1}$$
(4)

where  $I_C$  is the critical charge current for spin-torque induced switching. In order to model the SHE-MTJ the HM resistance is also required, which is expressed by below equation, where  $\rho_{HM}$  is the electrical resistivity of HM, i.e.  $\beta$ -tungsten [14].

$$R_{HM} = (\rho_{HM}.HM_{length})/(HM_{width} \times HM_{thick})$$
(5)

In this paper, we have utilized the approach proposed in [15] to model the behavior of SHE-MTJ device, in which a Verilog-AMS model is developed using the aforementioned equations. Then, the model is leveraged in SPICE circuit simulator to validate the functionality of the designed circuits using experimental parameters listed in Table I.

Fig. 2 shows the conventional 2T-1R SHE-MRAM cell [16, 17] in which one read transistor and one write transistor are utilized to connect the cell's bit line to MTJ and HM, respectively. The three terminal design of the SHE device facilitates the separate current flow paths to isolate its read operations and write operations. This reduces breakdown degradation vulnerability of the MTJ tunneling oxide barrier, since current flow through the oxide is avoided during the higher magnitude current which occurs during write operations.

# III. DESIGN AND ANALYSIS OF SHE-MTJ WRITE CIRCUITS

In this section, various write circuits are investigated for switching the states of SHE-MTJ devices. A comprehensive comparison of the different write circuits developed and examined in this paper is provided in Table II. The implemented write circuits are either proposed herein or inspired by previously proposed write circuits that are modified in this work to be capable of: (1) operating with an input clock signal, (2) producing a bidirectional current to enable switching the SHE-MTJ states from *P* to *AP* and vice versa. All the write circuits are simulated by SPICE circuit simulator in 90nm library using 1.2V nominal voltage. Herein, to provide a fair comparison, the minimum technology feature size is used for the gate channel width of the transistors. Critical charge current,  $I_c$ , for the SHE-MTJ is equal to  $108\mu A$ , which is obtained using (1) and (2).

## A. Current Mirror Circuit Approaches

Herein, we have designed two different current mirror circuits based on the designs introduced in [18] and [19], which are shown in Fig. 3(a) and 3(b), respectively. Simulation results listed in Table II show that current mirror designs have asymmetric switching behavior, i.e. the produced current amplitude for switching from P to AP states is different from the current amplitude generated for AP to P switching. Therefore, in the current mirror-based designs the worst case condition must be considered for clocking scheme to ensure correct switching. Moreover, the implemented current mirror-based write circuits, CM-1 and CM-2, cannot



Fig. 2. (a) Conventional 2T-1R bitcell. (b) 2T-1R bitcell layout view.

TABLE I: PARAMETERS OF SHE-MTJ.

| Parameter      | Description                                           | Value                                 |
|----------------|-------------------------------------------------------|---------------------------------------|
| $HM_{Volume}$  | $HM_{Length} \times HM_{Width} \times HM_{Thickness}$ | 100 nm×60 nm×3 nm                     |
| $MTJ_{Area}$   | $MTJ_{Length} \times MTJ_{Width} \times \pi/4$        | $60$ nm $\times 30$ nm $\times \pi/4$ |
| α              | Gilbert Damping factor                                | 0.007                                 |
| Р              | Spin Polarization                                     | 0.52                                  |
| M <sub>s</sub> | Saturation magnetization                              | 7.8e5 A⋅m <sup>-1</sup>               |
| $H_k$          | Anisotropy Field                                      | 80 Oe                                 |
| $\theta_{SHE}$ | Spin Hall Angle                                       | 0.3                                   |
| $\mu_B$        | Bohr Magneton                                         | 9.27e-24 J·T <sup>-1</sup>            |
| $ ho_{HM}$     | HM Resistivity                                        | 200 μΩ.cm                             |
| q              | Electric charge                                       | 1.602e-19 C                           |
| $\lambda_{sf}$ | Spin Flip Length                                      | 1.5nm                                 |
| $\overline{h}$ | Reduced Planck's Constant                             | 6.626e-34/2π J.s                      |



(a) (b) Fig. 3. Developed current mirror write circuits (a) CM-1 [18], and (b) CM-2 [19].



Fig. 4. Energy-aware write circuits inspired by the designs proposed by Ben-Romdhane et al. in [18], b) Gupta et al. in [21].



Fig. 5. (a) Transmission gate-based write circuit. (b) TG-based write circuit layout view.

produce the required critical current to ensure switching, i.e.  $108\mu A$ . Two solutions for the mentioned drawbacks are enlarging the channel width of the transistors and adding a reference current source to the circu it structure. Both of the mentioned approaches can result in a significant increase in the switching power consumption. Thus, these solutions are more appropriate for larger designs such as cache line drivers [20].

#### B. Previous Energy-Aware Write Circuits

Fig. 4(a) shows a SHE-MTJ write circuit implementation inspired by the switching circuit proposed in [18] for Racetrack memory, which is equipped with a clocking mechanism herein. Advantages of the proposed circuit are its fully symmetric behavior, in addition to the fewer number of transistors utilized in its structure. However, as listed in Table I, the produced current amplitude is  $54.08\mu A$ , which is smaller than the critical current. Thus, this write scheme with transistors having minimum feature size cannot ensure the SHE-MTJ switching.

As listed in Table II, the conventional 2T-1R SHE-MRAM layout with a single write transistor produces  $I_{P-AP}=90.41\mu$ A and  $I_{AP-P}=79.43\mu$ A using the minimum transistor geometries that are possible by 90nm MOSFET technology. Since the produced write currents are smaller than SHE-MTJ critical current, we have leveraged another NMOS transistor to increase the write current, as inspired by the 2T-1R layout proposed by Gupta et al. in [21] for STT-MRAM. Fig. 4(b) depicts a write circuit that utilizes two NMOS transistors which are electrically connected in parallel configuration, both of which are ON during the write operation leading to a high switching current. The drawback of this write circuit is its highly asymmetric behavior. As it is listed in Table II, the produced current amplitude for switching from *P* to *AP* ( $I_{P-AP}=177.8\mu A$ ) is different from the current amplitude generated while switching from *AP* to *P* state ( $I_{AP-P}=139.9\mu A$ ). Consequently, the clocking schemes should be always considered for the worst case scenario to ensure the complete switching, which increases the average energy consumption.

# C. TG-based Write Circuit

In this paper, we propose utilizing transmission gates (TGs) in the SHE-MTJ write circuit, as shown in Fig. 5(a). The asymmetry between the write circuits is caused primarily by the different drive strengths of the PMOS and NMOS transistors. TGs are characterized by near-optimal full-swing switching behavior, since both of the NMOS and PMOS transistors are ON during the write operation, and contributing to the drive strength. We have leveraged this feature of TGs within the write circuit, incurring a symmetric switching operation without increasing the design complexities of matching PMOS and NMOS drive strengths. Results provided in Table II shows that TG-based write circuit provides a symmetric switching with a high amplitude current that ensures

| TABLE II: SWITCHING CHARACTERISTICS IN ABSENCE OF CLOCKING LIMITATIONS. (SHE-MTJ CRITICAL CURRENT= $108 \mu A$ ). |
|-------------------------------------------------------------------------------------------------------------------|
|-------------------------------------------------------------------------------------------------------------------|

| Features |              | 2T-1R    | Curren    | t Mirror  | Ben-Romdhane | Gupta et al. | TG-based      |
|----------|--------------|----------|-----------|-----------|--------------|--------------|---------------|
|          | reatures     | [16, 17] | CM-1 [18] | CM-2 [19] | et al. [18]  | [21]         | Write Circuit |
| Р        | Current (µA) | 90.41    | 49.57     | 12.92     | 54.08        | 177.8        | 157.2         |
| to       | Delay (ns)   | NA*      | NA*       | NA*       | NA*          | 1.65         | 1.9           |
| AP       | Power (µW)   | 108.5    | 120.3     | 15.49     | 64.9         | 213.38       | 188.68        |
| AP       | Current (µA) | 79.43    | 55.58     | 40.47     | 54.08        | 139.9        | 157.1         |
| to       | Delay (ns)   | NA*      | NA*       | NA*       | NA*          | 2.18         | 1.9           |
| Р        | Power (µW)   | 95.31    | 135.2     | 48.56     | 64.9         | 167.89       | 188.56        |
|          | Symmetric    | NO       | NO        | NO        | YES          | NO           | YES           |

\* Produced current amplitude is smaller than critical current, thus SHE-MTJ state transduction cannot be ensured.

|  | TABLE III: | WRITE CHAR | ACTERISTICS | WITH CLO | CKING REQ | UIREMENTS. |
|--|------------|------------|-------------|----------|-----------|------------|
|  |            |            |             |          |           |            |

| Features              |         | Gupta et<br>al. [21] | 1TG-1R  |
|-----------------------|---------|----------------------|---------|
| Maximum CLK Freque    | ncy     | 220 MHz              | 250 MHz |
| Switching Engagy (fI) | P to AP | 484.95               | 377.36  |
| Switching Energy (IJ) | AP to P | 381.57               | 377.12  |
| Average Energy Improv | vement  | _                    | 13%     |

a high speed switching. Fig. 5(b) shows the TG-based write circuit layout view. Although, TG-based designs necessitate the availability of both CLK and inverse CLK' signals, it is reasonable to assume access to both signal conditions within typical integrated circuits.

Table III provides a comparison between the TG-based write circuit and the write schemes inspired by the Gupta et al. [21] 2T-1R layout, both of which can produce a bidirectional current with an amplitude greater than the SHE-MTJ critical current. Assuming the typical 50% duty cycle, the maximum operating clock frequency based on which each of the circuits can ensure complete switching of the MTJ states are listed in Table II. 1TG-1R write circuit can operate at 12% higher clock frequency, while realizing at least 13% average energy reduction compared to 2T-1R write circuit.

#### IV. PROCESS VARIATION ANALYSIS

In this section, the effect of process variation (PV) on the proposed SHE-MTJ write circuits are investigated. The results shown in Section III are obtained by using the transistors with minimum feature size enabled by the 90nm technology, while higher switching currents can be generated by enlarging the write circuit transistors' size at the expense of higher power consumption. Since, the focus of this section is on the PV effects, we have sufficiently enlarged the transistors' width of the developed write circuits to generate the required current amplitude ensuring the complete switching. Figures 6(a) and 6(b) show the produced write current versus the transistors' size for AP to P, and P to AP switching, respectively. Moreover, Tables IV lists the write delay and power consumption as a function of transistor size, both of which are important metrics for the investigated write schemes.

Herein, we have modeled two types of process variation which have the most impact on the produced write current; (1)  $\sigma HM$ : variations in the HM dimensions, and (2)  $\sigma V_{th}$ : fluctuations in the threshold voltages (Vth) of the transistors. Figures 7(a), 7(b), and 7(c) show the produced write current fluctuation versus  $\sigma HM$  for a given  $\sigma V_{th}$ . As it can be seen in Fig. 6(a), although the Gupta et al. [21] write circuit is the second most energy efficient design introduced herein, it is significantly susceptible to the variations in the HM dimensions. As it is shown in Fig. 7(c), for  $\sigma V_{th}=5\%$  the highest write current variation is associated with the current mirror write circuit CM-2, while its produced write current varies insignificantly for different  $\sigma HM$  values. Thus, it can be inferred that the high write current variation of the CM-2 write circuit is mainly induced by the *Vth* variations of MOS transistors, making CM-2 the most susceptible write circuit design to  $\sigma V_{th}$ .

To examine the impact of the Vth variations on implemented write circuits, the write current amplitude fluctuations for

| TABLE IV: PERFORMANCE OF THE WRITE SCHEMES AS A FUNCTION OF TRANSISTOR S | IZE |
|--------------------------------------------------------------------------|-----|
|--------------------------------------------------------------------------|-----|

|         |                          | Width/90nm Ratio |            |                 |            |                 |            |                 |            |
|---------|--------------------------|------------------|------------|-----------------|------------|-----------------|------------|-----------------|------------|
| Designs |                          | MOS              | $=1\times$ | MOS             | =2×        | MOS             | =3×        | MOS             | =4×        |
|         |                          | Power $(\mu w)$  | Delay (ns) | Power $(\mu w)$ | Delay (ns) | Power $(\mu w)$ | Delay (ns) | Power $(\mu w)$ | Delay (ns) |
|         | CM-1 [18]                | 120.3            | $NA^*$     | 181.47          | $NA^*$     | 238.2           | 2.04       | 288.85          | 1.53       |
| Р       | CM-2 <sup>4</sup> [19]   | 15.49            | $NA^*$     | 48.05           | $NA^*$     | 65.05           | $NA^*$     | 82.02           | $NA^*$     |
| to      | Ben-Romdhane et al. [18] | 64.9             | $NA^*$     | 134.54          | 2.89       | 200.84          | 1.78       | 264.7           | 1.3        |
| AP      | Gupta et al. [21]        | 213.38           | 1.65       | 430.24          | 0.76       | 627.2           | 0.51       | 796.28          | 0.4        |
|         | TG-based Write Circuit   | 188.68           | 1.9        | 342.86          | 0.98       | 463.3           | 0.71       | 562.17          | 0.57       |
|         | CM-1 [18]                | 55.58            | $NA^*$     | 201.4           | 2.9        | 262.58          | 1.84       | 316.6           | 1.39       |
| AP      | CM-2 [19]                | 40.47            | $NA^*$     | 98.17           | $NA^*$     | 146.3           | 2.57       | 192.9           | 1.85       |
| to      | Ben-Romdhane et al. [18] | 64.9             | $NA^*$     | 134.54          | 2.89       | 200.84          | 1.78       | 264.7           | 1.3        |
| Р       | Gupta et al. [21]        | 167.89           | 2.18       | 280.1           | 1.22       | 353.5           | 0.95       | 405.94          | 0.81       |
|         | TG-based Write Circuit   | 188.56           | 1.9        | 334.77          | 1          | 442.56          | 0.74       | 525.6           | 0.62       |

\* Produced current amplitude is smaller than critical current, thus SHE-MTJ state transduction cannot be ensured.

 $\Psi$  The size of the transistors within CM2 write schemes should be enlarged 10-fold to produce a current amplitude greater than critical current (139 6µA>L) resulting in 183 3µW write power and 2 19ns write delay



Fig. 6. Produced switching current versus the size of the transistors' width for (a) AP to P switching, and (b) P to AP switching.

various  $\sigma V_{th}$  values are measured for a given  $\sigma HM$ , as shown in Fig. 7(d-g). The significant increase in the produced write current variations for the current mirror circuits, CM-1 and CM-2, show their higher susceptibility to the *Vth* variations, which is mainly caused by the larger number of transistors utilized in their design. Moreover, the Ben-Romdhane et al. [18] and proposed TG-based write circuits are shown to be the most resilient designs to the *Vth* fluctuations. Fig. 7(g) shows the write current variation for the worst case scenario investigated herein, i.e.  $\sigma HM=20\%$  and  $\sigma V_{th}=5\%$ . The proposed TG-based write circuit shows 4.33% worst-case variation in the produced write current, making it the second most variation-resilient SHE-MTJ write circuit design after the Ben-Romdhane et al. [18] write circuit with 2.31% worst-case variation.

Thus far, we have considered two types of variations that have substantial impact on the production of write current, while the



Fig. 7. Write current variations: (a) versus  $\sigma HM$  for  $\sigma V_{th}=0\%$ , (b) versus  $\sigma HM$  for  $\sigma V_{th}=1\%$ , (c) versus  $\sigma HM$  for  $\sigma V_{th}=5\%$ , (d) versus  $\sigma V_{th}$  for  $\sigma HM=5\%$ , (e) versus  $\sigma V_{th}$  for  $\sigma HM=10\%$ , (f) versus  $\sigma V_{th}$  for  $\sigma HM=15\%$ , (g) versus  $\sigma V_{th}$  for  $\sigma HM=20\%$ .



Fig. 8. Switching delay variations: (a) versus  $\sigma MTJ$  for  $\sigma HM=0\%$ , (b) versus  $\sigma HM$  for  $\sigma MTJ=0\%$ 



Fig. 9: Switching delay variations versus  $\sigma MTJ$  and  $\sigma HM$ .



Fig. 10. Monte-Carlo simulation of TG-based write circuit for switching SHE-MTJ device (top) from AP to P, and (bottom) from P to AP.

variations in MTJ free layer ( $\sigma MTJ$ ) can also influence the switching performance. Variations in the MTJ free layer do not affect the produced write current and only impact the switching delay regardless of the write circuit utilized. Based on these effects, herein we investigate the effect of  $\sigma MTJ$  on switching delay using the proposed TG-based write circuit without loss of generality. MTJ free layer variation affects the spin injection efficiency and critical spin current according to (1) and (2), respectively, which can alter the switching delay. Fig. 8 (a) shows that the switching delay and  $\sigma MTJ$  are linearly proportional with a mild slope, while  $\sigma HM$  is fixed to zero. Moreover, HM variation also has significant effect on critical switching current, as well as the produced write current. Fig. 8(b) depicts the fluctuations in switching delay versus  $\sigma HM$  without any variations in the MTJ free layer. Finally, Fig. 9 depicts the switching delay variations for various  $\sigma MTJ$  and  $\sigma HM$  values ranging from 0% to 20%. As it can be seen in the figure, switching delay can be approximately doubled for the worst case scenario considered herein, i.e.  $\sigma MTJ$ =20% and  $\sigma HM$ =20%.

To assess the transient behavior of SHE-MTJ switching in presence of process variations in *HM* size and *Vth*, a Monte-Carlo simulation is utilized in SPICE along with the SHE-MTJ model developed by Camsari et al. in [22]. Figure 7 shows the Monte-

|                          | Features               |         | 7T-1R  | 3T-1R  | 1TG-1T-1R |
|--------------------------|------------------------|---------|--------|--------|-----------|
| Р                        | Current (µA)           | 166.2   | 358.16 | 284.3  |           |
| to                       | Delay (ns)             |         | 1.78   | 0.76   | 0.98      |
| AP                       | Power (µW)             |         | 199.7  | 430.04 | 359.85    |
| AP                       | Current (µA)           |         | 166.2  | 241.18 | 271.9     |
| to                       | Delay (ns)             |         | 1.78   | 1.17   | 1.03      |
| <i>P</i> Power $(\mu W)$ |                        |         | 199.7  | 289.66 | 344.94    |
| Maxin                    | num CLK Frequency (N   | /Hz)    | 280    | 425    | 485       |
| Averag                   | ge Energy (fJ)         |         | 356.6  | 423.2  | 363       |
| Energy                   | y-Delay Product (EDP)  | (fJ×ns) | 634.75 | 408.39 | 364.81    |
| Avana                    | a EDD Immercy amont    | 7T-1R   | _      | 35.6%  | 43.9%     |
| Average EDP Improvement  |                        | 3T-1R   | _      | _      | 10.7%     |
| Norma                    | alized Area Compared t | o 2T-1R | 10.1   | 1.25   | 2.88      |

TABLE V. WRITE CHARACTERISTICS FOR VARIOUS SHE-MRAM BIT CELLS

Carlo simulation of 1TG-1R write circuit for switching SHE-MTJ device in presence of 5% and 20% process variation in transistors' Vth and HM dimensions, respectively. The results are obtained for 10,000 simulation points. The effect of the process variation on the write current amplitude and in consequence of which on switching delay is shown in Fig. 10.

# V. SPIN-HALL EFFECT MRAM BASED MEMORY

Up to this point, we have investigated various SHE-MTJ write schemes, which can be utilized for both logic and memory applications. The obtained results have provided a meaningful comparison between the introduced write circuits. However, additional attributes should be considered for leveraging the write circuits within a SHE-based magnetic random access memory (SHE-MRAM) bit cell. Therefore, in this section, we have focused on the bit cell circuit and layout design considerations, as well as the effect of Source Line (SL), Bit Line (BL), and Word Line (WL) drivers on the write performance. Herein, we have utilized a chain of four inverters to drive BL, SL, and WL, in which each successive inverter is twice as large as the previous one. We have only leveraged the three most energy-efficient and variation-resilient write circuits examined in Section III and IV. The current mirror circuits are excluded from the analyses, due to their high energy consumption and susceptibility to *Vth* variations. Table V provides a detailed comparison of various SHE-MRAM memory cells developed and examined in this paper, in which the effect of line drivers is included for a meaningful comparison. The normalized area consumption of the proposed SHE-MRAM cells compared to the conventional 2T-1R cell is also listed in the last row of Table V.

Fig. 11 shows a 7T-1R bitcell that is designed based on the write circuit introduced in Ben-Romdhane et al. [18] requiring two read transistors and 5 write transistors. As it is shown in Fig. 6, the size of the write transistors should be tripled in this design to produce a write current greater than switching critical current leading to a significant area overhead as shown in Table V. A 3T-1R SHE-MRAM bitcell structure is shown in Fig. 12, which is inspired by the write circuit proposed by Gupta et al. [21]. The schematic and layout of our proposed TG-based SHE-MRAM bitcell is shown in Fig. 13, which includes one TG for write and one transistor for read operation (1TG-1T-1R). To provide a comparable configuration, we have also tripled the size of the transistors



Fig. 11. (a) 7T-1R bitcell. (b) 7T-1R bitcell layout view.



Fig. 12. (a) 3T-1R bitcell. (b) 3T-1R bitcell layout view.



Fig. 13. (a) 1TG-1T-1R bitcell. (b) 1TG-1T-1R bitcell layout view. (c) Required signaling for 1TG-1T-1R SHE-MRAM cell.



Fig. 14. 128×128 memory array constructed by the 1TG-1T-1R structure.

| Parameter                    | 7T-1R                  | 3T-1R                  | 1TG-1T-1R              |
|------------------------------|------------------------|------------------------|------------------------|
| Energy Efficiency            | $\checkmark\checkmark$ | $\checkmark$           | $\checkmark\checkmark$ |
| Process Variation Resiliency | $\checkmark\checkmark$ | -                      | $\checkmark$           |
| Area Efficiency              | -                      | $\checkmark\checkmark$ | $\checkmark$           |

utilized in 3T-1R and 1TG-1T-1R structures to obtain the results listed in Table V. As it is shown in Fig. 11, the 7T-1R bitcell has a completed current path from VDD to GND via the transistors and the HM. Since the BL and the SL are electrically isolated from the current path, the strengths of the BL and SL drivers do not need to be considered for the write operation. While, the 3T-1R and proposed 1TG-1T-1R structures do not have a completed current path. Since the polarities of the BL and SL need to be changed, additional write drivers should be connected to the both BL and SL. Hence, as it is listed in Table V, the 7T-1R structure is one of the most energy efficient designs, although it incurs significant area overhead. Moreover, delivering VDD to each bitcell within the memory array could be challenging, which is ignored in our analysis. The qualitative comparison provided in Table VI elaborates that our proposed 1TG-1T-1R bit cell is one of the energy-efficient designs with improved energy-delay product (EDP) values, as listed in the tenth row of the Table V, while being the second most variation-resilient and area-efficient design after the 7T-1R and 3T-1R bit cell designs, respectively. Fig. 14 shows a 128×128 memory array that is constructed by the proposed 1TG-1T-1R structure.

### VI. CONCLUSION

In this paper, we developed a symmetric energy-efficient TG-based write scheme for SHE-based MTJ devices. A SHE-MTJ Verilog-A behavioral model was leveraged via SPICE circuit simulations to validate the functionality of the designed circuit using experimental parameters. Various write schemes were developed and equipped with clocking mechanism to produce the required bidirectional current for SHE-MTJ switching. Simulation results exhibit symmetric behavior of the proposed TG-based write circuit. Comparisons with various write schemes indicated that TG-based design excels in terms of switching delay and energy. In particular, the proposed TG-based write scheme was shown to be able to operate at 12% higher clock frequency, and achieved over 13% energy improvement compared to the next most energy-efficient design. We have investigated the functionality and performance of implemented write circuits in presence of process variation in the transistors' threshold voltage and SHE-MTJ dimensions. The obtained results showed that the proposed TG-based design is the second most process variation-resilient SHE-MTJ write circuit among the implemented designs, allowing appropriate energy versus PV tradeoffs. Finally, we have leveraged three of the most energy-efficient and PV-resilient write circuits within a memory bit cell, and investigated their energy and area tradeoffs. Obtained results exhibit that our proposed 1TG-1T- 1R SHE-MRAM bit cell excels in term of energy-delay product (EDP), while incurring low area overhead.

#### References

- Z. Sun, X. Bi, H. Li, W. F. Wong, and X. Zhu, "STT-RAM Cache Hierarchy With Multiretention MTJ Designs," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 22, pp. 1281-1293, 2014.
- [2] R. Zand, A. Roohi, S. Salehi, and R. DeMara, "Scalable Adaptive Spintronic Reconfigurable Logic using Area-Matched MTJ Design," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. PP, pp. 1-1, 2016.
- [3] A. Roohi, R. Zand, and R. F. DeMara, "A Tunable Majority Gate-Based Full Adder Using Current-Induced Domain Wall Nanomagnets," IEEE Transactions on Magnetics, vol. 52, pp. 1-7, 2016.
- [4] B. Behin-Aein, J.-P. Wang, and R. Wiesendanger, "Computing with spins and magnets," *MRS Bulletin*, vol. 39, pp. 696-702, 2014.
- [5] J. C. Slonczewski, "Current-driven excitation of magnetic multilayers," Journal of Magnetism and Magnetic Materials, vol. 159, pp. L1-L7, 1996.
- [6] H. Farkhani, A. Peiravi, and F. Moradi, "Low-Energy Write Operation for 1T-1MTJ STT-RAM Bitcells With Negative Bitline Technique," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, pp. 1593-1597, 2016.
- [7] R. Patel, X. Guo, Q. Guo, E. Ipek, and E. G. Friedman, "Reducing Switching Latency and Energy in STT-MRAM Caches With Field-Assisted Writing," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 24, pp. 129-138, 2016.
- [8] L. Liu, C.-F. Pai, D. Ralph, and R. Buhrman, "Magnetic oscillations driven by the spin Hall effect in 3-terminal magnetic tunnel junction devices," *Physical review letters*, vol. 109, p. 186602, 2012.
- [9] L. Liu, T. Moriyama, D. Ralph, and R. Buhrman, "Spin-torque ferromagnetic resonance induced by the spin Hall effect," *Physical review letters*, vol. 106, p. 036601, 2011.
- [10] Z. Wang, W. Zhao, E. Deng, J.-O. Klein, and C. Chappert, "Perpendicular-anisotropy magnetic tunnel junction switched by spin-Hall-assisted spin-transfer torque," Journal of Physics D: Applied Physics, vol. 48, p. 065001, 2015.
- [11] W. Zhao, X. Zhao, B. Zhang, K. Cao, L. Wang, W. Kang, et al., "Failure Analysis in Magnetic Tunnel Junction Nanopillar with Interfacial Perpendicular Magnetic Anisotropy," Materials, vol. 9, p. 41, 2016.
- [12] S. Manipatruni, D. E. Nikonov, and I. A. Young, "Energy-delay performance of giant spin Hall effect switching for dense magnetic memory," *Applied Physics Express*, vol. 7, p. 103001, 2014.
- [13] S. Rakheja and A. Naeemi, "Graphene nanoribbon spin interconnects for nonlocal spin-torque circuits: comparison of performance and energy per bit with CMOS interconnects," *Electron Devices, IEEE Transactions on*, vol. 59, pp. 51-59, 2012.
- [14] C.-F. Pai, L. Liu, Y. Li, H. Tseng, D. Ralph, and R. Buhrman, "Spin transfer torque devices utilizing the giant spin Hall effect of tungsten," *Applied Physics Letters*, vol. 101, p. 122404, 2012.
- [15] R. Zand, A. Roohi, D. Fan, and R. F. DeMara, "Energy-Efficient Nonvolatile Reconfigurable Logic Using Spin Hall Effect-Based Lookup Tables," *IEEE Transactions on Nanotechnology*, vol. 16, pp. 32-43, 2017.
- [16] A. Aziz, W. Cane-Wissing, M. S. Kim, S. Datta, V. Narayanan, and S. K. Gupta, "Single-Ended and Differential MRAMs Based on Spin Hall Effect: A Layout-Aware Design Perspective," in 2015 IEEE Computer Society Annual Symposium on VLSI, 2015, pp. 333-338.
- [17] J. Kim, B. Tuohy, C. Ma, W. H. Choi, I. Ahmed, D. Lilja, et al., "Spin-Hall effect MRAM based cache memory: A feasibility study," in 2015 73rd Annual Device Research Conference (DRC), 2015, pp. 117-118.
- [18] N. Ben-Romdhane, W. Zhao, Y. Zhang, J.-O. Klein, Z. Wang, and D. Ravelosona, "Design and analysis of racetrack memory based on magnetic domain wall motion in nanowires," in *Proceedings of the 2014 IEEE/ACM International Symposium on Nanoscale Architectures*, 2014, pp. 71-76.
- [19] D. Lee and K. Roy, "Energy-delay optimization of the STT MRAM write operation under process variations," *IEEE Transactions on Nanotechnology*, vol. 13, pp. 714-723, 2014.
- [20] S. Motaman, S. Ghosh, and N. Rathi, "Impact of process-variations in STTRAM and adaptive boosting for robustness," in *Proceedings of the 2015 Design*, *Automation & Test in Europe Conference & Exhibition*, 2015, pp. 1431-1436.
- [21] S. K. Gupta, S. P. Park, N. N. Mojumder, and K. Roy, "Layout-aware optimization of STT MRAMs," in Proceedings of the Conference on Design, Automation and Test in Europe, 2012, pp. 1455-1458.
- [22] K. Y. Camsari, S. Ganguly, and S. Datta, "Modular approach to spintronics," Scientific reports, vol. 5, 2015.



**Ramtin Zand** received B.Sc. degree in Electrical Engineering in 2010 from IKIU, Qazvin, Iran. He also received his M.Sc. degree in Digital Electronics at Sharif University of Technology, Tehran, Iran, in 2012. He is currently working toward the Ph.D. degree in Computer Engineering at the University of Central Florida, Orlando, USA. His research interests are in Reconfigurable and Adaptive Computing Architectures with emphasis on spintronic devices.



Arman Roohi received B.Sc. degree in computer engineering in 2008 from Shiraz University, Shiraz, Iran. He also received his M.Sc. degree in computer architecture at Department of Computer Engineering, Science and Research Branch of IAU, Tehran, Iran, in 2011. He is currently working toward the Ph.D. degree in computer engineering at the University of Central Florida, Orlando, USA. His research interests include Nano electronics with emphasis on Spintronics, QCA, Computer Arithmetic, and Interconnection Network Design.



**Ronald F. DeMara** (S'87-M'93-SM'05) received the Ph.D. degree in Computer Engineering from the University of Southern California in 1992. Since 1993, he has been a full-time faculty member at the University of Central Florida where he is a Professor of Electrical and Computer Engineering, and joint faculty of Computer Science, and has served as Associate Chair, ECE Graduate Coordinator, and Computer Engineering Program Coordinator.

His research interests are in computer architecture with emphasis on reconfigurable logic devices, evolvable hardware, and emerging devices, on which he has published approximately 200 articles and holds one patent. He received IEEE's Joseph M. Bidenbach Outstanding Engineering Educator Award in 2008.

He is a Senior Member of IEEE and has served on the Editorial Boards of *IEEE Transactions on VLSI Systems, Journal of Circuits, Systems, and Computers,* the journal of *Microprocessors and Microsystems,* and as Associate Guest Editor of *ACM Transactions on Embedded Computing Systems,* as well as a Keynote Speaker of the International Conference on Reconfigurable Computing

and FPGAs (ReConFig). He is lead Guest Editor of IEEE Transactions on Computers joint with IEEE Transactions on Emerging Topics in Computing 2017 Special Section on Innovation in Reconfigurable Computing Fabrics: from Devices to Architectures. He is currently an Associate Editor of IEEE Transactions on Computers, and serves on various IEEE conference program committees, including ISVLSI and SSCI.