This document is an author-formatted work. The definitive version for citation appears as:

M. Krishna Gopi Krishna, A. Roohi, R. Zand and R. F. DeMara, "Heterogeneous energysparing reconfigurable logic: spin-based storage and CNFET-based multiplexing," in *IET Circuits, Devices & Systems*, vol. 11, no. 3, pp. 274-279, 5 2017. doi: 10.1049/iet-cds.2016.0216

http://ieeexplore.ieee.org/document/7939010/ http://digital-library.theiet.org/content/journals/10.1049/iet-cds.2016.0216

# Heterogeneous Energy-Sparing Reconfigurable Logic: Spin-based Storage and CNFET-based Multiplexing

Mohan Krishna Gopi Krishna, Arman Roohi, Ramtin Zand, and Ronald F. DeMara\*

Computer Architecture Lab, Department of Electrical and Computer Engineering, University of Central Florida, Orlando, FL, 32816-2362, USA \*<u>demara@mail.ucf.edu</u>

Abstract: Field Programmable Gate Array (FPGA) attributes of logic configurability, bitstream storage, and dynamic signal routing can be realized by leveraging the complementary benefits of emerging devices with Complementary Metal Oxide Semiconductor (CMOS)-based devices. In this paper, a novel Carbon/Magnet Lookup Table (CM-LUT) is developed and evaluated by trading off a range of mixed heterogeneous technologies to balance energy, delay, and reliability attributes. Herein, magnetic spintronic devices are employed in the configuration memory to contribute non-volatility and high scalability. Meanwhile, Carbon-Nanotube Field-Effect Transistors (CNFETs) provide desirable conductivity, low delay, and low power consumption. The proposed CM-LUT offers ultra-low power and high-speed operation while maintaining high endurance re-programmability with increased radiation-induced softerror immunity. The proposed 4-input 1-output CM-LUT utilizes 41 CNFETs and 20 Magnetic Tunnel Junctions (MTJs) for read operations and 35 CNFET to perform write operation and 9.3% reconfiguration Power Delay Product (PDP) improvement in comparison with spin-based Look-up Tables (LUTs). Finally, additional hybrid technology designs are considered to balance performance with the demands of energy consumption for near-threshold operation.

### 1. Introduction

As we continue to advance towards CMOS technology-scaling limits, new and innovative strategies are sought to realize reconfigurable logic fabrics such as FPGA devices. In conventional FPGA designs, the use of Static Random Access Memory (SRAM) bit-cells contributes the major source of standby power consumption [1]. The introduction of emerging devices such as MTJ can offer new alternatives providing non-volatility, near-zero leakage power, high integration density, and increased immunity to radiation-induced faults [2]. These characteristics have motivated the use of spin-based devices for memory functionality. Meanwhile, alternatives to CMOS-based switching networks for logic implementations are also sought, which CNFETs have garnered recent interest due to high electron mobility [3].

Emerging devices could advance new transformative opportunities for exploiting technology-specific advantages, which we refer to as Technology Heterogeneity. Technology heterogeneity recognizes the cooperating advantages of CNFET devices for their rapid and low energy switching capabilities, while simultaneously leveraging spin-based devices for their non-volatility, near-zero standby power, high integration density, and radiation-hardness. Thus, our main contribution in this work is utilizing and aggregating the proven advantages of CNFET devices to complement spin-based devices. The simulation results verified that the combination of these devices in a LUT circuit can realize improved performance and

energy consumption compared to CMOS/MTJ hybrid technology LUT designs. Realization of technology heterogeneity in this work can also provide a platform for extension to other possible combinations of emerging switching devices and storage elements.

In this paper, a novel heterogeneous LUT has been realized using MTJs as storage elements, and CNFETs as logic switching elements, as well as the MTJ write circuitry, as shown in Fig. 1. The proposed 4-Input 1-Output CM-LUT is designed using 41 CNFETs to perform the read operation, 20 MTJs to store bit information and 34 CNFETs to implement the write circuit.



Fig. 1. Heterogeneous LUT design.

The following progression of designs is pursued to elaborate technology specific advantages for reconfigurable fabrics:

- utilization of Magnetic Look-Up Table (M-LUT) providing MTJs in place of SRAM cells to reduce leakage power consumption and achieve in-place configuration storage within the LUT,
- 2) design of new low-energy CNFET-based Pre-Charge Sense Amplifier (PCSA) to read the state of the MTJs with a compact footprint and low switching energy,
- design of a novel CM-LUT, using CNFET in place of a CMOS-based logic select tree. This leverages the preferred switching characteristics of Carbon NanoTubes (CNTs) within a mixed technology design,
- design of high speed CNFET-based MTJ write circuit, utilizing the significant drive capabilities of CNFET to deliver fast MTJ switching.

The remainder of the paper is organized as follows. Section 2 provides basic background to MTJ and CNFET device operations. This section also identifies previous works of hybrid LUT designs spanning emerging technologies. Section 3 deals with the design of different components required to realize a CM-

LUT element. Read and write circuits comprising CM-LUT are parametrized and verified in Section 4. Section 5 concludes the paper with PDP estimates for the M-LUT and CM-LUT designs of a 4-input 1-bit output capacity reconfigurable logic element.

## 2. Background

#### 2.1. Previously Proposed Emerging Device Look-Up Tables

The performance of various LUT designs using popular technologies such as Memristor [4], Phase Change Memories (PCM) [5], Carbon Nanotubes (CNT) [6, 7] and the results of the proposed design is compared in Table I. Among the mentioned LUT designs using emerging memory devices, spintronic devices offer several advantages such as higher endurance, higher integration density, and high read speed. In particular, Zhao et al. [8] proposed Spin LUT, which utilizes MTJ devices as storage elements using a localized PCSA [9] to read the MTJ's state. The select tree is described as NMOS switching structure used to select one of the multiple PCSAs leading to the logic output of the LUT. This design uses complementary sensing design, hence it requires  $2^{n+1}$  MTJs to realize an *n* input LUT. The write circuit is designed to modify

| <b>Research Work</b> | Technology | Power and Delay    |  |  |
|----------------------|------------|--------------------|--|--|
| Memristor LUT        | CMOS +     | Medium Power       |  |  |
| (mrLUT) [4]          | Memristor  | Ultra-Low Delay    |  |  |
| DOM LUT [6]          | CMOS +     | Power Not Reported |  |  |
| PCM LUT [5]          | PCM        | Very High Delay    |  |  |
|                      | CMOS +     | High Power         |  |  |
| DW-LUT [10]          | DW         | Very High Delay    |  |  |
|                      | CNFET +    | Low Power          |  |  |
| CNFET-LUT [6]        | SRAM       | Low Delay          |  |  |
| NEMS-CMOS            | CMOS +     | Very Low Power     |  |  |
| LUT [7]              | NEMS       | Low Delay          |  |  |
| Baseline STT-        | CMOS +     | High Power         |  |  |
| MRAM based LUT       | MTJ        | Medium Delay       |  |  |
| Duran and Ward       | CNFET +    | Very Low Power     |  |  |
| Proposed Work        | MTJ        | Very Low Delay     |  |  |

Table 1: Selected Related works in emerging devices based LUT

a single bit of configuration logic corresponding to the state change of the two MTJs. Based on the LUT approach introduced by Zhao et al., extensions using current-induced Domain Wall (DW) shift register [10] or Racetrack Memories (RM) [11] as the storage element has the potential for higher integration density at the expense of high reconfiguration energy due to the excessive shift operations. These designs employ a

global PCSA instead of localized PCSAs. Since both designs use complementary sensing structure, they also require  $2^{n+1}$  MTJs. In addition to the complementary write circuit from the previous designs a shift current is required to propagate the bit information through the nanowire. RM-based LUT [11] has localized read heads for each constriction, hence the need for shift operations is eliminated.

#### 2.2. Fundamentals of MTJ Operation

MTJ consists of two ferromagnetic (FM) layers which are called *pinned layer* and *free layer*, and one oxide barrier. The free layer could be aligned in two different configurations, parallel (P) and antiparallel (AP) with respect to the pinned layer, which results in low resistance or high resistance characteristic, respectively [12-14]. Parallel and Anti-Parallel states can reach up to a 600% difference in resistance [15] due to the Tunnelling Magnetoresistance (TMR) effect. This difference between the states can be readily distinguished using a sense amplifier.

The spin-Polarized current flowing thorough the free layer of the MTJ exerts an angular momentum resulting in change in magnetic orientation of the free layer, this effect is called Spin Transfer Torque (STT) proposed in [16]. This effect can be formulated by a mathematical equation given by Landau-Lifshitz-Gilbert (LLG) [17, 18],

$$\frac{\partial \vec{m}}{\partial t} = -\gamma \vec{m} \times \vec{H}_{eff} + \alpha \vec{m} \times \frac{\partial \vec{m}}{\partial t} - \beta J(\vec{m} \times \vec{m} \times \vec{M})$$
(1)

where  $\gamma$  is the gyromagnetic ratio,  $\vec{M}$  and  $\vec{m}$  is the unit vectors of the pinned and free layer,  $\vec{H}_{eff}$  is the effective field exerted on the device,  $\alpha$  is the damping constant,  $\beta$  is the STT coefficient depending on spin polarization and the geometric configuration of the spin torque efficiency and *J* is the current density.

## 2.3. Carbon Nanotube Field Effect Transistor (CNFET)

CNTs are cylindrical carbon molecules with unique electrical and thermal properties. CNTs are essentially a single sheet of graphene rolled into a cylinder, with diameters ranging from 0.6 to 5 nm. Their highest room-temperature mobility and scattering velocity make them suitable candidates for nano-electronics, which based on the chiral vector can be configured as follows [19-22]:

- If *n*=*m*, then the CNT exhibit metallic properties, or
- If n-m=3i, where *i* is an integer, CNT acts as a semiconductor with small band gap.

Integers n and m are the chirality of the tube. The relative value of m and n impact the conductive properties of CNTs, i.e. they can be configured to exhibit either with metallic or semiconductor properties. Metal-Oxide Semiconductor Field-Effect Transistor (MOSFET)-like CNFET devices consist of a structure similar to the conventional MOSFET except that the channel is constructed using multiple CNTs, and its

conduction is controlled electrostatically by potential applied at the gate terminal. The CNTs, which act as channel, are placed on a silicon substrate, such as SiO<sub>2</sub>. The gate terminal and the CNTs are separated by high-K dielectric material, such as HfO<sub>2</sub> to insulate the channel and the gate. The source and drain terminals are doped with impurities at the contacts to improve electron and hole transportation, thus, resulting in a unipolar conduction device. Since a MOSFET-like CNFET possess low subthreshold slope and low OFF current, this makes it an ideal candidate for low power high performance circuit design [19]. Throughout this paper, CNFET is referred to as a MOSFET-like CNFET for simplicity.

The number of tubes is a major determinant of the drain current generated by a MOSFET-like CNFET. In particular, the drain current of CNFET ( $I_{CNFET}$ ) is given by,

$$I_{CNFET} = \frac{n \times g_{CNT} \times (V_{supply} - V_{th,CNT})}{1 + (g_{CNT} \times L_{s,CNT} \times \rho_{s,CNT})}$$
(2)

where *n* is the number of tubes present in the channel,  $g_{CNT}$  is the transconductance per CNT, supply voltage  $V_{supply}$ , the quantity  $V_{th,CNT}$  denotes the threshold of semiconducting CNT with diameter of 1.51nm from [2],  $L_{s,CNT}$  is the length of the doped source region of the CNT, and  $\rho_{s,CNT}$  is the resistance per unit length of the doped source CNT region. Within these parameters, CNTs exhibit the ballistic electronic conduction phenomenon, resilience to electromigration, having the capability to withstand high current densities, and operating in low voltage range [23].

## 3. Proposed CNFET-based MTJ-based LUT Design

A functional block diagram of the CNFET MTJ-based LUT is shown in Fig. 2. Functions are abstracted into different components to realize a 4-input 1-output LUT, including 16 MTJs to store configuration bits,



Fig. 2. Functional Block Diagram of 4-input 1-output CMLUT, which utilizes CNFETs for read and write circuits.

a multiplexer (MUX) to select the logic function output, and a sense amplifier with associated reference resistances. Each component has been implemented using the appropriate technology as indicated in Fig. 2. In the read circuit, each 4-input LUT requires 41 CNFETs, sixteen MTJ-based storage cells, and four reference MTJ cells. While, the write circuit consists of 35 CNFETs to change the state of storage cells, plus eight CNFETs required for setting the reference MTJ structure. Each component is described below.

#### 3.1. Read Control Circuit

Since CNFET has faster drive capacity and low power consumption compared to CMOS [2], the entire sensing operation of the LUT is designed using CNFETs.

3.1.1. *CNFET Pre-Charge Sense Amplifier:* PCSA converts the magnetization configurations of resistive memory devices into CMOS-compatible signalling level. Since the sensing speed of PCSA is in the order of a few picoseconds, it has the ability to realize the desirable *instant ON* feature useful to minimize standby energy in large LUT-based fabrics as proposed by Zhao et al. [9]. The operation of PCSA is as follows. As shown in Fig. 3(a), CNFET-based PCSA consists of a 4 P-Channel and 3 N-Channel CNFETs. The P-channel CNFETs labelled MP1 and MP4 are used for Pre-Charging the outputs Q and Q' to supply voltage VDD for the duration of the period when the clock is low. However, there is no current flow through the MTJ as the N-channel CNFETs MN1 and MN2 are still in OFF condition. During clock high intervals, the pre-charged nodes start discharging through MN1 and MN2, which are now turned ON, as the gate of the N-channel is connected to Q and Q' output nodes. Also during clock high, the discharge N-channel CNFET MN3 is turned ON, and the voltage at Q and Q' starts discharging through MTJ1 and MTJ2 respectively.



#### *Fig. 3.*

a CNFET based Pre-Charge Sense Amplifier (PCSA) MTJ Read Circuit b Reducing the number of MTJs required per bit using Reference MTJ Structure realizing  $(R_{AP}+R_P)/2$ 

From the fact that current discharges faster when the resistance is low, causes the branch of the circuit consisting the MTJ in P state to discharge its current faster in comparison with AP state. For example in Fig. 3(a), MTJ1 is illustrated to reside in AP state and MTJ2 in P state. Thus, Q discharges to the ground faster compared to Q', turning ON P-channel CNFET MP2 and pulling the output of Q' to high. Also to be noted, since Q' controls the P-channel CNFET MP3 and N-channel CNFET MN2, when Q' is high, this maintains discharge of Q via the N-channel MN2 to ground while blocking MP2 from charging back to supply voltage. The operation is reversed when the states of the MTJs are reversed, producing Q as a low output and Q' as a high output. The state is retained until the next clock low, where it is charged back to supply voltage [9].

3.1.2. CNFET Pass Transistor (PT) Select Tree: To accommodate 16 bits of data, a 16:1 MUX is used to form a select tree consisting of  $2^{(n+1)}$ -2 transistors to accommodate  $2^n$  bits. Fig. 2 shows the implementation of a 4-input 1-output LUT, in which a select tree consisting of 30 N-channel CNFETs. Moreover, a reference stack accompanies the LUT design to compensate for the active resistance of the select tree to ensure correct sensing operation of PCSA. The storage cells are selected through select tree according to the input signals A, B, C, and D. For instance, if the input signal is ABCD= "0000", MTJ<sub>0</sub> will be selected to provide the corresponding binary logic output.

3.1.3. Symmetrical Reference MTJ Structure: It is sought to alleviate the need for complementary sensing MTJs in the reference circuit. In the conventional complementary-style MTJ read/write circuit, storage of a single bit of information requires changing the state of two MTJs [2, 8]. Hence, complementary circuitry occupies additional memory footprint per bit as well as elevating the write power consumption. Thus, the symmetrical reference circuit proposed herein remains independent of the number of LUT inputs. In particular, four MTJs are required to implement an arbitrarily-sized LUT. The design of the symmetrical reference circuit is illustrated in Fig. 3(b). To ensure the correct operation of the sense amplifier, as well as increasing the sense margin, a reference stack consisting of two MTJs in P configuration, and two MTJs in AP configuration is implemented in our design. This provides an effective resistance between  $R_P$  and  $R_{AP}$  of the sensed MTJs as demonstrated by below expression:

$$R_{Reference \, stack} = (R_P + R_{AP}) \mid |(R_P + R_{AP}) = (R_P + R_{AP})/2 \tag{3}$$

#### 3.2. CNFET-based Write Control Circuit

A STT switching approach requires bi-directional current for changing the MTJ states. Herein, the utilization of CNFETs for providing the required bidirectional current is investigated. Thus, the required bi-

directional current is supplied by two CNFET-based transmission gates (TGs) connected to the MTJs' terminals, as shown in Fig. 4. TGs are characterized by their near optimal full-swing switching behaviour at an acceptable area cost, which results in a high speed switching [24]. In total, an *n*-bit LUT requires  $2^{(n+1)}$  +



**Fig. 4.** Bit line BL and BL' are complementary input voltage, providing bidirectional current required for MTJ switching. WL and WL' are the word lines and WE and WE' are write enables controlling the operation of Transmission Gate. The magnetic orientation of the MTJ is switched when the current flowing through the MTJ is greater than the critical current Ic.

2 CNFETs to change the state of  $2^n$  MTJs.

## 4. Experimental Results

In this Section, comparison results and detailed analyses are provided using: (a) 45nm CMOS technology library obtained from Arizona State University [25, 26], (b) 32nm CNFET model provided by Stanford University [27], and (c) the MTJ compact model developed by Purdue University [28]. The proposed 4-input LUT is simulated using SPICE circuit simulation with voltages ranging from 1.1V to 0.7V. The obtained comparison results are extracted utilizing the experimental parameters provided in Table 2.

The read circuit is verified through exhaustive implementation of Boolean logic functions of 4-inputs. Figure 5 exhibits the transient behaviour of the proposed LUTs, where  $v(Q \ CNFET)$  is the output of the

| Device | Parameter                                   | Value                  |
|--------|---------------------------------------------|------------------------|
| MTJ    | Boltzmann Constant ( $K_B$ )                | 1.38×10 <sup>-23</sup> |
|        | Gilbert Damping factor ( $\alpha$ )         | 0.028                  |
|        | Width (W)                                   | 25 nm                  |
|        | Length (L)                                  | $\pi \times 25 \ nm$   |
|        | Height of free layer $(T_{sl})$             | 1.4 nm                 |
| CNFET  | Number of tubes (n)                         | 5                      |
|        | Interconnect Capacitance                    | 0.22 fF/µm             |
|        | <i>Work Function: CNT (Ø<sub>CNT</sub>)</i> | 4.5 eV                 |
|        | Work Function: contact ( $\phi_M$ )         | 4.5 eV                 |
|        | G/S/D Length CNT (Lg)                       | 32 nm                  |
|        | Oxide Thickness ( $T_{OX}$ )                | 4 nm                   |
|        | Gate Dielectric (K <sub>OX</sub> )          | 16                     |

Table 2 Fixed simulation parameters for different technologies

CNFET-based read circuit while v(Q CMOS) is the output of the CMOS-based design. Moreover, v(CLK) denotes the input clock frequency, i.e. 1 GHz. Figure 2 shows the schematic of the proposed CM-LUT implementing a 4-input AND function, in which P and AP configurations represent "0" and "1", respectively.

In the first clock cycle, the input signals A, B, C, and D are set to (0, 0, 0, 0), which selects MTJ<sub>0</sub> which results in the output Q pulled down to ground as shown in Fig. 5. In the second clock cycle, the input is changed to (1, 1, 1, 1), thus MTJ<sub>15</sub> is selected that is in AP configuration, therefore the output is charged to VDD. Table 3 compares CNFET- and CMOS-based designs at nominal and near-threshold voltages, respectively along with their PDP values. It can be observed from Table 3, CMLUT reads the state of the MTJ, 3 to 8-fold faster than MLUT, when the device operation is ranged from nominal to low voltage, Indicated by a steeper curve when comparing signals v(Q CNFET) and v(Q CMOS), as shown in Fig. 5. The power consumption is 4.5 to 4.2-fold lower, favoring CNFET-based designs operating at different voltages.

The transient analyses verifying the CNFET- and CMOS-based write circuits are shown in Fig. 6, in terms of magnetization direction (Mdir). The Mdir signal indicates the orientation of the electron in Free Layer. It can be either " $\pi$ " or "0" radian indicating AP or P states, respectively. When a positive voltage is applied to BL signal, the configuration of the MTJ will be switched to P, consequently Mdir drops to "0" radian. Conversely, when applying a positive voltage to BLbar, the state of the MTJ changes to AP ( $\pi$  rad) as illustrated in Fig. 6. Table 4 lists the switching characteristics for each MTJ state, when the circuit is



Fig. 5. Transient Analysis of CM-LUT to verify the circuit design.

operating at nominal and near-threshold voltages. CNFET-based MTJ can reduce writing operation time by roughly 40% compared to a CMOS-based write circuit at the expense of increased power consumption, due to the relatively large driving current possible by CNFETs. Since the current produced by CMOS-based write circuit operating at near-threshold voltages is lower than the required critical current for MTJ transition from P to AP, switching cannot be achieved. Thus, the obtained results exhibit the improvement of proposed CM-LUT hybrid design. Reconfiguration PDP improvement of 9.3% and average read PDP improvement of 95% are achieved relative to MLUT, when operating at nominal and near-threshold voltage.



Fig. 6. Performance comparison between CNFET and CMOS based write circuit.

- a Mdir signal switching from AP to P at nominal voltage (1.1V)
- b Mdir signal switching from P to AP at nominal voltage (1.1V)
- c Mdir signal switching from AP to P at near-threshold voltage (0.7V)

*d* Mdir signal switching from *P* to *AP* at near-threshold voltage (0.7V)

Table 3 Transient Analysis of read circuit

| Tuble of Transferit Thiarysis of fead enfount |      |                 |        |                     |        |  |
|-----------------------------------------------|------|-----------------|--------|---------------------|--------|--|
| Operating Voltage                             |      | Nominal (1.1 V) |        | Low Voltage (0.7 V) |        |  |
| Design                                        |      | M-LUT           | CM-LUT | M-LUT               | CM-LUT |  |
| Power (µW)                                    | Р    | 7.37            | 1.58   | 2.44                | 0.56   |  |
|                                               | AP   | 6.93            | 1.55   | 2.19                | 0.52   |  |
| Delay (ps)                                    | Р    | 116             | 38     | 394                 | 41     |  |
|                                               | AP   | 83              | 16     | 171                 | 22     |  |
| Average PDP (µWps)                            |      | 711             | 42     | 660                 | 17     |  |
| PDP<br>improvement                            | MLUT | -               | 94%    | -                   | 97%    |  |

 Table 4 Transient Analysis of write circuit

| Tuble T Transfent 7 marysis of write encart |                    |                 |        |                     |        |
|---------------------------------------------|--------------------|-----------------|--------|---------------------|--------|
| <b>Operating Voltage</b>                    |                    | Nominal (1.1 V) |        | Low Voltage (0.7 V) |        |
| Design                                      |                    | M-LUT           | CM-LUT | M-LUT               | CM-LUT |
| Power (µW)                                  | $P \rightarrow AP$ | 77.99           | 123.64 | 16.373              | 47.36  |
|                                             | $AP \rightarrow P$ | 77.20           | 117.64 | 16.39               | 42.42  |
| Delay                                       | $P \rightarrow AP$ | 3.90            | 2.15   | NA*                 | 4.16   |
| (ns)                                        | $AP \rightarrow P$ | 2.70            | 1.70   | 15.06               | 3.20   |
| Average PDP (µWns)                          |                    | 256.06          | 232.23 | NA                  | 165.12 |
| PDP<br>improvement                          | MLUT               | -               | 9.3%   | -                   | NA     |

\* MTJ state does not switch due to the low write current

### 5. Conclusion

In this paper, we have developed a reconfigurable LUT employing technologies by considering the characteristics of non-volatile storage cells, as well as energy consumption demands. The CM-LUT designed herein employs MTJs as storage elements to achieve near-zero standby and leakage power. To deal with the demands of active power consumption in the LUT select tree, CNFETs provide a favourable option relative to highly-scaled CMOS while eliminating leakage power. Furthermore, high performance CNFET-based sense amplifier and MUX Select Tree are designed, which significantly reduces the energy consumption for read operation. In 32nm CNFET technology, the simulation results indicated a 4.5-fold reduction in energy consumption during read operations at nominal voltage, and 4.2-fold decrease at near-threshold voltage, which incurs an acceptable delay increase. Additionally, favourable energy-consumption during write operations can be achieved using CNFET-based TGs to drive the MTJ-based storage cells. Results exhibit 9.3% improvement in PDP compared to that of the CMOS-based write circuits at comparable technology nodes. Given the prevalence of dramatically larger reconfigurable fabrics, spintronic storage elements can contribute significant standby energy benefits. However, the write current requirements for their switching at acceptable delays may be onerous for scaled-CMOS devices. Thus in hybrid CM-LUTs, the use of CNFETs offer a promising solution.

#### 6. Acknowledgment

The authors would like to thank PavanSuta Hosaagrahara Dakshinamurthy for helping with simulation of the LUTs and Sindhu Muttineni for proofreading the document.

#### 7. References

#### 7.1. Journal Articles

[2] W. Zhao, E. Deng, J.-O. Klein, *et al.*: 'A radiation hardened hybrid spintronic/CMOS nonvolatile unit using magnetic tunnel junctions', Journal of Physics D: Applied Physics, vol. 47, p. 405003, 2014.

[8] W. Zhao, E. Belhaire, C. Chappert, *et al.*: 'New non-volatile logic based on spin-MTJ', physica status solidi (a), vol. 205, pp. 1373-1377, 2008.

[12] Behin-Aein, Behtash, Jian-Ping Wang, *et al.*: 'Computing with spins and magnets.' MRS Bulletin 39.08 (2014): 696-702.

[13] A. K. Dwivedi , A. Islam: 'Design of magnetic tunnel junction-based tunable spin torque oscillator at nanoscale regime', IET Circuits, Devices & Systems, 2015.

[14] B. Liu, L. Cai, J. Zhu, Q. Kang, *et al.:* 'On-chip readout circuit for nanomagnetic logic', Circuits, Devices & Systems, IET, vol. 8, pp. 65-72, 2014.

[16] Slonczewski, John C.: 'Current-driven excitation of magnetic multilayers', Journal of Magnetism and Magnetic Materials 159.1 (1996): L1-L7.

[17] J. Sun.: 'Spin-current interaction with a monodomain magnetic body: A model study', Physical Review B, vol. 62, p. 570, 2000.

[18] J. Xiao, A. Zangwill, M. Stiles.: 'Macrospin models of spin transfer dynamics', Physical Review B, vol. 72, p. 014446, 2005.

[22] M. Jasemi, R. Faghih Mirzaee, K. Navi, *et al.*: 'Voltage mirror circuit by carbon nanotube field effect transistors for mirroring dynamic random access memories in multiple-valued logic and fuzzy logic', Circuits, Devices & Systems, IET, vol. 9, pp. 343-352, 2015.

[23] M. Radosavljević, J. Lefebvre, A. Johnson.: 'High-field electrical transport and breakdown in bundles of single-wall carbon nanotubes', Physical Review B, vol. 64, p. 241307, 2001.

[26] W. Zhao, Y. Cao.: 'New generation of predictive technology model for sub-45 nm early design exploration', Electron Devices, IEEE Transactions on, vol. 53, pp. 2816-2823, 2006.

# 7.2. Transactions

[3] J. Deng, H. P. Wong.: 'A compact SPICE model for carbon-nanotube field-effect transistors including nonidealities and its application—Part I: Model of the intrinsic channel region', Electron Devices, IEEE Transactions on, vol. 54, pp. 3186-3194, 2007.

[9] W. Zhao, C. Chappert, V. Javerliac, *et al.*: 'High speed, high stability and low power sensing amplifier for MTJ/CMOS hybrid logic circuits', Magnetics, IEEE Transactions on, vol. 45, pp. 3784-3787, 2009.

[10] W. Zhao, D. Ravelosona, J. Klein, *et al.*: 'Domain wall shift register-based reconfigurable logic', Magnetics, IEEE Transactions on, vol. 47, pp. 2966-2969, 2011.

[15] Zhang, Yue, et al.: 'Compact modeling of perpendicular-anisotropy CoFeB/MgO magnetic tunnel junctions' Electron Devices, IEEE Transactions on 59.3 (2012): 819-826.

[24] R. Zand, A. Roohi, S. Salehi, *et al.*: 'Scalable Adaptive Spintronic Reconfigurable Logic using Area-Matched MTJ Design', IEEE Transactions on Circuits and Systems II, vol. in-press, Jan 2016.

# 7.3. Conference Paper

[1] Y. Zhou, S. Thekkel, S. Bhunia.: 'Low power FPGA design using hybrid CMOS-NEMS approach', Proceedings of the 2007 international symposium on Low power electronics and design, pp 14-19, 2007.

[4] T. N. Kumar, H. A. Almurib, F. Lombardi.: 'A novel design of a memristor-based look-up table (LUT) for FPGA', in Circuits and Systems (APCCAS), 2014 IEEE Asia Pacific Conference on, 2014, pp. 703-706.

[5] C.-Y. Wen, J. Li, S. Kim, M. Breitwisch, *et al.:* 'A non-volatile look-up table design using PCM (phase-change memory) cells', in VLSI Circuits (VLSIC), 2011 Symposium on, 2011, pp. 302-303.

[6] K.-S. Han, D.-I. Jeon, K.-S. Chung.: 'Ultra low power and high speed FPGA design with CNFET', in Communications and Information Technologies (ISCIT), 2012 International Symposium on, 2012, pp. 828-833.

[7] Y. Zhou, S. Thekkel, S. Bhunia.: 'Low power FPGA design using hybrid CMOS-NEMS approach', in Proceedings of the 2007 international symposium on Low power electronics and design, 2007, pp. 14-19.

[11] W. Zhao, N. Ben Romdhane, Y. Zhang, *et al.:* 'Racetrack memory based reconfigurable computing', in Faible Tension Faible Consommation (FTFC), 2013 IEEE, 2013, pp. 1-4.

[20] M. Ouyang, J.-L. Huang, C. L. Cheung, *et al.:* 'Energy gaps in'' metallic'' single-walled carbon nanotubes', Science, vol. 292, pp. 702-705, 2001.

[21] M. Dresselhaus, G. Dresselhaus, R. Saito.: 'Carbon fibers based on C 60 and their symmetry', Physical Review B, vol. 45, p. 6234, 1992.

## 7.4. Thesis

[19] R. Ashraf.: 'Robust Circuit & Architecture Design in the Nanoscale Regime', PhD thesis, Portland State University, 2011

## 7.5. Websites

[25] 'ASU. Predicative Technology Model', Available: http://ptm.asu.edu/.

[27] 'Stanford University CNFET Model', Available: https://nano.stanford.edu/stanford-cnfet-model

[28] X. Fong, S. H. Choday, P. Georgios, *et al.*: '*Purdue Nanoelectronics Research Laboratory Magnetic Tunnel Junction Model*', Available: <u>https://nanohub.org/publications/16/1</u>, 2014.

## 7.6. Patents

[29] Y. D. Chih, C. Y. Huang, C. J. Lin, K. C. Lin, and H. C. Yu: 'Adjusting reference resistances in determining MRAM resistance states.' U.S. Patent 8,902,641, issued December 2, 2014.

[30] X. Li, X. Zhu, and W. Hao: 'Fast MTJ Switching Write Circuit For MRAM Array.' U.S. Patent Application 13/193,689, filed January 31, 2013.