# Estimating the Energy Consumption of a System with Novel Non-Volatile Memory

Andrew Obeso-Silva

Department of Electrical and Computer Engineering University of Central Florida Orlando, FL 32816-2362

Abstract—This document estimates the energy usage of existing DSH-MRAM, SHE-iMTJ, SHE-pMTJ, and RRAM nonvolatile memory designs in a system by simulating a MIPS system running a simple program. The virtual system runs an optimized simple token search program once with a long string as its primary input, and the number of dynamic instructions is counted to estimate the energy usage of the program. The program is designed such that four characters are read from memory at a time, thus saving time and energy. The results of this simulation show that the DSH-MRAM bit-cell design is the most ideal one with the minimum energy consumption, write voltage, and write delay.

## Keywords—memory bit-cell, non-volatile memory, SHE-MTJ, SRAM, token search

### I. INTRODUCTION

The goal of this project is to estimate the energy consumption of some novel non-volatile memory bit-cells with a simulation of a MIPS machine running a simple program with such memory. The program is written using assembly language with the least number of instructions possible and very few memory access instructions to conserve time and energy.

The program has 2 inputs, one for a string of text, and another for a word to look for in the previous string of text. The string of text has a maximum size of 1023 characters, and the query has a maximum size of 10 characters. The program will output the number of times the query was found in the text, and the word indices where an occurrence of the query can be found in the text.

#### A. Project Design

The program first receives both inputs, converts any newline characters in the query into null characters, and sets the current word index to 1. The program only reads 4 characters at a time to minimize the frequency of memory operations, so the program gets the next character of a string from the register that holds 4 characters of that string, and only gets the next 4 characters when the character index for that string is divisible by 4. The LSB in each such register will be copied to another register each as the character for comparison. If the text character is null, then it will produce output and end. If the query character is null, then it will store the word index to the array of



indices, increment the index array index by one, set the query character index to 0, read the first 4 characters of the query to a register, and get the first character from that register.

Both characters are then converted to lowercase by ORing them each with 0x20, which does not negatively affect letters, numbers, and common punctuation. If both the text and query characters match, then the character indices of both strings are incremented by 1 and both registers that contain 4 characters are bit shifted to the right by 8 bits. Otherwise, only the text character index is incremented and the register that contains 4 characters of the text is bit shifted, but the query character index is set to 0. If the text character is a space, then the word index is incremented by 1.

#### B. Test Cases

Three strings each containing the same word multiple times are used to test this program, but they are all different sizes: short, medium, and long. In each test the program is tasked with searching for the word "the". The input strings and correct outputs are shown in Figure 2. The longest one is: "UCF, its athletic program, and the university's alumni and sports fans are sometimes jointly referred to as the UCF Nation, and are represented by the mascot Knightro. The Knight was chosen as the university mascot in 1970 by student election. The Knights



of Pegasus was a submission put forth by students, staff, and faculty, who wished to replace UCF's original mascot, the Citronaut, which was a mix between an orange and an astronaut. The Knights were also chosen over Vincent the Vulture, which was a popular unofficial mascot among students at the time. In 1994, Knightro debuted as the Knights official athletic mascot."

### II. MEMORY BIT-CELLS

The memory bit-cells discussed in this document combine static random-access memory (SRAM) with unique forms of non-volatile memory. SRAM contains two NOT gates connected to each other, and before each NOT gate is a wire that can be used to set that half of the circuit as high or low. Those two wires are called bit lines (BL,  $\overline{BL}$ ), and both are in opposing states to each other. BL and BL are used together to read or write one bit to the pair of NOT gates. The word line (WL) is used to control the connection of the bit lines to an external circuit. When WL is enabled, BL and  $\overline{BL}$  are each connected to an external wire. To read a bit from the SRAM, WL is enabled and the voltages from BL and BLB are read. To write a bit from the SRAM, WL is enabled, BL is forced to the desired voltage, and  $\overline{BL}$  is forced to the opposite state.

The non-volatile memory part in the memory bit-cell designs of [1], [2], and [3] are used to reduce energy consumption during stand-by by storing data in RAM while the system power is off. Those proposed in [1] and [2] are based on the Spin-Hall Effect Magnetic Tunnel Junction (SHE-MTJ) device. The SHE magnetic RAMs (SHE-MRAM) in [1] and [2] use 1.2 V for writing, but the DSH-MRAM from [1] has a write delay of 1 ns, while the SHE-iMTJ and SHE-pMTJ from [2] have delays of 2 and 2.5 ns, respectively. The memory bit-cell proposed in [3], called the Rnv8T cell, uses RRAM instead, which stores data in memristors. It uses 1.8

| Table I: Energy consumption for a single bit-cell write operation in the designs provided in [1-3]. |                                                |
|-----------------------------------------------------------------------------------------------------|------------------------------------------------|
| Design                                                                                              | Energy Consumption<br>For Each ALU Instruction |
| [1]                                                                                                 | 121.51 fJ                                      |
| SHE-iMTJ [2]                                                                                        | 189.7 fJ                                       |
| SHE-pMTJ [2]                                                                                        | 252.4 fJ                                       |
| [3]                                                                                                 | 836.2 fJ                                       |

and -1.6 V for writing with a delay of 10 to 80 ns. The write energy consumptions for each non-volatile memory bit-cell are listed in Table I.

#### **III. RESULTS AND DISCUSSION**

To determine the energy consumption of the program described in Section I using only the non-volatile component of each memory bit-cell listed in Table I, a simulated execution of the program described in Section I is done to measure the dynamic instruction counts of five types of instructions: ALU, branch, jump, memory, and other. Each ALU instruction is assumed to use 1 fJ of energy, each branch instruction uses 3 fJ, each jump instruction uses 2 fJ, each memory instruction uses 1 fJ + the write energy of the memory bit-cell design, and every other instruction uses 5 fJ. The simulation will use the longest string mentioned in Section I as the first input and the word "the" as the second input.

The results of the energy calculations for each memory bitcell during the simulation are shown in Table II. The DSH-MRAM bit-cell from [1] is the most energy efficient device with 110.4319 pJ of total energy usage, while the memristor memory bit-cell from [3] is the least energy efficient device with 641.4466 pJ of total energy usage.

#### IV. CONCLUSION

The memory bit-cell from [1] is the best candidate compared to the rest for non-volatile storage, as it requires the least amount of voltage and energy to write, and its write delay is the shortest. The design from [3] is the worst with the most amount of voltage and energy to write and a relatively long write delay. SRAM, non-volatile storage, SHE-MTJ, energy efficiency, and token searching are topics discussed within this document. The memory bit-cell designs with SHE-MTJs are the fastest and

| Design       | Total Energy Consumption |
|--------------|--------------------------|
| [1]          | 110.4319 pJ              |
| SHE-iMTJ [2] | 161.0971 pJ              |
| SHE-pMTJ [2] | 207.6832 pJ              |
| [3]          | 641.4466 pJ              |

Table II: Total Energy consumption for the assembly program using designs provided in [1-3].

most energy efficient designs discussed in this document, and further research should be done with SHE-MTJ technology to achieve better performance.

#### REFERENCES

- [1] S. Salehi and R. F. DeMara, "BGIM: Bit-Grained Instant-on Memory Cell for Sleep Power Critical Mobile Applications," 2018 IEEE 36th International Conference on Computer Design (ICCD), Orlando, FL, USA, 2018, pp. 342-345.
- [2] W. Kang, W. Lv, Y. Zhang and W. Zhao, "Low Store Power High-Speed High-Density Nonvolatile SRAM Design With Spin Hall Effect-Driven Magnetic Tunnel Junctions," in IEEE Transactions on Nanotechnology, vol. 16, no. 1, pp. 148-154, Jan. 2017.
- [3] P. Chiu et al., "Low Store Energy, Low VDDmin, 8T2R Nonvolatile Latch and SRAM With Vertical-Stacked Resistive Memory (Memristor) Devices for Low Power Mobile Applications," in IEEE Journal of Solid-State Circuits, vol. 47, no. 6, pp. 1483-1496, June 2012.