# Methodology For Improving performance & Reliability In Low Voltage on-chip Memories HAMED MOVAHEDI MASTER'S THESIS DEPARTMENT OF ELECTRICAL AND INFORMATION TECHNOLOGY FACULTY OF ENGINEERING | LTH | LUND UNIVERSITY # Methodology For Improving performance & Reliability In Low Voltage on-chip Memories Hamed Movahedi hamed.movahedi@xenergic.com Department of Electrical and Information Technology Lund University Supervisor: Henrik Sjöland and Babak Mohammadi Henrik.sjoland@eit.lth.se babak.mohammadi@xenergic.com Examiner: Pietro Andreani pietro.andreani@eit.lth.se September 11, 2019 ### **Abstract** Recent surveys show that on average about 70% of area budget of system on chip (SoC) <sup>1</sup> are occupied by Static Random Access Memory (SRAM)s, with a capacity ranging from a few kilo-bits to tens of megabits. SRAM (static RAM or SRAM) is a type of electronic memory that uses bistable latching circuitry (flip-flop) to store each bit. SRAM is volatile in the conventional sense that data is eventually lost when the memory is not powered. The term static differentiates SRAM from DRAM (dynamic random-access memory) which must be periodically refreshed. SRAM is faster and more expensive than DRAM; it is typically used for CPU cache while DRAM is used for a computer's main One critical issue in modern SoCs when using such huge amount of on-chip memories is area efficiency. The other vital factor is power consumption. One desired approach in designing memories is to reduce power consumption by lowering the voltage but pay as little as possible in terms of silicon area overhead. Low-power memory design is a challenging field that designers need to push the energy and area efficiency to extreme limits. In this thesis, the main goal will be to look into different approaches or techniques on how to maintain the low power SRAM memories functionality in terms of R/W reliability in low voltage operations. One useful technique is to be able to temporarily boost the voltage locally for desired nodes in the circuit without having to raise the operational SRAM supply voltage. So, a specially designed in-memory charge pump would be beneficial with the condition that the intrinsic large parasitic capacitances already existing on the memory are exploited for this purpose; hence keeping area overhead to a minimum. One other limitation in low voltage SRAMs is the sense amplifier. These circuit elements are generally working well with the nominal SRAM voltages verified for a certain technology node. However, special considerations require when the intention is to reduce the supply voltage. This requires to take a fresh look into sense amplifier implementations and propose a more suitable approach regarding theses elements of SRAM memories in low voltages. This might demand to introduce <sup>&</sup>lt;sup>1</sup>SOC is an integrated circuit that integrates all components of an electronic system on a substrate. new circuit elements to the memory design. For example in the case under some certain conditions using single-ended sense amplifiers shown to be beneficial, there might be a need to design circuits to provide the required reference voltages for their operations. Consequently, the following are some key steps for the thesis, and will be used as milestones or goals: - $\bullet$ Investigating the possibility of using in-memory charge pump circuitry to provide local voltage boosting in desired memory nodes with the intention of maintaining acceptable R/W reliability in low power memories. Cares must be taken to avoid introducing unnecessarily extra area overhead to the memory layout. - Investigating and proposing sense amplifier circuitry which is more suitable for low voltage SRAM memories. - Implementing voltage reference generator to be used for sense amplifier diversity schemes to improve detection reliability; e.g. using single-ended sense amplifier circuits in combination with a differential one to enhance memory cell read reliability. # Popular Science Summary Static Random Access Memory (Static RAM or SRAM) is a type of RAM that holds data in static form, that stays available, as long as the memory is supplied by power. SRAM is best suited for operations like the CPU's fast cache memory and storing registers, hard drives as disc cache, printers, modems routers, and digital cameras. SRAM stores a bit of data on four transistors that form two cross-coupled inverters named Cell. Cells are positioned in matrix shape comprise columns and rows to make a bigger memory. Place of the cell determines by columns and rows order which makes up the address of memory cell. Access to each cell is facilitated by decoding the address by the decoder and multiplexer and selecting the associated column and rows provides access to determined bit-cell for read and write operations. Read means putting data on a specific cell and write means detecting the content of a certain cell. The requirements to keep more data buffering capacities lead to Increasing onchip memory density trend is driven efforts to scale the size of bit-cell [1] thanks to technology improvements. On the other hand, scaling faces the limitation of scaling threshold voltage Vt [3], [4]. Size scaling contributes to a higher density of cells and lowers voltage grant lower power consumption that is in the center of attention in IoT applications [2]. Reducing the supply voltage affects the read and writes operations [4]. Some techniques developed to compensate for the read and write operation parameters in lower voltages. One of these techniques is adjusting the voltage of critical nodes to improve the performance of SRAM [1]. There are peripheral circuits to facilitate the read and write operations on bit-cell. During the read operation, the content of the bit-cell is detected by a sense amplifier. A sense amplifier is a kind of circuit that amplifies a small input differential voltage to a large rail to rail voltage difference at the output. A column of cells is connected to the sense amplifier and during the read operation sense amplifier demonstrates the content of certain cells that are selected by addressing procedure. Provided failure of the sense amplifier a column of bit cells failed to work. Enhancing sense amplifier function saves a major part of memory cells. This thesis concentrates on peripheral circuits to improve SRAM read and write operations in low voltage conditions. # Contents | 1.1 Thesis Motivation | | |-----------------------------------------------|--| | 1.1 Thesis Motivation | | | 1.2 Project Specification | | | 1.3 Thesis Organization | | | 2 Background Research | | | 2.1 Memory Structure | | | 2.2 Decoder | | | 2.3 Charge Pump | | | 2.4 Voltage Divider | | | 2.5 Assist Techniques | | | 2.6 Sense Amplifier | | | 3 Circuit Design | | | 3.1 Voltage Reducing | | | 3.2 Voltage Boosting | | | 3.3 Sense Amplifier | | | 4 Layout | | | 4.1 Word-line Voltage Reducing And Increasing | | | 4.2 Sense Amplifier | | | 5 System Integration And Simulation | | | 5.1 Voltage Reducing | | | 5.2 Voltage Boosting | | | 5.3 Sense Amplifier | | | 6 Conclusion And Future Works | | | Bibliography | | # List of Figures | 2.1 | A SRAM 6T bit-cell schematic | 5 | |------|-----------------------------------------------------------------------|----| | 2.2 | SRAM Memory Architecture | 6 | | 2.3 | $3 \times 8$ decoder | 7 | | 2.4 | Function of a $3 \times 8$ decoder | 8 | | 2.5 | Voltage doubler.state 1 | 9 | | 2.6 | Voltage doubler.state 2 | 9 | | 2.7 | Simple voltage doubler with load | 10 | | 2.8 | Voltage divider-state 1 | 11 | | 2.9 | Voltage divider-state 2 | 11 | | 2.10 | RNM versus WL and BL voltage | 12 | | 3.1 | Voltage Reduction System | 15 | | 3.2 | Wordline and decoder | 16 | | 3.3 | Simple model of decoder and WL parasitic capacitances | 17 | | 3.4 | Decoder parasitics charging | 18 | | 3.5 | Schematic of charge pump circuit | 19 | | 3.6 | Block diagram of voltage boosting system | 20 | | 3.7 | StrongArm schematic | 21 | | 3.8 | Dual strongArm topology | 22 | | 3.9 | Strongarm input section phase | 23 | | 3.10 | Gain versus supply voltage for input stage of single strongArm topol- | | | | ogy. load is modeled as capacitor | 24 | | 3.11 | Voting topology | 25 | | 4.1 | WL voltage adjusting floor plan | 28 | | 4.2 | Sense amplifier first stage layout | 29 | | 5.1 | Word line voltage reduction signals | 32 | | 5.2 | Output voltage for reducing system in all corners. | 33 | | 5.3 | Word-line voltage versus different capacitance load | 34 | | 5.4 | Boosting system main signals | 35 | | 5.5 | Output voltage for boosting system in all corners | 36 | | 5.6 | Top read operation and signal condition of SA. Bottom output of SA. | 37 | | 5.7 | Error rate versus supply voltage for single strongArm topology.read time is enough long. | 38 | |------|------------------------------------------------------------------------------------------|----| | 5.8 | Error rate versus differential input voltage for single strongArm topol- | | | | ogy | 39 | | 5.9 | Error rate versus supply voltage for both single and voting topology. | | | | Input differential is 20mV | 40 | | 5.10 | Error rate versus supply voltage for voting topology in case SAETOP | | | | active 200pS after SAE. Input differential voltage is 20mV | 41 | | 5.11 | Error rate versus SAETOP delay for supply voltage 0.48 [a.u] | 42 | | 5.12 | Error rate versus supply voltage for voting topology which select from | | | | five input. Input differential voltage is 20mV . FSTP is 500 pS | 43 | | 5.13 | Error rate versus supply voltage for single and voting topology which | | | | select from three and five input. Input differential voltage is 20 mV. | | | | FSTP is 500 pS | 44 | | 5.14 | Comparing NOE associated with single strongArm and voting topol- | | | | ogy by increasing the differential pairs size supply voltage is 1 and | | | | differential voltage is 20mV | 45 | | 5.15 | compare the error rate versus supply voltage for voting topology which | | | | select from three and five input. Input differential voltage is 20mV mV. | | | | FSTP is 500 pS | 46 | | | 1011 10 000 po | τ0 | # List of Tables #### 1.1 Thesis Motivation An outstanding characteristic of modern society is the powerful flow of knowledge and information in different fields of human activities. Data are often called the lifeblood of modern civilization. Digital systems that are implemented on silicon chips (SoC) play a key role in process data and handle the flow of it. One of the major parts of such digital systems is SRAM that is used as CPU's fast cache memory and storing registers. Most of the area and power budget of today's SoCs are consumed by the memory [3]. Among different kinds of memories, SRAMs thanks to their higher power efficiency are ideal for electronic devices especially the Internet of Things (IoT) appliances [5]. With the spreading of IoT appliances, more struggles have been done to improve SRAM's main features such as power consumption and silicon area. Power consumption directly translates to the battery life of the device, while the silicon area is an important factor in the total cost of the IoT device. One popular method to reduce the associated power consumption with memories is to lower the supply voltage [6]. As long as the expected performance is met, this method provides a significant power saving in memories. On the downside, voltage scaling affects the reliability of memory operation. That means a significant increment in the chance of a failure during the read or write (R/W) operations. One cause of the failure is bit-cell itself [7]. Memory bit-cells are designed to work under nominal voltage in a certain technology node. However, in reduced voltage conditions bit-cell stable state might undoubtedly change during a simple read operation. In the same way, during writing to some target bit-cells, the state of other bit-cells might undoubtedly change. To maintain the functional reliability of the memories in reduced voltage conditions some sort of Read/Write (R/W) assist techniques are commonly used. One popular technique is to boost or reduce voltages of only the desired nodes via the help of some kind of R/W assist circuitry [8]. Different methods are available to control the voltage of a node. But most of the solutions add an area to Memory. Unlike some previous works, the aim here is to propose a solution that has the least silicon area overhead. One most effective approach to boost the voltage of certain nodes is implementing some sort of charge pumping circuitry [8]. However, a conventional charge pumping circuit requires significant area overhead due to capacitors used in its architecture. To address this issue, we know that, SRAM memory arrays are constructed by connecting hundreds or thousands of smaller memory cells with small parasitic capacitances. Consequently, the arrangement of such a large number of small parasitic capacitances associated with each bit-cell sums up to fairly considerable total parasitic capacitances. In case these freely available parasitic capacitances can somehow be used for redistributing electric charges to the desired nodes via a charge pump circuit, a more reliable read or write operation can be achieved with minimal area overhead. The effort on maintaining SRAM functional reliability continues with analysis and study around Sense Amplifier (SA) circuitry. Another source of failure of memory operation is the Sense Amplifier [9]. It is a module that decides on the state of a bit-cell to be either 0 or 1 during the read operation. Performance and Reliability of the SA, when the supply voltage is reduced, should be studied. The performance of the sense amplifier is how fast a sense amplifier can detect a memory content. Reliability of sense amplifier related to the correct output in different process variations and ambient conditions. Most of the time, SA itself could be the bottleneck in reducing the memory operational voltage even further. As mentioned before, in near-threshold voltage, SRAM performance is heavily limited by sense amplifier operation. The Sense amplifier performance and the reliability degrade in near-threshold supply voltage [10]. Among different factors associated with performance, the input offset of the sense amplifier is a factor that has a considerable impact on the reliability of SA and the power wasting in memory [11]. In other words, to detect bit-lines voltage correctly, the minimum differential voltage of bit-lines, in worst-case, should be greater than the SA's input-referred offset. Accordingly, a smaller voltage difference on bit-lines leads to lower power consumption thanks to lower bit-line discharging. Then sense amplifier with lower input-referred offset is desired. Lower input differential voltage decrease sense amplifier reliability [11]. Reliability is related to the probability of failure to detect a correct state of bit-cell by the sense amplifier. A large number of bit-cells are connected to one sense amplifier. Therefore, degrading of sense amplifier reliability means failing of reading operation for a large number of bit-cells. This thesis aims to suggest a circuit or approach that is most suitable for SRAM memories operating in lower than nominal supply voltage conditions. Circuit simulations, analysis, and evaluations would be provided to accompany the proposed or studied techniques during the thesis work. Moreover, a novel technique to increase sense amplifier reliability will be introduced in this thesis. ## 1.2 Project Specification As mentioned before the main purpose of this project is improving SRAM read and write operations by applying techniques to existence peripheral circuits with as low as possible changes. The previous studies of SRAM show that decoder and sense amplifiers are in the center of attention. The Decoder applies voltage to word line That is a critical node to improve the operation of SRAM [12]. Boosting or reducing the voltage of the word line affects the operation parameters of SRAM. Implementing some sort of charge pumping inside the decoder facilitates to change the voltage of the word line without having to change the operational SRAM supply voltage. At least using this technique faces two main challenges. The first challenge is keeping the area overhead as small as possible. Existed charge pump uses large capacitances, approximately ten times the load capacitance, to boost the voltage of the output node. Then a configuration with smaller capacitance should be designed to minimize area overhead as small as possible. The other concern is the operation of the charge pump circuitry does not interfere with the operation of the memory itself. Hence, a control unit should manage the operation of the charge pump in consolidation with the decoder. The control circuit should consider all the related timing issues and performances of memory. Another circuit module that limits the minimum supply voltage of SRAM memory is a sense amplifier. Sense amplifier parameters should be designed for near-threshold voltage. Then, improvement methods and techniques will be used to enhance sense amplifier functionality and parameters. The main challenge regarding the sense amplifier is a limitation in the area. Sense amplifier room on-chip is restricted by the width of cells and length of memory. For that reason, a sense amplifier with large transistors is not suitable in smaller technologies. #### 1.3 Thesis Organization The rest of the thesis is organized as follows: Chapter 2 introduces the SRAM blocks and peripheral circuits. Analysis method to evaluate and optimize the performance of integrated charge pumps, and the application of the charge pump used in low voltage circuits. Chapter 3 analysis method and propose techniques to improve SRAM operation in lower voltage. Introducing techniques and design calculation. Chapter 4 Layout and silicon techniques to reach the desired results. Chapter 5. Results of simulations and discuss the results and compare with theory. Chapter 6. Put all things together and round-up around the work. # Background Research ## 2.1 Memory Structure The smallest data storage cell is the one-bit memory cell consists of a simple latch circuit with two stable operating points. Depending on the preserved state of the two inverter latch circuit, the data being held in the memory cell will be interpreted either as logic '0' or as logic '1'. Figure 2.1 shows a schematic of a bit-cell. Figure (2.1) A SRAM 6T bit-cell schematic Transistors $M_1$ to, $M_4$ form a two back to back latch and $M_5$ and $M_6$ provide the access to the data contained in the memory cell via bit lines (BL). Access switches gates are controlled by the corresponding word line (WL) as shown in figure 2.1. Arranging bit cells (BC) next to each other in rows and columns form a matrix of cells that makes a bigger memory. Figure 2.2 shows a block diagram of the memory block. Figure (2.2) SRAM Memory Architecture Bit-cells are arranged next to each other forms a matrix. Position of each bit-cell determined by the number of rows and columns in the matrix. Word-lines are parallel wires that provide access to bit cells in a row. Bit-lines are also vertical parallel wires connected to bit-cells and form the column of the matrix. A Multiplexer (MUX) selects the column of the matrix. By applying address to decoder and MUX, and selecting certain word-line and bit-line, access to a bit-cell to write and read are provided. Word-lines are connected to the decoder on the other side. A decoder (DEC) activates one of the word-lines bases on the input address. Also, bit-lines connected to MUX and output of MUX is connected to sense amplifiers. As mentioned The address of each Bit-cell is determined by the position of the bit-cell in-memory matrix. This address is consists of two parts. One part that selects the row of the matrix, applied to the decoder and decoder selects one of the word-lines accordingly. The second part is used by MUX to select the column (bit-lines). SRAM memories are suitable thanks to its lower power consumption and high performance. The static power dissipation of bit-cell is very small. Essentially, it is limited by a small leakage current. Moreover Other advantages of SRAM and the ability to operate at a lower power supply voltage. The major disadvantage of this topology is larger cell size. #### 2.2 Decoder Decoder is a logic combination of transistors that provide access to a row of Bitcells according to its input address. Figure (2.3) $3 \times 8$ decoder A brief description of the function of the decoder used in these projects is depicted in figure 2.4. A high-value bit address selects the first stage and low-value bits select among the second stages. Then the 3-bit input address can active 8-bit outputs. And connect one of the output lines to $V_{dd}$ . **Figure (2.4)** Function of a $3 \times 8$ decoder ## 2.3 Charge Pump Charge Pump (CP) is an electronic circuit that converts the supply voltage, $V_{dd}$ to output voltage, $V_{out}$ that is higher or lower than, $V_{dd}$ . Unlike the other traditional DC-DC converters, which employ inductors, CPs are only made of capacitors and switches, thereby allowing integration on silicon. CPs were originally used in nonvolatile memories. To see the function of CP, consider the simple circuit consisting of a single capacitor and three switches shown in figure 2.5 and figure 2.6 . During phase 1, switches $S_1$ and $S_3$ are closed and the capacitor is charged to the supply voltage, $V_{dd}$ . Next phase switch $S_2$ is closed and the left side plate of the capacitor assumes a potential $V_{dd}$ , while the capacitor maintains its charge of $V_{dd}$ from the previous phase. This means that during the phase 1 and phase 2, in the absence of a D.C. load, an output voltage has been generated that is twice the supply voltage. $$(V_{out} - V_{dd}) \cdot C = V_{dd} \cdot C \tag{2.1}$$ $$V_{out} = 2 \cdot V_{dd} \tag{2.2}$$ In order to accommodate a load at the output, the circuit would be modified by adding an output capacitance as shown in figure 2.7. Figure (2.7) Simple voltage doubler with load Practical voltage doubler, In this case, the ideal output voltage is given by equation 2.3 $$V_{out} = 2 \cdot V_{out} \cdot \frac{C_s}{C_s + C_L} \tag{2.3}$$ ## 2.4 Voltage Divider Figure 2.8 shows two capacitors that share their charge when switches change from state 1 to state 2. During state 1, switches $S_1$ are closed and $C_1$ charges to $V_d$ . When switch conditions change to state 2, $S_2$ is closed and two capacitors form a parallel configuration. Since the voltage on plate of $C_1$ is higher than the voltage on plate of capacitor $C_2$ , the charge will be transferred from $C_1$ to $C_2$ until both plates reach the same voltage. The final voltage is calculated by equation 2.4. $$V_2 = V_1 \cdot \frac{C_1}{C_1 + C_2} \tag{2.4}$$ This method will be used to adjusting the voltage of word line that will be discussed in the following. #### 2.5 Assist Techniques To guarantee the functionality of bit-cell and memory, certain techniques have been applied to control the voltage level of important nodes such as WL, BL, BLC. These techniques are known to assist techniques in literature and industry. One of these assist techniques concentrates on the voltage of the word-line. The word-line voltage level is crucial to SRAM functionality. Figure 2.10 shows the interaction of rnm by word-line voltage <sup>1</sup>. By increasing the voltage of word-line rnm decreases around 40 percent of the first value. <sup>&</sup>lt;sup>1</sup>data is base on study that has been done in xenergic company Figure (2.10) RNM versus WL and BL voltage. Figure 2.10 shows the interaction of rnm by bit-line voltage base. By increasing voltage of bit-line rnm is constant roughly. This comparison shows that reducing the voltage level on word-line is a critical parameter that should be controlled appropriately. #### 2.6 Sense Amplifier Sense Amplifier amplifies differential voltage at inputs and regenerates latch position at the output. Two Inputs of the sense amplifier are connected to bit-line (BL) and its complementary (BLC). Output of the sense amplifier is a latch that after regeneration holds the content of memory. Sense amplifier has great effect on operation parameters of memory such as performance, reliability and power consumption. Performance of sense amplifier bond with access time needed to generate output. Access time is comprising two parts. First, the time that input of SA get ready thenceforward the time that output latch of SA take a position and are ready to read. The first part of access time can be reduced by increasing sense amplifier sensitivity that is its ability to detect smaller differential voltage at the input. It is clear that better sensitivity results in lower bit-line discharging leading to lower power consumption. The second term of access time depends on the sense amplifier ability to regenerate the output latch. This term also is connected with the power consumption and performance of the sense amplifier. Furthermore, temperature and manufacturing process variation may cause the failure of the sense amplifier. Then a sense amplifier that has reliable output in the different conditions is desired. Since a large number of bit-cells are connected to one sense amplifier. Consequently, if a sense amplifier fails, a bunch of bit-cells is not usable which means loss of area. Many attempts have been done to improve SA performance and reliability. One study presented a comparison in low power and high performance between the current-mode sense amplifier and voltage mode sense amplifier with statistical models and Monte Carlo simulations [16]. Results of this study help to select a proper configuration for SA. Another work proposes a process control monitor of SA offset in recent CMOS technology. The monitoring system provides accurate measurement of SA offset. Analyzing the extracted data shows the relation of the SA offset voltage to process variations [9]. The other attempt uses body biasing to control SA offset in low supply voltage. Biasing the body of transistors with multiple voltages is a technique to control the mismatch of the transistor threshold voltage. Simulation results in related technology show that the proposed calibration technique can reduce the standard deviation of the offset voltage comparing to a conventional SA [18]. Another work implements low voltage techniques on various kinds of the sense amplifier (voltage mode, current mode, and charge transfer sense amplifiers) on certain CMOS technology. Some of these techniques will be useful for our effort to design a sense amplifier in nearthreshold voltage [11]. Sense amplifier offset is also improved by many attempts. Mostly offset is the result of the mismatch in transistors that affect the threshold voltage. Then controlling the threshold voltage of transistors lead to lower offset [17]. Unfortunately, the attempt to reduce the input offset voltage of SA often leads to an increase in the SA area and power consumption or prolonged resolution time. Hence, the trade-off between SA input offset voltage, area, and power-delay is a major design challenge, especially for nanoscale CMOS technology [17]. Combining creatively the current and voltage sensing schemes provides much better performance in terms of both sensing speed and power consumption. First block sense the current flows from bit-line to bit-cell and transforming it to voltage. This process continues with a cross-coupled inverter block that makes the first stage output small voltage to rail-to-rail voltage at the output. This design can operate in a wide supply voltage range, with minimum performance degradation. Moreover, this method helps to cope with the variation of parasitic capacitances due to the layout and fabrication processes [28]. Improving sense amplifier operation at low voltage would be another favorable achievement. Considering this fact that the available area on-chip for the sense amplifier is limited, using circuits with many transistors to improve sense amplifier operation might not be a good solution due to the area restrictions. Consequently, the circuits with the minimum number of transistors are preferable. If pre-layout simulation results demonstrate that the purpose was not complied or show that the cost of implementation is too high, instead of proceeding to layout, a detailed explanation of this outcome will be given. Furthermore, all the reasoning behind any suggestion or justification will be documented for future works or references. Finally, a report containing the detail design process will be submitted. Moreover, as mentioned before, in near-threshold voltage, SRAM performance is heavily limited by sense amplifier operation. Sense amplifier performance and reliability degrade in near-threshold supply voltage. The performance of the sense amplifier is how fast a sense amplifier can detect bit-line voltage. It is related to the speed and power consumption of memory. Among different factors associated with performance, the input offset of the sense amplifier is a factor that has a considerable impact on performance (power and speed) of the sense amplifier. In other words, to detect bit-lines voltage correctly minimum differential voltage of bit-lines, in worst-case, should be greater than the SA's input-referred offset. Accordingly, a smaller voltage difference on bit-lines leads to lower power consumption. Then sense amplifier with lower input-referred offset is desired. The other important factor of the sense amplifier is reliability. Reliability is related to the probability of failure to detect the correct state of bit-cell by the sense amplifier. A large number of bit-cells are connected to one sense amplifier. Therefore, degrading of sense amplifier reliability means failing to read operation for a large number of bit-cells. SA with lower offset detects the lower differential voltage of bit-lines that prevent wasting of energy. Moreover, other power reduction techniques regarding sense amplifiers such as a threshold controlling will be considered. Consequently, in case these techniques seemed interesting to be used in a low power memory architecture, more detailed studies will be followed. #### 3.1 Voltage Reducing Word-line under drive method (WLUD) is an assist technique to increase RNM by reducing the voltage level of the Word-line. Figure 2.10 shows the RNM variation versus WL voltage. Hence, the goal is to reduce the voltage. Many solutions can be used to reduce voltage. One solution is using a voltage supply circuit. However, applying a voltage regulator imposes area overhead that is added to memory costs. Another solution is to apply techniques to available components and resources to adjust the voltage of the word-line. Section 2.5 introduced the basic concept of voltage dividing between two parallel capacitance. Figure 3.1 shows the block diagram of the voltage reduction system. Figure (3.1) Voltage Reduction System In this technique first decoder parasitics capacitor $C_{dec}$ charge to $V_{dd}$ . During second phase $C_{dec}$ disconnect from $V_{dd}$ and connect to load. Then the charge of $C_{dec}$ transfer to word-line parasitics. Output voltage takes a value base on $C_{dec}$ and $C_{wl}$ . To well adjustment, the WL voltage, the capacitor of $C_{dec}$ and $C_{wl}$ should be adjusted by another block that adds small capacitance to them. This block switches between small capacitors to well adjusting the output voltage. This technique needs the exact timing to charge the decoder and discharge it to the word-line. To explain the timing system supposes that the Address is applied to the decoder by the rising edge of the clock cycle. The decoder is a combinational circuit and just after a delay, one of the word-lines is selected and its voltage reaches the supply voltage. In this technique, the clock cycle divides into two phases. In the first phase, the parasitics of decoder charge to $V_{dd}$ . In second phase decoder disconnected from $V_{dd}$ and connect to selected word-line and discharge to that. Figure 3.2 shows the high level view of memory major componenets. Figure (3.2) Wordline and decoder The decoder is depicted as a parasitic capacitor and ideal switches. A logic circuit closes a certain switch base on input address and the path from the Supply voltage to the WL is connected and WL is charged to 0.63Vdd. Word-line is a long wire that connect to the gate of Access transistors in bit-cells. For example for a 64 bit SRAM, the word-line connects to 124 gates with parasitic capacitances. Since all of these capacitors are parallel, their capacitance is added and forms a larger capacitance. A simpler configuration of the decoder and word-line capacitors are shown in figure 3.3. Figure (3.3) Simple model of decoder and WL parasitic capacitances If the charge sharing technique applied in this configuration then voltage $V_2$ would be determined by the size of parasitics. Suppose the clock period divided into two phases. During the phase 1 $(t_1)$ decoder capacitor $C_{DEC}$ charge to $V_{dd}$ and in phase 2 $(t_2)$ $C_{DEC}$ disconnected from the source and connect to word-line. Hence the charge of $C_{DEC}$ share between both capacitors and voltage $V_2$ get a certain value $$V_2 = V_{dd} \cdot \frac{C_{DEC}}{C_{DEC} + C_{WL}} \tag{3.1}$$ $C_{DEC}$ is the total parasitic capacitances that be charged during $t_1$ . $C_{DEC}$ is calculated by extraction all parasitic capacitances of the available decoder. The same is for word-line parasitic capacitances. As mentioned earlier, the fist phase decoder is Isolated from word-lines and is charged to $V_{dd}$ . To providing this condition, an edge detector makes a pulse by the rising edge of the clock cycle. This pulse applied to the first stage of the decoder and disables the decoder during the first phase. Now the decoder is as an isolated capacitor that is charged to $V_{dd}$ . After the first phase, a switch disconnects the $V_{dd}$ connection and simultaneously input address applied to the decoder and word-line is selected and the charge stored in decoder parasitics is discharged to the word line. Decoder parasitics charge to $V_{dd}$ during $t_1$ . This time should be enough long that $C_{dec}$ fully charge to $V_{dd}$ . This matter is one of the problems that this system faces in higher frequencies. To compensate for the lower charging time bigger switch transistors should be used. However bigger switch transistor increases the parasitic capacitances in decoder side $C_{dec}$ that needs a longer time to charge. Consequently to some extent increasing the switch size improve the charging time and by simulation, the operation frequency of this system restricted to 250 MHz. $$\frac{W}{L} = \frac{C_{DEC} \cdot V_{dd}}{t_1 \cdot k \cdot V_{ov}^2} \tag{3.2}$$ Figure 3.4 shows equal simple model of charging part. Figure (3.4) Decoder parasitics charging #### 3.2 Voltage Boosting The last section concentrates on decreasing the word-line voltage to improve the RNM. However, decreasing the word-line voltage affects the reliability of SRAM due to process variation. For example, Process variation changes the threshold voltage of access transistors and causes week operation of this transistor in low voltage that leads to an increase in the failure rate of cells. In certain conditions that higher performance and more yield is the main object increasing the voltage of word-line is an effective tool to improve these features [2]. The most widely used circuit to boost the voltage is the charge pump. A simple model of the charge pump is a capacitor that one side plat voltage reverses in a clock cycle. Figure 3.5 shows the proposed topology for the charge pump. Figure (3.5) Schematic of charge pump circuit Before clock rising edge $M_1$ and $M_3$ are on and $V_{out}$ is equal $V_{dd}$ . By rising clk to logic 1, $M_2$ , $M_4$ are on and $V_{out}$ increases. If $C_s$ are enough bigger than $C_L$ then $V_{out}$ will be double of $V_{dd}$ . Figure 3.6 shows a complete block diagram of the boosting system implemented to the decoder. In this configuration, $C_L$ is the whole parasitics of word-line and decoder and $C_1$ and $C_2$ are the charge pump capacitors. The main goal is to reduce $C_1$ and $C_2$ as small as possible while the output voltage is maintained around $1.5V_{dd}$ . For this purpose instead of charging the load from zero to $V_{boost}$ , first $C_L$ charge to $V_{dd}$ then charge pump boost voltage from $V_{dd}$ to $V_{boost}$ . Figure (3.6) Block diagram of voltage boosting system ## 3.3 Sense Amplifier Sense Amplifiers (SA) amplify a small signal bit-lines differential voltage to '0' and '1' logic levels at the output. SA comprises a differential pair at the input stage and a bi-stable latch at the output stage. Figure 3.7 shows a simple SA topology. Figure (3.7) StrongArm schematic The differential pair detect voltage difference of bit-lines at the input and stimulate a latch at the output stage. This process leads to regenerate latch position according to the bit-lines voltage difference. Among different topologies for the sense amplifier, StrongArm latch topology possesses the distinguishing characteristic which makes it special for sense amplifier application. The term "StrongArm" commemorates the use of this circuit in Digital Equipment Corporation's StrongArm microprocessor, but the basic structure was originally introduced by Toshiba's Kobayashi [19]. The StrongArm latch features which are making it suited to use as sense amplifier are as bellow: - 1) It consumes zero static power, - 2) It directly produces rail-to-rail outputs, 3) Its input-referred offset arises from primarily one differential pair [19]. Figure 3.8 shows a dual strongArm topology [21]. Figure (3.8) Dual strongArm topology Suppose that bit-cell holds logic level zero and one in the ends that are connected to BL and BLC respectively. BL and its complementary BLC connect the inputs of the sense amplifier. In read operation BL, BLC, DP, DN, MP, MN nodes are precharged to $V_{dd}$ . Right after the precharge phase, both BL and BLC lines applied differential voltage to gate of $M_1$ and $M_2$ . When Sense Amplifier Enable signal (SAE) signal turn on $M_7$ transistor, currents $I_1$ and $I_2$ path through $M_1$ and $M_2$ respectively. Since the voltage of BL and BLC are different than the current pass through $M_1$ and $M_2$ are different. Different currents contribute to a different voltage of DP and DN nodes. These nodes are also connected to sources of $M_4$ and $M_3$ respectively while their gates charge to vdd. Consequently $V_{gs}$ of both transistors $M_3$ and $M_4$ increase. Each transistor that reaches the $V_{th}$ sooner gets on and thanks to positive feedback regenerate the latch position. During the simulation, it was observed that the gain of strongArm SA increase by decreasing the supply voltage. To investigate this matter a simple model of the circuit, which is shown in figure 3.9, is used in the test bench. Figure 3.10 shows the gain os SA versus supply voltage. Figure (3.9) Strongarm input section phase $$V_{DP} - V_{DN} = \frac{1}{C_{DP}} \cdot (I_{M1} - I_{M2}) \cdot t_1 \tag{3.3}$$ $$V_{DP} - V_{DN} = \frac{1}{C_{DP}} \cdot g_{M1} \cdot (V_{BL} - V_{BLC}) \cdot t_1 \tag{3.4}$$ during this time first stage has a gain equal $$gain = \frac{g_{M1,2}}{C_{DP}} \cdot t_1 \tag{3.5}$$ 24 Circuit Design **Figure (3.10)** Gain versus supply voltage for input stage of single strongArm topology. load is modeled as capacitor The other factor is the time that output logic value is prepared. This time is expressed as regeneration constant time $\tau_{req}$ . $$\tau_{reg} = \frac{C_{MP}}{g_{M4} \cdot (1 - \frac{C_{MP}}{C_{DP}})} \tag{3.6}$$ Circuit operation in the top side is the same in the concept but with a small difference in voltage level. $M_8$ and $M_9$ form another differential pair. DN and DPare connected to gates of $M_8$ and $M_9$ respectively. Drain of $M_8$ and $M_9$ discharge to the ground during the precharge time. When DN and DP start to falling XPand XN rising to vdd with the different speed which is source of $M_{10}$ and $M_{11}$ . Gates of $M_{10}$ and $M_{11}$ are discharged to ground during the precharge stage. Then The first transistor that reaches $V_{th}$ get on and the output of the latch is evaluated. Scaling in technology makes it difficult to Control the fabrication process leading to variation in the process. Parameters causing unpredictability in the performance of SAs. For example, the variation in oxide thickness and the number of dopant atoms in the transistor channel lead to variation in the threshold voltage $V_{th}$ and other parameters of the device. It is therefore extremely important to keep this aspect in Perspectives while designing a SA and estimating its Performance in terms of access time, offset, power and area. StrongArm topology Monte Carlo analysis shows better results compare to existed SA. However to improve the SA operation in lower differential voltage, voting topology is proposed in this thesis. Figure 3.11 shows the voting topology. Circuit Design 25 Figure (3.11) Voting topology The first stage has a major role in the failure of the sense amplifier to detect the correct value. Suppose that the first stage contributes to failure, and its failure probability (FP) is a1, a2 and a3, then the probability of failure in the voting system is when at least two of three input stages are failed or whole three are failed. Considering that the failure of each stage is independent of each other and the probability of the whole system expected to be: $$FP = a1 \cdot a2 + a2 \cdot a3 + a3 \cdot a1 + a1 \cdot a2 \cdot a3$$ (3.7) if consider that a1=a2=a3=a then $$FP = 3 \cdot a^2 + a^3 \tag{3.8}$$ and since the a is less than one then the FP of voting system will be smaller than the FP of single topology. 26 Circuit Design \_\_\_\_\_ Chapter 4 Layout #### 4.1 Word-line Voltage Reducing And Increasing The main challenge for implementing voltage reduction or increasing techniques is capacitances. As it mentioned, implementing capacitances needs extra mask and area. Therefore using the available parasitic capacitors would be a solution. To have a comprehensive understanding of available parasitics of memory, parasitic extraction is done on memory components. The results show that the parasitic available on the decoder with 64 output can charge the word-line connected to 64 bit-cell. As mentioned to make a larger memory, we can duplicate the memory blocks and decoder. By adding more decoders, the capacity of the decoder to charge a larger word-line will increase. While extra circuits added to decoder share in larger memories. For load control unit number of small capacitors are required. To avoid the extra mask to the design, a dummy word-line is added to the memory that its parasitic capacitance use for the load control circuit. For the word-line voltage boosting technique The main problem is the charge pump capacitance that needs access to both plates. The big portion of parasitics is between a node and substrate where the node is connected to the semiconductor. For example, the gates connected to word-line contribute a big portion of word-line capacitance. Accessing to the metal side is easy however accessing the other side plate of parasitics is not easy. In this case, we use a MOS transistor as capacitance by shot connecting the drain and source of MOS. Then one side is the gate that is accessible, and the other side is the drain-source connection that also is accessible. To reduce the area overhead this kind of capacitance should be as small as possible that is explained in the charge pump design section. 28 Layout Figure (4.1) WL voltage adjusting floor plan ### 4.2 Sense Amplifier The layout of the sense amplifier should be designed enough big to reduce the impact of process variation. However, the sense amplifier should be as small as compared to the size of bit cells. The solution completely discussed in the design chapter. Figure 4.2 shows the block diagram of the voting system. Layout 29 Figure (4.2) Sense amplifier first stage layout 30 Layout $\_$ Chapter 5 # System Integration And Simulation ### 5.1 Voltage Reducing In chapter 3 The concept of the voltage reducing and the required circuits were explained. Figure 3.1 shows the voltage reducing circuit. As mentioned earlier the concept is voltage dividing between two capacitors. To save the area we use the parasitic capacitors of the decoder and word-line. This technique is done in three steps. The first decoder is isolated from word-lines and charge to $V_{dd}$ . Second, the address applied to the decoder and one word-line is selected. Finally the charge of the decoder parasitic capacitance is shared with word-line parasitic capacitance. Figure 5.1 shows the different steps. Figure 5.1 a sealed the decoder from word-line and simultaneously figure 5.1.b shows the signal that charge the decoder. Word-line voltage is shown in figure 5.1.c, which is reduced. Figure (5.1) Word line voltage reduction signals Figure 5.2 shows the word-line voltage in different corners. It can be seen that the SS corner is the worst condition. Figure (5.2) Output voltage for reducing system in all corners. Since voltage dividing happens between parasitic capacitances of the decoder and word-line and the value of word-line parasitic capacitance is changed according to the size of the word-line then a control system is needed to adjusting the voltage of word-line. Figure 5.3 shows the word-line voltage versus word-line capacitance. Figure (5.3) Word-line voltage versus different capacitance load. #### 5.2 Voltage Boosting Another part of the project is designing an assist technique to increase the voltage of the word-line. Figure 3.6 shoes the system that increases the voltage of the word-line in each clock cycle. As mentioned before in chapter 3, the main challenge in voltage boosting is the size of the source capacitor should be as small as possible. Then the technique is first charging the word-line to $V_{dd}$ then the voltage of the word-line boost to a nominal voltage by the charge pump. By applying this technique, the size of the capacitor in the charge pump is reduced. Figure 5.4 shows the major signals. Figure 5.4 and b show the signals that connect the decoder supply node to either $V_{dd}$ or charge pump respectively. Figure 5.4 c shows the boosted voltage of the word-line. Figure (5.4) Boosting system main signals . Figure 5.5 shows the word-line voltage in different corners. The system well works in all corners. **Figure (5.5)** Output voltage for boosting system in all corners. ### 5.3 Sense Amplifier In this section, the results of the simulation of strong-arm and voting topology are presented and investigated. Fig 3.8 shows strongArm topology. First, the precharge system applies voltage equal to $V_{dd}$ to DP, DN, MP, MN, BL, BLC nodes. In the second step, a differential voltage slop applied to BL. Sense Amplifier Enable (SAE) signal active the sense amplifier when the differential voltage of BL and BLC reaches 20 mV and active SA for 200 pS. During this period SA regenerates output latch base on differential input. Fig 5.6-top shows the major signals of StrongArm sense amplifier. BL and BLC are the differential inputs. BL is constant, and BLC dropped gradually from Vdd. When the SAE signal activates the SA, the input differential is equal to 20 mV and the output regenerates bases on the input condition. Figure 5.6-bottom shows the SA output signal. **Figure (5.6)** Top read operation and signal condition of SA. Bottom output of SA. Reliability of SA is the probability that SA generates correct output in different condition. Monte Carlo simulation is used to examine circuit behavior in different conditions. In this study, the output should be logic 1. Then the outputs lower than half-Vdd are the wrong answers. Fig 5.7 shows the error rate versus supply voltage. The number of errors (NOE) deducts from 47 in 0.66 [a.u] to 31 in 0.47 [a.u]. For lower voltage than 0.47 [a.u], the SA error increases drastically since the circuit does not work properly. Figure (5.7) Error rate versus supply voltage for single strongArm topology.read time is enough long. This happens due to the gain increase when the supply voltage decrease is shown in figure 3.10. Then in this topology reducing the supply, voltage improves the reliability of the SA. However, for the voltages lower than 0.488 Vdd the NOE increases dramatically. Consider that SA operation in lower supply voltage is so slow and results will be generated after a couple of nanoseconds. Hence, in simulations, the output is read after enough time to remove the effect of delay and reach a logical view of voltage effect on the number of errors without impact of time. The other parameter affects reliability is the input differential voltage. Figure 5.8 shows the error rate by changing the differential input. **Figure (5.8)** Error rate versus differential input voltage for single strongArm topology . Figure 5.8 compares the error rate For voting and single topology. The supply voltage is adjusted for the best condition in each topology. In the single topology when the supply voltage is fixed at 0.48 [a.u], The number of errors deducts from 32 in 20mV to 1 in 30mV. Comparing this condition with the supply voltage around 0.67 [a.u] that NOE deduct from 47 numbers to 1 number. For voting topology with 3 voters, the best supply voltage is around 0.53 [a.u] that the NOE is less than 5. Comparing curves in figure 5.8 reveals that the number of errors is related to both supply voltage and input differential voltage. The voting system shows better behavior in lower supply voltage and lower input differential voltages. The strongArm topology shows good results by scaling the supply voltage. However, reducing the input differential voltage decreases the reliability of StrongArm SA. To rectify this problem, a voting methodology is implemented in the first stage of SA. The Proposed topology was discussed and depicted in figure 3.11. Fig 5.9 shows error rate versus supply voltage when the input diff voltage is fixed at 20mV. A number of errors deduct from 47 in single strongArm topology to around 7 in voting topology. The number of errors reaches the minimum value of 0.5-0.53 [a.u]. The same as a single topology the NOE increases when the supply voltage decreases lower than 0.48 [a.u]. The first outcome of the voting system is reducing the NOE around 86 % for the same condition as the single topology. The second outcome is reducing the dependency of NOE to the supply voltage. As can be seen in fig 5.9 NOE is fixed around 7 for the supply voltage range between 0.56-0.66 [a.u]. **Figure (5.9)** Error rate versus supply voltage for both single and voting topology. Input differential is 20mV. Two enable signal is applied to SA in the voting configuration. SAE signal is applied to the first stage and SAETOP signal is applied to the second stage. To understand the needs to separate these signals first suppose that both of them are the same. In this conditions the sense amplifier NOE is depicted in figure 5.10. **Figure (5.10)** Error rate versus supply voltage for voting topology in case SAETOP active 200pS after SAE. Input differential voltage is 20mV. The number of errors increases below 0.56 [a.u]. This fact is due to the delay in the first stage. In fact in low voltage first stage needs more time to reach a final stable correct value. If before that time the second stage read the values from the first stage, then wrong results generate at the output of the second stage. To overcome this problem, the second stage enables the signal SAETOP is applied after SAE. We name this time difference first stage preparation time(FSPT). **Figure (5.11)** Error rate versus SAETOP delay for supply voltage 0.48 [a.u]. Figure 5.11 shows the NOE for different delay times between SAE and SAETOP. It is clear that after 400pS delay, SA shows optimum results at output. To increase the speed of SA and reduce power consumption by memory, FSTP time should be reduced that cause an increase in the NOE of the sense amplifier. To improve the reliability of the sense amplifier even more, the first stage number of voters can be increased. Then the voting system selects among more options resulted in the probability of error decrease to around zero. In the flowing first stage expand to five voters. Consequently, the second stage output select among five inputs. Figure 5.12 shows result of 1,000 Monte-Carlo analyses. As it is expected, the number of errors reduces. **Figure (5.12)** Error rate versus supply voltage for voting topology which select from five input. Input differential voltage is 20mV . FSTP is 500 pS. Till now the strong arm input stage duplicated and a voting system at the output stage determine the final results. By voting among three, the area gets approximately 3 times bigger than a single strongArm And the number of errors at output drop from 47 to 7 around 85% improvement. The following struggle is on reducing the size of multiple voting transistors and investigate the relation of the error rates with the size of transistors. Figure 5.13 shows the NOE versus the size of differential pairs. In this simulation, the voting system has three voters. Each voter size is the thirst of the single strong Arm transistor sizes. Then approximately both single SA and Voting SA are the same size. Results will show that for very large transistors the results of both methods are the same. By decreasing the size of the transistors voting method NOE are lower than single SA. It means that for the transistors with same size, results of the voting method is better than single SA. For example, the best value happens when the size of transistors is 1.8 [a.u] and the voting method reduces the errors from 30 to 12 around 60%. Base on what is explained it is clear that for the voting method with first stage three times bigger than single StrongArm the NOE improvement is 86%, while for the voting SA with first stage same size as the single SA, improvement in NOE is 60%. Figure (5.13) Error rate versus supply voltage for single and voting topology which select from three and five input. Input differential voltage is 20 mV. FSTP is 500 pS. **Figure (5.14)** Comparing NOE associated with single strongArm and voting topology by increasing the differential pairs size.supply voltage is 1 and differential voltage is 20mV. To make a comparison between the single and the voting system, the error reduction ability of the voting system is depicted in figure 5.14. In this comparison, error reductions normalize to maximum error reduction. As shown in the figure the maximum error reduction is when the input size of transistors is 1.8 [a.u]. In this condition, the number of errors deducts from 30 to 14 that is 54% improvement in the errors with the same transistor sizes. **Figure (5.15)** compare the error rate versus supply voltage for voting topology which select from three and five input. Input differential voltage is 20mV mV. FSTP is 500 pS. Figure 5.15 compares the NOE of the voting system with five voters. NOE decreases rapidly by increasing the size of transistors. For the transistors greater than 2.1 normalized sizes the NOE for a voting system with 5 voters is 50% improvement. The same as a voting system with 3 voters NOE is reached zero at 4 normalized sizes . All in all the voter system with five voters shows better results between 2 and 3. But we can see that instead of adding 2 voters to the system by a 30-percent increase in the size of 3 voters the same result can be achieved. ## Conclusion And Future Works This project provides solutions in lower supply voltage, to maintain the expected performance of SRAM in low voltage. Voltage scaling affects the reliability of memory operation that means a significant increment in the chance of a failure during the read or writes (R/W) operation. One cause of the failure is bit-cell itself. Memory bit-cells are designed to work under nominal voltage in a certain technology node. However, in reduced voltage conditions bit-cell stable state might undauntedly change during a simple read operation. In the same way, during writing to some target bit-cells, the state of other bit-cells might undauntedly change or the content of bit-cell change wrongly. To maintain the functional reliability of the memories in reduced voltage conditions, One technique is to boost or reduce voltages of only the desired nodes via the help of some kind of R/W assist circuitry. Based on the previous study world-line is a critical node its voltage level impact on performance and reliability of memory. This work proposes a solution to boost the voltage of the word-line that has the least silicon area overhead. To boost the voltage of certain nodes base on the concept of charge pumping a novel topology of the charge pump is designed that used a lower number of transistors and good efficiency. The capacitor ratio of the proposed charge pump is lower than the available one. However, a charge pumping circuit requires significant area overhead due to capacitors used in its architecture. In this thesis, these freely available parasitic capacitances somehow are used for redistributing electric charges to the desired nodes via a charge pump circuit. This technique provides the possibility to control the voltage of the word-line for a more reliable read or write operation with minimal area overhead. Another source of failure of memory operation is the Sense Amplifier. It is a module that decides on the state of a bit-cell to be either 0 or 1 during the read operation. The reliability of SA decreases when the supply voltage is reduced. Most of the time, SA itself could be the bottleneck in reducing the memory operational voltage even further. A novel and state of the art topology for the sense amplifier are introduced in this thesis that works base on voting. Voting among multiple numbers of results reduces the number of failures in Monte Carlo simulation. Also, the input offset of the sense amplifier is improved in comparison with the available one. As mentioned before the main purpose of this project is improving SRAM read and write operations by applying techniques to existence peripheral circuits with as low as possible changes. The previous studies of SRAM show that decoder and sense amplifiers are in the center of attention. The decoder applies voltage to word line That is a critical node to improve the operation of SRAM. Boosting or reducing the voltage of the word line affects the operation parameters of SRAM. Implementing some sort of charge pumping inside the decoder facilitates to change the voltage of the word line without having to change the operational SRAM supply voltage. At least using this technique faces two main challenges. The first challenge is keeping the area overhead as small as possible. Existed charge pump uses large capacitances, approximately ten times the load capacitance, to boost the voltage of the output node. Then a configuration with smaller capacitance should be designed to minimize area overhead as small as possible. The other concern is the operation of the charge pump circuitry does not interfere with the operation of the memory itself. Hence, a control unit should manage the operation of the charge pump in consolidation with the decoder. The control circuit should consider all the related timing issues and performances of memory. Another circuit module that limits the minimum supply voltage of SRAM memory is a sense amplifier. Sense amplifier parameters should be designed for near-threshold voltage. Then, improvement methods and techniques will be used to enhance sense amplifier functionality and parameters. The main challenge regarding the sense amplifier is a limitation in the area. Sense amplifier room on-chip is restricted by the width of cells and length of memory. For that reason, a sense amplifier should be designed as small as possible. Further study to continue this thesis subject is designing a decoder with the possibility of voltage boosting or reducing. By implementing the charge pump concept in the decoder architecture instead of using an external charge pump. Another improvement is regarding sense amplifier to doing more study around dividing the sense amplifier to multiple smaller one and studying the pattern of errors for operation parameters and design a decision-making system with optimum correction ability. Moreover regarding sense amplifier, using multiple input differential transistors. Also designing a test that detects the impact of process variation on sense amplifier base on the results in different conditions and reducing the effect of process variation by correction unbalance between two sides by adding or reducing transistors on each side. # Bibliography - [1] Hiroyuki Yamauchi,IEEE, A Discussion on SRAM Circuit Design Trend in Deeper Nanometer-Scale Technologies in TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 18, NO. 5, MAY 2010 https://ieeexplore.ieee.org/document/5067000 - [2] Tony T. Kim, Bo Wang, and Anh Tuan Do VIRTUS, High Energy Efficient Ultra-low Voltage SRAM Design Device, Circuit, and Architecture in 2012 International SoC Design Conference (ISOCC). https://ieeexplore.ieee.org/document/6407117 - [3] Hiroyuki Yamauchi, Embedded SRAM Trend in Nano-Scale CMOS, in 2007 IEEE International Workshop on Memory Technology, Design and Testing https://ieeexplore.ieee.org/abstract/document/4547608 - [4] KAUSHIK ROY, SAIBAL MUKHOPADHYAY, AND HAMID MAHMOODI-MEIMAND, Leakage Current Mechanisms and Leakage Reduction Techniques in Deep-Submicrometer CMOS Circuits in IEEE, Volume: 91, Issue: 2, Feb. 2003. https://ieeexplore.ieee.org/document/1182065 - [5] P. Upadhyay, Jeet Sen Sharma,R. Kar, D. S. P. Ghoshal. A Novel 8T SRAM Cell with Low Swing Voltage for Portable Devices in 2018 15th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology https://www.researchgate.net/publication/330537381\_ A\_Novel\_8T\_SRAM\_Cell\_with\_Low\_Swing\_Voltage\_for\_Portable\_Devices - [6] Tony T. Kim, Bo Wang, and Anh Tuan Do High Energy Efficient Ultralow Voltage SRAM Design in Device, Circuit, and Architecture .2012 International SoC Design Conference (ISOCC) 978-1-4673-2990-3/12/31.00 https://ieeexplore.ieee.org/document/6407117 - [7] Nahid Rahman, B. P. Singh, Static-Noise-Margin Analysis of Conventional 6T SRAM Cell at 45nm Technology in International Journal of Computer Applications (0975 – 8887) Volume 66– No.20, March 2013 50 Bibliography [8] Babak Mohammadi, Oskar Andersson, Joseph Nguyen, Lorenzo Ciampolini, Andreia Cathelin, and Joachim Neves Rodrigues A 128 kb 7T SRAM Using a Single-Cycle Boosting Mechanism in 28-nm FD-SOI in IEEE TRANSAC-TIONS ON CIRCUITS AND SYSTEMS-I: REGULAR PAPERS, VOL. 65, NO. 4, APRIL 2018 https://ieeexplore.ieee.org/document/8048540 - [9] Mohamed H. Abu-Rahma, Ying Chen, Wing Sy, Wee Ling Ong, Leon Yeow Ting, Sei Seung Yoon, Michael Han, Esin Terzioglu, Characterization of SRAM Sense Amplifier Input Offset for Yield Prediction in 28nm CMOS in 2011 IEEE Custom Integrated Circuits Conference (CICC) https://ieeexplore-ieee-org.ludwig.lub.lu.se/document/6055315 - [10] V. K. Tomar; Ashish Sachdeva, Implementation and analysis of low power reduction techniques in sense amplifier in the 2nd International conference on Electronics, Communication and Aerospace Technology (ICECA 2018) - [11] Disha Arora; Anil K. Gundu; Mohammad S. Hashmi, A High Speed Low Voltage Latch Type Sense Amplifier for Non-Volatile Memory in 20th International Symposium on VLSI Design and Test (VDAT) 2016 https://ieeexplore.ieee.org/document/8064841 - [12] Babak Mohammadi, Oskar Andersson, Joseph Nguyen, Lorenzo Ciampolini, Member, Andreia Cathelin, and Joachim Neves Rodrigues, A 128 kb 7T SRAM Using a Single-Cycle Boosting in IEEE Transactions on Circuits and Systems I: Regular Papers (Volume: 65, Issue: 4, April 2018) https://ieeexplore.ieee.org/document/8048540 - $[13] \begin{tabular}{lll} Mohammadi, & Babak & LU & (2017), & Ultra-low & power & Design & Approaches & in & Memories & and & Assist & Techniques. \\ & https://lup.lub.lu.se/search/publication/95956fa0-793e-4577-bff6-6baa20fd0ab5 \\ \end{tabular}$ - [14] Babak Mohammadi, Oskar Andersson, Xiao Luo, Masoud Nouripayam, and Joachim Neves Rodrigues An Area Efficient Single-Cycle xVDD sub-Vth On-Chip Boost Scheme in 28 nm FD-SOI in 2016 IEEE Asian Solid-State Circuits Conference (A-SSCC) - [15] Tahseen Shakir and Manoj Sachdev, A Word-Line Boost Driver Design for Low Operating Voltage 6T-SRAM in JOURNAL OF COMPUTERS, VOL. 3, NO. 5, MAY 2008 https://ieeexplore-ieee-org.ludwig.lub.lu.se/document/6291950 - [16] Baker Mohammad; Percy Dadabhoy; Ken Lin; Paul Bassett, Comparative Study of Current Mode and Voltage Mode Sense Amplifier used for 28nm SRAM Baker in 2012 24th International Conference on Microelectronics (ICM) https://ieeexplore-ieee-org.ludwig.lub.lu.se/document/6471396 - [17] B. S. Reniwal; P. Singh; V. Vijayvargiya; S. K. Vishvakarma, A New Sense Amplifier Design with Improved Input Referred Offset Characteristics for Energy-Efficient SRAM in 2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID) https://ieeexplore.ieee.org/document/7884800 Bibliography 51 [18] Bingyan Liu, Jiangzheng Cai, Jia Yuan, and Yong Hei, A Low-Voltage SRAM Sense Amplifier With Offset Canceling Using Digitized Multiple Body Biasing in IEEE Transactions on Circuits and Systems II: Express Briefs (Volume: 64, Issue: 4, April 2017) https://ieeexplore.ieee.org/document/7465818 - [19] .Behzad Razavi, A Circuit for All Seasons.The StrongARM Latch in IEEE Solid-State Circuits Magazine, Volume: 7, Issue: 2, Spring 2015 https://ieeexplore.ieee.org/document/7130773 - [20] Parita Patel, Sameena Zafar, and Hemant soni Process Variation Induced Mismatch Analysis In Sense Amplifiers in International Journal of Research in Computer and Communication Technology, Vol 3, Issue 5, May- 2014 - [21] Aikaterini Papadopoulou ; Vladimir Milovanović ; Borivoje Nikolić A Low-Voltage Low-Offset Dual Strong-Arm Latch Comparator, in 2017 IEEE Asian Solid-State Circuits Conference (A-SSCC) https://ieeexplore.ieee.org/abstract/document/8240271 - [22] YOUNIS ALLASASMEH, ANALYSIS, DESIGN, AND IMPLEMENTATION OF INTEGRATED CHARGE PUMPS WITH HIGH PERFORMANCE. https://pdfs.semanticscholar.org/6dac/fa99e91061647f5208d93ebbe7d4aaf9f70a.pdf - [23] Tony T. Kim, Bo Wang, and Anh Tuan Do, High energy efficient ultra-low voltage SRAM design: Device, circuit, and architecture in 2012 International SoC Design Conference (ISOCC) - [24] Dhruv Patel; Manoj Sachdev, 0.23-V Sample-Boost-Latch-Based Offset Tolerant Sense Amplifier in IEEE Solid-State Circuits Letters (Volume: 1, Issue: 1, Jan. 2018) - [25] Naveen Verma, Ultra-Low-Power SRAM Design In High Variability Advanced CMOS https://www.researchgate.net/publication/42539304\_Ultra-low-power SRAM design in high variability advanced CMOS - [26] Bram Rooseleer1 and Wim Dehaene1, A 40 nm, 454 MHz 114 fJ/bit Area-Efficient SRAM Memory with Integrated Charge Pump in 2013 Proceedings of the ESSCIRC (ESSCIRC). https://ieeexplore.ieee.org/document/6649107 - [27] Aikaterini Papadopoulou, Vladimir Milovanovic´† and Borivoje Nikolic, A Low-Voltage Low-Offset Dual Strong-Arm Latch Comparator in 2017 IEEE Asian Solid-State Circuits Conference (A-SSCC) - [28] Do Anh-Tuan; Kong Zhi-Hui; Yeo Kiat-Seng, Hybrid-Mode SRAM Sense Amplifiers: New Approach on Transistor Sizing in IEEE Transactions on Circuits and Systems II: Express Briefs (Volume: 55, Issue: 10, Oct. 2008) https://ieeexplore.ieee.org/document/4653505 Series of Master's theses Department of Electrical and Information Technology LU/LTH-EIT 2019-727 http://www.eit.lth.se