## E-Band Transmitter Design 71G - 86GHz

ANDREI ANUFRIJEV MASTER'S THESIS DEPARTMENT OF ELECTRICAL AND INFORMATION TECHNOLOGY FACULTY OF ENGINEERING | LTH | LUND UNIVERSITY



# E-Band Transmitter Design 71G - 86GHz

Andrei Anufrijev (an0253an-s)

Department of Electrical and Information Technology Faculty of Engineering, LTH, Lund University SE-221 00 Lund, Sweden

Supervisor: Baktash Behmanesh (LTH)

Per-Olof Brandt (BeammWave)

Meshach Milon Paul Pears Pradeep Kumar (BeammWave)

Examiner: Pietro Andreani (LTH)





LUND UNIVERSITY

2023 - 2024

# Abstract

The goal of Master's thesis is to design a power amplifier as part of a transmitter operating at 71-86GHz using the 22nm technology. The primary objective is to design a narrowband power amplifier for E-band at 80GHz. The output power range is 14-20dBm power added efficiency around 14% - 18%. The additional aim is investigating what new circuit concepts need to be used at 80GHz compared to 40GHz, designing the power amplifier and potentially also modulator, keeping the efficiency high enough.

Though the majority of IEEE publications refer to the Doherty amplifier because of the sufficient efficiency as the result of the power combining techniques. The additional area of interest is to observe the difference between 40GHz and 80GHz amplifiers. Therefore, the amplifier for 80GHz has the same approach as for 40GHz - the stacked case.

The output stage is the main concern because of it deals with the high amount of both input and output powers. The size of transistors and components become vast to withstand high amount of power, which forces tradeoffs between operation, efficiency, size and solutions.

The numerous problems are introduced by the influence of frequency dependent components, including non-linearity, power dissipation, and component size. These influences become more visible with increasing operating frequency.

The research indicates that, while utilizing the identical structure, the schematic cases of the PA for 40GHz and 80GHz differ considerably.

Firstly, the load line approach helps to determine the minimum number of transistors and it is not well applicable due to higher loses at 80GHz (e.g. parasitics impact, transconductance, reflection)

Secondly, in order to minimise the signal power loss and make the transistor structure suitable for PDK inductors, the default versions of the transistor structure are rebuilt with the purpose of purposely changing capacitive and resistive parasitics.

Thirdly, the inductance presence in gates and ground nets have an influence on the overall performance is sensitive to its size.

Fourthly, the Q factor of components like capacitors and inductors should be greater than 17 and the Q factor of nets on the path of signal (e.g. between components) should be at least 10 and higher.

A schematic PA is used in the PA design process, and layout extractions are used to replace components one at a time for the post-layout simulation. Modified transistors, a cascade, EMX-extracted nets (Vdd, Vss, back gate nets, between transistors and components), EMX-extracted designed capacitors, an output MN, and RF pads are all included in the PA configuration.

However, the required efficiency could not be obtained due to high current, which causes large dc power and low voltage swing with phase shift in relation to each other.

Only the output stage PA has a finalised results due to complexity design at such high frequency as 80GHz.

# Acknowledgements

I would like to thank Per-Olof Brandt, my industrial supervisor at BeammWave, for this thesis opportunity and guidance. Despite his hectic schedule during the project, I am grateful for his helpful criticism and tremendous encouragement.

I express my gratitude to Baktash Behmanesh, my academic supervisor, for his unwavering tolerance and trust during the assignment.

Additionally, I would like to extend my gratitude to Meshach Milon for his constant support, oversight and collaboration during the project.

# Popular science summary

The focus of interest is to increase data exchange volume and the E-band becomes a target for data rate enhancement as well as higher frequency operation with low power consumption. The E-band frequency range is within 71-76GHz and 81-86GHz, also it is a part of millimeter wave bands, suitable for commercial use and exhibit an atmospheric attenuation of less than 0.5 dB/km at sea level [14, 18].

Millimeter-wave bands enable a growing range of a growing range of applications like broadband wireless communication and automotive radar. Commonly, 16-QAM and 64-QAM modulations are used in E-band radio links using beamforming techniques and narrow band power amplifiers to surpass losses between transmitter and receiver (e.g. a mobile to a base station (BS) and BS to BS) [14, 19].

The power amplifier is the most consuming block in a mm-wave transceiver and its design becomes problematic targeting high frequencies.

Various approaches exist to design efficient power amplifiers such a stacked PA or a transformer-based Doherty PA and also thankfully to semiconductor industries for their tremendous efforts in shrinking transistors improving cut-off and maximum oscillation frequencies limit. [11 - 20]

The 22nm FD-SOI CMOS technology presents unique opportunities for the industry as a cheaper alternative to Finfets, but allowing higher integration density than SiGe. Nevertheless, there are several key challenges to implement a mm-wave PA in 22nm FD-SOI, such as a low breakdown voltage and large back-end parasitics compared to older CMOS nodes. [14]

IEEE refers to the power combining methods that use axillary amplifiers for the back-off power achieving the output power within 14dBm - 18dBm and peaking efficiency within 14% - 19.2%. [11-17]

However, stacked power amplifiers did not lose their relevance achieving relevant results [12, 14, 16, 20] (e.g. 18dBm of the output power with 24% of PAE [20]) to compete with other approaches in terms of complexity, efficiency, area and etc.

# Contents

| Abstract                             | .1 |
|--------------------------------------|----|
| Acknowledgements                     | .2 |
| Popular science summary              | .2 |
| The list of figures                  | .5 |
| The list of tables                   | .7 |
| Abbreviations                        | .8 |
| Introduction                         | .9 |
| Chapter 1 - Theory1                  | 0  |
| Basic Concept1                       | 0  |
| Power Amplifier Structure1           | 2  |
| Matching Network1                    | 3  |
| Q-factor & Bandwidth1                | 4  |
| CP and IIP31                         | 4  |
| Stability1                           | 5  |
| Expected Problems1                   | 6  |
| Chapter 2 - Schematic1               | 7  |
| DC Part1                             | 7  |
| AC Part1                             | 9  |
| The Output PA Stage2                 | 21 |
| Output of PA2                        | 22 |
| Inter-Matching                       | 24 |
| Input Matching2                      | 26 |
| Schematic Test Bench with Results    | 27 |
| Chapter 3 - Layout2                  | 28 |
| 22FDSOI MOSFET Transistor            | 28 |
| Customized Transistor                | 30 |
| Drain & Source                       | 30 |
| Gate                                 | 31 |
| Single Transistor Layout             | 32 |
| Trade-off between R and C Influences | 33 |
| Stacked 10 Modified Transistors      | 34 |
| Modified Capacitors & Nets           | 35 |

| Cascade                                                     | 40 |
|-------------------------------------------------------------|----|
| Post Layout Results                                         | 41 |
| Output PA Stage                                             | 45 |
| Post Layout Results                                         | 47 |
| RF pads                                                     | 56 |
| Output PA Stage with Output MN to RF pads                   | 57 |
| Chapter 4 - Issues with RLCK extraction using Quantus       | 58 |
| Discussion and Conclusion                                   | 60 |
| Outcomes of Output PA Stage Results                         | 60 |
| Comparison Results between Quantus RCLK and EMX extractions | 63 |
| Comparison with "TX40" Power Amplifier for 40GHz            | 65 |
| Comparison with IEEE articles                               | 66 |
| Comparison with the State of Art                            | 67 |
| Future Work                                                 | 67 |
| Reference                                                   | 68 |

# The list of figures

| Figure 1: Harmonics impact for various PA classes                                   | 11 |
|-------------------------------------------------------------------------------------|----|
| Figure 2: Power Amplifier Structure                                                 | 12 |
| Figure 3: Fundamental Gain & IM <sub>3</sub>                                        | 14 |
| Figure 4: Stacked PA                                                                | 17 |
| Figure 5: Vds vs lds sweep                                                          | 18 |
| Figure 6: Impact of grounded capacitor in the gate of transistor on the output load | 19 |
| Figure 7:Impact of grounded inductor in the gate of transistor on the output load   | 20 |
| Figure 8: Common Lg1 net                                                            | 21 |
| Figure 9: Separated Lg1 net                                                         | 21 |
| Figure 10: Case 1: Theoretical output of PA                                         | 22 |
| Figure 11: Case 2: Shifted output of PA                                             | 22 |
| Figure 12: Additional impact of LC on the output load                               | 22 |
| Figure 13: Impact of Lg2 net on the PA performance                                  | 23 |
| Figure 14: Impact of Cg2 capacitor on the PA performance                            | 23 |
| Figure 15: Inter-matching between transistors using Lg2 and Ls1                     | 24 |
| Figure 16: Inter-matching between transistors using Lp1                             | 24 |
| Figure 17: Common gate net Lg1 with 2 caps Cg1                                      | 25 |
| Figure 18: Common gate net Lg1 with 1 cap Cg1                                       | 25 |
| Figure 19: Separated gate nets                                                      | 25 |
| Figure 20: Ideal case                                                               | 26 |
| Figure 21: Simple matching for the test bench                                       | 26 |
| Figure 22: General matching approach                                                | 26 |
| Figure 23: Additional matching way                                                  | 26 |
| Figure 24: Schematic test bench of the PA                                           | 27 |
| Figure 25: PDK transistor with M1 metallisation layer                               | 28 |
| Figure 26: PDK transistor with C3 metallisation layer                               | 28 |
| Figure 27: PDK transistor with JA metallisation layer                               | 28 |
| Figure 28: Transistor test bench                                                    | 29 |
| Figure 29: Close view: Drain/Source connections                                     | 30 |
| Figure 30: Close view: Gate connection                                              | 30 |
| Figure 31: Modified Drain/Source position                                           | 31 |
| Figure 32: Modified Drain/Source common connections                                 | 31 |
| Figure 33: Modified gates circled connection                                        | 31 |
| Figure 34: Modified transistor                                                      | 32 |
| Figure 35: M1 - M2 long and C1 - C5 short                                           | 33 |
| Figure 36: M1 - C3 long and C4 - C5 short                                           | 33 |
| Figure 37: Gate connection M1 - QB                                                  | 34 |
| Figure 38: Drain & Source connections M1 - JA                                       | 34 |
| Figure 39: 10 stacked modified transistors                                          | 34 |
| Figure 40: Schematic core of custom capacitor                                       | 35 |
| Figure 41: Layout of custom capacitor                                               | 35 |
| Figure 42: Schematic core of custom inductor                                        | 36 |

| Figure 43: Layout of custom inductor                                                    | 36   |
|-----------------------------------------------------------------------------------------|------|
| Figure 44: Common Layout View                                                           | 39   |
| Figure 45: 3D Layout View                                                               | 39   |
| Figure 46: Schematic core of cascade                                                    | 40   |
| Figure 47: Layout of cascade                                                            | 40   |
| Figure 48: Cascade test bench                                                           | 41   |
| Figure 49: Vds and output swings in the cascade                                         | 42   |
| Figure 50: Spectrum of the output signal                                                | 42   |
| Figure 51: HB-tstat and transient spectrums of output signal                            | 43   |
| Figure 52: The low frequency component spotting                                         | 44   |
| Figure 53: Case 1 - Splitted Vss nets                                                   | 46   |
| Figure 54: Case 2 - Long Vss nets                                                       | 46   |
| Figure 55: Case 3 - Long Vss nets (Fully shorted with Vdd)                              | 46   |
| Figure 56: HBSP - S11 and S22 (Smith Chart) at 10dBm input power                        | 47   |
| Figure 57: HBSP - S11 and S22 at 10dBm input power (no peak within 30G-40GHz)           | 48   |
| Figure 58: I $\Delta$ I and $\Delta$ plots with definition of issue                     | 49   |
| Figure 59: K, mu and mu-prime factors                                                   | 49   |
| Figure 60: Final OP1dB and IP1dB                                                        | 50   |
| Figure 61: IIP3                                                                         | 50   |
| Figure 62: Common & Differential Modes                                                  | 51   |
| Figure 63: Oscillations in Vdd & Vss nets for 300ns                                     | 52   |
| Figure 64: Differential swings in the Cascade                                           | 53   |
| Figure 65: Differential swings in the PA                                                | 53   |
| Figure 66: Output signal spectrum of the cascade                                        | 53   |
| Figure 67: Output signal spectrum of the PA                                             | 53   |
| Figure 68: SP/HBSP plotting options                                                     | 54   |
| Figure 69: Delta plot indicates the same problem as the transient simulation            | 54   |
| Figure 70:Schematic view of considered components                                       | 55   |
| Figure 71: Spectrum of low power output signal                                          | 55   |
| Figure 72: Impact of fault (yellow) and correct (red) emx extractions on the output sig | gnal |
|                                                                                         | 55   |
| Figure 73: RF pads with Output MN to 100 Ohm                                            | 56   |
| Figure 74: Final layout of the PA with RF pads                                          | 57   |
| Figure 75: Fault results in RCLK extraction                                             | 58   |
| Figure 76: Original output net from the PA with low inductance                          | 59   |
| Figure 77: Case 1 - Pins are located on the maximum distance                            | 59   |
| Figure 78: Case 2 - Pins are located on the original distance as in the output net      | 59   |
| Figure 79: Vds spread & Output Swing of the PA                                          | 62   |
| Figure 80: Gain & PAE using Quantus RCLK extraction                                     | 63   |
| Figure 81: Gain & PAE using EMX extraction for both with and without RF pads            | 64   |

# The list of tables

| Table 1: Conduction angle, PA Classes, Efficiency                   | 10 |
|---------------------------------------------------------------------|----|
| Table 2: PA stages with suggested parameters                        | 12 |
| Table 3: Expected values for PA stages                              | 13 |
| Table 4: Outcomes of Q factor comparison                            | 14 |
| Table 5: Biasing and Vds values                                     | 18 |
| Table 6: Impact of Lg1 on the gain                                  | 25 |
| Table 7: Y and Z parameters for M1, C3, JA metallisation layers     | 29 |
| Table 8: Comparison between PDK and Modified transistors            | 32 |
| Table 9: PDK and custom capacitors comparison                       | 35 |
| Table 10: Impact of PDK and custom capacitors on the PA gain        | 36 |
| Table 11: Output net views                                          | 38 |
| Table 12: Output nets parameters                                    | 39 |
| Table 13: Differential voltages spread in the cascade               | 41 |
| Table 14: Comparison Cascade and PA post-layouts results            | 45 |
| Table 15: Vss influence on the PA performance                       | 46 |
| Table 16: Overview the difference between cascade and PA results    | 53 |
| Table 17: The influence of the output MN and RF pads                | 56 |
| Table 18: Overview of performance drop through implementations      | 61 |
| Table 19: Final power gain, output power and PAE                    | 62 |
| Table 20: Gain, Pout, PAE using EMX extraction                      | 64 |
| Table 21: Comparison between the state of art with the current work | 67 |

# Abbreviations

SP - Scattering Parameter HB(A) - Harmonic Balance (Analysis) Vds - Voltage between Drain to Source Vgs - Voltage between Gate to Source Vdg - Voltage between Drain to Gate Vdif - Differential Voltage Cg – Gate Capacitance Cgs – Gate to Source Capacitance Cgd – Gate to Drain Capacitance MN - Matching Network PA - Power Amplifier CS Amp - Common Source Amplifier CG Amp - Common Gate Amplifier Q (factor) - Quality Factor Vdd - Supply Voltage of the PA GND - Ground **CP** - Compression Point BW<sub>3dB</sub> - Bandwidth (3db from the peak) Psat - Saturation Power IP1dB - Input Power at Compression Point OP1dB - Output Power at Compression Point IM – Intermodulation IIP3 – Input third-order intercept point OIP3 - Output third-order intercept point PAE - Power Added Efficiency

# Introduction

Today, the primary focus of wireless networks is on the quantity and quality of data transmission. Power amplifiers, which amplify data signals with a high frequency carrier signal and deliver them at a sufficient distance using, for example, a beam forming technique, are primarily responsible for the successful delivery of data.

Two fundamental topologies can be used for the design of output power amplifier stage at a frequency as high as 80GHz: stacked case and power combining case.

A cascade-configured amplifier can get a high output power while allowing for some degree of output power variation and the avoidance of voltage breaks over the gate to drain and gate to source. But because the stacking power amplifier handles a lot of current, the transistors get much bigger than in earlier stages, which causes an extra parasitic impact issue.

The power combiner amplifier configuration uses two or more amplifiers with common balun to combine the power of each amplifier on the output load. The problem lies in all previously mentioned problems related to the basic amplifier as well as in balun creation. The power combining balun should be designed to have both high coupling factor (within 0,7 and 1) and sufficient bandwidth avoiding self-resonances, which number is the same as a number of used amplifiers.

The implementation of stacked power amplifiers is taken into consideration for the thesis in order to better observe the differences between PA for 40GHz and 80GHz.

The goal is to achieve comparable outcomes for the PA operating at 80 GHz, meaning that the peaking efficiency must be at least 14% and the intended output power must fall between 14 and 16 dBm.

# Chapter 1 - Theory

### **Basic Concept**

Power amplifiers deal with large amounts of power and consequently are expected to have preferable performance as well as to be applicable for modern requirements in terms of power consumption, recent technology scales, frequency of interest, cooling limitations and other areas.

Power amplifiers are classified on their characteristics and performance which are referred to the time period of passing current through an active amplifier. The time and portions of RF cycles for which a current occurs in the device is defined as a conduction angle. In simple words, the current swing defines how long PA is on and it determines classes. For example, the full current swing higher than 0 defines that a device is on constantly (conduction angle is  $2\pi$ ) and such PA is classified as class A. In the meanwhile, a higher class is defined if the current swing falls to zero for a certain amount of time. For instance, if a device is operational for half of its duration (with a conduction angle of  $\pi$ ), it is assigned to class B. [5]

| Conduction angle ( $\alpha$ ) | 2π  | $2\pi - \pi$ | π     | π - 0  |
|-------------------------------|-----|--------------|-------|--------|
| Class                         | А   | AB           | В     | С      |
| Theoretical efficiency limit  | 50% | 64%          | 78.5% | ≈ 100% |

Both conduction angles and following classes [5] are depicted in the table 1.

Table 1: Conduction angle, PA Classes, Efficiency

The time of current presence in the device inversely impacts on the efficiency of the amplifier. The RF current waveform can be expressed as the sum between DC current:

$$I_{dc} = \frac{Imax}{2\pi} \cdot \frac{2sin(\alpha/2) - \alpha \cdot cos(\alpha/2)}{1 - cos(\alpha/2)}$$
[5]

and the varying magnitude of the 1th harmonic:

$$I_1 = \frac{Imax}{2\pi} \cdot \frac{\alpha - \sin(\alpha)}{1 - \cos(\alpha/2)}$$
[5]

It can be seen that reducing the DC component forces the conduction angle to decrease monotonically, based on the comparison between class A and B:

- The DC component for class A ( $\alpha = 2\pi$ ):  $I_{dc}(Class A) = I_{max}/2$  [5]
- The DC component for class B ( $\alpha = \pi$ ):  $I_{dc}(Class B) = I_{max}/\pi$  [5]
- The fundamental component for class B ( $\alpha = \pi$ ):  $I_1(Class B) = I_{max}/2$  [5]

Fig. 1 illustrates the influence of non-linear loads becomes more visible for higher PA classes on its performance.

The main concern is on the first or fundamental harmonic, but other harmonics also have an impact on the output of PA.

The effect of the second harmonic is more noticeable in grades B and above.

Greater PA classes are more influenced by harmonics higher than the second one.

Special purpose amplifiers have greater classes than C and they are not considered.



Figure 1: Harmonics impact for various PA classes

It is also notable that the fundamental current between  $2\pi$  and  $\pi$  remains on the same level, which is pointing out that the load does not differ much between classes A and B. This simplifies the problem with mismatch between the optimum and actual loads.

For the best power transfer, the output power and gain should be defined over the output load, which should ideally match the PA's inner load. A load-pull technique aids in defining the ideal load, which permits maximal amplification. But because the output power and the gain are inversely proportional to one another, there is always a trade-off.

The overall performance is defined by efficiency. The efficiency is the ratio between the fundamental power in respect to the spent DC power:  $\eta = \frac{P_1}{P_{dc}}$  [5]

Where the fundamental power is defined over voltage and current swings in the device:

 $P_1 = Vrms \cdot Irms$  of the first harmonic and DC power:  $P_{dc} = V_{dc} \cdot I_{dc}$ . [5]

A commonly used term of efficiency is "the power additive efficiency":  $PAE = \frac{P_1 - P_{IN}}{P_{dc}}$  [5]

which considers the input signal power  $(P_{IN})$  causing limitations for the gain.

The power gain should be at least 10dB to properly evaluate PAE [5], which demands to include the drive stage at least.

Nonetheless, there is a notable difference between the theoretical example and the practical outcomes for designed PA.

It can be explained over the impact of frequency dependent components presence in all components of PA, which introduce mismatches, power dissipation, non-linearity, additional resonance points, low frequency components presence and other issues.

Results about differences are presented in further chapters.

### Power Amplifier Structure

In order to attain high gain, output power, and PAE, three amplifier stages are utilised.

The input, driving, and output stages are illustrated in Fig. 2.

The output stage is the most important power amplifier, which amplifies sufficiently high input power (7dBm - 12dBm) into the required output power level (e.g. 14dBm - 18dBm). The output power value is defined by the expected distance between radio transmitter and radio receiver to surpass the isotropic loss. The design output PA stage mainly defines the efficiency of the whole PA.

Between the input and output amplifier stage is an intermediate step called the driving stage. Its function is to appropriately amplify the incoming signal from the input stage for the output stage located close to the output power amplifier's compression point. For the drive stage, the input power range is between -5 and 5 dBm, while the output power range is between 8 and 12 dBm. The output stage's compression point is a reference to the driving stage's design.

The input stage consists of one or more amplifiers that handle modest input signals from the transmitter's upconverting mixer that fall between -20 and -15 dBm. Its goal is to sufficiently amplify the little input signal within the drive stage's 0dBm or 5dBm range.



Figure 2: Power Amplifier Structure

Each stage deals with different amounts of currents and limited power gains. For the comparison, the table 2 illustrates data of designed PA for 80GHz.

| Stage        | Gain<br>(dB) | Pout<br>(dBm) | Compression<br>point (dBm) | N transistor | Width of single<br>transistor (um) |
|--------------|--------------|---------------|----------------------------|--------------|------------------------------------|
| Input stage  | 15 - 20      | 0 - 5         | -15                        | 1            | 30                                 |
| Drive stage  | 8 - 14       | 7 - 12        | 0-4                        | 3-5          | 75                                 |
| Output stage | 6 - 10       | 14 - 18       | 8-12                       | 8-10         | 75                                 |

Table 2: PA stages with suggested parameters

Maintaining the appropriate trade-off between gain and output power while synchronising voltage swings and the resonance point in output matching and inter-matching at the drain of transistors is the most challenging task.

Only the output stage PA has a finalised results in the report due to complexity design at such high frequency as 80GHz.

### Matching Network

The various transistor sizes and numbers form the basis of each level.

As a result, each step has a different input and output load.

A drop in the load value results from increasing the entire width, as it is shown in the table 3.

| Stage           | Number of transistors | Width of single<br>transistor (um) | Input load<br>(Ohms) | Output load<br>(Ohms) |
|-----------------|-----------------------|------------------------------------|----------------------|-----------------------|
| Input<br>stage  | 1                     | 30                                 | 18-28                | 16-63                 |
| Drive<br>stage  | 3-5                   | 75                                 | 14                   | 33                    |
| Output<br>stage | 8-10                  | 75                                 | 4                    | 24                    |

Table 3: Expected values for PA stages

The mismatch in loads between stages causes reflections to appear and the power drops.

The reflection coefficient is expressed over the ratio between loads as  $\Gamma = \frac{Zload - Zref}{Zload + Zref}$  [1], which represents reflected and incident waves relation.

The incident wave is moving ahead, while the reflected wave is moving in the opposite direction.

Because of the mismatch in loads, the incident wave cannot be completely absorbed by the load and will instead be reflected back, fading the transferred power of a subsequent incident wave.

By minimising power loss and bringing reflection near to zero, the matching network enables the equalisation of loads from both sides between stages, the output of the previous and the input of the subsequent stages.

The matching network consists of lossless components such as inductor and capacitor. Depending on the situation,  $\Pi$  and T circuit can used to place serially or parallelly LC components to transfer the power without loss.

Theoretically, no power drop is expected using a matching network, however, practically, some drop presence will be within 0.75dBm using ideal components from the "analoglib" library.

This drop is referred to the quality factor of the component and the quantity of LC circuits used in the matching network.

### Q-factor & Bandwidth

The quality factor represents the ratio between stored to spread energy, where reactive components such as an inductor and a capacitor are lossless and a resistor as a dissipative energy component.

 $Q = 2\pi \cdot (maximum instantaneous energy stored)/(energy dissipated per cycle)$  [1] Therefore, Q - factor is expressed as reactance over resistance or admittance over conductance.  $wo = 2\pi \cdot fo$ 

$$Qserial = \frac{1}{(wo \cdot Cserial \cdot Rserial)} = \frac{(wo \cdot Lserial)}{Rserial}$$
[1]

$$Qparallel = wo \cdot Cparallel \cdot Rparallel = \frac{Rparallel}{wo \cdot Lparallel}$$
[1]

The impact of quality of inductor in the output of PA is shown in the table 4.

| PDK 50 3p        | 70m  | 15.5dBm | 24 Ohm |
|------------------|------|---------|--------|
| Customised 16 3p | 192m | 14.5dBm | 18 Ohm |

Table 4: Outcomes of Q factor comparison

The bandwidth determines the shift in frequency ( $\Delta f$ ) from the resonance point within 3dB from the gain peak on the frequency scale. The peak is located at the resonance point. 3dB drop from the peak defines that the quantity of interest has changed to the half the value relative to the maximum.

The bandwidth can be expressed as  $BW3dB = 2\pi\Delta f = \frac{1}{RC}[1]$ ,

or as the ratio to Q factor  $\frac{wo}{BW} = wo \cdot CR = Q$  [1].

Consequently, the quality of used components has an influence on the bandwidth.

Therefore, the main concern is capacitors and inductors (and nets) quality where a signal is flowing.

#### CP and IIP3

Compression point (CP) locates at the difference of 1dB between the extrapolated 1:1 slope of fundamental gain and non-linear behavior of the output power.

Third-order intermodulation (IM) products is the result of nonlinear behavior of an amplifier generated by frequency dependent components (e.g. Cg, Cgd, Cgs, Cds).

As shown in Fig. 3, an intercept point is the intersection between the extrapolated 1:1 slope of fundamental gain, and the 3:1 slope of the third order IM products. [5]

Interested parameters are IP1dB for the input power limit definition, OP1dB for the peaking output power and IIP3 for the input power value when  $IM_3$  product covers the fundamental power of carrier signal with data.



Figure 3: Fundamental Gain & IM<sub>3</sub>

#### Stability

Stability analysis is required to define if the amplifier is unconditionally or conditionally stable.

If input and output resistances are positive, or similarly, input and output reflections are less than 1 then the two-port network is unconditionally stable.

$$\Gamma in = S11 + S12 \cdot S21 \cdot \frac{\Gamma load}{1 - S22 \cdot \Gamma load}$$

$$\Gamma out = S22 + S12 \cdot S21 \cdot \frac{\Gamma source}{1 - S11 \cdot \Gamma source}$$
[1]

The first alternative check is Kf > 1 and  $|\Delta| < 1$ , where S11 and S22 are referred to input and output reflections:

$$Kf = \frac{1 - |S11|^2 - |S22|^2 + |\Delta|^2}{2|S12 \cdot S21|}$$
[1]

$$\Delta = S11 \cdot S22 - S12 \cdot S21 \tag{1}$$

The second alternative check is to ensure that  $\mu$  and  $\mu$ ' factors are greater than 1.

A  $\mu$ -factor is referred to the load stability:

ility: 
$$\mu = \frac{1 - |S11|^2}{|S22 - S11^* \cdot \Delta| + |S21 \cdot S12|}$$
  
ability: 
$$\mu' = \frac{1 - |S22|^2}{|S11 - S22^* \cdot \Delta| + |S21 \cdot S12|}$$

A  $\mu$ -factor is referred to the source stability:  $\mu'$ 

In the case of conditionally stable, the reflection of either input or output or both is more than 1. The stable regions can be defined by plotting stability circle and finding its center for the position on the Smith chart, which boundary is referred to the unity of reflection in the  $\Gamma$ -plane.

The stability circle may cross the area of reflection lower than 1 on the Smith chart and may not. The stable region locates depending on the situation with reflection value.

Radius of input stability circle

Radius of output stability circle

$$rs = \frac{|S12 \cdot S21|}{||S11|^2 - |\Delta|^2|} [1] \qquad rl = \frac{|S12 \cdot S21|}{||S22|^2 - |\Delta|^2|} [1]$$

Centre of input stability circle

Centre of output stability circle

$$\Gamma so = \frac{S11^* - \Delta^* \cdot S22}{|S11|^2 - |\Delta|^2|} [1] \qquad \Gamma so = \frac{S22^* - \Delta^* \cdot S11}{|S22|^2 - |\Delta|^2|} [1]$$

The quick way to define which area to consider is to check the S parameter of input/output (S11/S22) module.

If  $\Gamma load = 0$  then  $\Gamma in = S11$ If  $\Gamma source = 0$  then  $\Gamma out = S22$ 

If the |S| parameter is less than 1 then the stable region locates out of the stability circle on the Smith chart. If the |S| parameter is more than 1 then the stable region locates in the common section within the stability circle on the Smith chart. If there is no cross section for the last case, then such PA is unstable and unworkable.

### **Expected Problems**

The PA design has a number of issues that arose throughout the design phase and may be the reason for its subpar performance.

The output stage of PA is the most troublesome because of the sensitivity of outcomes (e.g. CP, matching, Q factors, an optimum load value), setting several objectives (such as a high CP and PAE, a sufficient gain and output power), handling large signals (such as differential voltage swings, breakdown limitations, high input power) and addressing ensuing issues (e.g. stability, oscillations, Q factors, an earlier compression, a power dissipation).

First of all, the output PA stage should make final efficient amplification of the large input signal.

Therefore, several goals are set initially such as the power gain, relatively low biasing of input transistors to achieve class AB or B cases, high compression point, high IIP3, high output power at the resonance of required frequency and avoiding breaking voltage limits. Most of the listed targets above are related to the transconductance value of used transistors at considered frequency. The transconductance describes how much output current in the PA can change in respect to the incoming voltage change at the gate. Gm mainly represents the current, therefore, it can be magnified over:

- increasing the gate voltage;
- (Vdd) supply voltage of whole output PA stage increasing the theoretical limit of output power
- making the ratio width to length of the transistor larger;
- tuning the output of transistor to the resonance point.

However, each solution has outcomes which may make the situation worse:

- increasing the gate voltage will decrease the compression point and the drop will be quicker than the power gain will grow up;
- increasing the supply voltage will increase the power consumption causing the drop of efficiency and leading dc differential voltages, such as Vds, Vgd, Vgs, closer to the breakdown limit;
- making the ratio width to length of the transistor larger, it creates problems with capacitive parasitics, decreasing input/output load values, increasing the sensitivity to resistance in nets/components for transistors and lowering variations for the resonance over inductors.

Secondly, the power loss. A signal experiences a power loss when its energy is partially or completely dispersed throughout a component or network before it reaches its intended destination, or when mismatches cause the signal to be reflected back.

Thirdly, the existence of RLC parasitics. The existence of parasitics, mismatches, out-of-resonance, and low-quality factor of components can all cause power loss. But occasionally, giving up some power is necessary to guarantee the amplifier operates correctly (e.g. stability). Additional parasitics may show up due to components configuration and even their position in respect to each other. C parasitics appear if layers are close to each other and/or wide in terms of area. L and R parasitics appear if layer is long and narrow.

For the reference, the resonance formula is  $2\pi f o = 1/\sqrt{LC}$  [1], which illustrates backward proportion showing that, for example, the parasitics capacitor present around 1pF can be resonated with a 3.5pH inductor, which is quite small considering the ratio W/L=10um/15um and sufficient spacing between components around 10um.

## Chapter 2 - Schematic

#### DC Part

A stacked PA for the output stage is used which is illustrated in Fig. 4.

When the presence of parasites, the area of connections and nets, the mutual inductance, and the capacitance of component locations are disregarded, the schematic amplifier is said to be the best example. Power dissipation and gain loss are both caused by RLC parasitics, as was previously mentioned.

To overcome upcoming decreases, the schematic amplifier should therefore have larger desired values. The final power amplifier's maximum output power is between 14dBm and 16dBm, and its maximum gain is between 4dB and 6dB.

The output power of 16–18 dBm and a minimum gain of 6–9 dB are the targets of the schematic case.

The first step is to look at the transfer characteristics, which is illustrated in Fig. 5, to figure out how big and how many transistors are needed to get the desired output power limit.

The maximum desired output power is 18dBm, or 64mW, and

Figure 4: Stacked PA

a voltage swing of 0.5V to 0.6V is anticipated over the drain and source.

In order for the cascade to increase the gain and the output power, three transistors should be stacked. Consequently, a dc current of between 35mA and 20mA should be used, and a supply voltage of 1.5V to 1.8V.

The back gate of the 22nm FD-SOI CMOS transistor lowers the threshold voltage. The threshold voltage of a back gate is 250 mV when applied at 0 V and 160 mV when applied at 1 V.

The transfer characteristic plot is illustrated in Fig. 5 and it is based on the scenario where the transistor width is 75um at minimum length and the threshold voltage is 250 mV.

When the gate voltage is near the threshold value, a single transistor produces 7mA.

In order to get saturated output power close to 18dBm, three (for Vds 0.6V) to five (for Vds 0.5V) transistors in parallel are sufficient.

The total of the loads for every transistor in the cascade determines the ultimate load that results.

To determine the appropriate biasing values for the equal Vds distributed over each transistor, apply the formula below. Vgi = ((i - 1)/m) Vdd + Vgsi i = 2, 3, ..., m [2]

When HB simulation will be done, then ac Vds should be checked for the high output power to ensure that differential voltages do not exceed the breakdown voltage limit.



Differential voltages are drain to source, drain to gate and gate to source and the breakdown voltage is 1.2V. The biasing for the input transistor is defined in respect to transfer characteristics and wanted gain after HB analysis.

The biasing for the middle transistor defines Vds over the input transistor.

The biasing for the output transistor defines Vds over the middle and the output transistors.

It is possible that Vds over the output transistor is preferred to be a bit lower than others to have safe space for before it reaches its differential voltage limit (Case 2).

| Vdd = 1.8V | Case 1 |                   | Ca     | se 2              |
|------------|--------|-------------------|--------|-------------------|
| Transistor | V gate | V drain to source | V gate | V drain to source |
| output     | 1.6V   | 0.6V              | 1.8V   | 0.5V              |
| middle     | 0.95V  | 0.6V              | 1.1V   | 0.65V             |
| input      | 0.25V  | 0.6V              | 0.25V  | 0.65V             |

The biasing plan is represented in the table 5.

Table 5: Biasing and Vds values

#### AC Part

When input/output matching networks are suitably set, additional actions can be taken.

Transistors' gates ought to be equipped with resistors and capacitors.

In order to prevent biassing sources of voltage from spreading outside of transistors and vice versa, resistors should be included in each transistor.

The functions of capacitors and how they are connected are numerous.

The initial goal is to shorten with a ground higher harmonics that the PA produces. Capacitors at gates are therefore grounded and only necessary for output and middle transistors. Because the input MN or the MN from the preceding stage fade higher harmonics, the input transistor does not require a grounded cap.

The output load for each transistor output is defined by the capacitor at the gate of the middle and output transistors (it is positioned at the drain for the CG Amp).

Both transistors share the same middle transistor capacitor. It is done to reduce changes in differential voltage in the gate so that the first CG Amp after the CS Amp has a larger load and can increase the gain in a cascade.

The output transistor has its own capacitor. The reason for it is to keep some voltage fluctuations at the gate to make Vds adaptive to the output signal, decreasing its difference allowing it to have high Vout and avoid the breaking limit of 1.2V. Therefore, it is good to have a high Q value for these capacitors.

The formula for the external gate capacitance value is illustrated below.

Ci = (Cgs, i + Cgd, i(1 + gm, i \* Ropt))/((i - 1) \* gm, i \* Ropt - 1) i = 2, 3, ..., m [2] The influence of the capacitor at the gate of transistor is illustrated in Fig. 6.

The value of the capacitor at the output transistor allows it to vary the output load of the whole PA proportionally. However, it has its own boundary in terms of value size. It looks like the output load stops the enhancement and leads down the capacitive area on the smith chart which leads to pointless decrease of output inductor.



Figure 6: Impact of grounded capacitor in the gate of transistor on the output load

Similarly, the inductance in the net between the output capacitor and the gate of the output transistor pulls the load up with a similar pattern illustrated in Fig. 7.



Figure 7:Impact of grounded inductor in the gate of transistor on the output load

The performance of the PA is determined by the value and quality of the output inductance, which functions as a resonator. The Q-factor section table displays the impact example of inductor quality.

The final transistor's drain, which is situated between the cascade and the output inductor, is where the PA's whole output is found. Because of its capacitive qualities, the inductor functions as a susceptive load, which means that increasing its value causes a capacitive area to expand on the Smith chart up to the location of the original transistor.

The Vdd and Vss nets of PA must be shorted by capacitor to lock large generated signals inside of PA avoiding impact and damage to other components. However, it can be delayed as the final step.

After everything is configured, the load-pull simulation can be used to determine the ideal load position to obtain the necessary output power, power gain, and stability, and the SP simulation can be used to determine whether the PA is stable.

As it was mentioned before, the optimum load region may locate out the Smith chart in case of instability. A neutralization capacitor can be used to improve the situation. C neutralization - create negative Cgd to neutralize Cgd in transistor improving linearity but sacrificing by gain. The neutralization capacitor, cross-connected between the input and output differential as a negative feedback, and it is used to compensate for the reverse feedback effect (S12) created by the gate-to-drain capacitor inside the transistor.

Generally, C neutralization is useful if PA consists of 1 or 2 transistors in the stack (PA from 3 stacked transistors is stable to avoid the usage of C neutralisation).

### The Output PA Stage

A stacked amplifier is used for the thesis to observe the parasitics impact.

Such amplifiers have good reverse isolation, stability, sufficiently high both gain and output power (because of higher applicable supply voltage).

The problem lies in the number of parallelly connected transistors, applicable inductor value, capacitance impact, Vss nets to reach each ground point (at the source of input transistor and grounded capacitors). Additionally, inter-matching between transistors is playing an important role in the design of PA for higher than 60GHz.

Notwithstanding the influence of the components, every net (such as that between a transistor or components) and component location with regard to one another (such as that between a transistor and a biassing resistor) also introduces changes. Input and output ports of PA are illustrated as Rin and Rout.

Basic stacked PA configurations are illustrated in Fig. 7 and 9 depicting main impacting places.



Common gate net for middle transistors [2,3,6]

Splitted gates of middle transistors

Each interested place of PA is described separately below with outcomes.

#### Output of PA

The most crucial area to be concerned with is the output load (Rout), which is found at the PA's output. The load-pull analysis can determine the ideal load, although it might not be easily attained. It is possible to adjust the resonance point and the output load across the LC circuit at the output transistor's source, gate, and drain. However, there are benefits and drawbacks to each place where LC circuits are located.



Figure 10: Case 1: Theoretical output of PA



The ideal case is shown in Fig. 10, which can be reached if to consider the net inductance between the drain port and Lout.

The actual inductor/balun is expected to be low. The generated power is maximum in this case.

Some power and gain are expected to drop over Ls2, which is illustrated in Fig 11.

Lout will be considerably large than in the first case (e.g for (1) Lout = 5pH and for (2) Lout = 20pH) However, limitations in terms of Gp and Pout are determined by the first case.



Lg2 and Cg2 are illustrated in Fig 12, which define the output load value.

Then higher the output load then higher Gp. However, Rout cannot be increased constantly. Zout leads away to the initial capacitive position. (The stability might be lost in the worst case.) The visible illustration of output load position with 80GHz and 85GHz sweeping values for LC components at the gate are depicted below with the list of their influence on the PA.



Lg2 impact: 1pH - 15pH (Q = 20)

Figure 13: Impact of Lg2 net on the PA performance

Cg2 impact: 200fF - 1pF

The sweep of inductance value in the Lg2 inductor with its impact on the output load is illustrated Fig. 13.

The gain change is rapid:

- 2dB 6dB with 1pH 8pH
- 1dB growth after 8pH
- May lead to instability (e.g. at 11pH)

The increase if inductance until 7pH improves the gain remaining the CP. However, the further enhancement causes both CP and the power gain drop (only CP was dropping in the schematic case).



The sweep of capacitance value in the Cg2 capacitor with its impact on the output load is illustrated Fig. 14.

The gain change stays within 1dB and the output load value moves along the conductance circle contour mainly.

Figure 14: Impact of Cg2 capacitor on the PA performance

Overall, the inductor Lg2 has a significant impact on the output in terms of both matching and gain.

#### Inter-Matching

An inter-matching between the output and the middle transistors can be set by the gate inductance Lg2, which is shown in Fig. 15.



Figure 15: Inter-matching between transistors using Lg2 and Ls1

However, the value of Lg2 allows to change the output power, power gain and the output matching, as it has already been shown.

A serial inductor Ls1 can help to set the resonance point for the inter-matching, but the increase of Ls1 value leads an output characteristic impedance to lower value causing the gain decrease.



Figure 16: Inter-matching between transistors using Lp1

As shown in Fig. 16, a parallel inductor Lp1 allows to set the resonance point for a higher characteristic impedance (inversely to Ls1) and it is slightly less dependent on the Lg2 inductor.

Additionally, A Lp1 inductor pulls up the output impedance (Rout) into the inductive area on the Smith chart. This allows the use of a higher value inductor (Lout) improving Q factor.

Inductance Impact at the Gate of Middle Transistors

The common net "Lg1" between gates of middle transistors is illustrated in Fig. 17 & 18.

The presence of inductance more than 400fH in the Lg1 net creates maximum shift of inter load into the capacitive area on the Smith Chart causing the output load drops 2 times and, as the result, the gain drops crucially, as it is shown in the table 6.



Figure 17: Common gate net Lg1 with 2 caps Cg1



Figure 18: Common gate net Lg1 with 1 cap Cg1

Furthermore, this inductance completely blocks itself and any further matching approach using either serial or parallel one does not make any difference as long as Lg1 net is common.

The physical realistic value of inductance for the common net is within 10pH - 25pH.

| Case                               | Gain   |  |  |
|------------------------------------|--------|--|--|
| No inductance in Lg1 (Ideal)       | 6dB    |  |  |
| Presence of inductance in Lg1      | -6.5dB |  |  |
| Table 6: Impact of Lg1 on the gain |        |  |  |

Separation of Lg1 is needed to solve the problem with inductance impact.

However, it still stops to make the maximum amplification.

The common net between gates allowed to cancel out differential signals at gates making maximum drain to source swing.

The splitted gates case is illustrated in Fig. 19. It leads to the same approach as with the output transistors using Lg1 and Cg1 to increase the gain.



The ideal setting is to place the characteristic impedance of the middle and input transistors' intermatching at the lowest resistance and closest to the resonance point (exactly, but not required). The Lg1 inductor has the ability to set this position. Cg1 has an x1.4 times greater capacity than Cg2. Ls0 as low as feasible.

#### Input Matching

An input matching as well as an output matching defines an input bandwidth, reflection and stability. The problem description and several approaches are described below.





Figure 21: Simple matching for the test bench



Figure 22: General matching approach



Figure 23: Additional matching way

Capacitance at the gate (Cgs, Cgd, Cg) creates mismatch for purely resistive load at the resonance. The original impedance position is located in the capacitive area on the Smith Chart, referring to Fig. 20.

As illustrated in Fig. 21, Lin is attached parallelly to the gates.

It creates the resonance and helps to define the actual input load value (Rin) for the future MN target and simplify test bench settings.

Rin represents the optimum load from the previous stage in Fig. 22.

Lmn is a part of both resonance circuit for the previous stage and lumped component for the MN Cmn and Lmn transfer Rin into the optimum input load of the next stage.

Cmn and Lmn components will have relatively low values (e.g. Cmn  $\in$  [10fF; 100fF])

Rin represents the optimum load from the previous stage.

Lmn is a part of both resonance circuit for the previous stage and lumped component for the MN Lin helps to set the optimum input load allowing to use high value components (e.g. Cmn [1pF; 10pF]) Cmn and Lmn transfer Rin into the optimum input load of the next stage in Fig. 23.

Third and fourth cases are suitable to use to connect the previous stage (e.g. a drive stage) with the next one (e.g. an output stage).

The second case is useful for the separate stage design defining the optimum input load.

### Schematic Test Bench with Results

The list of used simulations is:

- DC for the biasing, voltage spreads (e.g. Vds) and drops;
- SP for the input/inter/output matching, the stability (Kf, delta, mu, mu-prime), the gain (S21, S12, h21, Gmax, Gmsg), the noise figure;
- HB for the compression point, the transconductance, the power gain, PAE, a load-pull, a voltage, HBSP, HB-tstat;
- Transient for the long-term signal observation.

The used test bench of the main circuit with the biasing circuit are illustrated Fig. 24.



Figure 24: Schematic test bench of the PA

Schematic results are Gain = 12dB, IP1dB = 9.48dBm and OP1dB = 20.5dBm. The reason of achieved outcomes are high because to overcome future power dissipation in non-ideal components with relatively low Q factor (e.g. capacitors).

# Chapter 3 - Layout

### 22FDSOI MOSFET Transistor

A Super Low Threshold Voltage NFET transistor was used referring to the scope of the thesis due to its ability to generate from 5uA to 90uA applying 200mV - 600mV to the gate and possibility to vary the threshold voltage from 250mV to 160mV applying 0V - 1V. The width of used SLVTNFET is 75um (the upper limit is 80um).

A slvtnfet transistor 3 versions of metalisation options to lead a drain, source and gate to M1 in Fig. 25, C3 (gate C2) in Fig. 26, and JA (gate C1 in this case) in Fig. 27. The metalisation hierarchy is M1, M2, C1, C2, C3, C4, C5, JA, QA, QB, LB where M1 is the lowest and LB is the highest. C5 - LB metals are used for most connections between components and, therefore, ports of transistors should be able to reach at least JA metal level.





Figure 25: PDK transistor with M1 metallisation layer



Figure 26: PDK transistor with C3 metallisation layer



Figure 27: PDK transistor with JA metallisation layer

The positions of metal layers create resistive and capacitive parasitics.

The test bench is illustrated in Fig. 28. The main concern is imY11 (port 56 connected to gate and source), imY22 (port 57 connected to drain and source) and imY12 for capacitive presence, and reZ22 for drain with source resistance and reZ11 for gate with source resistance.



Figure 28: Transistor test bench

The list of transistors parasitics (resistance and susceptance at 80GHz) is illustrated the table 7.

| M1 metal level    | C3 metal level   | JA metal level   |
|-------------------|------------------|------------------|
| imY11 = 57.8mS    | imY11 = 64.26mS  | imY11 = 61.25mS  |
| imY12 = -18.63mS  | imY12 = -21.28mS | imY12 = -20.64mS |
| imY21 = -83.53mS  | imY21 = -89.32mS | imY21 = -84.91mS |
| imY22 = 51.06mS   | imY22 = 95.51mS  | imY22 = 114.06mS |
| reZ11 = 12.91Ohm  | reZ11 = 12.4Ohm  | reZ11 = 12.35Ohm |
| re Z22 = 14.64Ohm | re Z22 = 7.92Ohm | re Z22 = 6.03Ohm |

 Table 7: Y and Z parameters for M1, C3, JA metallisation layers

The cutoff frequency defines the case when the short-circuit current gain " $h21 = \frac{|Y21|}{|Y11|}$ " reaches unity.

The maximum oscillation frequency defines the case when the maximum available gain

"MAG = 
$$\frac{|Y_{21}|}{|Y_{12}|} \cdot (K - \sqrt{K^2 - 1})$$
" reaches unity.

The cutoff frequency is 387GHz and the maximum oscillation frequency 173GHz.

The PDK transistors allows to set metallisation layers from M1 to JA for drain and source ports.

This causes the reZ22 drop due to number of vias and the imY22 increase due to parallel position of drain and source vias stack with big area and small separation in respect to each other.

The further problems of C3 and JA PDK transistors use are:

- 1) The capacitance impact increases gradually at the drain port. It means that the inductor value should be lower than for M1 case for the further resonance point and PDK inductors or baluns may not have such low inductance value with good Q factor (and coupling factor for balun).
- 2) C1 metal layer is used for the JA case due to it has low resistance (around 1.5Ohm) than M2-C3 layers and in an attempt to decrease growing capacitive value (61mS), which started to grow up in the C3 case (64mS)

### Customized Transistor

#### Drain & Source

Metal layers are vias which fully cover M1 - C4 with identical size in the original PDK transistor. These rectangle blocks are located like "domino" in Fig. 29, and this position and shape is the cause of capacitance enhancement, which is illustrated in Fig. 30. The drain and source pins are located on the JA layer and they are connected over the C5 layer to C3-C4.

Drain/Source connection





Gate connection

Figure 30: Close view: Gate connection

Figure 29: Close view: Drain/Source connections

Therefore, metal layers were created manually for drain, source and gate ports as it is shown in Fig. 31.

M2 - C5 vias are placed in "the teeth saw" shape to decrease area between drain and source by sacrificing resistance.

M1 is covered by M2 in the attempt to decrease the resistance jump.



Figure 31: Modified Drain/Source position

As illustrated in Fig. 32, C5 drain and source vias are connected in the own common horizontal wide C5 line and further they are connected with JA line in  $\Pi$ -shape where pins are located.



Figure 32: Modified Drain/Source common connections

In the case of stacked transistors, a horizontal JA layer can be replaced and vertical layers be extended.

#### Gate

The resistance in the gate will be the cause of partial signal loss.

It might not be enough to have the gate connection only from one side for the stack case.

Furthermore, it is wanted to have a connection to the gate at the C5 layer for the capacitor or at the QB layer for the input of the PA stage.

Therefore, the core transistor should have minimum resistance in the gate and to have close to C5 layer connections.

The circle for the gates is illustrated in Fig. 33. It made using C1-C3 layers which is fulfilled by vias to reduce resistance.



Figure 33: Modified gates circled connection

Additionally, the M2 layer is extended to fit a narrow M1 (0.22um) to extended C1 (0.714um) layer, which allows to decrease inner resistance slightly more.

#### Single Transistor Layout

The final layout for the single transistor is illustrated in Fig. 34.



Figure 34: Modified transistor

The RC extraction of customized transistor is compared with PDK transistor cases in the table 8:

| M1 metal level    | C3 metal level   | JA metal level   | Customized transistor |
|-------------------|------------------|------------------|-----------------------|
| imY11 = 57.8mS    | imY11 = 64.26mS  | imY11 = 61.25mS  | imY11 = 61.46mS       |
| imY12 = -18.63mS  | imY12 = -21.28mS | imY12 = -20.64mS | imY12 = -20.37mS      |
| imY21 = -83.53mS  | imY21 = -89.32mS | imY21 = -84.91mS | imY21 = -84.57mS      |
| imY22 = 51.06mS   | imY22 = 95.51mS  | imY22 = 114.06mS | imY22 = 60.64mS       |
| reZ11 = 12.91Ohm  | reZ11 = 12.4Ohm  | reZ11 = 12.35Ohm | reZ11 = 13.17Ohm      |
| re Z22 = 14.64Ohm | re Z22 = 7.92Ohm | re Z22 = 6.03Ohm | re Z22 = 13.21Ohm     |

Table 8: Comparison between PDK and Modified transistors

The parameters of customized transistor such as:

- both imY11 and imY22 are close to the M1 PDK case.
- both reZ11 and reZ22 are close to the M1 PDK case.
- imY12 is lower than C3 and JA PDK cases.
- imY21 is close to the JA PDK case.
- Both ft and fmax are close to the C3 PDK case.

Overall, the outcomes of the RC derived modified single transistor ought to be somewhat similar to the M1 metallization PDK transistor case schematic.

#### Trade-off between R and C Influences

Modified transistors "M1 - M2 long and C1 - C5 short" in Fig. 35 and "M1 - C3 long and C4 - C5 short" Fig. 36 are used in the same test bench to define the influence of R and C presence to define which transistor case is more suitable for the further usage.

Metal Hierarchy is M1 - M2 - C1 - C2 - C3 - C4 - C5 - JA - QA - QB – LB, starting from the lowest metal layer M1 to the highest layer LB.

Performance comparison between different vias length for the drain/source extension:

(M1 has the same size as M2 and it is hidden for the picture clarity)



Figure 35: M1 - M2 long and C1 - C5 short

Outcome results:

- Gain = 10.4 dB
- CP = 10 dBm
- Pout = 19.53dBm
- Lout = 13.4pH

Outcome results:

- Gain = 8.44 dB
- CP = 9dBm
- Pout = 16.49dBm
- Lout = 9.6 pH

The M1 - C3 long and C4 - C5 short case has lower results:

- \_ Introduces more capacitance but less resistance;
- Requires a smaller output inductor for the resonance position; -
- \_ The output power and the gain dropped.

The M1 - M2 long and C1 - C5 short case has better results:

- Introduces more resistance but less capacitance;
- Larger output inductor can be used which improves the Q factor. \_
- The output power, the gain and the compression point have sufficient values to proceed to the stacked transistors layout creation.

### Stacked 10 Modified Transistors

According to simulation results, 10 parallelly connected transistors allows to overcome the drop-in transconductance due to parasitics and to reach sufficient gain, compression point and the output power. Gates are connected among each other at C1-C3 and QB layers as it is shown in Fig. 37. C1-C3 layers are filled with vias and surround each transistor for the maximum resistance decrease and equal signal spread with a layer among all transistors.

Fig. 36 illustrates how drains and sources are connected by a vertical JA layer attaching 5 transistors and making global drain and source ports connecting two 5 stacked transistors parallel to each other using the JA layer.



Figure 37: Gate connection M1 - QB



Figure 39: 10 stacked modified transistors



Figure 38: Drain & Source connections M1 - JA

The final view of 10 stacked transistors is illustrated in Fig. 39.

The postlayout results using:

- Schematic view:
  - $\circ$  Gain = 12dB
  - $\circ$  IP1dB = 9.48dBm
  - $\circ$  OP1dB = 20.5dBm
- RC extraction view:
  - $\circ$  Gain = 10.4dB
  - $\circ$  IP1dB = 10dBm
  - $\circ$  OP1dB = 19.53dBm

RC extraction was used for all further simulations.
## Modified Capacitors & Nets

It was found that PDK APMOM caps have a problem with Q-factor.

The reason is PDK APMOM caps have thin metal layers using minimum allowed width (e.g. 44nm), which is causing higher resistance.

Additionally, if the length of brush is too long (e.g. >5um) then a self-resonance effect may happen or a capacitor becomes an inductor due to inductance in nets.

Therefore, a custom capacitor with wider nets (0.2um) was created for the output stage and it is illustrated in Fig. 41 with its schematic view in Fig. 40.



igure 40: Schematic core o custom capacitor

The impact of the bulk pin degrades a reactance 0.1 Ohm only. The comparison between PDK and customised capacitors are illustrated in the table 9.

| Width x Length x Net Width         | Z11 real | Z11 imag |
|------------------------------------|----------|----------|
| PDK APMOM (20um x 7.5u x 0.044um): |          |          |
| schematic                          | 337.7m   | -2       |
| RC                                 | 3.159    | -1.477   |
| EMX                                | 3.4      | -2.96    |
| Custom (20um x 10um x 0.2um):      |          |          |
| EMX                                | 324.8m   | -1.75    |

Table 9: PDK and custom capacitors comparison

3 custom capacitors were created: 700fF and 1.1pF for the PA and 68.5fF for the output MN.

The RC extraction of PDK APMOM capacitor and the custom capacitor are used during the PA design based on only the RC extraction of transistors.

The impact of PDK APMOM and custom caps on the gain is illustrated in the table 10.

|   | Usage of            | Gain (dB) |
|---|---------------------|-----------|
|   | PDK APMOM capacitor | 4         |
|   | Modified capacitor  | 10        |
| _ |                     |           |

Table 10: Impact of PDK and custom capacitors on the PA gain

Custom capacitors are used instead of PDK APMOM capacitors on the path of signal (e.g. in the matching network, or grounded capacitors in gates of transistors).

Custom nets are used to set particular inductance and Q factor to avoid an accident shift of resonance point for the inter-matching between transistors.

Both schematic and layout views are illustrated in Fig. 42 and Fig. 43.







Layout



The net consists of top stacked layers QA and QB define low resistance (170mOhm) and specific inductance (3.2pH and 3.8pH) and vias C5 - QB to reach the capacitor. The achieved Q factor is 10.

The case of either single layer, or low-level layer (e.g. C5), or fulfilled by vias usage, it leads to the resistance jump up within 200mOhm - 400mOhm. This impacts on the performance of PA in terms of additional power drop.

An output net impacts on both the value of output inductor for the output matching and the power dissipation. If other nets between transistors are wide enough to have low resistance and inductance, then the output nets should fit for the drain of the output transistor, the output inductor and the further output MN. The drain pin of transistor is located in JA layer. The output MN and the output inductor can be connected over the QB layer.

The table 11 illustrates the layout views of the output nets and the table 12 depicts their parameters for the trade-off between lowest resistance and inductance.

| Description                     | Common Layout View | 3D Layout View |
|---------------------------------|--------------------|----------------|
| 1 JA-QA-QB surrounded by vias   | 228604             | Tionda         |
| 2 JA-QA-QB fulfil vias          | 2.44 o.4           | Thoras         |
| 3 QA-QB surrounded by vias      | 2.4 OLE            | 2:#6:put       |
| 4 JA-QA-QB surrounded vias      |                    |                |
| 5 JA-QB without QA vias at pins | 2 An out           | 2:H6.bur       |



Table 11: Output net views

| Case                            | С     | L     | Q    | R      |
|---------------------------------|-------|-------|------|--------|
| 1 JA-QA-QB surrounded by vias   | 2.36p | 1.62p | 7.14 | 115m   |
| 2 JA-QA-QB fulfil vias          | 2.38p | 1.6p  | 6.96 | 116.8m |
| 3 QA-QB surrounded by vias      | 2.1p  | 1.8p  | 7.07 | 129.7m |
| 4 JA-QB surrounded by vias      | 2.3p  | 1.67p | 7.56 | 111.8m |
| 5 JA-QB without QA vias at pins | 2.3p  | 1.67  | 7.5  | 112.7m |
| 6 only QB                       | 1.7p  | 2.25p | 7.59 | 149.8m |
| 7 JA-QA-QB rectangle            | 4.59p | 810f  | 3.96 | 102.6m |
| 8 JA-QA-QB leaned rectangle     | 4.7p  | 788f  | 3.93 | 100.6m |
| 9 JA-QA-QB leaned x2 rectangle  | 2.75p | 1.41p | 6.98 | 101.5m |

Table 12: Output nets parameters

The 9th case "JA-QA-QB leaned x2 rectangle" has sufficient size to set a transmission line with a space for spacing variations as for 1-6 cases but with lower resistance and inductance.

The net between output and middle transistors with the pin to a transmission line is shown in Fig. 44 & 45:



Figure 44: Common Layout View



Figure 45: 3D Layout View

#### Cascade

Cascaded 3 transistor block, each and one of them consisting of 10 sub transistors (in the layout) coupled in parallel (effectively increasing the width by 10) are connected to one another via QA and QB layers. Every net has EMX extraction and is set to place further transmission lines without additions.

Mxlsv resistors are used to separate pins for future components. Pins are located at the drain of transistors from one another to avoid LVS error about a short circuit. An impedance for the matching/inter-matching in the testbench is observed using pins at the drains. It is possible to attach straight ports inside of instance schematic, however it becomes unclear where the resonance point is and LVS errors appear about undefined instances. Output net (TLout) and inter-net between output and middle transistors with the pin to transmission line (Lm) are set to have as low parasitic impact as possible.

The schematic view of the cascade is shown in Fig. 46.

The cascade layout contains the RC extraction of stacked transistors and EMX extractions of modified capacitors and nets from previous sections. The final layout of the cascade is illustrated in Fig. 47.



Pins are located in the most important places for the future straight connection of further components without additions as well as for the external port connection to observe matching results without interruption of process and passing LVS.

The post layout simulation results are in the further table below.



Figure 47: Layout of cascade

#### Post Layout Results

All EMX extractions of components are used in the test bench, which are shown in Fig. 48. Each component and net bring changes and they can be controlled over "Virtuoso Hierarchy Editor" (config) to examine the influence of each component separately.

It is expected results will have minor changes after a common layout creation.



The difference in post layout simulation results between stacked transistors only and entire cascade with EMX extracted components are quite similar:

- RC extraction of stacked transistors: Gain = 10.4dB; CP = 10dBm; Pout = 19.53dBm
- Cascade: Gain = 9.1dB; CP = 10.91dBm; Pout = 19dBm

Also, in the table 13, the differential voltage swings over drain to source, drain to gate and gate to source are not exceeding the break down limit.

| V peak | Тор   | Middle | Bot   |
|--------|-------|--------|-------|
| Vds    | 930mV | 817mV  | 815mV |
| Vgs    | 695mV | 582mV  | 473mV |
| Vdg    | 584mV | 571mV  | 574mV |

Table 13: Differential voltages spread in the cascade

However, several issues are observed as well:

Firstly, the values of output inductor and inter-matching inductor have decreased significantly, being almost on the edge of lower limit for the physical dimension. This problem appears due to the inductance values of nets as well as the positional pin space to avoid overlapping bulk metallisation layers of transmission lines. The further decrease of inductance value is still possible to gain 0.7pH drop, decreasing the spacing in transmission lines from 5um to 2.41um.

Secondly, voltage swing positions are out of phase and have no cross with Vth or 0 as it is shown in Fig. 49. This states the problem with efficiency of the PA since the amplifier is constantly on.

The phase problem is linked with the value of LC components at the gate of both output and middle transistors.

As it was mentioned in the 2nd chapter, LC value at the gate of the transistor impacts on both the power gain changing the output load and the output matching at the drain of the transistor.

Thirdly, the low frequency component is present in the output of cascade. Low frequency components are illustrated in Fig. 50.

Peaks of the low frequency component at 36GHz and further harmonics at 72GHz and 101GHz. This problem was detected only in the transient simulation over the observation of the output signal within 100ns.



Figure 49: Vds and output swings in the cascade



Figure 50: Spectrum of the output signal

Furthermore, as it is illustrated in Fig. 51, HB analysis is not able to spot this problem due to predefined frequency for the consideration even with the transient-aided option for a long term.



Figure 51: HB-tstat and transient spectrums of output signal

There is a possibility that the load pull effect occasionally occurred for 40GHz at the output of the middle transistor where locating inductors for the inter-matching network for 80GHz. (Load-pull effect was meant as dramatical enhancement of signal power against to expectations.)

The places of lower frequency component generation are illustrated in respect to the cascade using the long-term transient simulation.

The place of lower frequency signal generation is illustrated in Fig. 52.



Figure 52: The low frequency component spotting

The inter-matching between transistors is set by both LC components at the gates of transistors and a differentially connected inductor between cascades. The green circle surrounds the place where the mentioned inductor is attached.

As it can be observed, the low frequency component is generated at the location of differentially connected inductor (the green plot), while, the matching is set for 80GHz resonance at the drain of each transistor.

## Output PA Stage

The PA layout is based on the cascade layout which includes all main parasitics (R, L, C) in nets. All attachments as capacitors, nets to connect caps with cascade, transmission lines, supply and ground nets are extracted separately. Each component is connected without additions to avoid fault results.

A weak spot of these results is that some mutual capacitance between components is not included. Using transmission lines as almost ideal inductors, they set both the output resonance point and as a part inter-matching without power dissipation

| Case                         | Gain (dB) | CP (dBm) | Pout (dBm) |
|------------------------------|-----------|----------|------------|
| Cascade only                 | 9.1       | 10.91    | 19         |
| Cascade (with LC components) | 7.6       | 10.87    | 17.55      |
| PA without Vss               | 7.24      | 11.23    | 17.48      |
| PA with 2 short Vss          | 5.93      | 11.6     | 16.32      |

The results among the cascade only, the pa without and with short Vss net are depicted in the table 14:

Vss impacts significantly on the performance of PA:

- Presence of inductance at the source of the input transistor causes the gain degradation
- Extended inductance at the gate of the middle transistor leads away inter-matching between the middle and the input transistors. The situation is coming back to the case with inductance in the common net between the gates of the middle. The outcome is the significant gain drop.
- Extended inductance at the gate of the output transistor leads away the output matching. Furthermore, this shift causes output power drop since predefined capacitors and nets for them have already set to its optimum trade-off between maximum gain and output power.

Table 14: Comparison Cascade and PA post-layouts results

PA layouts with different ground nets shorted with supply nets are illustrated below.

3 cases of Vss net position with fixed Vdd net and shorted large capacitors are illustrated in Fig. 53, 54 and 55 relatively.

The Vss net is depicted as a ping line defining LB metallisation Layer.

| PA EMX<br>case | Gain (dB) | CP (dBm) | Pout<br>(dBm) |
|----------------|-----------|----------|---------------|
| 1 (Fig. 51)    | 5.9       | 11.6     | 16.3          |
| 2 (Fig. 52)    | -5        | 17.33    | 11.33         |
| 3 (Fig. 53)    | 4.43      | 11.25    | 14.68         |

HB results of 3 cases are shown in the table 15.

Table 15: Vss influence on the PA performance

The 2 splitted Vss case is used for the output MN and RF pads tests for the better comparison.



Figure 53: Case 1 - Splitted Vss nets



Figure 54: Case 2 - Long Vss nets



Figure 55: Case 3 - Long Vss nets (Fully shorted with Vdd)

#### Post Layout Results

#### Stability 1G-90GHz

Such stability factors as K-factor, mu and mu-prime are based on the S-parameters.

SP simulation is referred to as the small signal simulation when its impact is minor on the linearity.

The results for the SP simulation start to differ from HBSP after reaching 5dBm input power, which introduces 0.1 reflection shift into the inductive area at the inner part of the curve.

The HBSP plot of S11 and S22 in Fig. 56 illustrates that both input and output loads of PA are located inside of the Smith Chart from 1GHz to 100GHz at 10dBm input power.



Figure 56: HBSP - S11 and S22 (Smith Chart) at 10dBm input power

Back to the problem with the low frequency component, here it is seen that the signal for 30G-40GHz locates far from the center of the Smith Chart meaning that the signal is expected to be faded by reflections. However, unexpected amplification at 30G-40GHz can be explained by the load pull effect.

As it can be seen in Fig. 57, both S22 and S11 are located inside of the Smith chart, which define the stability mainly. The sweep of frequency is from 100MHz to 100GHz.

HBSP results are shown below at 10dBm input power close to the compression point. The resonance is point is close enough to the center of the Smith chart to have low reflection impact



Figure 57: HBSP - S11 and S22 at 10dBm input power (no peak within 30G-40GHz)

As it was mentioned before, if the K factor is greater than 1 and the module of delta is lower than 1, then the PA is unconditionally stable, as the main check.

The plotted delta in Fig. 58 is lower than 1. However, it has 2 lowest peaks within 30G-40GHz and 80G-90GHz. If 80G-90GHz range is acceptable, then 30G-40GHz might be the cause of problems even if the PA is stable.



Stability factors are plotted within 25GHz and 90GHz due the clarity in Fig. 59. Mu and Mu prime factors lower than 30GHz tend to unity but they do not cross it since S11 and S22 are staying inside of the Smith chart.

K-factor lower than 30GHz jumps significantly due S11 and S22 position at the lower frequencies [4].







The reached output load is 5 Ohm and the overall trade-off between the gain and the output power using LC components in gates leads to the illustrated result in Fig. 60.

The peaking output power is 16.4 dBm at 11.5dBm input power. The influence of  $IM_3$  is illustrated in Fig. 61 and IIP3 is 20dBm of input power.



#### Common Mode & Differential Mode

Mixed in/out mode was used in SP simulation to check common and differential modes within 1GHz and 100GHz.

This simulation should show the impact on the gains keeping both input and output differentially (Sdd21) or in the case of shorting either output (Sdc21) or input (Scd21).

The differential to differential gain should be the same as previously achieved and common to differential or differential to common gain are expected to be low to avoid oscillations leading to instability PA for long term signal presence.

The plot of gains is illustrated in Fig. 62.



Figure 62: Common & Differential Modes

The differential gain has the same results as previously achieved.

The differential input and the common output as well as the common input and the differential output gains are low enough (-30dB) to avoid amplification of signals at the common points in the case of mismatches.

Oscillations Lock

An amplifier generates signals with high power, which may damage other components.

Therefore, it is necessary to lock oscillations inside of PA using huge value capacitors to short Vdd with Vss net.

Transient analysis was employed with moderate and cautious precision at 11dBm of input power.



Figure 63: Oscillations in Vdd & Vss nets for 300ns

The measurements were taken between Vdd and Vss pins of PA and wires with Q-factor 50 - 100 and an inductance 1pH - 10pH attached to the supply 1.8V and the ground.

As it is seen in Fig. 63, the largest peak to peak amplitude of the oscillations is 6mV - 8mV, which fades after 75ns.

#### Power Amplifier vs Cascade

|                     | Gain (dB) | CP (dBm) | Pout (dBm) |
|---------------------|-----------|----------|------------|
| Cascade only        | 9.1       | 10.91    | 19         |
| PA with 2 short Vss | 5.93      | 11.6     | 16.32      |

The main difference between the cascade and the PA post layout results are illustrated the table 16.

Table 16: Overview the difference between cascade and PA results

The power gain has sufficient drop due to the impact of Vss net even after modifying the LC components for the PA. Fig. 64 & 65 illustrate that the output swing has decreased over the output transistor meaning the main gain drop happened.

Fig. 66 & 67 depicts that the efficiency is expected to be even lower because of the output power drop as well.





Spectrum of the output differential signal



Peaks at 72GHz and 101GHz have decreased significantly but the lower frequency component is shifted from 36GHz to 31GHz but it is not filtered out.

The transient simulation is very useful in terms of spotting problems (e.g. peaking, oscillations, fading), meanwhile, most usable simulations as SP, STB, HB, HB - tstat, HBSP are covering a predefined area of results made them limited.

However, some parameters may warn about the problem. The "delta" parameter for the conversion between S, Z, Y, H, and the definition of stability can be plotted manually using formula:  $\Delta = S11 \cdot S22 - S12 \cdot S21 [1,4,5]$ Unfortunately, there is not a plotting option in the cadence tool, as it is shown in Fig. 68.

| Function |        |            |        |
|----------|--------|------------|--------|
|          |        |            |        |
| 🖲 SP     | ⊖ zp   | O YP       | 🔾 НР   |
| 🔾 gd     | VSWR   | 🔾 NFmin    | 🔾 Gmin |
| 🔾 Rn     | 🔾 rn   | O NF       | 🔾 Kf   |
| 🔾 B1f    | 🔾 gt   | 🔾 ga       | ⊖ gp   |
| 🔾 Gmax   | 🔾 Gmsg | Gumx       | 🔾 zm   |
| O NC     | 🔾 GAC  | GPC        | 🔾 LSB  |
| 🔾 SSB    | 🔾 Mu   | O Mu_prime |        |
| L        |        |            |        |

Figure 68: SP/HBSP plotting options

Comparing results between IEEE articles [2,3,6,11-17] and books [4,5], the delta curve has a single lowest value within the frequency of interest.

The delta curve has drops around 0 at places, where frequency components have a noticeable peak to consider, as it is illustrated in Fig. 69.



Figure 69: Delta plot indicates the same problem as the transient simulation

Through the research was found that low frequency components are mainly generated by the combination of LC values in such components as transistors, original inductors (Lout, Lp1), nets (Lg2, Lg1, Ls1, Ls0) and capacitors (Cg2, Cg1).

Fig. 71 illustrates a spectrum with low power signal (-60dBm) with previously seen low frequency (around 40GHz) beams, which are lower than original signal at 80GHz.

The mentioned components are illustrated in Fig. 70.





Figure 71: Spectrum of low power output signal



However, the emx extraction of the Cg1 capacitor was causing of lifting up whole spectrum amplifying existed low frequency peak gradually and creating oscillation inside of PA.

The emx extraction (for 1.1pF) substitution of Cg1 capacitor to prepared additionally another emx extracted capacitor (0.9pF) solved the problem with oscillations, which was generated by the great amplitude of low frequency component.

The impact of fault (using 1.1pF) and correct (using 0.9pF) emx extractions on the output signal is illustrated in Fig. 72.



Figure 72: Impact of fault (yellow) and correct (red) emx extractions on the output signal

### RF pads

RF pads should lead out the amplified signal from the chip to the outside PCB.

Pads are the wide LB metal layer which has R = 18 mOhm, C = 55 pF, L = 46 fH.

A custom serial capacitor with 68.5fF and 330mOhm and a parallel inductor using double transmission line to make the output matching network between the PA and pads to transfer optimum load (4.5Ohm differentially) to the differential 100 Ohm.

Pads with the output MN are illustrated in Fig. 73.



Figure 73: RF pads with Output MN to 100 Ohm

The case 1 of PA with short Vss is used to clearly see the impact of pads and MN on the output power. Results with initial, ideal and including parasitics cases are illustrated the table 17.

|                                        | Power Gain (dB) | Output Power (dBm) | Compression Point (dBm) |
|----------------------------------------|-----------------|--------------------|-------------------------|
| Only PA                                | 5.93            | 16.32              | 11.51                   |
| Schematic MN<br>Schematic pads         | 5.56            | 16.03              | 11.47                   |
| Schematic MN<br>EMX extracted pads     | 4.61            | 14.03              | 11.42                   |
| EMX extracted MN<br>EMX extracted pads | 3.09            | 12.41              | 10.32                   |

Table 17: The influence of the output MN and RF pads

Presence of RLC parasitics in components, vias and nets are causing the power drop even with achieving the precise center of the Smith Chart using the same approach as in the ideal cases (schematic).

## Output PA Stage with Output MN to RF pads

The output stage with MN to RF pads is illustrated in fig. 74.



Characteristics of PA without RF pads: Power gain = 5.9 dB Compression point = 11.6dBm Output power = 16.3dBm IIP3 = 20dBm Noise figure = 4.5dB BW = 34GHz (53GHz - 87GHz) Psat = 19.7dBm

However, the gain and the output power are insufficient to overcome the power loss in the output MN and RF pads.

Characteristics of PA with RF pads: Power gain = 3.09 dB Compression point = 10.32dBm Output power = 12.41dBm IIP3 = 18dBm Noise figure = 4.5dB BW = 16.5GHz (70GHz - 86.5GHz) Psat = 17dBm

Figure 74: Final layout of the PA with RF pads

The original plan was on achieving a high enough output power and gain to compensate for inevitable losses in the output MN and RF pads, as well as power dissipation across components.

RF pads cause a 1.2dB decrease in power, while the output MN experiences a 1.14dB drop in power (the ideal MN has a 0.75dB drop).

The effect of the Vss net on the output PA's overall performance is too great.

The impact of LC components at the transistor gates in the cascade causes a considerable phase shift in the differential voltage swings between the input and the remaining transistors.

While the low frequency component at 30GHz is suppressed at 62GHz and 93GHz, its harmonics are not sufficiently attenuated over the MN.

# Chapter 4 - Issues with RLCK extraction using Quantus

The first problem is orientated to the extraction of two identical instances.

The RCLK extraction results of two identical cascades are illustrated Fig. 75.

Pins are attached to nets around transistors and the yellow text is a summary of RCLK extraction of a particular net.



Figure 75: Fault results in RCLK extraction

For two cascades that are identical, the RCLK extraction results appear differently.

Observation reveals that the total parasitic capacitance remains the same between the left and right cascades, but the total parasitic inductance varies twice.

The impact of such extraction on the post layout simulation are:

- The result is mostly oriented towards the lowest value, which may be inaccurate.
- The result is more stable. The net "out" of the left cascade has the result 12.54pH and further results will be close to it. The net "out" of the right cascade can be 2 or 3 times larger than 12pH but the shift will be minor. It can be clearly observed on the Smith Chart when the position of characteristic impedance would locate at the resonance with the left net 12pH and the right net 12pH or 24pH, but it shifts if to set both nets to 24pH by -j1.

The second problem is orientated to the values of extraction.

The "out" net is considered again as an example and it is illustrated in Fig. 76 with pins.



Figure 76: Original output net from the PA with low inductance

The net "out" has dimension 13.5um by 38.8um and pins sizes are 10um for the transmission line connection and 33.5um for the stacked transistors common connection.

According to the documentation about RCL and RCLK extraction using Quantus, a preferred fraction width for frequencies higher than 50GHz is 50-100.

The lowest value is 12pH and the highest 24pH from the RCLK extraction.

An EMX extraction result is 1.4pH.

The difference between results is roughly 10 times, which is too significant to continue.

Two EMX extractions of the rectangle QB layer with the dimension 10um by 40um are further represented below to illustrate a possible reason for the fault result.

The first test is to place pins on the maximum distance from each other, as it is illustrated in Fig. 77.



Figure 77: Case 1 - Pins are located on the maximum distance

An EMX extraction result is 18pH if pins are 10um width and 40um away from each other. It is noticeable that 18pH is the mean of 12pH and 24pH from RCLK extraction. The second test is to place pins in a similar position as in the net "out", as it is illustrated in Fig. 78.



Figure 78: Case 2 - Pins are located on the original distance as in the output net

An EMX extraction result is 1.6pH if pins are 10um and 33.5um. The same value has already been represented in the table from the section "output net".

Overall, the EMX extraction has more precise results for complex connections.

# Discussion and Conclusion

## Outcomes of Output PA Stage Results

Structure of PA:

- Divided the middle transistors' gates to avoid gain degradation brought on by the common net's inductance (p. 25, table 6);
- To set inter-matching and the output load into the inductive area, a differentially connected inductor is placed between the output and middle transistors. This creates space for LC components to be added to the output transistor's gate, which will both set the resonance point and increase gain (pp.23-24).

Transistors:

- The metallisation layers of the PDK transistor create a significant capacitive load, which lowers the inductor value and causes linearity problems (earlier CP and worse transconductance, for example).
- Metallisation layers for the drain, the source and the gate are rebuilt.
- The modified transistor's parameters (p. 32) are as follows:
  - both imY11 and imY22 are close to the M1 PDK case.
  - both reZ11 and reZ22 are close to the M1 PDK case.
  - imY12 is lower than C3 and JA PDK cases.
  - imY21 is close to the JA PDK case.
- The redesigned transistor sets the drain and source to JA and the gate to QB layers, improving the gain (2dB), CP (1dBm), output power (3dBm), and inductor range (extra 4pH) for the PA (p. 33).

The impact of inductance on the PA's overall performance identified the main adjustments.

PDK capacitors are substituted by modified because of Q factor impact on the results (pp. 37-39, table 11 & 12).

Cascade:

- The pins for the observation of inter-matching and matching places in the test bench are located at drains of transistors;
- All nets have EMX extraction for the more precise definition of RLC values;
- The output net had separate concern due to its quality impacts on the gain and the output power values as well as the inductance in the net defines the output inductor value.
- The post layout results have minor difference between the cascade and transistors only (p. 41).
- A low frequency components generation were noted at the differentially connected inductor for the inter-matching after transient simulation (p. 44, Fig. 52).

The output PA stage:

- The initial strategy was focused on the sufficiently high the output power and the gain to overcome unavoidable loss in the output MN and RF pads as well as power dissipation over components.
- Differential voltage swings have to a significant phase shift between the input and the rest of transistors due to LC components impact at gates of transistor in the cascade.
- The stacked power amplifier is fully stable.
- The delta curve around 0 defines the same place of signal peaks on the frequency scale (p. 54 Fig. 69) as well as the spectrum of the output signal using the transient simulation (HB-tstat is not able to show everything) for the long term (within 50ns and 300ns).
- The delta curve indicates peak signals (both at fundamental and lower frequency component) (p. 54 Fig. 69), while, no other parameter from SP/HB simulation (S, Kf, mu, gains, HB-tstat, HB spectrums) did not show any pattern (p. 49 Fig. 59).
- The low frequency component at 40GHz is generated due to occasional fault emx extraction of 1.1pF Cg1 capacitor, but was fixed using alternative emx extraction of 0.9pF Cg1 capacitor (p. 55 Fig. 72).
- The low frequency component at 30GHz is not sufficiently faded over the MN, but its harmonics at 62GHz and 93GHz are suppressed.
- The power drops over the output MN is 1.14dB (the ideal MN has a 0.75dB drop) and RF pads make 1.2dB drop (p. 56, table 17).
- The Vss net has too significant impact on the overall performance of the output PA (p. 47, table 16) causing the dramatic drop of the gain and the output power and CP.

| 1 1                                   | 1         | 0          | 1       |          |            |
|---------------------------------------|-----------|------------|---------|----------|------------|
|                                       | Rin (dif) | Rout (dif) | Gp (dB) | CP (dBm) | Pout (dBm) |
| Schematic                             | 4         | 15         | 12      | 9.48     | 20.5       |
| Custom transistors (RC)               | 4         | 15         | 10.7    | 9.56     | 19.53      |
| Cascade (only)                        | 2.7       | 9          | 9.1     | 10.55    | 19         |
| Cascade (with LC components)          | 2.7       | 9          | 7.6     | 10.87    | 17.55      |
| Pa out (splitted Vss)<br>Case 1       | 2.4       | 5          | 5.9     | 11.6     | 16.3       |
| PA out (global Vss)<br>Case 3         | 2.3       | 6          | 4.43    | 11.25    | 14.68      |
| Pa out with mn to pads (using case 1) | 2         | 100        | 3.09    | 10.32    | 12.41      |

The PA performance development process during the thesis is depicted in the table 18.

Table 18: Overview of performance drop through implementations

2 input and 1 drive stages were used with a RC extraction of a customized transistor in the test bench to lift up the gain and check the PAE result. In comparison with the output stage PA, mentioned stages are dealing with lower input/output powers, therefore PA for the stages are smaller, and operating as a class A since overall performance is defined by the output stage PA.

The whole PA chain result is depicted in the table 19.

|      | Without RF pads | With RF pads |
|------|-----------------|--------------|
| Gain | 33.17dB         | 30.3dB       |
| Pout | 15.73dBm        | 12.78dBm     |
| PAE  | 6.56%           | 3.11%        |

| Table 19: Final power gain, output power and PA |
|-------------------------------------------------|
|-------------------------------------------------|

The PAE is much lower than expected. These results can be explained as:

- Too high biasing of input transistor in the attempt to magnify a transconductance
- Fig. 74 illustrates small voltage swing amplitudes. The DC position is 0.6V, meanwhile, the mean of amplitude is within 0.25V only. This means that the PA is constantly on and a huge DC current is causing a large DC power presence.
- The output swing of the input transistor has almost an opposite phase causing the total output swing be <sup>1</sup>/<sub>3</sub> times lower as it is shown in Fig. 79.



- The first common gate amplifier (the middle transistor) has created a phase shift. The common gate amplifier is a current follower and has a positive voltage gain. The most possible reason for the swing shift is the influence of Vss net and the presence of LC parasitics on the path of signal between transistors.
- The inter-matching using differentially connected inductors between differential amplifiers may cause unwanted amplification of lower frequency components.

However, this work shows that it is possible to reach sufficient power and gain using a stacked power amplifier structure.

During the work it was observed that:

- Great number of transistors were used and remodified to surpass the transconductance loss due to non-linear components presence.
- The required value for inductors is within 3pH 10pH and PDK baluns cannot reach such low values.

LC parasitics can be used to improve the gain and the output power of PA.

#### Comparison Results between Quantus RCLK and EMX extractions

The first result was achieved using Quantus RCLK extraction.

The result in Fig. 80 illustrates that the main goal to achieved sufficient output power and efficiency as it was targeted initially: the power gain of the output PA stage around 7dB, the output power 14.4dBm and PAE 14.4%.

However, it is suspicious that the research of IEEE articles referred to the Doherty amplifier than the stacked on.



Figure 80: Gain & PAE using Quantus RCLK extraction

The mention problem was spotted when the work on the output stage was almost accomplished and the recheck and rebuilding work was started from the transistors stage (literally from the beginning). EMX extractions shew that a bunch of problems were undiscovered related to new inductance values, LC impact on the output load and performance, Q factor of components (nets and capacitors). The whole scope of met problems is described in chapters 2 and 3.

|      | Without RF pads With RF pads |          |  |  |
|------|------------------------------|----------|--|--|
| Gain | 33.17dB                      | 30.3dB   |  |  |
| Pout | 15.73dBm                     | 12.78dBm |  |  |
| PAE  | 6.56%                        | 3.11%    |  |  |

The final results using the rebuilt output PA stage based on the EMX extractions are shown in the table 20.

The final gain, the output power and PAE are illustrated in Fig. 81. It is notable that values considerably decreased for the PA with and without RF pads.



Figure 81: Gain & PAE using EMX extraction for both with and without RF pads

Result using EMX extractions is notably worse than with Quantus RCLK extractions, but this indicated the situation with extraction outcomes.

Table 20: Gain, Pout, PAE using EMX extraction

#### Comparison with "TX40" Power Amplifier for 40GHz

The list of differences in respect to the power amplifier "TX40" for 40GHz:

- The area of output PA stage for 80GHz 210um x 290um due to the large number of transistors, but small inductors, and the area of PA for 40GHz is 102um x 160um due to the small number of transistors, but large balun.
- The output stage for 40GHz has Gain = 10dB; Pout = 14dBm; PAE = 18%, meanwhile, the output PA for 80GHz has Gain = 5.9; Pout = 15.73; PAE = 6.56% with further drop of the gain (3dB) and the output power (12.41dBm) over the output matching network (1.14dB drop) and RF pad (1.2dB). Such extreme drop of efficiency is caused by both non-full swings drain to source over transistors defining the constant operation and the phase shift reducing the final output swing. A non-full swing might be caused by parasitics impact (e.g. discharge time or additional resistance presence) as it was seen in the comparison of the cascade and the PA post layout results in the chapter 3. The phase shift is defined mainly by LC components at gates of middle and output transistors, which was used to set the power gain, the output power, matching at the output of PA and inter-matching between transistors.
- The TX40 PA is more flexible in this case due to more stable and sufficiently high transconductance to surpass mismatches between transistors, which allows it to swing over the drain to the source in phase.

(The cascade of TX40 is extracted as RC only, which disregards thin long gate and ground nets. This may have an unwanted impact.)

- PDK components are losing their Q factor with the frequency gross. Baluns are out of size for the 80GHz PA and transmission lines were used as inductors. The Q factor of APMOM capacitors degrades dramatically with frequency and particularly PDK capacitors were causing the power drop. The APMOM capacitor is designed using the minimum width of metal layers and this is the cause of, firstly, the resistance drops, and secondly, the value of cap with dimension (length x width) 5um x 20um is capacitive and 20um x 5um is rather inductive.
- It is notable that both TX40 and the PA for 80GHz have good post-layout results using RCLK Quantus extraction. However, the post-layout result is considerably worse using EMX extractions of all nets in the PA.

#### Comparison with IEEE articles

The main difference in respect to the IEEE articles works is the efficiency enhancement techniques are used to efficiently amplify a signal at such high frequencies as 71G-76GHz and 81G-86GHz. Most papers refer to the Doherty amplifier. The main idea is to combine the output power between the main and an axillary amplifier using a combining balun. Amplifiers contain 2 stacked differential amplifiers generally with the supply voltage within 1V - 1.5V. There are some additional problems such as efficiency drop due to the additional amplifier or avoiding the self-resonance peaks during the design of the transformer, it allows to resolve the problem with growing parasitics.

Basic differences are:

- The DC current varies around familiar 22mA pointing to 14dBm 16dBm. However, this moment brings back to the initial question with nonlinearity impact on the transconductance problem about which the main part of the thesis report is referred. IEEE papers do not contain the information about any problems with MAG.
- According to the operating current amount (e.g. 22mA), the possible number of stacked transistors is 3. This number of transistors has considerably lower total capacitance, which allows the use of a large inductor value improving Q factor. The circuit illustrations in IEEE articles depict the differentially connected capacitor in outputs of PA, referring to both the sufficient stored value of the inductor to use and the resonance point not particularly at the drain of the output transistor. During the thesis, the resonance point in the output is wanted to be as close to the drain as possible because the presence of inductance in the net between the drain and the observed output port causes the gain drop. IEEE papers refer to the space of freedom for simplifications/assumptions for resonance points until -10dB S parameter for matching.
- Doherty amplifiers from IEEE articles are using neutralization capacitors to improve stability creating negative Cgd, which is a part of the reverse gain. The power amplifier of the current work is implemented as a cascade amplifier which is stable and no neutralisation capacitors are needed. However, it should be mentioned that previously there was created a layout of the PA with neutralisation capacitors in an attempt to improve a transconductance and the inductance of net for cross connection between the drain of transistor and capacitor has an influence on both input matching and inter-matching between transistors.

| Comparison | with the | State of Art |
|------------|----------|--------------|
|------------|----------|--------------|

| Ref.         | Technology  | Topology | Freq<br>(GHz) | BW <sub>3dB</sub><br>(GHz) | Psat<br>(dBm) | OP1dB<br>(dBm) | PAE<br>peak<br>(%) | Area<br>(mm <sup>2</sup> ) |
|--------------|-------------|----------|---------------|----------------------------|---------------|----------------|--------------------|----------------------------|
| [11]         | 90nm CMOS   | Doherty  | 74            | 5                          | 15.4          | 11.7           | 30                 | 1.53                       |
| [12]         | 22nm FD-SOI | Cascade  | 85.2          | 14.7                       | 17.4          | 14.6           | 18                 | 0.2                        |
| [13]         | 40nm CMOS   | Doherty  | 77            | 12                         | 16            | 15.3           | 12                 | 0.1                        |
| [14]         | 22nm FD-SOI | Cascade  | 76            | 15                         | 17.8          | 13.3           | 17.3               | 0.02                       |
| [16]         | 40nm CMOS   | Cascade  | 80            | 80                         | 15            | -              | -                  | 0.31                       |
| [17]         | 130nm SiGe  | Doherty  | 75            | 20                         | 14.4          | 11.4           | 19.2               | 0.72                       |
| This<br>work | 22nm FD-SOI | Cascade  | 82            | 16.5                       | 17            | 12.78          | 3.11               | 0.06                       |

The table with the comparison of the thesis work with the state of art is depicted the table 21.

Table 21: Comparison between the state of art with the current work

The list of problems and solutions trade off with following results from the chapter 3 concludes that in terms of the bandwidth, the saturation power and the area, this work is successful. However, the main wanted parameters as the output power and PAE are lower than expected and reasons for such failures are suggested in the main part of the report.

# Future Work

- 1) Reconsider the way to decrease the number transistor without losing the gain.
- 2) Power drop sufficient output power is required (close to 20dBm) to surpass power loss at RF pads, input/output matching network, components/nets power dissipation.
- 3) Custom balun to create high Q factor more than 25 at least to surpass power dissipation due to lower limit resistance around 170mOhm, except the basic problem with coupling factor and self-resonance regions.

(PDK inductors such as a balun or a transmission line have resistance within 30mOhm and 80mOhm meanwhile a custom inductor is generally larger than 130mOhm.)

- 4) The PA design can be started from including Vss EMX extractions to orient in respect to a close loss. Capacitors and net inductance values in respect to the Vss net.
- 5) Balun for the output matching with extended inductance using chosen resonance point from the original PA output.
- 6) Reconsider LC values for the gates of middle and output transistors to synchronize voltage swings without the loss of both the gain and the output power.

# Reference

- 1. L.Sundström, G.Jönsson, H.Börjeson, "Radio Electronics", LTH, Electrical and Information Technology.
- 2. Hayg-Taniel Dabag, Bassel Hanafi, Fatih Golcuk, Amir Agah, James B.Buckwalter, Peter M. Asbeck, "Analysis and Design of Stacked-FET Millimeter-Wave Power Amplifiers", IEEE Transactions on Microwave Theory and Techniques, Vol. 61, No4, April 2013.
- 3. Mohsin M. Tarar, Renato Negra, "Design and Implementation of Wideband Stacked Distributed Power Amplifier in 0.13um CMOS Using Uniform Distributed Topology", IEEE Transactions on Microwave Theory and Techniques, Vol. 65, No12, Dec 2017.
- 4. David del Rio, Ainhoa Rezola Juan F. Sevillano Igone Velez, Roc Berenguer, "Digitally Assisted, Fully Integrated, Wideband Transmitters for HighSpeed MillimeterWave Wireless Communication Links", ACSP, Analog Circuits and Signal Processing.
- 5. Steve C. Cripps, "RF Power Amplifiers for Wireless Communications", Second Edition, 2006 ARTECH HOUSE, INC.
- Sataporn Pornpromlikit, Jinho Jong, "A Watt-Level Stacked-FET Linear Power Amplifier in Silicon-on-Insulator CMOS", IEEE Transactions on Microwave Theory and Technology, vol. 58, 1 Jan 2010.
- Janne P. Aikio, Mikko Hetanen, Nuutti Tervo, "Ka-Band 3-Stack Power Amplifier with 18.8dBm Psat and 23.4% PAE using 22nm CMOS FDSOI Technology", 2019 IEEE Topical conference on RF/Microwave Power Amplifiers for Radio and Wireless Applications
- Mohsin M. Tarar, Renato Negra, "Design and Implementation of Wideband Stacked Distributed Power Amplifier in 0.13um CMOS using Uniform Distributive Topology", IEEE Transactions on Microwave Theory and Techniques, vol. 65, 12 Dec 2017
- Hayg-Taniel Dabag, Bassel Hanafi, Faith Golcuk, "Analysis and Design of Stacked-FET Millimeter-Wave Power Amplifiers", IEEE Transactions on Microwave Theory and Techniques, vol. 61, 4 Apr 2013
- 10. Mariano Ercoli, Daniela Dragomirescu, Robert Plana, "Small Size High Isolation Wilkinson Power Splitter for 60GHz Wireless Sensor Network Applications", 2011 IEEE, pp. 85-88.
- 11. Stefan Shopov, Rony E. Amaya, John W. M. Rogers, "Adapting the Doherty Amplifier for Millimeter-Wave CMOS Applications", 2011 IEEE, pp. 229-232.
- 12. Songhui Li, Mengqi Cui, Xin Xu, Laszlo Szilagi, "An 80GHz Power Amplifier with 17.4dBm Output and 18% PAE in 22nm FD-SOI CMOS for Binary Phase Modulation Radars", 2020 IEEE, pp. 278-280.
- Ercan Kaymaksut, Dixian Zhao, Patrick Reynaert, "E-band Transformer-based Doherty Power Amplifier in 40-nm CMOS", 2014 IEEE Radio Frequency Integrated Circuits Symposium, pp. 167-170.
- 14. Umut Celik, Patrik Reynaert, "An E-band Compact Power Amplifier for Future Array-based Backhaul Networks in 22nm FD-SOI", 2019 IEEE Radio Frequency Integrated Circuits Symposium, pp.187-190.
- 15. Maryam Fathi, David K. Su, Bruce A. Wooley, "A Stacked 6.5GHz 29.6dBm Power Amplifier in Standart 65nm CMOS", 2010 IEEE.

- 16. Po-Han Chen, Kuang-Sheng Yeh, Jui-Chih Kao, Huei Wang, "A High-Performance DC-80GHz Distributed Amplifier in 40nm CMOS Digital Process", 2014 IEEE.
- 17. Md Najmussadat, Raja Ahamed, Dristy Parveg, "Design of an E-band Doherty Power Amplifier", 2018 IEEE, pp. 145-148.
- Dixan Zhao, Patric Reynaert, "An E-Band Power Amplifier wit Broadband Parallel-Serial Power Combiner in 40nm CMOS", IEEE Transactions on Microwave Theory and Techniques, vol. 63, 2, Feb 2015.
- 19. Ercan Kaymaksut, Dixian Zhao Patrick Reynaert, "Transformer-Based Doherty Power Amplifiers for mm-Wave Applications in 40nm CMOS", IEEE Transactions on Microwave Theory and Techniques, vol. 63, 4, Apr 2015.
- Jefy A. Jayamon, James F. Buckwalter, Peter M. Asbeck, "A PMOS mm-wave power amplifier at 77 GHz with 90 mW output power and 24% efficiency", 2016 IEEE Radio Frequency Integrated Circuits Symposium, 3 Feb 2024.



Series of Master's theses Department of Electrical and Information Technology LU/LTH-EIT 2024-967 http://www.eit.lth.se