<span id="page-0-0"></span>Design of Power-Efficient High-Purity Phase-locked Frequency Synthesis

#### DISS. ETH NO. 26967

## **Design of Power-Efficient High-Purity Phase-locked Frequency Synthesis**

A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH ZURICH (Dr. sc. ETH Zurich)

presented by

LIANBO WU

MSc EE, Delft University of Technology, the Netherlands born on December 3rd, 1989 citizen of Tianjin, P.R.China

accepted on the recommendation of

Prof. Dr. Qiuting Huang, examiner Prof. Dr. R.B. Staszewski, co-examiner

2020

*"Simplicity is the ultimate sophistication."*

Leonardo Da Vinci

*"Great Truths Are Always Simple."*

Lao Tzu, Taoism

## <span id="page-4-0"></span>**Acknowledgements**

Time flies so fast and passes like a river flows continuously that waits for no body! I still have vivid memories of the first evening I arrived at the Netherlands about 8 years ago, the second time I left China in my life. I still can remember my first day at ETH J93 as it was yesterday . This thesis marks the end of a long journey and would not be completed successfully without the support and help received from my advisors, friends, colleagues, and family. I would like to express my gratitude to everyone who has contributed to this doctorate work.

My utmost gratitude goes to my advisor, Prof. Qiuting Huang, for offering me the opportunity to pursue my Ph.D. in such a great team, for the freedom I was granted, and for his patient guidance, confidence as well as encouragement throughout these years. His profound knowledge and exceptional technical curiosity in the field have challenged me many times and have improved my works significantly.

I am also indebted to my co-examiner Prof. R.B.Staszewski from University College Dublin for co-examining this dissertation. Prof. R.B.Staszewski was also the advisor of my M.Sc. dissertation in TU Delft, under whose supervision I learned immeasurable professional knowledge and stepped into the field of integrated circuit design. Thank you!

A special thank goes to Dr. Thomas Burger who supported me at several important and difficult moments throughout this journey and inspired me a lot via countless technical discussions. I sincerely thank Thomas Kleier for his support in PCB design as well as measurement.

Similar qualities are attributed to Rossi Aldo for bonding dozens of chips. I also appreciate the help from Hansjörg Gisler regarding measurement.

I feel so lucky to have so many great persons as fellow student teammates (past and present) at IIS, and I would like to thank them one by one: Luca Bettini and Benjamin Sporrer for making my first two years' life in J93 so wonderful as well as making the MRI project a fruitful research journey; Philipp Schönle for his advices and many discussions. I would like to thank Mattia Bonomi, Christian Geiger, Serdar Yonar, Jiawei Liao, Danny Luu, Schekeb Fateh, Giovanni Rovere, Florian Glaser, Matthias Korb, Harald Kröll, Benjamin Weber, Sandro Belfanti, Mauro Salomon, David Bellasi, Pascale Meier, Noé Brun, and Anne Ziegler as well.

I am deeply indebted to the design center and IT staff for their patient and professional help with maintaining the CAD tools, guarantying an excellent design environment. I have high appreciations particular to Christoph Wicki, Beat Muheim, and especially Frank Gürkaynak, with whom I had many pleasant discussions over our common enthusiasm – history. I also much appreciate the administrative support of Monika Zwahlen.

Many thanks also go to Advanced Circuit Pursuit AG for helping with the production of the various test chips, particularly to Jürgen Rogin for his patient technical guidance, Illian Kouchev for numerous discussions and David Tschopp. A special *thank you* goes to Jiajia Liu, for the continuous friendship that could date back to 2010.

What is more, life in Switzerland and Europe would be so colorless and lack of joy without my dear Chinese fellows and friends, to whom I want to convey my sincere gratitude: I thank Xu Han for many social evenings and shared pleasant time in Switzerland; Bojun Cheng for the inspiring debates from USTC all the way to ETH; Jian Wang, Wen Li and Prof. Jibo Wei for the unforgettable happy memory in Zurich and the pleasant reunions in China; I want to thank Wei Wang, Qilong Liu, Peng Chen, Zhirui Zong, Dr. Xin He, Sheng Zou, Jiaji Pan, Yu Liu, Zhiwei Zong, Bangan Liu, Ting Gong, Yuming He, Zisong Wang,

Taiyun Chi, Xuan Li, Yitong Li, Yuanyuan Huang, Rong Song, and many others. Cheers to our long-time friendships, my dear friends!

My deepest and highest gratitude and appreciation go to my dear parents, Ruichang and Yulan. I am very much indebted to you, for your unconditional support and encouragement throughout my life. Your love is the most robust strength that always backs me up and keeps motivating me to move forward. I am looking forward to starting the new chapter of my life, with all the loves, supports, knowledge, lessons, trusts, experiences, and philosophies harvested along my Ph.D. journey!

Zürich, July 2020

Lianbo Wu

## <span id="page-7-0"></span>**Abstract**

This thesis is a contribution to the research regarding the design and implementation of power-efficient phase-locked frequency synthesizer with high spectral purity. With the unending evolution of wireless communications and the unceasing growth of RF system-on chip (SoC) market, power-efficient frequency synthesis solutions with higher spectral purity are more crucial than ever. Increasingly stringent specifications with dense constellations are imposed on integrated phase noise (IPN) and spur levels to fulfill the related requirements such as transmitter error vector magnitude (EVM), receiver sensitivity, and blocker tolerance. On the other hand, for battery-operated SoC devices, the power budget is limited while the performance demands are ever-increasing. Therefore, high-performance phase-locked loops (PLLs), with improved power-jitter trade-off (higher power efficiency) are required. This dissertation seeks to explore alternative novel PLL architectures towards such a goal. It investigates both opportunities and design challenges embedded within conventional analog and digital solutions. The fundamental limitations to fulfill a high-performance fractional-N operation are discussed and analyzed. To overcome these obstacles, proposed solutions from both architecture and circuit levels are presented. To prove the proposed concepts, three prototypes have been carried out in 130nm, targeting at cellular application and MRI on-coil receiver arrays. The measured results show that the proposed FDVPD-based DPLL has achieved the stateof-the-art jitter-power trade-off among all reported sub-10GHz PLLs, paving the way further for DPLLs to be applied in high-performance RF SoCs.

## <span id="page-8-0"></span>**Zusammenfassung**

Diese Doktorarbeit ist ein Forschungsbeitrag zum Design und zur Implementierung energieeffizienter Frequenzsynthese mit hoher spektraler Reinheit. Angesichts der fortwährenden Entwicklung der drahtlosen Kommunikation und des unaufhörlichen Marktwachstums für Hochfrequenz-Systeme-on-Chip (SoC) sind energieeffiziente Frequenzsyntheselösungen mit hoher spektraler Reinheit wichtiger denn je. Zunehmend strengere Spezifikationen mit komplexen Modulationskonstellationen werden für integriertes Phasenrauschen (IPN) und spektrale Masken festgelegt, um die damit verbundenen Anforderungen wie die Grösse des Senderfehlervektors (EVM), die Empfänger-Empfindlichkeit und die Blockertoleranz zu erfullen. Andererseits ist ¨ bei batteriebetriebenen SoC basierten Geräten das Strombudget begrenzt, während die Leistungsanforderungen ständig steigen.

Daher sind Hochleistungs Phasenregelkreise (PLLs) mit verbessertem Power-Jitter-Verhalten (höhere Energieeffizienz) erforderlich. Diese Dissertation erforscht alternative neuartige PLL-Architekturen um ein solches Ziel zu erreichen. Sie untersucht sowohl Chancen als auch Herausforderungen im Design, welche herkömmliche analoge und digitale Lösungen mit sich bringen. Die grundlegenden Einschränkungen zur Erfüllung einer Hochleistungs-Fractional-N-Frequenzerzeugung werden diskutiert und analysiert. Um diese Hindernisse zu überwinden, werden Lösungsvorschläge sowohl auf Architektur- als auch auf Schaltungsebene vorgestellt.

Um die vorgeschlagenen Konzepte umzusetzen und deren Wirksamkeit aufzuzeigen, wurden drei Prototypen in 130 nm CMOS realisiert, welche auf Mobilfunkanwendungen und Magnetresonanz-On-Coil-Empfänger-Arrays abzielen. Die gemessenen Ergebnisse zeigen, dass der vorgeschlagene digitale PLL auf differenzieller Spannungsbasis das beste Jitter-Power-Verhalten unter allen publizierten PLLs unter 10 GHz erzielt, was den Weg für die Anwendung von solchen digitalen PLLs in Hochleistungs-RF-SoCs weiter ebnet.

## **Contents**







### $\begin{minipage}{0.9\linewidth} \hbox{CONTENTS} \begin{minipage}{0.9\linewidth} \centering \begin{tabular}{cc} \multicolumn{2}{c}{\textbf{CNOT} \textbf{ENTS}} \end{tabular} \end{minipage} \hfill \begin{minipage}{0.9\linewidth} \centering \end{minipage} \hfill \begin{minipage}{0.9\linewidth} \centering \begin{minipage}{0.9\linewidth} \centering \centering \end{minipage} \hfill \begin{minipage}{0.9\linewidth} \centering \centering \end{minipage} \hfill \begin{minipage}{0.9\linewidth} \centering \centering \end{minipage} \hfill \begin{minipage}{$



## <span id="page-14-0"></span>**Chapter 1**

## **Introduction**

The explosive growth of wireless communication has significantly changed our world over the past century. Marked by the first transatlantic radio signal sent by Marconi in 1901, an "empire of the air" was founded. Enabled by the revolutionary frequency modulation (FM) radio invented by Armstrong in 1933, a human, for the first time in history, can share his/her information to millions of others via merely a microphone. Nowadays, thanks to the cellular and wireless local area network (WLAN) technologies, catalyzed by breakthroughs in the modern semiconductor industry, we can share not only voices but also images and videos with our friends via compact handsets, anytime and anywhere.

Behind this fact lies the unending evolution of higher data rates and better spectral efficiency, which comes with denser constellations in congested spectra, as shown in Fig.  $1.1(a)$  $1.1(a)$  [\[1\]](#page-175-1). The ever-increasing data rates have imposed stringent requirement over the performance of frequency synthesis that is employed by these mobile terminals. Meanwhile, power consumption is another important concern as these handsets operate on batteries. As indicated by Fig. [1.1\(](#page-15-1)b), the RF front-end consumes significant power for typical use cases of mobile terminals such as phone calls and web surfing [\[2\]](#page-175-2). Furthermore, the frequency synthesis, leveraged as local oscillators (LOs) in RF transceivers, consumes significant power compared to the total budget [\[3,](#page-175-3) [4,](#page-175-4) [5,](#page-175-5) [6\]](#page-175-6), as shown in Fig. [1.2.](#page-16-1) Therefore, the main goal of

<span id="page-15-1"></span>

Figure 1.1: (a) Evolution of data rates for wireless LAN, cellular and wireline short links over time[\[1\]](#page-175-1). (b)power usage in a smart phone[\[2\]](#page-175-2).

this thesis is to seek an alternative and innovative way of frequency synthesis by phaselock, that demonstrates better performance at a lower cost with lower power.

### <span id="page-15-0"></span>**1.1 Frequency Synthesizers for Wireless Systems**

As an electronic block that generates a range of frequencies from one or a sets of input reference clocks, frequency synthesizer is a dispensable part for many integrated circuits (ICs). There are three major conventional frequency synthesis techniques [\[7\]](#page-175-7). The first one is the direct analog synthesis (mix-filter-diode architecture), which can offer clean output but at the expensive costs of many references required to cover a broad frequency band. The second one is direct digital frequency synthesis (DDS), which limits the output frequency to half of its input reference clock, making it unsuitable for RF applications. The third one, indirect synthesis by phase-locked loop (PLL) can achieve excellent performance with relative simplicity and low cost. Therefore phaselock synthesis is chosen for this thesis.

Emerging wireless applications set increasingly aggressive requirements to the synthesizers, which are leveraged as the system LOs, especially

<span id="page-16-1"></span>

Figure 1.2: Comparison of power consumption between frequency synthesizers and receivers.

at integration with digital processors, on-die area, power consumption, as well as robustness against process-voltage-temperature (PVT) variations. Meanwhile, numerous additional challenges are imposed by RF System-on-Chips (SoCs), which contain limited area/pad resources and noisy digital circuitry. Worse still, additional obstacles are presented by continued scaling of silicon technologies.

### <span id="page-16-0"></span>**1.2 Technology Scaling**

The size, speed, and power consumption of digital circuits are reduced by technology scaling. For instance, the propagation of a CMOS inverter is proportional to its load capacitance, and to the "on resistance" of the transistors. Thus, the reduction of gate-delay is roughly proportional to the scaling factor of geometrical dimensions. This can be seen from Fig.  $1.3(a)$  $1.3(a)$ , where three common CMOS technology nodes are taken (130 nm, 65nm, and 28nm). On the other hand, the design of analog and RF circuits faces many difficulties in more advanced CMOS technologies. One of the worrying facts is about the scaling of supply in the deep-sub-micron technology nodes. However, since the threshold voltage does not scale with the geometry, the

<span id="page-17-0"></span>

Figure 1.3: (a) Scaling of an inverter propagation delay; and (b) scaling of supply and threshold voltage.

supply voltage as well cannot be scaled proportionally to the transistor channel length. The trend is visually depicted in Fig. [1.3\(](#page-17-0)br). Migrating from 130nm to 28nm, the supply voltage is only scaled down by 1.2X while the inverter gate-delay gets reduced for almost 4X.

This fact partially explains the trend of replacing the more conventional analog PLL by emerging DPLLs over the past two decades [\[8\]](#page-176-0). Countless efforts and attempts have been made to demonstrate that DPLLs can not only meet the GSM-level noise performance [\[9\]](#page-176-1) but also satisfy emerging standards such as 5G [\[10\]](#page-176-2). Compared to a conventional analog PLL, its digital counterpart offers many advantages, such as higher reconfigurability, easier bandwidth control. One of the most noticeable advantages is that the bulky passive components within the loop filter (LF) is completely replaced by compact digital circuits. The resulting area savings are critical for achieving low-cost solutions, and the overall PLL implementation is more readily scaled down in size as new fabrication processes are utilized. While the benefits of a digital PLL approach are obvious to many, there remain basic questions regarding the attainable performance. In particular, can such structures achieve low jitter comparable to analog approaches?

Can a high PLL bandwidth be achieved to more easily support widebandwidth modulation and fast settling? Can traditional voltagecontrolled oscillators (VCO) be efficiently leveraged in such systems? These questions will be discussed in this thesis.

### <span id="page-18-0"></span>**1.3 Motivation**

Based on the discussions above, this thesis is dedicated to exploring an innovative phase-locked frequency synthesis structure that achieves relatively lower jitter at less power compared to conventional structures. Special concerns as listed below will be taken.

- *Overcome the main restrictions in deep-submicron CMOS* The low-cost of digital realization enabled by technology scaling is a great motivation to push as much signal processing as possible into the digital domain. On the one hand, this triggers a trend of applying digitally assisted design methodologies into tons of PLL applications, from prospective 5G transceivers to wireless sensor networks (WSN) for Internet of things (IoT); on the other hand, the digital circuitry may further degrade the inband phase noise, leading to worse fractional spurs. These facts make the digital intensive solution always a controversial topic. In this project, we aim at proposing circuit techniques intended to mitigate these relative drawbacks, so that PLL solutions with intensive digital assistance can fully benefit from the technology scaling.
- *Identify methods to fulfill the noise performance requirements at low power, area cost*

Area means cost in ultra-scaled process nodes while power consumption means less available battery life. Therefore, a decent noise performance should not be just achieved, but rather achieved in a cost-effective way, which means the proposed solutions have to be low power and compact in area. This research aims at identifying solutions, both at architectural and circuitry level, to improve the cost-effectiveness of the frequency synthesizer design over state-of-art solutions.

### <span id="page-19-0"></span>**1.4 Thesis Organization**

The present dissertation is organized as follows:

**Chapter [2](#page-20-0)** provides a general overview regarding frequency synthesizers. The basic metrics, and corresponding requirements for cellular and MRI on-coil receiver applications are discussed. In the end, the concept of phaselock-based synthesis (PLL) is introduced.

**Chapter [3](#page-44-0)** bridges the brief introduction and detailed experimental implementations. Firstly, four representative PLL structures are discussed regarding their operation. This is followed by a dedicated noise analysis regarding different structures, with a focus set on the phase detector (PD) block. To compare different PD designs, a benchmark is derived and discussed. Other than the noise performance, a general discussion over fractional spurs resulted from PD operation is given afterwards. Then, design methods of achieving both high spectral purity and low power are discussed with an alternative solution proposed. Finally a brief discussion over output oscillator design is done regarding its noise-power produce optimization.

**Chapter [4](#page-111-0)** presents experimental designs implemented in 130nm based on the proposed method. At first, block-level implementation of the whole loop is introduced and discussed in detail. Variations of design are then introduced, discussed and compared. This chapter ends with measurement results as well as a comparison to state-of-the-art designs.

**Chapter [5](#page-154-0)** presents a clocking solution to the MRI imaging application which is implemented in 130nm. A 2-stage cascaded-PLL structure is introduced to ensure the long-term frequency stability for an on-coil receiver, followed with its corresponding noise analysis. Measurement results of the loop inside a commercial 3T MRI scanner, as well as scanned image result, is shown in the end.

**Chapter 6** closes this dissertation, summarizing the main research contributions, and drawing conclusions.

## <span id="page-20-0"></span>**Chapter 2**

# **Phaselock Frequency Synthesis Fundamentals**

Frequency synthesizers are deployed in an ever-wider variety of applications to generate different operating frequencies according to diverse specifications. Integrated synthesizers based on PLLs are popular because of their excellent potential performance, relative simplicity, and low cost. In this chapter, the basic characteristics of frequency synthesizers are briefly introduced and followed with a detailed discussion regarding specific applications. Two applications demanding high spectral-purity are chosen, i.e., the cellular communication, and the MRI, where frequency synthesizers are employed as LOs. As the most prevalent form of frequency synthesizer realization, PLL is finally discussed in terms of its basic concepts, evolution history, and essential classifications.

### <span id="page-20-1"></span>**2.1 General RF Synthesizer Considerations**

As seen from the transceiver level, an RF frequency synthesizer is simply the building block providing a periodical local oscillator signal with programmable output, enabling up/down-conversion. It has to cover all the required range while meeting all required specifications.

Therefore, certain considerations have to be taken into account regarding the synthesizer design metrics. And these considerations can be generally classified into the following **four** categories, namely: 1) frequency programmability; 2) spectral purity; 3) switching speed; and 4) the synthesizer cost.

### <span id="page-21-0"></span>**2.1.1 Frequency Programmability.**

Programmability is the essential characteristic of a synthesizer as a certain application-dependent frequency range has to be covered by the synthesizer, based on one fixed input frequency.

- *frequency range* The maximum programmable range of the synthesizer has to be large enough to cover not only the required operation range but also with additional margins for circuitbased variations, such as PVT.
- *frequency resolution* The minimum programmable output step usually has to be fine enough to fulfill the specification defined by the targeted application.

### <span id="page-21-1"></span>**2.1.2 Spectral purity.**

Spectral purity of the synthesizer is important and always impaired by two types of imperfections, i.e., phase noise and spurious tones:

- *phase noise* Different from the voltage noise, phase noise represents the random (noisy) fluctuations in the phase of a periodical waveform, in frequency-domain. It is usually characterized by a various frequency-dependent distributed shape in the spectrum.
- *spurious tone* Any unwanted relatively stronger periodical components contained in the synthesized output spectrum are expressed as well-defined tones, located at certain frequency offsets from the carrier signal.

#### **Phase Noise, Phase Error, and Phase Jitter**

Assume an ideal synthesizer periodical output as  $y(t) = A \cdot \cos(\omega_c t +$  $\phi$ ), where  $\omega_c$  is the angular frequency of the carrier, A denotes the

<span id="page-22-0"></span>

Figure 2.1: Spectrum showing the modulation model and resulted PSD of a practical synthesizer output.

amplitude and  $\phi$  is an arbitrary but fixed phase quantity. In timedomain, all zero-crossings occur at exact integer multiples of  $T_c$  =  $2\pi/\omega_c$ . Correspondingly, in frequency-domain, all the signal power concentrates at a single frequency,  $\omega_c$ , as illustrated in Fig. [2.1](#page-22-0) In a practical case, however, both phase  $\phi$  and amplitude A are time-varying, due to disturbances brought by various noise sources. For simplicity, the amplitude disturbance is ignored here as it can be removed by a following limiter circuit. Now, in time domain the signal with random phase perturbation can be written as  $y(t) =$  $A \cdot \cos(\omega_c t + \phi_n(t))$ , where  $\phi_n(t)$  is a small random time-varying phase quantity that perturbs the signal zero-crossings in time-domain and correspondingly spreads the signal energy in spectrum into a decaying skirt around the fundamental tone at  $\omega_c$ . This  $\phi_n(t)$  phase error item is characterized here by its power spectral density (PSD), denoted as  $S_{\phi}(\omega)$ .

Based on  $\phi_n$ , **phase noise** can be further illustrated. Essentially noise can be understood as a superposition of harmonic signals of random amplitude, e.g., white noise can be assumed to be composed by tones of harmonic signals with a constant amplitude. Therefore,

the quantity  $\phi_n(t)$  can be considered as a periodical phase modulation (PM) signal here with amplitude of  $\Phi_p$ , denoted as  $\Phi_p \cdot \sin(\omega_m t)$  $(\omega_m = 2\pi f_m)$ , corresponding to a PSD of  $S_\phi(\omega_m) = \frac{\Phi_p^2}{2}$ . Thus, the synthesizer output can be rewritten as

<span id="page-23-1"></span>
$$
y(t) = A \cdot \cos(\omega_c t + \Phi_p \cdot \sin(\omega_m t)) \tag{2.1}
$$

For [a](#page-23-0) small value of fluctuation in phase,  $| \Phi_p | \ll 1$  rad<sup>a</sup>, Eq. [\(2.1\)](#page-23-1) can be simplified to

$$
y(t) = A \cdot \cos(\omega_c t) + A \cdot \frac{\Phi_p}{2} \cdot \cos((\omega_c + \omega_m)t) - A \cdot \frac{\Phi_p}{2} \cdot \cos((\omega_c - \omega_m)t)
$$
\n(2.2)

As a result, there are three tones standing in the output spectrum and the voltage PSD as

$$
S_v(\omega) = \frac{A^2}{2} \cdot \delta(\omega - \omega_c) + \frac{A^2 \cdot \Phi_p^2}{8} \cdot \delta(\omega - \omega_c - \omega_m) + \frac{A^2 \cdot \Phi_p^2}{8} \cdot \delta(\omega - \omega_c + \omega_m)
$$
(2.3)

Therefore, the upper single-sideband-to-carrier ratio (SSCR), or phase noise is

$$
\mathcal{L}(\omega_m) = \frac{\text{power in 1-Hz bandwidth at } (\omega_0 + \omega_m)}{\text{carrier power}} \tag{2.4}
$$

<span id="page-23-2"></span>
$$
=\frac{S_v(\omega_c \pm \omega_m)}{A^2/2} \tag{2.5}
$$

$$
=\frac{\Phi_p^2}{4} \tag{2.6}
$$

with its unit as dBc/Hz, indicating the phase noise is essentially a ratio between a modulation-introduced sideband and carrier. Recalling the PSD of the PM signal as  $S_{\phi}(\omega_m) = \frac{\Phi_p^2}{2}$ , it is clearly to see the numerical relation of

$$
\mathcal{L}(\omega_m) = \frac{1}{2} S_{\phi}(\omega_m) \tag{2.7}
$$

<span id="page-23-0"></span><sup>a</sup>This narrow-band FM approximation holds for small angle modulation, e.g., noise.

indicating that there is a 3dB difference in number between the two quantities. Eq. [\(2.4\)](#page-23-2) is also referred as spot phase noise as it characterizes phase noise at the offset frequency  $\omega_m$ . The region near the carrier is called "close-in" phase noise and the region far from the carrier is called "far-out" phase noise, although the border between the two is vague, as shown in Fig. [2.1.](#page-22-0) Within the scope of the thesis, the "far-out" phase noise is referred to as region of more than 20 MHz frequency offset.

These distributed spots compose the full phase noise profile together while an integral quantity can be defined to capture the wideband noise contribution, namely the integrated phase noise (IPN)

<span id="page-24-0"></span>
$$
IPN = 10 \log_{10} \left( \int_{f_l}^{f_h} 10^{\mathcal{L}(f_m)/10} df_m \right) \qquad (dBc)
$$
 (2.8)

The integration range here is determined by the application so that the imperfection effect over the system performance can be well captured. E.g., when the synthesizer is adopted as LO for telecommunication, the upper limit  $f_h$  should be high enough to cover the channel bandwidth while the lower limit *f<sup>l</sup>* should be low enough to include the long-term noise accumulated according to the frequency hopping rate. This is conceptually illustrated in Figure [2.2\(](#page-25-0)a) and further explanation can be found in Sec. [2.2.](#page-29-1)

Additionally, the time-domain phase uncertainty,  $\sqrt{\phi_n^2(t)}$ , can be derived based on its frequency-domain counterpart (Eq. [\(2.8\)](#page-24-0)), linked by Parseval's theorem ( $\int_{-\infty}^{+\infty} \phi_n^2(t) dt = \int S_\phi(\omega_m) df$ ). This quantity, namely root mean square (rms) residual phase error, is hence expressed as

<span id="page-24-1"></span>
$$
\phi_{e,rms} = \sqrt{2 \cdot 10^{IPN/10}} \quad \text{(rad)} \tag{2.9}
$$

In time-domain, the spectral impurity results in jitter, namely the time-domain deviation from the expected periodical signal. This quantity is more of an interest for wireline applications, such as clock data recovery (CDR). Although there are several ways to link phase noise

<span id="page-25-0"></span>

Figure 2.2: (a) PN spectrum of a typical oscillator, with illustrated IPN. (b)Impact of phase error on a constellation diagram of QPSK.

with a certain type of jitter [\[11\]](#page-176-3), rms phase jitter is adopted within this work for simplicity and it is defined as

<span id="page-25-1"></span>
$$
\sigma_{t,rms} = \frac{\phi_{rms}}{2 \cdot \pi f_c} \tag{2.10}
$$

#### **Response to Frequency Scaling**

Apparently there is a carrier frequency  $(f_c)$  item contained in the expression of jitter  $(Eq. (2.10))$  $(Eq. (2.10))$  $(Eq. (2.10))$  which could not be found in its frequency-domain counterpart: neither IPN (Eq. [\(2.8\)](#page-24-0)) nor phase error  $(Eq. (2.9))$  $(Eq. (2.9))$  $(Eq. (2.9))$ . What is the difference for them if the signal experiences a noiseless scaling in frequency? The simple answer is that the jitter will be frequency scaling-independent while items such as IPN experience a frequency normalization gain.

This can be simply illustrated by revisiting the aforementioned FM model. Assume the signal experiences a noiseless scaling factor of M (e.g., by an ideal frequency divider or multiplier), Eq. [\(2.1\)](#page-23-1) can be re-written as

$$
y(t) = A \cdot \cos[M(\omega_c t + \Phi_p \cdot \sin(\omega_m t))]
$$
 (2.11)

$$
y(t) = A \cdot \cos(M\omega_c t) + A \cdot \frac{M\Phi_p}{2} \cos((M\omega_c + \omega_m)t) - A \cdot \frac{M\Phi_p}{2} \cos((M\omega_c - \omega_m)t)
$$
 (2.12)

Accordingly, the scaled phase noise is

$$
\mathcal{L}(\omega_m) = \frac{M^2 \Phi_p^2}{4} \tag{2.13}
$$

with a **frequency gain of**  $20 \log_{10}(M)$ . This is essentially due to the fact that the frequency scaling factor applies to the FM modulation index and thus changes the sideband-to-carrier ratio. The same conclusion of scaling applies to the spurs as well. However, since jitter  $(Eq. (2.10))$  $(Eq. (2.10))$  $(Eq. (2.10))$  is inverse proportionally to the carrier frequency, the impact of scaling over jitter is thus canceled out.

#### **Spurious Tones**

Different from phase noise, spurs are well-observable discrete tones presented at the output spectrum, which is qualitatively indicated in Fig. [2.1.](#page-22-0) Its mechanism can be explained by the FM/PM model as used previously. Assume here that the phase error  $\phi_n(t)$  is a periodical signal  $\beta \sin(\omega_m t)$ , and hence characterized by a discrete tone in spectrum. A phase-modulated signal can be setup as Eq. [\(2.1\)](#page-23-1), i.e.,  $y(t) = A \cdot \cos(\omega_c t + \beta \cdot \sin(\omega_m t))$ . This signal can be rewritten a set of cosines weighted by Bessel functions of the modulation index *β* without any phase/frequency modulation components, as

$$
y(t) = A \sum_{k=-\infty}^{\infty} J_k(\beta) \cos(\omega_c t + k\omega_m t)
$$
 (2.14)

Where the Bessel function can be approximated by  $J_k(\beta) \approx \frac{\beta^k}{2^k k}$  $\frac{\beta}{2^k k!}$ . Under the assumption of narrow band FM ( $\beta \ll 1$  rad), further simplification can be made:  $J_0(\beta) \simeq 1$ ,  $J_1(\beta) \simeq \frac{\beta}{2}$  while  $J_k(\beta) \simeq 0$ ,

for  $k \geq 2$ . Same as phase noise, spurs is specified relative to the carrier power and therefore expressed here as

$$
P_{spur}(\omega_m) = \frac{\text{Sideband}}{\text{Carrier}}\tag{2.15}
$$

$$
= 10 \log_{10} \left[ \frac{J_1(\beta)}{J_0(\beta)} \right]^2 \tag{2.16}
$$

$$
=10\log_{10}\left(\frac{\beta}{2}\right)^2\tag{2.17}
$$

in the unit of dBc, which is with the same dimension as IPN.

Stronger discrete Periodical components contained within the phase error  $\phi_n(t)$  result in spurious tones in the output spectrum. From another perspective, spurs can be considered as undesired obtrusive spot phase noise points and thus spur has the same response to frequency scaling. Spurs are undesired from two aspects: 1) with the same dimension as IPN, they contribute to the phase error directly; 2) as stronger discrete tones, they affect the performance as spot phase noise. Further discussion about the potential sources of spur can be found in Chapter [3.](#page-44-0)

#### **Effects of Spectral Impurity**

When adopted as the LO in a cellular transceiver, the spectral purity of a frequency synthesizer affects the system performance from two aspects.

Firstly, at the receiver side, certain spot phase noise or certain spurs would degrade the receiver's noise figure via reciprocal mixing with in- and out-of-band blockers; similarly, at the transmitter side, they can violate the emission mask or increase the transmitter noise in the receiver band due to limited isolation/ filtering (e.g., in the case of Global System for Mobile Communications (GSM)). Secondly, at the receiver side, IPN or spurs within the signal bandwidth would decrease the signal-to-noise ratio (SNR) of the receiver and thus limit the sensitivity; similarly, at the transmitter side, these integral phase error contents contribute to the pollution of constellation of de-/modulated signals, causing error vector magnitude (EVM) degradation, hence an increasing bit error rate (BER). Phase error (Eq. [2.9\)](#page-24-1) is a prevalent

source of EVM among others such as LO feed-through or IQ imbalance. This is simply because that phase error is indistinguishable from phase modulation, and thus the mixing of the signal with a noisy LO in the TX or RX path corrupts the information carried by the signal. The quality of the modulated signal can be quantitatively evaluated by the EVM, which is leveraged by communication standards to measure the error between the transmitted signal and the reference signal. If the amplitude error is small, the relation between phase error and EVM can be simplified as [\[12\]](#page-176-4)

$$
EVM = \sqrt{\frac{1}{SNR} + 2 - 2\exp[-\frac{\phi_{e,rms}^2}{2}]} \quad (\%) \tag{2.18}
$$

<span id="page-28-1"></span>
$$
=\phi_{e,rms} \tag{2.19}
$$

where Eq. [2.19](#page-28-1) is derived under the assumption of a large SNR and a small phase error.

In a broad sense, when synthesizer is adopted for wireline application or as the sampling clock for data converters, jitter, the time-domain representation of spectral impurity, degrades the system performance as well. For a sinusoidal signal the theoretical limit on SNR resulted from sampling clock jitter is set by

$$
SNR = -20\log(2\pi f_{in}\sigma_t)
$$
\n(2.20)

For high resolution or high speed application, the jitter resulted from impurity dominates the sampling SNR very much. E.g., to achieve 11 ENOB (70 dB SNR) at  $f_{in} = 300$  MHz for MRI application, a sampling jitter no more than 200 fs is required.

#### <span id="page-28-0"></span>**2.1.3 Frequency accuracy and switching speed**

The dynamic behavior of the frequency synthesizer is of great interest as well. It is usually characterized by the switching speed/ settling time, i.e., the necessary time for the synthesizer to switch from one output frequency to another. This transition time is crucial for modern communication systems, which leverages channel or frequency hopping against different impairments (e.g., multipath induced fading, co-channel interference). The speed requirement is set by the corresponding use-cases which will be discussed in Sec. [2.2.](#page-29-1) In addition to fast transition time, the accuracy of the synthesized frequency is of great significance and has to be fine enough to meet system requirements.

#### <span id="page-29-0"></span>**2.1.4 Cost.**

In addition to those aforementioned important performance metrics, the implementation cost is equally critical for the design of a synthesizer. This cost can be generally evaluated from the following aspects:

- *power consumption* The amount of power consumed is crucial for battery-powered systems, e.g., mobile communication terminals.
- *economy cost* First, the size/area of the integrated synthesizer as a block is important for mass production. Secondly, involved external/off-chip components (e.g., crystal oscillator (XO)) have to be considered as well as they contribute to the bill of materials (BOM).
- *implementations cost* Design portability and integrability are very vital to make the design procedure shorter and simpler. Generally more digital intensive designs are preferred, as the digital part can be easily migrated from one application to another and from one process to another (portability). Besides, being more digital intensive (integrability), designs can be easier to integrate with digital baseband and application processor.

### <span id="page-29-1"></span>**2.2 Application Oriented Concerns**

Regarding the frequency synthesizer design, there are always certain design trade-offs between the implementation cost and different performance metrics. Therefore a cost-effective synthesizer design needs to be optimized to meet the system specification, based on the corresponding application scenario. As high-purity frequency synthesizer design is the topic of interest within the scope of this work, two typical spectral purity-demanding applications are chosen as target scenarios for this thesis. They are cellular communications and MRI. Brief background information and corresponding synthesizer design considerations are discussed below.

#### <span id="page-30-0"></span>**2.2.1 Frequency Synthesis for Cellular Applications**

Cellular communications enjoy the largest production volume among all kinds of consumer electronic products nowadays, with global revenue of \$87*.*7 billion in 2019 [\[13\]](#page-176-5). The high demand set by the market has driven the fast evolution of mobile technologies ever since 1990s, thanks to the digital revolution brought by GSM, which is considered as a second generation (2G) mobile wireless technology. After a journey of 30 years' development, mobile communication is currently embracing the 5G New Radio (NR) while 6G is already around the corner. This fast developing and complex cellular radio system thus sets various design specifications for the according LOs (synthesizers). On the other hand, cellular communications represent probably the most sophisticated communication system in the world. As the information signal travels in a hostile environment, against noise, interference, Doppler effects and multipath fading. Thus robust and high purity frequency synthesizer is not only required in the RF front-end for LO but also needed in the necessary analog-to-digital converter (ADC) function for sampling clock generation.

Due to the two aforementioned reasons, we need to look further into what kind of specification a fully integrated frequency synthesizer should reach to support a multi-standard cellular radio application, as a consequence of the rapid growth of modern wireless communication. Regarding the required frequency range, Table [2.1](#page-34-0) briefly summarizes the allocated frequency bands from 2G to 5G NR Frequency Range 1 (FR1) (sub-6 GHz). One thing to notice is that in terms of center frequency range, all these bands can be categorized into several groups, which are 1) around 900 MHz, 2) around 1800 MHz, 3) around 2700 MHz, 4) around 3500 MHz, 5) around 4800 MHz. This gives a clear insight into the required frequency range which have thus to be covered by as few synthesizers as possible with additional programmable dividers, via a clever frequency plan. Regarding the required synthesizer resolution, the synthesizer finest tuning step has to meet at least the specified channel resolution (a.k.a. channel raster). The finest raster specified so far is 5 kHz, as required by 5G NR. Therefore to design a multi-standard supporting synthesizer, a programmable step of 5 KHz is definitely necessary considering the overlap of frequency bands.

Regarding the required spectral purity, both spot and integrated phase noise and spurs have to be considered as discussed previously. Some of the known critical specifications are listed in Table [2.1.](#page-34-0)

On the receiver side, spot phase noise or spurs at certain frequency offset choke the SNR via reciprocal mixing with blockers by the following relation of

$$
SNR|_{dB} = (P_S|_{dBm} - P_B|_{dBm}) - (\omega_m)|_{Bc/Hz} - 10 \cdot \log_{10} B \quad (2.21)
$$

which is defined for spot phase noise at certain offset  $\omega_m$ , while the relation for spur at certain offset and SNR can be found as

$$
SNR|_{dB} = (P_S|_{dBm} - P_B|_{dBm}) - P_{spur}|_{dBc}
$$
 (2.22)

where  $P_S$  and  $P_B$  are the powers of the desired signal and blocker respectively, and B is the signal bandwidth.

With a given blocker template, corresponding spur or spot phase noise requirement can thus be derived. In addition, at the receiver side, IPN or spurs within the bandwidth restrict the maximum achievable SNR the receiver can realize.

Relatively, at the transmitter side, spot phase noise or spurs at certain offset might simple violate the modulation mask or increase the transmitter noise in receiver band. E.g., in the case of GSM standard, the critical 400 kHz mask of -60 dBc usually dictates the synthesizer design. According to 3GPP, the 400kHz mask is measured with 30kHz resolution bandwidth (RBW), which leads to the spot phase noise requirement of :

$$
\mathcal{L}(\textcircled{a}400\text{kHz}) = -60 - 9 - 10\log_{10}(30\text{kHz}) = -113.8\text{dBc/Hz} \quad (2.23)
$$

where the 9dB is an approximate adjustment of the reference power [\[14\]](#page-176-6). Therefore normally a specification with margin is made as −115 dBc/Hz as shown in Table [2.1.](#page-34-0) Besides, due to the relatively narrow channel bandwidth in the case of GSM, a very stringent transmitted noise level of -79 dBm at the corresponding receive-band is specified, in order not to desensitize a nearby receiving mobile handset, measured with a 100kHz RBW. This corresponds to the famous critical phase noise specification of -162 dBc/Hz at a worst-case offset of 20 MHz at the full power level of  $+33$  dBm, as

$$
\mathcal{L}(\textcircled{a}20 MHz) = -79 - 33 - 10 \log_{10}(100kHz) = -162 \text{dBc/Hz} (2.24)
$$

Given the additional noise contribution from building blocks such as the power amplifier (PAs), this specification should be made with further margin in practice.

As channel bandwidth is always being increased for higher data throughput capacity, requirements for synthesizers' design are also changed correspondingly. As the channel bandwidth is increased significantly from 2G to 4G, certain specifications such as far-out phase noise due to transmitter noise in the receiver band is never a serious concern as in GSM, while the IPN is becoming more serious due to not only the larger channel bandwidth but also the more complex modulation scheme for higher data rate. As depicted in Figure [2.2\(](#page-25-0)b), on the constellation diagram, each point corresponds to a different symbol, which could represent multiple bits. As the number of symbols is increased, the bandwidth efficiency increases. In another word, the higher order modulation can transfer more bits per symbol (e.g., 1024 QAM has five times higher data rate than QPSK within the same bandwidth). However, this also means that the system becomes more susceptible to noise, which can be measured by the EVM of the transmitter as well as the SNR of the receiver. Now the obvious bottleneck becomes the IPN or spurs within the signal bandwidth as discussed above. The specification of IPN can be derived from the corresponding EVM via Eq. [2.19,](#page-28-1) and becomes more critical in advanced standards such as long-term evolution (LTE) (256 QAM), Wi-Fi 6 (1024 QAM), and 5G. For instance, an IPN of -48 dBc is required to support 256-QAM and 4x4 multiple-input multipleoutput (MIMO) under non-ideal channel conditions [\[15\]](#page-176-7), significantly tougher compared to the -28 dBc required by 2G. Overall, the need

for backward compatibility dictated by the 3GPP cellular standard combined with strict marketing requirements to avoid expensive and bulky external SAW filters requires virtually any cellular frequency synthesizer on the market to feature ultra-low out-of-band phase noise. As for the IPN, the challenging requirements stem from the complex modulation scheme to support 4G/5G high data rates at good spectral efficiency of bandwidth-constrained channels.

As a carrier frequency for the communication systems, the accuracy of synthesized frequency has to be extremely accurate. For instance, as defined by 3GPP, the mobile terminal in the GSM standard must transmit signals with frequency accuracy finer than 0.1 parts per million (ppm) (e.g., 100 Hz for a 1GHz carrier). This value is thus far beyond the performance of available commercial XOs. Therefore, the ultimate synchronization is usually done by comparing the local carrier frequency with a received accurate frequency reference broadcasted from the base station.

On the other hand, the frequency hopping technique becomes more and more popular in modern communication systems against aforementioned impairments (Sec. [2.1.3\)](#page-28-0) and this has consistently put tougher and tougher requirements on the synthesizer switching time, which is the minimum time required by the synthesizer to switch and settle from one frequency to another. The corresponding specification varies between the communication standards. E.g., in the case of GSM or Enhanced Data rates GSM Evolution (EDGE), the transceiver is working in a time-division duplex (TDD) mode, bringing the potential benefit of sharing one synthesizer, however, at the cost of less time allocated for the locking operation. Therefore a settling time much less than 577µs is necessary for a robust operation [\[16\]](#page-177-0). For some other specific cases, such as universal mobile telecommunications system (UMTS), where frequency-division duplex (FDD) mode is adopted, the locking requirement is relatively more relaxed.

Overall, a high spectral-purity frequency synthesis solution is highly desired to support the evolution of unending marching journey of data rates as well as spectral efficiency. Yet it has to be powerefficient enough so the battery discharge rate does not get reduced proportionally as the data rates soar up, given a limited capacity budget. In addition to that, as high performance RF SoCs are commonly required nowadays to increase the level of hardware integration,



<span id="page-34-0"></span>

*a*sdfdg

CHAPTER 2. PHASE-LOCKED FREQUENCY SYNTHESIS 21



Figure 2.3: Overview of an MR setup. MR scanner, coil array, commercial in-bore receiver shown.

flexibility as well as scalability, the hostile interference environment imposed to analog/RF blocks from digital intensive parts leads to challenges to render the synthesizer design robust and keep it lownoise. One solution for the realization of a frequency synthesizer with high spectral-purity, low power and yet with strong robustness is discussed in Chapter [3](#page-44-0) and an experimental implementation is explained in Chapter [4.](#page-111-0)

#### <span id="page-35-0"></span>**2.2.2 Frequency synthesis for MRI On-Coil Receiver**

#### **On-Coil RX for MRI**

MRI is a critical tomographic method, by which cross-sections of a patient can be acquired without the involvement of any invasive exploratory surgery. Compared to other conventional imaging techniques, namely x-ray computer tomography (CT) and positron emission tomography (PET), which make use of hazardous ionizing radiation to form an image, MRI relies solely on harmless electromagnetic waves in the radio frequency (RF) spectrum.

MR-imaging is based on the effect of nuclear magnetic resonance (NMR). By employing a strong static magnetic field, ranging from 1*.*5 T up to 10*.*5 T for (ultra-)high-field MRI, the magnetic spin
<span id="page-36-0"></span>

Figure 2.4: Array-Coils with in-field (left) and on-coil (right) receivers.[\[17\]](#page-177-0)

states of the hydrogen nuclei  $({}^{1}H)$  in the body are split. By means of an external RF excitation, transition and coherence among the population of those states is induced resulting in an RF signal emitted from the nuclei themselves, around their Larmor frequencies  $(f_L)$ , which are filed-strength dependent, given by

$$
f_L = \gamma \cdot B_0 \tag{2.25}
$$

where  $B_0$  represents the strength of the externally applied field in unit of T (Tesla), and  $\gamma$  is the reduced gyromagnetic ratio, a constant specific to <sup>1</sup>H that equals to  $42.58$  MHz/Tesla.

In addition to a DC field, gradient fields are superimposed to modulate the  ${}^{1}H$  resonance frequency, which enables spatially distinguishable signals to be picked up by RF receive coils placed around the target anatomy. From the received signal a map of the  ${}^{1}$ H-concentration within the body can be reconstructed. A general MRI setup is illustrated in Fig. [2.3.](#page-35-0) As depicted, due to the coaxial RF connectors and comparably large electronics, the coil array usually are rigid and large in size. This rigid coaxial cabling that is required to bring the weak and vulnerable analog signal of multiple coils to the receiver-array outside the MRI scanner impacts image quality by distorting the magnetic fields within the scanner. Unless the cables are properly decoupled, by means of RF traps and filters, the several tens of kilowatts of the RF excitation field may induce excessive heat in the cables and potentially harm the patient.

In-bore receiver arrays, shown in Fig. [2.4,](#page-36-0) have been proposed [\[18\]](#page-177-1) in order to minimize the length and therefore the impact of the array cabling by shifting the receivers directly inside the MRI fields. To completely avoid any RF cabling within the array-coils, receivers need to be placed directly onto the individual coils of the arrays (right of Fig. [2.4\)](#page-36-0). Due to its vicinity to the tissue itself such an oncoil RX must not be shielded unlike most in-bore receivers. Even so, it needs to withstand the high power magnetic fields without introducing distortions in the received signal. The most promising way to achieve this is to integrate the entire receive chain (including the signal digitization) into a silicon microchip, which is small, nonmagnetic and low-power. The entire receiver chain (i.e., from the LNA to digitization and filtering) is integrated into silicon, allowing us to place the receiver directly on-coil therefore eliminating long and bulky coaxial cables, hence improving patient safety and comfort.

### **Frequency Synthesis for On-Coil RX**

Frequency synthesis is critical for such an on-coil RX for MRI application, as it is required both as LO for the front-end as well as sampling clock for the A/D conversion. On the other hand, the hostile time and space varying MR field also make the design of frequency synthesizer unique and challenging.

### *Required output frequency*

A high-field MRI provides better sensitivity and resolution but increases the  ${}^{1}H$  frequency relatively. According to Eq. [2.29,](#page-38-0) the desired range can be thus derived roughly as listed in Table. [2.2.](#page-38-1)

### *Spectral purity considerations*

A typical MR scan may contain over 4 million voxels, each generating its own MR signal, however, with only a static homogeneous field  $B_0$ , these MR resonances are indistinguishable in space [\[19\]](#page-177-2). Therefore, to acquire a real image of the distribution of proton density  $\rho_{1_H}(\mathbf{r})$ in the body, the spatial information has to be encoded within the signal. For this purpose, magnetic field gradients are applied that alter the DC field in a spatially specific pattern, changing the local resonance's frequency/ phase. Similar to the cellular protocols, this

| $DC$ Field Strength $(T)$ | Resonance Frequency (MHz) |
|---------------------------|---------------------------|
|                           | 42.58                     |
| $1.5\,$                   | 63.87                     |
| २                         | 127.74                    |
|                           | 298.06                    |
| 11                        | 468.38                    |

<span id="page-38-1"></span>Table 2.2: Common MRI Field Strengths and Corresponding Operating Frequencies

added spatially encoding patterns are usually complex in practice. Thus, a simplified example is illustrated in Fig. [2.5](#page-39-0) to give a general concept over the importance of synthesizer's spectral purity. Over the x-axis, a time-varying gradient field of  $G_x(t)$  is applied for frequency encoding, while a time-varying gradient field of  $G_y(t)$  is applied for phase encoding over the y-axis (both in the unit of  $T/m$ ). Thus, at time t and location  $r(x,y)$ , the local field strength is

$$
B(r,t) = B_0 + x \cdot G_x(t) + y \cdot G_y(t)
$$
 (2.26)

$$
=B_0 + \mathbf{G}(t) \cdot \mathbf{r} \tag{2.27}
$$

which leads to the local Larmor frequency

<span id="page-38-0"></span>
$$
f(\mathbf{r},t) = \gamma (B_0 + \mathbf{G}(t) \cdot \mathbf{r}) \tag{2.28}
$$

Thus the point voxel (A,B,C) can be distinguished from (D,E,F) in Fig. [2.5.](#page-39-0) While the corresponding local phase can be expressed as

$$
\phi(\mathbf{r},t) = \gamma \cdot B_0 \cdot t + \gamma \int \mathbf{G}(t) \cdot \mathbf{r} dt \qquad (2.29)
$$

by assuming the initial phase for each local resonance to be zero. With this additional phase difference, each voxel can thus be identified. These encoded local resonances are emitted and recorded by the array coils. They correspond to the spatial two-dimensional Fourier transform of the actual desired image, the so-called k-space representation[\[19\]](#page-177-2). The k-space is similar to the constellation diagram (Fig. [2.2\)](#page-25-0) regarding the fact that more dense the diagram is, more information can be

<span id="page-39-0"></span>

Figure 2.5: A conceptual diagram showing a simple case of applying both frequency encoding and phase encoding to encode horizontal (x-axis) and vertical (y-axis) spatial information of a brain, with its resulted MRI scan of the human head.

carried, while both of them are impacted by the integrated phase error.

Therefore, the quality of an MRI image dependents critically on the phase accuracy and the SNR of acquired samples. Thus, the jitter/IPN contributed from the frequency synthesizer is of great significance in the sense that all the phase accuracy, receiver NF, and effective number of bits (ENOB) of the A/D conversion are related to IPN. For an acceptable MRI scan quality, sub-picosecond jitter must be ensured as shown in [\[18\]](#page-177-1). However, what makes the frequency synthesis design in on-coil MRI substantially more challenging than for cellular transceivers, is the extremely hostile strong magnetic field. A frequency reference directly derived from an on-coil XO lacks the long term stability required by wide scanning receive-windows  $(50 \approx 100 \text{ ms})$  necessary to hold down the number of required RF excitations and save scan time. Its short-term stability, on the other hand, suffers from strong modulation by intensive gradient fields, which have a typical change rate of 200T/m/s. This results from the fact that the XO package usually contains magnetic material such as nikel and iron, and thus in-bore XO is sensitive to the magnetic field. A highly stable OCXO, necessarily outside bore, can provide the required reference, but its phase characteristics get corrupted if supplied via a noisy fiber link to the on-coil receiver. This challenge remains unresolved till our proposed solution [\[20\]](#page-177-3), which will be elaborated in Chapter [5.](#page-154-0)

Although the specifications on spectral purity, especially the longterm stability, are more challenging in on-coil MRI rather than advanced mobile applications, the power budget is be much more relaxed in the former one, as it is not operated based on a handset terminal.

### **2.3 Phaselock-based Frequency Synthesis**

<span id="page-40-0"></span>Originally introduced in the 1930s to solve the LO frequency drift issue, frequency synthesis by PLL is simply a feedback control system, similar to an unity-gain voltage buffer in voltage-domain. A simple

$$
\begin{picture}(180,10) \put(0,0){\line(1,0){10}} \put(10,0){\line(1,0){10}} \put(10,0){\line(
$$

Figure 2.6: A simple phase-control feedback loop.

showcase of PLL is depicted in Figure [2.6.](#page-40-0) A basic PLL is a closed-loop system composed of three elements, a phase detector (PD), a loop filter (LF) and an externally controlled oscillator (CO)<sup>[b](#page-40-1)</sup>. The error between the input phase  $(\phi_{in})$  and the output phase  $(\phi_{out})$  is detected by the PD, processed by the LF, further added to the output by the CO and regulated by closing the negative feedback loop. In such a way, the phase of CO eventually gets "locked" to an external input so

<span id="page-40-1"></span><sup>b</sup>Without the oscillator, an alternative structure can be made, known as DLL.

that the long-term stability of  $\phi_{out}$  is determined by the input.

**However, what does "phase-lock" essentially mean here?**

Assume the phase of the input reference is  $\phi_{in}(t) = 2\pi f_{ref}t$  while the output phase is  $\phi_{out}(t) = 2\pi f_o t + \phi_n(t) + \phi_{off}$ . The item  $\phi_n(t)$ represents an unwanted random fluctuations in phase, which is the time-domain representation of phase noise here. Meanwhile,  $\phi_{off}$  is a constant offset quantity which could be zero.

The item  $\phi_n(t)$  represents a relative phase difference which is composed of some colored/white noise around a certain constant offset (unnecessary to be zero). Phaselock means the output phase always "tracks" the input reference, and thus two simple criteria are used here:

- the frequency error between the two inputs must be zero (referred here as "frequency-lock"). As phase is the integration of frequency over time, any frequency error would cause an unbound phase error.
- the random fluctuation part,  $\phi_n(t)$ , should be properly regulated within an application-defined range.

### **2.3.1 Freq. Multiplying: Integer v.s. Fractional**

A simple PLL ( Fig. [2.6](#page-40-0) ) is enough for the generation of a LO output whose phase as well as frequency always tracks those of the stable input reference. However, frequency programmability is required when a PLL is leveraged to fulfill a frequency synthesis (Sec[.2.1\)](#page-20-0), which requires a tunable ratio N between the output frequency and the input reference. This is commonly realized by a divider in the feedback path in the [c](#page-41-0)ase of a PLL, as shown in Fig.  $2.7^c$  $2.7^c$ . Based on this ratio N (which is treated the same as frequency command word (FCW) within this thesis), PLLs for synthesizer can be classified into two types: the integer-N PLL and the fractional-N PLL.

An integer-N PLL is simple in term of operation yet characterized by some intrinsic limitations. The main one is that the output frequency resolution is forced to be equal to a multiple of frequency reference

<span id="page-41-0"></span> $\text{c}^{\text{c}}$ This is a general argument while cases of divider-less frequency multiplication by PLL will be discussed later.

<span id="page-42-0"></span>

Figure 2.7: Frequency-multiplying PLL.

(FREF). In this case, if a fine output frequency resolution is needed, a corresponding low FREF has to be chosen, e.g., the 5kHz channel raster planned in 5G. This forces the loop bandwidth to be extremely narrow, since it must be much lower than FREF for loop stability reasons, a factor 10 is usually taken there [\[21\]](#page-177-4)as a rule of thumb. The intermediate consequence of the loop bandwidth reduction is an increase of the locking time and of the channel switching time. Moreover, a low reference frequency requires a high feedback division ratio to synthesize the desired output frequency, thus causing a considerable noise gain resulted from PD (Ch. [3\)](#page-44-0). Fractional synthesizer can avoid these limitations, thus achieving fast locking, agile channel switching, potentially arbitrary output frequency resolution, and more freedom in the reference frequency choice. This is accomplished thanks to their fractional division capability, obtained by varying the feedback division between different integers, usually using a multi-modulus divider. The division ratio is dynamically programmed by a control pattern, whose average value corresponds to the fractional FCW. Despite of potential costs, i.e., issue of fractional spurs, freedom capitalizing on the decoupling between loop bandwidth and choice of reference, offered by the fractional-N PLL, is of significant value to fulfill advanced communication standards. Therefore, the fractional-N architecture has been chosen as the topic of this thesis, while the issue of fractional spurs will be discussed and demonstrated later that they can be well-regulated.

### <span id="page-42-1"></span>**2.3.2 Architecture: Analog v.s. Digital**

Considering its input and output signal, the essence of a PLL is apparently analog. Early days' PLLs are also realized in a full analog

way, where the input reference is usually multiplied with the feedback voltage-controlled oscillator (VCO) via a mixer to generate a low frequency signal, which is proportional to their phase error [\[22\]](#page-177-5)[\[23\]](#page-177-6). However, from a historical perspective, a matured and optimized PLL solution always comes with certain assistances from digital implementation. For instance, the popular tri-state digital PFD was already proposed in early 1970s[\[24\]](#page-177-7), which is usually associated with charge-pumps to drive an analog filter [\[25\]](#page-177-8)[\[21\]](#page-177-4), known as CPPLL nowadays.

What is the boundary line that distinguishes a digital PLL from an analog one? The answer is vague according to tons of publications over the past six decades. Within the scope of this thesis, it is the format of the signals carried in the paths in Fig. [2.7,](#page-42-0) which bridge the LF with PD and CO, that defines whether a specific PLL is analog or digital. In another word, a PLL is classified as a DPLL as long as the LF is realized digitally. Apparently, compared to its analog counterpart, a Digital Loop Filter (DLF) offers the PLL with higher programmability, more benefits from technology scaling, better portability and greater integrability. Thus, the continued explorations in increasing the PLL performance, and the simultaneous reduction in size and cost of IC has resulted in strong interest in the implementation of the PLL in digital domain. From the first attempt made in 1960 [\[26\]](#page-178-0), where a sample and hold circuit is inserted between the passive filter and a digitally-controlled oscillator (DCO), to the epoch-making all digital phase-locked loop (ADPLL) presented in 2004 [\[8\]](#page-176-0), which is considered the catalyst for the great surge of progress in PLL over the past two decades, countless works of DPLL have been published or patented till now. As the digital-intensive implementation would introduce inevitable deteriorations of spectral-purity, due to additional quantization noise and nonlinear distortions. This is reflected in recent years' research works, where focus has been zeroed-in to a better digitization performance[\[27\]](#page-178-1)[\[28\]](#page-178-2)[\[29\]](#page-178-3)[\[30\]](#page-178-4)[\[31\]](#page-178-5). Based on the previous discussion, a digital-intensive fractional-N approach has been chosen within this thesis work, while an alternative path will be explored to further solve the abovementioned issues.

## <span id="page-44-0"></span>**Chapter 3**

# **Towards Power-Efficient High-Purity Phaselock Frequency Synthesis**

In this chapter, both the evolution history of PLLs and representative PLL structures are covered briefly at the beginning. Based on previous discussions, we know a phaselocked frequency synthesis is composed of three function blocks plus one additional frequency multiplication path. Therefore four representative PLL architectures (two analog: CPPLL, sub-sampling PLL; two digital: multi-modulus frequency divider (MMDIV)-based and divider-less) are discussed in detail regarding their noise performance and dis-/advantages. Discussion further aims at key building blocks such as PD and controlled oscillator (CO), based on which an alternative power-efficient high purity solution is proposed. A general treatment of fractional spur issue is also included in this chapter.

### **3.1 Evolution Journey of the PLL**

The concept of PLL originated almost at the same time as the birth of modern wireless communications. It all began in the 1920s when

British researchers tried to develop an alternative to Armstrong's superheterodyne receiver with fewer components. However, the phenomenon of LO frequency drift prohibited the implementation of this homodyne solution. In 1932, French engineer Henri de Bellescize solved this issue by introducing the concept of PLL, leveraging the stability of an input frequency and a negative feedback loop [\[32\]](#page-178-6) to correct the frequency shift.

Based on this original solution proposed for frequency synthesis, great developments of the PLL have been witnessed over the following decades. In the 1960s, the appearance of monolithic circuits enabled the possibility of PLL integration [\[23\]](#page-177-6). In 1970s and 1980s, demands presented by modern communication systems lead to significant progress in PLL research [\[33\]](#page-178-7) [\[34\]](#page-178-8) [\[35\]](#page-179-0) [\[36\]](#page-179-1). Ever since the 1990s, the fast-growing of CMOS RF for wireless applications significantly propelled the advancement of PLLs towards lower power, more robustness, less noise, and better overall performance. In addition to frequency synthesis, PLLs nowadays are also found in many other applications, such as CDR for wireline communication, configurable clocking generation in processors, etc. [\[21\]](#page-177-4).

On one hand, the developments, as well as benefits of digital realization of PLL, are getting increased rapidly due to technology scaling ever since the 1960s (section [2.3.2\)](#page-42-1).On the other hand, the progress achieved in optimization of analog realization of PLL is still appreciable even just over the past decade, with emerging techniques such as injection locking[\[37\]](#page-179-2), sub-sampling [\[38\]](#page-179-3) and reference sampling[\[39\]](#page-179-4). Therefore, a brief review of several classical PLLs of both analog and digital implementation is given in the following section, as the necessary background prepared for further discussion and analysis for the proposed solution.

### **3.2 Classical PLL Architectures**

Among the three main blocks of a simple PLL (Fig. [2.6\)](#page-40-0), the  $PD^a$  $PD^a$ determines the essential operation of the loop significantly, as it is PD that distinguishes the difference in phase between input and output,

<span id="page-45-0"></span><sup>a</sup>The PD mentioned here generally includes concepts of both phase and frequency error detection (PFD).

<span id="page-46-0"></span>

Figure 3.1: XOR-based PD (a) schematic; (b) timing diagram; (c) transfer function

as well as closes the feedback loop. Therefore, a short introduction of PD will be discussed first. Then four classical architectures of PLL will be discussed: CPPLL, SSPLL as the representative analog realizations, ∆Σ-Modulator divider-based DPLL and counter-assisted DPLL as typical digital realizations.

### **3.2.1 Phase Detector**

A phase detector is a circuit that generates an output whose average value is (usually linearly) proportional to the phase difference of its two periodic inputs when they are at the same frequency. Within the context of a PLL, these two inputs are the input reference (REF) and the feedback variable clock (CKV). There are two classes of PD implementation in terms of its operation scheme. The first type is referred here as analog alike PD, as they are usually driven by analog form of the two inputs, which is also the choice for early days' PLLs. For example, the analog mixer adopted as PD in [\[23\]](#page-177-6) would generate a low-frequency voltage proportional to the input phase error (PE). This type of PD is still popular nowadays, e.g., the sub-sampling phase detector (SSPD) [\[38\]](#page-179-3)[\[40\]](#page-179-5) works in the way that the squared REF is used to sub-sample a high-frequency sinusoidal oscillator output

directly so that a DC voltage would be generated in proportion to the PE (when PE is small). The second type is referred here as more digital alike PD, as they are commonly realized by digital logic and driven by the square waveforms at their inputs, sensitive only to the relative timing difference of their edges. A good example, of course, is the phase frequency detector (PFD), which is developed in the 1970s [\[36\]](#page-179-1)[\[24\]](#page-177-7), and later developed into the classic CPPLL. The output/input characteristic of the PD is defined as the "gain" of the PD, usually denoted by *KPD*.

For further illustration, a simple XOR-based PD is depicted in Figure [3.1](#page-46-0) as an example, which is works as a digital mixer. Albeit the simple realization, the drawbacks are often unacceptable. The most criticized one is its limited detection range. As depicted in Figure  $3.1(c)$  $3.1(c)$ , this PD has a transfer function curve symmetrical to 0 PE, which means it could not capture the frequency error properly. Besides, a linear gain  $K_{PD}$  of  $2\pi$  can only be achieved within a limited linear range of *π*. When there is a frequency offset or a disturbance in phase, the output would simply repeat the same behavior at a certain beat rate (difference of the input frequencies), as highlighted in red in Figure [3.1\(](#page-46-0)c), usually referred as cycle-slip. A PLL might never re-lock again once such a phenomenon happens without any additional measures. These drawbacks are overcome by the later proposed tri-state PFD [\[36\]](#page-179-1), which is illustrated in Figure [3.2.](#page-48-0) The "tri-state" contains: 1) positive output  $e(t)$  triggered by an earlier (lead) REF; 2) negative  $e(t)$  triggered by a later REF (lag); 3) zero output after locking in an ideal case. In a narrow sense, this is a real PFD instead of a simple PD as the transfer function is asymmetrical to 0 PE. This means that a positive average DC value will be generated when the PE is positive (Figure [3.2\(](#page-48-0)c)) and *vice versa*. Even though this gain is nonlinear once frequency error happens (when PE exceeds the range of  $[-2\pi, +2\pi]$ ), it helps to eliminate the case of cycle slip. The summation of phase error in Figure [3.2\(](#page-48-0)a) is usually realized by a charge-pump nowadays to incorporate with a passive loop filter[\[25\]](#page-177-8). In such an implementation, an additional delay  $\tau/\tau_{PFD}$  is usually introduced to get rid of linearity penalties due to the PFD gate delay, and especially the switch-on time of the current sources in the charge pump shown in Figure [3.2\(](#page-48-0)a) [\[41\]](#page-179-6).

<span id="page-48-0"></span>

Figure 3.2: A conventional tri-state PFD (a) schematic; (b) timing diagram; (c) transfer function

### **3.2.2 Classical Architectures**

As a preparation for further discussions regarding optimized PLL design, operational scheme of four classical PLLs would be briefly covered in this part.

### **Charge-Pump PLL**

Based on the tri-state PFD just discussed, a classic  $\Delta\Sigma$  modulator (DSM)-Divider-based fractional-N CPPLL is conceptually shown in Figure [3.3.](#page-49-0) A charge-pump (CP) is leveraged to convert the PFDdetected PE into the analog loop filter, while the two current sources are representing the lead/lag information separately in a single-end way. Thus, any mismatch between the two current sources would results in strong reference spurs, as the mismatch currents pumped into the VCO control line at every REF cycle.An ideal-lock case is shown as well, for the case of an integer-N channel. On the other hand, the lock-case for a fractional-N channel is dependent on the order of the DSM, which is introduced in 1993 [\[42\]](#page-179-7) to alleviate the

<span id="page-49-0"></span>

Figure 3.3: Conceptual diagram of a classic DSM-divider-based fractional-N CPPLL.

issue of fractional spurs. With the DSM-assisted MMDIV, fractional spurs can be greatly reduced at the cost of more hardware and larger power consumption, as well as more current noise injected. However, not only the MMDIV introduces additional noise, but also the DSM contributes quantization noise. The latter one is usually resolved by a noise-cancellation DAC, which takes the control input from the DSM [\[43\]](#page-179-8).

### **Sub-Sampling PLL**

Based on the scheme of the aforementioned analog-alike PD, a SSPLL was proposed in 2009 [\[38\]](#page-179-3), where the feedback divider is completely removed from the phase-locking path. Thus its noise contribution is canceled. The SSPD works based on the simple fact that: the phase locks as long as the rising edge of REF could be synchronous with a *certain point*  $\frac{b}{c}$  $\frac{b}{c}$  $\frac{b}{c}$  of the sinusoidal output from the LC-based RF harmonic oscillator. The noise contribution from the PFD and CP would be greatly reduced, as will be reviewed later in this chapter, at the cost of much less robustness of locking. This degradation can be

<span id="page-49-1"></span><sup>b</sup>This point can be set to any DC voltage if the SSPD is realized in a singleended way, or better set to the cross-over point of the differential VCO as done in [\[38\]](#page-179-3)

explained by two facts. Firstly, the SSPD is similar to the mixer-based ones, and only a linear PD gain could be acquired when the PE is sufficiently small. As the PE becomes larger, the PD gain transfer function features a sinusoidal curve, as shown in Fig. [3.5.](#page-51-0) However, this means potential cycle-slip issue when PE is larger than one VCO cycle. Secondly, with one additional frequency-locked loop (FLL) path introduced at the cost of more hardware [\[38\]](#page-179-3), the cycle-slip dilemma is removed. However, the effective combined PFD gain is still nonlinear, resulting in a potential penalty in locking time. These are reflected in Figure [3.5.](#page-51-0) In addition to the smaller robustness, other issues are also not ignorable. For instance, the direct sampling of the sinusoidal LC-VCO (without the isolation of an output buffer) could lead to a much stronger reference spur issue due to leakage and modulation of the load of the VCO LC-tank. Another issue is that the original subsampling idea only works for integer-N mode, and the extension into fractional-N mode has been proved that additional quantization and noise issues are brought back[\[44\]](#page-180-0)[\[45\]](#page-180-1)[\[28\]](#page-178-2)[\[46\]](#page-180-2). A DTC-assisted example is given in Figure [3.4\[](#page-51-1)[45\]](#page-180-1). The fractional PE, which is periodically accumulated up to one  $T_{ckv}$ , gets compensated by the programmed delay in the REF path. Despite these imperfections, the superior noise performance (especially in integer-N mode) has attracted countless investigations from both industry and academia over the recent years. Detailed noise analysis will be done later.

#### **DSM-divider-based DPLL**

Even though the terminology of "ADPLL" and "DPLL" was already used in IEEE publications as early as 1970s [\[48\]](#page-180-3)[\[49\]](#page-180-4), a mature structure has not been formed till the work published in 2004[\[8\]](#page-176-0). Numerous investigations have been carried out on a better digital realization of PLL. As discussed before, phase-locking requires the frequency error to be completely eliminated. Thus, depending on the way of frequency error elimination (referred as frequency-lock here), two types of classical DPLL could be classified. The first one is the divider-based DPLL, as an analogy to the classical CPPLL mentioned above. The other one is the counter-assisted divider-less one, which is innovatively proposed in[\[8\]](#page-176-0), partially inspired by [\[50\]](#page-180-5).

<span id="page-51-1"></span>

Figure 3.4: Illustrative diagram of a conventional SSPLL.

<span id="page-51-0"></span>

Figure 3.5: The effective SSPD gain[\[47\]](#page-180-6)

A typical divider-based structure is conceptually depicted in Figure [3.6.](#page-53-0) Similar to the CPPLL, a DSM-based MMDIV is leveraged to assist both frequency-locking and suppression of potential fractional-spurs. Instead of PFD-CP, various types of time-to-digital converter (TDC) are used here to quantize the PE between DIV and REF signal. Different from PFD-based CPPLL, modern DPLLs are generally working in a sampling scheme, which means that one of the REF and feedback CKV is used to sample the other. A basic TDC is shown in Figure [3.7\(](#page-53-1)a). The divided CKV (DIV) is fed into a delay-line composed of a number (K) of unit delay cells, each adds a delay of ∆*t* and is usually realized by an inverter. The incrementally delayed DIV is then compared (sampled) by REF, and thus the PE is digitally resolved in a way similar to a resistor ladder-based flash ADC [\[51\]](#page-181-0). Therefore, similar to a flash ADC, a trade-off exists between the resolution and detection range, imposed by implementation cost. The situation gets worse once higher-order DSM is employed to suppress potential fractional spurs caused by the MMDIV. This is because higher order DSM would result in a larger dynamic range (more than one  $T_{CKV}$  after locking, increasing the cost of time-to-digital conversion. In addition to the noise-power trade-off brought by the DSM-MMDIV path, the effective PD gain is also not linear once frequency error exits, as the TDC's range is exceeded. This imperfection stems from the fact that due to the TDC's resolution-range trade-off, the TDC range is normally designed to cover only several  $T_{CKV}$ , which means the TDC works like a bang-bang phase detector (BBPD) [\[52\]](#page-181-1)[\[53\]](#page-181-2) once out-of-range. This results in unnecessary longer locking time, which is undesired. A noise-cancellation digital-to-time converter (DTC) can be added to reduce the DSM-introduced quantization noise [\[54\]](#page-181-3), as circled in dashed-line in Figure [3.6.](#page-53-0)

#### **Counter-assisted divider-less DPLL**

The nonlinear PD gain issue found in both CPPLLs and divider-based DPLLs is due to its essential divider-based scheme. In such a scheme, the PE input of the PD is not bounded once frequency offset exists, and this is difficult to be resolved simply by a PD with a limited linear detection range (usually a few times of  $T_{ckv}$ ). When the frequency error is not small enough, the TDC only offers 1-bit information,

<span id="page-53-0"></span>

Figure 3.6: Conceptual diagram of one DSM-divider-based fractional-N DPLL.

<span id="page-53-1"></span>

Figure 3.7: (a) A simple delay-line-based TDC and (b) its transfer function.

while the PFD gives not much more feedback. This dilemma is solved by a true-phase domain method as proposed by [\[8\]](#page-176-0), referred here as counter-assisted divider-less DPLL. A typical structure is sketched in Figure [3.8,](#page-54-0) where the divider path is removed and replaced by a counter-assisted frequency-lock path. The operation of the PD is further elaborated in Figure [3.9.](#page-55-0) As shown in Figure [3.9\(](#page-55-0)a), the key concept here is to digitize the accumulated phase of both REF and CKV without any involvement of division and then further sum the two digitized phases to get the digitized PE [\[7\]](#page-175-0). As the PE between REF and its neighbor CKV edge cannot be larger than one *Tckv*, a TDC with limited linear range is sufficient. For any PE larger than one  $T_{ckv}$ , it can always be tracked by the two counters/accumulators (REF) and CKV). Therefore, an effective PD with linear gain over unbounded PE is realized, and its transfer function is shown in Figure  $3.9(c)$  $3.9(c)$ . As

<span id="page-54-0"></span>

Figure 3.8: Conceptual diagram of a counter-assisted fractional-N DPLL.

long as the fractional PE (normalized to one  $T_{ckv}$ ) is perfectly captured by an ideal TDC with infinitely fine resolution, the equivalent PD gain is then  $1/2\pi$ . In addition to the intrinsically linear PD gain, noise-power trade-off imposed by the divider path is also eliminated, even in the presence of frequency error. The counter-assisted path can be roughly reviewed as an auxiliary FLL, although it actually works in the "true phase domain" [\[7\]](#page-175-0). This counter-assisted path does not contribute any noise after lock-in as an analogy to the FLL path in SSPLL, but intrinsically with much higher robustness.

Different from any general ADC application, the fractional PE input of the TDC within a DPLL contains a predictable pattern, which is periodically ramping up to one  $T_{ckv}$  with incremental step of  $T_{ckv}/2^{frac}$ every  $2^{frac}$   $2^{frac}$  cycles (Figure [3.9\(](#page-55-0)b)) [\[55\]](#page-181-4), where frac is the effective fractional-bits of FCW. Therefore, a phase-prediction DTC can be leveraged to alleviate the resolution-range trade-off in TDCs [\[56\]](#page-181-5)[\[57\]](#page-181-6). This is much more power-efficient, just as a digital-to-analog converter (DAC) consumes lower power, takes less hardware and is much easier to calibrate compared to an ADC, which includes not only a DAC but also associated logic as well as comparators.

<span id="page-55-0"></span>



Based on the above discussion, counter-assisted divider-less DPLLs show their potential to achieve a more power-efficient RF frequency synthesis. However, further noise analysis is still required and will be done in next section.

### **3.3 Frequency Response**

The s-domain model is often used for PLL analysis and is also adopted here for the analysis and a comparison of different PLLs in terms of noise performance. However, since s-domain requires a premise of a valid linear time invariant (LTI) continuous time system approximation, where is the boundary so that a simple but accurate model still holds? There is no problem with the early days' full analog PLL to fulfill the necessary precondition. Nevertheless, the nature of the CPPLL, although usually treated as an analog PLL, is discrete in time and nonlinear. Nonlinearity and sampling effects are both caused by PFD, due to its digital nature. The same issue is shared by SSPD and the TDC used in divider-based DPLLs. For the counter-based DPLL, even though the effective PD gain is linear over the full phase-domain in an ideal case, the sampling effect is not ignorable. Obviously, negligence of nonlinearity and sampling affects the model accuracy. Nonlinear behavior caused by the nonlinear PD gain during search of lock can be simply avoided by restricting the error phase to the linear region of the PD, which is valid once an RF PLL achieves its lock. On the other hand, sampling brings discretion in time and potential aliasing and thus will be shortly reviewed below.

### **Sampling effect**

Sampling of the PE generally causes a reduction of loop stability. Worse still, the PFD adopted in CPPLL is further impeded by the PE dependent sampling rate, which results from the fact that both the REF and DIV can trigger the PFD. A constant sampling rate determined by the period of REF occurs only if the CP is activated by REF (UP). Otherwise, non-periodical sampling would result from charge pump (CP) activation by the non-periodical DIV, which is modulated by the DSM. Fortunately, the lack of constant sampling can mostly often be ignored without major loss of accuracy [\[21\]](#page-177-4) [\[58\]](#page-181-7). Now a pseudo-continuous approximation can be investigated with the inclusion of sampling effect.

Consider the PE signal,  $\phi_e(t)$ , gets sampled with a period  $T_{REF}$  and then converted to a impulse sequence  $\hat{\phi}_e(t)$ , which can be expressed as

$$
\hat{\phi}_e(t) = \sum_{k=-\infty}^{\infty} \phi_e(kT)\delta(t - kT_{REF})
$$
\n(3.1)

The spectrum of the sampled PE can thus be found by taking the Fourier transform of the above equation, leading to

$$
\hat{\Phi}_e(f) = \frac{1}{T_{REF}} \sum_{k=-\infty}^{\infty} \Phi_e(f - \frac{k}{T_{REF}})
$$
\n(3.2)

This result reflects that the Fourier transform of  $\hat{\phi}_e(t)$ , i.e.,  $\hat{\Phi}_e(f)$ is composed of multiple replicas of the Fourier transform of  $\phi_e(t)$ ,  $\Phi_e(f)$ , which are scaled by  $1/T_{REF}$  and shifted in frequency from one another with spacing of  $1/T_{REF}$ . It is valid to assume that the frequency content of  $\Phi_e(f)$  is confined between  $-1/2T_{REF}$  and  $1/2T_{REF}$ , so that aliasing can be neglected. Capitalizing on that assumption that  $\hat{\phi}_e(t)$  is fed into the low-pass LF with relatively much lower bandwidth compared to  $1/T_{REF}$ , those replica spectral contents can be significantly attenuated. This can be intuitively explained by the fact that as long as the loop response is slow enough with respect to the REF rate, it reacts simply to the average value from the discrete PD output, revealing a continuous time system. Gardner showed that the continuous-time approximation works well if the closed-loop bandwidth is narrower than  $1/10$  of  $f_{REF}$  [\[25\]](#page-177-8). This holds within the scope of this thesis, and therefore, an s-domain analysis will be adopted below.

#### **Order and Type**

Orders of the loop (the degree of the transfer function polynomial) usually draw more attention; however, the loop type matters to the same extent for the properties of a PLL. As a term borrowed from

control system theory, the type refers to the number of integrators within the loop. Since each integrator contributes one pole to the transfer function, the order can never be less than the type. Additional non-integrative filtering is often used, contributing additional poles and increasing the order with no effect on the type. Besides, considering the inherent integration in the output oscillator, a PLL is at least type 1. However, additional pole is usually placed at DC to the PE to zero, making type 1 structure rare to see.

#### **General Definitions**

Within this section, the quantities  $\phi_{out}$  and  $\phi_R$  are respectively defined as the output phase and reference phase, in radians for convenience. The quantity N is the frequency-division ratio between the output frequency and the input reference of the PLL, which is the same as FCW. The colored quantities in Fig. [3.10](#page-59-0) represent various noise [c](#page-58-0)ontributions while different color means different characteristics  $\cdot$ .

### **3.3.1 Noise Analysis of CPPLL**

Fig. [3.10](#page-59-0) shows various noise sources injected into the loop for a CPPLL. They are  $\phi_{n,R}$  from the reference,  $\phi_{n,PFD}$  from the PFD gates,  $\phi_{n,CP}$  from the CP,  $\phi_{n,divider}$  from the MMDIV,  $\phi_{LF,n}$  from the LF, and  $\phi_{n,v}$  from the VCO. By combing the gain of the tri-state PFD  $(K_{PFD} = \frac{1}{2\pi})$  and the gain of the charge pump stage  $(I_{CP})$ , the effective PD gain can be defined as  $\frac{I_{CP}}{2\pi}$  (A/rad). Meanwhile, the transfer function of the loop filter is simplified as  $LF(s)$  here (current-to-voltage transfer). Examples of 1st, 2nd and 3rd order RC filters are given as well, which lead to 2nd, 3rd, and 4th order loop due to the pole contributed by VCO. Besides, the VCO is characterized by an integration of  $K_v/s$  due to its frequency-to-phase conversion feature with its gain defined as  $K_v$ , in rad/V. As other feedback systems, the PLL can be analyzed by open-loop gain and closed-loop transfer functions. The open-loop gain  $G_{CP}(s)$  between input and feedback can be easily defined as

<span id="page-58-0"></span>c low-pass band-pass high-pass

<span id="page-59-0"></span>

Figure 3.10: S-domain transfer function of a typical Charge-Pump PLL.

<span id="page-59-2"></span>
$$
G_{CP}(s) = \frac{I_{CP}}{2\pi} LF(s) \frac{K_v}{N \cdot s} \tag{3.3}
$$

If the open-loop gain  $G_{path*}(s)$  from any node of the loop to the output  $\phi_{out}$  is known, the corresponding closed-loop transfer function can be derived as

<span id="page-59-1"></span>
$$
H(s) = \frac{G_{path}(s)}{1 + G_{CP}(s)}\tag{3.4}
$$

Applying Equation [3.4,](#page-59-1) the contribution from different noise sources to the PLL output can be analyzed as below.

Once locked,  $G_{path}(s)$  of the noise sources present at the reference input before the PD, mainly  $\phi_{n,R}$ ,  $\phi_{n,PFD}$  and  $\phi_{n,divider}$ , can be simplified as

$$
G_{path, reference}(s) = N \cdot G_{CP}(s)
$$
\n(3.5)

and therefore their transfer function to the output can be derived as

<span id="page-60-0"></span>
$$
H_R(s) = N \cdot \frac{G_{CP}(s)}{1 + G_{CP}(s)}\tag{3.6}
$$

Overall, the transfer function to the output for all the noise sources highlighted in red in Fig. [3.10](#page-59-0) can be derived by referred them to the right input of the PFD first and then multiplied by Eqn. [3.6.](#page-60-0) Therefore, the transfer function for  $\phi_{n,CP}$  is expressed as

<span id="page-60-1"></span>
$$
H_{CP}(s) = \frac{2\pi N}{I_{CP}} \cdot \frac{G_{CP}(s)}{1 + G_{CP}(s)}
$$
(3.7)

The same method can be applied to the noise contributed from the loop filter, i.e., thermal noise from the passive R or a potential active noise from an active filer implementation.

$$
H_{LF}(s) = \frac{1}{1 + G_{CP}(s)} \cdot \frac{K_v}{s}
$$
 (3.8)

Similarly, the  $G_{path}(s)$  for the VCO noise is 1 in Fig. [3.10,](#page-59-0) the transfer function to the output for the VCO referred noise  $\phi_{n,V}$  is

$$
H_V(s) = \frac{1}{1 + G_{CP}(s)}
$$
(3.9)

Notice here that the item *A*(*s*)

<span id="page-60-2"></span>
$$
A(s) = \frac{G_{CP}(s)}{1 + G_{CP}(s)}
$$
(3.10)

contained in  $H_R$  and  $H_{CP}$  always has a DC gain of 1 as  $G_{CP}(s) \gg 1$ within the PLL bandwidth, while  $H_V$  is equivalent to  $(1 - A(s))$ . Therefore it is interesting to observe that all the noise sources highlighted in red in Fig. [3.10,](#page-59-0) experience a low-pass filter transfer function. Meanwhile, the VCO noise, colored in blue, is high-pass filtered and the LF noise, colored in green, has a band-pass characteristic. Qualitatively this means that outside the loop bandwidth, the phase

<span id="page-61-0"></span>

Figure 3.11: A typical CPPLL PN profile with noise sources qualitatively shown.

noise is dominated by the VCO contribution while noises from reference, the divider, CP and PFD are dominating the in-band noise. This fact is qualitatively illustrated in Fig. [3.11](#page-61-0) and it generally applies to other PLLs as well as shown later. To achieve optimized IPN performance, the loop bandwidth is usually chosen at the point where VCO and non-VCO noise intersect, as depicted in Fig. [3.11,](#page-61-0) which is usually called the optimal loop bandwidth, marked here as *fc,opt*. It can be mathematically proven that the IPN contributions from VCO and non-VCO noise are equal with choosing a loop bandwidth that equals *fc,opt*. [\[59\]](#page-182-0) appendix. In practice, the choice of loop bandwidth usually has to accommodate system specifications, e.g., locking time and modulation bandwidth. As with CPPLL, if LF noise contribution is ignorable, other non-VCO sources all get multiplied by  $20log_{10}(N)$ , which is referred as **frequency gain** in this thesis. For instance, a phase noise floor of -150dBc/Hz above 10kHz offset from a 40MHz carrier is commonly found in commercial mediocre XOs, and this results in a corresponding phase noise floor of -112dBc/Hz from a 3.2 GHz output carrier.

Now we have zeroed in on the in-band noise contribution of the CP for a later comparison of different PLL structures, as it is one of the major contributors. Consider when both UP and DOWN current sources are on in the CP Fig. [3.3,](#page-49-0) the PSD of the output thermal noise current can be simply defined by

$$
S'_{iCP,n} = 4kT\gamma \cdot 2g_m \tag{3.11}
$$

while assuming same  $UP/DN q_m$ .

In principle, the charge pump does not contribute noise at all to the PLL once it is locked as no current source is on. However, both current sources are turned on during a time  $\tau_{PFD}$  per  $T_{ref}$  cycle to avoid the deadzone effect. Therefore the minimum current noise floor contributed from CP, in theory, is (**in an ideal integer-N channel**)

$$
S_{iCP,n} = 4kT\gamma \cdot 2g_m \cdot \frac{\tau_{PFD}}{T_{ref}}
$$
\n(3.12)

Based on [3.7](#page-60-1) and [3.10](#page-60-2) and  $A(0)=1$ , the resulting in-band noise floor at the CPPLL output can be derived as

$$
\mathcal{L}_{CP}(\Delta f) = 0.5 \cdot S_{iCP,n} \cdot \left(\frac{2\pi N}{I_{CP}}\right)^2 \tag{3.13}
$$

<span id="page-62-0"></span>
$$
=\frac{4\pi^2 f_{out}^2}{f_{ref}} \frac{4kT\gamma g_m}{I_{CP}^2} \tau_{PFD}
$$
(3.14)

Other than the CP noise contribution, the MMDIV noise is also not ignorable which requires more power consumption for sufficient suppression. Besides, in fractional-N channels, quantization noise from DSM and periodically larger switch-on time of the CP noise would make the output in-band noise floor higher than the one derived above.

#### **Loop Bandwidth**

Now we can take a further rough estimation to check what factors decide the closed-loop bandwidth. Take the widely used 2nd-order R-C filter (depicted in Fig. [3.10\)](#page-59-0), we have a 3rd-order PLL for analysis here as an example. Compared to the basic 1st-order RC filter, the additional  $C_2$  is introduced to further reduce the filter impedance at high frequencies outside the PLL bandwidth, resulting in two poles and one zero. Therefore  $C_2$  is usually chosen to be far smaller than  $C_1$ . At low frequencies, the loop impedance is roughly  $1/s(C_1 + C_2)$ ; at intermediate frequencies (around loop bandwidth), it is practically equal to R; at high frequencies, it reduces to  $1/sC_2$ . In addition, the unity gain frequency  $\omega_u$  of the open loop gain, Eq. [3.3,](#page-59-2) can be used to estimate the closed-loop bandwidth, as

$$
|G_{CP}(s)| \approx \frac{I_{CP}}{2\pi} R \frac{K_v}{N \cdot \omega_u}, |G_{CP}(s)| = 1
$$
 (3.15)

which leads to a rough estimation of the CPPLL loop bandwidth as

<span id="page-63-0"></span>
$$
\omega_u \approx \frac{I_{CP}R \cdot K_v}{2\pi \cdot N} \tag{3.16}
$$

This estimation, albeit a rough one, clearly indicates the dependence of bandwidth on factors of the PD (CP), LF and the VCO, which are sensitive to PVT variations. In addition, with such a structure, the choice of bandwidth is coupled with factors, e.g.,  $I_{CP}$ , that determines noise performance as well (Eq. [3.13\)](#page-62-0). This may lead to design difficulties in cases where both low in-band noise floor and low bandwidth are wanted.

### **3.3.2 Noise Analysis of SSPLL**

Here we stick to the original SSPLL implementation, and therefore the only integer-N case is considered [\[38\]](#page-179-3). As briefly shown in Fig. [3.12,](#page-64-0) the noise contributions from LF and VCO are apparently not different from those in a typical CPPLL case. Even though the divider by N and its noise are totally removed, the reference noise still experience a virtual multiplication due to the auxiliary FLL path so its transfer function is same as Eqn. [3.6.](#page-60-0) This can also be explained by the scaling effect as discussed in Sec. [2.1.2.](#page-25-1) The real game changer is the effective feedback gain. As seen from Eqn. [3.13,](#page-62-0) the equivalent gain for CP noise is  $\frac{2\pi N}{I_{CP}}$ , where the PFD-CP gain is divided by a N and thus noise gets multiplied by  $N^2$  to the output. As shown in Fig. [3.4](#page-51-1) and [\[38\]](#page-179-3), the equivalent PD gain is not multiplied by N and is expressed as

$$
K_{SSPD} = 2A_{VCO} \cdot g_m \frac{\tau_{on}}{T_{REF}} \tag{3.17}
$$

<span id="page-64-0"></span>

Figure 3.12: S-domain transfer function of a SSPLL.

where  $A_{VCO}$  is the oscillator amplitude while  $q_m$  is the transconductance of the CP in SSPLL and  $\tau_{on}$  is the introduced on-time per REF cycle for gain reduction [\[38\]](#page-179-3), which is generated by the pulse generator shown in Fig. [3.4.](#page-51-1) The equivalent current thermal noise floor from the CP can thus be derived as

$$
S_{iSSCP,n} = 4kT\gamma \cdot 2g_m \cdot \frac{\tau_{on}}{T_{REF}} \tag{3.18}
$$

and its final contribution to the output phase noise floor is therefore,

$$
\mathcal{L}_{SSCP}(\Delta f) = 0.5 \cdot S_{iSSCP,n} / K_{SSPD}^2 \tag{3.19}
$$

$$
=\frac{kT\gamma}{A_{VCO}^2 \cdot g_m} \cdot \frac{T_{REF}}{\tau_{on}}\tag{3.20}
$$

By carefully designing the on-time ratio control  $\frac{T_{REF}}{\tau_{PFP}}$  of the PFD, the noise floor due to CP in SSPLL can be significantly lower than the one in the CPPLL case, as there is no frequency gain N anymore.

This fact is visually shown in Fig. [3.13,](#page-65-0) under the assumption that other noise contributions are sufficiently lower compared to CP. In such a case, the output in-band noise floor can be enormously reduced, indicating a higher  $f_{c,out}$  is possible, which means the possibility of a higher modulation rate as well as faster locking speed.

Again, the bandwidth analysis can be derived following the methods applied to the CPPLL, and it still depends on PVT sensitive factors such as  $A_{VCO}$ ,  $g_m$  as well as R, C and  $K_{VCO}$ .

### **3.3.3 Noise Analysis of Counter-based DPLL**

Other than the removal of noise from DSM and MMDIV, the counterbased DPLL has a similar transfer function as the divider-based one.

<span id="page-65-0"></span>

Figure 3.13: CP noise contribution is not multiplied by  $N^2$  in SSPLL, as compared to that in CPPLL.

For simplicity, the divider-based one is therefore skipped here. The sdomain transfer function with noise sources in a counter-based DPLL can be found in Fig. [3.14,](#page-66-0) where  $\phi_{n, TDC}$  represents both thermal noise and quantization noise contributed from TDC/DTC, which is not presented in an analog realization. In addition, in order to decouple the PVT sensitive oscillator gain  $K_o$  from the loop properties, gain normalization is usually included in DPLL so that  $\frac{K_o}{\hat{K}_o}$  is one. The loop filter in Fig. [3.14](#page-66-0) shows the transfer function of the digital filter in s-domain: it can be a type-I, with only a proportional gain *α* (during fast acquisition), or type-II with both proportional (*α*) and integral  $(\rho)$  paths, or of higher order with Infinite impulse response (IIR) filters turned on. These LF parameters are programmable and can be dynamically configured during regular PLL operation. For the following frequency response analysis, type-II operation is assumed due to its universality (type-I could be considered as type-II with  $\rho = 0$ ).

Now regarding the DPLL, the open-loop gain  $G_{DPLL}$  can be derived as

$$
G_{DPLL}(s) = \frac{\phi_{out}}{N * \phi_R}
$$
\n(3.21)

$$
= (\alpha + \frac{\rho f_R}{s}) \cdot \frac{f_R}{\hat{K}_o} \cdot \frac{K_o}{s}.
$$
 (3.22)

<span id="page-66-0"></span>

Figure 3.14: S-domain transfer function of a counter-based ADPLL.

$$
= (\alpha + \frac{\rho f_R}{s}) \cdot \frac{f_R}{s}.
$$
 (3.23)

with a perfect DCO gain estimation. Here an additional pre-division of M is included simply for convenience of later discussions. Now, considering the IIR filter in the proportional path of the LF, the transfer function of a one stage IIR filter in s-domain is

$$
H_{IIR}(s) = \frac{1 + s/f_R}{1 + s/\lambda f_R}
$$
\n
$$
(3.24)
$$

Additional K cascaded independently controlled IIR stages can be inserted to further attenuate the reference and PD noise. Each IIR stage has an attenuation factor  $(\lambda_i < 1)$ , and the open-loop transfer function becomes

$$
G_{DPLL,IIR}(s) = (\alpha + \frac{\rho f_R}{s}) \cdot \frac{f_R}{s} \cdot \prod_{i=1}^k \frac{1 + s/f_R}{1 + s/\lambda_i f_R}
$$
(3.25)

Following Eqn. [3.4,](#page-59-1) the noise transfer function for reference noise  $\phi_{n,R}$ is

<span id="page-67-0"></span>
$$
H_R(s) = N \cdot \frac{G_{DPLL}(s)}{1 + G_{DPLL}(s)}
$$
(3.26)

which is the same as in CPPLL and SSPLL, and again can be explained by the frequency scaling in Sec[.2.1.2.](#page-25-1)

One interesting fact is that the PD (in the format of a TDC here) has a noise transfer function as a SSPD, due to the fact that divider is removed from the phase loop while DCO phase is effectively "subsampled" in the TDC. Thus, we have its noise transfer function as

$$
H_{TDC}(s) = \frac{G_{DPLL}(s)}{1 + G_{DPLL}(s)}\tag{3.27}
$$

Then the noise contributed in a TDC has to be checked. As with any A/D converter, both physical thermal noise and quantization noise degrade the SNR. As for the general discussion, here we take the basic delay-line-based flash TDC, as depicted in Fig. [3.7,](#page-53-1) for analysis. In such a structure, the resolution is marked as *tres*, equal to one inverter/buffer delay, while its full range has to cover one  $T_{CKV}$ . In practice, the full range has to be much more than one  $T_{CKV}$ concerning the substantial delay-line change due to PVT variation.

#### **TDC Quantization Noise**

Since the TDC digitize the analog PE, it introduces quantization noise. Similar to noise calculation in A/D [\[51\]](#page-181-0), an least significant bit (LSB) size of  $t_{res}$  translates to a total noise power of  $t_{res}^2/12$  in time-domain, marked as  $\sigma_t^2$  and thus leading to a phase noise (in  $rad^2$ ) at the output as

$$
\sigma_{\phi}^2 = (2\pi)^2 \left(\frac{\sigma_t}{T_V}\right)^2 \tag{3.28}
$$

Where  $T_V$  is the nominal period of the feedback oscillator signal. As the PE is sampled by the TDC at the REF rate, thus exhibiting a PSD of  $t_{res}^2/12f_{REF}$  uniformly from DC to the Nyquist frequency, leading to a phase noise floor at the PLL output resulted from quantization as

$$
\mathcal{L} = \frac{(2\pi)^2}{12} \left(\frac{t_{res}}{T_V}\right)^2 T_R
$$
\n(3.29)

#### **TDC Thermal Noise**

In addition to quantization noise, jitter due to thermal noise in an inverter has to be considered as well. According to the analysis model proposed in [\[11\]](#page-176-1), two sources of white noise contribution dominate. The first one is the contribution from thermal current noise, as the capacitance  $C_l$  [\[11\]](#page-176-1) at the output node of an inverter gets dis-/charged when the NMOS/PMOS transistor enters saturation region, leading to integrated current noise similar to the CP noise analyzed before; the other contribution stems from the digital switching behavior of the inverter, resulting in  $kT/C$  noise. The propagation delay is thus jittered by noise in both pullup and pulldown processes. Assuming these noise events are uncorrelated, the dis-/charge currents are equal and marked as  $I_d$ , and they have the same absolute threshold point as  $V_{th}$ , and thus the rise/fall time are equal, marked as  $t_d$ . Now the accumulated jitter due to thermal noise from one inverter can thus be written as

$$
\sigma_{t,inv}^2 = 2(\frac{4kT\gamma t_d^2}{C_l \cdot 0.5V_{DD}(V_{DD} - V_{th}))} + \frac{4kT \cdot t_d^2}{C_lV_{DD}^2})
$$
(3.30)

where the first item corresponds to current noise injected during both rising/falling edge while the second item corresponds to the *kT /C* noise. For a M-stage inverter chain, the accumulated jitter due to thermal noise at the output is simply

$$
\sigma_{t,tm}^2 = M\sigma_{t,inv}^2 \tag{3.31}
$$

The corresponding in-band noise floor at output can be derived as

$$
\mathcal{L}_{TDC,thermal} = \left(\frac{2\pi\sigma_{t,tm}}{T_V}\right)^2 T_R \tag{3.32}
$$

As an A/D converter design, any optimal high-performance TDC designed for high-purity DPLL, should have its thermal noise dominate over its quantization noise contribution. While regarding DCO's phase noise, it shares the same characteristic as that of VCO in analog PLLs, i.e., a high-pass filter transfer function.

#### **Loop Bandwidth**

The closed-loop transfer function *H<sup>R</sup>* can be compared to a classical, two-pole system transfer function

<span id="page-69-0"></span>
$$
H(s) = N \frac{2\xi\omega_n s + \omega_n^2}{s^2 + 2\xi\omega_n s + \omega_n^2}
$$
\n(3.33)

where  $\xi$  is the damping factor and  $\omega_n$  is the non-damped, natural frequency. The zero lies at  $\omega_z = -\omega_n/2\xi$ . According to the analogy between Eq. [3.26](#page-67-0) and Eq. [3.33](#page-69-0) conclusion could be drawn as,

$$
\omega_n = \sqrt{\rho \cdot f_R} \tag{3.34}
$$

and

$$
\xi = \frac{1}{2} \cdot \frac{\alpha}{\sqrt{\rho}}\tag{3.35}
$$

For a type-I loop, the closed-loop transfer function simplifies to

$$
H_{cl}(s) = N \cdot \frac{\alpha f_R}{s + \frac{\alpha f_R}{M}}
$$
(3.36)

and the 3-dB bandwidth of the loop is  $f_{BW} = \alpha f_R/(2\pi M)$ 

This, compared to Eq. [3.16,](#page-63-0) obviously shows the one advantage of DPLL regarding bandwidth: the loop bandwidth can be decoupled from factors which decide noise performance as well. Although a wrongly estimated gain of the PD or the DCO (by either design or PVT variations) would lead to deviations from the expected bandwidth, numerous methods of gain estimation, correction can be digitally implemented.

### **3.3.4 Brief Summary over Comparisons in S-domain**

Three facts are clear from the discussion above:

- 1. Be it VCO or DCO in either DPLL or an analog one, the contributions of the CO block in different structures to the output purity are the same. Therefore, the detailed analysis regarding optimization of CO design is done as the last part of this chapter, as it is generally independent of the loop structure.
- 2. As the noise from an RC filter can be almost ignored compared to other sources, and there is also no difference between an RC filter and a DLF in terms of noise contribution to the output. On the other hand, the DLF's power consumption is not dominating when compared to other consumers within a high-purity PLL, and it reduces with technology scaling.
- 3. It is clear enough that it is the PD block that distinguishes different PLLs in terms of their power efficiency for high-purity frequency synthesis.
- 4. Regarding the inherent frequency multiplication path required, the counter-based scheme is better than any one with DSM-MMDIV involved, as additional noise sources are removed from the loop. Therefore, noise can be traded off for less power consumption. In addition, the choice of loop bandwidth, as well as order, are more flexible, due to the exclusion of the need to filter  $\Delta - \Sigma$  noise (especially high-order ones).

Therefore a benchmark of PD will be derived in the next section for comparing different PLLs.

### **3.4 A Simple Benchmark of PD**

In order to develop a power-efficient high spectral-purity PLL, a simple noise-power benchmark FOM to evaluate the PD jitter performance in relation to the consumed power is defined in this section. For simplicity's sake, we focus on the main differences of the previously discussed PLLs, while neglicting the similar part. Besides, the frequency-lock path is neglicted as well, as we are focusing on "Phase Detector" benchemarking here. In another words, it means:

- **1. For a CPPLL, the CP is taken as the focus of the benchmark analysis.** Contrarily, the PFD contribution is ignored here, as it contains primarily limited logic and few Dtype Flip Flops (DFFs). This simplification is fair as a similar number of gates, which are mostly triggered by sharp edges at the reference rate, can be found in other PLLs as well.
- **2. For a SSPLL, the SSPD/CP is taken as the focus of the benchmark analysis.** The pulse generator is ignored here as it contains only a few gates, similar to the PFD.
- **3. For a DPLL, a conventional delay-line-based TDC is taken as the focus.** In a counter-based DPLL, some REF edge-based clock gating logic is commonly adopted to save power, as the TDC only needs to detect the phase error at the reference frequency. Again, these gates are ignored for the same reasons as mentioned above.

Please keep in mind again that the frequency error is assumed to be zero here as we are focusing on the comparison of different PDs. However, frequency-locking does not come for free. In terms of noisepower trade-off, those divider-less PLLs (counter-based DPLLs, SS-PLLs, and injection-locking PLLs [\[37\]](#page-179-2) ) are generally more efficient compared to those divider-based PLLs (CPPLLs and divider-based DPLLs). The essential reason is that the noisy frequency-locking path is decoupled from the major phase-locking path in the former ones.

### **3.4.1 IPN-Power Product**

As our motivation is to generate a high-purity PLL at high power efficiency, we need to get as low in-band PN floor as possible with minimum power consumption. Now, for simplicity we can determine the minimum power consumption of a CP as

$$
P_{CP} = I_{CP} \cdot V_{DD} \frac{\tau_{PFD}}{T_R} \tag{3.37}
$$
Besides, for a saturated metal-oxide-semiconductor field-effect transistor (MOSFET) transistor, we have the relation of

<span id="page-72-2"></span><span id="page-72-0"></span>
$$
g_m = \frac{2I_{drain}}{V_{gs} - V_{th}}
$$
\n
$$
(3.38)
$$

Combining Eqn[.3.37,](#page-71-0) Eqn[.3.38](#page-72-0) and Eqn[.3.13,](#page-62-0) we obtain

$$
\mathcal{L}_{CP} \cdot P_{CP} = f_{out}^2 \tau_{PFD}^2 \frac{32\pi^2 k T \gamma V_{DD}}{|V_{gs} - V_{th}|}
$$
(3.39)

The quantity on the left side of the equation, the power-IPN product is our target to reduce. Meanwhile, regarding the quantity on the right side of the equation,  $f_{out}$  is the PLL output frequency which is fixed. Meanwhile,  $V_{DD}$ ,  $(V_{gs} - V_{th})$ , as well as  $\tau_{PFD}$ , is adjustable within some range for getting a smaller product, while the other parameters are related to a given process node. Regarding the SSPD/CP, we can define the minimum power consumption of a CP to be

<span id="page-72-3"></span><span id="page-72-1"></span>
$$
P_{SS,CP} = I_{CP} \cdot V_{DD} \frac{\tau_{on}}{T_R}
$$
\n(3.40)

Combing Eqn[.3.40,](#page-72-1) Eqn[.3.38](#page-72-0) and Eqn[.3.19,](#page-64-0) we have

$$
\mathcal{L}_{SSCP} \cdot P_{SSCP} = \frac{kT\gamma V_{DD} \mid V_{gs} - V_{th} \mid}{2A_{VCO}^2} \tag{3.41}
$$

Similar to CPPLL, here the  $V_{DD}$ ,  $(V_{gs} - V_{th})$  as well as  $A_{VCO}$  are adjustable within limits, however, different to CPPLL, there is no *fout* which decouples the output frequency from the power-IPN product, which offers more advantages for higher output frequency synthesis. However, as soon as fractional-N operation is required, be it with DTC [\[45\]](#page-180-0) or phase interpolation or other methods [\[44\]](#page-180-1), *fout* will always come back into play since the  $T_{frac}$  excursion up to one  $T_{CKV}$  has to be quantized somehow. And for circuits with time-domain TDC or DTC involvement, it can be analyzed as follows. Assume in a conventional delay-line-based TDC, the jitter power contributed from the quantization noise is R times the one contributed from the thermal noise, which can be expressed as

$$
t_{res} = \sqrt{12MR}\sigma_{t,inv} = 2\sqrt{3MR}\sigma_{t,inv}
$$
\n(3.42)

In practice, a gate delay-based flash-TDC  $t_{res}$  is limited to  $t_d$ , which is still as high as 8∼10ps in an advanced complementary metal-oxide semiconductor (CMOS) node such as 28nnm. To have it lower than the thermal jitter, here we simply assume a Vernier delay-chain based structure  $[60]$  is a[d](#page-73-0)opted, and thus the minimum required number  $\rm^d$  of stages, M, to cover one  $T_{out}$  simply scaled up when  $t_{res}$  scales down, written as

<span id="page-73-2"></span>
$$
M_{(min)} = \lceil \frac{T_{out}}{t_{res}} \rceil \tag{3.43}
$$

and we assume that 3M stages in total are required for simplicity here, counting for the two delay line as well as the DFF required in one Vernier TDC. Secondly, we can estimate the minimum power of a flash inverter-line based TDC to cover one *Tout* as

$$
P_{TDC} = 3M \frac{1}{2} f_R C_l \cdot V_{DD}^2 = 3 \frac{T_{out}}{2t_{res}} f_R C_l \cdot V_{DD}^2 = \frac{3T_{out}}{4\sqrt{M} \sigma_{t,inv}} f_R C_l \cdot V_{DD}^2
$$
\n(3.44)

wh[e](#page-73-1)re only dynamic power consumption is considered<sup>e</sup>. Combing Eqn[.3.44,](#page-73-2) Eqn[.3.30](#page-68-0) and Eqn[.3.32,](#page-68-1) we have

$$
\mathcal{L}_{TDC} \cdot P_{TDC} = 48\pi^2 f_{out}^2 \cdot (1+R)M^2 \frac{C_l V_{DD}^2}{2} \sigma_{t,inv}^2 \tag{3.45}
$$

<span id="page-73-3"></span>
$$
\approx 48\pi^2 f_{out}^2 \cdot (1+R)M^2 t_d^2 \cdot f(k,T,\gamma,V_{th}) \tag{3.46}
$$

Here we see two interesting facts about a basic Vernier TDC-based PD:

• 1. The power-IPN product of this PD benefits with technology scaling, due to scaling of  $t_d^2$ .

<span id="page-73-1"></span><span id="page-73-0"></span><sup>&</sup>lt;sup>d</sup>A much larger number is usually adopted to cover potential PVT variations.

<sup>&</sup>lt;sup>e</sup>Assuming a necessary clock gating is adopted, so the TDC works at reference frequency

- 2. A finer resolution does not improve the power-IPN product further, however, it does improve the in-band noise floor.
- 2. With a given resolution  $(M_{min})$ , smaller the R is, smaller the power-IPN product will be. Alternatively speaking, the optimized product appears at the low power side rather than the low IPN side.

In a given CMOS technology node (fixed  $t_d$ ,  $V_{th}$ ,  $\gamma$ ), only a necessary number of stages, M, should be chosen to keep the right side of Eqn[.3.46](#page-73-3) minimum. Interestingly, conventional time-domain-based PD gets less power efficient when the target spectral purity is increased.

## **3.4.2 IPN-Power FOM of PD**

Based on the discussion above, we can define here a benchmarking FOM to compare different PDs in terms of their power-efficiency of generating output with same spectral purity, at same output frequency. Based on Eqn[.3.39,](#page-72-2) Eqn[.3.41](#page-72-3) and Eqn[.3.46,](#page-73-3) it is clear that they can all be rearranged in a certain form of

<span id="page-74-0"></span>
$$
\frac{\mathcal{L}_{PD} \cdot P_{PD}}{f_{out}^2} = X \tag{3.47}
$$

where factor X is the FOM factor, the smaller the better, when comparing different PDs, and thus defined as

$$
\text{FOM}_{PD} = 10 \log \mathcal{L}_{in-band} + 10 \log \left[ \left( \frac{1Hz}{f_{out}} \right)^2 \cdot \frac{P}{1mW} \right] \tag{3.48}
$$

#### **Limitation of FOM***PD*

A similar FOM definition for the PLL loop is also adopted in [\[59\]](#page-182-1). However, there are some intrinsic limitations with such a FOM. In principle, as the impact of output frequency value is already included, **the value of the reference frequency is not fully decoupled**. Actually, we always assume a REF frequency-dependent switching power in the discussion above. This assumption applies not only to the digital gates but also to the duty-cycled CPs, which can be found

<span id="page-75-0"></span>

Figure 3.15: Benchmark for different PDs.

in Eqn[.3.37](#page-71-0) and Eqn[.3.40.](#page-72-1) For any detailed structure that includes remarkable power consumption from non-duty-cycled biasing, it is always clear that **higher the REF frequency, smaller the FOM**<sub>*PD*</sub> **(better)**.

## <span id="page-75-2"></span>**3.4.3 An Illustrative Comparison**

Based on the discussion above, an illustrative comparison is done and shown in Fig. [3.15](#page-75-0)[f](#page-75-1) . A general case of 40MHz input reference and 3GHz output is taken, leading to a frequency gain of 75, i.e., 37.5dB in Eqn[.3.48.](#page-74-0) In Fig. [3.15,](#page-75-0) the output in-band PN floor is plotted against power consumption on a log-log scale. An indication line in black presents the potential in-band noise contribution from a good commercial XO with PN floor of -160dBc/Hz over offset from 10kHz from its 40MHz carrier. One thing that has to be clarified is

<span id="page-75-1"></span><sup>f</sup>A typical set of parameters is adopted, so the results are more for qualitative information rather than a quantitative conclusion.

that the analysis above, especially about CP and SSPD are based on an ideal case, that the current sources can be perfectly duty-cycled and, therefore, only consume power during the limited on-time once lock. This leads to the red line in the very left corner, far below the reference noise contribution, meaning it is more than enough even at a power consumption of 100µW. However, for CP designs in real life, especially those leveraged in high-performance CPPLL, duty-cycled current pumps might not be adopted and even other auxiliary circuits are required[\[61\]](#page-182-2)[\[62\]](#page-182-3)[\[63\]](#page-182-4). If the CP is not duty-cycled, the drawback of the PD FOM comes to its dependency on the REF frequency. This is visually indicated by the 3dB difference between the solid blue line (CP w/o duty-cycle) and dashed blue line (CP w/o duty cycle but with doubled REF frequency, i.e., 80MHz here). What's more, as  $I_{CP}$  is coupled with CPPLL's loop bandwidth, a larger passive LF is required to keep a proper bandwidth once *ICP* increased for lower IPN. The SSPD-based CP noise is significantly reduced, even much lower than the red CP profile, and thus not shown in the figure. However, as soon as it is extended into a fractional-N mode, by methods such as DTC-assistance [\[45\]](#page-180-0), the FOM gets degraded. It is clearly shown in Fig. [3.15](#page-75-0) as well that the disadvantage of time-domain quantization, compared to CP, as all scenarios sit towards the upper right corner. Among these lines, the flat regions represent the quantization noise dominating regime, and thus it does not get improved with more power consumed, while the non-flat regions are where thermal jitter contribution dominates. This proves the previous conclusion again that the conventional time-domain based PD enjoys better PD FOM in the low power, high IPN (rough resolution) applications. Therefore, it does not fit our target, which is the power-efficient generation of a high purity fractional-N DPLL for RF frequency synthesis.

## **3.4.4 Short Summary So Far**

Based on the discussions above, some basic conclusions can be drawn here as a guidance for searching for an alternative better solution.

• 1. Even though both TDC and DCO introduce additional quantization noise, a DLF is the best candidate for modern wireless communications. In terms of programmability, it can support an ultra-wide range of loop settings without involving any bulky passive RC components, especially for those who require ultranarrow loop bandwidth. On the other hand, the loop bandwidth of a DPLL can be well controlled (if not using a BBPD [\[52\]](#page-181-0)[\[53\]](#page-181-1)) as both PD gain and DCO gain do not exist in the transfer function once correctly normalized in digital [\[7\]](#page-175-0). This results in a purely digital-controlled loop bandwidth, which is independent of any analog factors that might be sensitive to PVT variations. However, this is not the case in a CPPLL, as already discussed above. In an analog realization, be it CPPLL or SSPLL, noise performance, and loop bandwidth are coupled, which is not desired at all.

• 2. The conventional time-domain method will not be considered at all as it does not fit our scenario. Besides, in terms of noise performance, a PD solution that can compete with a conventional CP in terms of PD FOM is good enough considering the input reference noise contribution, as clearly indicated in Fig. [3.15.](#page-75-0)

Therefore, a counter-based multiplication DPLL architecture is chosen based on the above discussions. Before going deeper into the discussion of an alternative solution, fractional spurs resulting from the counter-based DPLL operation will be analyzed next, serving later discussions.

# **3.5 Fractional Spurious Tones in a Counter-Assisted Fractional-N DPLL**

As discussed in Ch[.2,](#page-20-0) spurs are unwanted as they may cause mixing of unwanted blocker signals, violation of emission mask, worse still, deterioration of IPN due to both in-band spurs and folding of outof-band ones. These undesired tones can come from both a DPLL's external interference sources as well as its internal operations. In a modern SoC environment, there are many other noise sources such as digital baseband processors, clock buffers from other voltage domains that can interfere a DPLL's output spectrum. These distortions can

provoke spurs at the DPLL output via coupling over different paths, e.g., silicon substrate, bondwire, supply lines, through a path with poor common-mode rejection ratio (CMRR) or power supply rejection ratio (PSRR). Spurs originated from the aforementioned reasons are referred as external spurs here. They can be mitigated by proper supply regulation, careful layout and better CMRR achieved in key blocks (clock buffers, PD) in a DPLL.

Other spurs, not resulting from those external sources, are all referred to as internal spurs here. Since a DPLL is normally updated at its reference rate  $(f_{ref})$ , any internal spur with fundamental tone located at *fref* is further classified as reference spur here. Compared with a CPPLL, the reference spur issue in DPLL is already relaxed by replacing the leaky passive LF with a DLF, and avoiding charge pump mismatches. However, the potential poor isolation between the output oscillator, buffer and the input reference path, switching of the DLF, as well as operations from PD, may still contribute to non-ignorable reference spurs. To further relax reference spur in a DPLL, better isolations the aforementioned paths and abstention of using certain sub-sampling PD [\[38\]](#page-179-0) would help.

For those internal spurs with fundamental tones located at fractions of *fref* are classified as fractional spur here. They originate from the DPLL's fractional-N operation and associated non-ideal behaviors [\[55\]](#page-181-2)[\[64\]](#page-182-5)[\[65\]](#page-182-6)[\[66\]](#page-182-7)[\[67\]](#page-183-0). The channel raster is becoming deeper fractional w.r.t. the input reference, e.g., a resolution of 5kHz needs to be achieved in 5G, which means the fractional part of FCW has to be lower than  $2^{-12}$  even at a typical low reference rate such as 26 MHz. This results in numerous in-band fractional spurs that could not be attenuated by the DLF, which directly contribute to a high IPN in the end and thus violate the corresponding standard. Therefore, a brief analysis will be done in this section to get a deeper understanding of fractional spurs, with the goal of further reduction. In the following part of this section, fractional spurs will be discussed in three parts according to their major origins within a counter-assisted DPLL structure: limited resolution, nonlinearity and gain estimation error.

### **3.5.1 Spurs Introduced by Limited Resolution**

For any DPLL, the limited quantization resolution from the PD would result not only in a noise floor at the DPLL output (Eq. [\(3.29\)](#page-68-2)), but potentially also in fractional spurs. The word "limited" here refers to two facts: 1) the absolute resolution might not be fine enough, restricted by factors such as the minimal gate delay set by a specific process; 2) the relative resolution is not enough, as the PD has to resolve the fractional phase error that resulted from the accumulation of fractional FCW over cycles of the reference, the nominal bits of the PD are relatively lower than the effective-fractional bits of the FCW. Regarding the former fact, a conventional time-domain-based TDC/DTC [\[8\]](#page-176-0) is directly limited by CMOS process due to its resolution dependence on logic gate delay. For instance, even when an 8ps-inverter delay can be achieved in 28nm CMOS technology, this only translates to a sub-6-bit TDC for a 2.4GHz ISM band PLL. Meanwhile, ADC-assisted solutions are proved to offer much finer resolution with acceptable power consumption [\[28\]](#page-178-0)[\[29\]](#page-178-1)[\[46\]](#page-180-2)[\[68\]](#page-183-1) in voltagedomain, greatly relaxing this limitation. However, the fractional part of FCW is desired to be more than 16-bits for a DPLL to achieve pro[g](#page-79-0)rammable 1kHz resolution with a 40MHz reference input<sup>g</sup>. This example reflects the second fact mentioned above. As a TDC with acceptable power consumption cannot be easily realized with such a fine resolution, the associated fractional spurs are inevitable in deep fractional channels, i.e., with deep fractional bits turned-on in a given FCW.

Even though the DSM-divider-based DPLL faces a similar challenge[\[69\]](#page-183-2) regardless of the DSM's order, only counter-assisted divider-less DPLLs are considered here for simplicity. In such a DPLL, the PD is supposed to distinguish the fractional phase difference  $(T_{frac})$  between the REF and its neighboring CKV edge. This excursion of phase,  $T_{frac}$ , ramps up periodically from 0 (aligned) to a full range *TDCO* (next aligned edge) at an incremental step of  $FCW_{frac}$ *C<sub>kv</sub>* per reference cycle, as depicted in Figure. [3.16.](#page-80-0) Therefore, a PD has to not only cover at least one  $T_{ckv}$  range, but also to resolve it with a resolution as fine as possible. As discussed previously, any fine step increment of  $T_{frac}$ 

<span id="page-79-0"></span><sup>g</sup>For mobile communication, a requirement of 20-bit fractional part is not uncommon.

<span id="page-80-0"></span>

Figure 3.16: (a) PE waveform resulted from limited resolution; (b) resulted spur tones illustrated in spectrum.

that could not be resolved by the PD would simply accumulate until PE reaches a LSB level of the PD as shown in Figure. [3.16,](#page-80-0) where *nq* denotes the PD's effective resolution bits while frac denotes the effective fractional bits of FCW. Thus, a sawtooth-alike pattern with a period of  $2^{frac - nq}$ *REF* and a peak of  $2^{-nq}$ *T*<sub>*ckv*</sub> is formed and will be finally fed into DCO after DLF. The generation of corresponding spurs can be explained by recalling Eq. [\(2.17\)](#page-27-0), following the same PM model. Then, what is the  $\omega_m$  and  $\beta$  here? The prevalent modulation comes from the fundamental tone of the sawtooth corresponding Fourier Series, written as

$$
x_{sawtooth}(t) = \frac{t_{res}}{2} - \frac{t_{res}}{\pi} \sum_{n=1}^{\infty} \frac{\sin(2\pi nft)}{n}
$$
(3.49)

where, *tres* stands for the effective resolution in time-domain of the PD. Therefore, the fundamental PM tone is 2*tres*  $\frac{2t_{res}}{T_{ckv}} \cdot \sin(2\pi \frac{f_{REF}}{2^{frac-v}})$  $\frac{1}{2}$ *frac*−*nq*</sub><sup>*t*</sup>). According to Eq.  $(2.17)$ , this results in a spur of

$$
P_{spur}(\frac{f_{REF}}{2^{frac_{cr}}}) = 20\log_{10}(\frac{t_{res}}{T_{ckv}}) \, d\text{Bc} \tag{3.50}
$$

which is depicted in Figure[.3.16.](#page-80-0) According to the analysis in Chapter 2, fractional spurs at an offset of multiple  $\frac{f_{REF}}{g_{HZ}f_{RZ}}$  $\frac{1}{2^{frac -n}}$  will also show up.

For those fractional spurs within the loop bandwidth, they will not be attenuated by the PLL and directly pollute the output spectral purity, while the out-of-bandwidth ones can increase the in-band noise floor via noise folding. Overall, improving the resolution of the PD quantizer is the only way to resolve such an issue.

## **3.5.2 Spurs Introduced by Non-linearity**

Unfortunately, the worst case fractional spurs for most of the channels are not caused by a limited resolution, but are usually caused by the non-lineariy of the PD. As in any  $D/A$  converter, the nonlinear conversion characteristic produces distortion tones that ultimately appear as spurs in the output spectrum. In general, the evaluation of the fractional spurs here due to TDC nonlinearity is not straightforward, as the nonlinearity pattern can not be well predicted. However, for a matter of simplicity, a TDC with a typical nonlinearity pattern and infinite resolution is assumed here for example and its waveforms is shown in Figure [3.17.](#page-82-0) Even with infinite number of bits, the PD could not resolve completely the accumulated fractional  $T_{frac}$  as a nonzero residual error due to nonlinearity is still present, which would modulate the DCO resulting in fractional spurs. With the assumed half sinusoidal nonlinearity pattern, a residual timing error with peak  $t_{INL}$  at a rate of  $2^{frac}$ *Irac* $T_{REF}$  can be represented in a Fourier series format as

$$
| t_{INL} \sin(2\pi \frac{f_{REF}}{2^{frac+1}}t)| = \frac{2t_{INL}}{\pi} - \frac{4t_{INL}}{\pi} \sum_{n=1}^{\infty} \frac{\cos(2\pi \cdot 2^{-frac}t_{REF}t)}{4n^2 - 1}
$$
(3.51)

and the fundamental tone can be written as  $\frac{4t_{INL}}{2}$  $\frac{f_{INL}}{3\pi}$   $\cos(2\pi \cdot \frac{f_{REF}}{2^{frac}})$  $\frac{1}{2}$ *frac*<sup>*t*</sup>)). According to Eq.  $(2.17)$ , this results in a spur of

<span id="page-81-0"></span>
$$
P_{spur,INL}(\frac{f_{REF}}{2^{frac}}) = 20 \log_{10}(\frac{4t_{INL}}{3T_{ckv}}) dBc \qquad (3.52)
$$

For instance, an effective INL as large as 1ps in the above case would already result in a fractional spur as large as -48dBc at a 3GHz output. Overall, the fractional spur issue caused by non-linearity can

<span id="page-82-0"></span>

Figure 3.17: (a) PE waveform resulted from a typical INL pattern; (b) resulted spur tones illustrated in spectrum.

be very complex in practice, but Eq. [3.52](#page-81-0) can still serve as a sufficient estimate. Considering its prevailing impact, it is no wonder that quite a few attempts have been made and reported to suppress the nonlinear imperfections within the PD over the past decade [\[27\]](#page-178-2)[\[55\]](#page-181-2). Noticeable methods include techniques such as dithering at the input/within the TDC (PD) [\[70\]](#page-183-3), sacrificing noise floor for lower spur level, noiseshaping of the PD, e.g., the gated-ring oscillator-based TDC [\[71\]](#page-183-4)[\[72\]](#page-183-5) which brings additional bandwidth-spur trade-off, and feedforward cancellation in digital scheme [\[64\]](#page-182-5). Even though a better calibration algorithm might help, at the cost of additional power and hardware, a scheme of PD with intrinsic better linearity of course is the best solution.

## **3.5.3 Spurs Introduced by the PD Gain Error**

In analogy to an A/D converter, gain error is as well presented in the PD of a DPLL, which usually comes from three sources: 1) the PVT sensitive transfer function of the PD, whose transfer function might deviate from the expected trace due to PVT variations; 2) mismatch between coarse-fine segments, which takes place in any PD implementation with an effective coarse-fine arrangement; 3)wrong estimation of the effective LSB resolution of the PD. Even though the

<span id="page-83-0"></span>

Figure 3.18: (a) PE waveform resulted from gain error; (b) resulted spur tones illustrated in spectrum.

last source is generally easy to tackle, a gain error introduced phase modulation/ spur issues are not ignorable. A simplified example is shown in Figure. [3.18,](#page-83-0) where a PD with infinite resolution as well as perfect linearity is assumed. However, due to a wrong estimated gain, a residue error would be generated, resulting in a sawtooth alike error tuning the DCO as long as error source is slow enough compared to the loop response. In the illustrated example, the sawtooth is having a longer period compared to the resolution-limitation introduced spur, as large as  $2^{frac}T_{REF}$  with a peak amplitude of  $t_{error}$ , which depends on the corresponding case of gain error.

$$
P_{spur}\left(\frac{f_{REF}}{2^{frac}}\right) = 20\log_{10}\left(\frac{t_{error}}{T_{ckv}}\right) dBc
$$
 (3.53)

Although the PD in a DPLL can be realized in many different forms, e.g., simple TDC [\[7\]](#page-175-0), DTC-assisted coarse-fine TDC [\[56\]](#page-181-3), ADC-based sampling [\[28\]](#page-178-0), or even simply as a conventional PFD-CP followed by an ADC [\[68\]](#page-183-1), the analysis derived above is generally valid still. Therefore, to draw a conclusion, a PD that digitizes the input PE

<span id="page-84-0"></span>

Figure 3.19: Simulated resolution of a flash TDC without optimization in 28 nm.

within a DPLL with not only fine-enough resolution but also good linearity, together with correct gain calibration is the essence to achieve acceptable as well as small fractional spurs.

# **3.6 An Alternative Path to Time-Domain**

## **3.6.1 T-domain v.s. Conventional Analog Domains**

Conventional inverter-based quantization of PE signal (P/D conversion) in DPLL can be simple, low power and expected to benefit significantly from technology scaling as it is dependent on the propagation of a CMOS inverter. Meanwhile, for other analog domains, e.g., charge-domain, current-domain and voltage-domain, the design headroom is worrying, because supplies shrink as technology scales down. However, the supply and dimension-determined inverter-delay do not scale by the same factor, known as *generalized scaling*[\[73\]](#page-183-6). For instance, a typical inverter delay scales down from about 30ps at 130nm node to 8ps in 28nm node, by a factor of almost 4; while the supply shrinks only from 1.2V to 1V by a factor of 1.2, as depicted in Fig. [1.3.](#page-17-0) This, at least, means other analog domains are still attractive for the P/D conversion.

#### **Quantization Noise**

Consider the fact that even a TDC with a resolution of 8ps (in 28nm) only counts for 5bit in a 3GHz PLL, leading to -102dBc/Hz in-band noise floor and large fractional spurs as high as -33 dBc in deep fractional channels. This -102dBc/Hz noise floor is unacceptable for highpurity applications, as an in-band phase noise floor of -122dBc/Hz at RF output can be achieved from a commercial 40 MHz XO. If the PE can be resolved in other analog domains with 10b resolution, then the in-band noise floor contributed from the quantizations is only -132dBc/Hz, which is 10dB lower than the reference contribution. In addition, a 10b quantization can be achieved with quite affordable power. This implies that methods in other analog domains, rather than the conventional time-domain, can be leveraged to reduce quantization noise substantially.

To break the resolution limitation in time-domain, set by the process, a number of techniques have been proposed, e.g., improved Vernier delay-line TDCs, [\[74\]](#page-183-7)[\[75\]](#page-183-8), ring-oscillator-based ones[\[72\]](#page-183-5)[\[76\]](#page-184-0) or time-amplifier based-ones [\[77\]](#page-184-1). However, fine-resolution TDCs usually suffer from PVT variations that results in ambiguity in its full range as shown in Fig. [3.19.](#page-84-0) Worse still, although its resolution can be improved by these emerging techniques, intrinsically poor matching of unit-delay as well as thermal jitter (Eqn[.3.30\)](#page-68-0) contributed form each delay element can only be reduced with larger sizing or high-order noise-shaping[\[71\]](#page-183-4), significantly increasing power consumption.

The reduction of quantization noise requires more bits of resolution, which has been proven to be expensive in terms of power and linearity in time-domain. Due to this considerable cost, DPLLs with timedomain-based PD are generally limited by the quantization noise (state-of-art resolution of one-plus picoseconds counts only for 8bit for a 3GHz output). Thus, in such a DPLL, the thermal noise floor is far from being touched. However, increasing bits of resolution in other analog domains, such as voltage-domain and charge-domain, requires simply more passive components. Thus reduction of quantization is more power efficient in conventional analog domains.

#### **Thermal Noise**

The reduction of thermal noise usually requires more power consumption, which sets the minimum noise-power product of a PD in theory as seen in Fig. [3.15.](#page-75-0) As with CMOS inverter-based delay-line structures, the conversion from thermal noise into timing jitter is roughly following Eq. [3.30.](#page-68-0) Even though the thermal jitter gets reduced with technology scaling, as  $t_d^2$  shrinks slightly more than  $C_l \cdot V_{DD}^2$  does, the reduction is neither substantial nor flexible. Worse still, more cells are required for higher resolution, and thus more thermal jitter. On the other hand, the conversion from thermal noise into timing jitter relation in other analog domain is flexible and can be chosen almost freely. For instance, as adopted in some state-of-the-art designs, some leverage ADCs for low quantization noise [\[28\]](#page-178-0), and thus a slope generator is required to link  $\Delta t$  into  $\Delta V$ . As long as the slew rate of  $\Delta V/\Delta t$  is high, thermal noise conversion into timing jitter can be relaxed.

#### **Other Advantages of Conventional Analog Domains**

In addition, conventional analog domains have some attractive features for achieving a more power-efficient high-resolution quantization of the PE signal.

• **More freedom for power-efficient structures.** In a conventional analog domain, there are many different architectures for A/D conversion other than a flash converter. Other types, such as successive approximation register (SAR) or  $\Delta - \Sigma$ , are usually trading off conversion time for either lower power or better ENOB. However, a similar extent of freedom does not exist in time-domain. This is partially due to the reason that it is generally inconvenient to restore timing information in timedomain, and thus to trade-off conversion time for lower power is not easy. This is also why most time-domain TDCs are realized as flash converters. Meanwhile, once a timing error information

gets converted into analog domains, power-efficient quantization can be realized much easier.

- **Better-defined reference.** Precise voltage references can always be generated easily. Based on this, both accurate dynamic ranges and LSB quantity can be well defined in domains such as voltage or charge. However, time-domain does not enjoy this type of reference at all as it is usually relying on a PVT sensitive gate delay, let alone the full dynamic range with variations and mismatches accumulated from LSB cells.
- **Easier realization with high CMRR/PSRR.** In analog domains, countless fully differential structures can be found to enhance a block's CMRR, which is highly appreciated for a RF SoC that support advanced mobile applications. On such a SoC, intensive digital aggressors may contribute to modulations either over the supply line or via the substrate, leading to undesired spurs and degradation of IPN. A conventional inverter-based delay line is used as a digital circuit, and thus it has a poor CMRR, and it is self-interfering during operation due to periodical peak current pushing. Even large decoupling caps can be employed to reduce high-frequency interference/noise, at the cost of a large area, attenuation of undesired low-frequency interferences by decoupling caps are too expensive to afford.

## **3.6.2 Towards a Power-Efficient High-Purity Solution**

To draw a conclusion so far, we have almost found a solution that supports our goals based on the discussions above.

- A counter-based frequency multiplication solution is chosen, as to remove the DSM-MMDIV introduced redundant noise-power trade-off and concerns with loop bandwidth and order.
- Instead of the conventional gate-delay-based time-domain method, it is more power efficient to have a high ENOB PD realized in conventional analog domains.

<span id="page-88-0"></span>

Figure 3.20: Basic CP links analog domains together.

• DPLL structure is chosen due to the advantages brought by DLF, as discussed previously.

Now the only obstacle left is how to shift the PE signal into conventional analog-domain to complete the solution. Recall the fact that the CP is actually a phase-to-current converter while an RC filter is adopted for current-to-voltage conversion. This fact implies us that a CP-based ramp generator, is a good converter that links quantities in *time-domain* (t), *charge-domain* (Q), *current-domain* (I) and *voltage domain* (V) together, following the basic relation below

<span id="page-88-1"></span>
$$
\Delta Q = I \cdot \Delta t = C \cdot \Delta V \tag{3.54}
$$

This implies many possibilities of realization and is conceptually illustrated in Fig. [3.20.](#page-88-0) For instance, if the dis-/charging current I and dis-/charging capacitor C are kept the same as a constant slope generator, then the information from time-domain is transferred linearly into voltage-domain, which can be further resolved by a power-efficient A/D conversion, such as a SAR ADC. Based on Eq. [3.54,](#page-88-1) and Fig. [3.20,](#page-88-0) two more points can be added for the be-searched solution:

<span id="page-89-0"></span>

Figure 3.21: Conceptual diagram of the proposed fully differential analog-domain DPLL.

- As a large dynamic range of PE up to one  $T_{CKV}$  is only required to be covered in fractional channels, with a known pattern periodically repeated based on  $\Sigma F C W_{frac}$ , a DAC can be inserted to reduce the power consumption in ADC. This saves the unnecessary cost brought by comparators as well as  $A/D$ logics. In addition, similar to the phase prediction-based DTC in a DPLL design [\[56\]](#page-181-3), the quantization realized by DAC+ADC is equivalent to a coarse-fine digitization and thus achieves a high resolution in a power-efficient way. However, the drawback would be that potential gain error introduced fractional spurs, resulted from coarse-fine gain mismatch. [\[67\]](#page-183-0)
- A differential structure can be leveraged not only to improve the CMRR of the PD block, but also increase the dynamic range of the analog domain, and thus reduce the conversion of thermal noise into timing jitter.

The proposed DPLL is conceptually shown in Fig. [3.21.](#page-89-0) In addition to the fact that a high-resolution quantization can be achieved powerefficiently in analog-domain, the proposed solution enjoys one additional advantage compared to other state-of-the-art designs [\[28\]](#page-178-0). That is the unpolluted REF and clock gated CKV edge is directly used for

voltage sampling. However, in other designs, either  $\Delta\Sigma$ -modulated dividers or DTC to compensate the large fractional PE  $T_{frac}$  will either pollute CKV with noise from the divider chain and  $\Delta\Sigma$ -modulator, or the REF due to DTC-modulation, impacting its noise and leading to high PVT sensitivity and susceptibility to external interferences in a RF SoC.

Here, the proposed DPLL addresses  $T_{frac}$  by adding the accumulated fractional FCW pattern in analog-domain in front of ADC, so that the excursion produced due to  $T_{frac}$  by the dv/dt conversion is canceled by the DAC output, making the ADC simple and very low power. With a differential implementation, the available dynamic range is further doubled, alleviating the concern with voltage headroom in scaled technology nodes. Besides, with the concern of getting a linear  $K_{PD}$ , a constant dv/dt generator based on a CP is adopted here. Thus the slew rate is

$$
\Delta V / \Delta t = \frac{2I}{C} \tag{3.55}
$$

assuming the charge pump current of I for comparison. Different from a CPPLL, as a sampling scheme is adopted, the mismatch between the CP current sources will lead only to a static gain error rather than any reference spurs. Based on Eq. [3.54,](#page-88-1) the DAC can be either implemented in the voltage-domain and added before the ramp generation, or added in the charge-domain after the sampling triggered by CKV. Detailed implementations are discussed and compared in Chapter [4.](#page-111-0)

### **3.6.3 Noise Analysis**

Regarding the proposed DPLL, the in-band noise floor can also be obtained in a similar way by normalizing the noise sources first back to the right input of the PD.

#### **Noise from the Ramp Generator**

Assuming the dis-/charing NMOS and PMOS current source have the same current  $I_{SR}$  and same  $g_m$  for simplicity here, the current noise  $\phi_{n,SR}$  contributed from the slope generating current sources has a equivalent average density as

<span id="page-91-0"></span>

Figure 3.22: A representative case of the  $t_{on}$  time for  $I_R$  noise contribution with only fractional bit  $N_q$  on.

<span id="page-91-1"></span>
$$
S_{iSR,n} = 4kT\gamma \cdot 2g_m \cdot \frac{t_{on}}{T_{ref}}
$$
\n(3.56)

In the locked state of the proposed DPLL, the ramp is turned on for a short period of  $t_{on}$ , during which a current of  $2I_R$  gets integrated on to the capacitors of the ADC.  $I_R$  and  $I_{SR}$  are both used in this thesis to refer to the same current quantity, i.e., dis-/charing NMOS and PMOS currents. the The white current noise  $i_{R,n}(t)$  is thus injected into the loop, by a gating window of  $g(t)$  [\[78\]](#page-184-2). This window function is ideally zero, in an integer-N channel. In a fractional-N channel, the window width contains a periodical pattern, which is determined by the turned-on deepest fractional bit in the FCW. For simplicity, we assume that only the fractional bit  $N_q$  is turned on. Thus the window width is a sequential array at an incremental step of  $\frac{1}{2^{N_q}T_{CKV}}$  every REF cycle and gets repeated every  $2^{N_q}$  reference cycles, as illustrated in Fig. [3.22.](#page-91-0) Within one period  $(2^{N_q}$  REF cycles), the window width at i-th cycle equals  $\frac{i}{2^N q} \cdot T_{CKV}$ . The averaged value of  $t_{on}$  is ideally to be

$$
\bar{t}_{on} = \begin{cases}\n0, & \text{integer-N} \\
\sum_{i=0}^{i=2^{frac-1}{2}} i \cdot T_{ckv}/2^{frac} = \frac{T_{ckv}}{2} - \frac{T_{ckv}}{2^{frac+1}}, & \text{fractional-N}\n\end{cases}
$$
\n(3.57)

where again,  $frac$  in  $2^{frac}$  is the turned-on deepest fractional bits in FCW (same as the abovementioned  $N_q$ ). This indicates that in a



Figure 3.23: S-domain transfer function of the proposed DPLL.

deep-fractional-N channel, the average on-time is almost half of the CKV period. The feedback gain here is that

$$
K_{I_R} = \frac{\hat{i}_e}{\hat{\phi}_e} = \frac{2I_R t_e f_{REF}}{2\pi f_{CKV} t_e} = \frac{I_R}{\pi N}
$$
(3.58)

Being multiplied by a gain of  $\left(\frac{\pi N}{I_R}\right)^2$ , Eq. [3.56](#page-91-1) can be normalized to the phase noise power at the input of the PD.

#### **Other thermal noise contributions**

The periodical sampling operation contributes the  $kT/C$  noise. In the implemented structure, it is dominated by

$$
v_{n,kTC}^2 = 2\frac{kT}{C_{SAR}}\tag{3.59}
$$

where the factor of 2 accounts for the differential contributions. This noise is transferred to the output, by the division of the  $dv/d\phi(dt)$ conversion gain *KSR*, which is

$$
K_{SR} = \frac{\Delta v}{\Delta \phi_{CKV}} = \frac{2I_R}{2\pi f_{CKV} C_{SAR}}\tag{3.60}
$$

to be

$$
\mathcal{L}_{IB,kTC} = 0.5 \frac{v_{n,kTC}^2}{K_{SR}^2 \cdot 0.5 f_{REF}} = 2\pi^2 \frac{f_{CKV}^2}{f_{REF}} \frac{kTC_{SAR}}{I_R^2}
$$
(3.61)

Unlike the intuitive impression that a large capacitance is required for a minimized jitter, the primary consideration in deciding the capacitance size is rather about matching property instead of noise.

Therefore, there is no  $kT/C$ -restricted power-noise trade-off in this architecture. Instead, the smaller the *CSAR*, the less switching power dissipation, and smaller the  $I_R$  current required to sustain a specific dv/dt conversion gain. The downside is the reduced matching of the capacitor-DAC (CDAC) of the ADC with small unit capacitors, where a trade-off has to be made. Further phase noise contributions from the voltage noise sources of the comparator, the DAC (marked as  $v_{TN}$ ) and quantization noise (marked as  $v_{\alpha}$ ) can be derived in the same way as kT/C noise. Further elaboration on the these noise sources are given in Ch. 4.

#### **Overall In-band phase noise due to thermal sources**

Above all, we have the total equivalent phase noise at the input of the PD (in fractional-N channel) as

<span id="page-93-0"></span>
$$
\mathcal{L}(\Delta f) = \frac{\pi^2 f_{out}^2}{f_{ref}} \left( \frac{2kT\gamma g_m}{I_{SR}^2} T_{out} + \frac{2kTC}{I_{SR}^2} + \frac{v_{TN}^2 C^2}{I_{SR}^2} + \frac{1}{12} \left( \frac{T_{out}}{2^N} \right)^2 \right)
$$
\n(3.62)

Some interesting facts can be observed regarding the comparison between Eq. [3.62](#page-93-0) and Eq. [3.13,](#page-62-0) as:

- Different from the CP used in a single-end manner in a CPPLL, the CP here is leveraged to form a differential ramp, and thus the feedback gain is rather 2*I* instead of *I* while the current noise sources are essentially the same. This offers one additional reduction of 6dB of the in-band noise floor.
- The second and third items are  $kT/C$  noise resulting from sampling, as well as effective thermal voltage noise contributed from the data converters involved. It is shown that larger the C used for ramp generation, larger the final timing jitter. Therefore, a small value of C is preferred for the sake of power-saving, as long as the matching condition is acceptable.
- Compared to the minimum CPPLL phase noise floor in one integer-N channel, the noise contributed from CP current sources in a fractional-N channel has an on-time of  $T_{ckv}/2$  instead of  $\tau_{on}$

<span id="page-94-1"></span>

Figure 3.24: Benchmark of PD with proposed scheme added.

(zero in integer channels), which is rather small, especially for high frequencies (e.g., 167ps for 3GHz). However, the necessary on-time for avoiding deadzone in CPPLL is usually 0.5ns to 1ns.

To obtain a visualization about the comparison, we can put the proposed architecture into the same scenario as discussed in section [3.4.3.](#page-75-2) To make the comparison practical and simple, two additional assumptions are made. The first one is that the product of effective thermal voltage noise  $(v_n)$  resulting from data converters and the power consumed (*Pconv*) are constant at a specific sampling frequency [h](#page-94-0)

<span id="page-94-0"></span><sup>&</sup>lt;sup>h</sup>E.g., PSD of the noise voltage of MOSFETs is in the format of  $\frac{8kT}{2}$  $\frac{1}{3g_m}$ , multiplied by its consumed static power  $I \cdot V_{DD}$  would be a constant, biased with a specific *gm*  $\overline{I_D}$ . Besides, for  $\frac{kT}{\epsilon}$  $\frac{d}{C}$  noise, multiplied by its consumed switching power  $f \cdot CV^2$ would be a constant as well, given a specific dynamic range.

$$
v_n^2 * P_{conv} = \text{constant} = (300 \mu V)^2 \cdot 0.1 mW \tag{3.63}
$$

The second assumption is that a dynamic range is fixed at 0.6V and thus the capacitance C used in model scales proportionally to the charge pump current.

And the result is shown as in Fig. [3.24.](#page-94-1) From which we can arrive at the following conclusions:

- The proposed structure has the best PD FOM in for a fractional-N PLL realization. (Actually, it only loses to the ideal dutycycled CP, which does not include the noise from  $\Delta\Sigma$ -modulated MMDIV, which brings additional contributions of noise and power.)
- With the assumed thermal noise model and setup, the difference resulting from quantization noise is almost ignorable for converter with resolution more than 11 bits within a power budget of 10mW. Within an affordable power budget (1 mW), the total noise reduction from 9 bit to 10 bit is significant.

Further implementation details, discussions and analysis can be found in Chapter [4.](#page-111-0)

# **3.7 Controlled Oscillators**

While the PD closes the loop, and dictates most of the loop characteristics, the CO is the soul of a PLL as it generates the output RF carrier. Be it voltage-controlled (VCO) or digital-controlled (DCO), it is always challenging to achieve high spectral purity at relatively low power consumption, especially for GSM TX, where the spot phase noise must be less than -162dBc/Hz at 20MHz offset from 915 MHz, as discussed in Sec. [2.2.1.](#page-30-0) Not only about its noise contribution, the RF oscillator also occupies a substantial portion of the total synthesizer's power budget, which usually counts for more than 30% of cellular receiver power consumption, as shown in Fig. [1.2.](#page-16-0) As an RF high-purity synthesizer is required, we restrict our choice to

LC-tank-based oscillator here, due to their effective higher-Q in the RF range (no XOs available) as compared to other typical structures, such as ring-oscillator and active-inductor-based ones. In addition, to search for a power-efficient and straightforward structure, a general review of typical structures is discussed below. More in-depth analysis regarding phase noise optimization, as well as detailed implementation, follows later.

#### **Noise-Power Benchmark of CO**

To make the comparison simple, a basic phase noise formula based on Leeson's LTI model [\[79\]](#page-184-3) is adopted here. Interestingly, similar conclusions are also achieved in more complicated analysis [\[80\]](#page-184-4) and [\[78\]](#page-184-2), which will be dedicatedly discussed later. The thermal to phasenoise upconversion (20 dB/dec) can be found, as

$$
\mathcal{L}_{CO,dB}(\Delta f) = 10 \cdot \log_{10} \left( \frac{R_t kT}{2Q_t^2 V_{osc}^2} \cdot (F+1) \cdot \left( \frac{f_0}{\Delta f} \right)^2 \right) \tag{3.64}
$$

<span id="page-96-0"></span>
$$
= 10 \log_{10} \left( \frac{kT}{2Q_t^2 \alpha_I \alpha_V P_{DC}} \cdot (F+1) \cdot \left( \frac{f_0}{\Delta f} \right)^2 \right) \tag{3.65}
$$

where  $R_t$  is the equivalent parallel resistance of the tank,  $k$  is the Boltzmann's constant, and *T* is the absolute temperature, while F is the excess noise factor which counts for all additional noise upconverted from sources other than the loss of the tank.  $\alpha_V = \frac{V_{osc}}{V}$  $\frac{V_{\text{osc}}}{V_{DD}}$  and  $\alpha_I = \frac{I_{\omega_0}}{I}$  $\frac{I_{\omega_0}}{I_{DC}}$  are the voltage and current efficiency factors respectively.

The voltage efficiency measures the ratio between the oscillation amplitude and supply; while the current efficiency measures the ratio between the fundamental harmonic current across the tank and the DC value used for driving the tank. Higher the efficiency, less phase noise generated from the same noise sources.

Similar to the FOM for the PD derived in Sec. [3.4.1,](#page-71-1) Eqn. [3.65](#page-96-0) can be rearranged into

$$
\mathcal{L}_{CO}(\Delta f) \cdot P_{DC} = \frac{kT}{2Q_t^2 \alpha_I \alpha_V} \cdot (F+1) \cdot \left(\frac{f_0}{\Delta f}\right)^2 \tag{3.66}
$$

$$
\mathcal{L}_{CO}(\Delta f) \cdot P_{DC} \cdot \left(\frac{\Delta f}{f_0}\right)^2 = \frac{kT}{2Q_t^2 \alpha_I \alpha_V} \cdot (F+1)
$$
\n(3.67)

Similar to the PD, a same power-noise product is observed here, which is largely determined by the corresponding design strategy  $(F)$ ,  $\alpha_V$ and  $\alpha$ <sup>*I*</sup>, and given process-dependent factors such as  $Q_t$ . Based on this, a noise-power FOM has been used by academia, which is defined as (in dB)

$$
\text{FOM}_{osc} = 10 \log_{10} (\mathcal{L}_{CO}(\Delta f) \cdot \frac{P_{DC}}{1mW} \cdot (\frac{\Delta f}{f_0})^2)
$$
(3.68)

<span id="page-97-0"></span>
$$
\text{FOM}_{osc} = 10 \log_{10} \left( \frac{kT}{2Q_t^2 \alpha_I \alpha_V 10^{-3}} \cdot (F+1) \right) \tag{3.69}
$$

Eqn. [3.69](#page-97-0) clearly emphasizes the importance of a high Q for the power-efficiency of an RF oscillator design, which is mostly restricted by the given process. However, improving the design-dependent factors such as reduction of F, increasing  $\alpha_I$  and  $\alpha_V$  (the ratio between the oscillation amplitude and the supply voltage) can definitely optimize the final power-efficiency of an RF oscillator design. On the other hand, this FOM definition also has many limitations. For example, it does not characterize the tuning range of an oscillator properly, which is usually enlarged at the cost of degrading of  $Q_t$ , leading to a degrease of FOM. Furthermore, this FOM only captures the 20dB/dec phase noise region, while the  $1/f$  noise up-conversion phase noise, which is a considerable jitter-contributor in narrow-bandwidth PLLs, is completely ignored.

#### **Noise-Power FoM of PLL**

Based and only based on the assumption that an optimal bandwidth *fopt* is chosen for the loop, i.e., the output IPN contribution from the PD/loop and the VCO/DCO are equal (Sec. [3.3.1\)](#page-58-0), a FOM regarding the whole PLL's noise-power product can be therefore defined as below:

$$
\text{FOM}_{PLL} = 10 \log_{10} \left[ \left( \frac{\sigma_{t,PLL}}{1s} \right)^2 \frac{P_{PLL}}{1mW} \right] \tag{3.70}
$$

Where  $\sigma_{t,PLL}$  is the integrated phase jitter of the PLL and  $P_{PLL}$ counts for the overall power consumption. The underling improvements of the  $FOM_{PLL}$  come from  $FOM_{PD}$  and  $FOM_{osc}$ . However, as Eq. [3.69](#page-97-0) can be leveraged to further predict the maximum limitation value of FOM*osc*, based on the assumption that excess noise factor F approaches zero, while the maximum power efficiency  $(\alpha_I \alpha_V)$ approaches 1, leading to

$$
\text{FOM}_{osc} = 10 \log_{10} \left( \frac{kT}{2Q_t^2 \alpha_I \alpha_V 10^{-3}} \cdot (F+1) \right) \tag{3.71}
$$

$$
= -176.8 - 20\log_{10}(Q) \tag{3.72}
$$

Which shows the strong restriction imposed by the process (Q) and thus not very large headroom for a significant breakthrough. Therefore, a substantial advance of the loop FOM should result from an improvement of the PD FOM.

### **3.7.1 Methods of Phase Noise Analysis**

Even though the Leeson model [\[79\]](#page-184-3) gives an essential prediction as well as understanding of phase noise in an LC oscillator, it encounters many restrictions. Limited by an LTI assumption, the Leeson model cannot explain well the mechanism of noise conversion, which is caused by non-linear behaviors in an oscillator. For instance, the model is powerless when it comes to the understanding of the conversion from 1/f noise to phase noise. Therefore, how to understand noise sources at frequencies of  $\omega_m$ ,  $\omega_0 \pm \omega_m$  and  $2\omega_0 \pm \omega_m$ , and higher harmonics end up to be the phase noise at a certain offset  $\omega_m$  from the carrier frequency  $\omega_0$  becomes the key question. Over the past two decades, significant progress has been witnessed in the understanding of noise in LC-based oscillators. During this period, academia has marched further and adopted analysis methods that more appropriately capture the time-varying and large-signal nature of practical oscillators. Two analysis methods stand out: the impulse-sensitivityfunction (ISF)-based approach proposed by Hajimiri and Lee [\[80\]](#page-184-4) and the phasor-based one by Huang [\[78\]](#page-184-2) . Central to the former work, which is working as a linear-time variant model under the assumption of automatic gain control (AGC), is the derivation of

the ISF (usually by extensive simulations) that characterizes how the phase disturbance produced by a current impulse depends on the time at which the impulse is injected. Under the assumption of AGC, only phase disturbance (PM) finally translates into phase noise, with the narrowband FM approximation as used in Ch.2. Limited by this AGC assumption, phase noise resulting from AM-PM conversion cannot be predicted at all, which could be a problem in situations such as high swing voltage-biased VCOs with varactors involved. [i](#page-99-0) The latter one looks at the sideband noise generation scheme, however, with much less intuitive assumptions, at the cost of involving both timedomain analysis and sideband balance derivation in the frequencydomain. The main drawback of the latter approach would be the only assumption of a nearly-sinusoidal LC oscillators. Originally proposed for a Colpits oscillator, this method is further elaborated and applied to other differential LC oscillator designs, by D. Murphy and Abidi [\[81\]](#page-184-5).

## **3.7.2 Review of Classical LC-oscillator Topologies**

The differential switching pair (be it NMOS, PMOS or CMOS), which works as the necessary negative R to sustain the oscillation energy required from the LC tank, can be viewed as amplifiers as well. Thus, the classification in power amplifier designs is borrowed, according to the time period that the active device conducts current, expressed as a fractional of the period of a signal waveform applied to the input.

#### **Class-B Oscillator Topology**

The conventional current-biased class-B oscillator, shown in Fig. [3.25,](#page-101-0) has been widely adopted in wireless transceivers due its simplicity, robustness, and a superior performance over the singled-ended Colpitts oscillator [\[82\]](#page-184-6). The ideal noise factor in a class-B structure is equal to  $1 + \gamma$  [\[81\]](#page-184-5) with the assumption that the tail current transistor  $M_T$  is an ideal current source. Under this assumption, not only the current source does not contribute to phase noise but also provides an

<span id="page-99-0"></span>i ISF method has problem of prediction 1/f noise upconversion in a near sinusoidal oscillator, figure prepared. not sure whether to put more space on it here

infinite impedance at the common source of the  $g_m$  transistors which, as will be explained later, is beneficial for phase noise reduction. Let us investigate how the performance of this oscillator topology can be improved, based on Eqn. [3.69.](#page-97-0) As mentioned above, increasing the tank's quality factor, reduces the phase noise. The tank's quality factor, Qt is decided by both the inductive and capacitive quality factors, as

$$
\frac{1}{Q_t} = \frac{1}{Q_L} + \frac{1}{Q_C} \tag{3.73}
$$

The inductor's quality factor,  $Q_L$ , which usually limits  $Q_t$ , is mostly technology dependent and does not improve with CMOS technology scaling. The capacitive quality factor,  $Q_C$ , on the other hand, depends on the tuning range of the oscillator. A typical switched-capacitor structure [\[83\]](#page-184-7)[\[84\]](#page-185-0), shown in Fig. [3.26,](#page-102-0) is used nowadays to tune the oscillators, especially to cover larger tuning ranges (in both DCO and VCO). When  $M_{switch}$  is on,  $C_{on} = \frac{C}{2}$ , and the switch on-resistance,  $R_{on}$ , defines  $Q_C = \frac{1}{2 \cdot R}$  $\overline{2\omega R_{on}C}$ . To improve  $Q_C$ ,  $R_{on}$  should decrease and consequently the size of  $M_{switch}$  should increase. However, larger  $M<sub>switch</sub>$  adds to the parasitic capacitance  $C<sub>par</sub>$  and consequently increases the switch capacitance when *Mswitch* is off:

$$
C_{off} = \frac{CC_{par}}{2(C + C_{par})}
$$
\n(3.74)

, which restrict the available tuning range. Therefore, *Q<sup>t</sup>* is primarily limited by the technology and oscillator's tuning range and is rarely a flexible design parameter to improve phase noise. Another approach to improve the phase noise is to reduce the tank inductance while maintaining its quality factor. Doing so,  $R_t = \omega L Q_t$  is reduced; however, it increases the power consumption  $P_{DC} = \frac{V_{osc}^2}{2 \pi G}$  $\frac{\partial s}{\partial x}$  at the same rate and thus FOM is not improved. Furthermore, by reducing the inductor size, the tank interconnection losses become more critical and at they ultimately limit the tank quality factor.

<span id="page-101-0"></span>

Figure 3.25: Class-B oscillator (a) schematic; (b) oscillation amplitude vs. tail current; (c) ideal and real drain current waveforms; (d) oscillation voltages waveforms..

<span id="page-102-0"></span>

Figure 3.26: The switched-capacitor tuning circuit in on and off states.

The class B oscillator reaches its best performance when the oscillation amplitude is increased up to VDD [\[85\]](#page-185-1) and consequently  $\alpha_V = 1$ . After this point, for a typical oscillator with a tail current source  $M_T$ , the oscillation amplitude increase rate tapers off (see Fig.  $3.25(b)$  $3.25(b)$ ) while its power consumption still increases linearly with the tail current, thus degrading the FOM. This point is sometimes also referred to as the separation point between the voltage (supply)-limit regime and the current-limit regime. The transistors  $M_{1,2}$  drain current exhibits almost a square waveform when the tail current source is ideal and  $\alpha_I = \frac{2}{\pi}$  (see Fig. [3.25\(](#page-101-0)c)). However, in a realistic scenario, the non-ideal current source brings up certain issues and limitations. First of all, the transistor  $M_T$  will contribute to the phase noise and increase the noise factor over  $(1 + \gamma)$ . The minimum tail node voltage,  $V_T$ , is also limited by margin to keep  $M_T$  in saturation. Consequently, the maximum oscillation amplitude reduces to  $V_{DD}$  − *V*<sub>sat</sub> and  $\alpha$ <sup>*I*</sup>  $\lt$  1 ( $\alpha$ <sup>*V*</sup>  $\approx$  0.8). The capacitance at the drain of  $M_T$ tends to keep this node voltage at a constant level, consequently, for large oscillation amplitudes, *M*1*,*<sup>2</sup> are entering the triode region, and the ideal square wave of the *M*1*,*<sup>2</sup> drain currents experience a dimple, as is shown in Fig. [3.25\(](#page-101-0)c). As a result,  $\alpha_I$  drops from the ideal value of  $\frac{2}{\pi}$ , thus increasing phase noise. On the other hand, when  $M_1$ or *M*<sup>2</sup> enter the triode region for a portion of the oscillation period, they will exhibit a low impedance path. Furthermore, the equivalent parasitic capacitance at node T creates a low impedance path from T to ground. Therefore the tank finds a discharge path to ground for the time that either one of these transistors are in the triode region; consequently, its quality factor drops, increasing the oscillators phase noise. This phenomenon is called as **Q-degradation**. The size of transistor  $M_T$  is usually relatively large to reduce its flicker noise; consequently the parasitic capacitor at node T is large enough to provide such a low frequency path. However, it is also helpful in partially filtering the thermal noise of transistor  $M_T$ . Various solutions have been proposed in the literature to improve phase noise of the class-B topology or to improve trade-offs between its phase noise and power consumption. Consequently, new classes of oscillation have been introduced. One of the most effective techniques that could improve the class-B considerably is the popular noise filtering technique [\[86\]](#page-185-2). In this technique, the thermal noise of  $M_T$  is filtered by a relatively large capacitor while a high impedance path is inserted between the core transistors and MT to prevent the discharge path to the tank. Although this technique is very effective, since the high impedance path is realized by another resonator, it requires additional die area. In addition, the capacitive component in the tail-filtering tank is different from the main resonant LC tank, decreasing the filtering effect when a larger tuning range is covered.

### **Class-C Oscillator Topology**

The class-C structure [\[87\]](#page-185-3) is shown in Fig. [3.27\(](#page-104-0)a). In this class of operation the core transistor is kept in saturation, and consequently they show a high impedance during the entire oscillation period. The tank does not find a discharge path to the ground and its quality factor is thus preserved. This structure also saves 36% of the power consumption for the same phase noise by changing the square pulses of  $M<sub>1</sub>$ <sub>2</sub> drain current in the class-B operation to narrow and tall pulses with  $\alpha_I = 1$ . To ensure the saturation region operation, gates of  $M_{1,2}$  are decoupled from oscillation voltage and are biased to a value well below the VDD voltage. A large capacitor in parallel with the  $M_T$  current source allows the class-C alike sharp and narrow current pulses for  $M_{1,2}$  transistors, as highlighted in Fig. [3.27\(](#page-104-0)a). However, the maximum oscillation amplitude is limited in this topology. If the oscillation amplitude gets large enough to push  $M_{1,2}$  into the

<span id="page-104-0"></span>

Figure 3.27: (a) A class-C oscillator schematic; and (b) its voltages waveforms.

triode region, not only the tank's quality factor would heavily drop due to the large  $C_T$ , but also the drain currents of  $M_{1,2}$  will no longer feature sharp and narrow pulses, thus  $\alpha$ <sub>*I*</sub> would drop dramatically. Consequently, although the phase noise and power efficiency are improved for low oscillation amplitudes as compared to the class-B oscillator structure with the same amplitude, the best phase noise performance is limited here. An attempt to increase class-C swing can be done by removing the current source transistor  $M_T$  and generating *Vbias* by an adaptable current mirror circuit [\[88\]](#page-185-4). This oscillator topology also suffers from a trade-off between its robust start-up and the maximum oscillation voltage in steady-state [\[89\]](#page-185-5). *Vbias* should be relatively large to facilitate the start-up, but large *Vbias* values limit the steady-state oscillation amplitude. As high purity is the primary goal, this topology is thus not adopted.

<span id="page-105-0"></span>

Figure 3.28: (a) A class-D oscillator schematic; and (b) its voltages waveforms.

#### **Class-D Oscillator Topology**

The schematic of a class-D oscillator topology is shown in Fig. [3.28.](#page-105-0) The tail transistor is removed, thus eliminating the required margin for the saturation headroom. Furthermore, the transistor sizes of  $M_{1,2}$ are chosen large enough to become almost ideal switches. The relative oscillation voltage amplitude is maximized in this structure, which reaches almost 3VDD[\[90\]](#page-185-6). Consequently, transistors  $M_{1,2}$  are pushed into deep triode region (even more than in the class-B structure) and, therefore, phase noise is considerably degraded. However, as demonstrated in Fig. [3.28\(](#page-105-0)b), the oscillation voltages, V1 and V2, are forced to ground for almost half the period. V1 (V2) is mostly grounded when M1 (M2) is in the triode region, and consequently the correspondingly injected noise is almost zero for most of this period, preventing the generation of upconverted phase noise. A high oscillation amplitude in this structure makes it suitable for low-voltage low phase-noise applications. The product of drain current and drain voltage of MOS switches is almost zero across the oscillation period, and hence, the power efficiency of this structure could be above 90% [\[90\]](#page-185-6). This oscillator structure not only can but it also must work at low-voltage supplies, otherwise the transistors  $M_{1,2}$ , which should be thin-oxide devices to guarantee nearly perfect switching, will face breakdown. Another limitation of the class-D structure is its relatively severe low-frequency noise upconversion and intense supply frequency pushing. It has been attempted to minimize this problem by an on-chip LDO in [\[91\]](#page-185-7), which causes additional cost.

### **Short Summary**

In this chapter, we briefly introduced various oscillator structures and mentioned their benefits and drawbacks. We gave an overview on nonidealities that the traditional class-B oscillator faces and reviewed how each structure tries to overcome them. The class-C oscillator improves phase noise at a given power consumption but only when its oscillation amplitude is low enough to keep the core transistors in saturation. Thus it trades off the maximum achievable phase for power efficiency. The class-D oscillator reaches low phase noise in the thermal noise region without requiring large supply voltages. On the other hand, it is also limited in operation to low supply voltages due to reliability concerns. Over the recent years, more innovative structures have been proposed, such as class-F oscillator [\[92\]](#page-185-8) [\[93\]](#page-186-0); however, they are more complex to design and need simultaneous tuning of multiple capacitor banks.

All these oscillator structures attempt to improve the thermal or 20dB/dec phase noise, which dominate the final PLL out-of-band phase noise. Based on the above discussion, the class-B topology is chosen as our starting point, due to its robustness and balanced performance regarding power consumption as well as phase noise. In the next subsection, LC oscillator structures with low-1/f-noise upconversion are covered.

## **3.7.3 Basics about 1/f Noise Upconversion**

### **Negative Impact over Output Phase Noise**

Close-in spectra of RF oscillators are degraded by a flicker  $(1/f)$ noise upconversion  $(1/f^3, i.e., 30dB/dec$  region). However, the corner frequency  $f_{corner}$  which separates the  $1/f^3$  and  $1/f^2$  regions is not necessarily equal to the corner frequency that bridges 1*/f* noise and thermal noise [\[80\]](#page-184-4), and it is rather determined by the oscillator design itself, as will be covered later. The resulting low-frequency phase

<span id="page-107-0"></span>

Figure 3.29: Conceptual s-domain PLL model showing the importance of  $1/f^3$  phase noise.

noise fluctuations can be mitigated as long as *fcorner* falls well within a loop bandwidth of a PLL. However, the PLL loop bandwidths in cellular transceivers can vary from less than a few tenths to a few hundreds of kHz [\[94\]](#page-186-1), which is below the typical  $1/f<sup>3</sup>$   $(1/f3)$  PN corner of CMOS oscillators [\[90\]](#page-185-6). Consequently, a considerable amount of the oscillator's low frequency noise cannot be filtered by the loop and will adversely affect the synthesizer's spectral purity. This can be explained conceptually, as shown in Fig. [3.29.](#page-107-0) Two different VCOs are adopted for comparison, marked by their difference in *fcorner*, i.e., high/low 1/f3. The reference, as well as PD noise contribution, is diminished so that we can focus on the contribution from VCO, i.e., high-pass filtered oscillator phase noise. The loop bandwidth is set to narrow as 100kHz, which is commonly seen. With a type-1 loop, the in-band filtering of the oscillator phase noise is merely 20dB/dec and thus with a 10dB/dec phase noise residue left to the output, heavily
<span id="page-108-0"></span>

Figure 3.30: Current harmonics path in an LC oscillator.[\[95\]](#page-186-0)

polluting the purity. On the other hand, even with a type-2 loop, the VCO with higher *fcorner* is still not fully filtered by the narrow loop bandwidth and thus leads to much higher in-band phase noise at the output.

#### **Basic Mechanism**

Now we should check the potential sources for increasing *fcorner* in an RF oscillator. There are two major 1/f phase noise upconversion mechanisms [\[95,](#page-186-0) [96\]](#page-186-1). The former is due to the fact the flicker noise of a tail transistor can modulate the oscillation voltage amplitude and then gets upconverted via an AM-PM conversion mechanism through nonlinear parasitic capacitances of active devices, varactors and switchable capacitors[\[97\]](#page-186-2)[\[98\]](#page-186-3). This mechanism can be minimized by forming another auxiliary resonance at the tail of the VCO at  $2f_0$ [\[86\]](#page-185-0), offering high impedance to reject the tail current modulation. In addition, a direct solution would be sizing the tail transistor, thus reducing the source of 1*/f* noise.

Yet another mechanism of the 1/f upconversion is due to Groszkowski effect [\[99\]](#page-186-4)[\[100\]](#page-186-5). In a harmonically rich tank current, the fundamental

component,  $I_{H1}$ , flows into the equivalent parallel resistance of the tank, *Rp*. Other components, however, mainly take the capacitive path due to their lower impedance, as depicted in Fig. [3.30.](#page-108-0) Compared to the case with only the fundamental component, the capacitive reactive energy increases by the higher harmonics flowing into them. This phenomenon makes the tank's reactive energy unbalanced. The oscillation frequency will shift down from the tank's natural resonance frequency,  $f_0$ , to increase the inductive reactive energy, and restore the energy equilibrium of the tank. This frequency shift is given by [\[99\]](#page-186-4)

<span id="page-109-0"></span>
$$
\frac{\Delta\omega}{\omega_0} = -\frac{1}{Q^2} \sum_{n=2}^{\infty} \frac{n^2}{n^2 - 1} \cdot \left| \frac{I_{Hn}}{I_{H1}} \right|^2 \tag{3.75}
$$

where,  $I_{H_n}$  is the *n*-th harmonic component of the tank's current. Even though the original literature suggests that this shift is static but any fluctuation in  $\frac{I_{Hn}}{I_{H1}}$  due to the 1/f noise modulates  $\Delta \omega$  and exhibits itself as  $1/f^3$  phase noise [\[100\]](#page-186-5).

It should be mentioned that several solutions have been proposed in literature to reduce the 1/f noise upconversion due to Groszkowski's effect. The proposed solutions mostly include linearization of the system to reduce the level of current harmonics by limiting the oscillation amplitude by an AGC [\[101\]](#page-186-6), or linearization of gm-devices [\[102\]](#page-187-0), at the expense of the oscillator's start-up margin and increased  $1/f<sup>2</sup>$ phase noise. However, the 1/f noise improvements brought by these techniques are at the expense of degradation in the 20dB/dec region of phase noise, and potentially higher power consumption.

One technique worthy to mention is to form auxiliary resonant frequencies [\[95\]](#page-186-0) [\[103\]](#page-187-1). As shown in Fig. [3.30,](#page-108-0) the oscillation frequency  $\omega_{osc}$  fluctuates around the tank's natural resonant frequency  $\omega_0$  due to the flow of higher harmonics of the current  $I_{D1,2}$  into the capacitive part of the tank. Odd harmonics of the tank current are differential mode (DM) signals, hence, they can flow into both differential- and single-ended capacitors. Even harmonics of the tank current, on the other hand, are common mode (CM) signals, and can only flow into single-ended (SE) capacitors. If the tank possesses further resonances that cope with these higher harmonics, these components can find

their respective resistive path to flow into. Consequently, the capacitive reactive energy would not be disturbing and the oscillation frequency shift  $\Delta\omega$  would be minimized according to Eq. [3.75.](#page-109-0) Such a composed tank should contain the fundamental natural resonant frequency at the targeted  $\omega_0$  and auxiliary CM and DM resonant frequencies at even and odd-order harmonics, respectively. Minimizing the frequency shift ∆*ω* will weaken the underlying Groszkowski mechanism; however, realizing auxiliary resonances at higher harmonics can be area inefficient and also degrade the PN performance. Consequently, the auxiliary resonance frequencies have to be chosen wisely. Eq. [3.75](#page-109-0) indicates that all the contributing current harmonics  $I_{H_n}$ are weighted equally. This means that, in practice, stronger current harmonics contribute more to the frequency shift. Consequently, we can narrow down the required auxiliary resonances to the commonly stronger harmonics, i.e., the 2nd and 3rd harmonic. State-of-theart implementations offer such a solution with either transformers involved, and thus require tuning of multiple capacitor banks [\[95\]](#page-186-0) or realizes few desired auxiliary resonances [\[103\]](#page-187-1). An alternative solution is implemented and will be covered in Chapter [4](#page-111-0) and Chapter [5.](#page-154-0)

# <span id="page-111-0"></span>**Chapter 4**

# **Fractional-N DPLLs with PD Accomplished in Fully-Differential Analog-Domain**

This chapter presents two implementations of Fractional-N DPLLs based on the proposed solution discussed in Chapter [3.](#page-44-0) The first one resolves phase error (PE) in a fully differential voltage (FDV) domain, where power-efficient PE detection is accomplished with higher CMRR/PSRR, lower PVT sensitivity, finer resolution, and better linearity as compared to a gate-delay-dependent time-domain solution. The first implementation covers the fractional-N operation by a differential 10b current DAC, realizing a voltage proportional to the fractional phase difference. A differential dv/dt ramp is employed to linearly transfer the fractional-N phase difference into a small range voltage error, which is digitized by a narrow range but fine resolution 7b ADC. This design is fabricated in 130nm CMOS and achieves an integrated RMS jitter of 101fs with -56dBc worst-case fractional spur and consumes 9.2mW which translates to an FOM of -250.3dB. The second implementation covers the full dynamic range of fractional-N

by a 10b DAC in charge domain after the sampling of the differential ramp, before the conversion. This not only eliminates one considerable noise source but [a](#page-112-0)lso saves about  $15\%$  power <sup>a</sup> while leaving a larger dynamic range for a higher dv/dt gain, which leads to a smaller phase noise at the output.

## **4.1 State-of-the-Art PD Techniques**

Proposed by R.B. Staszewski [\[7\]](#page-175-0), TDC-based ADPLL has enjoyed a prosperous and fast development over the past two decades. Numerous attempts have been made to optimize the ADPLL to reduce the power-jitter product more efficiently. As discussed in Chapter [3,](#page-44-0) the key improvement lies in the PD block and improvement of  $1/f^3$ phase noise of the oscillator, concerning the fact that the FOM of the VCO thermal phase noise is strongly restricted by the process and tuning range. Thus, to reduce in-band phase noise, state-of-theart TDCs' resolution has been pushed down to around pico-second level[\[71\]](#page-183-0) [\[77\]](#page-184-0) [\[29\]](#page-178-0) [\[68\]](#page-183-1). Vernier TDC formed by two delay chains with slightly different delays can achieve sub-gate delay resolution at the cost of poor linearity, area, and substantially more power consumption. Spiral/2-D/ring-based Vernier TDCs achieve fine resolution and reduced power consumption, yet with linearity issues from PVT variations and unresolved mismatch[\[74\]](#page-183-2)[\[75\]](#page-183-3). Time amplifier TDCs can achieve fine-time resolution, however, their power consumption make them almost prohibitive for covering the full range of a  $T_{ckv}$  [\[104\]](#page-187-2). Gated-ring-oscillator TDC achieves fine resolution with intrinsic 1st order noise-shaping and gated-ring structure, while its nonlinearity is still a drawback due to the device leakage issue [\[72\]](#page-183-4). State-of-the-art DPLL architectures usually realize the TDC by converting ∆*t* to a voltage by way of a slope generator, then convert the voltage to digital with a high-resolution ADC [\[29\]](#page-178-0)[\[46\]](#page-180-0)[\[15\]](#page-176-0)[\[28\]](#page-178-1)[\[68\]](#page-183-1). Within such a structure, large slope gain is required to sufficiently reduce the conversion from voltage domain noise into timing jitter. However, due to the large dynamic range imposed by  $T_{frac}$  wrapping, as well as the even larger range added by the  $\Delta\Sigma$  modulation in dividerbased structure ([\[28\]](#page-178-1)), design of the ADC has to face two major

<span id="page-112-0"></span><sup>a</sup>According to simulation in 130nm CMOS.

challenges, the linearity issue and large dynamic range. For a high resolution ADC, solving these two challenges leads to substantially more power. However, a DPLL intrinsically does not need a high dynamic range Phase/Digital converter. After the fractional-N PLL lock-in, a large percentage of the pattern of the difference between the reference and feedback phase is predictable [\[57\]](#page-181-0) based on  $\Sigma$ FCW<sub>frac</sub>. In other words, only a small noise-induced quantity is left uncertain and required to be detected. Therefore, leveraging a  $D/A$  converter to cover the predictable part makes the phase detector more power efficient, as the fractional phase difference pattern is pre-determined to a large extent. This prediction technique has been proved in timedomain for higher power efficiency [\[56\]](#page-181-1)[\[105\]](#page-187-3), known as DTC used for phase prediction. Nevertheless, since the DTC is normally required for covering at least one variable clock (CKV) cycle range, its nonlinearity is the main spurs source, thus adding an additional linearity vs. power trade-off although the resolution-power trade-off is relaxed. To cope with this issue, constant-slope based DTC [\[106\]](#page-187-4) is becoming more and more popular due to its intrinsic higher linearity, for both phase prediction [\[105\]](#page-187-3) in counter-based DPLL and  $\Delta\Sigma$  noise reduction in divider-based DPLL [\[107\]](#page-187-5). Regardless of the different implementation details [\[106\]](#page-187-4)[\[107\]](#page-187-5)[\[105\]](#page-187-3), the constant-slope DTC is essentially creating a ramp, which linearly maps a ΣFCW-controlled quantity from voltage domain  $(\Delta v)$  into time domain  $(\Delta t)$ . For power efficiency concern, an inverter is commonly taken as the comparator to translate the controlled quantity from voltage-domain into time-domain. However, the inverter's flipping point  $V_{TH,inv}$  is sensitive to PVT variation, as well as to other external variations (such as ground bouncing). In addition, this V-to-T domain transition step is also vulnerable to pick external noise (e.g., supply noise) due to the intrinsic low CMRR of inverter. However, DTCs are employed to assist both highresolution time-domain TDC designs and high resolution ADC-based TDC designs, alleviating the dynamic range requirement.

As previously mentioned in Chapter [3,](#page-44-0) we propose to add the fractional control in analog form before the A/D for power saving and to avoid any controlled delay line, as will be covered in depth in the next section.

<span id="page-114-0"></span>

Figure 4.1: Implemented DPLL with FDVPD operation shown conceptually.

## **4.2 Implementation 1: Fully Differential Voltage Domain PD DPLL**

## **4.2.1 Concept of Operations**

The overview of implemented DPLL is conceptually depicted in Fig. [4.1.](#page-114-0) A low power counter-assisted frequency locking path is added to ensure a proper frequency locking. One additional divide-by-2 is added, for the power vs. speed trade-off considerations in the counter design, at the cost of an increased in-band noise floor. Based on the

model-based analysis depicted in Fig. [3.24,](#page-94-0) the quantization noise is only sufficiently suppressed as compared to its thermal counterpart if more than 10b quantization ensured. Thanks to the sub-ranging alike quantizer structure, realized in the proposed novel fully differential voltage domain PD (FDVPD), the effective resolution is much higher than 10b so it is ignorable.

The FDVPD linearly maps the phase error from time-domain into voltage-domain for a power-efficient quantization with high CMRR ensured, as discussed in Section [3.6.](#page-84-0) The transistor-level implementation of FDVPD is shown in Figure [4.2,](#page-116-0) which includes 1) a biasing branch, from where the reference current is derived for both the current array (D/A converter) and the ramp generator; 2) a 10b current array-based DAC; 3) a complementary pair of current sources, which defines the slew rate of the differential ramp; 4) a narrow-range 7b self-timed SAR ADC which digitizes the phase error with high resolution, while offering its sampling capacitors *CSAR* as part of the ramp generator. Instead of relying on a DTC assisted fine-resolution narrow-range TDC, which is restricted by a hard trade-off among resolution, linearity, power, and immunity to external interference due to the intrinsic PVT sensitive single-ended inverters , the proposed FDVPD processes both the fractional-N prediction and quantization in fully differential voltage domain, where the quantization is cheap, and CMRR is intrinsically high. The operations of the FDVPD is conceptually illustrated in Figure [4.1.](#page-114-0) The proposed FDVPD digitizes the phase error between two input phases, REF and CKVdg (clock-gated feedback CKV), with very fine resolution, yet at relatively low power consumption. A DAC, controlled by the fractional part of FCW, differentially pre-charges encoded voltage information onto the sampling capacitor  $C_{SAR}$  of the SAR ADC. This operation is finished before the rising edge of REF and relaxes the resolution-dynamic range trade-off of the ADC design, given a specific power budget. The rising edge of REF instantly triggers a differential ramp generation, realized by the complementary current pair  $I_R$  and corresponding ADC input capacitors  $C_{SAR}$ . This operation lasts until the arrival of the rising edge of CKVdg, when the ADC sampling is triggered and the final differential voltage is kept on the input capacitors  $C_{SAR}$ . As the sampled voltage is expected to include a large constant part and a small noise part after locking, an extension operation is inserted to roughly remove the constant part

<span id="page-116-0"></span>



before the ADC starts digitization, as will be discussed later. In this way, the dynamic range requirement of the fine resolution ADC is significantly relaxed; hence the power consumption is reduced.

The phase error quantization of a fractional-N PLL is accomplished in 5 steps: (1-3) time-to-voltage conversion, (4-5) digitization with the detailed procedure elaborated in Figure [4.3.](#page-119-0) The first 3 steps are illustrated in Figure [4.3.](#page-119-0) In step 1 (before the rising edge of REF), the fractional-N operation is encoded into a differential output voltage *VDM* according to the following relation,

$$
V_{DM} = \frac{2I_R}{C_{SAR}}T_{frac} = \frac{2I_R}{C_{SAR}}(1 - \{\Sigma FCW\}_{frac})T_{ckv}
$$
(4.1)

$$
V_1 = V_{CM} - 0.5V_{OS}
$$
 (4.2)

$$
V_2 = V_{CM} + 0.5V_{OS} \tag{4.3}
$$

$$
V_{OS} = I_R R_D \tag{4.4}
$$

(4.5)

, where  $T_{frac}$  represents the varying phase relation between REF and CKVdg  $(T_{frac} < T_V)$ , due to fractional-N operation after locking. During this step, the encoded steered currents set the wanted DM voltage via the effective differential resistance  $R_D$ , with a constant offset  $V_{OS}$  added by the complementary branch of  $I_R$ . The CM voltage  $V_{CM}$  is determined by resistance  $R_{CM}$  and  $I_{DAC}$  which is chosen to be 0.55V, about half of the supply here for the considerations of linearity. In step 2, a rising REF edge disconnects the DAC from the rest and thus  $V_{DM}$  is decreased at a constant rate K until the DCO feedback CKVdg rising edge stops the operation by sampling the differential voltage onto  $C_{SAR}$ , marked as the step 3. The nominal K is assumed to be  $\frac{2I_R}{G}$  $\frac{C_{SAR}}{C_{SAR}}$ , while mismatch between top and bottom current sources only slightly changes K and *VOS*, adding a static gain error. This fact

is very different from the case in CPPLL, where the mismatch between the charge pump currents contributes serious spurs as the phase error information is presented separately in a single-ended way (lead or lag). On the contrary, the FDVPD here is based on the assumption that after locking, the CKVdg phase is always tracking the REF with a certain margin of lagging; therefore, two current sources can be both used to present the same lagging information and hence the mismatch only adds a static gain error which can be easily calibrated.

In step 3, the differential voltage is sampled for further  $A/D$  conversion, triggered by the rising edge of CKVdg. Hence, the time error (expected vs. actual CKVdg) is represented by a differential voltage error from this point on. As the excursive  $T_{frac}$  is canceled by the DAC in voltage domain, the ADC only needs to resolve a small error voltage, resulting from the residue noise contents. A low power voltage buffer is used to maintain the operating points of the *I<sup>R</sup>* around *VCM* until next REF cycle, preventing possible interference between cycles. Overall, the conversion of time into voltage error is completed at the end of step 3, in a linear, robust and power efficient way. Step 4 is marked by a delayed version of CKVdg (*Extension*), a flipping capacitor *Cexten* is switched to compensate for a large constant part in  $V_{input}$  and narrow down the  $\Delta V_{final}$  after lock-in, equivalently extending the dv range. Step 5, i.e., the conversion phase, is the final procedure of FDVPD conversion. A further delayed *Extension* triggers the self-timed SAR ADC conversion, to resolve the final residue error. During the initial phase locking, the voltage (time)-error will be outside the limited ADC range, resulting in a bang-bang behavior until the time error is regulated into the linear ADC range. However, this "blind" period is greatly reduced by the counter-based frequency locking path. In addition, the 7b ADC covers a range wider than 6*σ* peak-to-peak DCO jitter in order to speed-up locking. Besides, speed can be traded versus resolution by different settings.

#### **4.2.2 10b DAC in Voltage Domain**

A 10b DAC is required to cover the required dynamic range within half REF cycle time, e.g., 12.5ns with a reference rate of 40MHz, to get rid of settling introduced spurs. Other than settling speed, it has to be linear as it covers the large  $T_{frac}$  corresponding voltage dynamic range, thus nonlinearity of the DAC dominates as source of fractional spurs, following the discussion in Section [3.5.](#page-77-0) Last but not least, it has to be power efficient as well.

<span id="page-119-0"></span>

Figure 4.3: FDVPD operations

So in what topologies should it be realized? A R-2R-Ladder-based DAC could be a power-efficient choice. However, its drawbacks outweigh its advantages. As settling speed is not only essential to ensure proper operation, but also significant to remove linearity issues by incomplete settling, the LSB value of a 10b R-ladder realization should be chosen small. This worsens the impact of wiring resistance as well as ladder's impedance variations and leads to strong spurs. Worse still, the settling can be further slowed down by the output buffer stage. Therefore, albeit its advantage in power consumption, the loss of speed and linearity excludes the R-DAC option.

An I-DAC, on the other hand, can be much faster in terms of conversion, at the cost of more noise. Besides, an I-DAC can incorporate the CP current branch inside, so that the CP is properly setup before ramp generation, so that there is no dead-zone issue, and thus getting rid of the unnecessary large  $\tau_{on}$  (Eq. [3.13\)](#page-62-0), during which significant amount of current noise gets integrated on to the sampling capacitance. Thus a 10b current steering-based D/A converter is adopted with the ramp currents incorporated as depicted in Fig. [4.2](#page-116-0) to realize the DAC function in voltage-domain.

The implemented DAC is based on a 10b segmented current array. Albeit its simplicity, a binary controlled current array requires extreme precision in the matching at the codes when the most significant bits are flipped. The mid-code transition is very delicate and prone to high glitches. On the other hand, thermometer controlled current array has a much-alleviated glitch issue and the monotonicity of the converter is also guaranteed by design. However, the implementation is not practical for 10b since it requires a large thermometer decoder and a complex wiring.

Therefore the final implementation is realized in a segmental way. The 4b MSB array consists of 15 identical unit cascode current sources, which are controlled by a thermometer decoded control word based on  $\Sigma FCW_{frac}$ , to ensure the matching of the most critical parts. The 6b LSB array consists of a set of binary scaled current sources directly controlled by the LSB of the input words. This arrangement is a compromise between linearity and cost. A passive Pi-network of resistance is leveraged as the output stage, realizing the I-to-V conversion, while properly incorporating the CP branch. This, compared to an active transimpedance stage, sacrifices linearity for power as well as conversion speed. What's more, the fully differential realization offers more rejection to disturbance from the supply, and it is less affected by even harmonic distortion.

Alternatively to arrays of current sources, D/A converters based on capacitors were also considered, to add the  $T_{frac}$  compensation in charge-domain. This leads to an alternative implementation which is covered in latter part of this chapter.

<span id="page-121-0"></span>

Figure 4.4: Non-linearity curves of the FDVPD based on post-layout simulation.

#### **4.2.3 Conversion of dv-to-dt**

The ramp generator, based on the current sources and sampling capacitors of the ADC, determines the linearity of conversion between V-domain and T-domain to a large part. In this implementation, the ramp is triggered by the REF edge, with particular nonlinear starting behavior due to switching; and the ramp is stopped by the CKVg edge, with the final differential voltage sampled by the ADC. Therefore, switch design is crucial and will be covered in next section. However, two other strategies are adopted to alleviate the nonlinearities resulting from channel length modulation, as well as distortions from switching.

The first one is that the dynamic range is chosen to be 300mV per side, around a CM level of about 600mV, so that the CP is not heavily stretched over different channels. Besides, this alleviates the switching distortion from S1, as the starting point does not vary much. Besides, a high linearity is only required for the periodical varying  $T_{frac}$  section. While regarding the constant part  $T_{const.}$ , as long as the dv/dt conversion has a constant shape (not slope!), the overall linearity is not degraded. The second trick is that a high output swing current mirror is used to realize the CP, which shares the same reference branch as the current arrays in the DAC. As shown in Fig. [4.2,](#page-116-0) the current mirror transistors are nested with the cascode transistors, instead of being stacked. The resistance R in the biasing branch (not the RC noise filer one), helps to provide a voltage shift for the sources of the cascode transistors that is large enough to accommodate a *Vds* of the current tail transistors to keep them in saturation region.

As the SAR ADC is leveraged for diminishing the quantization noise level, linearity is dictated primarily by the dv/dt conversion and DAC. Careful layout as well as design can be applied to improve the overall linearity. The characterized INL and DNL based on postlayout simulation is shown in Fig. [4.4.](#page-121-0) This reflects mainly the impact from system non-linearities due to layout, settling, as well as nonidealities associated to the ramp as well as switches. However, the overall result is still promising and ensures a worst in-band spur lower than -58 dBc in theory, according to the analysis in section [3.5.](#page-77-0) In addition to the satisfying linearity performance, the equivalent timing

resolution is much better defined, easier for estimation and calibration of the PD gain, as compared to a delay-based structure. This can be seen from the following expression,

$$
t_{\rm res, DAC} = \frac{C_{SAR}}{I_R} \Delta V_{LSB}
$$
\n(4.6)

$$
=\frac{C_{SAR}}{I_R}\alpha I_{ref} \cdot \frac{2R_{CM}R_{DM}}{2R_{CM}+R_{DM}}\tag{4.7}
$$

$$
=\frac{\alpha}{\beta}\mathbf{RC}\tag{4.8}
$$

#### **4.2.4 Switches in FDVPD**

Switches, marked as S1 and S2 in Fig. [4.2,](#page-116-0) play a significant role in the proposed FDVPD, as they link the timing edges with differential voltages. In addition, they can benefit from scaling, being usually sized at minimum channel length allowed by the technology. This as well proves the applicability of the FDVPD to advanced process nodes. As with a switch, we care about its conducting resistance as well as parasitic capacitance (related to charge injections), both of which are improved with technology scaling.

#### **CMOS Switch for S1**

The linear equation governing the transistor current in triode region is valid for long and short channel transistors. Therefore the onresistance *Ron* of a simple NMOS/PMOS switch as the one depicted in Fig. [4.5\(](#page-124-0)a) can be always written as

$$
R_{on} = \frac{1}{\mu_n C_{ox} \frac{W}{L} (V_{DD} - v_i - V_{TH,n})},
$$
(4.9)

where  $v_i$  is the input signal applied to one side of the switch. One drawback of MOS switch is that the on-resistance of a simple NMOS/PMOS switch is dependent on the input signal level. This is the case because the gate-source voltage across the switch transistor is indeed a function of the input voltage.

In order to mitigate the on-resistance dependency on the input level, complementary CMOS switches (also referred to as transmission gates),

<span id="page-124-0"></span>

<span id="page-124-2"></span>Figure 4.5: (a) NMOS switch. (b) CMOS switch.



Figure 4.6: CMOS switch with body biasing and dummy switches.

as the one depicted in Fig. [4.5\(](#page-124-0)b), are vital in SC circuits. Assuming the switch to be sized with  $\mu_n(W/L)_n = \mu_p(W/L)_p$  and neglecting for simplicity the body effect, the on-resistance is, to a first approximation, independent of the input signal level [\[108\]](#page-187-6) and results

<span id="page-124-1"></span>
$$
R_{on} = \frac{1}{\mu_n C_{ox}(W/L)_n (V_{DD} - V_{TH,n} - |V_{TH,p}|)}.
$$
(4.10)

From Eq. [4.10,](#page-124-1) the *Ron* seems to be signal independent. However, when considering the modulation of the threshold voltage by the input signal due to the body effect, the on-resistance is still signal dependent. However, with the design strategy of restricting the dynamic range per side around 300mV, this impact is not severe, as proven by the post-layout simulation of nonlinearity. However, two more facts have to be considered if this CMOS switch is adopted as S1 in Fig. [4.2,](#page-116-0) which triggers the ramp generation.

The first concern is about the switching speed. This is simple a RC constant that is mainly determined by the *Ron* and load capacitors. In order not to significantly increase the parasitic capacitance contribution from the switches, body biasing techniques have to be adopted to further reduce the  $R_{on}$  to avoid up-sizing. This switching time is critical in the sense that it should not be considerably larger as compared to  $T_{ckv}$ , and it is covered by  $T_{const}$ . Therefore, any longer switching time leads to additionally longer necessary CP injection time, and thus more noise.

The second concern is about charge injection. As each rising REF edge opens the S1, channel charges crushes into the *CSAR*. As the amount of injected charge is signal dependent, it brings additional nonlinearity into the charge-domain operation of the ramp generation, leading to additional nonlinearity, and increase of spurs. Therefore, dummy switches of half the CMOS switches' sizes are adopted to compensate charge injection, as shown in Fig. [4.6.](#page-124-2)

#### **Bootstrap Switch for S2**

A pair of bootstrap switches is leveraged as the S2 in Fig. [4.2,](#page-116-0) which samples the differential voltage into the ADC when the rising edge of CKVg arrives. After lock-in, the final sampling voltage is always around a certain level, as shown in Fig. [4.1,](#page-114-0) charge injection is less serious compared to the signal dependent *Ron*. This fact justifies the choice we make. In sub-1 V CMOS technologies the overdrive of the input sampling switch is limited to a few hundreds of millivolts. The time constant formed by the sampling capacitor and the input switch on-resistance cannot be made negligible compared to the clock period at several hundreds of pico-seconds. Therefore, the on-resistance signal dependency results in large harmonics at the ADC output. To prevent this, bootstrapping on the input switch [\[109,](#page-187-7) [110\]](#page-188-0) is a popular technique employed to make the input switch on-resistance signal independent by biasing the switch with a constant gate-source voltage equal to  $V_{DD}$ . The circuit performing this operation has been

<span id="page-126-0"></span>

Figure 4.7: Bootstrap switch circuit.

proposed in [\[110\]](#page-188-0), and is shown in Fig. [4.7.](#page-126-0) A conventional voltage doubler [\[111\]](#page-188-1) made of transistors  $M_1$ – $M_2$  and of capacitors  $C_1$ – $C_2$ generates a boosted version of the clock phase *CKVdg* that charges capacitor  $C_3$  completely to  $V_{DD}$ . During the on-phase *CKVdqb*,  $C_3$ serves as a battery with its bottom terminal tied to the input signal, and its top plate controlling the gate of the sampling switch  $V_a$ . Therefore,  $V_q$  ranges between ground and  $(V_{in} + V_{DD})$ , ensuring  $V_{qs}$  $V_{DD}$  for the sampling switch. For high speed operation, it is crucial that the rising and falling edges of such bootstrapped phase are kept negligible compared to the short clock period. Thus, with respect to the original bootstrap circuit version, an additional NMOS transistor  $M_{11}$  is added in order to make the rising edge of the bootstrapped gate voltage faster [\[112\]](#page-188-2). At the beginning of the on-phase *CKVdgb*, the gate of the sampling switch rises because of the charge provided by the battery capacitor  $C_3$ . This requires in turn that  $M_5$  and  $M_8$ are turned on. To speed-up this process, the addition of transistor  $M_{11}$  ensures that  $V_g$  can start to rise immediately at the beginning of the on-phase, because the charge is provided directly by the supply. *M*<sub>11</sub> automatically switches off when  $V_q$  reaches  $(V_{DD} - V_{TH,n})$ .

<span id="page-127-0"></span>

Figure 4.8: SAR ADC for the Step 4, 5 conversion of FDVPD operation.

## **4.2.5 Self-timed Low Power SAR ADC**

#### **Extension and Conversion**

The last 2 steps of the FDVPD are carried out by low power, small input-range self-timed SAR ADC with a dynamic comparator as shown in Fig. [4.8.](#page-127-0) Different from the synchronous SAR ADC with conventional switching scheme [\[28\]](#page-178-1), a significantly more power-efficient top-plate sampling with monotonic switching scheme [\[113\]](#page-188-3) is implemented here instead of a conventional bottom-plate sampling for 2 reasons. Firstly, the differential voltage is charged/discharged against the ground in step 2 on the top-plate which reduces parasitic capacitance and improves mismatch. Secondly, a single supply-independent voltage reference can thus be used. A 0.25 mW low power self-timed logic is implemented for simplicity of design (with 80MHz reference rate), saving power when conversion terminates early. As shown in Fig. [4.9,](#page-129-0) during the enable period (CKVdg high), a ready comparison result from the dynamic comparator will always pull Z high from a XOR logic, which in turn triggers the comparator reset as well as the C-DAC conversion via a low power dynamic DFF-based sequential logic as shown inFig. [4.10.](#page-130-0) And a reset always pulls Z down after a certain delay, triggering the next comparison clock. Once the conversion is finished (marked by valid in Fig. [4.10\)](#page-130-0), the self-timing logic

is terminated automatically. Different from a common SAR ADC, a special step for dynamic range extension is introduced in step 4, otherwise the ramp would need to be stopped near the cross-over point, limiting the available dynamic range. This option is realized by flipping the falling ramp up for a *Vexten* by *Cexten*, triggered by the delayed CKVdg, i.e., Extension in Fig. [4.9.](#page-129-0) In principle, this constant extension voltage is expected to approach

$$
V_{exten} = \frac{2I_R}{C_{SAR}} T_{const} + V_{OS}.
$$
\n(4.11)

However, in practice, *Vexten* does not need to be accurately match a certain fixed value, as it corresponds to a constant phase difference which will be regulated by the DPLL operation itself. In another word, a fixed offset in *Vexten* leads to a fixed phase offset, which does not impact the spectral purity. During the initial phase locking, the voltage (time)-error will be outside the limited ADC range, resulting in a bang-bang behavior until the time error is regulated into the ADC's linear range. Therefore, a 7b ADC is chosen to cover the required peak-to-peak range for linear settling speed up. Besides, speed can be traded vs. resolution by switching the additional  $C_{gain}$ , which can be leveraged for gain calibration. The SAR A/D-conversion further reduces the coarse quantization noise of the I-DAC, so that the final output jitter is vastly dominated by thermal noise.

#### **Monotonic Switching Scheme**

The C-DAC array of the SAR ADC is attenuated by an adjustable  $C_{attn}$ , giving another dimension of freedom for gain calibration, as shown in Fig. [4.8.](#page-127-0) Therefore, the unit cell capacitance does not need to be so small that parasitic and mismatch become relatively too large, although the linearity of the ADC is not as decisive as that of the DAC. The switching C-DAC, as mentioned above, adopts the monotonic mechanism proposed in [\[113\]](#page-188-3). Compared to the conventional bottomplate sampling adopted in [\[28\]](#page-178-1), the MSB can be derived directly by comparing the sampled input without switching any capacitor, other than the aforementioned advantage of a simpler reference voltage requirement. In the first phase of the algorithm, the input is sampled

<span id="page-129-0"></span>

Figure 4.9: Self-timed SAR logic.

on the top plates of the capacitive arrays, while the bottom plates are connected to a well-defined ground. The MSB is directly obtained by the comparator. Depending on whether the MSB is '0' or '1' the MSB-capacitor of the bottom array or from the top array is connected to  $V_{ref}$ , respectively. An identical procedure is carried out for the following bits, reducing the differential voltage towards zero while the output bits are extracted from the comparator output. However, one potential issue is the varying common mode voltage due to the monotonic switching, as the *VCM* gets changed every cycle. This would modulate the offset of the comparator, and thus modulates the final phase relation, leading to additional spurs. Fortunately, this is not serious in our scheme, as the dynamic range of the SAR ADC is only about 10mV, the impact over phase offset modulation can thus be neglected.

<span id="page-130-0"></span>

Figure 4.10: Sequential logic leveraged in the SAR logic.

## **4.2.6 Comparator of the SAR ADC**

Compared to the flash ADC used in [\[29\]](#page-178-0) and most flash TDCs, only one comparator is required in a SAR ADC. However, it is still an vital block as the power-noise trade-off can be further degraded by such a function, given the offset (linearity) is almost constant over a narrow input range. Thus we will search for a solution that features low power-noise product.

#### **A Simple Dynamic Latched Comparator**

To avoid static current consumption, fully-dynamic voltage-mode senseamplifiers have become widely used in A/D converters as comparators. A popular implementation that combines the input pair with the latch stage is the one proposed in [\[114\]](#page-188-4) and depicted in Fig. [4.11.](#page-131-0) This sense amplifier-based topology is also widely adopted in many flash TDCs [\[7\]](#page-175-0)[\[9\]](#page-176-1). The input signal is transformed into a differential current that is then injected in the latch-type structure composed of transistors M3–M6. The positive feedback amplifies the signal difference driving the outputs at full swing. Such topology eliminates any static current, however suffers from high kickback noise as the large variations at the output nodes couple through parasitic capacitances to the comparator

input [\[115\]](#page-188-5). In addition, it requires the stacking of 4 transistors, which reduces the voltage headroom available for the latch stage, and therefore its speed. Finally, the speed of the dynamic latched comparator is strongly dependent on the input common-mode voltage level [\[116\]](#page-188-6).

#### **Adopted Double-Tail Latched (DTL) Comparator**

The kickback as well as headroom-restricted speed issue is mitigated by separating the pre-amplifier stage from the latch, as presented in [\[116\]](#page-188-6). This separation allows for a larger input common mode range, as well as an extra dimension of freedom by providing separate tail transistors (double tailed). Therefore, a fully-dynamic DTL has been chosen as the comparator for the implemented SAR ADC [\[117\]](#page-188-7) [\[118\]](#page-188-8)[\[119\]](#page-189-0), as depicted in Fig. [4.12.](#page-132-0)

Combined with the inverter, the dynamic pre-amplifier stage provides a certain gain so the input referred offset is reduced. In addition, the kick-back noise of the latch is isolated from the capacitor array. The self-timed logic generated clock (CK in Fig. [4.11,](#page-131-0) comparator clk in Fig. [4.9\)](#page-129-0) is used to control the reset (CK low), and the latch phase

<span id="page-131-0"></span>

Figure 4.11: Dynamic latched comparator proposed in [\[114\]](#page-188-4).

(CK high). In each comparison, the comparator input referred noise at its input can be thus derived as

$$
(\sigma_{vn})^2 \approx 2 * 4kT \frac{\gamma}{g_{m,in}} B_n = \frac{2kT\gamma}{C_d} \tag{4.12}
$$

where  $g_{m,in}$  is the input transistors transconductance at the latch toggling point, and  $B_n$  is the noise bandwidth of the input state. The input pair sizing and  $C_d$  are chosen so that the power-noise product is within the target budget, as  $C_d$  also determines the energy per comparison.  $(C_d$  is the equivalent capacitance seen at the drain of M1 and M2.)

<span id="page-132-0"></span>

Figure 4.12: Implemented DTL comparator.

| Nominal $I_R$      | 240 $\mu$ A | $v_n$              | $220\mu\rm{V}$    |
|--------------------|-------------|--------------------|-------------------|
| Sampling $C_{SAR}$ | 550f        | Lowest $f_{out}/2$ | $1.5\mathrm{GHz}$ |
| $gm/id$ of $CP$    |             |                    |                   |

<span id="page-133-0"></span>Table 4.1: Relevant design parameters for noise modeling.

## **4.2.7 Brief Summary of the FDVPD**

#### **Noise Performance**

As seen from Eq. [3.62,](#page-93-0) the major contributors of noise are the integrated current noise, thermal voltage noise  $v_n$  resulting from comparator and the DAC, quantization noise and  $kT/C$  noise. In the implemented design, some noticeable points are summarized below, with relevant parameters listed in Table. [4.1](#page-133-0) based on simulation.

- Thanks to differential structure, the equivalent dv/dt gain is almost 0.8 GV/s, which leads to sufficiently large attenuation of the contribution from  $v_n$ .
- With a equivalent bits of resolution adjusted to be sufficiently high, the quantization noise can be almost ignored, as shown in Fig. [3.24](#page-94-0)
- Noise contribution from CP is dominating now, as predicted by Fig. [3.24,](#page-94-0) it is still more power efficient compared to a conventional CPPLL.

Therefore based on the noise analysis in Chapter 3 ( Eq. [3.62\)](#page-93-0), the output phase noise can be well-modeled and predicted. Regarding the item of  $v_n$ , we assume that

$$
v_n^2 = v_{n,DAC}^2 + v_{n,comp}^2 \tag{4.13}
$$

As well-studied in [\[120\]](#page-189-1)[\[121\]](#page-189-2), the overall contribution of the comparator noise to the input of a SAR ADC is less than its original value, when the comparator noise is dominating over the quantization noise. Therefore, we assume that the contribution of the comparator noise to the input of a SAR is the same large as itself. Results

<span id="page-134-0"></span>

Figure 4.13: Phase noise predicted by the s-domain model, with comparison to measurement result.

predicted by the simple s-domain mode are depicted in Fig. [4.13,](#page-134-0) with a comparison to a measured case in an integer-N channel. The further breakdown of in-band jitter is visulized in Fig. [4.14](#page-135-0) and it is clearly seen that, among the PD noise contributors, the noise from CP is dominating in this design, followed by the thermal noise contribution, while the quantization is almost ignorable here. The result predicted by s-domain model matches the measurement result quite well. The main difference lies in the region between 1kHz to 100kHz, which is clearly caused by the flicker noise contributed by the DAC and pump generators, as well as the input reference [b](#page-134-1)uffer<sup>b</sup>.

## **4.2.8 Nonlinearity of the FDVPD**

Any nonlinear behavior of PD in a PLL results in spurs (fractional and reference) and degrades IPN, which could be worsened considering noise-folding effect. The linearity of FDVPD here is mainly degraded by:

<span id="page-134-1"></span><sup>b</sup>These flicker noise sources are not characterized in the s-domain model for simplicity.

<span id="page-135-0"></span>

Figure 4.14: Breakdown of in-band jitter contributors.

- 1. the linearity of the I-DAC in step 1. This is quantified by the maximum integral non-linearity (INL) and differential nonlinearity (DNL). The INL and DNL are a measure of the deviation of the relation between outputs and inputs of a converter from a perfect straight transfer function line. In addition, the passive pi-R network further degrades the linearity. This is because of the finite output impedance of the unit current sources, which leads to a nonlinear conversion from current to voltage. Other than these static performance, settling behavior, especially code-dependent settling behavior would cause fractional spurs as well.
- 2. The nonlinearity from CP. As mentioned above, the fully differential structure helps to make the dynamic range per single side small, while sustaining a high enough slew rate as to reduce noise contribution. However, the mismatch between the two CP current sources do not matter much as in a CPPLL. This stems from the fact that in the proposed DPLL structure, a sampling PD scheme is adopted, which means only lag information is characterized. Therefore, mismatch between the CP sources only result in static deviations from expected slew rate, which

can be easily calibrated. This also substantially simplifies the design of the current sources for FDVPDs.

- 3. The switching distortion. Both S1 and S2 would contribute signal/code-dependent charge interference, increasing spurs in potential. This can be alleviated by the switch designs as mentioned above. Besides, any code-independent interference does not cause fractional spurs.
- 4. the C-DAC within the SAR ADC. This is not a dominating source, however, special care is still paid into the layout and wiring to ensure a good matching performance. Again, the settling speed is also crucial for the C-DAC, which gets improved by the chosen monotonic switching scheme, as discussed above.
- 5. the comparator's offset due to input CM voltage variations. Although this source is unavoidable, it is ignorable as the SAR ADC works with a narrow dynamic range.

#### **4.2.9 DCO implementation**

Even with a large loop bandwidth suppressing in-band portion of the DCO noise, the latter if not optimized may still make an appreciable contribution to the resulting jitter. This is already shown in Fig. [3.29.](#page-107-0) As shown in [\[95\]](#page-186-0), stronger 2nd order harmonics content of the tank voltage results in more asymmetrical voltage and thus more 1/f noise up-conversion. This results in more jitter when the 1/f up-conversion corner frequency is not ignorable compared to the loop bandwidth. Therefore, a harmonic-shaping LC tank-based DCO is implemented, shown in Fig. [4.15.](#page-137-0) Other than the resonance at  $\omega_0$  and a DM short at  $2\omega_0$  formed by the inserted series LC path  $(C_{bank}$  and  $L_s$ ), a CM open is formed by the single-ended caps  $(C_c)$  and the inductor L at  $2\omega_0$  as well, further reducing the 1/f noise up-conversion. This leads to much lower 1/f up-conversion corner. Besides, the parallel LC shows inductive impedance over  $\omega_0$  while series LC path shows capacitive impedance over  $2\omega_0$ . This fact is leveraged to form one additional open at  $3\omega_0$ , results in more square wave alike swing, as depicted in Fig. [4.15.](#page-137-0) The sharpened transition slope helps to reduce noise upconversion, as zero crossings represent the most vulnerable

<span id="page-137-0"></span>

Figure 4.15: (a) Implemented DCO with harmonics shaping; (b) simulated spectral contents; (c) simulated oscillation waveform

part for such a mechanism. This also helps with power reduction of the following divider/buffer stage. Comparison of the simulated noise spectra shows substantial improvement for our chosen version with the series resonance. Overlaid with the simulations is the measured phase noise of the free-running DCO, as depicted in Fig. [4.16](#page-138-0)

## **4.2.10 Power-Efficient High-Speed Counter**

There are two types of counter in terms of the toggling clock, namely the asynchronous counter and the synchronous one. Even though the former is known for much lower power, it brings potential robustness issue in terms of output update, and thus, increases the risk of loselock for counter-based PLL. Even though this issue can be relaxed if

<span id="page-138-0"></span>

Figure 4.16: Simulated phase noise with comparison to measured result.

the asynchronous counter is leveraged in a FLL [\[122\]](#page-189-3) for frequency lock, a mixed mode counter is adopted here for a better balance between power and robustness. As shown in Figure [4.17,](#page-139-0) the LSB facing the high frequency CKV is handled by an DFF-based divideby-2 counter to save power, whose output  $\text{obj}[0]$  is used to clock the rest 7 most significant bit (MSB)s synchronously at half of the rate of CKV. The synchronous counter for the 7 MSBs is essentially composed of cascaded toggle logic and registers, which scale down the input toggling rate by 2. Grayed in Figure [4.17,](#page-139-0) the unit toggle register is made by a DFF whose Q output is feedback connected to a XOR gate, which is triggered by the level change of the input signal. Each of the DFFs is realized by the same customized dynamic true single-phase clock (TSPC) logic, and thus the power is greatly reduced.

Within the implemented DPLL, a high speed counter is deployed to sample the integer cycles of the high speed clock CKV by retimed reference clock (CKR). However, considering this counter contains asynchronous toggled DFFs, great care must be taken to guarantee a proper sampling. Due to metastability, PVT variations, misaligned settling time between MSBs and LSB bit, the true phase of CKV could be sampled with abrupt errors. This results in catastrophic failure of locking. Therefore, an asynchronous sampling scheme is adopted here to tackle this asynchronous issue, as done in [\[123\]](#page-189-4). The readout circuit and timing scheme is shown in Figure [4.18](#page-140-0) and Figure [4.17.](#page-139-0) The key here is to sample the Q of the DFF with enough

<span id="page-139-0"></span>

Figure 4.17: Block diagram of the implemented high-speed CKV phase counter.

margin after it toggles. The DFF counting LSB is triggered by CKV while its complementary part CKVb is used to re-sample the phase of CKR, generating the sampling clock cks0 for LSB. In this way, there is almost half of a CKV period margin left for the readout, assuming all the DFFs have the same propagation delay. The rising edge of cks0 samples the supply VDD, generating the sampling clock cks1 for the synchronous counter. Overall this scheme ensures that the sampling clock trigger edge always arrived for half of a CKV period later after the corresponding counter bit settles, enhancing the counter's robustness.

## **4.2.11 Digital Loop Filter**

DLF determines the dynamics and frequency response of the DPLL. To cope with the trade-off between tuning range and resolution in a area-efficient way, the frequency tuning of the DCO is realized by three capacitance banks in a coarse-fine arrangement, as discussed in Section [3.7](#page-95-0) and [\[7\]](#page-175-0) [\[124\]](#page-189-5) [\[67\]](#page-183-5). For the sake of simplicity, the implemented DLF is composed of three separate paths to process different sections of the input phase error, generating the control words for the corresponding DCO tuning bank (Fig. [4.19\)](#page-141-0). For instance, the PVT bank control words derive from the MSBs of the PE while the

<span id="page-140-0"></span>

Figure 4.18: Timing sequence of the implemented high-speed counter.

tracking bank words response to the LSBs. Additionally, considering the fact that the two coarse banks are mainly adopted to achieve the target frequency as soon as possible, instead of filtering more phase noise and spurs, only a proportional processing is deployed to realize a type-I DLF for wide loop bandwidth and fast locking. Different from the type-I loop filter, type-II PLL filters noise sources more (Chapter 3) and therefore leveraged in the DLF for tracking bank control words. Additional switchable IIR filter is utilized as well for additional programmability of noise attenuation. Besides, gain normalization is also applied to avoid the impact of DCO gain impact over the loop transfer function. Functions of zero-phase restart and control-word frozen during bank-switch are implemented in the illustrated CTRL block.

## **4.3 Measurement of Implementation 1**

To explore if the architectural innovations will pave the way for substantial jitter improvement towards breaking higher spectral-purity

<span id="page-141-0"></span>

Figure 4.19: Simplified Diagram of the implemented DLF.

with optimized power-efficiency, an experimental fully differential DPLL as outlined above has been implemented in 130nm CMOS, occupying an active area of 0.27*mm*<sup>2</sup> , as depicted in Fig. [4.20.](#page-141-1)

<span id="page-141-1"></span>

Figure 4.20: Chip micrography of FDVPD DPLL implementation.

Under a reference rate of 80MHz, the whole DPLL consumes a power of 9.2 mW with its detailed breakdown shown in Fig. [4.21.](#page-142-0) The DCO

is running at a supply of 1.5V while the rest blocks of the chip are all supplied by 1.2 V.

<span id="page-142-0"></span>

Figure 4.21: Estimated power breakdown at 80 MHz reference.

Fig. [4.22](#page-143-0) shows the measured phase noise spectrum for an integer-N channel with reference input of 80MHz. A sub-90fs integrated jitter is achieved, which corresponds to a IPN of -58dBc, integrated from 10kHz to 40MHz.

Fig. [4.23](#page-143-1) shows the measured phase noise spectrum for a deep fractional-N channel with reference input of 80MHz. A 101fs RMS phase jitter is achieved, which corresponds to a IPN of -56dBc, integrated from 10kHz to 40MHz.

With the measured power and jitter, a benchmark is done in Fig. [4.24,](#page-144-0) showing the proposed solution has broken the -250dB FOM barrier line. Besides, it is also clearly shown that there is a trend that most power-efficient (high FOM), high purity PLLs are either analog (voltage domain) or leveraging the assistance from ADC, as shown in Fig. **??**.

The reference spur is measured to be -78.8dBc, which is good enough compared to conventional CPPLLs. The degradation of reference spur in the implemented chip is mostly due to the insufficient isolation between input and output path, as well as switching activity of the DCO at reference rate.

<span id="page-143-0"></span>

Figure 4.22: Measured phase noise of an integer-N channel, with input of 80MHz.

<span id="page-143-1"></span>

Figure 4.23: Measured phase noise of an fractional-N channel, with input of 80MHz


Figure 4.24: Benchmarking state-of-the-art fractional-N PLLs in terms of noise-power FOM.



Figure 4.25: Measured reference spur.

The fractional spur is measured to be -56.4dBc, which is enough for most challenging protocols, according to Table [2.1.](#page-34-0) The result is shown in Fig. [4.26.](#page-145-0)

<span id="page-145-0"></span>

Figure 4.26: Measured worst in-band fractional spur.

To further show the advantage of the fully differential structure in terms of PSRR, an external sinusoidal noise is added to the supply of the PD at 80kHz (thus, within the PLL bandwidth after upconversion) while the correspondingly generated spurs at 80kHz offset from the carrier are recorded as shown in Fig. [4.27](#page-146-0) over different peak-peak modulation amplitude levels.

The overall performance has been summarized in Table [4.2.](#page-153-0)

## **4.4 Implementation 2: Fully Differential Charge-Domain PD DPLL**

Instead of covering the large fractional-N operation in a fully differential voltage domain, charge-domain D/A conversion is attractive as a more power-efficient solution can be made with less jitter, naturally leading to a better implementation. Thus, one additional

<span id="page-146-0"></span>

Figure 4.27: Measured spur level at 80kHz offset vs. different noise levels (peak-peak).

experimental design is implemented in the same technology, with also one oscillator with a larger tuning range.

#### **4.4.1 PD in Charge-Domain**

Although the design has achieved state-of-the-art performance, the implementation 1 still contains several unsatisfying characteristics,

- The I-DAC adopted as the coarse conversion, has a different PVT variation feature as compared to the C-DAC used in the SAR ADC, which is adopted for the fine conversion. This makes calibration more difficult.
- Due to the current array in the I-DAC, the ramp has to work within a limited range to keep all current sources in deep saturation region so that the overall linearity is not heavily degraded. This limits the potential maximum  $dv/dt$  gain, and it is thus not in favor for further phase noise reduction.

Apparently, these are all related to the I-DAC, and these issues can be alleviated by shifting the D/A conversion step into charge domain. One possible implementation is conceptually illustrated in Fig. [4.28.](#page-147-0) Different from the FDVPD design, there is no pre-setting I-DAC, and hence, the power consumed is primarily duty-cycled (switching) with no static part. As shown in the bottom part of Fig. [4.28,](#page-147-0) the differential ramp is turned on and pre-charged to the supply and ground separately, eliminating any settling-introduced non-linearity

<span id="page-147-0"></span>

Figure 4.28: Conceptual diagram of DPLL with differential charge domain PD.

shortly before the rising edge of REF, as step 1. Marked by the rising edge of REF, a differential ramp starts from rail-to-rail. This gives the proposed method a large dv range (2VDD), which is meaningful considering technology scaling. In addition, the pump generators can thus work in their deep saturation regions, resulting in not only less phase noise (large dv/dt gain), but also better linearity. After the CKVd2/CKVd sampling, the time-domain difference already gets converted into the charge domain  $(I \Delta t = \Delta Q)$ . In step 4, which is the same as the encoding phase in implementation 1, is the section where the pre-known pattern gets encoded in to a  $\Delta Q$ , so that only a small residue error is left after step 4. Step 5 is the conversion phase, where phase error finally gets digitized, which is the same as the final step in implementation 1. Even though it is essentially the same as implementation 1, the charge-domain operation can potentially result in less power (about 15%) and even less phase noise.

#### **4.4.2 the Coarse-fine C-DAC**

The charge-domain of implementation 2 is depicted in Fig. [4.29.](#page-149-0) Compared to Fig. [4.8,](#page-127-0) it is clearly shown that the right part of Fig. [4.29](#page-149-0) (fine conversion step) is using the same monotonic switching scheme as the SAR ADC realized in implementation 1. The difference now lies in the left part, which is used to replace the previous I-DAC. For each of the two 10b CDAC, half of the capacitor array is charged against Vref, while the other half against VSS during the ramp generation.  $\degree$ After sampling (ckvdg rising edge), the 10b CDAC will switch first to cover the large fractional-N operation, bringing the differential voltage into the locking point as old scheme.

#### **4.4.3 The PSRR of PD and its impact on spurs**

Supply noise can easily find its way via the PD to the output of the PLL, leading to phase noise degradation and spurs. This can be roughly discussed based on the PSRR of the PD block as following. First, we assume that a modulation at the supply  $v_{in,n}$  at certain

<span id="page-148-0"></span><sup>&</sup>lt;sup>c</sup>They get pre-set to VDD and VSS before REF-rising edge (ramp) comes.

<span id="page-149-0"></span>

Figure 4.29: Dynamic range compensated in charge domain.

frequency  $f_m$  would lead to a signal at the output of the PD with an amplitude of *vo,n*, with the relation of

$$
PSRR = 20 \log_{10}(\frac{v_{in,n}}{v_{o,n}})
$$
\n(4.14)

Then we assume the translation from output voltage noise  $(v_{o,n})$ to timing jitter  $(t_{o,n})$  follows a simple dv/dt relation for simplicity. According to Eq. [3.52,](#page-81-0) a fundamental tone will present at  $f_m$  offset from the carrier at the output.

$$
\mathcal{L}(f_m) = 20 \log_{10}(\frac{t_{o,n}}{T_{out}})
$$
\n(4.15)

$$
= 20 \log_{10}(\frac{v_{o,n}}{dv/dt \cdot T_{out}})
$$
\n(4.16)

$$
=20\log_{10}\left(\frac{v_{in,n}}{dv/dt \cdot T_{out}}\right) - PSRR\tag{4.17}
$$

This rough analysis simple tells that with a certain input modulation and output frequency, the output spurs could be reduced by a sharp transition between voltage and time domain, as well as a large PSRR. The improvement of PSRR of the proposed PD over a conventional inverter is more than 50dB as shown in Fig. [4.30,](#page-151-0) which lends us a significant advantage in terms of attenuating impact from supply noise. This saves additional supply headroom and power/area cost as inverter-based PD is relying on additional local LDOs for a better effective PSRR.

#### **4.4.4 Hybrid DCO for a Larger Tuning Range**

As an experimental implementation, the FDVPD-based DPLL is using a switched-capacitor based DCO. Albeit being popular as well as benefiting from technology scaling, this method still has several drawbacks. Other than the requirement of fine capacitor values, and some dynamic-element matching algorithms, the poor isolation between such a complex switched-capacitor network and the oscillator core might easily bring additional noises and spurs due to the digital

<span id="page-151-0"></span>

Figure 4.30: Simulated PSRR of the proposed PD and an inverter stage.

circuitry. Therefore, another chip is fabricated with a DCO realized in a hybrid method. It leverages a switched-capacitor array for frequency band selection and an analog varactor for fine tuning, leading to a wide tuning range with excellent phase noise. The implementation is conceptually shown in Fig. [4.31.](#page-152-0) The Cbank stands for the switchedcapacitor bank which is controlled by the coarse paths of the DLF (PVT and AB, shown in Fig. [4.19,](#page-141-0) fixed after locking), while the varactor is controlled by the fine bank path. This method makes the DCO to achieve both fine-resolution and large tuning range in an easier way, however, at the cost of additional DAC noise and area.

<span id="page-152-0"></span>

Figure 4.31: Implemented DCO controlled by DAC+varactor.



<span id="page-153-0"></span>CHAPTER 4. LOW JITTER PLL 140

## <span id="page-154-0"></span>**Chapter 5**

# **Frequency Synthesis Solution for MRI Application**

In this chapter, a robust 2.9 GHz to 3.8 GHz two-stage cascaded phaselocked loop (PLL)-based clocking system with minimized integrated jitter for a MRI on-coil receiver is presented. The clocking system consists of a first stage jitter cleaning DPLL and a second stage frequency multiplying CPPLL. A harmonic reshaping technique (HRT) is applied to the CPPLL LC VCO for its  $1/f^3$  phase noise (PN) suppression over a 27% tuning range, reducing the  $1/f^3$  corner to less than 60 kHz. This helps the out-of-bore CPPLL RMS jitter to be optimized to 480 fs, integrated from 1 kHz to 500 kHz. The proposed system generates the local-oscillator (LO) signal for the integrated RX, covering the required Larmor frequency range for 1.5 to  $10.5$  T MRI field strength. The clocking system has been fabricated in a standard 130 nm CMOS process. Measured inside a commercial 3 T MRI scanner in the presence of strong magnetic gradient  $(200 \text{ T/m/s})$ modulation, the in-bore jitter is 3 ps integrated from 1 Hz to 500 kHz, bringing a 100x improvement over a typical single-stage PLL-based clock solution. Therefore, this is a key step in the hardware evolution

towards a multi-channel wearable MRI.

## **5.1 Motivation: A Robust On-coil Clocking System**

MRI has become one of the most important medical imaging techniques nowadays. Ever since its introduction in the 1980s, efforts have been made on both circuit and system levels to provide better image quality while reducing scanning time. Using multi-channel coil-arrays (MCA) placed near the tissue increases the received signal strength [\[125\]](#page-189-0), [\[18\]](#page-177-0), but introduces bulky RF cables to carry the analog signal to the out-of-field receiver array. To make the MCA MRI set-up low-cost and wearable, a fully integrated on-coil MRI RX has been proposed in [\[20\]](#page-177-1), which places the RX IC directly on coil to allow the digitized coil signals to be transmitted via thin and flexible optical cables. Differently from palm-held nuclear magnetic resonance (NMR) devices [\[126\]](#page-189-1) and conventional MRI scenarios [\[18\]](#page-177-0), the on-coil MRI RX is exposed to the high magnetic field  $(1.5$ -to-10.5 T static field and strong modulating gradient fields) of the scanner (Fig. [5.1\)](#page-156-0). Therefore the requirement for the corresponding on-coil clocking system becomes considerably higher. To guarantee coherent acquisition during the long scans, the PN has to be minimized to preserve the high SNR of the coil signals, and therefore the image quality. This is an essential requirement to provide a sufficiently clean clock for the on-coil RX. To make the on-coil PCB compact for wearable use, the clocking system ideally should be integrated within the RX itself.

In this chapter, which is a further expansion based on [\[20\]](#page-177-1), [\[17\]](#page-177-2), we present the first highly integrated clocking system for an on-coil receiver  $(Fig. 5.1)$  $(Fig. 5.1)$ .

## **5.2 Overview of the Clocking System**

Based on the mechanism of NMR, MRI splits the spin states of the hydrogen nuclei in the human body with a strong static magnetic

<span id="page-156-0"></span>

Figure 5.1: Illustration of the on-coil RX set-up with zoom-in view of the proposed integrated clocking system.

field. After being excited with an additional RF-field at the resonance frequency of the nuclei, the relaxation of this resonance is picked up by coils around the tissue. The spatial information of the nuclei is then encoded in phase and frequency into a signal with gradient fields applied. The image is reconstructed offline from the received signal. Based on the phase and frequency modulated signal, the seconds-long scan procedure sets demanding requirements for the long-term stability, and especially the low offset PN of the on-coil RX LO signal [\[18\]](#page-177-0). Furthermore, any component with a typical package containing magnetic metals (e.g., iron and nickel) such as the XO, can pick up modulation from the magnetic field easily. This makes the clocking system design more challenging as compared to LO in wireless communication systems.

<span id="page-157-0"></span>



The proposed solution is shown in the system level diagram of Fig. [5.2.](#page-157-0) Like in most conventional MRI clocking systems, the stability of the clocking systems is derived from a central reference source that is placed outside the MRI field to avoid undesired distortions. The rigid RF cables, which are commonly used to convey the external reference into the bore, are not an option for the on-coil RX set up due to their cost, volume and susceptibility to the magnetic field. Instead much thinner, cheaper and more flexible fiber links are adopted at the cost of large added noise [\[18\]](#page-177-0). According to measurements, the fiber link dominates the PN at frequencies above 500 Hz carrier offset. Therefore, a DPLL with an ultra-narrow bandwidth is employed to sufficiently remove the added noise from the fiber link, considering this filtering would be too costly to be realized by a passive analog loop filter. An XO has to be used to satisfy the stringent PN requirement set by the sub-kHz bandwidth of such a low-noise DPLL, as the quality factor (Q) of the XO is generally 1000x better than the Q of an on-die LC tank. In order to keep the field modulation pick-up effect to a minimum, a small package size voltage controlled XO (VCXO) is adopted at the cost of PN performance. A CPPLL is chosen as the successive stage of the DPLL to realize the final frequency multiplication as well as a large tuning range. A 3 GHz LC-VCO is adopted as the output stage of the CPPLL, sitting far away from the Larmor frequency to get rid of potential interferences. At the output of the LC-VCO, a programmable low PN divider chain is adopted to offer the desired LO (from 64 MHz to 450 MHz for 1.5 to 10.5 T MRI).

## **5.3 Circuit Implementation**

#### **5.3.1 Cascaded PLLs**

As shown in Fig. [5.2](#page-157-0) (left), the first stage fully synthesized counterbased DPLL is designed with integer-N mode due to its simplicity and robustness. The input clock first goes through a programmable divider to save power and to provide reconfigurability for the DPLL. The frequency error is derived by a counter-based feedback loop, while the phase error is detected by a binary bang-bang phase detector (BBPD) instead of using a multi-bit TDC. A BBPD is sufficient for integer-N

operation and brings lower power consumption and less jitter/ spur issue compared with a TDC. The phase and frequency errors are then fed into a reconfigurable type-II 4th order IIR loop filter, making the bandwidth easily configurable from 100 Hz to 10 kHz.

A fractional-N mode is adopted for the second stage  $\Delta\Sigma$  modulatorbased CPPLL to achieve the required frequency tuning (Fig. [5.2,](#page-157-0) right). In addition to the common PLL noise contributors (chargepump, reference, and  $\Delta\Sigma$  modulator), spurs caused by gradient field modulation located between 10 kHz to 400 kHz, have also to be filtered sufficiently. Therefore a narrow bandwidth of 100 kHz is chosen according to system simulations.

<span id="page-159-0"></span>

Figure 5.3: Simplified schematic and tank impedance of (a) a current biased VCO with a typical LC tank, (b) a current biased VCO with a classic tail filtering technique applied.

#### **5.3.2 1/***f*  $1/f<sup>3</sup>$  Phase Noise Improved VCO for 2nd **Stage CPPLL**

The  $100 \text{ kHz}$  bandwidth is well below a typical  $1/f^3$  PN corner of modern CMOS LC VCO and therefore the CPPLL cannot filter the VCO jitter contribution enough. This results in a strong motivation for the suppression of the upconverted flicker noise in the VCO design to minimize the VCO PN contribution from 1 kHz to 100 kHz. Various methods and techniques for  $1/f<sup>3</sup>$  PN suppression have been developed over the past years [\[127\]](#page-189-2),[\[96\]](#page-186-0),[\[95\]](#page-186-1). However these resulted either in

<span id="page-160-0"></span>

Figure 5.4: Proposed HRT-VCO and equivalent tank models at different harmonic frequencies.

degradation of  $1/f^2$  PN [\[96\]](#page-186-0) or the additional effort of manual finetuning across the tuning range[\[95\]](#page-186-1). None of these approaches are favourable to the scenario of the proposed CPPLL.

There are two main 1/f PN upconversion mechanisms [\[95\]](#page-186-1), [\[96\]](#page-186-0). The former is faced by all current biased VCO topologies due to the fact that the oscillation amplitude gets modulated by flicker noise from the tail current  $(i_{\text{fn tail}}$  in Fig. [5.3\(](#page-159-0)a)), via AM-FM conversion. Although this can be avoided by choosing a voltage-biased VCO with no tail current source as done in [\[90\]](#page-185-0) at the cost of "Q-degradation", a widely used equivalent solution is to form another resonance at the tail node of the VCO at  $2\omega_0$ , offering high impedance to reject the tail current modulation (Fig. [5.3\(](#page-159-0)b)). The other mechanism stems from the Groszkowski effect[\[95\]](#page-186-1): the resonance frequency gets shifted in the presence of harmonic components from the active devices  $(i_{fn,d1})$ and  $i_{\text{fn},d2}$  in Fig. [5.3\(](#page-159-0)a)). The capacitive path of the LC tank shows much lower impedance at multiples of  $\omega_0$ . Therefore the harmonic

components will be injected into the tank, breaking the LC fundamental resonance equilibrium via the low impedance capacitive path. A new equilibrium state will be reached with a frequency shift ∆*ω* from  $\omega_0$ , to compensate for the disturbance. As the  $1/f$  noise contents can modulate  $\Delta\omega$  via the Groszkowski effect, a frequency modulation  $(FM)$  occurs and the upconversion of the 1/f noise results in  $1/f<sup>3</sup> PN$ degradation. According to the analysis recently reported [\[95\]](#page-186-1), the dominating FM sources are the lower harmonics, especially the 2nd and 3rd order. As even harmonics are known to excite the common mode (CM) path of the tank and odd harmonics the differential mode (DM) path, additional resonances can be created to suppress the corresponding harmonics. Here the desired additional resonances are a DM resonance at  $3\omega_0$  and a CM resonance at  $2\omega_0$ . Considering that the tail filtering technique is equivalent to creating an additional 2nd order harmonic CM resonance, these additional resonances can suppress not only the Groszkowski effect but also the tail current modulation mechanism in principle.

The proposed HRT-VCO solution of Fig. [5.4](#page-160-0) is composed of a parallel LC tank (inductor  $L$  and varactor  $C_v$ ), an additional series LC tank  $(ST)$   $(L_s$  and the switched capacitor-array), together with the additional adjustable differential capacitor  $C_{\text{diff}}$  and the single-ended  $C_{\text{SE}}$ . The ST has its resonance at  $2\omega_0$ . At frequency lower than  $2\omega_0$ , it is equivalent to a capacitor, while it shows inductive behavior at frequencies above  $2\omega_0$ . Therefore, as shown in Fig. [5.4,](#page-160-0) a  $2\omega_0$  DM short is realized by resonating the ST, and the additional resonance at  $3\omega_0$  is created based on the inductive ST behavior above  $2\omega_0$ . Different from tail filtering and [\[95\]](#page-186-1), no additional tuning of the additional tank is required as the switched capacitor-array is essentially shared by both *L* and *L*<sup>s</sup> and will shift the ST resonance together with the fundamental tank resonance (formed by  $L, C_v$ , capacitor shown by ST,  $C_{\text{SE}}$  and  $C_{\text{diff}}$ ). Furthermore, as the CM impedance at  $2\omega_0$  is dependent on all single ended capacitors (normally from transistor parasitics) and the CM inductance of *L*, the additional CM resonance at  $2\omega_0$  is created by fine tuning the ratio between  $C_{\text{SE}}$  and  $C_{\text{diff}}$ .

<span id="page-162-0"></span>

Figure 5.5: Die photo of the the proposed clocking system.

## **5.4 Measurement Results**

The proposed clocking system was implemented in a standard 130 nm CMOS process as shown in Fig. [5.5,](#page-162-0) occupying an area of  $0.88 \text{ mm}^2$ . All measurements were conducted in-field as shown in Fig. [5.1.](#page-156-0) The large additive noise contribution from the fiber link can be seen from Fig. [5.6.](#page-163-0) The long-term integrated jitter (1 Hz to 500 kHz) of the clean input reference clock gets degraded from 1.6 ps to 8.97 ps, which is not sufficient to be used as the system reference clock. This is issue is solved by the implemented system, as the jitter now is 3.3 ps at the LO output in the presence of a  $200 \text{ T/m/s}$  strong gradient field. This represents a 100x improvement over a conventional one-stage PLL solution.

Another main contribution to the clock jitter is the 2nd stage CPPLL. Fig. [5.7](#page-164-0) shows a comparison of the measured PN between the proposed HRT technique and the tail filtering-technique by using the same inductors and NMOS pairs. The  $1/f^3$  PN corner is reduced to 59 kHz (with 100 kHz CPPLL bandwidth) from 219 kHz, which means that the proposed technique is effective in the suppression of  $1/f^3$  PN as desired, and the CPPLL has an optimized RMS jitter of 480 fs integrated from 1 kHz to 500 kHz (this is the upper bandwidth of

<span id="page-163-0"></span>

Figure 5.6: Measured PN of the divided RX LO PN from 1 Hz offset.

interest for the MCA MRI application) which is comparable with recently reported low noise designs [\[128\]](#page-189-3).

As the long-term drift over seconds can directly be linked to the final image quality, the long-term phase coherence is also measured and shown in Fig. [5.8,](#page-165-0) where a sinusoidal input from a signal generator is fed directly into the RX while a gradient field of  $200 \text{ T/m/s}$  is applied. Phase drift is then extracted from the digitized RX output, showing a 0.03 vs. 40 rad (peak-peak) improvement within a 10 s measurement window. Acquired images from a bottle of water with the proposed system in a one channel set-up is shown in Fig. [5.8](#page-165-0) as a proof of concept. One acquired image from a human wrist with two-channel array coil setup is shown in Fig. [5.9.](#page-165-1) A summary of the measured performance of the entire system is shown in Table [5.1.](#page-166-0) The primary concern for the on-coil RX is the LO phase noise between 1 Hz and 500 kHz while power is not a primary issue. As the close-in PN is the key motivation of the  $1/f^3$  phase noise suppression, relevant benchmarks are listed in the lower part of the table.

<span id="page-164-0"></span>

Figure 5.7: Measured PN of the CPPLL and the VCO from 1 kHz offset.

## **5.5 Conclusion**

In this chapter, a low jitter integrated cascaded PLL-based clocking system is reported, with a wide tuning range and minimized close-in PN VCO. The circuit has been measured in a commercial 3 T MRI scanner and shows an excellent long-term clock stability as well as sufficient robustness in the presence of strong gradient field. This MRI on-coil clocking system therefore represents a key step towards multi-channel wearable MRI systems.

<span id="page-165-0"></span>

Figure 5.8: Measured in-bore long-term phase drift (top) and acquired sectional view of a bottle of water (bottom).

<span id="page-165-1"></span>

Figure 5.9: Two-channel array-coil for verification within a 3 T MRI unit and acquired image.



#### Measured Overall System Performance Discrete MRI Clocking

<span id="page-166-0"></span>Table 5.1: Measured performance and comparison.

\*\* includes 8.97 ps contribution from polluted reference via fiber link \*1 Hz to 500 kHz is the bandwidth of interests for this MRI application

#### Benchmark Comparison of Relevant VCO Performance



(1) With carrier frequency normalized to 128 MHz

(2) FOM=|PN|+20log (ω/Δω)-10log ( $P_{p}$ /1mW)

## **Chapter 6**

# **Conclusions and Outlook**

This final chapter summarizes the main contributions discussed in the present dissertation. This work can be structured into four parts: the identification of the research target problem (Chapter [1](#page-14-0) and Chapter [2\)](#page-20-0), the analysis and the investigation of the best solutions to the problem (Chapter [3\)](#page-44-0), and finally the implementations of the proposed solution towards the specified targets (Chapter [4\)](#page-111-0), and (Chapter [5\)](#page-154-0).

## **6.1 The Problem**

A power-efficient generation of high spectral purity RF carriers is highly desirable for RF SoCs that support advanced mobile applications. For such emerging applications, dense constellations (e.g., 256 QAM in LTE, 5G) are developed to meet the unending quest for a higher data rate. As constellations evolve, stringent requirements are imposed to PLL integrated phase noise (IPN) and spur level to fulfill the related requirements on transmitter error vector magnitude (EVM), receiver sensitivity as well as blocker tolerance. Therefore, the goal of this dissertation was to reduce both jitter (the time-domain equivalent of IPN) and power in a frequency synthesizer design. Meanwhile, although DPLLs outperform the analog ones

in terms of configurability and integration level, the conventional T-domain PE digitization is limited by a trade-off between nonlinearity, power and quantization noise, due to its dependence on inverter delay. Worse still, the intrinsic poor CMRR and high PVT sensitivity make T-domain PE detection-based DPLLs vulnerable in the RF SoC environment. This dissertation seeks to explore an alternative path to reduce jitter at low power while keeping the design robust in an RF SoC environment.

#### **6.2 The Analysis**

The success of a power-efficient high-purity frequency synthesizer design comes from a deep understanding of the PLL fundamentals and essential mechanisms. Which type of PLL is more power-efficient, a digital one, or an analog one? Which block is more crucial to realize an optimized synthesizer design, the PD, the CO, or the LF? How does the oscillator's phase noise and which topology is more fit for the thesis goal? How to design a PD that generates few and smaller spurs while more being power-efficient? To address these questions, both operation mechanisms and noise analyses have been carried out regarding four representative PLL architectures. It is concluded that as the integrated inductor's Q value primarily limits the power-noise tradeoff of an LC oscillator, hence, the realization of frequency-locking, as well as phase error detection, is the critical barrier towards a more optimized design. Based on both quantitative and qualitative analysis as well as modeling, it is proven that a counter-based DPLL with its phase error-detection realized in a differential analog domain can lead to a more power-efficient high-purity design. The essential reasons are straight and simple. A physical frequency multiplication path based on feedback control is always more robust than those without an FLL path. Even SSPLL has a separate physical FLL path by either counters or a dummy loop with a larger dead-zone. While structures with frequency multiplication based on injection locking mechanism would face more robustness issues, such as locking range, as well as spurs. On the other hand, compared to an inverter-chain-based method, analog domains offer the PD more design freedom to trade for

lower power and less jitter, while being more robust against aggressors from supplies and substrate.

## **6.3 The Solution**

Two DPLL variations based on the proposed method were implemented, with experimental versions fabricated in 130nm CMOS. These explorations leads to a substantial jitter-power reduction towards breaking the -250dB FOM barrier for fractional-N PLLs. This not only proves the remarkable theory contribution of this dissertation in silicon, but also paves the way further for DPLLs to be applied in high performance RF SoCs.

## **6.4 The Outlook**

To support the unending evolution of wireless communications, the generation of high-purity frequency synthesizer are one of the key tackles and can be pushed towards more power-efficient, and lower cost following the points listed below.

## **6.4.1 Wide Output Range by more Agile Frequency Locking Scheme**

In the interest of frequency synthesis, multi-standard operation primarily demands an ultra-wide tuning range LO. Centered to the answer of such a topic, although it is the LO design itself, a better frequency locking scheme can help, especially considering a transceiver has to cover frequency ranges from sub-GHz to more than 60 GHz. For instance, by an optimized frequency plan and arrangement, some higher frequency channels can be covered from a lower frequency PLL path via direct frequency-locking, reducing the cost of hardware greatly.

#### **6.4.2 Better A/D and D/A converter Designs for Higher Efficiency and Smaller Spurs**

As analog domains (charge, current, voltage) offer more design headrooms for optimizations, more structures of ADC and DAC design can be leveraged to improve the performance of the PD, compared to a conventional time-domain method. Besides, realization based on these analog domains are usually much easier for calibration regarding spurs caused by PVT variations as well as mismatch.

#### **6.4.3 More Efficient CO designs**

Even though the Q factor of an LC-tank primarily limits the noisepower trade-off of an LC oscillator, better topologies have been proven to push the designs to be more efficient. To support a wide range and efficient frequency synthesizer design, such an optimized oscillator is inevitable and crucial.









## **Bibliography**

- [1] S. Kosonocky, "Through the Looking Glass The 2020 Edition," in *2013 IEEE International Electron Devices Meeting*, Dec 2018.
- [2] G. Yeap, "Smart mobile SoCs driving the semiconductor industry: Technology trend, challenges and opportunities," in *2013 IEEE International Electron Devices Meeting*, Dec 2013, pp. 1.3.1–1.3.8.
- [3] M. Ingels *et al.*, "A 5 mm<sup>2</sup> 40 nm LP CMOS Transceiver for a Software-Defined Radio Platform," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 12, pp. 2794–2806, Dec 2010.
- [4] H. Darabi *et al.*, "A Quad-Band GSM/GPRS/EDGE SoC in 65 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 4, pp. 870–882, April 2011.
- [5] J. Borremans, G. Mandal, V. Giannini, T. Sano, M. Ingels, B. Verbruggen, and J. Craninckx, "A 40nm CMOS highly linear 0.4-to-6GHz receiver resilient to 0dBm out-of-band blockers," in *2011 IEEE International Solid-State Circuits Conference*, Feb 2011, pp. 62–64.
- [6] D. Murphy, H. Darabi, and H. Xu, "A Noise-Cancelling Receiver Resilient to Large Harmonic Blockers," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 6, pp. 1336–1350, June 2015.
- [7] R. B. Staszewski, *All-Digital Frequency Synthesizer in Deep-Submicron CMOS*. John Wiley & Sons, Ltd, 2005.
- [8] R. B. Staszewski, Chih-Ming Hung, K. Maggio, J. Wallberg, D. Leipold, and P. T. Balsara, "All-digital phase-domain TX frequency synthesizer for Bluetooth radios in 0.13/spl mu/m CMOS," in *2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519)*, Feb 2004.
- [9] F. Kuo, M. Babaie, H. R. Chen, L. Cho, C. Jou, M. Chen, and R. B. Staszewski, "An All-Digital PLL for Cellular Mobile Phones in 28-nm CMOS with -55 dBc Fractional and -91 dBc Reference Spurs," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 65, no. 11, pp. 3756–3768, Nov 2018.
- [10] J. L. et al., "A Sub-6-GHz 5G New Radio RF Transceiver Supporting EN-DC With 3.15-Gb/s DL and 1.27-Gb/s UL in 14-nm FinFET CMOS," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 12, pp. 3541–3552, Dec 2019.
- [11] A. A. Abidi, "Phase Noise and Jitter in CMOS Ring Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 8, pp. 1803–1816, Aug 2006.
- [12] Y. Choi, Y. Seong, Y. Yoo, S. Lee, M. Velazquez Lopez, and H. Yoo, "Multi-Standard Hybrid PLL With Low Phase-Noise Characteristics for GSM/EDGE and LTE Applications," *IEEE Transactions on Microwave Theory and Techniques*, vol. 63, no. 10, pp. 3254–3264, Oct 2015.
- [13] I. T. Press, "5G's rise set to break the semiconductor market's fall in 2020," *IHS Technology Press Releases*, Oct 2019.
- [14] H. Darabi, "Chapter 2 CMOS Transceivers for Modern Cellular Terminals," in *Advances in Analog and RF IC Design for Wireless Communication Systems*. Oxford: Academic Press, 2013, pp.  $7 - 33$ .
- [15] C. Y. et al., "A 14-nm 0.14-psrms Fractional-N Digital PLL With a 0.2-ps Resolution ADC-Assisted Coarse/Fine-Conversion Chopping TDC and TDC Nonlinearity Calibration," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 12, pp. 3446– 3457, Dec 2017.
- [16] X. Chen, "Design of a highly-integrated frequency synthesizer for multi-standard mobile communications," Ph.D. dissertation, ETH Zurich, 2007.
- <span id="page-177-2"></span>[17] B. Sporrer, L. Wu, L. Bettini *et al.*, "A Fully Integrated Dual-Channel On-Coil CMOS Receiver for Array Coils in 1.5-10.5 T MRI," *IEEE Transactions on Biomedical Circuits and Systems*, vol. 11, no. 6, pp. 1245–1255, Dec 2017.
- <span id="page-177-0"></span>[18] J. Reber, J. Marjanovic, D. Brunner *et al.*, "In-bore broadband array receivers with optical transmission," vol. 22, Oct. 2014, p. 619.
- [19] Z.-P. Liang and P. C. Lauterbur, *Principles of magnetic resonance imaging*. IEEE Press, Series in Biomedical engineering, 2000.
- <span id="page-177-1"></span>[20] B. Sporrer, L. Wu, L. Bettini *et al.*, " A sub-1dB NF dual-channel on-coil CMOS receiver for Magnetic Resonance Imaging," in *2017 IEEE International Solid-State Circuits Conference (ISSCC)*, Feb 2017, pp. 454–455.
- [21] Floyd M. Gardner, *Phaselock Techniques*. John Wiley & Sons, Ltd, 2005.
- [22] R. Jaffe and E. Rechtin, "Design and performance of phaselock circuits capable of near-optimum performance over a wide range of input signal and noise levels," *IRE Transactions on Information Theory*, vol. 1, no. 1, pp. 66–76, March 1955.
- [23] A. Grebene and H. Camenzind, "Phase locking as a new approach for tuned integrated circuits," in *1969 IEEE International Solid-State Circuits Conference. Digest of Technical Papers*, vol. XII, Feb 1969.
- [24] J. I. Brown, "A digital phase and frequency-sensitive detector," *Proceedings of the IEEE*, vol. 59, no. 4, pp. 717–718, April 1971.
- [25] F. Gardner, "Charge-Pump Phase-Lock Loops," *IEEE Transactions on Communications*, vol. 28, no. 11, pp. 1849–1858, November 1980.
- [26] P. Westlake, "Digital Phase Control Techniques," *IRE Transactions on Communications Systems*, vol. 8, no. 4, pp. 237–246, December 1960.
- [27] E. Temporiti, C. Weltin-Wu, D. Baldi, R. Tonietto, and F. Svelto, "A 3 GHz Fractional All-Digital PLL With a 1.8 MHz Bandwidth Implementing Spur Reduction Techniques," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 3, pp. 824–834, March 2009.
- [28] X. Gao, O. Burg, H. Wang, and et al., "9.6 A 2.7-to-4.3 GHz, 0.16psrms-jitter, -246.8dB-FOM, digital fractional-N sampling PLL in 28nm CMOS," in *2016 IEEE International Solid-State Circuits Conference (ISSCC)*, Jan 2016, pp. 174–175.
- [29] T. Siriburanon, S. Kondo, K. Kimura, and et al., "A 2.2 GHz -242 dB-FOM 4.2 mW ADC-PLL Using Digital Sub-Sampling Architecture," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 6, pp. 1385–1397, June 2016.
- [30] D. Tasca, M. Zanuso, G. Marzin, S. Levantino, C. Samori, and A. L. Lacaita, "A 2.9-4.0-GHz Fractional-N Digital PLL With Bang-Bang Phase Detector and 560-fsrms Integrated Jitter at 4.5-mW Power," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 12, pp. 2745–2758, Dec 2011.
- [31] S. Levantino, G. Marzin, and C. Samori, "An Adaptive Pre-Distortion Technique to Mitigate the DTC Nonlinearity in Digital PLLs," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 8, pp. 1762–1772, Aug 2014.
- [32] H. de Bellescize, *La réception synchrone*. E. Chiron, 1932.
- [33] S. C. Gupta, "Phase-locked loops," *Proceedings of the IEEE*, vol. 63, no. 2, Feb 1975.
- [34] W. C. Lindsey and Chak Ming Chie, "A survey of digital phaselocked loops," *Proceedings of the IEEE*, vol. 69, no. 4, pp. 410– 431, April 1981.
- [35] C. P. Reddy and S. C. Gupta, "A Class of All Digital Phase Locked Loops: Modeling and Analysis," *IEEE Transactions on Industrial Electronics and Control Instrumentation*, vol. IECI-20, no. 4, Nov 1973.
- [36] C. A. Sharpe, *A 3State Phase Detector Can Improve Your Next PLL Design*. EDN, Sep. 1976, pp. 55–59.
- [37] W. Deng, T. Siriburanon, A. Musa, K. Okada, and A. Matsuzawa, "A Sub-Harmonic Injection-Locked Quadrature Frequency Synthesizer With Frequency Calibration Scheme for Millimeter-Wave TDD Transceivers," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 7, pp. 1710–1720, July 2013.
- [38] X. Gao, E. A. M. Klumperink, M. Bohsali, and B. Nauta, "A Low Noise Sub-Sampling PLL in Which Divider Noise is Eliminated and PD/CP Noise is Not Multiplied by *N*<sup>2</sup> ," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 12, pp. 3253–3263, Dec 2009.
- [39] J. Sharma and H. Krishnaswamy, "A 2.4-GHz Reference-Sampling Phase-Locked Loop That Simultaneously Achieves Low-Noise and Low-Spur Performance," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 5, pp. 1407–1424, May 2019.
- [40] D. Lee and P. P. Mercier, "AMASS PLL: An Active-Mixer-Adopted Sub-Sampling PLL Achieving an FOM of -255.5dB and a Reference Spur of -66.6dBc," in *2018 IEEE Symposium on VLSI Circuits*, June 2018, pp. 181–182.
- [41] A. L. Lacaita, S. Levantino, and C. Samori, *Integrated Frequency Synthesizers for Wireless Systems*. Cambridge University Press, 2007.
- [42] T. A. D. Riley, M. A. Copeland, and T. A. Kwasniewski, "Deltasigma modulation in fractional-N frequency synthesis," *IEEE Journal of Solid-State Circuits*, vol. 28, no. 5, pp. 553–559, May 1993.
- [43] M. Gupta and Bang-Sup Song, "A 1.8GHz Spur-Cancelled Fractional-N Frequency Synthesizer with LMS-Based DAC Gain
Calibration," in *2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers*, Feb 2006, pp. 1922– 1931.

- [44] J. Tao and C.-H. Heng, "A 2.2-GHz 3.2-mW DTC-free Sampling ÎŤÎč Fractional-N PLL with -110 dBc/Hz In-band phase noise and -246dB FoM and -83dBc Reference Spur," *2019 Symposium on VLSI Circuits*, pp. C162–C163, 2019.
- [45] N. Markulic, K. Raczkowski, E. Martens, P. E. Paro Filho, B. Hershberg, P. Wambacq, and J. Craninckx, "A DTC-Based Subsampling PLL Capable of Self-Calibrated Fractional Synthesis and Two-Point Modulation," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 12, pp. 3078–3092, Dec 2016.
- [46] Z. Chen, Y. Wang, J. Shin, Y. Zhao, S. A. Mirhaj, Y. Kuan, H. Chen, C. Jou, M. Tsai, F. Hsueh *et al.*, " Sub-sampling all-digital fractional-N frequency synthesizer with -111dBc/Hz in-band phase noise and an FOM of -242dB," in *2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers*, Feb 2015, pp. 1–3.
- [47] X. Gao, E. Klumperink, and B. Nauta, "Sub-sampling PLL techniques," in *2015 IEEE Custom Integrated Circuits Conference (CICC)*, 2015, pp. 1–8.
- [48] J. Holmes, "Performance of a First-Order Transition Sampling Digital Phase-Locked Loop Using Random-Walk Models," *IEEE Transactions on Communications*, vol. 20, no. 2, pp. 119–131, April 1972.
- [49] A. Weinberg and B. Liu, "Discrete Time Analyses of Nonuniform Sampling First- and Second-Order Digital Phase Lock Loops," *IEEE Transactions on Communications*, vol. 22, no. 2, pp. 123–137, February 1974.
- [50] A. Kajiwara and M. Nakagawa, "A new PLL frequency synthesizer with high switching speed," *IEEE Transactions on Vehicular Technology*, vol. 41, no. 4, pp. 407–413, Nov 1992.
- [51] Marcel J.M. Pelgrom, *Analog-to-Digital Conversion*. Springer-Verlag New York, 2013.
- [52] N. Da Dalt, "A design-oriented study of the nonlinear dynamics of digital bang-bang PLLs," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 52, no. 1, pp. 21–31, Jan 2005.
- [53] N. Da Dalt, "Linearized analysis of a digital bang-bang pll and its validity limits applied to jitter transfer and jitter generation," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 55, no. 11, pp. 3663–3675, Dec 2008.
- [54] A. Elkholy, T. Anand, W. Choi, A. Elshazly, and P. K. Hanumolu, "A 3.7 mW Low-Noise Wide-Bandwidth 4.5 GHz Digital Fractional-N PLL Using Time Amplifier-Based TDC," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 4, pp. 867–881, April 2015.
- [55] C. Ho and M. S. Chen, "A Digital PLL With Feedforward Multi-Tone Spur Cancellation Scheme Achieving *<*-73 dBc Fractional Spur and *<*-110 dBc Reference Spur in 65 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 12, pp. 3216–3230, Dec 2016.
- [56] V. K. Chillara, Y. Liu, B. Wang, A. Ba, M. Vidojkovic, K. Philips, H. de Groot, and R. B. Staszewski, "An 860*µ*W 2.1-to-2.7GHz all-digital PLL-based frequency modulator with a DTC-assisted snapshot TDC for WPAN (Bluetooth Smart and ZigBee) applications," in *2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC)*, Feb 2014, pp. 172–173.
- [57] J. Zhuang and R. B. Staszewski, "A low-power all-digital PLL architecture based on phase prediction," in *2012 19th IEEE International Conference on Electronics, Circuits, and Systems (ICECS 2012)*, Dec 2012, pp. 797–800.
- [58] M. H. Perrott, M. D. Trott, and C. G. Sodini, "A modeling approach for  $\Sigma-\Delta$  fractional-N frequency synthesizers allowing straightforward noise analysis," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 8, pp. 1028–1038, Aug 2002.
- [59] X. Gao, E. A. M. Klumperink, P. F. J. Geraedts, and B. Nauta, "Jitter Analysis and a Benchmarking Figure-of-Merit for Phase-Locked Loops," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 56, no. 2, pp. 117–121, Feb 2009.
- [60] P. Dudek, S. Szczepanski, and J. V. Hatfield, "A high-resolution CMOS time-to-digital converter utilizing a Vernier delay line," *IEEE Journal of Solid-State Circuits*, vol. 35, no. 2, pp. 240–247, Feb 2000.
- [61] S. Pamarti, L. Jansson, and I. Galton, "A wideband 2.4-GHz delta-sigma fractional-NPLL with 1-Mb/s in-loop modulation," *IEEE Journal of Solid-State Circuits*, vol. 39, no. 1, pp. 49–62, Jan 2004.
- [62] A. Swaminathan, K. J. Wang, and I. Galton, "A Wide-Bandwidth 2.4 GHz ISM Band Fractional-*N* PLL With Adaptive Phase Noise Cancellation," *IEEE Journal of Solid-State Circuits*, vol. 42, no. 12, pp. 2639–2650, Dec 2007.
- [63] K. J. Wang, A. Swaminathan, and I. Galton, "Spurious Tone Suppression Techniques Applied to a Wide-Bandwidth 2.4 GHz Fractional-N PLL," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2787–2797, Dec 2008.
- [64] C. Ho and M. S. Chen, "A Fractional-N DPLL With Calibration-Free Multi-Phase Injection-Locked TDC and Adaptive Single-Tone Spur Cancellation Scheme," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 63, no. 8, pp. 1111–1122, Aug 2016.
- [65] M. Zanuso, S. Levantino, C. Samori, and A. L. Lacaita, "A Wideband 3.6 GHz Digital ÎŤÎč Fractional-N PLL With Phase Interpolation Divider and Digital Spur Cancellation," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 3, pp. 627–638, March 2011.
- [66] S. Levantino and C. Samori, "Nonlinearity cancellation in digital PLLs (Invited paper)," in *Proceedings of the IEEE 2013 Custom Integrated Circuits Conference*, Sep. 2013, pp. 1–8.
- [67] Wu, Lianbo, *An Ultra-Low-Power ADPLL for BLE Applications*. Master Thesis at Delft University of Technology, 2014.
- [68] Z. Xu, M. Miyahara, K. Okada, and A. Matsuzawa, "A 3.6 GHz Low-Noise Fractional-N Digital PLL Using SAR-ADC-Based TDC," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 10, pp. 2345–2356, Oct 2016.
- [69] S. Levantino and C. Samori, "Nonlinearity cancellation in digital PLLs (Invited paper)," in *Proceedings of the IEEE 2013 Custom Integrated Circuits Conference*, Sep. 2013, pp. 1–8.
- [70] E. Temporiti, C. Weltin-Wu, D. Baldi, M. Cusmai, and F. Svelto, "A 3.5 GHz Wideband ADPLL With Fractional Spur Suppression Through TDC Dithering and Feedforward Compensation," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 12, pp. 2723–2736, Dec 2010.
- [71] Y. Wu, M. Shahmohammadi, Y. Chen, P. Lu, and R. B. Staszewski, "A 3.5-6.8-GHz Wide-Bandwidth DTC-Assisted Fractional-N All-Digital PLL With a MASH ∆Σ -TDC for Low In-Band Phase Noise," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 7, pp. 1885–1903, July 2017.
- [72] M. Z. Straayer and M. H. Perrott, "A Multi-Path Gated Ring Oscillator TDC With First-Order Noise Shaping," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 4, pp. 1089–1098, April 2009.
- [73] G. Baccarani, M. R. Wordeman, and R. H. Dennard, "Generalized scaling theory and its application to a Âij micrometer MOSFET design," *IEEE Transactions on Electron Devices*, vol. 31, no. 4, pp. 452–462, April 1984.
- [74] L. Vercesi, A. Liscidini, and R. Castello, "Two-Dimensions Vernier Time-to-Digital Converter," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 8, pp. 1504–1512, Aug 2010.
- [75] H. Wang, F. F. Dai, and H. Wang, "A Reconfigurable Vernier Time-to-Digital Converter With 2-D Spiral Comparator Array

and Second-Order ∆Σ Linearization," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 3, pp. 738–749, March 2018.

- [76] C. Hsu, M. Z. Straayer, and M. H. Perrott, "A Low-Noise Wide-BW 3.6-GHz Digital  $\Delta\Sigma$  Fractional-N Frequency Synthesizer With a Noise-Shaping Time-to-Digital Converter and Quantization Noise Cancellation," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2776–2786, Dec 2008.
- [77] M. Lee and A. A. Abidi, "A 9 b, 1.25 ps Resolution Coarse-Fine Time-to-Digital Converter in 90 nm CMOS that Amplifies a Time Residue," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 4, pp. 769–777, April 2008.
- [78] Qiuting Huang, "Phase noise to carrier ratio in LC oscillators," *IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications*, vol. 47, no. 7, pp. 965–980, July 2000.
- [79] D. B. Leeson, "A simple model of feedback oscillator noise spectrum," *Proceedings of the IEEE*, vol. 54, no. 2, pp. 329–330, Feb 1966.
- [80] A. Hajimiri and T. H. Lee, "A general theory of phase noise in electrical oscillators," *IEEE Journal of Solid-State Circuits*, vol. 33, no. 2, pp. 179–194, Feb 1998.
- [81] D. Murphy, J. J. Rael, and A. A. Abidi, "Phase Noise in LC Oscillators: A Phasor-Based Analysis of a General Result and of Loaded *Q*," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 57, no. 6, pp. 1187–1203, June 2010.
- [82] P. Andreani, Xiaoyan Wang, L. Vandi, and A. Fard, "A study of phase noise in colpitts and LC-tank CMOS oscillators," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 5, pp. 1107–1118, 2005.
- [83] B. Hershberg, K. Raczkowski, K. Vaesen, and J. Craninckx, "A 9.1-12.7 GHz VCO in 28nm CMOS with a bottom-pinning bias technique for digital varactor stress reduction," in *ESSCIRC 2014 - 40th European Solid State Circuits Conference (ESS-CIRC)*, 2014, pp. 83–86.
- [84] P. Andreani, K. Kozmin, P. Sandrup, M. Nilsson, and T. Mattsson, "A TX VCO for WCDMA/EDGE in 90 nm RF CMOS," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 7, pp. 1618– 1626, 2011.
- [85] H. D. et al., "A Quad-Band GSM/GPRS/EDGE SoC in 65 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 4, pp. 870–882, April 2011.
- [86] E. Hegazi, H. Sjoland, and A. A. Abidi, "A filtering technique to lower LC oscillator phase noise," *IEEE Journal of Solid-State Circuits*, vol. 36, no. 12, pp. 1921–1930, Dec 2001.
- [87] A. Mazzanti and P. Andreani, "Class-C Harmonic CMOS VCOs, With a General Result on Phase Noise," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 12, pp. 2716–2729, Dec 2008.
- [88] M. Tohidian, A. Fotowat-Ahmadi, M. Kamarei, and F. Ndagijimana, "High-swing class-C VCO," in *2011 Proceedings of the ESSCIRC (ESSCIRC)*, Sep. 2011, pp. 495–498.
- [89] L. Fanori and P. Andreani, "Highly Efficient Class-C CMOS VCOs, Including a Comparison With Class-B VCOs," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 7, pp. 1730–1740, July 2013.
- [90] L. Fanori and P. Andreani, "Class-D CMOS Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 12, pp. 3105–3119, Dec 2013.
- [91] L. Fanori, T. Mattsson, and P. Andreani, "A Class-D CMOS DCO with an on-chip LDO," in *ESSCIRC 2014 - 40th European Solid State Circuits Conference (ESSCIRC)*, Sep. 2014, pp. 335– 338.
- [92] M. Babaie and R. B. Staszewski, "A Class-F CMOS Oscillator," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 12, pp. 3120– 3133, Dec 2013.
- [93] M. Babaie and R. B. Staszewski, "An Ultra-Low Phase Noise Class-F 2 CMOS Oscillator With 191 dBc/Hz FoM and Long-Term Reliability," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 3, pp. 679–692, March 2015.
- [94] H. Darabi, H. Jensen, and A. Zolfaghari, "Analysis and Design of Small-Signal Polar Transmitters for Cellular Applications," *IEEE Journal of Solid-State Circuits*, vol. 46, no. 6, pp. 1237– 1249, June 2011.
- [95] M. Shahmohammadi, M. Babaie, and R. B. Staszewski, "A 1/f Noise Upconversion Reduction Technique for Voltage-Biased RF CMOS Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 11, pp. 2610–2624, Nov 2016.
- [96] S. Levantino *et al.*, "Suppression of flicker noise upconversion in a 65nm CMOS VCO in the 3.0-to-3.6 GHz band," in *ISSCC*, 2010, pp. 50–51.
- [97] B. Soltanian and P. Kinget, "AM-FM conversion by the active devices in MOS LC-VCOs and its effect on the optimal amplitude," in *IEEE Radio Frequency Integrated Circuits (RFIC) Symposium, 2006*, June 2006, pp. 4 pp.–108.
- [98] E. Hegazi and A. A. Abidi, "Varactor characteristics, oscillator tuning curves, and AM-FM conversion," *IEEE Journal of Solid-State Circuits*, vol. 38, no. 6, pp. 1033–1039, June 2003.
- [99] J. Groszkowski, "The Interdependence of Frequency Variation and Harmonic Content, and the Problem of Constant-Frequency Oscillators," *Proceedings of the Institute of Radio Engineers*, vol. 21, no. 7, pp. 958–981, July 1933.
- [100] J. J. Rael and A. A. Abidi, "Physical processes of phase noise in differential LC oscillators," in *Proceedings of the IEEE 2000 Custom Integrated Circuits Conference (Cat. No.00CH37044)*, May 2000, pp. 569–572.
- [101] M. A. Margarit, Joo Leong Tham, R. G. Meyer, and M. J. Deen, "A low-noise, low-power VCO with automatic amplitude control

for wireless applications," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 6, pp. 761–771, June 1999.

- [102] A. Jerng and C. G. Sodini, "The impact of device type and sizing on phase noise mechanisms," *IEEE Journal of Solid-State Circuits*, vol. 40, no. 2, pp. 360–369, Feb 2005.
- [103] D. Murphy, H. Darabi, and H. Wu, "Implicit Common-Mode Resonance in LC Oscillators," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 3, pp. 812–821, March 2017.
- [104] A. Elkholy, T. Anand, W. Choi, A. Elshazly, and P. K. Hanumolu, "A 3.7 mW Low-Noise Wide-Bandwidth 4.5 GHz Digital Fractional-N PLL Using Time Amplifier-Based TDC," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 4, pp. 867–881, April 2015.
- [105] H. Liu, Z. Sun, H. Huang, W. Deng, T. Siriburanon, J. Pang, Y. Wang, R. Wu, T. Someya, A. Shirane *et al.*, "A 265*µ*W Fractional-N Digital PLL with Seamless Automatic Switching Subsampling/Sampling Feedback Path and Duty-Cycled Frequency-Locked Loop in 65nm CMOS," in *2019 IEEE International Solid- State Circuits Conference - (ISSCC)*, Feb 2019.
- [106] J. Z. Ru, C. Palattella, P. Geraedts, E. Klumperink, and B. Nauta, "A High-Linearity Digital-to-Time Converter Technique: Constant-Slope Charging," *IEEE Journal of Solid-State Circuits*, vol. 50, no. 6, pp. 1412–1423, June 2015.
- [107] H. Liu, D. Tang, Z. Sun, W. Deng, H. C. Ngo, and K. Okada, "A Sub-mW Fractional- *N* ADPLL With FOM of -246 dB for IoT Applications," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 12, pp. 3540–3552, Dec 2018.
- [108] B. Razavi, *Design of Analog CMOS Integrated Circuits*. McGraw-Hill, 2001.
- [109] A. M. Abo and P. R. Gray, "A 1.5-V, 10-bit, 14.3-MS/s CMOS Pipeline Analog-to-Digital Converter," *IEEE J. Solid-State Circuits*, vol. 34, no. 5, pp. 599–606, May 1999.
- [110] A. M. Abo, *Design for reliability of low-voltage, switchedcapacitor circuits*. Ph.D. dissertation, University of California, Berkeley, 1999.
- [111] Y. Nakagome, H. Tanaka, K. Takeuchi *et al.*, "An Experimental 1.5-V 64-Mb DRAM," *IEEE J. Solid-State Circuits*, vol. 26, no. 4, pp. 465–472, Apr. 1991.
- [112] G. Huang and P. Lin, "A Fast Bootstrapped Switch for High-Speed High-Resolution A/D Converter," in *Proc. IEEE Asia Pacific Conf. Circuits Syst.*, Sep. 2006, pp. 500–503.
- [113] C. Liu, S. Chang, G. Huang, and Y. Lin, "A 10-bit 50-MS/s SAR ADC With a Monotonic Capacitor Switching Procedure," *IEEE Journal of Solid-State Circuits*, vol. 45, no. 4, pp. 731–740, April 2010.
- [114] B. Nikolic, V. G. Oklobdzija, V. Stojanovic *et al.*, "Improved Sense-Amplifier-Based Flip-Flop: Design and Measurements," *IEEE J. Solid-State Circuits*, vol. 35, no. 6, pp. 876–884, Jun. 2000.
- [115] P. M. Figueiredo and J. C. Vital, "Kickback Noise Reduction Techniques for CMOS Latched Comparators," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 53, no. 7, pp. 541–545, Jul. 2006.
- [116] D. Schinkel, E. Mensink, E. Klumperink *et al.*, "A Double-Tail Latch-Type Voltage Sense Amplifier with 18 ps Setup+Hold Time," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2007, pp. 314–315.
- [117] S. Fateh, P. Schönle, L. Bettini, G. Rovere, L. Benini, and Q. Huang, "A Reconfigurable 5-to-14 bit SAR ADC for Battery-Powered Medical Instrumentation," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 62, no. 11, pp. 2685– 2694, Nov 2015.
- [118] H. Jeon and Y. Kim, "A CMOS low-power low-offset and high-speed fully dynamic latched comparator," in *23rd IEEE International SOC Conference*, Sep. 2010, pp. 285–288.
- [119] Masaya Miyahara, Yusuke Asada, Daehwa Paik, and Akira Matsuzawa, "A low-noise self-calibrating dynamic comparator for high-speed ADCs," in *2008 IEEE Asian Solid-State Circuits Conference*, Nov 2008, pp. 269–272.
- [120] W. P. Zhang and X. Tong, "Noise Modeling and Analysis of SAR ADCs," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 23, no. 12, pp. 2922–2930, 2015.
- [121] Y. Luo, A. Jain, J. Wagner, and M. Ortmanns, "Input Referred Comparator Noise in SAR ADCs," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 66, no. 5, pp. 718–722, 2019.
- [122] D. E. Bellasi and L. Benini, "Smart Energy-Efficient Clock Synthesizer for Duty-Cycled Sensor SoCs in 65 nm/28nm CMOS," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 9, pp. 2322–2333, Sep. 2017.
- [123] Z. Zong, P. Chen, and R. B. Staszewski, "A Low-Noise Fractional- *N* Digital Frequency Synthesizer With Implicit Frequency Tripling for mm-Wave Applications," *IEEE Journal of Solid-State Circuits*, vol. 54, no. 3, pp. 755–767, March 2019.
- [124] Wu, W. and Staszewski, Robert and Long, J.R., *Millimeter-Wave Digitally Intensive Frequency Generation in CMOS*.
- [125] P. B. Roemer *et al.*, "The NMR phased array," *Mag. Res. in Med.*, vol. 16, no. 2, pp. 192–225, 1990.
- [126] N. Sun *et al.*, "Palm NMR and one-chip NMR," in *ISSCC*, 2010, pp. 488–489.
- [127] L. Romano *et al.*, "5-GHz oscillator array with reduced flicker up-conversion in 0.13-*µ*m CMOS," *IEEE JSSC*, vol. 41, no. 11, pp. 2457–2467, Nov 2006.
- [128] C. P. Wang *et al.*, "A technique for in-band phase noise reduction in fractional-N frequency synthesizers," in *A-SSCC*, 2016, pp. 273–276.