## Nyquist-rate A/D Converters

## Pietro Andreani

Dept. of Electrical and Information Technology Lund University, Sweden

- Introduction
- Timing accuracy
- Flash converters
- Sub-ranging and two-step converters
- Folding and interpolating converters
- Time-interleaved converters
- Successive-approximation converters
- Pipeline converters
- Other architectures
Introduction - II

We have to consider the feedback factor $\beta$ around the op-amp:


$$
\begin{aligned}
A(s)= & \frac{V_{\text {out }}(t)}{V_{\text {in }}(t)}=\frac{A(s)}{1+\beta A(s)}= \\
& \frac{\omega_{T} / s}{1+\beta \omega_{T} / s}=\frac{\omega_{T} / s}{1+\beta \omega_{T} / s}=\frac{\omega_{T}}{s+\beta \omega_{T}}
\end{aligned}
$$

The (linear) time response to an input voltage step is:

$$
V_{\text {out }}(t)=V_{\text {in }}\left(1-e^{-t / \tau}\right) ; \quad \tau=\frac{1}{2 \pi \beta f_{T}}
$$

## Introduction - III

## Introduction - IV

An $n$-bit ADC needs an accuracy better than $2^{-(n+1)}$, which means that the settling time must be

$$
t_{\text {sett }}>\tau \cdot(n+1) \cdot \ln (2)
$$

The time allowed for settling is half a clock cycle; thus

$$
\begin{aligned}
& \frac{1}{2 f_{C K}}>t_{\text {sett }} \rightarrow \quad f_{C K}<\frac{1}{2 t_{\text {sett }}}=\frac{\pi \beta f_{T}}{(n+1) \ln (2)} \\
& \gamma=\frac{f_{T}}{f_{C K}}>\frac{(n+1) \ln (2)}{\pi \beta}
\end{aligned}
$$

## Timing accuracy

Recall the error due to sampling jitter:

$$
\delta V_{i n}=\delta T_{j i} \frac{d V_{i n}}{d t}
$$

The timing error must be less than $1 / 2 L S B=V_{\text {fs }} / 2^{n+1} ; 12$-bit ADC with 1 V signal at 20 MHz requires a jitter below 1 ps .
Phase generation and distribution is also an issue, because of the distributed time constants in the metal wires


For instance, if we have a 1.6 GHz process, with $\alpha=2$ and $\beta=0.5$, we have

$$
f_{T}=f_{\text {tech }} / \alpha=800 \mathrm{MHz} \quad \tau=1 /\left(2 \pi \beta f_{T}\right)=0.398 n s
$$

For a 9-bit ADC requiring two clock periods per conversion we have

$$
\begin{gathered}
t_{\text {sett }}=0.398 \mathrm{~ns} \cdot(9+1) \cdot 0.693 \approx 2.7 \mathrm{~ns} \\
f_{C K}=\frac{1}{2 t_{\text {sett }}} \approx 164 \mathrm{MHz} \quad f_{\text {sampling }}=\frac{f_{C K}}{2}=82 \mathrm{MHz}
\end{gathered}
$$

With an anti-aliasing filter margin of one octave, the signal band is

$$
\frac{f_{s}-f_{B}}{f_{B}}=2 \quad \rightarrow \quad f_{B}=\frac{f_{s}}{3}=27.5 \mathrm{MHz}
$$

## Timing accuracy - II

Assume a step with a finite slope, $\quad e_{i}(t)=\left\{\begin{array}{cc}\frac{E}{T_{R}} t & \text { for } 0 \leq t \leq T_{R} \\ E & \text { for } t>T_{R}\end{array}\right.$

The step response is given by $e_{\text {out }}(t)=V_{r}(t)-V_{r}\left(t-T_{R}\right)$

$$
V_{r}(t)=\frac{E}{T_{R}}\left\{(t+\tau / 2) \operatorname{erf}\left(\sqrt{\frac{\tau}{4 t}}\right)-\sqrt{\frac{\tau t}{\pi}} e^{-\frac{\tau}{4 t}}\right\} \quad \begin{gathered}
\tau=R_{u} C_{u} L^{2} \\
\operatorname{erf}(x)=\frac{2}{\sqrt{\pi}} \int_{0}^{x} e^{-y^{2}} d y
\end{gathered}
$$

The delay is roughly estimated to be $T_{D}=\tau / 4$, and it is easy to see that it does not depend on the width of the interconnection (if larger, $\mathrm{C}_{u}$ increases and $R_{u}$ decreases by the same amount). Typically, $R_{u} C_{u}$ is in the order of a few $10^{-17} s / \mu m^{2}$, leading to delays of 1 ps over a few hundreds of microns of interconnection.

## Example

## Metastability error

$$
R_{u}=40 \mathrm{~m} \Omega / \square ; \quad C_{u}=0.25 \mathrm{fF} / \mu \mathrm{m}^{2} ; \quad T_{R}=50 \mathrm{fs}
$$



A noise of $40 \mathrm{mV}_{\text {FS }}$ in the crossing threshold ( 0.5 normalized amplitude) results in a jitter of 40.8 fs and 142.8 fs , respectively; e.g.

$$
\delta_{j i t}=\frac{\Delta V}{\text { slope }}=\frac{40 \cdot 10^{-3}}{0.28}=142.8 \mathrm{fs}
$$

If the input $\mathrm{V}_{\text {in }}$ is not large enough, the comparator output may be undefined at the end of the latch phase $\rightarrow$ error in the output code
Input signal $\rightarrow$ pre-amplified during Sample and positive-feedback regeneration during Latch


Positive feedback with time constant

$$
\tau_{L}=C_{p} / g_{m}
$$

Metastability probability error is $P_{E} \approx \frac{V_{o}}{V_{o u t, d}}=\frac{V_{o}}{A_{o} V_{i n}} e^{-\frac{t_{r}}{\tau_{L}}}$
$V_{o}$ is the voltage swing for valid logic levels; $t_{r}$ is the period of the latch phase, typically equal to $1 / 2 f_{s}$

## Metastability error - II

At reasonably low frequencies, the error probability is low. The comparator must ensure an almost certain output for $\mathrm{V}_{\text {in }}>1 / 2 \mathrm{LSB} \rightarrow$ $\mathrm{P}_{\mathrm{E}}$ must be lower than a given maximum for $V_{i n}=V_{F S} / 2^{n+1}$, leading to

$$
f_{s} \ln \left(\frac{V_{o} \cdot 2^{n+1}}{P_{E, \max } A_{o} V_{F S}}\right) \leq \frac{1}{2 \tau_{L}}
$$

For example, if

$$
V_{o} \approx V_{F S}, \quad n=8, \quad \tau_{L}=2 \cdot 10^{-10}, \quad A_{o}=10^{3}, \quad P_{E, \max }=10^{-4}
$$

the maximum sampling frequency becomes

$$
f_{s}=\frac{1}{17 \tau_{L}}=293 \mathrm{MHz}
$$

## Resistor values

Each comparator present a time varying load to the ladder $\rightarrow$ the unit resistance value must be low enough to allow the ladder to pull back all voltages values with en error below $1 / 2$ LSB before the latch phase $\rightarrow$ simulation determine the optimal value for minimum power consumption


## Distortion and INL

A linear resistivity gradient in the resistance ladder causes INL; the reduced cell pitch in $b$ ) is better than $a$ ), and even better is the folded layout in c)

Matching in modern technologies is $0.1-0.05 \% \rightarrow 10-11$ bits without trimming
Temperature drift across the ladder can cause INL as well (temperature coefficents as high as $10.000 \mathrm{ppm} /{ }^{\circ} \mathrm{C}$ are possible)


## Offset in comparators

The different offset at the input of two contiguous comparators changes the respective quantization interval from $\Delta$ to $\Delta_{i}$ :

$$
\Delta \rightarrow \Delta_{i}=V_{t h r, i}-V_{t h r, i-1}=\Delta-\left(V_{o s, i}-V_{o s, i-1}\right)
$$

For example, an 8-bit flash with $1 \mathrm{~V}_{\mathrm{FS}}$ requires an $\sigma_{V_{o s}}<0.6 \mathrm{mV}$ for a $99.9 \%$ yield (i.e., $1 / 2^{8+1}>3.3 \sigma_{V_{0 s}}$ )
The offset in mainly caused by the pre-amplifier in the comparator; if the transistors in the input differential pair are near-minimum length, the mismatch in threshold voltage and device length are much larger that those in width and conductivity; the input-referred total offset (usually, a few mV ) becomes

$$
V_{o s}=\sqrt{\frac{A_{V_{t r r}}^{2}}{W L}+\left(\frac{V_{o d}}{2} \frac{\Delta L}{L}\right)^{2}} \quad \text { with } \quad \Delta V_{t h}=\frac{A_{V_{t h}}}{\sqrt{W L}}
$$

and $\quad \frac{I_{d s}}{g_{m}}=\frac{V_{g s}-V_{t h r}}{2}=\frac{V_{o d}}{2} \quad$ (long-channel expression)

## Offset auto-zeroing

Offset is usually too large $\rightarrow$ offset cancellation is needed


Two-phase approach $\rightarrow$ during $\Phi_{\mathrm{az}}$, opamp in unity-gain configuration $\rightarrow C_{\text {os }}$ is charged to $V_{o s}-V_{1}$; during the second phase, we have

$$
V_{\text {diff }, \text { in }}=V_{+}-V_{-}=V_{o s}\left(\bar{\Phi}_{a z}\right)-V_{2}-\left(V_{o s}\left(\Phi_{a z}\right)-V_{1}\right)
$$

If the offset does not change $\rightarrow$ perfect cancellation; also $1 / f$ noise is removed (white noise is instead doubled, because uncorrelated)

Charge injection is an issue (switch in feedback path critical); however, offsets of fraction of mV are possible (with fully differential architectures)

## Example - ladder settling

Simplified example ( $\mathrm{n}=8$ ) $\rightarrow$ tap \#80 takes $\sim 5.6 \mathrm{~ns}$ to settle to less than $1 / 2 \mathrm{LSB}$ (of its nominal level of 0.3125 V ) after converting an input of $0.5625 \mathrm{~V} \rightarrow$ if this is too long, and the next input is close to 0.3125 V , there could be a code error - notice also that tap \#144 is not constant even if 0.5625 V is its nominal value - this is because the ladder must (dis)charge all other nodes as well

$$
R_{i}=25 \Omega, \quad C=6.25 \mathrm{fF}
$$




The speed and resolution of flash converters depend on a number of practical limits:

1) Small unit resistors in the resistive divider $\rightarrow$ improve resolution and speed, but require higher power and a voltage reference with low impedance from DC to $f_{s} \rightarrow$ very difficult to realize
2) Exponential increase in complexity and power consumption with \# bits (especially power cannot grow beyond budget)
3) Effectiveness of comparators, in terms of metastability probability error $\rightarrow$ at high frequencies, the gain of the pre-amp (i.e., ratio of output signal (at the end of pre-amp phase) and input signal) does not have enough time to reach its low-frequency value

## Practical limits - II

## Practical limits - III

At high frequencies, the "dynamic" gain of the pre-amp is determined by the current delivered by the pre-amp to $\mathrm{C}_{\mathrm{p}}$, yielding a voltage ramp across $\Phi_{\text {amp }}$ (without reaching its maximum value) $\rightarrow$ the final value of the output voltage is

$$
\begin{gathered}
V_{\text {out }, \text { pre-amp }}=\left(V_{i n} g_{m, A}\right) \frac{1}{C_{p}} \cdot \frac{T_{c k}}{2}=\frac{V_{i n} g_{m, A}}{2 f_{c k} C_{p}} \rightarrow \\
\rightarrow A_{0}=\frac{g_{m, A}}{2 f_{c k} C_{p}}
\end{gathered}
$$

More bits $\rightarrow$ more gain needed $\rightarrow$ more $\mathrm{g}_{\mathrm{m}, \mathrm{A}} \rightarrow$ much more current ( x 4 for 1 more bit, or for doubling the sampling speed)

Example: 7-bit 500 MHz flash that needs a pre-amp gain of 20; MOS overdrive $\mathrm{V}_{\mathrm{od}}=200 \mathrm{mV}, \mathrm{C}_{\mathrm{p}}=0.4 \mathrm{pF}$. Since $g_{m}=2 I_{d s} / V_{o d}$, we obtain for the diff-pair

$$
I_{d i f f-p a i r}=2 I_{d s}=g_{m} V_{o d}=2 A_{0} f_{c k} C_{p} \cdot V_{o d}=1.6 \mathrm{~mA}
$$

which means that 200 mA are needed only in the pre-amp stages

Another important limitation is due to the capacitive load on the input S\&H from the par. caps of all comparators, $2^{n} C_{p}$
The charge on this cap. after sampling the reference is $2^{n} C_{p} V_{\text {ref }} / 2$
A full-scale input voltage drains an equal amount of charge from the S\&H; this charge must be provided in a fraction $\alpha$ of the sampling period; the peak current that the S\&H must deliver is

$$
I_{S \& H, p k}>f_{s} 2^{n} C_{p} \Delta V_{i n, \max } / 2 \alpha
$$

which is easily larger than 10 mA .
To summarize, it is impractical to design an 8 -bit flash with $>500 \mathrm{MS} / \mathrm{s}$, or a 6 -bit flash with $>2$ GS/s (these numbers improve along with the CMOS technology, of course)

## Sub-ranging / Two-step converter



## Advantages

\# comparators is much reduced: if 8 -bit and $\mathrm{M}=\mathrm{N}=4$, we need $2 \cdot\left(2^{4}-1\right)=30$ comparator instead of $2^{8}-1=255$ in the full flash!
The spared area and power are much more than what is needed in the DAC and residue generator - if $\mathrm{K}=2^{\mathrm{M}}$, the dynamic range of the amplified residue equals that of the input signal $\rightarrow$ coarse and fine ADCs can share the same reference voltage
S\&H loaded only by $2^{M}-1$ comparators
Conversion rate is in principle reduced, as 2-3 clock periods are needed per conversion

However, since the speed of the S\&H is the bottleneck in mediumresolution full-flash ADCs, the clock of the sub-ranging can actually end up being at higher frequencies, as the reduced capacitive loading of the S\&H enables a faster S\&H

## Accuracy requirements

For medium resolutions, the quantization step is 1 mV or more $\rightarrow$ the capacitance required to keep $\mathrm{kT} / \mathrm{C}$ smaller than $1 / 2$ LSB is not very large
For instance, if $\mathrm{C}=0.5 \mathrm{pF}, \sqrt{k T / C}=90 \mu V \rightarrow$ up to 10 b , the input capacitance is not a problem in the $\mathrm{S} \& \mathrm{H}$ design
The residue is given by

$$
V_{\text {res }}\left(V_{\text {in }}\right)=K\left(V_{\text {in }}-V_{D A C}(i)\right) \text { for } V_{\text {coarse }}(i-1)<V_{\text {in }}<V_{\text {coarse }}(i)
$$

Ideally, the residue is a saw-tooth with amplitude between 0 and $V_{F S} \cdot K / 2^{M}$ Below is instead the residue with a real coarse ADC and an ideal DAC

## Accuracy requirements - II

A residue outside the range of the LSB ADC results in all zeros (ones) until the input re-enters the boundaries

The DAC generates the subtractive term $\rightarrow$ a DAC error alters the residue as in the figure below (where the ADC is ideal)
Important: shift caused by DAC lasts an entire "tooth" $\rightarrow$ accuracy demands on the DAC are more stringent than on the ADC (whose errors can be corrected, as they are localized around the break points - either by inserting extra thresholds outside the $0-\mathrm{V}_{\mathrm{FS}}$ region, or with the same technique that will be discussed later within pipeline ADCs)



## Overall picture

ADC errors (middle curve) $\rightarrow$ only around MSB transitions; after 1-2 LSB from MSB transitions, response is again on the interpolating line
DAC errors $\rightarrow$ affect whole LSB range


Data Converters
Nyquist-rate A/D Converters

## Two-step as a non-linear conversion

The generation of the residual is equivalent to the action of a non-linear block


Non-linear operations alter the signal spectrum by generating extra tones $\rightarrow$ in data-sampled systems, a narrow-band input can spread over the Nyquist range $\rightarrow$ amplifier and LSB flash ADC must work well above Nyquist in order to avoid degradation in the LSB conversion
Spectrum of the residue is only weakly correlated with the input (provided the amplitude is a few MSBs at least) $\rightarrow$ distortion in the LSB blocks look more like white noise than tones $\rightarrow$ linearity of the residue generator is not critical $\rightarrow$ SNDR and SFDR set by first ADC and DAC

Two-step $\rightarrow$ dynamic range divided into MSBs, with linear input-output relations inside each MSB

An equivalent non-linear transformation is the folding of a straight line shown below $\rightarrow$ folding once around $\mathrm{V}_{\mathrm{FS}} / 2$ : 1 bit; twice around $\mathrm{V}_{\mathrm{FS}} / 4$ : 2 bits; three times around $\mathrm{V}_{\mathrm{FS}} / 8$ : 3 bits; and so on


The number of intervals required to quantize the folded signal diminishes accordingly $\rightarrow$ after an M -bit folding, only $2^{\mathrm{n}-\mathrm{M}}-1$ comparators are needed for the $n$-bit conversion

Obviously, it is necessary to know from which folded segment the input is coming, to determine the MSBs
Folding - II

The M-bit folder produces two signals: the analog folded output, and the M-bit code identifying the segment used in the folded response

The gain stage possibly boosts the dynamic range of the analog folded output to $\mathrm{V}_{\mathrm{FS}}$
The N-bit ADC determines the LSBs, and finally the logic block combines MSBs and LSBs to deliver $n=N+M$ bits


## Real folding

Any circuit implementing folding in not able to achieve infinitely sharp transitions - corners are always more or less rounded
The error is estimated by unfolding the folded response - monotonicity is sure, but INL may be high


(b)

In general, working on different segments results in different delays Folding is normally used for high conversion rates and medium-high resolutions $\rightarrow$ finite bandwidth and slew-rate in folding block are crucial

## Double folding

Double folding avoids the use of the non-linear regions present in simple folding $\rightarrow$ higher linearity, but also complexity

Two transfer characteristics, out-of-phase by $1 / 4$ of the folding period $\rightarrow$ one is always is the linear region
The logic must decide which ot the two characteristics should be used

$V_{\text {iner }}=\frac{V_{1} R_{2}+V_{2} R_{1}}{R_{1}+R_{2}}$
$V_{\text {inter }}\left(\Phi_{2}\right)=\frac{V_{1} C_{1}+V_{2} C_{2}}{C_{1}+C_{2}}$

(a)
(b)

Interpolation
Interpolation $\rightarrow$ value intermediate between two other values
With voltages $\rightarrow$ implemented with resistors, or capacitors (in data sampling) - a) and b) below

With currents $\rightarrow$ implemented with current mirrors - c)
Accuracies of $0.1 \%$ can be expected with good layout

$$
\frac{(W / L)_{1, i}}{(W / L)_{1}}=\alpha ; \quad \frac{(W / L)_{2,1}}{(W / L)_{2}}=1-\alpha
$$

$I_{\text {inter }}=\alpha I_{1}+(1-\alpha) I_{2}$
(c)

## Interpolation in flash converters

Comparators substituted with pre-amp; reduces \# pre-amps and ref. voltages (but not \# latches) $\rightarrow$ cap. load on the S\&H diminishes $\rightarrow$ less power consumption, higher speed; fewer ref. voltages $\rightarrow$ less "chargepumping" effect $\rightarrow$ less stringent settling limit

Often used with 4 or 8 resistors (not just 2 as here)


## Interpolation in flash converters - II

To preserve linearity, pre-amp output should saturate for an input higher than the closest upper threshold, and lower than the closest lower threshold; the overlapped non-saturated regions, interpolated by the resistors, determine zero-crossings mid-way between the pre-amps zero crossings; far from the zero crossings, the slope of the interpolated curve diminishes, but then the differential signal is already large enough for the latch.


## Interpolation with folding

Multiple interpolators take the place of the fine flash converter $\rightarrow$ (a) shows the interpolation between two folded responses ( $\mathrm{V}_{\mathrm{F} 1}$ and $\mathrm{V}_{\mathrm{F} 2}$ ), shifted by half segment; the shape of the interpolated segment is more rounded than the generating signals; but, again, it is important that it is linear only close to the zero crossing

(b) $\rightarrow$ interpolating $\mathrm{V}_{\mathrm{F} 1}$ and $-\mathrm{V}_{\mathrm{F} 2}$ yields a second set of zero crossings, detected by additional comparators
(c) $\rightarrow$ if $R_{1}=3 R_{2}$, the zero crossing is at $1 / 4$ distance from the zero crossing of $\mathrm{V}_{\mathrm{F} 2} \rightarrow$ multiple interpolations can yield a sufficient number of crossings to obtain the LSB conversion $\rightarrow$ avoids the use of an explicit flash ADC, possibly increasing the conversion rate

## Interpolation and linearity - II

We assume an error on the output voltage of the i-th op-amp: $V_{P, i} \rightarrow V_{P, i}+\varepsilon_{i}$ The error at the output $i$ becomes

$$
\varepsilon_{\text {out }, i}=\varepsilon_{i} \frac{1}{1+R_{\text {out }} / R_{T, i}}=\varepsilon_{i} T_{i, i}<\varepsilon_{i} ; \quad R_{T, i}=R_{U, i} \| R_{L, i}
$$


where $T_{i, i}$ is a damping factor introduced by the finite output resistance of the pre-amp
Using the circuit on the left, it is possible to estimate the error induced by $\varepsilon_{i}$ at a node $j ; T_{i, j}$ diminishes with $|i-j|$ increasing, and is usually negligible for $|i-j|>4$. In general, we must superpose the effects of all errors:

$$
\varepsilon_{\text {tot, } i}=\sum_{j=1}^{2^{n}-1} \varepsilon_{j} T_{i, j}
$$

The above averaging effect benefits shortdistance errors $\rightarrow$ reduces the DNL

## Interpolation and linearity

Interpolation resistances load the pre-amps $\rightarrow$ if output resistance is not much lower, voltage drop alters the generated voltages $\rightarrow$ however, this is in fact an advantage! For the first, the voltage error depends on the generated voltage, and is zero for zero output voltage $\rightarrow$ ininfluent at zero crossing, which is where it really matters!


Besides this, interpolating resistors + pre-amps with finite output resistance average the offsets of the pre-amps, improving the overall linearity The current flowing into the i-th pre-amp is
$\qquad$

$$
I_{P, i}=\frac{V_{I, i+1}+V_{t, i-1}-2 V_{I, i}}{2 R_{\mathrm{int}}}
$$

$\rightarrow$ if the i-th interpolated voltage is the average of its neighbors, its current is zero $\rightarrow$ ideal output voltages are not affected by the interpolating network $\rightarrow$ we need to consider only the effect of the errors

## Time-interleaved converters

Increase the conversion rate by having N converters working in parallel $\rightarrow$ the equivalent conversion rate is N times that of the single converter; the S\&H must work at full rate; or, there can be \#N S\&H working at the reduced rate, but with a very high precision in the distribution of the phases. Important remark: gain and offset errors on each ADC, which are usually of minor weight, are very important here, because they become dynamic errors!


## Accuracy requirements

Clock misalignment across the ADCs $\rightarrow$ looks like clock jitter, but with $\mathrm{N} \cdot \mathrm{T}_{\mathrm{S}}$ periodicity. If the clock misalignment between the K-th and 1st channel is $\delta_{\mathrm{k}}$, the error introduced is

$$
\varepsilon_{c k, K}(n T)=\left.\delta_{K} \frac{d V_{i n}}{d t}\right|_{n T} ; n=i N+K
$$

which, for the sinusoidal input $V_{i n}=A_{i n} \sin \left(\omega_{i n} t\right)$, becomes

$$
\varepsilon_{c k, K}(n T)=\delta_{K} A_{i n} \omega_{i n} \cos \left(\omega_{i n} n T\right)
$$

The power of the error due to clock misalignment becomes, as a function of the input power and N :

$$
P_{\varepsilon_{\delta_{k}}}=P_{i n} \delta_{k}^{2} \omega_{i n}^{2} / N
$$

Furthermore, because of the down-sampling of the input with a clock error at frequency $f_{s} / N$, the same clock misalignment gives rise to images located at

$$
\frac{k f_{s}}{N} \pm f_{i n}
$$

## Accuracy requirements - gain error

Identical gain errors in all ADCs have no impact; gain mismatch between the channels causes tones $\rightarrow$ worst case if gains alternate between $\left(1+\varepsilon_{G}\right)$ and $\left(1-\varepsilon_{G}\right)$, which gives rise to an error equal to the multiplication of the input signal with a square wave of amplitude $2 \varepsilon_{G}$ at $f_{s} / 2$ (or its submultiples)
Largest spur tone occurs at $f_{s} / 2 \pm f_{\text {in }}\left(\right.$ or $\left.\mathrm{f}_{\mathrm{s}} / 4 \pm \mathrm{f}_{\mathrm{in}}, \ldots\right)$, and has amplitude

$$
A_{\text {spur }}=\frac{4}{\pi} \varepsilon_{G} A_{i n}
$$

resulting in

$$
S F D R=20 \log \frac{\pi}{4 \varepsilon_{G}}
$$

Thus, even with $\varepsilon_{G}$ as low as $0.1 \%$, the SFDR is not higher than 58 dB Bottom-line: time-interleaved ADCs need trimming/calibration if high resolution is desired

1 bit converted per clock cycle (+ 1 or 2 cycles for signal sampling + settling) $\rightarrow$ binary search; reduced complexity and power, but lower conversion rate
The MSB distinguishes between input signal that are above or below $\mathrm{V}_{\mathrm{FS}} / 2$; depending on this result, the threshold for determining the 2nd bit is either $\mathrm{V}_{\mathrm{FS}} / 4$ or $3 \mathrm{~V}_{\mathrm{FS}} / 4$; and so on (below: 3-b conversion)
The voltages used in the comparisons are generated by a DAC driven by a successive-approximation register (SAR)


## SA - timing diagram

After signal S\&H, the SAR sets the MSB to 1 ; if the comparator confirms, the 1 is retained, otherwise is set to zero; then the 2 nd bit is processes, and so on until all bits have been generated; then a new signal sample is taken


## Algorithm with error

If there is an error in the bit evaluation, the error propagates along all successive steps - for instance, this may happen if $\mathrm{V}_{\text {DAC }}$ changes from a level well below $\mathrm{V}_{\text {S\&H }}$ to a level just above $\mathrm{V}_{\mathrm{S} \& \mathrm{H}}$ (4 $4^{\text {th }}$ clock period below) if the comparator recovery from overdrive is not fast enough, an error may occur - in the case below, we end up with a conversion error of 2 LSBs.

Occurs typically at the beginning of the conversion, when overdrive is large
Error correction techniques expand the search range near the end, to accommodate for initial inaccuracies $\rightarrow$ however, extra clock cycles needed


The name of the algorithm comes from the fact that the voltage from the DAC is an improving approx. of the voltage from the S\&H - occasionaly, the error can be larger than in the previous step (as from $4^{\text {th }}$ to $5^{\text {th }}$ bit below), but surely is not larger than successive divisions by 2 of the fullscale amplitude

Example below: search path for $\mathrm{V}_{\mathrm{S} \& \mathrm{H}}=0.4296875 \mathrm{~V}_{\mathrm{FS}}$


## Charge-redistribution SA-ADC

Charge sampled at the beginning of the conversion is redistributed on the capacitor array, to obtain a top-plate voltage close to zero at the end

Binary-weighted capacitances + a comparator $\rightarrow$ only one active-device block, which is not even particularly critical $\rightarrow$ very attractive for nanometer CMOS processes

During $\Phi_{\mathrm{S}}$, sampling $\rightarrow$ array connected between $\mathrm{V}_{\text {in }}$ and ground - total charge becomes:

$$
C_{\text {tot }}=2^{n} C_{u} V_{i n}
$$



## Charge-redistribution SA-ADC - conversion

MSB $\rightarrow$ bottom plate of largest capacitance $\left(2^{n-1} C_{u}\right)$ to $\mathrm{V}_{\text {ref }}$, rest of the array to ground $\rightarrow$ superposition yields the voltage on the top plate:

$$
V_{\text {comp }}(1)=\frac{V_{\text {ref }}}{2}-V_{i n}
$$

This voltage is the difference between MSB and input $\rightarrow$ only necessary to compare it to zero = ground, very convenient; if $\mathrm{MSB}=1$, the connection of the largest cap to $\mathrm{V}_{\text {ref }}$ is kept during the $2^{\text {nd }}$ comparison; otherwise it is restored to ground


## SA-ADC conversion - II

Thus, during the second comparison, the top plate voltage becomes

$$
V_{\text {comp }}(2)=\frac{V_{\text {ref }}}{2} \cdot M S B+\frac{V_{r e f}}{4}-V_{i n}
$$

which is used to find the $2^{\text {nd }}$ bit; and so on for all bits
The parasitic capacitance at the top plate attenuates the generated voltage by a factor $\alpha$ :

$$
\alpha=\frac{C_{u} 2^{n}}{C_{u} 2^{n}+C_{p}}
$$



## Example of SA-ADC

## A Programmable 10b up-to-6MS/s SA-ADC Featuring Constant FoM with On-Chip Reference Voltage Buffers

$\alpha$ reduces the voltage value, but not its sign, which is the relevant info this is a consequence of pre-charging the top-plate to zero - the top-plate voltage is zero at sampling, and almost zero at the end of the conversion
The input common-mode range of the comparator is zero without using op-amps or OTAs

Only the comparator and charging/discharging the array determine the power consumption $\rightarrow$ very power efficient (however, this does not take into account the generation of $\mathrm{V}_{\mathrm{REF}}$ )
Auto-zeroing of the comparator to avoid offset errors $\rightarrow$ comparator is connected as a unity-gain buffer and used to pre-charge the top plate during sampling
Capacitive attenuation can be used to limit the capacitive spread in the array


| CMOS Technology | $0.13 \mu \mathrm{~m}$ <br> 1 P 6 M |
| :---: | :---: |
| Core area [mm$]$ | 0.75 |
| Power Supply $[\mathrm{V}]$ | 1.2 |
| Power consumption [mW] | 3.2 |
| Sampling frequency [MS/s] | 5.5 |
| INL/DNL $[\mathrm{LSB}]$ | $0.6 / 0.55$ |
| ENOB $[\mathrm{bits}]$ | 9.2 |
| HD3 $\approx I M 3$ [dBs-FS] | 72 |
| FoM [pJ/conversion] | 1 |

## Pipeline ADC

Cascade of individual stages, each performing one of the elementary functions required by a sequential algorithm

The pipeline unwinds over space what would be performed over time by a sequential scheme

The simplest sequential scheme is the two-step algorithm, which uses two clock periods, one for MSBs and one for LSBs $\rightarrow$ the pipeline version delivers MSBs and LSBs in one clock period, in this way: the first stage yields the MSBs of the current sample, and the second stage yields the LSBs of the previous sample - but, in general, pipeline is multi-step


## Pipeline ADC - II

SA with pipeline $\rightarrow$ each stage of the pipeline generates 1 bit, plus the difference between input and internal DAC voltage - accuracy of analog signal must comply with the desired \# of bits (each stage may deliver more than 1 bit, and/or different stages may deliver different \# of bits)

Digital logic combines the bits coming from K stages $\rightarrow$ outputs at full rate, but with a delay (latency) of $\mathrm{K}+1$ clock cycles $\rightarrow$ not a problem, unless the ADC is inside a feedback loop


We see here a timing example for a 10-b, 5-stage, 2-b per stage pipeline ADC - the $6^{\text {th }}$ clock cycles is used by the digital logic to combine the 10 bits and make them available

## Generic pipeline stage

Architectures using digital correction techniques have a DAC resolution lower than the ADC's
The subtraction of $V_{\text {DAC }}$ from $V_{\text {in }}$ gives the quantization error, which, after amplification, determines the new residue voltage

$$
V_{\text {res }}(j)=K_{j}\left\{V_{\text {res }}(j-1)-V_{D A C}\left(b_{j}\right)\right\}
$$

The dynamic range of the residue equals that of the input if, for an $n_{j}$-bit DAC, the gain is $2^{n_{j}} \rightarrow$ this is frequently used, as it allows the use of the same reference voltages in all stages

(a) $\rightarrow$ input between $-\mathrm{V}_{\mathrm{R}}$ and $\mathrm{V}_{\mathrm{R}}, 1$ bit $\rightarrow$ the DAC subtracts $-1 / 2 \mathrm{~V}_{\mathrm{R}}$ when the input is negative, and $1 / 2 \mathrm{~V}_{\mathrm{R}}$ when the input is positive; with $\mathrm{K}=2$, the residue varies again between $-\mathrm{V}_{\mathrm{R}}$ and $\mathrm{V}_{\mathrm{R}}$
(b) 3 bits $\rightarrow$ the amplitude of the quantization error is at most $\pm \mathrm{V}_{\mathrm{R}} / 8$, and 3 bits $\rightarrow$ the amplitude of the quantization error is at most
$\mathrm{K}=8$ makes the residue vary again between $-V_{R}$ and $V_{R}$

## Residue generation



## Accuracy requirements

Accuracy requirements are of course greater in the first few stages - most demanding is the input $\mathrm{S} \& \mathrm{H}$
Non-idealities of ADC, DAC, and K cause limitations similar to those studied in the two-step ADC - threshold errors in the ADC cause the residue to be either greater or lower than full scale at the break points


However, the residue generator can still correctly provide the difference between the analog input and the quantized signal from the DAC - the error will be generated by the ADC in the next stage, since this ADC will not be able to correctly convert residues outside $\pm \mathrm{V}_{\text {ref }}$ (this observation is the basis of the digital error correction technique to be treated next)

## Accuracy requirements - II

We observed that in the two-step ADC the DAC errors modified the residue over a whole LSB segment, impacting the INL $\rightarrow$ also in the pipeline ADC the accuracy of any DAC, referred to the input of the pipeline, must be better than the required INL, and lower than 1LSB to ensure monotonicity (after a few stages, DAC linearity is of minor concern)
If the interstage gain has an error, $G=2^{n_{j}}(1+\delta G)$, the slope of the residue is either increased or decreased, causing an error that is zero in the middle, and maximum at the endings of the segments - at the break points, the error invert its sign, giving rise to a step change

$$
\Delta V=2 V_{r e f} \delta G
$$

which, referred to the input, must again comply with INL requirements and monotonicity

## Digital error correction

We observed that the dynamic range of the residue can exceed the limits, but no error is made until the following ADC cannot convert correctly an out-of-range signal
Reduce the interstage gain? Difficult, as an attenuation different from 1/2 is difficult to account for

Add additional levels to the ADC of the stage? yes, because: 1) the redundant levels avoid generating out-of-range residues, and 2) provide info to the digital domain (hence the name of the technique) to fully compensate for the ADC error
Consider a 1-b DAC, which usually requires an ADC with 1 threshold - to ensure enough redundancy, at least 2 thresholds are needed - since a 1-b ADC needs 1 threshold and a 2-b ADC needs 3, the use of 2 thresholds is referred to as a 1.5-b conversion (very popular)

It is possible to apply the same approach to multi-bit DACs

## Digital error correction - II

The input is divided into 3 regions: one below the lower threshold $\left(\mathrm{V}_{\mathrm{th}, \mathrm{L}}\right)$, one between the two thresholds, and one above the upper threshold $\left(\mathrm{V}_{\mathrm{th}, \mathrm{H}}\right)$ - if the separation of the two thresholds is large enough, a signal below $\mathrm{V}_{\mathrm{th}, \mathrm{L}}$ is "certainly negative", and above $\mathrm{V}_{\mathrm{th}, \mathrm{H}}$ is "certainly positive" Uncertainty arises in the middle region $\rightarrow$ close to the zero crossing, an error may lead to a residue exceeding the limits

The residue generator adds $\mathrm{V}_{\text {ref }} / 2$ if the ADC provides a certain 0 , and subtracts $\mathrm{V}_{\text {ref }} / 2$ if the ADC provides a certain 1 - but does not do anything in the uncertainty region $\rightarrow$ here the residue is the simple amplification by 2 of the input, shown below with correct (left) and incorrect thresholds


## Digital error correction - III

Errors $\delta_{\mathrm{th}, \mathrm{L},}, \delta_{\mathrm{th}, \mathrm{H}}$ on the thresholds change the value of the residue at the break points, but the residue remains within $\pm \mathrm{V}_{\text {ref }}$ if $\delta_{\mathrm{th}, \mathrm{L}}\left(\delta_{\mathrm{th}, \mathrm{H}}\right)$ is lower than $\mathrm{V}_{\mathrm{th}, \mathrm{L}}\left(\mathrm{V}_{\mathrm{t} \mathrm{n}, \mathrm{H}}\right)$ - in the case below, the input range for "certain" 0 diminishes, and increases for "certain" 1
The info provided by the $1.5-\mathrm{b}$ converter is a flag in LSB position, which is set to 1 when the uncertainty region is detected $\rightarrow 00$ is certain 0,10 is certain 1 , and 01 is uncertainty.


## Digital error correction - IV

The digital logic sums the outputs, taking into account the weight of each stage - the gain of each stage is $2 \rightarrow$ the weight of the uncertainty flag is equal to the MSB in the next stage (adders are needed)
In the example below, notice that the uncertainty in the $3^{\text {rd }}$ stage is corrected by the 10 of the next two stages, while the 01 of the $4^{\text {th }}$ and $5^{\text {th }}$ stages are fixed to 0
The last stage does not have an extra threshold, as its uncertainty cannot be resolved by further comparisons

$1.5-\mathrm{b}$

## Digital error correction - example

Assume $\pm \mathrm{V}_{\text {ref }}= \pm 1$, and $\mathrm{V}_{\text {in }}=2 \mathrm{~V}_{\text {ref }}\left(1 / 4+1 / 8+1 / 64+10^{-3}\right)-\mathrm{V}_{\text {ref }}=$ $-0.21675=011001$
If $\mathrm{V}_{\mathrm{th}, \mathrm{L}}=-0.25, \mathrm{~V}_{\mathrm{th}, \mathrm{H}}=0.25$, we have the following residue sequence: $-0.21675,-0.4335,0.133,0.266,-0.468,0.064$, corresponding to the binary outputs: $01,00,01,10,00,1 \rightarrow 011001$, as expected
If now shift on thresholds: $\mathrm{V}_{\mathrm{th}, \mathrm{L}}=-0.30, \mathrm{~V}_{\mathrm{th}, \mathrm{H}}=0.10$, the sequence becomes $-0.21675,-0.4335,0.133,-0.734,-0.468,0.064$, corresponding to the binary outputs: $01,00,10,00,00,1 \rightarrow$ still 011001 !
With thresholds $\mathrm{V}_{\mathrm{th}, \mathrm{L}}=-0.10, \mathrm{~V}_{\mathrm{th}, \mathrm{H}}=0.30$, the sequence becomes $-0.21675,0.5665,0.133,0.266,0.532,0.064$, corresponding to the binary outputs: $00,10,01,01,10,1 \rightarrow$ again 011001 !

## Dynamic performances

Depend on slew-rate and bandwidth of S\&H and residue generator - a step with amplitude $V_{\text {out }}$ becomes a ramp during slewing, and turns into an exponential when feedback takes over - the equations are (as already derived in an earlier lecture)

$$
\begin{array}{ll}
V_{\text {out }}(t)=S R \cdot t & t<t_{\text {slew }} \\
V_{\text {out }}(t)=\bar{V}_{\text {out }}-\Delta V \cdot e^{-\left(t-t_{\text {slew }}\right) / \tau} & t>t_{\text {slew }} \\
\Delta V=S R \cdot \tau ; \quad t_{\text {slew }}=\bar{V}_{\text {out }} / S R-\tau ; & \tau=1 / \beta \omega_{T}
\end{array}
$$

The next stage $\mathrm{S} \& \mathrm{H}$ samples after $\mathrm{T}_{\mathrm{S}} / 2$, and the (non-linear) settling error becomes

$$
\Delta V_{e r r}=\Delta V \cdot e^{-\left(T_{s} / 2-t_{\text {teres }}\right) / \tau}
$$

It is important to have a small error in the first stages, but in later stages is less important, because the input-referred error is divided by the gain of all preceding stages $\rightarrow$ op-amps in later stages do not need to be as highperforming $\rightarrow$ it is possible to save power

## Behavioral simulations

Much faster than transistor-level simulations, give a good starting point for the design - here, the achievable SFDR and SNR vs. the slew-rate of the key blocks are estimated - the different impact of the different blocks is also very easy to determine


## Residue generator

Capacitors are used for both sampling and D/A conversion; the residue is

$$
V_{\text {res }}=N V_{i n}-\sum_{1}^{N} V_{D A C}(i)
$$

where the DAC control bits are thermometric from the ADC
An amplification of $2^{n}$ gives rise to a feedback factor $\beta=1 /\left(2^{n}+1\right) \rightarrow$ to keep the same $\tau$, the unity-gain frequency of the op-amp must grow as $2^{n}$ $\rightarrow$ more than 3 bits per stage not suitable for high conversion rates


## Cyclic (or algorithmic) converter

Uses the same cell for converting one bit per clock cycle $\rightarrow n+1$ clock periods to convert $n$ bits; the voltage at the output of OTA 1 is the same as the residue of the first stage of a 1-b per stage pipeline: x2) - speed can be improved with the flip-around circuit shown below, where $\beta=1 / 2$, since input and feedback capacitance are equal (however, is the op-amp really open-loop during sampling?)
If $C_{U}(1)=C_{U}(2)$, the input voltage is doubled (one cap changes polarity compared to the other), but the gain for $\mathrm{V}_{\mathrm{DAC}}$ is only $-1 \rightarrow$ it is necessary to double the value of the references.


