Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Clock-Deskew Buffer Using a SAR-Controlled Delay-Locked Loop
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Clock-Deskew Buffer Using a SAR-Controlled Delay-Locked Loop
[attachment=24682]
Abstract
A successive approximation register-controlled
delay-locked loop (SARDLL) has been fabricated in a 0.25- m
standard n-well DPTM CMOS process to realize a fast-lock
clock-deskew buffer for long distance clock distribution. This
DLL adopts a binary search method to shorten lock time while
maintaining tight synchronization between input and output
clocks. The measured lock time of the proposed SARDLL is within
30 clock cycles at 100-MHz clock input. The power dissipation is
3.3 mW (not including off-chip driver’s) at a 1.1-V supply voltage
while the measured rms and peak-to-peak jitter are 11.3 ps and
95 ps, respectively.
Index Terms—Clock skew, lock time, PVTL, SARDLL, static
phase error.
I. INTRODUCTION
WITH the rapid advances in semiconductor technologies,
modern digital systems operated at several hundred
megahertz have been successfully developed for many years.
Since there are more and more IC modules integrated on the
same printed-circuit board (PCB), the clock-skew problem will
undoubtedly be significant and becomes one of the bottlenecks
for high-performance systems. The clock-skew problem exists
in several different situations. For example, the input clock
driver in any chip will somewhat introduce uncertain time delay
between the internal clock and the external clock. Thus, the
internal clocks in a multi-chip system become asynchronous
and problems will occur when data transfer between chips is
needed. This phenomenon will also become more serious for
chips operated at gigahertz in the future. In addition, on a large
PCB, the length of traces between different chips and the clock
generator may differ from each other. Hence, the clock-skew
problem will inevitably exist. Similar problems also happen in
a multi-board system in which different boards are connected
through cables. Briefly speaking, the clock-skew problem
comes from different propagation delays of system clocks on
the board or in a chip, and is usually dependent on process,
voltage, temperature, and loading (PVTL), which make it a
complicated issue.
Manuscript received October 26, 1999; revised February 1, 2000.
This work was supported by the National Science Council under Grant
NSC89-2215-E-002-024.
G.-K. Dehng, C.-Y. Yang, and S.-I. Liu are with the Department of Electrical
Engineering, National Taiwan University, Taipei 106, Taiwan, R.O.C. (e-mail:
lsi[at]cc.ee.ntu.edu.tw).
J.-M. Hsu was with the Department of Electrical Engineering, National
Taiwan University, Taipei 106, Taiwan, R.O.C. He is now with the Electronics
Research and Service Organization, Industrial Technology Research Institute,
Hsinchu 310, Taiwan, R.O.C. (e-mail: jmhsu[at]erso.itri.org.tw).
Publisher Item Identifier S 0018-9200(00)06432-5.
Reducing the clock skew can not only further increase
system clock frequency but also avoid system malfunction.
Phase-locked loops (PLL’s) and delay-locked loops (DLL’s)
have been widely adopted to solve the clock-skew problem.
Such kinds of circuits are called clock-deskew buffers. A DLL
consists of a phase detector (PD) or a phase comparator (PC),
a variable delay line, and a controller to convert the PD’s
output signal to a control signal for the delay line. It detects the
phase error between the input clock and its output clock and
automatically tunes the delay line to insert an optimal delay
time between them for clock synchronization.
The design of DLL’s can be classified into two types: analog
and digital. The delay line in an analog DLL is controlled
through a loop filter, as shown in Fig. 1 [1]. Two off-chip
transmission lines are involved in the clock-deskew system.
One connects the clock-deskew buffer’s output node with one
of the PC’s input nodes. The other connects the clock-deskew
buffer’s output node with the remote chip. Assume that the
two transmission lines, as well as the clock input buffers in
front of the PC, are identical. A parameter called loop delay
[2] can be defined as the time delay for the input clock signal
to propagate through the input clock buffer, the variable delay
line, the output clock buffer, and the feedback transmission
line. It can be given as
(1)
where , , , and denote the delay time of
the input clock buffer, the variable delay line, the output clock
buffer, and the feedback transmission line, respectively. Another
propagation delay, , is defined as the time delay for the
input clock signal to propagate through the input clock buffer,
the variable delay line, the output clock buffer, and the forward
transmission line. It is given as
(2)
where is the clock delay time of the forward transmission
line. As long as the input capacitance of the clock-deskew buffer
and the remote chip are close to each other, is close to .
In other words, is close to . When the DLL is
locked, , as well as , becomes an integer multiple
of the clock period and the two input signals of the PC are synchronous.
At the same time, the remote clock will also coincide
with the input clock.
Since the delay line in an analog DLL is adjusted in a continuous
manner, the analog approach can result in smaller static
0018–9200/00$10.00 © 2000 IEEE
DEHNG et al.: CLOCK-DESKEW BUFFER 1129
Fig. 1. Analog DLL.
Fig. 2. Register-controlled DLL.
phase error than the digital approach. Besides, low-power operations
and small chip area can be achieved in an analog DLL
[3]. However, it is not suitable for future low-voltage applications
because it cannot provide enough delay range under low
supply voltages. Moreover, it is more susceptible to process
variations and less immune to power-supply noise. Contrarily,
the digital approach [1], [2], [4]–[6] can provide more robust
operations over power-supply noise and PVTL effects. Besides,
it has low standby current consumption and can exhibit shorter
lock time than the analog one at the expense of inherent quantization
phase error.
In this paper, a low-voltage all-digital DLL-based clockdeskew
buffer is designed in a 0.25- m standard n-well
double-poly double-metal (DPTM) CMOS process for long
distance clock distribution. Two conventional architectures of
digital DLL’s will be reviewed and a successive approximation
register-controlled DLL (SARDLL) will first be discussed in
Section II. Section III describes the building blocks of the
proposed SARDLL. Experimental results are presented in
Section IV and Section V gives the conclusions.
II. CONVENTIONAL DIGITAL DLL’S AND SARDLL
Fig. 2 shows the block diagram of the register-controlled DLL
(RDLL) [1], [4], [5]. The feedback clock signal is the delayed
version of the input clock signal, and the shift register controls
the amount of the delay time. The PC compares the phases of
the input clock signal and the output clock signal. The outputs of
the PC, Fast, Just, and Slow, are used to control the shift register.
The input clock signal is a common input for every delay stage.
At any time, only one bit of the shift register is active to select
a point of entry of the delay line for the input clock signal. The
number of the delay stages which the input clock signal goes
through determines total amount of delay. When Fast is active,
the feedback clock leads the input clock, and the high bit in the
shift register will be shifted left to increase the delay time. When
Slow is active, the situation is opposite and the high bit in the
shift register will be shifted right to decrease the delay time.
When Just is active, the phase error between the input clock and
the output clock is within one unit delay, and the data in the shift
register will be held. Under this circumstance, the loop is locked
and will not alter until the phase error exceeds the unit delay
again. The resolution of the RDLL is determined by the unit
delay of the delay line and the total delay time of the delay line
determines the DLL’s deskew range and the lowest operation
frequency. Wider deskew range or lower operation frequency
can be achieved by adding more delay stages in the delay line.
However, larger chip area would be the penalty.
In order to solve the problem stated above, a countercontrolled
DLL (CDLL) has been proposed [2], [6]. Fig. 3
shows the block diagram of the CDLL. It is similar to the
RDLL, except that an up/down counter substitutes for the shift
register to control the delay line. In addition, the delay line is
designed in a binary-weighted manner and no longer consists
of delay stages with equal delay time. The n-bit control word
from the up/down counter in Fig. 3 determines whether the
input clock goes through the delay stage or just passes it. The
principle of operation is similar to that case in a RDLL except
that the n-bit up/down counter counts up or down to control
the delay line. Compared with the RDLL, if 64 delay stages
are required in a RDLL, only 6 delay stages are required in
a CDLL. Besides, the 64-bit shift register in a RDLL can be
replaced by a 6-bit up/down counter. Assume that the 64-bit
shift register in a RDLL and the 6-bit up/down counter in a
CDLL occupy equal chip area. If both of RDLL and CDLL
achieve fine resolution through interpolation, then the delay
line of RDLL will have larger offset delay time than the CDLL
( ) and occupy larger chip area while the tuning ranges
are the same. Using a CDLL, the chip area of the DLL could
possibly be reduced while maintaining the same deskew range
and having the same limitation for low-frequency operation as
in a RDLL.
1130 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 35, NO. 8, AUGUST 2000
Fig. 3. Counter-controlled DLL.
Fig. 4. Flowchart for weighing sequence.
Besides the cost of chip area, another important parameter
to evaluate the performance of DLL’s is the lock time. In the
case of the RDLL, if the point of entry of the delay line is
chosen in the middle at first, the longest lock time will be 32
input clock periods for the 64-stage delay line. In the case of
the CDLL, if the initial value of the counter is set as 32, the
longest lock time will also be 32 input clock periods when a
6-bit up/down counter is used. Both of the digital DLL’s mentioned
above exhibit the same lock time. Shorter lock time is
preferable, especially in the design of high-speed memory applications.
A principle called synchronous mirror delay (SMD)
has been proposed [7] to achieve fast-lock operation within two
input clock periods. More or less, it behaves like a measure-controlled
DLL [1]. However, tight synchronization between input
clock and output clock is not provided in this case since SMD
is an open-loop system, not a closed-loop system.
The operation mechanism of the general DLL’s can be treated
as a searching sequence. The closed loop tends to search for an
optimal delay and tries to insert it between the input clock and
output clock. From this point of view, the binary search algorithm
can be applied to reduce the searching time. The flowchart
in Fig. 4 illustrates the algorithm when a 3-bit binary-weighted
delay line is used [8]. In the beginning, the most significant bit
(MSB) of the control word is set to 1, and all the other bits are
set to 0. The PC examines whether the output clock leads the
input clock or not. If so, the MSB remains high. If not, it is set
to be low and held constant. In this way, the MSB is determined.
The process is then repeated for each following bit until the least
significant bit (LSB) is determined. The binary search algorithm
is the main operation principle of the proposed SARDLL. Theoretically,
when a 6-bit binary-weighted delay line is used, a
SARDLL can achieve lock time of six clock periods, which is
faster than a RDLL or a CDLL. In this paper, the SARDLL is
developed to shorten the lock time while maintaining tight synchronization
for the purpose of long distance clock distribution.
III. CIRCUIT DESCRIPTION
The block diagram of the SARDLL is shown in Fig. 5. It
consists of a PC, a digital-controlled delay line as in the conventional
digital DLL’s, and an additional frequency divider, an
initial circuit (INCKT), and a 6-bit SAR [8], [9] to provide a
control word for the delay line. Again, two off-chip transmission
lines are involved in the clock-deskew system. Each block
in the SARDLL is described as follows.
A. Digital-Controlled Delay Line
Fig. 6(a) shows the entire binary-weighted digital-controlled
delay line. The number on the top of each stage denotes the
number of unit delay it provides. The delay line is divided into
two sections, coarse section and fine section. The coarse section
consists of 7 delay stages, and each stage provides 8 unit
delays. Instead of combining the stages controlled by or
into two stages with longer delay time, the 16-unit delay stage
(controlled by ) is composed of two 8-unit delay stages and
the 32 delay stage (controlled by ) is composed of four 8-unit
delay stages. This is due to fact that the longer the delay time
per stage is, the more difficult it is for the rise time and fall time
of the delay stage to match with each other. Moreover, delay
stages with longer delay time will suffer from larger variations
in the clock duty cycle, especially in the case of low supply voltages.
To avoid such potential variations, more stages with equal
delay time are cascaded in the design at the cost of the slightly
increased chip area.
Clock-Deskew Buffer Using a SAR-Controlled Delay-Locked Loop

[attachment=25890]
Abstract

A successive approximation register-controlled
delay-locked loop (SARDLL) has been fabricated in a 0.25- m
standard n-well DPTM CMOS process to realize a fast-lock
clock-deskew buffer for long distance clock distribution. This
DLL adopts a binary search method to shorten lock time while
maintaining tight synchronization between input and output
clocks. The measured lock time of the proposed SARDLL is within
30 clock cycles at 100-MHz clock input.

INTRODUCTION

WITH the rapid advances in semiconductor technologies,
modern digital systems operated at several hundred
megahertz have been successfully developed for many years.
Since there are more and more IC modules integrated on the
same printed-circuit board (PCB), the clock-skew problem will
undoubtedly be significant and becomes one of the bottlenecks
for high-performance systems. The clock-skew problem exists
in several different situations. For example, the input clock
driver in any chip will somewhat introduce uncertain time delay
between the internal clock and the external clock. Thus, the
internal clocks in a multi-chip system become asynchronous
and problems will occur when data transfer between chips is
needed. This phenomenon will also become more serious for
chips operated at gigahertz in the future. In addition, on a large
PCB, the length of traces between different chips and the clock
generator may differ from each other.

CONVENTIONAL DIGITAL DLL’S AND SARDLL

Fig. 2 shows the block diagram of the register-controlled DLL
(RDLL) [1], [4], [5]. The feedback clock signal is the delayed
version of the input clock signal, and the shift register controls
the amount of the delay time. The PC compares the phases of
the input clock signal and the output clock signal. The outputs of
the PC, Fast, Just, and Slow, are used to control the shift register.
The input clock signal is a common input for every delay stage.
At any time, only one bit of the shift register is active to select
a point of entry of the delay line for the input clock signal. The
number of the delay stages which the input clock signal goes
through determines total amount of delay. When Fast is active,
the feedback clock leads the input clock, and the high bit in the
shift register will be shifted left to increase the delay time.

CIRCUIT DESCRIPTION

The block diagram of the SARDLL is shown in Fig. 5. It
consists of a PC, a digital-controlled delay line as in the conventional
digital DLL’s, and an additional frequency divider, an
initial circuit (INCKT), and a 6-bit SAR [8], [9] to provide a
control word for the delay line. Again, two off-chip transmission
lines are involved in the clock-deskew system. Each block
in the SARDLL is described as follows.

CONCLUSION

In this paper, a low-voltage clock-deskew buffer fabricated in
a 0.25- m standard CMOS technology is presented. The clockdeskew
buffer utilizes a SARDLL technique. This architecture
uses a binary-search method to quickly find an optimal delay
time between the input and output clocks for clock synchronization,
and achieves short lock time.