20-11-2012, 12:13 PM
Clock Distribution
clocks.ppt (Size: 403 KB / Downloads: 112)
Defining Clock Skew and Jitter
Clock skew
The deterministic (knowable) difference in clock arrival times at each flip-flop
Caused mainly by imperfect balancing of clock tree/mesh
Can be deliberately introduced using delay blocks in order to time-borrow
Accounted for in STA by calculating the clock arrival times at each flip-flop
Clock jitter
The random (unknowable, except distribution ) difference in clock arrival times at each flip-flop
Caused by on-die process, Vdd, temperature variation, PLL jitter, crosstalk, Static timing analysis (STA) accuracy, layout parameter extraction (LPE) accuracy
Background
Technology scaling results in:
higher clock frequencies possible and requested by users
prominence of wiring parasitics (R,L,C) in electrical behavior
increasing noise impact on delays
increasing on-chip process variation impact on delays
Existing ASIC clock synthesis flows
Use tree architectures: not best for low skew, jitter, variations
Don't properly address noise issues
Rely on STA to calculate the delays through clock networks
Use inaccurate wiring models
Use noise-sensitive clock circuit topologies
Ignore or crudely estimate process/voltage/temperature variations
Don’t have tight integration of physical synthesis & clock synthesis
Result
Predictability of clock delay is poor: Clock uncertainty (i.e., skew + jitter) of 400ps is not uncommon
Maximum attainable clock frequency is impaired
Mesh
Sizing of clock distribution networks for high performance CPU chips
Desai et al., DEC [DAC 1996]
goal: size grid interconnect segments with constraints on clock latency and average current
assume: initial grid and interconnect sizes
width explicit => non-linear program; practical for small networks/trees.
consider width as implicit & solve using sequence of network problems.
Results: applied on clock networks of two actual processors: DC21046A and DC21164. Results for DC21046A:
275MHz clock
grid has 1 million edges, 15.5K drivers, 81K receivers
16% reduction in capacitance - without increasing clock latency.
Runtime: 3 days.
Optimal Wire and Transistor Sizing for Circuits with Non-tree Topology
Vandeberghe et al., Stanford University [ICCAD 97]
RC circuit with tree topology => sizing problem is convex optimization
meshes have R loops; use dominant time constant as measure of delay
solve using semi-definite programming (quasi-convex function)
Summary of Processor Clock Design
Three basic routing structures for global clock
H-tree
low skew, smallest routing capacitance, low power
Floorplan flexibility is poor:
Grid or mesh
low skew, increases routing capacitance, worse power
Alpha uses global clock grid and regional clock grids
Spine
Small RC delay because of large spine width
Spine has to balance delays; difficult problem
Routing cap lower than grid but may be higher than H-tree.