26-05-2012, 04:48 PM
Power Efficient Arithmetic Circuits for
Application Specific Processors
Power Efficient Arithmetic Circuits.pdf (Size: 738.74 KB / Downloads: 70)
Abstract
This thesis presents a study on RT level power optimization techniques in terms of their
applicability on data-
ow intensive data path designs and their eciency.
The dynamic power management techniques of clock gating and operand isolation are in-
vestigated and their eciency evaluated by sample designs. Although, clock gating by itself
oers signicant power savings at low overhead in sequential blocks, it is not always the
case that hold conditions can be extracted when input registers are shared among several
resources. Latch based operand isolation, was also found quite eective, though savings are
oset by the high overhead; evened out in case of the gate-based implementation for 32bit
adder/subtractor units.
Fine clock gating is proposed as an approach that merges the merits of both methods
and yields the highest power savings and the least performance degradation, for the same
overhead.
The static RTL power optimization methods proposed are: power sensitive implementation
selection and retiming.
The use of carry-save arithmetic to eliminate carry propagation in datapaths is deployed
to improve timing slack and provide larger margins for the performance-power trade-o in
other parts of the design.
The proposed methods are escorted by sample design examples to illustrate their eciency.
Further, by closely controlling unnecessary switching activity the overhead of sharing re-
sources among operations of varying complexity is reduced.
The methods proposed are suitable for a synthesis-based design
ow and achieve performance
comparable to custom application specic processors.
Introduction
For the last three decades, semiconductor industry has been facing a monotonic improvement
in technology size, performance and cost, as predicted by G. Moore back in 1965. Ever since,
circuits of increasing complexity and performance, though at aordable costs, have been
produced. At the very early stages of this phenomenal progress, an increasing gap between
technology and design productivity was noticed. This brought about the rst Computer
Aided Design (CAD) tools that would translate the circuit description, schematic at that
time, into the lithographic masks necessary for the production phase. Today tools are taking
over the designer as early as at the Register-Transfer (RT) level, while tools for behavioral
synthesis have for long been a topic of research.
Along this evolution, the optimization goals have undergone changes. Performance will
always be a metric that cannot be neglected. Power dissipation has attained signicant im-
portance as it can easily become the bottleneck of current designs, both because of cooling
requirements and battery life of portable equipment. It is understood that power min-
imization will always come with some performance degradation, hence new metrics that
capture this trade-o, such as the power-delay product are coming into play. This shift has
also resulted in incorporating power awareness both in the CAD tools and in the systems'
architectures.
Power Reduction in RT-Level
The RT level of abstraction has been created as an intermediate level between the logic and
the architecture levels to facilitate manageability of large designs. It alleviates designers
from the tedious and error prone tasks of capturing functionality at the gate level, resulting
in considerable improvement in the productivity-design quality product.
In contrast to the system/architecture level of abstraction characterized by general spec-
ications and inaccurate power consumption models for design components, the RT level
contains enough implementation details to be used for constrained design space exploration
and more precise power estimation. The local optimization techniques applicable at the
logic level as mentioned in the previous chapter, not only limit the expected gain, but also
incur very high computation requirements. This is due to the incremental nature of the
algorithms used and the propagation of the eect of a local change and re-evaluation of the
overall power consumption in the design [6]. In conclusion, the coarser RTL description of
a design provides for ecient power optimization and estimation algorithms.
Conclusions
The aim of this thesis was to investigate the design of power ecient arithmetic circuits for
application specic processors. The application domain of ASPs is specic in the sense that
products have a limited time horizon. As large development costs can not be amortized
over next generation products, a synthesis-based design
ow is followed, in order to meet
the tight performance and time-to-market constraints. Such a
ow is characterized by RTL
description of functionality and synthesis based on available IP libraries.