20-10-2012, 06:02 PM
An Efficient Twin-Precision Multiplier
An Efficient Twin-Precision Multiplier.pdf (Size: 279.57 KB / Downloads: 28)
Abstract
We present a twin-precision multiplier that in normal operation
mode efficiently performs N-b multiplications. For
applications where the demand on precision is relaxed, the
multiplier can perform N/2-b multiplications while expending
only a fraction of the energy of a conventional N-b
multiplier. For applications with high demands on throughput,
the multiplier is capable of performing two independent
N/2-b multiplications in parallel. A comparison between
two signed 16-b multipliers, where both perform single
8-b multiplications, shows that the twin-precision multiplier
has 72% lower power dissipation and 15% higher
speed than the conventional one, while only requiring 8%
more transistors.
Introduction
Recent development at the micro architecture level
shows that there is an increasing interest in datapath
components that are capable of performing computations
with variable operand size, e.g. adders capable
of doing both N and N/2-b additions [1]. By using
only a part of the datapath component for computation,
it has been demonstrated [2] that reductions in the
total power dissipation can be effected. Datapath components
that can perform both one N, one single N/2, or two
N/2-b operations give the designer the opportunity to design
a system which can adapt to changing modes, such
as low-power, high-throughput, or high-precision operation.
Such a datapath component could be used for dynamic
power reduction in the same way as described by Abddollahi
et al. [2]; by using the same kind of logic for
detecting if the effective bit rate is within N/2-b precision,
it is possible to control at what precision the
datapath component should be operating.
Tree Multiplier
Until now we have implicitly used the array multiplier
to demonstrate the twin-precision feature. The array multiplier
is, however, slow and power dissipating in comparison
to a logarithmic tree multiplier. The implementation of
the twin-precision feature in an N-b tree multiplier is similar
to that of the array multiplier; all that is needed is to set
the partial products bits not being used to zero and to partition
the partial products bits of the two multiplications into
the respective LSP and MSP of the tree. To reduce the critical
path for the N/2-b multiplications the partial products
bits used during the computation are moved as far down the
tree as possible, Figure 2. In this paper we use a tree multiplier
with regular connectivity [7].
Simulation Setup and Results
To evaluate delay and power dissipation, simulations
have been performed in a commercially available 0.13-μm
technology. The simulated circuit is a 16-b twin-precision
tree multiplier, which is capable of performing two 8-b multiplications
in parallel, and which uses the fast final adder of
Section 3. As reference we use a conventional 8-b and 16-b
multiplier, respectively, which both use a Kogge-Stone as final
adder. For signed multiplication the Baugh-Wooley algorithm
[8] has been implemented for both the conventional
and the twin-precision multiplier. All simulations have been
done using Spice transistor netlists including estimated wire
capacitances. All logic has been implemented as static logic
and designed to resemble what could be expected to be
found in a standard-cell library. The implemented version of
the multiplier cancels the inactive partial products by forcing
the AND gates to zero. The impact of using sleep-mode
techniques on power and delay has not been investigated5.
For power simulation 50 random input vectors were applied
to HSpice at 500 MHz, a supply voltage of 1.2 V, and an
operating temperature of 25 ◦C. Delay was obtained using
PathMill.