22-08-2012, 02:40 PM
Ad x N Booth Encoded Multiplier Generator Using Optimized Wallace Trees
Ad x N Booth Encoded.pdf (Size: 503.69 KB / Downloads: 76)
Abstract
In this paper the architecture and the design method
of an M-bit by N-bit Booth encoded parallel multiplier generator
are discussed. A new algorithm for reducing the delay hide the
branches of the Wallace tree section is explained. The final stage
of adding two N + M - 1-bit numbers is done by an optimal
carry select adder stage. The algorithm for optimal partitioning
of the N + M - 1-bit adder is elso presented.
INTRODUCTION
FA ST multipliers are essential parts of digital signal pro- cessing systems. General purpose DSP’s can afford to
have customized high performance multipliers (16 by 16 < 25
ns in 0.9 mm CMOS). Applications specific DSP’s (ASDSP),
on the other hand, implement algorithms as direct realization
of their flow graphs, thus having several multipliers, possibly
each one with different specifications. It is not economical
to have custom layout for every conceivable multiplier. Besides,
experience shows that in ASDSP systems the desired
parts are not actually multipliers; rather more often they are
composition of multipliers, and adders, and pipeline registers.
Therefore, manipulation of prefabricated custom layouts for
accommodating any such requirements tums out to be another
“custom” task.
MULTIPLIERA RCIUTECTURE
The multiplier is composed of three blocks: the modified
Booth encoder and multiplicand selector block for formation
of the partial products; next is the Wallace tree section, which
adds all of the partial products simultaneously to eventually
produce two numbers; the third block is the carry select
adder section which adds the two numbers, obtained from the
Wallace tree section, as fast as possible. The block diagram of
a 12 x 12 parallel multiplier is shown in Fig. 3.
GENERATOARL GORITHMS
In this section we explain the method of generating the
logic description of an M-bit by N-bit Wallace tree multiplier.
However, we will first provide some techniques for estimating
the speed of the multiplier, in terms of the delays of full adders
used. In order to estimate the speed of the Booth encoded
Wallace tree section of the multiplier, we need to identify the
column © with the most number of bits in it (nc) including
the carry bits from the previous columns. The worst delay
in generating the results of the Wallace tree adders is then
proportional to DW = [log3(nc)l. The worst case delay of
the final result for an M-bit by N-bit multiplication, excluding
the fixed overheads, will be proportional to DW+ M+ N - C.
This can be contrasted to a rough proportionality constant of
N + M/2 for the CSA architectures.
Wallace Tree Adders Optimization
The Wallace tree section of the generator accepts as input
variables the list of partial products for each column, and the
delay associated with each bit. At the present time, rather than
keeping track of both the fall- and the rise-time delays of the
bits, only an average delay is calculated, and saved at any
given step. It is assumed that all of the partial product bits
arrive at the Wallace tree section simultaneously
CONCLUSION
The techniques for doing fast M by N multiplication using
Wallace tree were presented in this paper. New heuristic
algorithms for formation of the Wallace tree adders, and the
subsequent carry select adder section were detailed. Several
instances of this M-bit by N-bit multiplier generator were
simulated, and results presented. Currently, many instances
of this generator are used in customer specified designs.
This program has created fast ASIC (standard cell based)
multipliers that have been easily modified into fast data paths.
The method can equally well be applied to other technologies
such as gate arrays.