22-06-2013, 12:19 PM
Optimized Design of Carry-Bypass Adders
Optimized Design.pdf (Size: 214.25 KB / Downloads: 29)
Abstract
In this paper, a simple and systematic
procedure to design Carry Bypass Adders (CBA) is
proposed. It allows to choose the block sizes of a CBA to
minimize the adder delay, and can be used for penciland-
paper design. Since it derives from rigorous
analysis of CBAs, it is general and provides intuitive
understanding of the optimum block size.
Compared to optimum results reported in the
literature, the optimization procedure proposed leads to
a delay which is minimum in actual cases, or very close
to optimum (within 7%) even in unrealistic cases.
Introduction
The adder is the fundamental block in any
arithmetic unit, and is often the speed-limiting circuit
in a digital system. Hence, many parallel adder
architectures have been proposed to increase speed,
with reasonable area and power dissipation features.
One of the fastest and efficient architectures in terms
of area and power dissipation is the Carry Bypass
Adder (CBA) [1].
A N-bit CBA is made up of N full adder gates,
which are grouped together into blocks, whose size
(i.e., the number of full adders per block) has to be
properly chosen to minimize the time needed for a
computation [2]-[3]. The CBA architecture can be
derived from that of a simple Ripple Carry Adder
(obtained cascading N full adders [3]) by stating that,
when some contiguous full adders work in propagate
(i.e., each of them has a carry output equal to the
carry input), they can be bypassed to evaluate the
carry output of the last one, since it is equal to the
carry input of the first.
Timing analysis of the CBA
Assume the j-th block of the CBA in Fig. 1 is made
up by Mj cascaded full adders and a multiplexer, and
the number of blocks is Q. The input signal of the j-th
multiplexer are 0j
I and 1j
I : the former is the output
of the full adders chain which forms the j-th block,
and it is selected when the carry does not propagate
in the block; the second is the carry output of the
previous block. When all the full adders of the j-th
block work in propagate, its block carry output (or
equivalently the carry input of the next block) .
Optimization of the CBA block size
As explained in the previous section, an optimum
CBA can be designed by adding blocks first with
increasing size by α, according to (4), and then
blocks with decreasing size by α, according to (5),
until the required number of bits, N, is obtained.
Hence, a symmetrical block size distribution can be
assumed. Unfortunately, MQ and M1 are unknown,
and will be evaluated in the following.
Validation and conclusions
The results obtained with the procedure proposed
were compared to those in [6], which are optimal.
More specifically, we compared the delay, to check if
the procedure effectively provides optimum results.
To this aim, we considered different values of N (32
and 64) and α (ranging from 0.1 to 2, with a step of
0.1). Moreover, we considered the examples reported
in [6]. For the sake of compactness, in Table I are
reported only the cases with α<1 and in which the
procedure leads to block sizes different from [6].
From inspection of Table I, the procedure proposed
leads to a delay greater than that obtained with [6]
only for α≥0.85, while for lower values of α the same
delay is achieved, even with a different block size.
Therefore, the CBA design strategy proposed leads to
optimum speed performance in realistic cases. Even
for unrealistic cases, the CBA delay obtained after
the optimization strategy proposed differs at most by
7% with respect to [6].