06-10-2012, 05:28 PM
VHDL IMPLEMENTATION OF 32-BIT INTERLOCK COLLAPSING ALU
VHDL.pdf (Size: 473.31 KB / Downloads: 48)
ABSTRACT
An important area in computer architecture is parallel
processing. Machines employing parallel processing are called parallel machines. A
parallel machine executes multiple instructions in one cycle. However, parallel machines
have a limitation, they cannot execute interlocked instructions. They are executed in
serial like any serial machine. It takes more than one cycle to execute multiple
instructions causing performance degradation. In addition there is hardware
underutilization as a result of serial execution in parallel machine.
The solution requires a special kind of device called “Interlock
collapsing ALU”. The Interlock Collapsing ALU, unlike conventional 2-1 ALU’s is a 3-
1 ALU. The proposed device executes the interlocked instructions in a single instruction
cycle, unlike other parallel machines, resulting in high performance. The resulting
implementation demonstrates that the proposed 3-1 Interlock Collapsing ALU can be
designed to outperform existing schemes for ICALU, by a factor of at least two. The
ICALU is implemented in VHDL. Its functionality is verified through simulation.
INTRODUCTION:
BACKGROUND:
Parallel machines cannot execute interlocked instruction concurrently.Interlocked
instructions or instruction with dependencies cannot be executed concurrently in a
parallel machine, thus degrading the performance of the machine. The thesis investigates
a solution, called, “interlock collapsing”, to execute these interlocks concurrently. The
solution requires a special kind of a device called the Interlock collapsing ALU. The
Interlock collapsing ALU, unlike conventional 2-1 ALU’s, is a 3-1 ALU.
The proposed ALU, in addition to collapsing these interlocks also should be
implemented in identical stages as the conventional ALU’s. A functional model of the
ICALU is assumed initially. The functional model is optimized by optimizing the
model’s individual blocks. The design and optimization of each block is discussed in
separate chapters.
Finally, two parallel machines with and without the ICALU are compared
with regard to their execution times. The effect of variation of percentage interlocks in a
given code on the execution times and the percentage speed ratio of the parallel machines
is studied.
CONTROL UNIT ::
The Control Unit determines the order in which instructions should be executed.
It interprets the machine instructions. The execution of each instruction is determined by
a sequence of control signals produced by the control unit. In other words, the control
unit governs the flow of information through the system by issuing control signals to
different components. For example, to perform an addition operation, it sets the
appropriate signals to appropriate components so that an addition operation results.
ALU ::
The Arithmetic and Logic Unit (ALU) is arguably the most important part of the
CPU. The ALU performs the decision making operations (logical) and arithmetic
operations. Arithmetic operations involve functions such as addition, subtraction,
multiplication and division. It also performs the basic logic functions such as AND, OR,
XOR, and so on. There are a variety of techniques to design these functions. It is most
complex with regard to design, amongst all the components of the computer, and it also
contributes to most of the delay. Thus, the design of the ALU is critical to the speed of
the computer.
REGISTER ARRAY ::
The Register Array consists of a number of temporary storage locations or
registers. Because the registers are often on the same chip and directly connected to the
control unit, they have faster access than memory. The ALU and the register array are
together called as the ‘dataflow’ of the computer
PARALLEL MACHINES ::
An important area in computer architecture is parallel processing. Machines
(computers) employing parallel processing are called parallel machines. A
parallel machine executes multiple instructions in parallel, in one cycle, compared
to a serial machine (discussed so far) that can execute only one instruction. Thus a
parallel machine is faster than a serial machine.
In a parallel machine, a number of execution units (ALU’s) are connected in
parallel, so that each unit is able to handle an instruction. But for practical reasons
the number is limited to two. For example, if two such units are present in the
processor, two instructions can be handled concurrently resulting in faster
execution. Fig 2.5 shows a simple block diagram of a parallel machine unit.