Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: VLIW Processors
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
VLIW Processors

[attachment=34625]

VLIW (“very long instruction word”) processors

instructions are scheduled by the compiler
a fixed number of operations are formatted as one big instruction (called a bundle)
usually LIW (3 operations) today
change in the instruction set architecture, i.e., 1 program counter points to 1 bundle (not 1 operation)
operations in a bundle issue in parallel
fixed format so could decode operations in parallel
enough FUs for types of operations that can issue in parallel
pipelined FUs

Goal of the hardware design:

reduce hardware complexity
to shorten the cycle time for better performance
to reduce power requirements
How VLIW designs reduce hardware complexity
less multiple-issue hardware
no dependence checking for instructions within a bundle
can be fewer paths between instruction issue slots & FUs
simpler instruction dispatch
no out-of-order execution, no instruction grouping
ideally no structural hazard checking logic
Reduction in hardware complexity affects cycle time & power consumption

IA-64 EPIC

Template, cont’d.
schedule for functional unit availability (I.e., template types) & latencies
implications for hardware:
no instruction grouping
potentially fewer paths between issue slots & functional units
potentially no structural hazard checks
hardware not have to determine intra-bundle data dependences
Branch support
full predicated execution
branch prediction instruction
PC of branch instruction
branch prediction
target forecasting
new wrinkle on confidence
hierarchy of branch prediction structures in different pipeline stages
4-target BTB for repeatedly executed taken branches
an instruction puts a specific target in it (exposed to the architecture)
0-cycle execution if predict taken & correct
larger back-up BTB
2-level branch prediction for hard-to-predict branches
instruction hint that branches that are statically easy-to-predict should not be placed in it
private history registers, 4 history bits, shared PHTs
separate 2-level structure for multiway branches