28-06-2010, 11:01 AM
TigerSHARC processor .ppt (Size: 753 KB / Downloads: 197)
¢ SHARC
ËœSâ„¢uper ËœHâ„¢arvard ËœARCâ„¢hitecture
Presented By:
¢ Nagendra Doddapaneni
¢ Overview
¢ Harvard Architecture
¢ Super Harvard Architecture
¢ TigerSHARC processor
¢ Outline
¢ Background
¢ Harvard Architecture
- Why
- What
¢ Modern CPU Chip Design
¢ Super Harvard Architecture
¢ TigerSHARC Processor
¢
¢
¢ Background
¢ von Neumann Architecture
- Single storage for instructions and data
- Digital Signal Processors
- Specialized microprocessor designed specifically for digital signal processing, generally in real time
¢
¢
¢
¢
¢ Why Harvard Architecture
¢ von Neumann bottleneck
(Ëœmemory boundâ„¢)
¢ DSP applications
¢ In von Neumann architecture
- Either reading an instruction
- Or reading/writing from/to memory
-
- What is Harvard Architecture
¢ Physically separate storage and signal pathways for instruction and data
¢ Next instruction fetched, when executing current instruction
¢ Program memory can be small and wide
¢ Data memory can be large and narrower
¢
¢
¢ Modern CPU chip design
¢ Incorporate features from both architectures
¢ ˜On chip™ cache memory “ divided into instruction cache and data cache.
Harvard architecture used when CPU accesses cache memory.
¢ On a cache miss, ˜off chip™ main memory is accessed using von Neumann architecture.
Main memory is not separated into data and instruction sections.
¢
¢
¢ Super Harvard Architecture
¢ Cache used to store instructions, leaving both instruction bus and data bus free to fetch operands
¢ Harvard Architecture + cache = Extended Harvard Architecture or Super Harvard Architecture
¢
¢
¢ -
¢ TigerSHARC Processor
¢
¢
¢
¢
¢ TigerSHARC
Instruction Parallelism and SIMD Operation
¢ Core can execute simultaneously one to four 32-bit instructions encoded in single instruction line (VLIW).
¢ Can execute in parallel Depends on¦.
- Instruction line resources each requires
- Source and Destination of registers used
¢ Supports SIMD operations through the use of both Computational Blocks in parallel.
¢ Each Computational Block can execute four 16-bit or eith 8-bit SIMD computations in parallel.
¢
¢
¢
¢ TigerSHARC
Integer ALU
¢ 31 32 bit general registers + 1 status register + 8 dedicated registers for circular buffers
¢ Performs integer ALU operations and data addressing
¢ ALU instructions: ADD, SUB, ARS, LRS (right shifts only), ROT (left and right), AND NOT, NOT, OR, XOR, ABS, MIN, MAX, CMP
¢ Status flags: zero (Z), negative (N), overflow (V), carry ©
¢ Instruction conditions: EQ, LT, LE, NEQ, NLT, NLE
¢ Instruction options: unsigned (U), circular buffer (CB), bit reverse (BR), computed jump (CJMP)
¢ Address related operations: data address generation, circular buffers, bit reverse, UREG moves, DAB control.
¢
¢
¢
¢ TigerSHARC Computational Blocks
X and Y Register File
¢ Register File Syntax
- Each Block has 32x32 bit Data registers
- Each register can store 4x8 bit, 2x16 bit or 1x32 bit words.
- Registers can be combined into dual or quad groups. These groups can store 8, 16, 32, 40 or 64 bit words.
¢ TigerSHARC Computational Blocks
X and Y Register File
¢ Register File Syntax
- Volatile registers in each block
¢ 24 Volatile Data registers in each block
- XR0 “ XR23
- YR0 “ YR23
¢ 2 ALU summation registers in each block
- XPR0, XPR1, YPR0, YPR1
¢ 5 MAC accumulate registers in each block
- XMR0 “ XMR3, YMR0 “ YMR3
- XMR4, YMR4 “ Overflow registers
¢
¢
¢
¢ TigerSHARC
X and Y ALU
¢ 2x64 bit input paths
¢ 2x64 bit output paths
¢ 8, 16, 32, or 64 bit addition/subtraction - Fixed-point
¢ 32 or 64 bit logical operations - fixed-point
¢ 32 or 40 bit floating-point operations
¢ Sample ALU Instruction
¢ Example of 16 bit addition
¢ XYSR1:0 = R31:30 + R25:24
¢ Performs addition in X and Y Compute Blocks
¢
¢
¢
¢
¢ TigerSHARC
Multiplier
¢ Operates on fixed, floating and complex numbers.
¢ Fixed-Point numbers
- 32x32 bit with 32 or 64 bit results
- 4 (16x16 bit) with 4x16 or 4x32 bit results
¢ Floating-Point numbers
- 32x32 bit with 32 bit result
- 40x40 bit with 40 bit result
¢ Complex Numbers
- 32x32 bit with 32 bit result
- Fixed-point only
¢ Results stored in MR register
¢ TigerSHARC
Multiplier
¢
¢
¢
¢
¢ TigerSHARC
Shifter
¢ Operates on one 64-bit, one or two 32-bit, two or four 16-bit, and four or eight 8-bit fixed-point operands
¢ Shifts and rotates bits
¢ manipulation operations, like bit set, clear, toggle and test
¢ Bit FIFO operations to support bit streams
¢ TigerSHARC Processor
¢ Processor Architecture
¢ Integer ALU
¢ Computational blocks
- X and Y Register File
- X and Y ALU
- Multiplier
- Shifter
- CLU <-
¢ Program Sequencer
¢ J and K data buses
¢ I bus “ data bus
¢ TigerSHARC CLU
¢ CLU instructions are designed to support different algorithms used for communications applications
¢ Algorithms supported are
- Viterbi Decoding (minimal distance decoding algorithm)
- Turbo-code Decoding (variant of Viterbi decoding)
- De-spreading for Code Division Multiple Access (CDMA) systems (used for tasking a signal in wide Pseudo Noise spread bandwidth)
¢
¢
¢
¢
¢ TigerSHARC
Program Sequencer
¢ Supplies instruction addresses to memory
¢ IAB caches up to five fetched instruction lines waiting to execute
¢ It extracts an instruction line from IAB and distributes to appropriate core component for execution
¢ Determine flow control for instructions like JMP, CALL
¢ Reduce branch delays using branch prediction and BTB
¢
¢
¢
¢ TigerSHARC
architecture at a glance
¢ TigerSHARC Buses
¢ DRAM divided into 6 blocks of 4Mbits
¢ 6 blocks connect to four 128-bit wide internal buses through a crossbar connection
¢ Internal bus architecture provides a total memory bandwidth of 32Gbytes/sec
¢ Core and I/O can access
- twelve 32-bit data words
- four 32-bit instructions
per cycle
¢
¢
¢
¢
¢ TigerSHARC
DMA Controller
¢ On-chip, with 14 DMA channels
¢ Provide zero-overhead data transfers
¢ Operates independently and invisibly to the DSP™s core
¢
¢
¢
¢
¢ References
¢ ANALOG DEVICES
- http://www.analogprocessors/processors/t...index.html
- http://www.analogprocessors/processors/sharc/index.html
- http://www.analogprocessors/resources/te...urces.html
¢ ECE-ADI-PROJECT HOME PAGE
- http://www.enel.ucalgary.ca/People/Smith...index.html
- http://www.enel.ucalgary.ca/People/Smith...sFrame.htm
¢ Summary
¢ What is Harvard Architecture
¢ What is Super Harvard Architecture
¢ TigerSHARC processor architecture
¢ How TigerSHARC is ˜faster™ for targeted DSP applications
¢ Questions
Thank You.