Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: Intelligent RAM : IRAM
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Intelligent RAM : IRAM

[attachment=53340]

Problem Description Processor-Memory Performance Gap

The development of processor and memory devices has proceeded independently. Advances in process technology, circuit design, and processor architecture have led to a near-exponential increase in processor speed and memory capacity. However, memory latencies have not improved as dramatically.
Technological trends have produced a large and growing gap between CPU and DRAM.

Von-Neumann Model

Semiconductor industry divides into microprocessor and memory camps
Separate chips, separate packages
Memory size is bigger but low power cost
CPU speed is faster and high power cost
Desktop: 1~2 CPU, 4~32 DRAMs
Server: 2~16 CPU, 32~256 DRAMs

Advantages of Von-Neumann Model

Fabrication lines can be tailored to a device
Packages are tailored to the pinout and power of a device
The number of memory chips in a computer is independent of the number of processors

Disadvantages of Von-Neumann Model

Performance gap: CPU (60% each year) vs. DRAM (7% each year)
Memory Gap Penalty: larger caches (60% on-chip area, 90% transistors)
Caches are purely performance enhancement mechanisms…. Correctness does not depend on them
No. of DRAM chips shrinking for PC config
In future it maybe a single DRAM chip
The required min. memory size, means application and OS memory use, grows only 50~75% of rate of DRAM capacity.

Problem Description Off-chip Memory Bandwidth Limitation

Pin bandwidth will be a critical consideration for future microprocessors. Many of the techniques used to tolerate growing memory latencies do so at the expense of increased bandwidth requirements.Reduction of memory latency overhead aggravates bandwidth requirement for two reasons:
First, many of the techniques that reduce latency-related stalls increase the total traffic between main memory and the processor.
Second, the reduction of memory latency overhead increases the processor bandwidth – the rate at which the processor consumes and produces operands – by reducing total execution time.

Limitation of Present Solutions

Huge cache:
Slow and works well only if the working set fits cache and there is some kind of locality
Prefetching
Hardware prefetching
Cannot be tailored for each application
Behavior based on past and present execution-time behavior
Software prefetching
Ensure overheads of prefetching do not outweigh the benefits
Hard to insert prefetches for irregular access patterns
SMT
Enhance the utilization and throughput at thread level

Intelligent RAM

Unifying processing and memory into a single chip
Using DRAM rather than SRAM
DRAM is 25~50 times denser (3D structure)
Thus on chip memory much larger
Reason of IRAM:
Memory speed limits application
Processor uses 1/3 of the die, rest place big enough for Gbit DRAM to hold whole programs
More metal layers and faster transistors in DRAM in today’s technology, make DRAM as fast and dense as conventional logic process

IRAM--Summary

IRAM: Merging a microprocessor and DRAM on the same chip
Performance:
reduce latency by 5~10
Increase bandwidth by 50~100
Energy efficiency
Save at 2~4
Cost
Remove off-chip memory and reduce board area
IRAM is limited by amount of memory on Chip
Potential of network computer
Change the nature of semiconductor industry