21-05-2014, 04:23 PM
Extending the Unified Parallel Processing Speedup Model
Extending the Unified.ppt (Size: 97.5 KB / Downloads: 39)
Computer architectures take advantage of low-level parallelism: multiple pipelines
The next generations of integrated circuits will continue to support increasing numbers of transistors.
How to make efficient use of the additional transistors?
Answer: Parallelism beyond multiple pipelines: adding multiple processors or processing components in a single chip or single package.
Each level of parallelism performance suffers from the law of diminishing returns outlined by Amdahl.
Incorporating multiple levels of parallelism results in higher overall performance and efficiency.
Presentation Summary
Architects/Chip-Manufacturers are integrating additional levels of parallelism.
Multiple levels of speedup achieve higher speedups and greater efficiencies than increasing hardware at a single parallel level.
A balanced approach would achieve about the same level of efficiency in cost of hardware resources allocated, in delivering parallel speedup at each level of parallelism.
Numerous architectural approaches are possible, each with different trade-offs and performance returns.
Current technology is integrating DSP processing with microcontroller functionality - achieving up to three levels of parallelism.
Algorithm/Thread Level Parallelism
Example: Algorithms to compute Fast Fourier Transform (FFT) used in Digital Signal Processing (DSP)
Many separate computations in parallel (High Degree Of Parallelism)
Large exchange of data - much communication between processors
Fine-Grained Parallelism
Communication time (latency) may be a consideration if multiple processors are combined on a board of motherboard
Large communication load (fine-grained parallelism) can force the algorithm to become bandwidth-bound rather than computation-bound.