10-04-2012, 12:02 PM
Introduction to MMX
Introduction to MMX.ppt (Size: 112 KB / Downloads: 27)
MMX Technology
MMX is a single instruction, multiple data (SIMD) instruction set designed by Intel, introduced in 1996 with their P5-based Pentium line of microprocessors, designated as "Pentium with MMX Technology".
SIMD - Single Instruction, Multiple Data
One instruction can process multiple data items
Useful when large amounts of regularly organized data is processed
Example: Matrix and vector calculations
This is the basis of MMX and XMM
MISD
MISD: Multiple instructions process one data item.
MIMD
MIMD: Multiple instructions process multiple data items
Potential Applications MMX
graphics
MEG video/image processing
music synthesis
speech compression/recognition
video conferencing
matrix and vector calculations
The floating point registers
Floating point is processed by eight 80 bit registers ST(0), ST(1), …ST(7) in the floating point unit.
When doing floating point arithmetic, these registers are organized in a stack.
Programming floating point is quite different that programming integer arithmetic.
Advantages of using the floating point registers in MMX.
The registers already exist. Only logic had to be added to the chip.
The operating system already knows about the floating point registers.
New data types for MMX
8 one byte integers
4 two byte integers
2 four byte integers
1 eight byte integer
New instructions
Process the new data types 16, 8,4, or 2 data items (64 bits or 128 bits) at a time.
Types of instructions:Add / SubtractMultiply/Multiply and addShiftLogical (AND, NAND, OR, XOR)Pack and unpack
Saturation
Handling overflow when adding 16, 8, 4, or 2 values at a time is a problem. Programmers can specify that when overflow occurs, the “sum” should be replaced by the maximum legal value.
Example: Unsigned byte addition 80h + A0h = 120h ===> overflow Instead the machine stores FFh.
Likewise when subtracting.
Also included in MMX
Intel increased cache size when MMX was introduced (necessary for SIMD machines)
Programs run faster on MMX machines even if the SIMD instructions are not used
Excellent marketing:
Programs run faster on MMX machine
People want/buy MMX
Example 1: Calculating Dot Products
Approximate algorithm (Conclusion)
Add the two sums together in EAX to get the final sum.
Intel claims that standard Pentiums would require 40 instructions to carry this out. Using MMX technology, only 13 instructions are needed. Speed improves by even a greater ratio.
Example 2:24-bit color video blending
Suppose we have are displaying 640 by 480 pixel video that uses 24 bit colors - 8 bits for red, 8 for green, and 8 for blue.
Suppose we are currently showing one picture which we want to fade out and replace by “fading” in a second picture.
Suppose that we want to do the fade out/in in 255 steps.
Example 2:24-bit color video blending
For each step, for each of 3 colors and for each of the 640 by 480 pixels we must calculate:Result_pixel = NewPicture_pixel * (i/255) +OldPicture_pixel * (1-(i/255))where “i” is the step counter.
This formula must be calculated640 * 480 * 3 * 255 = 235,008,000times on 8 bit data!