24-08-2012, 02:38 PM
Floating Point Arithmetic
1Floating Point.pdf (Size: 227.1 KB / Downloads: 34)
Although integers provide an exact representation for numeric values, they suffer
from two major drawbacks: the inability to represent fractional values and a limited
dynamic range. Floating point arithmetic solves these two problems at the expense of
accuracy and, on some processors, speed. Most programmers are aware of the speed loss
associated with floating point arithmetic; however, they are blithely unware of the problems
with accuracy.
For many applications, the benefits of floating point outweigh the disadvantages.
However, to properly use floating point arithmetic in any program, you must learn how
floating point arithmetic operates. Intel, understanding the importance of floating point
arithmetic in modern programs, provided support for floating point arithmetic in the earliest
designs of the 8086 – the 80x87 FPU (floating point unit or math coprocessor). However,
on processors eariler than the 80486 (or on the 80486sx), the floating point processor
is an optional device; it this device is not present you must simulate it in software.
This chapter contains four main sections. The first section discusses floating point
arithmetic from a mathematical point of view. The second section discusses the binary
floating point formats commonly used on Intel processors. The third discusses software
floating point and the math routines from the UCR Standard Library. The fourth section
discusses the 80x87 FPU chips.
The Mathematics of Floating Point Arithmetic
A big problem with floating point arithmetic is that it does not follow the standard
rules of algebra. Nevertheless, many programmers apply normal algebraic rules when
using floating point arithmetic. This is a source of bugs in many programs. One of the primary
goals of this section is to describe the limitations of floating point arithmetic so you
will understand how to use it properly.
Normal algebraic rules apply only to infinte precision arithmetic. Consider the simple
statement x:=x+1, x is an integer. On any modern computer this statement follows the normal
rules of algebra as long as overflow does not occur. That is, this statement is valid only for certain values of x (minint <= x < maxint). Most programmers do not have a problem with
this because they are well aware of the fact that integers in a program do not follow the
standard algebraic rules .
The UCR Standard Library Floating Point Routines
In most assembly language texts, which bother to cover floating point arithmetic, this
section would normally describe how to design your own floating point routines for addition,
subtraction, multiplication, and division. This text will not do that for several reasons.
First, to design a good floating point library requires a solid background in numerical
analysis; a prerequisite this text does not assume of its readers. Second, the UCR Standard
Library already provides a reasonable set of floating point routines in source code form;
why waste space in this text when the sources are readily available elsewhere? Third,
floating point units are quickly becoming standard equipment on all modern CPUs or
motherboards; it makes no more sense to describe how to manually perform a floating
point computation than it does to describe how to manually perform an integer computation.