Ieee arithmetic, Appendix c ieee arithmetic – Motorola DSP96002 User Manual
Page 724
MOTOROLA
DSP96002 USER’S MANUAL
C-1
APPENDIX C
IEEE ARITHMETIC
C.1
FLOATING-POINT NUMBER STORAGE AND ARITHMETIC
C.1.1 General
The IEEE standard for binary floating point arithmetic provides for the compatibility of floating-point numbers
across all implementations which use the standard by defining bit-level encoding of floating-point numbers.
Maximum mathematical accuracy, with respect to roundoff errors, is achieved by optimally scaling floating-
point numbers by using a normalized exponential notation. Error bounds are guaranteed by the standard
for the basic mathematical operations (add, subtract, multiply, divide, square root, round to nearest integer,
conversion to and from integers and conversion to and from decimal strings). The standard also defines er-
ror handling for five floating point exceptions: invalid operation, divide by zero, overflow, underflow and in-
exact result.
The standard defines two data storage formats which are identical across implementations (basic formats):
Single Precision (SP) and Double Precision (DP). It also specifies the use of two implementation-dependent
encodings (extended formats): Single Extended Precision (SEP) and Double Extended Precision (DEP), on
which it only places some general constraints, and for which bit-level encodings are not defined. The ex-
tended formats are consequently implementation-dependent and should never be used for representation
of numbers which are to be shared across different processors (i. e., stored).
Each format provides representation of the following elements:
1.
Floating-point numbers
of the form:
X
= (-1)
S
2
E
(
b
0
•
b
1
b
2
...
p
-1
)
where:
s
= 0 or 1
E
= an integer between E
min
and E
max
, inclusive.
b
i
= 0 or 1
2.
Infinities:
+
∞
and -
∞
3.
" Not-a-Numbers (NaNs) "
. NaNs are special symbolic elements, encoded in the floating point
format. They can appear as operands and/or as results of arithmetic operations. The standard
provides two types of NaNs:
4.
Quiet NaNs (QNaNs):
are encodings of information regarding meaningless or invalid results.