Ieee arithmetic, Appendix c ieee arithmetic – Motorola DSP96002 User Manual

Page 724

MOTOROLA

DSP96002 USER’S MANUAL

C-1

APPENDIX C

IEEE ARITHMETIC

C.1

FLOATING-POINT NUMBER STORAGE AND ARITHMETIC

C.1.1 General

The IEEE standard for binary floating point arithmetic provides for the compatibility of floating-point numbers

across all implementations which use the standard by defining bit-level encoding of floating-point numbers.

Maximum mathematical accuracy, with respect to roundoff errors, is achieved by optimally scaling floating-

point numbers by using a normalized exponential notation. Error bounds are guaranteed by the standard

for the basic mathematical operations (add, subtract, multiply, divide, square root, round to nearest integer,

conversion to and from integers and conversion to and from decimal strings). The standard also defines er-

ror handling for five floating point exceptions: invalid operation, divide by zero, overflow, underflow and in-

exact result.

The standard defines two data storage formats which are identical across implementations (basic formats):

Single Precision (SP) and Double Precision (DP). It also specifies the use of two implementation-dependent

encodings (extended formats): Single Extended Precision (SEP) and Double Extended Precision (DEP), on

which it only places some general constraints, and for which bit-level encodings are not defined. The ex-

tended formats are consequently implementation-dependent and should never be used for representation

of numbers which are to be shared across different processors (i. e., stored).

Each format provides representation of the following elements:

Floating-point numbers

of the form:

= (-1)

(

•

...

-1

)

where:

= 0 or 1

= an integer between E

min

and E

max

, inclusive.

= 0 or 1

Infinities:

∞

and -

∞

" Not-a-Numbers (NaNs) "

. NaNs are special symbolic elements, encoded in the floating point

format. They can appear as operands and/or as results of arithmetic operations. The standard

provides two types of NaNs:

Quiet NaNs (QNaNs):

are encodings of information regarding meaningless or invalid results.