Explain IEEE 754 standards for Floating Point number representation.

298views

written 8.7 years ago by

teamques10 ★ 69k

• modified 8.7 years ago

The standards for representing floating point numbers in 32-bits and 64-bits have been developed by the institute of Electrical and Electronics Engineers (IEEE), referred to as IEEE 754 standards. Figure shows these IEEE standard formats.

enter image description here

The 32-bit standard representation shown in Fig. (a) is called a single precision representation because it occupies a single 32-bit word. The 32-bit are divided into three fields as shown below:

(field 1) sign ⟵ 1bit

(field 2) Exponent ⟵ 8-bits

(Field 3) Mantissa ⟵ 23 bits

The sign of the number is given in the first bit, followed by a representation for the exponent (to the base 2) of the scale factor.

Instead of the signed exponent, E, the value actually stored in the exponent field is E' = E + bias. In the 32-bit floating point system (single precision), bias is 127.

Hence $E' = E + 127$ . This representation of exponent is also called the excess-127 format.

The end values of E’, namely, 0 and 255, are used to indicate the floating point values of exact zero and infinity, respectively in single precision.

Thus range of E’ for normal values in single precision is $0 \lt E’ \lt 255$ . This means that for 32-bit representation the actual exponent E is in the range -126 SE 5127.

The 64-bit standard representation shown in Fig. (b) is called a double precision representation because it occupies two 32-bit words. The 64-bits are divided into three fields as shown below:

(field 1) ⟵ 1-bit

(field 2) Exponent ⟵ 1-bit

(field 3) Mantissa ⟵ 52-bits

In the double precision format value actually stored in the exponent field is given as

E' = E + 1023

Here, bias value is 1023 and hence it is also called excess -1023 format. The end values of E’, namely, 0 and 2047, are used to indicate the floating point exact values of exact zero and infinity, respectively.

Thus the range of E' for normal values in double precision is $0 \lt E' \lt 2047$ . This means that for 64-bit representation the actual exponent E is in the range

-1022 5 E51023.

ADD COMMENT EDIT