written 8.5 years ago by | • modified 8.5 years ago |
The standards for representing floating point numbers in 32-bits and 64-bits have been developed by the institute of Electrical and Electronics Engineers (IEEE), referred to as IEEE 754 standards. Figure shows these IEEE standard formats.
The 32-bit standard representation shown in Fig. (a) is called a single precision representation because it occupies a single 32-bit word. The 32-bit are divided into three fields as shown below:
(field 1) sign ⟵ 1bit
(field 2) Exponent ⟵ 8-bits
(Field 3) Mantissa ⟵ 23 bits
The sign of the number is given in the first bit, followed by a representation for the exponent (to the base 2) of the scale factor.
Instead of the signed exponent, E, the value actually stored in the exponent field is E' = E + bias. In the 32-bit floating point system (single precision), bias is 127.
Hence $E' = E + 127$. This representation of exponent is also called the excess-127 format.
The end values of E’, namely, 0 and 255, are used to indicate the floating point values of exact zero and infinity, respectively in single precision.
Thus range of E’ for normal values in single precision is $0 \lt E’ \lt 255$. This means that for 32-bit representation the actual exponent E is in the range -126 SE 5127.
The 64-bit standard representation shown in Fig. (b) is called a double precision representation because it occupies two 32-bit words. The 64-bits are divided into three fields as shown below:
(field 1) ⟵ 1-bit
(field 2) Exponent ⟵ 1-bit
(field 3) Mantissa ⟵ 52-bits
In the double precision format value actually stored in the exponent field is given as
E' = E + 1023
Here, bias value is 1023 and hence it is also called excess -1023 format. The end values of E’, namely, 0 and 2047, are used to indicate the floating point exact values of exact zero and infinity, respectively.
Thus the range of E' for normal values in double precision is $0 \lt E' \lt 2047$. This means that for 64-bit representation the actual exponent E is in the range
-1022 5 E51023.