Assume the following representation for a floating point number 1 sign bit, 4 bits exponent, 3 bits for the significand, and a bias of 7 for the exponent (there is no implied 1 as in IEEE).
a) What is the largest number (in binary) that can be stored? Estimate it in decimal.
b) What is the smallest positive number( closest to 0) that can be stored in binary? Estimate it in decimal.
c) Describe the steps for adding two floating point numbers.
d) Describe the steps for multiplying two floating point numbers.

Respuesta :

Answer:

Detailed answers given in the solution

Explanation:

a)

max exponent= 1111=15

in biased form =15-7=8

max significant=1111=(.9375)10

max value= 1.9375*28

b) min exp= 0000=0-7=-7(biased)

min value=1.0000*2(-7)

c) Addition : Suppose we want to add two floating point numbers, X and Y.

steps to add floating point numbers:

Make exponents of the two numbers to be the same (if not). We do this by rewriting .

Add the two mantissas of X and the Y together.

If the sum in the previous step does not have a single bit of value 1, left of the radix point, then adjust the radix point and exponent until it does.

Convert back to the one byte floating point representation.

d) Multiplication :

Suppose you want to multiply two floating point numbers, X and Y.

steps to multiply floating point numbers.

sum of the two exponents..

Multiply the mantissa of X to the mantissa of Y. Call this result m.

If m is does not have a single 1 left of the radix point, then adjust the radix point so it does, and adjust the exponent z to compensate.

Add the sign bits, mod 2, to get the sign of the resulting multiplication.

Convert back to the one byte floating point representation, truncating bits if needed.