Забыли?

?

# Assembly Language Programming

код для вставкиСкачать
```Floating Point
CPSC 252 Computer
Organization
Ellen Walker, Hiram College
Representing Non-Integers
вЂ“ Often represented in decimal format
вЂ“ Some require infinite digits to represent
exactly
вЂ“ With a fixed number of digits (or bits),
many numbers are approximated
вЂ“ Precision is a measure of the degree of
approximation
Scientific Notation (Decimal)
вЂў Format: m.mmmm x 10^eeeee
вЂ“ Normalized = exactly 1 digit before decimal point
вЂў Mantissa (m) represents the significant digits
вЂ“ Precision limited by number of digits in mantissa
вЂў Exponent (e) represents the magnitude
вЂ“ Magnitude limited by number of digits in exponent
вЂ“ Exponent < 0 for numbers between 0 and 1
Scientific Notation (Binary)
вЂў Format: 1.mmmm x 2^eeeee
вЂ“ Normalized = 1 before the binary point
вЂў Mantissa (m) represents the significant bits
вЂ“ Precision limited by number of bits in mantissa
вЂў Exponent (e) represents the magnitude
вЂ“ Magnitude limited by number of bits in exponent
вЂ“ Exponent < 0 for numbers between 0 and 1
Binary Examples
вЂў 1/16
1.0 x 2^-4 (mantissa 1.0, exponent -4)
вЂў 32.5
1.000001 x 2^5 (mantissa 1.000001,
exponent 5)
Quick Decimal-to-Binary
Conversion (Exact)
1. Multiply the number by a power of 2
big enough to get an integer
2. Convert this integer to binary
3. Place the binary point the appropriate
number of bits (based on the power of
2 from step 1) from the right of the
number
Conversion Example
вЂў
Convert 32.5 to binary
1. Multiply 32.5 by 2 (result is 65)
2. Convert 65 to binary (result is 1000001)
3. Place the decimal point (in this case 1 bit
from the right) (result is 100000.1)
вЂў
Convert to binary scientific notation
(result is 1.000001 x 2^5)
Floating Point Representation
вЂў
вЂў
вЂў
вЂў
Mantissa - m bits (unsigned)
Exponent - e bits (signed)
Sign (separate) - 1 bit
Total = 1+m+e bits
вЂ“ Tradeoff between precision and magnitude
вЂ“ Total bits fit into 1 or 2 full words
Implicit First Bit
вЂў Remember the mantissa must always
begin with вЂњ1.вЂќ
вЂў Therefore, we can save a bit by not
actually representing the 1 explicitly.
вЂў Example:
вЂ“ Mantissa bits 0001
вЂ“ Mantissa: 1.0001
Offset Exponent
вЂў Exponent can be positive or negative, but itвЂ™s
cleaner (for sorting) use an unsigned
representation
вЂў Therefore, represent exponents as unsigned,
but add a bias of вЂ“((2^(bits-1))-1)
вЂў Examples: 8 bit exponent
вЂ“ 00000001 = 1(+ -127) = -126
вЂ“ 10000000 = 128 (+ -127) = 1
IEEE 754 Floating Point
Representation (Single)
вЂў Sign (1 bit), Exponent (8 bits), Magnitude (23
bits)
вЂ“ What is the largest value that can be represented?
вЂ“ What is the smallest positive value that can be
represented?
вЂ“ How many вЂњsignificant bitsвЂќ can be represented?
вЂў Values can be sorted using integer
comparison
вЂ“ Sign first
вЂ“ Exponent next (sorted as unsigned)
вЂ“ Magnitude last (also unsigned)
Double Precision
вЂў Floating point number takes 2 words (64
bits)
вЂў Sign is 1 bit
вЂў Exponent is 11 bits (vs. 8)
вЂў Magnitude is 52 bits (vs. 23)
вЂ“ Last 32 bits of magnitude is in the second
word
Floating Point Errors
вЂў Overflow
вЂ“ A positive exponent becomes too large for the
exponent field
вЂў Underflow
вЂ“ A negative exponent becomes too large for the
exponent field
вЂў Rounding (not actually an error)
вЂ“ The result of an operation has too many significant
bits for the fraction field
Special Values
вЂў Infinity
вЂ“ Result of dividing a non-zero value by 0
вЂ“ Can be positive or negative
вЂ“ Infinity +/- anything = Infinity
вЂў Not A Number (NaN)
вЂ“ Result of an invalid mathematical
operation, e.g. 0/0 or Infinity-Infinity
Representing Special Values
in IEEE 754
вЂў Exponent в‰ 0, Exponent в‰ FF
вЂ“ Ordinary floating point number
вЂў Exponent = 00, Fraction = 0
вЂ“ Number is 0
вЂў Exponent = 00, Fraction в‰ 0
вЂў Exponent = FF, Fraction = 0
вЂ“ Infinity (+ or -, depending on sign)
вЂў Exponent = FF, Fraction в‰ 0
вЂ“ Not a Number (NaN)
Double Precision in MIPS
вЂў Each even register can be considered a
register pair for double precision
вЂ“ High order bit in even register
вЂ“ Low order bit in odd register
Floating Point Arithmetic in
MIPS
вЂ“ Single and double precision addition /
subtraction
вЂ“ rd = rs +/- rt
вЂў 32 floating point registers \$f0 - \$f31
вЂ“ Use in pairs for double precision
вЂ“ Registers for add.d (etc) must be even
numbers
Why Separate Floating Point
Registers?
вЂў Twice as many registers using the same
number of instruction bits
вЂў Integer & floating point operations
usually on distinct data
вЂў Increased parallelism possible
вЂў Customized hardware possible
Number
вЂў
вЂў
вЂў
вЂў
Lwc1 32 bit word to FP register
Swc1 FP register to 32 bit word
Ldc1 2 words to FP register pair
Sdc1 register pair to 2 words
вЂў (Note last character is the number 1)
вЂў Align the binary points (make exponents
equal)
вЂў Normalize the sum
Changing Exponents for
Alignment and Normalization
вЂў To keep the number the same:
вЂ“ Left shift mantissa by 1 bit and decrement
exponent
вЂ“ Right shift mantissa by one bit and increment
exponent
вЂў Align by right-shifting smaller number
вЂў Normalize by
вЂ“ Round result to correct number of significant bits
вЂ“ Shift result to put 1 before binary point
Add 1.101 x 2^4 + 1.101 x 2^5 (26+52)
вЂў Align binary points
1.101 x 2^4 = 0.1101 x 2^5
0.1101 x 2^5
1.1010 x 2^5
10.0111 x 2^5
вЂў Normalize:
10.0111 x 2^5 = 1.00111 x 2^6 (78)
вЂў Round to 3-bit mantissa:
1.00111 x 2^6 ~= 1.010 x 2^6 (80)
Rounding
вЂў At least 1 bit beyond the last bit is needed
вЂў Rounding up could require renormalization
вЂ“ Example: 1.1111 -> 10.000
вЂў For multiplication, 2 extra bits are needed in
case the productвЂ™s first bit is 0 and it must be
left shifted (guard, round)
вЂў For complete generality, add вЂњsticky bitвЂќ that
is set whenever additional bits to the right
would be >0
Round to Nearest Even
вЂў Most common rounding mode
вЂў If the actual value is halfway between
two values round to an even result
вЂў Examples:
вЂ“ 1.0011 -> 1.010
вЂ“ 1.0101 -> 1.010
вЂў If the sticky bit is set, round up because
the value isnвЂ™t really halfway between!
Sign Exponent
вЂў
Fraction
Sign Exponent
Fraction
1. Compare the exponents of the two numbers.
Shift the smaller number to the right until its
exponent would match the larger exponent
Small ALU
Exponent
difference
0
Start
1
0
1
0
3. Normalize the sum, either shifting right and
incrementing the exponent or shifting left
and decrementing the exponent
Shift right
Control
1
Overflow or
underflow?
Big ALU
Yes
No
0
0
1
1
4. Round the significand to the appropriate
Increment or
decrement
number of bits
Shift left or right
No
Rounding hardware
Still normalized?
Yes
Sign Exponent
Fraction
Done
Exception
Floating Point Multiplication
1. Calculate new exponent by adding
exponents together
2. Multiply the significands (using shift &
3. Normalize the product
4. Round
5. Set the sign
вЂў Remember that exponents are biased
(exp1 + 127) + (exp2 + 127) =
(exp1+exp2 + 254)
вЂў Therefore, subtract the bias from the
sum and the result is a correctly biased
value
Multiplication Example
вЂў Convert 2.25 x 1.5 to binary floating point (3
bits exponent, 3 bits mantissa)
вЂў 2.25 = 10.01 * 2^0 = 1.001 * 2^1
вЂў Exp = 100 (because bias is 3)
вЂў 2.25 = 0 100 001
вЂў 1.5 = 1.100 * 2^0
вЂў Exp = 011, Mantissa: 100
вЂў 1.5 = 0 100 100
0 100 001 x 0 011 100
вЂў Add Exponents (and subtract bias)
100 + 011 вЂ“ 011 = 100
2. Multiply Significands
0 100 001 x 0 011 100
вЂў Remember to restore the leading 1
вЂў Remember that the number of binary places
doubles
1.001
1.100
-----------------------.100100
1.001000
---------------1.101100 x 2^1
Finish Up
вЂў
вЂў
вЂў
вЂў
вЂў
вЂў
Product is 1.1011 * 2^1
But, too many bits, so we need to round
Nearest even number (up) is 1.110
Result: 0 100 110
Value is 1.75 * 2 = 3.5
Types of Errors
вЂў Overflow
вЂў Exponent too large or small for the number
of bits allotted
вЂў Underflow
вЂў Negative exponent is too small to fit in the
# bits
вЂў Rounding error
вЂў Mantissa has too many bits
Overflow and Underflow
вЂ“ Overflow is possible when adding two positive or
two negative numbers
вЂў Multiplication
вЂ“ Overflow is possible when multiplying two large
absolute value numbers
вЂ“ Underflow is possible when multiplying two
numbers very close to 0
Limitations of Finite Floating
Point Representations
вЂў Gap between 0 and the smallest nonzero number
вЂў Gaps between values when the last bit of
the mantissa changes
вЂў Fixed number of values between 0 and 1
вЂў Significant effects of rounding in
mathematical operations
Implications for Programmers
вЂў Mathematical rules are not always followed
вЂ“ (a / b) * b does not always equal a
вЂ“ (a + b) + c does not always equal a + (b + c)
вЂў Use inequality comparisons instead of directly
comparing floating point numbers (with ==)
вЂ“ if ((x > вЂ“epsilon) && (x < epsilon)) instead of
if(x==0)
вЂ“ Epsilon can be set based on problem or
knowledge of representation (e.g. single vs.
double precision)
The Pentium Floating Point
Bug
вЂў To speed up division, a table was used
вЂў It was assumed that 5 elements of the table
would never be accessed (and the hardware
was optimized to make them 0)
вЂў These table elements occasionally caused
errors in bits 12 to 52 of floating point
significands
вЂў (see Section 3.8 for more)
A Marketing Error
вЂў July 1994 - Intel discovers the bug, decides
not to halt production or recall chips
вЂў September 1994 - A professor discovers the
bug, posts to Internet (after attempting to
inform Intel)
вЂў November 1994 - Press articles, Intel says
will affect вЂњmaybe several dozen peopleвЂќ
вЂў December 1994 - IBM disputes claim and
halts shipment of Pentium based PCs.
вЂў Late December 1994 - Intel apologizes
The вЂњBig PictureвЂќ
вЂў Bits in memory have no inherent meaning. A
given sequence can contain
вЂ“
вЂ“
вЂ“
вЂ“
An instruction
An integer
A string of characters
A floating point number
вЂў All number representations are finite
вЂў Finite arithmetic requires compromises
```
###### Документ
Категория
Презентации
Просмотров
8
Размер файла
230 Кб
Теги
1/--страниц
Пожаловаться на содержимое документа