Video 1: Intro to Floating point
Video 1: Intro to Floating point (Unsigned) Fixed-point - - PowerPoint PPT Presentation
Video 1: Intro to Floating point (Unsigned) Fixed-point - - PowerPoint PPT Presentation
Video 1: Intro to Floating point (Unsigned) Fixed-point representation The numbers are stored with a fixed number of bits for the integer part and a fixed number of bits for the fractional part. Suppose we have 8 bits to store a real number,
(Unsigned) Fixed-point representation
The numbers are stored with a fixed number of bits for the integer part and a fixed number of bits for the fractional part. Suppose we have 8 bits to store a real number, where 5 bits store the integer part and 3 bits store the fractional part: 2! 2" 2# 2$ 2%
2!" 2!# 2!$
1 0 1 1 1.0 1 1 !
Smallest number: Largest number:
.¥¥¥÷:¥¥¥..
::i÷÷
(00000 . 001 ) z
= (O . 125 ) ,o(l l l l l
. lll ) z =(31 . 875) ,o(Unsigned) Fixed-point representation
Suppose we have 64 bits to store a real number, where 32 bits store the integer part and 32 bits store the fractional part: Smallest number: Largest number: "$" … "#"""!. %"%#%$ … %$# # = '
&'! $"
"& 2& + '
&'" $#
%& 2(&
= "!"× 2!"+"!#× 2!#+ ⋯ + "#× 2#+'"× 2$"+'%× 2%+ ⋯ + '!%× 2$!%
23 ',
420,
12-12-2
,
z- 32
000,02--0-00%-01=1
2-32=10-9
(l l l
. . . 1 . I 1 . . . 1) zE 109
? ?
¥73
Fixed-point representation
How can we decide where to locate the binary point? More bits on the integer part? More bits on the fractional part? ÷÷:::÷÷÷÷%
.
5 bits → do
. b.bzbzbc,
}
0.0625
#
( a.a.ao.b.b.5888.io
go.es
.do
..(Unsigned) Fixed-point representation
Range: difference between the largest and smallest numbers possible. More bits for the integer part ⟶ increase range Precision: smallest possible difference between any two numbers More bits for the fractional part ⟶ increase precision Wherever we put the binary point, there is a trade-off between the amount of range and precision. It can be hard to decide how much you need of each! Fix: Let the binary point “float”
!!!"!#. #"#!#$ ! !"!#. #"#!#$#% ! OR
Scientific Notation
In scientific notation, a number can be expressed in the form * = ± , × 10) where , is a coefficient in the range 1 ≤ , < 10 and 2 is the exponent.
1165.7 = 1.1657 × 10! 0.0004728 = 4.728 × 10$&
Note how the decimal point “floats”!
④
- O
Eg
- O
Floating-point numbers
A floating-point number can represent numbers of different order of magnitude (very large and very small) with the same number of fixed digits. In general, in the binary system, a floating number can be expressed as
! = ± $ × 2$
3 is the significand, normally a fractional value in the range [1.0,2.0) 2 is the exponent
=
=
- →
ME [ L
, U ] ¥4,4 ]Floating-point numbers
Numerical Form: ! = ±$ × 2+ = ±',. '-'.'/ … '0× 2+
!! ∈ 0,1
Exponent range: * ∈ ,, . Precision: p = 0 + 1
Fractional part of significand (* digits)
(
leading
bit
fractional
0-0=0
000
Video 2: Normalized floating point representation
Converting floating points
Convert (39.6875)"! = 100111.1011 # into floating point representation
- 1. 001111011×25
0.1001111011×26
No Normal alized floating-point numbers
Normalized floating point numbers are expressed as
! = ± 1. '-'.'/ … '0× 2+ = ± 1. 3 × 2+
where " is the fractional part of the significand, # is the exponent and $! ∈ 0,1 .
✓
leading bit
y ✓
M€1403
I
5bits
✓Aoebibzb3b4 → p - 5
\④b,bzb3b4
"t
"hidden"↳bit¥7
- Exponent range:
- Precision:
- Smallest positive normalized FP number:
- Largest positive normalized FP number:
Normalized floating-point numbers
% = ± ( × 2&= ± 1. #"#!#$ … #'× 2& = ± 1. - × 2&
OD
- mEE4€
p
e nt I1-OO.is#x2=/2I →exponent
1.sissyish
xI
=ftp.z-l#JYfa0Tgeeht
- precision
Normalized floating point number scale
+∞ −∞
p
- htt
I .fx2m
ME [ 40 ]
flow
T
- verflow
- ver
p
"
f
under
.tl#ro?g0.Itia...i
,
ht
l l
- 2
'
2
.
It'll.it)
- gap
?
gap ?
Floating-point numbers: Simple example
A ”toy” number system can be represented as * = ±1. %"%#×2)
for + ∈ [−4,4] and '' ∈ {0,1}.
O
w
- n=2
- M=O
m
=I
- .
M
- 2
m
- 3
m - 4
- 1. 00×20=1
1.00×2
'i :
fi:% :3.
I
. 11×20=1.75I
. 11×2 'm =-3
M
=- 4
M
=- l
M
=- 2
Floating-point numbers: Simple example
A ”toy” number system can be represented as * = ±1. %"%#×2)
for + ∈ [−4,4] and '' ∈ {0,1}.
1.00 ! ×2" = 1 1.01 ! ×2" = 1.25 1.10 ! ×2" = 1.5 1.11 ! ×2" = 1.75 1.00 ! ×2#$ = 0.5 1.01 ! ×2#$ = 0.625 1.10 ! ×2#$ = 0.75 1.11 ! ×2#$ = 0.875 1.00 ! ×2$ = 2 1.01 ! ×2$ = 2.5 1.10 ! ×2$ = 3.0 1.11 ! ×2$ = 3.5 1.00 ! ×2! = 4.0 1.01 ! ×2! = 5.0 1.10 ! ×2! = 6.0 1.11 ! ×2! = 7.0 1.00 ! ×2% = 8.0 1.01 ! ×2% = 10.0 1.10 ! ×2% = 12.0 1.11 ! ×2% = 14.0 1.00 ! ×2& = 16.0 1.01 ! ×2& = 20.0 1.10 ! ×2& = 24.0 1.11 ! ×2& = 28.0 1.00 ! ×2#! = 0.25 1.01 ! ×2#! = 0.3125 1.10 ! ×2#! = 0.375 1.11 ! ×2#! = 0.4375 1.00 ! ×2#% = 0.125 1.01 ! ×2#% = 0.15625 1.10 ! ×2#% = 0.1875 1.11 ! ×2#% = 0.21875 1.00 ! ×2#& = 0.0625 1.01 ! ×2#& = 0.078125 1.10 ! ×2#& = 0.09375 1.11 ! ×2#& = 0.109375
Same steps are performed to obtain the negative numbers. For simplicity, we will show only the positive numbers in this example.
"
}
2.0}
4.0
①
}
0.125
°
foot's
* = ±1. %"%#×2) for + ∈ [−4,4] and '' ∈ {0,1}
- Smallest normalized positive number:
- Largest normalized positive number:
¥
2-4=0.0625
zut
' ( I- 2-
0=4 p
= htt =34 = ±1. '"'%×2( for + ∈ [−4,4] and '' ∈ {0,1}
Machine epsilon
- Machine epsilon (1%): is defined as the distance (gap) between 1 and the
next larger floating point number.
Em =
1.25
- l
0.25
in general
:x # i. f
in,
IEm=2T↳
(1) co
=I
. 0000- .
- ① I
- .
x I
T.F.co#I-2-nx2o=2-n
Range of integer numbers
4 = ±1. '"'%×2( for + ∈ [−4,4] and '' ∈ {0,1}
Suppose you have this following normalized floating point representation: What is the range of integer numbers that you can represent exactly?
- 48¥
?:O:'o=②.
- c. oooh
(1112=1.10×2
' =3(1001 )z= -
(g) no
= 1.01×23=10÷:÷÷÷÷÷÷¥÷⇒
.