video 1 intro to floating point
play

Video 1: Intro to Floating point (Unsigned) Fixed-point - PowerPoint PPT Presentation

Video 1: Intro to Floating point (Unsigned) Fixed-point representation The numbers are stored with a fixed number of bits for the integer part and a fixed number of bits for the fractional part. Suppose we have 8 bits to store a real number,


  1. Video 1: Intro to Floating point

  2. (Unsigned) Fixed-point representation The numbers are stored with a fixed number of bits for the integer part and a fixed number of bits for the fractional part. Suppose we have 8 bits to store a real number, where 5 bits store the integer part and 3 bits store the fractional part: ::i÷÷ .¥¥¥÷:¥¥¥ . . 1 0 1 1 1.0 1 1 ! 2 !$ 2 !# 2 !" 2 $ 2 # 2 % 2 " 2 ! = ( O . 125 ) , o ( 00000 . 001 ) z Smallest number: . 875 ) , o . l l l ) z = ( 31 ( l l l l l Largest number:

  3. (Unsigned) Fixed-point representation Suppose we have 64 bits to store a real number, where 32 bits store the integer part and 32 bits store the fractional part: 12-12-2 - 32 420 , ' 23 z $" $# , , " & 2 & + ' % & 2 (& " $" … " # " " " ! . % " % # % $ … % $# # = ' &'! &'" = " !" × 2 !" +" !# × 2 !# + ⋯ + " # × 2 # +' " × 2 $" +' % × 2 % + ⋯ + ' !% × 2 $!% 000,02--0-00%-01=1 2-32=10-9 Smallest number: E 109 . 1) z ( l l l . I 1 Largest number: . 1 . . . . ? ? ¥73

  4. Fixed-point representation More bits on the fractional part? ÷÷:::÷÷÷÷% How can we decide where to locate the binary point? . More bits on the integer part? } 0.0625 . b. bzbzbc 5 bits → do , ( a.a.ao.b.b.5888.io # go.es . do . . -

  5. (Unsigned) Fixed-point representation Range : difference between the largest and smallest numbers possible. More bits for the integer part ⟶ increase range Precision : smallest possible difference between any two numbers More bits for the fractional part ⟶ increase precision ! ! ! " ! # . # " # ! # $ ! ! " ! # . # " # ! # $ # % ! OR Wherever we put the binary point, there is a trade-off between the amount of range and precision. It can be hard to decide how much you need of each! Fix: Let the binary point “float”

  6. Scientific Notation In scientific notation , a number can be expressed in the form * = ± , × 10 ) where , is a coefficient in the range 1 ≤ , < 10 and 2 is the exponent. - O 1165.7 = 1.1657 × 10 ! ④ - O 0.0004728 = 4.728 × 10 $& Eg Note how the decimal point “floats”!

  7. Floating-point numbers A floating-point number can represent numbers of different order of magnitude (very large and very small) with the same number of fixed digits. In general, in the binary system, a floating number can be expressed as ! = ± $ × 2 $ = = 3 is the significand, normally a fractional value in the range [1.0,2.0) - , U ] ¥4,4 ] ME [ L 2 is the exponent →

  8. Floating-point numbers bit leading fractional ( Numerical Form: 0-0=0 ! = ±$ × 2 + = ±' , . ' - ' . ' / … ' 0 × 2 + Fractional part of significand ( * digits) ! ! ∈ 0,1 Exponent range : * ∈ ,, . 000 Precision : p = 0 + 1

  9. Video 2: Normalized floating point representation

  10. Converting floating points Convert (39.6875) "! = 100111.1011 # into floating point representation 1. 001111011 × 25 0.1001111011 × 26

  11. alized floating-point numbers No Normal M € 1403 leading bit y ✓ ✓ Normalized floating point numbers are expressed as ! = ± 1. ' - ' . ' / … ' 0 × 2 + = ± 1. 3 × 2 + I 0 where " is the fractional part of the significand, # is the exponent and $ ! ∈ 0,1 . ✓ Aoebibzb3b4 → p - 5 5bits \ ④ b,bzb3b4 " ↳ bit¥7 " hidden " t

  12. Normalized floating-point numbers OD % = ± ( × 2 & = ± 1. # " # ! # $ … # ' × 2 & = ± 1. - × 2 & o mEE4 € • Exponent range : • Precision : e n t I p • Smallest positive normalized FP number: 1- OO.is#x2=/2I → exponent = ftp.z-l#JYfa0Tgeeht • Largest positive normalized FP number: x I - precision 1. sissyish

  13. Normalized floating point number scale I .fx2m ME [ 40 ] - htt p T flow overflow over p .tl#ro?g0.Itia...i f under " ht −∞ +∞ l It'll .it ) l 0 , . ' - 2 2 - - gap ? ? gap

  14. Floating-point numbers: Simple example O A ”toy” number system can be represented as * = ±1. % " % # ×2 ) w for + ∈ [−4,4] and ' ' ∈ {0,1} . n=2 - - m - 4 fi :% :3 . -3 - 2 I m M M=O m - . . - = - ' 1.00 × 2 1. 00 × 20=1 i : ' . 11 × 2 . 11 × 20=1.75 I I - 4 M m =-3 = - 2 M - l = M =

  15. Floating-point numbers: Simple example A ”toy” number system can be represented as * = ±1. % " % # ×2 ) for + ∈ [−4,4] and ' ' ∈ {0,1} . 1.00 ! ×2 " = 1 1.00 ! ×2 ! = 4.0 1.00 ! ×2 $ = 2 1.01 ! ×2 " = 1.25 1.01 ! ×2 $ = 2.5 1.01 ! ×2 ! = 5.0 " 1.10 ! ×2 " = 1.5 1.10 ! ×2 $ = 3.0 1.10 ! ×2 ! = 6.0 1.11 ! ×2 " = 1.75 1.11 ! ×2 $ = 3.5 1.11 ! ×2 ! = 7.0 } ① 1.00 ! ×2 % = 8.0 1.00 ! ×2 #$ = 0.5 1.00 ! ×2 & = 16.0 } } 1.01 ! ×2 % = 10.0 1.01 ! ×2 #$ = 0.625 1.01 ! ×2 & = 20.0 0.125 4.0 2.0 1.10 ! ×2 % = 12.0 1.10 ! ×2 #$ = 0.75 1.10 ! ×2 & = 24.0 1.11 ! ×2 % = 14.0 1.11 ! ×2 #$ = 0.875 1.11 ! ×2 & = 28.0 ° foot's 1.00 ! ×2 #! = 0.25 1.00 ! ×2 #% = 0.125 1.00 ! ×2 #& = 0.0625 1.01 ! ×2 #! = 0.3125 1.01 ! ×2 #% = 0.15625 1.01 ! ×2 #& = 0.078125 1.10 ! ×2 #! = 0.375 1.10 ! ×2 #& = 0.09375 1.10 ! ×2 #% = 0.1875 1.11 ! ×2 #! = 0.4375 1.11 ! ×2 #& = 0.109375 1.11 ! ×2 #% = 0.21875 Same steps are performed to obtain the negative numbers. For simplicity, we will show only the positive numbers in this example.

  16. ¥ * = ±1. % " % # ×2 ) for + ∈ [−4,4] and ' ' ∈ {0,1} • Smallest normalized positive number: 2-4=0.0625 • Largest normalized positive number: ' ( I P ) = 28 zut - 2- 0=4 = htt =3 p

  17. Machine epsilon Machine epsilon ( 1 % ): is defined as the distance (gap) between 1 and the • next larger floating point number. 0 4 = ±1. ' " ' % ×2 ( for + ∈ [−4,4] and ' ' ∈ {0,1} IEm=2T ↳ - l 0.25 1.25 Em = = . in general in , x # i. f : . 00 × 20 (1) co I . 0000 - . = - x I ① I . . 001 T.F.co#I-2-nx2o=2-n . 000 . -

  18. Range of integer numbers Suppose you have this following normalized floating point representation: 4 = ±1. ' " ' % ×2 ( for + ∈ [−4,4] and ' ' ∈ {0,1} -48¥ What is the range of integer numbers that you can represent exactly? ? :O : 'o= ② . c. oooh . ÷ :÷÷÷÷÷÷¥÷ ⇒ (g) no ( 1001 )z= - ' =3 ( 1112=1.10 × 2 = 1.01 × 23=10 .

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend