reelle tal f eks 1 7 eller double float
play

Reelle tal, f.eks. 1/7 eller double float 32 bit - PowerPoint PPT Presentation

Reelle tal, f.eks. 1/7 eller double float 32 bit 64 bit By default, the x87 processors all use 80-bit double-extended precision internally wikipedia.org/wiki/X87 Algorithms using real numbers


  1. Reelle tal, f.eks. 1/7 ≈ eller double float 32 bit 64 bit ” By default, the x87 processors all use 80-bit double-extended precision internally “ wikipedia.org/wiki/X87

  2. Algorithms using “real” numbers Noter ch.4

  3. double x = 0.0; assert ( x == 0.0 ); // C++ f( ) if ( x == 0.0) f( ) << do something smart >> else << do something stupid >> P.g.a. underlige compiler optimering og repræsentationer af reelle tal kan der faktisk ske ”something stupid”

  4. double x = 1.0; assert ( x == 0.0 ); // C++ f( ) if ( x == 0.0) f( ) << do something smart >> else << do something stupid >> P.g.a. underlige compiler optimering og repræsentationer af reelle tal kan der faktisk ske ”something stupid”

  5. Algorithms using ”real” numbers • Type double contains only finitely many values double d = 10; for (int i=0; i<10; i++) { System.out.println(d); d = d*d; }

  6. Algorithms using ”real” numbers • Type double contains only finitely many values double d = .10; for (int i=0; i<10; i++) { System.out.println(d); d = d*d; }

  7. Algorithms using ”real” numbers • Type double contains only finitely many values System.out.println( (1.0 / 49) * 49 ); }

  8. Algorithms using “real” numbers • Representable numbers • Rounding/truncation • IEEE standard and Java • Summation order • Newton iteration • More iteration • Formula rewriting

  9. Limited Precision • A computer can handle only a finite subset of the reals directly in hardware • These numbers make up a floating point number system F( β ,s,m,M) characterized by – a base � ∈ �\�1� – a number of digits � ∈ � – a smallest exponent � ∈ � – largest exponent � ∈ � • Each floating point number has the form � � . � � � with � � � � � and 0 � � � � � � 1 . The mantissa � � . � � � � … � � represents • � � � � � � �� � � � � �� � ・ ・ ・ � � � � ���� The exponent part is � � • The system includes 0 � 0.0 . . . 0 • All other numbers are normalised: 1 � � � � � � 1 •

  10. Limited Precision • Example: the number system F(2, 3, − 2, 1) contains 33 numbers: In decimal: 0.25 = (1+0/2+0/4) / 4 3.5 = (1+1/2+1/4) * 2 0.25 3.5

  11. QUIZ Limited precision A number x in the number system F(2, 3, − 2, 1) has the mantissa 110 and the exponent 1 . What number is x in the usual decimal base 10 system? 8% a) X 1. 1.5 62% b) X 2. 3 3. 12 c) X 14% 4. 11 d) X 10% 5. I don’t know e) X 7% mantissa 110 : binary number 1.10 Decimal number 1 + 1/2 + 0/4 = 1.5 mantissa 110 and exponent 1 : binary number 11.0 Decimal number 1*2 + 1 + 0/2 = 3

  12. QUIZ Limited precision A number x in the number system F(2, 3, − 2, 1) has the mantissa 110 and the exponent 1 . What number is x in the usual decimal base 10 system? 1. 1.5 2. 3 3. 12 4. 11 5. I don’t know

  13. QUIZ Limited precision A number x in the number system F(2, 3, − 2, 1) has the mantissa 110 and the exponent 1 . What number is x in the usual decimal base 10 system? 1. 1.5 2. 3 3. 12 4. 11 5. I don’t know mantissa 110 : binary number 1.10 Decimal number 1 + 1/2 + 0/4 = 1.5 mantissa 110 and exponent 1 : binary number 11.0 Decimal number 1*2 + 1 + 0/2 = 3

  14. Algorithms using “real” numbers • Representable numbers • Rounding/truncation • IEEE standard and Java • Summation order • Newton iteration • More iteration • Formula rewriting

  15. Rounding/truncation • Standard demo system is F(10, 4, − 99, 99) • 1.573 og 0.1824 are representable, but the following are not 1.573 � 0.1824 � 1.7554 1.573 � 0.1824 � 1.3906 1.573 � 0.1824 � 0.2869152 1.573 / 0.1824 � 8.6239035 . . . • Exact arithmetic on a finite subset of reals is not possible • Strategy for rounding/truncating to a representable number is needed

  16. Rounding/truncation • rounding/truncation defined by function fl : R → M , where M is machine numbers • machine arithmetic operations ⊕ , ⊖ , ⊗ , ⊘ defined by � ⊕ � � ���� � �� etc. • In demo system F(10, 4, − 99, 99) define �� by truncation, e.g. 1.573 ⊕ 0.1824 � ���1.573 � 0.1824� � ���1.7554� � 1.755 1.573 ⊖ 0.1824 � ���1.573 � 0.1824� � ���1.3906� � 1.390 1.573 ⊗ 0.1824 � ���1.573 � 0.1824� � ���0.2869152� � .2869 1.573 ⊘ 0.1824 � ���1.573 / 0.1824� � ���8.6239035 . . . � � 8.623

  17. Rounding/truncation • Algebraic laws invalid: �1.418 ⊕ 2937� ⊖ 2936 � 2938 ⊖ 2936 � 2.000 1.418 ⊕ �2937 ⊖ 2936� � 1.418 ⊕ 1.000 � 2.418 1.418 ⊗ �2001 ⊖ 2000� � 1.418 ⊗ 1.000 � 1.418 1.418 ⊗ 2001 ⊖ 1.418 ⊗ 2000 � 2837 ⊖ 2836 � 1.000 • The usual associative and distributive laws are not valid for machine arithmetic. Sometimes: �� ⊕ �� ⊖ � � � ⊕ �� ⊖ �� � ⊗ �� ⊖ �� � �� ⊗ �� ⊖ �� ⊗ ��

  18. QUIZ Rounding / truncation What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 in the number system F(10, 4, − 99, 99) using truncation? 1. 0.9995 2. 0.9996 3. 0.9997 4. 0.9998 5. I don’t know

  19. QUIZ Rounding / truncation What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 in the number system F(10, 4, − 99, 99) using truncation? 1. 0.9995 2. 0.9996 3. 0.9997 4. 0.9998 5. I don’t know (0.9996 ⊕ 0.9998) ⊘ 2 = fl(0.9996 + 0.9998) ⊘ 2 = fl(1.9994) ⊘ 2 = 1.999 ⊘ 2 = fl(1.999 / 2) = fl(0.9995) = 0.9995 Better expression for computing average: 0.9996 ⊕ (0.9998 ⊖ 0.9996) ⊘ 2 = 0.9997

  20. QUIZ Rounding / truncation What is the result of computing (0.9996 ⊕ 0.9998) ⊘ 2 77% in the number system F(10, 4, − 99, 99) using truncation? 1. 0.9995 a) X 8% 2. 0.9996 b) X 8% 3. 0.9997 c) X 4. 0.9998 5% d) X 5. I don’t know 3% e) X (0.9996 ⊕ 0.9998) ⊘ 2 = fl(0.9996 + 0.9998) ⊘ 2 = fl(1.9994) ⊘ 2 = 1.999 ⊘ 2 = fl(1.999 / 2) = fl(0.9995) = 0.9995 Better expression for computing average: 0.9996 ⊕ (0.9998 ⊖ 0.9996) ⊘ 2 = 0.9997

  21. QUIZ Rounding / truncation m,s ∊ F(10,4, − 99,99) int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; It is a fact that � � � � � . � �→� �1 � lim The algorithm computes an approximation to � corresponding to � � 2 �� � 1024 . What is the result of execution in the number system F(10,4, − 99,99) with truncation, where fl( � ) = 2.718? 1. 0 2. 1.000 3. 2.591 4. 2.718 5. I don’t know

  22. QUIZ Rounding / truncation m,s ∊ F(10,4, − 99,99) int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; � � 2 � � 2 It is a fact that � � 2 � � 4 � � � � � . � �→� �1 � lim … The algorithm computes an � � 2 � � 512 approximation to � corresponding to � � 2 �� � 1024 � � 2 �� � 1024 . What is the result of execution in the number system � � 1 � 1 � � 1⊕1⊘1024 F(10,4, − 99,99) with truncation, where fl( � ) = 2.718? � 1⊕0.0009765 1. 0 � �. ��� 2. 1.000 3. 2.591 � � � ∗ � � �. ��� 4. 2.718 … 5. I don’t know � � � ∗ � � �. ���

  23. QUIZ Rounding / truncation m,s ∊ F(10,4, − 99,99) int k = 10; m = 1; for (int i=0; i<k; i++) m = 2*m; s = 1 + 1/m; for (int i=0; i<k; i++) s = s*s; print s; � � 2 � � 2 It is a fact that � � 2 � � 4 � � � � � . � �→� �1 � lim … The algorithm computes an � � 2 � � 512 approximation to � corresponding to � � 2 �� � 1024 � � 2 �� � 1024 . What is the result of execution in the number system � � 1 � 1 � � 1⊕1⊘1024 F(10,4, − 99,99) with truncation, where fl( � ) = 2.718? � 1⊕0.0009765 7% 1. 0 � �. ��� a) X 86% 2. 1.000 b) X 2% 3. 2.591 � � � ∗ � � �. ��� c) X 4. 2.718 … 3% d) X 5. I don’t know � � � ∗ � � �. ��� 2% e) X

  24. Algorithms using “real” numbers • Representable numbers • Rounding/truncation • IEEE standard and Java • Summation order • Newton iteration • More iteration • Formula rewriting

  25. IEEE standard and Java • IEEE standard describes two number systems, approximately – Single precision F(2, 24, − 126, 127) – Double precision F(2, 53, − 1022, 1023) • The IEEE systems has more numbers: – closes the representational gap around zero with xtra numbers of the form 0. � � � � . . . � �� ∗ 2 ����� – has representations for �∞ and NaN (= Not a Number) • The IEEE system has detailed rules for rounding to representable numbers, ex. – overflow: 1/0 or 2 ���� ∗ 2 has result ∞ – underflow: 1/∞ or 2 �������� /2 has result 0 – 0/0 or ∞ � ∞ has result NaN

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend