IC220 0010 (multiplicand) __x_1011 (multiplier) SlideSet #5: - PowerPoint PPT Presentation

Multiplication • More complicated than addition – accomplished via shifting and addition • Example: grade-school algorithm IC220 0010 (multiplicand) __x_1011 (multiplier) SlideSet #5: More Arithmetic, Floating Point, & More • Multiply n * n bits, How wide (in bits) should the product be? 1 2 Multiplication: Simple Implementation Multiplication with Signed Numbers? 1110 1110 __x_0010 __x_1111 4 3

ARM Multiplication ARM Division • Suppose x, y are in x0, x1 • Implementation can actually use same shift/add hardware! • Compute x = y * 23 – See textbook for details • To get basic result (first 64 bits): • Instructions sdiv x0, x1, x2 // x0 = x1 / x2 (signed) udiv x0, x1, x2 // x0 = x1 / x2 (unsigned ) • How to compute remainder? – Does not work with an immediate value – e.g. inches = length % 12 – Does not set condition codes • To get next 64 bits of result: • Why is it (usually) okay to use only the lower 64 bits? 5 6 Extra space 7

Floating Point IEEE754 Standard Single Precision (float): 8 bit exponent, 23 bit significand • We need a way to represent 31 30 29 28 27 26 25 24 23 22 21 20 . . . 9 8 7 6 5 4 3 2 1 0 – numbers with fractions, e.g., 3.1416 S Exponent (8 Bits) Significand (23 bits) – very small numbers, e.g., .000000001 – very large numbers, e.g., 3.15576  10 23 • Representation: – All numbers converted to binary floating point Double Precision (double): 11 bit exponent, 52 bit significand – sign, exponent, significand: 31 30 29 28 . . . 21 20 19 18 17 . . . 9 8 7 6 5 4 3 2 1 0 • (–1) sign  significand  2 exponent(some power) S Exponent (11 Bits) Significand (20 bits) – IEEE standard defines two common sizes: 31 30 29 28 . . . 21 20 19 18 17 . . . 9 8 7 6 5 4 3 2 1 0 More Significand (32 more bits) • float (32 bits) • double (64 bits) 9 10 ARM Floating Point Instructions Floating point and memory • Separate set of registers provides 32 ‘doubles’: • Suppose X0 points to base of an array of doubles – D0, D1, …., D31 • ‘float’ values stored in same physical registers, but only use • D0 = array[2] half of each register – S0, S1, … , S31 – S0 actually is the ‘lower half’ of D0, etc. • array[3] = D1 • Arithmetic FADD D0, D1, D2 FSUB D0, D1, D2 Can also do loads / stores with S0, S1, …. registers FMUL D0, D1, D2 But how many bytes needed for float? FDIV D0, D1, D2 or with ‘float’ FADD S0, S1, S2 • Do we need to look for “upper half” of results after FMUL? • LDUR / STUR choose right opcode based on first register • Note that *integer* registers used for indexing 11 12 • We will always use ‘double’ vice ‘float’ (to keep it simple)

Example: print product, sum of two doubles Stored Program Concept Bl helperGetDouble // result in D0 1. Instructions represented in binary, stur d0, [sp, #8] just like data bl helperGetDouble // result in D0 stur d0, [sp, #16] 2. Instructions and data stored in memory // Reload values where I want them ldur d2, [sp, #8] Implications: ldur d3, [sp, #16] • Programs can operate on programs // Print product – e.g., compilers, linkers, … fmul d0, d2, d3 • Binary compatibility allows compiled bl helperPrintDouble // prints value in d0 programs to work on different computers // Reload values, print sum – Standardized ISAs ldur d2, [sp, #8] ldur d3, [sp, #16] fadd d0, d2, d3 13 14 bl helperPrintDouble // prints value in d0 Alternative Architectures A dominant architecture: x86 • See your textbook for a more detailed description • ARM philosophy – small number of fast, simple operations • Complexity: – Name: – Instructions from 1 to 15 bytes long – Others: ARM, Alpha, RISC-V, SPARC – one operand must act as both a source and destination – one operand can come from memory • Design alternative: – complex addressing modes – Name: e.g., “base or scaled index with 8 or 32 bit displacement” – provide more powerful operations • Saving grace: – goal is to reduce number of instructions executed – Hardware: the most frequently used instructions are… – Example VAX: minimize code size, make assembly language easy instructions from 1 to 54 bytes long! – Others: x86, Motorola 68000 – Software: compilers avoid the portions of the architecture… – Danger? • Virtually all new instruction sets since 1982 have been “what the x86 lacks in style it rectifies with market size, making it beautiful from the right perspective” (pg 162) 15 16

IC220 0010 (multiplicand) __x_1011 (multiplier) SlideSet #5: - PowerPoint PPT Presentation

Multiplication More complicated than addition accomplished via shifting and addition Example: grade-school algorithm IC220 0010 (multiplicand) __x_1011 (multiplier) SlideSet #5: More Arithmetic, Floating Point, & More

Real World Example Buzzer Feature for a Car Should Buzz when IC220 1. the

IC220 SlideSet #4: Procedures & Chapter 2 Finale (Sections 2.8) Stack Example Procedure

IC220 a = function2(b, c, d); SlideSet #3: Procedures & } Instruction

ADMIN Read pages 211-215 (MIPS floating point instructions) Read 3.9 IC220 Set #10:

IC220 Combinational Logic Slide Set #A2: Combinational and Multiplexors (mux)

Outline IC220 Computer Architecture and Class Survey / Role Call Organization What is:

Outline IC220 Computer Architecture and Class Survey / Role Call Organization What is:

ADMIN Reading finish Chapter 5 Sections 5.4 (skip 511-515), 5.5, 5.11, 5.12 IC220

IC220 See through the marketing hype Slide Set #5B: Performance Key to understanding

Big Picture Interrupts Processor IC220 Set #11: Cache Storage and I/O Memory- I/O bus Main

IC220: Caching 1 (Chapter 5) 1 Memory, Cost, and Performance Ideal World: we want a memory

IC220 Set #7: Controlling the Single Cycle Implementation (Chapter Four) 1 Control Selecting

IC220 SlideSet #4: Procedures (Chapter 2 finale) Stack Example Procedure Example &

ADMIN Course paper topics due Mon Feb 26 via plain text email IC220 Set #10: More

IC220 MIPS conditional branch instructions (I type): SlideSet #3: Control Flow bne $t0,

IC220 Slide Set #6: Digital Logic (Appendix B) 1 2 Appendix Goals Logic Design Digital

CONFIDENTIAL Precise approximation of Floating-Point Computations for C/C++ Software Using the

Image courtesy: Southern California Earthquake Center Matthias Christen, Cetus Users and Compiler

Effective Java Department of Computer Science University of Maryland, College Park Effective

Types Variables We (hopefully) know that if you say: You ask the computer for a variable called

How do Codecademy's The online Python Tutor interpreter 45 million users

New Duty Hour Standards Amelia Sutton David Ellington Britt Erickson Jamie Nodler Christy Walters

Household Debt and Macroeconomic Stability: An Empirical Stock-flow consistent (SFC) model for the

Monetary Policy Report July 2019 Chapter 1 Figure 1.1. Repo rate with uncertainty bands Per

IC220 0010 (multiplicand) __x_1011 (multiplier) SlideSet #5: - PowerPoint PPT Presentation

Multiplication More complicated than addition accomplished via shifting and addition Example: grade-school algorithm IC220 0010 (multiplicand) __x_1011 (multiplier) SlideSet #5: More Arithmetic, Floating Point, & More

Real World Example Buzzer Feature for a Car Should Buzz when IC220 1. the

IC220 SlideSet #4: Procedures &amp; Chapter 2 Finale (Sections 2.8) Stack Example Procedure

IC220 a = function2(b, c, d); SlideSet #3: Procedures &amp; } Instruction

ADMIN Read pages 211-215 (MIPS floating point instructions) Read 3.9 IC220 Set #10:

IC220 Combinational Logic Slide Set #A2: Combinational and Multiplexors (mux)

Outline IC220 Computer Architecture and Class Survey / Role Call Organization What is:

Outline IC220 Computer Architecture and Class Survey / Role Call Organization What is:

ADMIN Reading finish Chapter 5 Sections 5.4 (skip 511-515), 5.5, 5.11, 5.12 IC220

IC220 See through the marketing hype Slide Set #5B: Performance Key to understanding

Big Picture Interrupts Processor IC220 Set #11: Cache Storage and I/O Memory- I/O bus Main

IC220: Caching 1 (Chapter 5) 1 Memory, Cost, and Performance Ideal World: we want a memory

IC220 Set #7: Controlling the Single Cycle Implementation (Chapter Four) 1 Control Selecting

IC220 SlideSet #4: Procedures (Chapter 2 finale) Stack Example Procedure Example &amp;

ADMIN Course paper topics due Mon Feb 26 via plain text email IC220 Set #10: More

IC220 MIPS conditional branch instructions (I type): SlideSet #3: Control Flow bne $t0,

IC220 Slide Set #6: Digital Logic (Appendix B) 1 2 Appendix Goals Logic Design Digital

CONFIDENTIAL Precise approximation of Floating-Point Computations for C/C++ Software Using the

Image courtesy: Southern California Earthquake Center Matthias Christen, Cetus Users and Compiler

Effective Java Department of Computer Science University of Maryland, College Park Effective

Types Variables We (hopefully) know that if you say: You ask the computer for a variable called

How do Codecademy's The online Python Tutor interpreter 45 million users

New Duty Hour Standards Amelia Sutton David Ellington Britt Erickson Jamie Nodler Christy Walters

Household Debt and Macroeconomic Stability: An Empirical Stock-flow consistent (SFC) model for the

Monetary Policy Report July 2019 Chapter 1 Figure 1.1. Repo rate with uncertainty bands Per

IC220 SlideSet #4: Procedures & Chapter 2 Finale (Sections 2.8) Stack Example Procedure

IC220 a = function2(b, c, d); SlideSet #3: Procedures & } Instruction

IC220 SlideSet #4: Procedures (Chapter 2 finale) Stack Example Procedure Example &