Fast Arithmetic Philipp Koehn 27 September 2019 Philipp Koehn - PowerPoint PPT Presentation

Fast Arithmetic Philipp Koehn 27 September 2019 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

1 arithmetic Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Addition (Immediate) 2 • Load immediately one number (s0 = 2) li $s0, 2 • Add 4 ($s1 = $s0 + 4 = 6) addi $s1, $s0, 4 • Subtract 3 ($s2 = $s1 - 3 = 3) addi $s2, $s1, -3 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Addition (Register) 3 • Load immediately one number (s0 = 2) li $s0, 2 • Add value from $s5 ($s1 = $s0 + $s5) add $s1, $s0, $s5 • Subtract value from $s6 ($s2 = $s1 - $s6) sub $s2, $s1, $s6 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Overflow 4 • Signed integers operations: add, addi, and sub – overflow triggers exceptions – similar to interrupt – register $mfc0 contains address of exception program • Unsigned integers operations: addu, addiu, and subu – no overflow handling (as in C programming language) Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Code for Detecting Overflow 5 • Overflow for unsigned integers operations can be detected from result • Actual detection code is a bit intricate • If you are interested → consult Section 3.2 in Patterson/Hennessy textbook Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

6 fast addition Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Recall: N-Bit Addition 7 011 +11 --- 110 --- 110 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Recall: N-Bit Addition 8 011 +11 --- 110 --- 110 1+1 = 0, carry the 1 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Recall: N-Bit Addition 9 011 +11 --- 110 --- 110 1+1+1 = 1, carry the 1 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Recall: N-Bit Addition 10 011 +11 --- 110 --- 110 copy carry bit Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Fast Addition 11 • We defined n-bit adding as a sequential process • More bits → addition takes longer • 32 bit addition gets very slow • Faster addition: Carry Lookahead Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Problem: Carry Propagation 12 • 1+1 addition always causes a carry 1+1 + carry1 = 1, carry 1 1+1 + carry0 = 0, carry 1 • 0+0 addition never causes a carry 0+0 + carry1 = 1, carry 0 0+0 + carry0 = 0, carry 0 • 0+1 and 1+0 addition may cause a carry 0+1 + carry1 = 0, carry 1 0+1 + carry0 = 1, carry 0 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Generate and Propagate 13 • Compute for each bit, if it generates or propagates carry • Example Operand A 0100 1111 Operand B 0110 0001 Generate 0100 0001 Propagate 0110 1111 Carry 1001 111- • Generate: a i and b i • Propagate: a i or b i • Carry: ? Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

4-Bit Adder 14 • First compute generate and propagate for all bits g i = a i and b i – generate: p i = a i or b i – propagate: • Compute carries for each bit – c 1 = g 0 or ( p 0 and c 0 ) – c 2 = g 1 or ( p 1 and g 0 ) or ( p 1 and p 0 and c 0 ) – c 3 = g 2 or ( p 2 and g 1 ) or ( p 2 and p 1 and g 1 ) or ( p 2 and p 1 and p 0 and c 0 ) – c 4 = g 3 or ( p 3 and g 2 ) or ( p 3 and p 2 and g 2 ) or ( p 3 and p 2 and p 1 and g 1 ) or ( p 3 and p 2 and p 1 and p 0 and c 0 ) • The carry computations require no recursion --- but use a lot of gates • We may want to stop at 4 bits with this idea Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

16-Bit Adder 15 • Combine 4 4-bit adders • For each 4-bit adder, compute – "super" propagate = P = p 0 and p 1 and p 2 and p 3 – "super" generate = g 3 or ( p 3 and g 2 ) or ( p 3 and p 2 and g 1 ) or ( p 3 and p 2 and p 1 and g 0 ) • Compute super carry C j from super propagate P j and super generate G j • Use C j as input carry to the 4-bit adders Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Cycles 16 1. compute propagate p i and generate g i 2. compute carry c i compute super propagate P j and super generate G j 3. compute super carry C j 4. carry out all bitwise additions Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Trade-Off 17 • Higher n in n-bit adders – more gates in circuit – faster computation • Modern CPUs can pack more gates on a chip ⇒ speed-up at same clock speed Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

18 multiplication Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Recall Method 19 • Elementary school multiplication: xxxx10101 x 1101 ---------------- 10101 0 10101 10101 ---------------- 100010001 (in decimal: 23x13 = 299) • Idea – shift second operand to right (get last bit) – if carry: add second operand to sum – rotate first operand to left (multiply with binary 10) Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Multiplication in Hardware 20 64 SHIFT LEFT Multiplicant 32 Multiplyer WRITE Adder SHIFT RIGHT Control WRITE 64 Product Unit • Control unit runs microprogram • Speed loop 32 times: – 32 iterations if lowest bit of multiplyer=1 – 3 operations each add multiplicant to product (add + shift + shift) shift multiplicant left → almost 100 operations shift multiplyer right • Note: multiplying 32 bit numbers may result in 64 bit product Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Parallelize the 3 Operations 21 • The 3 operations in each loop affect different registers – add: product – shift left: multiplicant – shift right: multiplyer ⇒ These can be executed in parallel (note: read is executed before write) Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Parallelize the Iterations 22 • Sum of 32 independently computed values • More adders → some summing can be done in parallel • Binary tree → log 2 32 = 5 cycles MULTI- MULTI- MULTI- MULTI- MULTI- MULTI- MULTI- MULTI- PLICANT PLICANT PLICANT PLICANT PLICANT PLICANT PLICANT PLICANT SHIFT SHIFT SHIFT SHIFT SHIFT SHIFT SHIFT RIGHT RIGHT RIGHT RIGHT RIGHT RIGHT RIGHT 29 28 1 31 30 3 2 AND AND AND AND AND AND AND AND Adder Adder Adder Adder Adder Adder … … … … Adder PRODUCT Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

MIPS Instructions 23 • 32 bit multiplication results in 64 bit product • Special 64 bit register holds result – hi: high word – lo: low word • Low word has to be retrieved by another instruction mult $s1, $s2 mflo $s0 • Since this is the typical usage, pseudo-instruction mul $s0, $s1, $s2 More on that later Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

24 division Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Elementary School Division 25 0 xxxx1011 / 10 = 1 1 10 0 01 011 10 1 Remainder • Algorithm 1. shift divisor sufficiently to the left 2. check if subtraction is possible yes → add result bit 1, carry out subtraction no → add result bit 0 3. pull down bit from dividend 4. shift divisor to the right not possible → done, note remainder otherwise go to step 2 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Algorithm Refinement 26 1. Shift divisor sufficiently to the left • hard for machine to determine → shift to maximum left • 32 bit division: use 64 register, push 32 positions 2. Check if subtraction is possible yes → add result bit 1, carry out subtraction no → add result bit 0 • we always carry out subtraction • if overflow, do not use result 3. Pull down bit from dividend 4. Shift divisor to the right not possible → done, note remainder otherwise go to step 2 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Division in Hardware 27 • Operations similar to multiplication – shift divisor – subtraction – indication if subtraction should be accepted • These operations can be parallelized • But: iterations cannot be parallelized the same way (sophisticated prediction methods guess outcome of subtractions) Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

MIPS Instructions 28 • 32 bit division results in 32 bit quotient and 32 bit remainder – hi: remainder – lo: quotient • Quotient has to be retrieved by another instruction div $s1, $s2 mflo $s0 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019

Fast Arithmetic Philipp Koehn 27 September 2019 Philipp Koehn - PowerPoint PPT Presentation

Fast Arithmetic Philipp Koehn 27 September 2019 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019 1 arithmetic Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019 Addition (Immediate)

By Shervin Daneshpajouh Computer Arithmetic Computer Arithmetic p Computer Computer Arithmetic

Digital Design Discussion: Arithmetic Binary Arithmetic Floating-Point Arithmetic Binary

Lecture 4 Arithmetic-Logic Unit 1 Arithmetic - Logic Unit ALU Handles integers Does the

Arithmetic for Computers October 31, 2008 Arithmetic for Computers ALU Arithmetic Logic Unit

Section 4 Section 4 Arithmetic Units a 4-1 1 ALU ALU a 4-2 2 Arithmetic Logic Unit (ALU)

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Arithmetic Logic Unit (ALU) By : Khawar Nehal 18 June 2020 Updated 21 June 2020 1 / 32

Arithmetic Series (Lesson Slides) UNIT #7: Sequences and Series WARMUP Arithmetic Series

Peano Arithmetic Definition. The axioms of Peano Arithmetic (1889), denoted PA , consist of the

Numeration and Computer Arithmetic Some Examples JC Bajard LIRMM, CNRS UM2 161 rue Ada, 34392

Lecture 14. Outline. Modular Arithmetic Fact and Secrets There exists a polynomial... Modular

A Fast Linear-Arithmetic Solver for DPLL(T) Bruno Dutertre and Leonardo de Moura { bruno, demoura

Community Update MST T Fast st Facts cts MST T Fast st Facts cts MST T Fast st Facts

Fast Food and Your Health www.ddssafety.net Last updated October 2009 What is fast food?

Lurssen 32,9 A classic fast Lurssen 32,9 A classic fast A F T D E C K Lurssen 32,9 A

Arithmetic paper 5 Once you have completed the arithmetic paper, use the slides to help you to

Advanced Data Modelling in PostgreSQL Chris Travers Adjust GmbH May 16, 2020 Introduction

Using R for Spatial Shift-Share Analysis Gian Pietro Zaccomer Luca Grassetti

Dependency Parsing CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT marine@cs.umd.edu Slides

Lecture 22: Linear Shift-Invariant (LSI) Systems and Convolution April 26, 2016. Linear

22/10/2020 Keeping Yourself Safe Sometimes difficult emotional problems can lead to feelings

How Does Selective Mechanism Improve Self-Attention Networks? Xinwei Geng 1 , Longyue Wang 2 , Xing

YouTube Video Analytics for Health Literacy and Chronic Care Management: An Augmented Intelligence

CSE/NB 528 Lecture 10: Recurrent Networks (Chapter 7) Lecture figures are from Dayan &

Fast Arithmetic Philipp Koehn 27 September 2019 Philipp Koehn - PowerPoint PPT Presentation

Fast Arithmetic Philipp Koehn 27 September 2019 Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019 1 arithmetic Philipp Koehn Computer Systems Fundamental: Fast Arithmetic 27 September 2019 Addition (Immediate)

By Shervin Daneshpajouh Computer Arithmetic Computer Arithmetic p Computer Computer Arithmetic

Digital Design Discussion: Arithmetic Binary Arithmetic Floating-Point Arithmetic Binary

Lecture 4 Arithmetic-Logic Unit 1 Arithmetic - Logic Unit ALU Handles integers Does the

Arithmetic for Computers October 31, 2008 Arithmetic for Computers ALU Arithmetic Logic Unit

Section 4 Section 4 Arithmetic Units a 4-1 1 ALU ALU a 4-2 2 Arithmetic Logic Unit (ALU)

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Arithmetic Logic Unit (ALU) By : Khawar Nehal 18 June 2020 Updated 21 June 2020 1 / 32

Arithmetic Series (Lesson Slides) UNIT #7: Sequences and Series WARMUP Arithmetic Series

Peano Arithmetic Definition. The axioms of Peano Arithmetic (1889), denoted PA , consist of the

Numeration and Computer Arithmetic Some Examples JC Bajard LIRMM, CNRS UM2 161 rue Ada, 34392

Lecture 14. Outline. Modular Arithmetic Fact and Secrets There exists a polynomial... Modular

A Fast Linear-Arithmetic Solver for DPLL(T) Bruno Dutertre and Leonardo de Moura { bruno, demoura

Community Update MST T Fast st Facts cts MST T Fast st Facts cts MST T Fast st Facts

Fast Food and Your Health www.ddssafety.net Last updated October 2009 What is fast food?

Lurssen 32,9 A classic fast Lurssen 32,9 A classic fast A F T D E C K Lurssen 32,9 A

Arithmetic paper 5 Once you have completed the arithmetic paper, use the slides to help you to

Advanced Data Modelling in PostgreSQL Chris Travers Adjust GmbH May 16, 2020 Introduction

Using R for Spatial Shift-Share Analysis Gian Pietro Zaccomer Luca Grassetti

Dependency Parsing CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT marine@cs.umd.edu Slides

Lecture 22: Linear Shift-Invariant (LSI) Systems and Convolution April 26, 2016. Linear

22/10/2020 Keeping Yourself Safe Sometimes difficult emotional problems can lead to feelings

How Does Selective Mechanism Improve Self-Attention Networks? Xinwei Geng 1 , Longyue Wang 2 , Xing

YouTube Video Analytics for Health Literacy and Chronic Care Management: An Augmented Intelligence

CSE/NB 528 Lecture 10: Recurrent Networks (Chapter 7) Lecture figures are from Dayan &amp;

CSE/NB 528 Lecture 10: Recurrent Networks (Chapter 7) Lecture figures are from Dayan &