Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. - PowerPoint PPT Presentation

Computer Algorithms   CISC4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 2

Outline • Introduction to algorithm analysis: fibonacci seq calculation • counting number of “computer steps” • recursive formula for running time of recursive algorithm • math help: math. induction • Asymptotic notations • Algorithm running time classes: P, NP 2

Last class • Review some algorithms learned in previous classes • idea => pseudocode => implementation • Correctness of sorting algorithms: • insertion sort: gradually expand the “sorted sub array/list” • bubble sort: bubble largest number to the right, also expand “sorted sub array/list” • Algorithm time efficiency: • via measurement: implement & instrument the code, run it in a computer… 3

Example (Fib1: recursive) n T(n)ofFib1 F(n) 10 3e-06 55 Time (in seconds) 11 2e-06 89 12 4e-06 144 13 7e-06 233 14 1.1e-05 377 15 1.7e-05 610 16 2.9e-05 987 17 4.7e-05 1597 18 7.6e-05 2584 19 0.000122 4181 20 0.000198 6765 21 0.000318 10946 22 0.000515 17711 23 0.000842 28657 24 0.001413 46368 25 0.002261 75025 26 0.003688 121393 27 0.006264 196418 28 0.009285 317811 29 0.014995 514229 30 0.02429 832040 31 0.039288 1346269 32 0.063543 2178309 33 0.102821 3524578 34 0.166956 5702887 35 0.269394 9227465 36 0.435607 14930352 37 0.701372 24157817 38 1.15612 39088169 39 1.84103 63245986 40 2.9964 102334155 41 4.85536 165580141 42 7.85187 267914296 n 43 12.6805 433494437 Running time seems to grows 44 20.513 701408733 45 33.2363 1134903170 46 53.8073 1836311903 exponentially as n increases 47 86.9213 -1323752223 48 140.995 512559680 How long does it take to Calculate F(100)? 49 227.948 -811192543 4 50 368.435 -298632863

Example (Fib2: iterative) 10 1e-06 55 11 1e-06 89 Time (in seconds) Time (in seconds) 12 0 144 13 0 233 14 0 377 15 0 610 16 0 987 17 0 1597 18 0 2584 19 0 4181 20 0 6765 21 0 10946 22 0 17711 23 0 28657 24 0 46368 25 0 75025 26 0 121393 27 0 196418 28 1e-06 317811 29 0 514229 30 1e-06 832040 31 0 1346269 32 0 2178309 33 0 3524578 34 1e-06 5702887 35 0 9227465 36 0 14930352 37 0 24157817 38 1e-06 39088169 39 0 63245986 40 0 102334155 n 44 41 1e-06 165580141 42 0 267914296 43 1e-06 433494437 Increase very slowly as n increases 44 0 701408733 … 1000 8e-06 … 5

Take-away • It’s possible to perform model-fitting to find out T(n): running time of the algorithms given input size • Pros of measurement based studies? � � � • Cons: • time consuming, maybe too late • Does not explain why? 6

Analytic approach • Is it possible to find out how running time grows when input size grows, analytically? • Does running time stay constant, increase linearly, logarithmically, quadratically, … exponentially? • Yes: analyze pseudocode/code, calculate total number of steps in terms of input size, and study its order of growth • results are general: not specific to language, run time system, caching effect, other processes sharing computer • shed light on effects of larger problem size, faster CPU, … 7

R unning time analysis • Given an algorithm in pseudocode or actual program • When the input size is n, how many total number of computer steps are executed? • Size of input : size of an array, polynomial degree, # of elements in a matrix, vertices and edges in a graph, or # of bits in the binary representation of input • Computer steps: arithmetic operations, data movement, control, decision making (if, while), comparison,… • each step take a constant amount of time • Ignore: overhead of function calls (call stack frame allocation, passing parameters, and return values) 8

Case Studies: Fib1(n) � � � � � � � • Let T(n) be number of computer steps needed to compute fib1(n) � • T(0)=1: when n=0, first step is executed � • T(1)=2: when n=1, first two steps are executed � • For n >1, T(n)=T(n-1)+T(n-2)+3 : first two steps are executed, fib1(n-1) is called (with T(n-1) steps), fib1(n-2) is called (T(n-2) steps), return values are added (1 step) � � • Can you see that T(n) > F n ? � • How big is T(n)? 9

Running Time analysis � � � � � � • Let T(n) be number of computer steps to compute fib1(n) � • T(0)=1 � • T(1)=2 � • T(n)=T(n-1)+T(n-2)+3, n>1 � • Analyze running time of recursive algorithm � • first, write a recursive formula for its running time � • then, recursive formula => closed formula, asymptotic result � • How fast does T(n) grow? Can you see that T(n) > F n ? � • How big is T(n)? 10

Mathematical Induction • F 0 =0, F 1 =1, F n =F n-1 +F n-2 • We will show that F n >= 2 0.5n , for n >=6 using strong mathematical induction technique • Intuition of basic mathematical induction • it’s like Domino effect • if one push 1st card, all cards fall because 1) 1st card is pushed down 2) every card is close to next card, so that when one card falls, next one falls 11

Mathematical Induction • Sometimes, we needs the multiple previous cards to knock down next card… • Intuition of strong mathematical induction • it’s like Domino effect: if one push first two cards, all cards fall because the weights of two cards falling down knock down the next card � • Generalization: 2 => k 12

Fibonacci numbers • F 0 =0, F 1 =1, F n =F n-1 +F n-2 • show that for all integer n >=6 using strong mathematical induction • basis step: show it’s true when n=6, 7 • inductive step: show if it’s true for n=k-1, k, then it’s true for k+1 • given , k − 1 k F k ≥ 2 F k − 1 ≥ 2 2 2 � k − 1 k F k +1 = F k − 1 + F k ≥ 2 + 2 2 2 � k − 1 k − 1 ≥ 2 + 2 2 2 k − 1 = 2 1+ k − 1 k +1 = 2 × 2 = 2 2 2 2 13

Fibonacci numbers • F 0 =0, F 1 =1, F n =F n-1 +F n-2 � n 2 = 2 0 . 5 n F n ≥ 2 � • Fn is lower bounded by 2 0 . 5 n • In fact, there is a tighter lower bound 2 0.694n • Recall T(n): number of computer steps to compute fib1(n), � • T(0)=1 � • T(1)=2 � • T(n)=T(n-1)+T(n-2)+3, n>1 � T ( n ) > F n ≥ 2 0 . 694 n 14

Exponential running time • Running time of Fib1: T(n)> 2 0.694n • Running time of Fib1 is exponential in n • calculate F 200, it takes at least 2 138 computer steps • On NEC Earth Simulator (fastest computer 2002-2004) • Executes 40 trillion (10 12 ) steps per second, 40 teraflots • Assuming each step takes same amount of time as a “floating point operation” • Time to calculate F 200: at least 2 92 seconds, i.e., 1.57x10 20 years • Can we throw more computing power to the problem? • Moore’s law: computer speeds double about every 18 months (or 2 years according to newer version) 15

Exponential algorithms • Moore’s law (computer speeds double about every two years) can sustain for 4-5 more years… 16

Exponential running time • Running time of Fib1: T(n)> 2 0.694n =1.6177 n • Moore’s law: computer speeds double about every 18 months (or 2 years according to newer version) • If it takes fastest CPU of this year 6 minutes to calculate F 50, • fastest CPU in two years from today can calculate F 52 in 6 minutes • Algorithms with exponential running time are not efficient, not scalable 17

Fastest Supercomputer • June 2017 ranking • Sunway TaihuLight, 93 petaflopts • Tianhe-2 (milky way-2), 33.9 petaflops • Cray XC50, Swiss, 19.6 petaflops • Titan, Cray XK7, US, 17.6 petaflopts • Petaflop: one thousand million million (10 15 ) floating-point operations per second • Need parallel algorithms to take full advantage of these computers 18

Big numbers 19

Can we do better? • Draw recursive function call tree for fib1(5) • Observation: wasteful repeated calculation • Idea: Store solutions to subproblems in array (key of Dynamic Programming) 20

Running time fib2(n) � • Analyze running time of iterative (non-recursive) algorithm: T(n)=1 // if n=0 return 0 +n // create an array of f[0…n] +2 // f[0]=0, f[1]=1 +(n-1) // for loop: repeated for n-1 times = 2n+2 • T(n) is a linear function of n, or fib2(n) has linear running time 21

Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. - PowerPoint PPT Presentation

Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 2 Outline Introduction to algorithm analysis: fibonacci seq calculation counting number of computer steps recursive formula for running time of

P and NP CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Efficient

Algorithms with numbers (1) CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor:

Algorithms with numbers (1) CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor:

Graph: representation and traversal CISC4080, Computer Algorithms CIS, Fordham Univ.

Linear Programming CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang

HashTable CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Spring 2018

Dynamic Programming CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang

Algorithm Analysis, Asymptotic notations CISC4080 CIS, Fordham Univ. Instructor: X. Zhang

Algorithms with numbers (1) CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X.

Fordham University Lincoln Center Campus Community Board 7 November 19, 2008 Fordham University

HashTable CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Fall 2018

Graph: representation and traversal CISC5835, Computer Algorithms CIS, Fordham Univ.

Dynamic Programming CISC5835, Algorithms for Big Data CIS, Fordham Univ. Instructor: X. Zhang

Algorithms for Big Data CISC5835 Fordham Univ. Instructor: X. Zhang Lecture 1 Outline

Algorithms X. Zhang Fordham Univ. 1 Real World applications of algorithms Algorithms for

Algorithms for Big Data CISC5835 Fordham Univ. Instructor: X. Zhang Lecture 1 Outline

HPC Performance and Energy E ffi ciency Overview and Trends Dr. Sbastien Varrette June 9th, 2015

Opportunities for Parallelism Dr. Michael K. Bane HIGH END COMPUTE Questions 1. What do you

and Zonal Field Generation Z. Lin University of California, Irvine Fusion Simulation Center,

Data- Intensive

An Efficient Implementation of Tiled Polymorphic Temporal Media Simon Archipoff LaBRI FARM, 2015

PlaidML & Stripe Model-guided Optimization & Polyhedral IR Brian Retford PlaidML: Tile

OpenMapTiles: Vector tiles from OpenStreetMap Petr Pridal <petr.pridal@maptiler.com>

Homework Logistics Lecture Outline Strengthening Induction Hypothesis. Theorem: The sum of the

Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. - PowerPoint PPT Presentation

Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 2 Outline Introduction to algorithm analysis: fibonacci seq calculation counting number of computer steps recursive formula for running time of

P and NP CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Efficient

Algorithms with numbers (1) CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor:

Algorithms with numbers (1) CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor:

Graph: representation and traversal CISC4080, Computer Algorithms CIS, Fordham Univ.

Linear Programming CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang

HashTable CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Spring 2018

Dynamic Programming CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang

Algorithm Analysis, Asymptotic notations CISC4080 CIS, Fordham Univ. Instructor: X. Zhang

Algorithms with numbers (1) CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X.

Fordham University Lincoln Center Campus Community Board 7 November 19, 2008 Fordham University

HashTable CISC5835, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Fall 2018

Graph: representation and traversal CISC5835, Computer Algorithms CIS, Fordham Univ.

Dynamic Programming CISC5835, Algorithms for Big Data CIS, Fordham Univ. Instructor: X. Zhang

Algorithms for Big Data CISC5835 Fordham Univ. Instructor: X. Zhang Lecture 1 Outline

Algorithms X. Zhang Fordham Univ. 1 Real World applications of algorithms Algorithms for

Algorithms for Big Data CISC5835 Fordham Univ. Instructor: X. Zhang Lecture 1 Outline

HPC Performance and Energy E ffi ciency Overview and Trends Dr. Sbastien Varrette June 9th, 2015

Opportunities for Parallelism Dr. Michael K. Bane HIGH END COMPUTE Questions 1. What do you

and Zonal Field Generation Z. Lin University of California, Irvine Fusion Simulation Center,

Data- Intensive

An Efficient Implementation of Tiled Polymorphic Temporal Media Simon Archipoff LaBRI FARM, 2015

PlaidML &amp; Stripe Model-guided Optimization &amp; Polyhedral IR Brian Retford PlaidML: Tile

OpenMapTiles: Vector tiles from OpenStreetMap Petr Pridal &lt;petr.pridal@maptiler.com&gt;

Homework Logistics Lecture Outline Strengthening Induction Hypothesis. Theorem: The sum of the

PlaidML & Stripe Model-guided Optimization & Polyhedral IR Brian Retford PlaidML: Tile

OpenMapTiles: Vector tiles from OpenStreetMap Petr Pridal <petr.pridal@maptiler.com>