A NALYSIS OF A LGORITHMS Feb. 16, 2017 Acknowledgement: The course - PowerPoint PPT Presentation

    BBM 202 - ALGORITHMS T ODAY ‣ Analysis of Algorithms ‣ Observations D EPT . OF C OMPUTER E NGINEERING ‣ Mathematical models ‣ Order-of-growth classifications ‣ Dependencies on inputs ‣ Memory A NALYSIS OF A LGORITHMS   Feb. 16, 2017 Acknowledgement: The course slides are adapted from the slides prepared by R. Sedgewick   and K. Wayne of Princeton University. Cast of characters Running time Programmer needs to develop “ As soon as an Analytic Engine exists, it will necessarily guide the future   a working solution. course of the science. Whenever any result is sought by its aid, the question   will arise—By what course of calculation can these results be arrived at by   Student might play the machine in the shortest time? ” — Charles Babbage (1864) any or all of these Client wants to solve   roles someday. problem efficiently. how many times do you Theoretician wants have to turn the crank? to understand. Basic blocking and tackling is sometimes necessary. [this lecture] Analytic Engine 3 4

Reasons to analyze algorithms Some algorithmic successes Predict performance. Discrete Fourier transform. • Break down waveform of N samples into periodic components. • Applications: DVD, JPEG, MRI, astrophysics, …. Compare algorithms. this course (BBM 202) • Brute force: N 2 steps. • FFT algorithm: N log N steps, enables new technology.   Friedrich Gauss Provide guarantees. 1805 time quadratic Understand theoretical basis. 64T Analysis of algorithms (BBM 408) 32T Primary practical reason: avoid performance bugs. 16T linearithmic 8T linear size 1K 2K 4K 8K client gets poor performance because programmer   did not understand performance characteristics • sFFT: Sparse Fast Fourier Transform algorithm (Hassanieh et al., 2012) - A faster Fourier Transform: k log N steps (with k sparse coefficients) 5 6 Some algorithmic successes The challenge N-body simulation. Q. Will my program be able to solve a large practical input? • Simulate gravitational interactions among N bodies. • Brute force: N 2 steps. • Barnes-Hut algorithm: N log N steps, enables new research. Andrew Appel   Why does it run out of memory ? Why is my program so slow ? PU '81 time quadratic 64T 32T 16T linearithmic 8T Key insight. [Knuth 1970s] Use scientific method to understand linear performance. size 1K 2K 4K 8K 7 8

Scientific method applied to analysis of algorithms A NALYSIS OF A LGORITHMS A framework for predicting performance and comparing algorithms. ‣ Observations ‣ Mathematical models Scientific method. ‣ Order-of-growth classifications • Observe some feature of the natural world. ‣ Dependencies on inputs • Hypothesize a model that is consistent with the observations. ‣ Memory • Predict events using the hypothesis. • Verify the predictions by making further observations. • Validate by repeating until the hypothesis and observations agree. Principles. Experiments must be reproducible. Hypotheses must be falsifiable. Feature of the natural world = computer itself. 9 Example: 3-sum 3-sum: brute-force algorithm 3-sum. Given N distinct integers, how many triples sum to exactly zero? public class ThreeSum { public static int count(int[] a) { int N = a.length; a[i] a[j] a[k] sum % more 8ints.txt int count = 0; 8 30 -40 10 0 1 for (int i = 0; i < N; i++) 30 -40 -20 -10 40 0 10 5 for (int j = i+1; j < N; j++) 30 -20 -10 0 2 for (int k = j+1; k < N; k++) check each triple % java ThreeSum 8ints.txt -40 40 0 0 if (a[i] + a[j] + a[k] == 0) for simplicity, ignore 3 4 count++; integer overflow -10 0 10 0 4 return count; } public static void main(String[] args) { int[] a = In.readInts(args[0]); StdOut.println(count(a)); } } Context. Deeply related to problems in computational geometry. 11 12

Measuring the running time Measuring the running time % java ThreeSum 1Kints.txt Q. How to time a program? Q. How to time a program? A. Manual. A. Automatic. tick tick tick 70 % java ThreeSum 2Kints.txt public class Stopwatch (part of stdlib.jar ) tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick Stopwatch() create a new stopwatch 528 double elapsedTime() % java ThreeSum 4Kints.txt time since creation (in seconds) tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick public static void main(String[] args) tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick { tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick int[] a = In.readInts(args[0]); tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick Stopwatch stopwatch = new Stopwatch(); tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick StdOut.println(ThreeSum.count(a)); tick tick tick tick tick tick tick tick double time = stopwatch.elapsedTime(); tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick } tick tick tick tick tick tick tick tick client code tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick tick 4039 13 14 Measuring the running time Empirical analysis Q. How to time a program? Run the program for various input sizes and measure running time. A. Automatic. N time (seconds) † public class Stopwatch (part of stdlib.jar ) 250 0 Stopwatch() create a new stopwatch 500 0 double elapsedTime() time since creation (in seconds) 1.000 0,1 2.000 0,8 public class Stopwatch { 4.000 6,4 private final long start = System.currentTimeMillis(); 8.000 51,1 public double elapsedTime() 16.000 ? { long now = System.currentTimeMillis(); return (now - start) / 1000.0; } } implementation (part of stdlib.jar ) 15 16

Data analysis Data analysis Standard plot. Plot running time T ( N ) vs. input size N . Log-log plot. Plot running time T ( N ) vs. input size N using log-log scale. log-log plot 51.2 straight line of slope 3 25.6 standard plot 50 12.8 lg( T ( N )) = b lg N + c b = 2.999 6.4 lg( T ( N )) c = -33.2103 40 3.2 running time T ( N ) 1.6 T ( N ) = a N b , where a = 2 c 30 .8 .4 20 .2 .1 10 1K 2K 4K 8K lg N power law 1K 2K 4K 8K Regression. Fit straight line through data points: a N b . problem size N slope Hypothesis. The running time is about 1.006 × 10 –10 × N 2.999 seconds. 17 18 Prediction and validation Doubling hypothesis Hypothesis. The running time is about 1.006 × 10 –10 × N 2.999 seconds. Doubling hypothesis. Quick way to estimate b in a power-law relationship. Run program, doubling the size of the input. "order of growth" of running   time is about N 3 [stay tuned] Predictions. time (seconds) † N ratio lg ratio • 51.0 seconds for N = 8,000 . 250 0 – • 408.1 seconds for N = 16,000 . 500 0 4,8 2,3 1.000 0,1 6,9 2,8 Observations. N time (seconds) † 2.000 0,8 7,7 2,9 8.000 51,1 4.000 6,4 8 3 8.000 51 8.000 51,1 8 3 8.000 51,1 16.000 410,8 seems to converge to a constant b ≈ 3 validates hypothesis! Hypothesis. Running time is about a N b with b = lg ratio . Caveat. Cannot identify logarithmic factors with doubling hypothesis. 19 20

Doubling hypothesis Experimental algorithmics Doubling hypothesis. Quick way to estimate b in a power-law hypothesis. System independent effects. • Algorithm. determines exponent b • Input data. in power law Q. How to estimate a (assuming we know b ) ? A. Run the program (for a sufficient large value of N ) and solve for a . System dependent effects. determines constant a in power law • Hardware: CPU, memory, cache, … • Software: compiler, interpreter, garbage collector, … N time (seconds) † • System: operating system, network, other applications, … 8.000 51,1 51.1 = a × 8000 3 8.000 51 ⇒ a = 0.998 × 10 –10 8.000 51,1 Bad news. Difficult to get precise measurements. Good news. Much easier and cheaper than other sciences. Hypothesis. Running time is about 0.998 × 10 –10 × N 3 seconds. almost identical hypothesis e.g., can run huge number of experiments to one obtained via linear regression 21 22 In practice, constant factors matter too! A NALYSIS OF A LGORITHMS Q. How long does this program take as a function of N ? ‣ Observations ‣ Mathematical models ‣ Order-of-growth classifications String s = StdIn.readString(); int N = s.length(); ‣ Dependencies on inputs ... ‣ Memory for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) distance[i][j] = ... ... N time N time 1.000 0,11 250 0,5 2.000 0,35 500 1,1 4.000 1,6 1.000 1,9 8.000 6,5 2.000 3,9 Jenny ~ c 1 N 2 seconds Kenny ~ c 2 N seconds 23

A NALYSIS OF A LGORITHMS Feb. 16, 2017 Acknowledgement: The course - PowerPoint PPT Presentation

BBM 202 - ALGORITHMS T ODAY Analysis of Algorithms Observations D EPT . OF C OMPUTER E NGINEERING Mathematical models Order-of-growth classifications Dependencies on inputs Memory A NALYSIS OF A LGORITHMS Feb.

A NALYSIS OF A LGORITHMS Acknowledgement: The course slides are adapted from the slides

A NALYSIS W HAT IS IT ? Created & Exclusively Owned by: Impact Branding Consulting, Inc

T ACOMA M IXED U SE C ENTERS F EASIBILITY A NALYSIS P REPARED BY P ROPERTY C OUNSELORS M AY 2015 I

T YPE -G UIDED W ORST -C ASE I NPUT G ENERATION Di Wang , Jan Hoffmann Carnegie Mellon

A Hybrid Systolic-Dataflow Architecture for In Inductive Matrix Alg lgorithms Jian Weng, Sihao

D ATA S TRUCTURES AND A LGORITHMS FOR C OMPUTATIONAL L INGUISTICS III C LAUS Z INN a r

T RACEABLE A LGORITHMS Prasun Dewan Department of Computer Science University of North Carolina

E LEMENTARY S ORTING A LGORITHMS Acknowledgement: The course slides are adapted from the slides

E LEMENTARY S ORTING A LGORITHMS Feb. 20, 2017 Acknowledgement: The course slides are adapted

Toward a Principled Framework to Design Dynamic Adaptive Streaming Alg lgorithms over HTTP

CS 4/56101 Design and Analysis of Alg lgorithms Fall ll 2020 Website and Contact Course

[D ISK S CHEDULING A LGORITHMS ] Shrideep Pallickara Computer Science Colorado State University

Guaranteed Precision Evaluation of D-finite Functions Marc Mezzarobba A LGORITHMS project, INRIA

A Quick Math Review Logarithms and Exponents - properties of logarithms: log b (xy) = log b x

LEAR C ONTIGUOUS A REAS A NALYSIS (CAA) M APPING R EFINEMENT LEAR Open House Presentation April

Biophy iophytis is Present Presents s Preliminar Preliminary y Analysis nalysis of of SA

Riga, , Latvia, via, October 7 8, 2010 Current situation with Latvian & Lithuanian MT

The Initiative Discovering Talent, Developing Skills, Building Careers. CURRENT SITUATION

Introduction to GANs LSGAN SAGAN MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

The SuperTiger-1 instrument and its long-duration Antarctic balloon flight J.T. LINK 2,6 , W.R.

Correlation ECE for core turbulence measurements on Alcator C-Mod C-Mod Ideas Forum 2011 A.

Getting started with Java Magic Lines public public class class Mag MagicLines cLines {

Introduction to WinBUGS Introduction to WinBUGS WinBUGS is the Windows version of the B B ayesian

Fifth GF Summer School 2017, Riga, August 18, 2017 About Tilde and what we do Grammar

A NALYSIS OF A LGORITHMS Feb. 16, 2017 Acknowledgement: The course - PowerPoint PPT Presentation

BBM 202 - ALGORITHMS T ODAY Analysis of Algorithms Observations D EPT . OF C OMPUTER E NGINEERING Mathematical models Order-of-growth classifications Dependencies on inputs Memory A NALYSIS OF A LGORITHMS Feb.

A NALYSIS OF A LGORITHMS Acknowledgement: The course slides are adapted from the slides

A NALYSIS W HAT IS IT ? Created &amp; Exclusively Owned by: Impact Branding Consulting, Inc

T ACOMA M IXED U SE C ENTERS F EASIBILITY A NALYSIS P REPARED BY P ROPERTY C OUNSELORS M AY 2015 I

T YPE -G UIDED W ORST -C ASE I NPUT G ENERATION Di Wang , Jan Hoffmann Carnegie Mellon

A Hybrid Systolic-Dataflow Architecture for In Inductive Matrix Alg lgorithms Jian Weng, Sihao

D ATA S TRUCTURES AND A LGORITHMS FOR C OMPUTATIONAL L INGUISTICS III C LAUS Z INN a r

T RACEABLE A LGORITHMS Prasun Dewan Department of Computer Science University of North Carolina

E LEMENTARY S ORTING A LGORITHMS Acknowledgement: The course slides are adapted from the slides

E LEMENTARY S ORTING A LGORITHMS Feb. 20, 2017 Acknowledgement: The course slides are adapted

Toward a Principled Framework to Design Dynamic Adaptive Streaming Alg lgorithms over HTTP

CS 4/56101 Design and Analysis of Alg lgorithms Fall ll 2020 Website and Contact Course

[D ISK S CHEDULING A LGORITHMS ] Shrideep Pallickara Computer Science Colorado State University

Guaranteed Precision Evaluation of D-finite Functions Marc Mezzarobba A LGORITHMS project, INRIA

A Quick Math Review Logarithms and Exponents - properties of logarithms: log b (xy) = log b x

LEAR C ONTIGUOUS A REAS A NALYSIS (CAA) M APPING R EFINEMENT LEAR Open House Presentation April

Biophy iophytis is Present Presents s Preliminar Preliminary y Analysis nalysis of of SA

Riga, , Latvia, via, October 7 8, 2010 Current situation with Latvian &amp; Lithuanian MT

The Initiative Discovering Talent, Developing Skills, Building Careers. CURRENT SITUATION

Introduction to GANs LSGAN SAGAN MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

The SuperTiger-1 instrument and its long-duration Antarctic balloon flight J.T. LINK 2,6 , W.R.

Correlation ECE for core turbulence measurements on Alcator C-Mod C-Mod Ideas Forum 2011 A.

Getting started with Java Magic Lines public public class class Mag MagicLines cLines {

Introduction to WinBUGS Introduction to WinBUGS WinBUGS is the Windows version of the B B ayesian

Fifth GF Summer School 2017, Riga, August 18, 2017 About Tilde and what we do Grammar

A NALYSIS W HAT IS IT ? Created & Exclusively Owned by: Impact Branding Consulting, Inc

Riga, , Latvia, via, October 7 8, 2010 Current situation with Latvian & Lithuanian MT