Analysis of Algorithms Data Structures and Algorithms for CL III, WS - - PowerPoint PPT Presentation

analysis of algorithms
SMART_READER_LITE
LIVE PREVIEW

Analysis of Algorithms Data Structures and Algorithms for CL III, WS - - PowerPoint PPT Presentation

Department of General and Computational Linguistics Analysis of Algorithms Data Structures and Algorithms for CL III, WS 2019-2020 Corina Dima corina.dima@uni-tuebingen.de M ICHAEL G OODRICH Data Structures & Algorithms in Python R OBERTO T


slide-1
SLIDE 1

Corina Dima corina.dima@uni-tuebingen.de

Department of General and Computational Linguistics

Data Structures and Algorithms for CL III, WS 2019-2020

Analysis of Algorithms

slide-2
SLIDE 2

Analysis of Algorithms | 2

Data Structures & Algorithms in Python

MICHAEL GOODRICH ROBERTO TAMASSIA MICHAEL GOLDWASSER

  • 1. Python Primer
  • 2. Object-Oriented Programming
slide-3
SLIDE 3

Don’t forget to register – registration closes tonight!

Analysis of Algorithms | 3

https://dsacl3-2019.github.io/

slide-4
SLIDE 4

Analysis of Algorithms | 4

Data Structures & Algorithms in Python

MICHAEL GOODRICH ROBERTO TAMASSIA MICHAEL GOLDWASSER

  • 3. Algorithm Analysis

v experimental studies v seven functions v asymptotic analysis

slide-5
SLIDE 5

Analysis of Algorithms | 5

Algorithm Input Output

slide-6
SLIDE 6

Running Time

  • The running time of an

algorithm typically grows with the input size.

  • But may also vary for inputs of

the same size

  • Running time is influenced by

the hardware and software environment

Analysis of Algorithms | 6

20 40 60 80 100 120

Running Time

1000 2000 3000 4000

Input Size best case average case worst case

slide-7
SLIDE 7

Experimental Study

  • Write a program implementing

the algorithm

  • Run the program with inputs of

varying size and composition, recording the time needed

  • Analyze the results

Analysis of Algorithms | 7

1000 2000 3000 4000 5000 6000 7000 8000 9000 50 100

Input Size Time (ms)

  • r using clock() or the timeit module
slide-8
SLIDE 8

Limitations of Experiments

Analysis of Algorithms | 8

Action Challenge

slide-9
SLIDE 9

Limitations of Experiments

Analysis of Algorithms | 9

Action Challenge Write a program implementing the algorithm Algorithm must be fully implemented before performing an experimental study

slide-10
SLIDE 10

Limitations of Experiments

Analysis of Algorithms | 10

Action Challenge Write a program implementing the algorithm Algorithm must be fully implemented before performing an experimental study Run the program with inputs of varying size and composition, recording the time needed Experiments can only be done on a limited set of inputs

slide-11
SLIDE 11

Limitations of Experiments

Analysis of Algorithms | 11

Action Challenge Write a program implementing the algorithm Algorithm must be fully implemented before performing an experimental study Run the program with inputs of varying size and composition, recording the time needed Experiments can only be done on a limited set of inputs Analyze the results Experimental runs of two different algorithms are difficult to compare directly unless the experiments are performed in the same hardware and software environments

slide-12
SLIDE 12

Beyond Experimental Analysis

  • An approach to analyzing the efficiency of algorithms that:

1.

Can be used to evaluate the relative efficiency of two algorithms independently of the hardware and software environment

2.

Can be performed by studying a high-level description of the algorithm (pseudocode), without actually implementing it

3.

Takes into account all possible inputs

4.

Characterizes running time as a function of the input size, n

Analysis of Algorithms | 12

slide-13
SLIDE 13

Theoretical Analysis

  • Perform the analysis directly on a high-level description of the

algorithm

  • Count the number of primitive operations that are executed,

and use this number, t, as a measure of the running time of the algorithm

Analysis of Algorithms | 13

slide-14
SLIDE 14

Primitive Operations

  • Basic computations performed by an algorithm
  • Identifiable in pseudocode
  • Largely independent from the programming language
  • Assumed to take a constant amount of time in the RAM model

Analysis of Algorithms | 14

slide-15
SLIDE 15

Examples of Primitive Operations

  • Assigning an identifier to an object
  • Determining the object associated with an identifier
  • Performing an arithmetic operation (e.g. adding two numbers)
  • Comparing two numbers
  • Accessing a single element of a list by index
  • Calling a function
  • Returning from a function

Analysis of Algorithms | 15

slide-16
SLIDE 16

Focusing on Worst-Case Input

  • An algorithm might run faster on some inputs that it does on others
  • f the same size
  • Express the running time of an algorithm as a function of the input

size obtained by taking the average over all possible inputs of the same size

  • Challenging: requires defining a probability distribution over the set
  • f inputs
  • Solution: characterize running times in terms of the worst case, as

a function of the input size, n, of the algorithm

  • Easier: only need to identify the worst-case input
  • Plus: performing well on the worst-case input means that the

algorithm needs to do well on every input

Analysis of Algorithms | 16

slide-17
SLIDE 17
  • Associate, with each algorithm, a function f(n) that

characterizes the number of primitive operations that are performed as a function of the input size n

Analysis of Algorithms | 17

slide-18
SLIDE 18

Seven Important Functions in Algorithm Analysis

  • 1. Constant

! " = $

  • 2. Logarithmic

! " = %&'(", b > 1

  • 3. Linear

! " = "

  • 4. N-log-N

! " = " log "

  • 5. Quadratic

! " = "0

  • 6. Cubic, other polynomials

! " = "1

  • 7. Exponential

! " = 23, 2 > 0

Analysis of Algorithms | 18

slide-19
SLIDE 19

The Constant Function

  • " # = %, '() *(+, '-.,/ 0(1*2312 0
  • No matter the n, the function assigns the value c
  • c is a constant, e.g. c = 5, c = 27, c = 256
  • But will use typically 7 1 = 1, given that any other constant

function ' 1 = 0 can be written as ' 1 = 07(1)

  • Simple, but helps characterize the number of steps needed to do a

basic operation like adding or comparing two numbers

Analysis of Algorithms | 19

slide-20
SLIDE 20

The Logarithm Function

  • " # = %&'(#, * > 1
  • Defined as: - = ./012 34 526 /2.7 34 *8 = 2
  • By definition, ./011 = 0
  • * is called the base of the logarithm
  • The most commonly used base is 2: a common operation is to

repeatedly divide the input in half

Analysis of Algorithms | 20

slide-21
SLIDE 21

The Linear Function

  • " # = #
  • Given an input value n, assigns the value itself
  • Arises in algorithm analysis any time we have to do a single
  • peration for each of n elements, e.g.
  • Comparing a number x to each element of a sequence of size n
  • Counting the number of elements in a sequence

Analysis of Algorithms | 21

slide-22
SLIDE 22

The N-log-N Function

  • " # = # %&' #
  • Base 2 logarithm
  • Also called the linearithmic function (Sedgewick & Wayne, 2011)
  • Grows a little more rapidly than the linear function, and a lot less

rapidly than the quadratic function

  • An n-log-n algorithm is usually preferable to a quadratic algorithm

Analysis of Algorithms | 22

slide-23
SLIDE 23

The Quadratic Function

  • " # = #%
  • Given an input the function assigns the product of n with itself
  • Appears in the analysis of algorithms because of nested loops,

where the inner loop performs a linear number of operations, and the outer loop is performed a linear number of times

  • Also appears in nested loops where the first iteration uses one
  • peration, the second two operations, the third three operations

etc., where the number of operations is &

'() *

+ = 1 + 2 + 3 + … + 1 − 2 + 1 − 1 + 1 =

Analysis of Algorithms | 23

slide-24
SLIDE 24

The Quadratic Function

  • " # = #%
  • Given an input the function assigns the product of n with itself
  • Appears in the analysis of algorithms because of nested loops,

where the inner loop performs a linear number of operations, and the outer loop is performed a linear number of times

  • Also appears in nested loops where the first iteration uses one
  • peration, the second two operations, the third three operations

etc., where the number of operations is &

'() *

+ = 1 + 2 + 3 + … + 1 − 2 + 1 − 1 + 1 = 1(1 + 1) 2

Analysis of Algorithms | 24

slide-25
SLIDE 25

The Quadratic Function

  • " # = #%
  • Given an input the function assigns the product of n with itself
  • Appears in the analysis of algorithms because of nested loops,

where the inner loop performs a linear number of operations, and the outer loop is performed a linear number of times

  • Also appears in nested loops where the first iteration uses one
  • peration, the second two operations, the third three operations

etc., where the number of operations is

&

'() *

+ = 1 + 2 + 3 + … + 1 − 2 + 1 − 1 + 1 = 1(1 + 1) 2

Analysis of Algorithms | 25

Card Friedrich Gauss, 1777 - 1855

slide-26
SLIDE 26

The Cubic Function and Other Polynomials

  • " # = #%
  • " # = &' + &)# + &*#* + &%#% + … + &,#,, where
  • ., -0, -1, -2, … , -3 are constants called the coefficients of the

polynomial, and -3≠ 0.

  • 7 indicates the highest power of the polynomial and is called the

degree of the polynomial

  • Examples
  • 9 : = 2 + 5: + :1
  • 9 : = 1 + :2

Analysis of Algorithms | 26

slide-27
SLIDE 27

The Exponential Function

  • " # = %#, % > (
  • ) is called the base, * is called the exponent
  • + * assigns to the input n the value obtained by multiplying the

base b a total number of n times

  • Appears in the analysis of algorithms where we have a loop that

starts by performing one operation and then e.g. doubles the number of operations performed with each iteration – at the nth iteration the number of operations performed is 2-. .

/01

  • 2/ = 1 + 2 + 25 + … + 2-

Analysis of Algorithms | 27

slide-28
SLIDE 28

The Exponential Function

  • " # = %#, % > (
  • ) is called the base, * is called the exponent
  • + * assigns to the input n the value obtained by multiplying the

base b a total number of n times

  • Appears in the analysis of algorithms where we have a loop that

starts by performing one operation and then e.g. doubles the number of operations performed with each iteration – at the nth iteration the number of operations performed is 2-. .

/01

  • 2/ = 1 + 2 + 25 + … + 2- = 2-78 − 1

2 − 1

Analysis of Algorithms | 28

slide-29
SLIDE 29

Comparing Growth Rates

constant logarithm linear n-log-n quadratic cubic exponential 1 log % % % log % %& %' 2)

Analysis of Algorithms | 29

slide-30
SLIDE 30

Comparing Growth Rates

1 100 1⋅104 1⋅106 1⋅108 1⋅1010 1⋅1012 1⋅10-6 1⋅10-4 0.01 100 1⋅104 1⋅106 1⋅108 1⋅1010 1⋅1012 1⋅1014 1⋅1016 1⋅1018 1⋅1020 1⋅1022 1⋅1024 1⋅1026 1⋅1028 1⋅1030

f(n) = n linear f(n) = n log n linearithmic f(n) = n2 quadratic f(n) = 1 constant f(n)=log n f(n) = n3 cubic f(n)=2ⁿ exponential

Analysis of Algorithms | 30

slide-31
SLIDE 31

Comparing Growth Rates

Analysis of Algorithms | 31

slide-32
SLIDE 32

Comparing Growth Rates

Analysis of Algorithms | 32

slide-33
SLIDE 33

Better Hardware?

Running Time New Maximum Problem Size 400# 256' 2#( 16', because 16( = 256 2+ ' + 8, because 2. = 256

Analysis of Algorithms | 33

  • The importance of a good algorithm goes beyond what can be solved

effectively on a given computer

  • Suppose a hardware speedup of 256 times – algorithm with given

running times run 256 times faster on the new computer

  • ' is the size of the previous maximum problem size
slide-34
SLIDE 34

Asymptotic Algorithm Analysis

  • “big-picture approach”: it is often enough just to know that the

running time of an algorithm grows proportionally to n

  • Analyze algorithms using a mathematical notation for functions

that disregard constant factors

  • Characterize running times of algorithms by using functions that

map the size of the input, n, to values that correspond to the main factor that determines the growth rate in terms of n

  • Analyze an algorithm by estimating the number of primitive
  • perations executed up to a constant factor

Analysis of Algorithms | 34

slide-35
SLIDE 35

Counting Primitive Operations

Analysis of Algorithms | 35

Step 1 Step 3 Step 4 Step 5 Step 6 Step 7 2 ops 2 ops 2n ops 2n ops 0 to n ops 1 op

slide-36
SLIDE 36

Constant Factors

  • The growth rate is not affected by
  • Constant factors
  • Lower-order terms

Analysis of Algorithms | 36

10-60 10-50 10-40 10-30 10-20 10-10 100 1010 1020 1030 1040 1050 1060 10-30 10-20 10-10 100 1010 1020 1030

y = x2 y = 2x2+7 y=x y = 3x+1

slide-37
SLIDE 37

Big-Oh Notation

  • Given functions !(#) and % # ,

we say that !(#) is &(% # ) if there is a real constant ' > 0 and an integer constant #* ≥ 1 such that ! # ≤ ' %(#) for # ≥ #*

  • Example: 2# + 10 is &(#)
  • 2# + 10 ≤ '#

Analysis of Algorithms | 37

1 10 100 1,000 10,000 1 10 100 1,000

n

3n 2n+10 n

slide-38
SLIDE 38

Big-Oh Notation

  • Given functions !(#) and % # ,

we say that !(#) is &(% # ) if there is a real constant ' > 0 and an integer constant #* ≥ 1 such that ! # ≤ ' %(#) for # ≥ #*

  • Example: 2# + 10 is &(#)
  • 2# + 10 ≤ '#
  • ' − 2 # ≥ 10
  • # ≥ 2*

345

  • Pick ' = 3 and #* = 10

Analysis of Algorithms | 38

1 10 100 1,000 10,000 1 10 100 1,000

n

3n 2n+10 n

slide-39
SLIDE 39

Big-Oh Notation

Analysis of Algorithms | 39

slide-40
SLIDE 40

Big-Oh Example

  • Example: !" is not # !

Analysis of Algorithms | 40

1 10 100 1,000 10,000 100,000 1,000,000 1 10 100 1,000

n

n^2 100n 10n n

slide-41
SLIDE 41

Big-Oh Example

  • Example: !" is not # !
  • !" ≤ &!
  • ! ≤ &
  • The above inequality cannot

be satisfied since & must be a constant

Analysis of Algorithms | 41

1 10 100 1,000 10,000 100,000 1,000,000 1 10 100 1,000

n

n^2 100n 10n n

slide-42
SLIDE 42

More Big-Oh Examples

  • 7# − 2 is & #
  • 3#( + 20#+ + 5 is & #(
  • 3 log # + 5 is &(log #)

Analysis of Algorithms | 42

slide-43
SLIDE 43

More Big-Oh Examples

  • 7# − 2 is & #
  • Need ' > 0 and #* ≥ 1 such that 7# − 2 ≤ '# for # ≥ #*.
  • 7# − 2 ≤ 7# − 2# ≤ 5#; this is true for ' = 5 and #* = 1.
  • 3#3 + 20#5 + 5 is & #3
  • Need ' > 0 and #* ≥ 1 such that 3#3 + 20#5 + 5 ≤ '#3 for # ≥

#*

  • 3#3 + 20#5 + 5 ≤ 3#3 + 20#3 + 5#3 ≤ (3 + 20 + 5)#3; this is

true for ' = 28 and #* = 1.

  • 3 log # + 5 is &(log #)
  • Need ' > 0 and #* ≥ 1 such that 3 log # + 5 ≤ ' log # for # ≥ #*
  • 3 log # + 5 ≤ 8 log #; this is true for ' = 8 and #* = 2 (log 1 = 0)

Analysis of Algorithms | 43

slide-44
SLIDE 44

Big-Oh and Growth Rate

  • The big-Oh notation gives an upper bound on the growth rate of a

function

  • The statement f(n) is O(g(n)) means that the growth rate of f(n) is

no more than the growth rate of g(n)

  • We can use the big-Oh notation to rank functions according to their

growth rate

Analysis of Algorithms | 44

f(n) is O(g(n)) g(n) is O(f(n)) g(n) grows more Yes No f(n) grows more No Yes Same growth Yes Yes

slide-45
SLIDE 45

Big-Oh Rules

  • If !(#) is a polynomial of degree %, ! # = '( + '*# + '+#+ +

',#, + … + '.#., then !(#) is / #. , i.e.

  • Drop lower-order terms
  • Drop constant factors
  • Use the smallest possible class of functions
  • 2# is /(#) instead of 2# is / #+
  • Use the simplest expression of the class
  • 3# + 5 is /(#) instead of 3# + 5 is /(3#)

Analysis of Algorithms | 45

slide-46
SLIDE 46

Asymptotic Algorithm Analysis

  • The asymptotic analysis of an algorithm determines the running

time in big-Oh notation

  • To perform the asymptotic analysis
  • We find the worst-case number of primitive operations executed

as a function of the input size

  • We express this function with big-Oh notation
  • Example:
  • We say that algorithm find_max runs in O(n) time

Analysis of Algorithms | 46

slide-47
SLIDE 47

Example: Computing Prefix Averages

  • Given a sequence ! consisting of " numbers, compute a sequence

# such that A[%] is the average of elements ! 0 , … , ! % , for % = 0, … , " − 1: # % = ∑./0

1

![2] % + 1 = ! 0 + ! 1 + ⋯ + ![%] % + 1

  • #[%] is the %-th prefix average of !

Analysis of Algorithms | 47

1 2 3 4 5 S 20 10 3 3 14 4 A 20 15 11 9 10 9

slide-48
SLIDE 48

Prefix Averages 1

Analysis of Algorithms | 48

  • What is the running time of the following algorithm for computing

prefix averages?

slide-49
SLIDE 49

Prefix Averages 1: Analysis

  • The running time of the algorithm is ! 1 + 2 + 3 + ⋯ + '
  • The sum of the first n integers is (((*+)
  • = (/*(
  • = +
  • '- + +
  • '
  • prefix averages 1 runs in ! '- time

Analysis of Algorithms | 49

1 2 3 4 5 S 20 10 3 3 14 4 sum over how many elements? 1 2 3 4 5 6

slide-50
SLIDE 50

Prefix Averages 2: Using sum()

  • Use a Python function to simplify the code

Analysis of Algorithms | 50

slide-51
SLIDE 51

Prefix Averages 3: Linear Time

  • The following algorithm computes prefix averages by keeping a

running sum

Analysis of Algorithms | 51

slide-52
SLIDE 52

Prefix Averages 3: Linear Time

  • The following algorithm computes prefix averages by keeping a

running sum

  • This algorithm runs in ! " time

Analysis of Algorithms | 52

slide-53
SLIDE 53

Relatives of Big-Oh

  • big-Oh notation (O)
  • Provides an asymptotic way of saying that a function is “less

than or equal to” another function

  • big-Omega notation (Ω)
  • Provides an asymptotic way of saying that a function grows at a

rate that is “greater than or equal to” that of another.

  • big-Theta notation (Θ)
  • Allows us to say that two functions “grow at the same rate” up

to constant factors

Analysis of Algorithms | 53

slide-54
SLIDE 54

Big-Omega (!)

  • Let "($) and &($) be functions mapping positive integers to

positive real numbers

  • "($) is Ω(&($)) if &($) is )(" $ ), that is, there is a real constant

* > 0 and an integer constant $- ≥ 1 such that " $ ≥ * & $ for $ ≥ $-

  • Example: Show that 3$ log $ − 2$ is Ω $ log $ .

Analysis of Algorithms | 54

slide-55
SLIDE 55

Big-Omega (!)

  • Let "($) and &($) be functions mapping positive integers to

positive real numbers

  • "($) is Ω(&($)) if &($) is )(" $ ), that is, there is a real constant

* > 0 and an integer constant $- ≥ 1 such that " $ ≥ * & $ for $ ≥ $-

  • Example: 3$ log $ − 2$ is Ω $ log $
  • 3$ log $ − 2$ = $ log $ + 2 $ log $ − 2$ =

$ log $ + 2$(log $ − 1) ≥ $ log $ for $ ≥ 2; hence * = 1 and $- = 2.

Analysis of Algorithms | 55

slide-56
SLIDE 56

Big-Theta (!)

  • #(%) is Θ ( %

if #(%) is )(( % ) and #(%) is Ω(((%)), that is, there are real constants +, > 0 and +,, > 0 and an integer constant %/ ≥ 1 such that +,( % ≤ #(%) ≤ +,,((%), for % ≥ %/

  • Example: Show that 3% log % + 4% + 5 log % is Θ % log % .

Analysis of Algorithms | 56

slide-57
SLIDE 57

Big-Theta (!)

  • #(%) is Θ ( %

if #(%) is )(( % ) and #(%) is Ω(((%)), that is, there are real constants +, > 0 and +,, > 0 and an integer constant %/ ≥ 1 such that +,( % ≤ #(%) ≤ +,,((%), for % ≥ %/

  • Example: 3% log % + 4% + 5 log % is Θ(% log %)
  • 3% log % ≤ 3% log % + 4% + 5 log % ≤ 3 + 4 + 5 % log %, for % ≥

2, hence +, = 3, +,, = 12, %/ = 2.

Analysis of Algorithms | 57

slide-58
SLIDE 58

Intuition for Asymptotic Notation

  • big-Oh
  • "($) is &(' $ ) if "($) is asymptotically less than or equal to

'($)

  • big-Omega
  • "($) is Ω('($)) if "($) is asymptotically greater than or equal to

g(n)

  • big-Theta
  • "($) is Θ('($)) if "($) is asymptotically equal to '($)

Analysis of Algorithms | 58

slide-59
SLIDE 59

Beware of Large Constants

  • The function ! " = 10&''" is ((")
  • If we were to compare it to 10" log ", we should prefer the

((" log ")-time algorithm, although the linear time algorithm is asymptotically faster

  • 10&''= one googol
  • If the asymptotic notations hide very large constants, they can be

misleading

Analysis of Algorithms | 59

slide-60
SLIDE 60

Is It Efficient?

  • Any algorithm running in !(# log #) time (with a reasonable

constant factor) should be considered efficient

  • An !(#() algorithm may be fast in some contexts
  • An algorithm running in ! 2* time should never be considered

efficient

Analysis of Algorithms | 60

slide-61
SLIDE 61

More Examples of Algorithm Analysis

  • len(data), data[j] - where data is an instance of Python’s

list class - constant-time operations, both run in ! 1 time

Analysis of Algorithms | 61

slide-62
SLIDE 62

Three Way Disjointness

  • Suppose three sequences of numbers, A, B and C;
  • no individual sequence contains duplicate values – but there may

be some numbers that are in two or three of the sequences

  • Determine if the intersection of the three sequences in empty –

namely - that there is no element ! such that ! ∈ #, ! ∈ % and ! ∈ &

Analysis of Algorithms | 62

slide-63
SLIDE 63

Three-Way Set Disjointness

Analysis of Algorithms | 63

slide-64
SLIDE 64

Three-Way Set Disjointness

Analysis of Algorithms | 64

  • Worst-case running time is ! "# , because it loops through each

possible triple of values from the three sets to see if the values are equivalent

slide-65
SLIDE 65

Three-Way Set Disjointness: Take 2

  • Observation: once inside the body of the loop over B, if selected

elements ! and " do not match each other, it don’t make sense to iterate through the values of C looking for a matching triple

Analysis of Algorithms | 65

slide-66
SLIDE 66

Three-Way Set Disjointness: Take 2

  • Observation: once inside the body of the loop over B, if selected

elements ! and " do not match each other, it don’t make sense to iterate through the values of C looking for a matching triple

  • Worst-case running time is # $%

Analysis of Algorithms | 66

slide-67
SLIDE 67

Element Uniqueness

  • Given a sequence ! with " elements, are all elements distinct from

each other?

Analysis of Algorithms | 67

slide-68
SLIDE 68

Element Uniqueness

  • Given a sequence ! with " elements, are all elements distinct from

each other?

Analysis of Algorithms | 68

  • uter loop, j

1 2 … n-2 n-1 inner loop, k n-1 n-2 n-3 1

slide-69
SLIDE 69

Element Uniqueness

  • Given a sequence ! with " elements, are all elements distinct from

each other?

  • " − 1 + " − 2 + ⋯ + 2 + 1 = * *+,
  • ;
  • worst-case running time proportional to /("-)

Analysis of Algorithms | 69

  • uter loop, j

1 2 … n-2 n-1 inner loop, k n-1 n-2 n-3 1

slide-70
SLIDE 70

Element Uniqueness: Using Sorting

  • Idea: sort the sequence first; any duplicates are then guaranteed

to be next to each other

Analysis of Algorithms | 70

slide-71
SLIDE 71

Element Uniqueness: Using Sorting

  • Idea: sort the sequence first; any duplicates are then guaranteed

to be next to each other

  • Sorting: !(# log #) - details next week
  • Once the sequence is sorted, a single loop is needed to find

duplicates – which runs in !(#) time

  • Therefore the entire algorithm runs in !(# log #). Better?

Analysis of Algorithms | 71

slide-72
SLIDE 72

! " #$% " better than !("')

Analysis of Algorithms | 72

slide-73
SLIDE 73

Binary Search (review from Java 2)

  • One of the most important computer algorithms
  • Locate a target value within a sorted sequence of ! elements
  • If the sequence is unsorted, the standard approach is to use a loop

to examine each element – sequential search, linear time, "(!)

  • If the sequence is sorted and indexable, there is a much more

efficient algorithm

  • Intuition: think of how you look up a word in a dictionary
  • Open at a certain page; if the word is on that page, stop
  • if word should be before in lexicographic order, continue looking

in the first half

  • Otherwise continue looking in the second half

Analysis of Algorithms | 73

slide-74
SLIDE 74

Binary Search

Analysis of Algorithms | 74

slide-75
SLIDE 75

Binary Search: Analysis

  • Proposition: The binary search algorithm runs in ! log % time for a

sorted sequence with % elements.

  • Justification
  • With each recursive call the number of candidate entries still to

be searched is given by the value ℎ'(ℎ − *+, + 1

  • The number of remaining candidates is reduced by at least one

half with each recursive call

  • Initially, *+, = 0, ℎ'(ℎ = % − 1, 2'3 = (*+, + ℎ'(ℎ)/2
  • The number of candidates to be searched at the next recursive

call is either

§ 2'3 − 1 − *+, + 1 =

89:;<=>< ?

− low ≤

<=>< B89:;C ?

  • r

§ ℎ'(ℎ − 2'3 + 1 + 1 = high −

89:;<=>< ?

<=>< B89:;C ? Analysis of Algorithms | 75

slide-76
SLIDE 76

Binary Search: Analysis (cont’d)

  • The initial number of candidates is !;
  • After the 1st call in a binary search, it is at most #

$ = # $&

  • After the 2nd call, it is at most #

' = # $(

  • In general, after the jth call, it is at most #

$)

  • In the worst case (target not found), binary search stops when

there are no more candidate entries

  • The maximum number of recursive calls is the smallest integer

such that #

$* < 1, therefore - > log$!

  • Thus - =

log$ ! + 1, so binary search runs in 3(log$ !) time.

Analysis of Algorithms | 76

slide-77
SLIDE 77

Binary Search: Analysis (cont’d)

  • "(log ') binary search - much better than " ' sequential search
  • Think for ' = 1,000,000,000
  • "(log ' ) ≈ 29.897

Analysis of Algorithms | 77

slide-78
SLIDE 78

Math You May Need to Review

  • Summations
  • Logarithms and Exponents
  • See Appendix B.
  • Extra resource:
  • https://www.khanacademy.org/math/alge

bra2/x2ec2f6f830c9fb89:logs

Analysis of Algorithms | 78

  • Properties of logarithms
  • log% &' = log% & + log% '
  • log%

* + = log% & − log% '

  • log% &- = . log% &
  • log% . = /012 -

/012 %

  • Properties of exponentials
  • .%34 = .%.4
  • .%4 = (.%)4
  • -7
  • 8 = .%94
  • :/018 - = ./018 %
slide-79
SLIDE 79

Thank you.