cpsc 320 intermediate algorithm
play

CPSC 320: Intermediate Algorithm Design and Analysis July 28, 2014 - PowerPoint PPT Presentation

CPSC 320: Intermediate Algorithm Design and Analysis July 28, 2014 1 Course Outline Introduction and basic concepts Asymptotic notation Greedy algorithms Graph theory Amortized analysis Recursion


  1. CPSC 320: Intermediate Algorithm Design and Analysis July 28, 2014 1

  2. Course Outline Introduction and basic concepts • Asymptotic notation • Greedy algorithms • Graph theory • Amortized analysis • Recursion • Divide-and-conquer algorithms • Randomized algorithms • Dynamic programming algorithms • NP-completeness • 2

  3. Dynamic Programming 3

  4. Dynamic Programming Components Analyse the structure of an optimal solution • Separate one choice (usually the last) from a subproblem • Phrase the value of a choice as a function of the choice and the subproblem • Phrase an optimal solution as the value of the best choice • Usually a max/min result • Implement the calculation of the optimal value • Memoization: save optimal values as we compute them • Bottom-up: evaluate smaller problems and use them for bigger problems • Top-down: evaluate big problem by calling smaller problems recursively and • saving result Keep record of the choice made in each level • Rebuild the optimal solution from the optimal value result • 4

  5. Knapsack Problem Algorithm Knapsack( 𝑥 , 𝑞 , 𝑁 ) – 𝑥 is array of weights, 𝑞 is array of values, 𝑁 is limit 𝑠 0, 𝑛 ← 0 , 𝑚 0, 𝑛 ← false for 𝑛 = 0,1,2, … , 𝑁 For 𝑗 ← 1 To 𝑥 Do For 𝑛 ← 1 To 𝑁 Do If 𝑥 𝑗 > 𝑛 Or 𝑠 𝑗 − 1, 𝑛 > 𝑠 𝑗 − 1, 𝑛 − 𝑥 𝑗 + 𝑞[𝑗] Then 𝑠 𝑗, 𝑛 ← 𝑠 𝑗 − 1, 𝑛 , 𝑚 𝑗, 𝑛 ← 𝑚[𝑗 − 1, 𝑛] Else + 𝑞[𝑗] , 𝑚 𝑗, 𝑛 ← 𝑗 𝑠 𝑗, 𝑛 ← 𝑠 𝑗 − 1, 𝑛 − 𝑥 𝑗 𝑡 ← ∅ , 𝑦 ← 𝑥 While 𝑁 > 0 And 𝑚[𝑦, 𝑁] is not false Do 𝑦 ← 𝑚[𝑦, 𝑁] , 𝑡 ← 𝑡 ∪ 𝑦 , 𝑁 ← 𝑁 − 𝑥 𝑦 , 𝑦 ← 𝑦 − 1 Return 𝑡 5

  6. Knapsack Algorithm - Complexity What is the time complexity of the knapsack algorithm? • 𝑃(𝑜𝑋) (number of items times the weight limit) • This algorithm is called pseudo-polynomial • Time complexity is based on the value of the input, not just the size • There is no known polynomial algorithm to solve the knapsack problem • 6

  7. Algorithm Strategies - Review Dynamic programming algorithms: • Choice is made based on evaluation of all possible results • Time and space complexity are usually higher • Greedy algorithms: • Choice is made based on locally optimal solution • Usually faster, but may not result in globally optimal solution • Divide and conquer algorithms: • Choice of input division is made based on assumption that merging result of • subproblems is optimal 7

  8. Global Sequence Alignment Problem Problem: given two sequences, analyse how similar they are • Allow both gaps and mismatches • Application: • Finding suggestions for misspelled words (comparing strings) • Comparing files (diff) • Analyse if two pieces of DNA match • Example: “ ocurrance ” vs “occurrence” • There is a letter “c” missing (gap) • An “a” was used instead of an “e” (mismatch) • Mismatches may be seen as gaps in both sides • “ oc-urra-nce ” vs “ occurr-ence ” • 8

  9. Formal Definition We represent a gap with a hyphen “ − ” • A sequence alignment of (𝑌, 𝑍) is a pair (𝑌 ′ , 𝑍 ′ ) of sequences, such that: • 𝑌 ′ minus the gaps is 𝑌 , 𝑍 ′ minus the gaps is 𝑍 • 𝑌 ′ = 𝑍 ′ (the size is the same for both sides) • ′ = − , then 𝑍 ′ ≠ − (you can’t have gaps on both sides) If 𝑌 𝑗 • 𝑗 A parameter 𝜀 > 0 defines the gap penalty (penalty if one side has a gap) • A parameter 𝛽 𝑞𝑟 defines the mismatch penalty of matching 𝑞 and 𝑟 ( 𝛽 𝑞𝑞 = 0 ) • 𝑌 ′ The cost of a matching (𝑌 ′ , 𝑍 ′ ) is 𝑗=0 𝑞𝑓𝑜(𝑦 𝑗 , 𝑧 𝑗 ) • 9

  10. Finding the Best Alignment What is the choice to be made? • Last character could be a gap on either side, or a potential mismatch • Assume 𝐺(𝑗, 𝑘) is the penalty for the best alignment of 𝑦 1 . . 𝑦 𝑗 and 𝑧 1 . . 𝑧 𝑘 • 𝑘 ⋅ 𝜀 𝑗 = 0 𝑗 ⋅ 𝜀 𝑘 = 0 𝐺 𝑗, 𝑘 = min 𝐺 𝑗 − 1, 𝑘 − 1 + 𝛽 𝑦 𝑗 𝑧 𝑘 , 𝐺 𝑗 − 1, 𝑘 + 𝜀, 𝐺 𝑗, 𝑘 − 1 + 𝜀 otherwise 10

  11. Algorithm (Smith-Wasserman) Algorithm SmithWasserman( 𝑌 , 𝑍 , 𝜀 , 𝛽 ) For 𝑗 ← 0 To |𝑌| Do 𝐺 𝑗, 0 ← 𝑗 ⋅ 𝜀 For 𝑘 ← 1 To |𝑍| Do 𝐺 0, 𝑘 ← 𝑘 ⋅ 𝜀 For 𝑗 ← 1 To |𝑌| Do -- matching cost 𝑛 ← 𝐺 𝑗 − 1, 𝑘 − 1 + 𝛽 𝑌 𝑗 , 𝑍 𝑘 𝑕 𝑦 ← 𝐺 𝑗, 𝑘 − 1 + 𝜀 , 𝑕 𝑧 ← 𝐺 𝑗 − 1, 𝑘 + 𝜀 -- gap penalty in 𝑌, 𝑍 If 𝑛 ≤ 𝑕 𝑦 And 𝑛 ≤ 𝑕 𝑧 Then 𝐺 𝑗, 𝑘 ← 𝑛 , 𝐼 𝑗, 𝑘 ← ”match” Else If 𝑕 𝑦 ≤ 𝑕 𝑧 Then 𝐺 𝑗, 𝑘 ← 𝑕 𝑦 , 𝐼 𝑗, 𝑘 ← ”gap in X” Else 𝐺 𝑗, 𝑘 ← 𝑕 𝑧 , 𝐼 𝑗, 𝑘 ← ”gap in Y” 11

  12. Algorithm (cont.) … 𝑌 ′ ← “”, 𝑍 ′ ← “” 𝑗 ← 𝑛 , 𝑘 ← 𝑜 While 𝑗 > 0 Or 𝑘 > 0 Do If 𝐼 𝑗, 𝑘 = “match” Then 𝑌 ′ ← 𝑌 𝑗 . X′ , 𝑍 ′ ← 𝑍 𝑘 . Y′ 𝑗 ← 𝑗 − 1 , 𝑘 ← 𝑘 − 1 Else If 𝐼 𝑗, 𝑘 = “gap in X” Then 𝑌 ′ ← − . X′ , 𝑍 ′ ← 𝑍 𝑘 . Y′ 𝑘 ← 𝑘 − 1 Else 𝑌 ′ ← 𝑌 𝑗 . X′ , 𝑍 ′ ← − . Y′ 𝑗 ← 𝑗 − 1 Return 𝑌 ′ , 𝑍 ′ , 𝐺[𝑛, 𝑜] 12

  13. Longest Common Subsequence Subsequence: any sequence of items that is contained in the original sequence in • the same order (but not necessarily consecutively) Example: 𝐶, 𝐷, 𝐸, 𝐶 is a subsequence of 𝐵, 𝑪, 𝑫, 𝐶, 𝑬, 𝐵, 𝑪 • Problem: Given two sequences 𝑌 and 𝑍 , find the longest common subsequence of • 𝑌 and 𝑍 Application: • Find common DNA sequences in different organisms • Video compression (inter-frame comparison) • 13

  14. Characterizing the LCS Define 𝑌 𝑗 as the sequence 𝑌 limited to the first 𝑗 elements • Given two sequences 𝑌 = 𝑦 1 , . . , 𝑦 𝑛 and 𝑍 = 𝑧 1 , . . , 𝑧 𝑜 , let 𝑎 = 𝑨 1 , . . , 𝑨 𝑙 be the longest • common subsequence (LCS) of 𝑌 and 𝑍 If 𝑦 𝑛 = 𝑧 𝑜 , then 𝑨 𝑙 = 𝑦 𝑛 = 𝑧 𝑜 , and 𝑎 𝑙−1 is an LCS of 𝑌 𝑛−1 and 𝑍 • 𝑜−1 If 𝑦 𝑛 ≠ 𝑧 𝑜 , then 𝑎 is either an LCS of 𝑌 𝑛 and 𝑍 𝑜−1 , or an LCS of 𝑌 𝑛−1 and 𝑍 • 𝑜 Define the length of the LCS of 𝑌 𝑗 and 𝑍 𝑘 as: • 0 𝑗 = 0 ∨ 𝑘 = 0 𝑑 𝑗 − 1, 𝑘 − 1 + 1 𝑗, 𝑘 > 0 ∧ 𝑦 𝑗 = 𝑧 𝑘 𝑑 𝑗, 𝑘 = max{𝑑 𝑗, 𝑘 − 1 , 𝑑 𝑗 − 1, 𝑘 } otherwise 14

  15. Algorithm Algorithm LongestCommonSubsequence( 𝑌 , 𝑍 ) For 𝑗 ← 0 To 𝑌 Do c[𝑗, 0] ← 0 For 𝑘 ← 1 To 𝑍 Do c[0, 𝑘] ← 0 For 𝑗 ← 1 To |𝑌| Do If 𝑌 𝑗 = 𝑍[𝑘] Then 𝑑 𝑗, 𝑘 ← 𝑑 𝑗 − 1, 𝑘 − 1 + 1 , ℎ 𝑗, 𝑘 ← “+” Else If 𝑑 𝑗 − 1, 𝑘 > 𝑑[𝑗, 𝑘 − 1] Then 𝑑 𝑗, 𝑘 ← 𝑑[𝑗 − 1, 𝑘] , ℎ 𝑗, 𝑘 ← “X” Else 𝑑 𝑗, 𝑘 ← 𝑑[𝑗, 𝑘 − 1] , ℎ 𝑗, 𝑘 ← “Y” PrintLCS( 𝑌 , ℎ , |𝑌| , |𝑍| ) Return 𝑑 𝑌 , 𝑍 15

  16. Algorithm (cont.) Algorithm PrintLCS( ℎ , 𝑌 , 𝑗 , 𝑘 ) If 𝑗 = 0 Or 𝑘 = 0 Then Return If ℎ 𝑗, 𝑘 = “+” Then PrintLCS( ℎ , 𝑌 , 𝑗 − 1 , 𝑘 − 1 ) Print 𝑌[𝑗] Else If ℎ 𝑗, 𝑘 = “X” Then PrintLCS( ℎ , 𝑌 , 𝑗 − 1 , 𝑘 ) Else PrintLCS( ℎ , 𝑌 , 𝑗 , 𝑘 − 1 ) 16

  17. NP Complexity 17

  18. Time Complexity for Decision Problems From this point on we analyse time complexity for problems, not algorithms • We want to know what is the best possible complexity for the problem • Our focus now is on decision problems, not optimization problems • Decision problems: Yes/No answer • Optimization: “find best”, “find maximum”, “find minimum” • We also need to distinguish “finding” and “checking” a solution • 18

  19. Time Complexity - Classes A problem is solvable in polynomial time if there is an algorithm that solves it, that • runs in 𝑃 𝑜 𝑙 , where 𝑙 ∈ Θ 1 and 𝑜 is the size of the input representation Example: sort ( 𝑃 𝑜 log 𝑜 ⊂ 𝑃 𝑜 2 ), select ( 𝑃(𝑜) ), longest common subsequence • ( 𝑃(𝑜 2 ) ), matrix multiplication ( 𝑃(𝑜 3 ) or better) P: set of all decision problems that are solvable in polynomial time • NP (non-deterministic P): set of all decision problems for which a given certificate • can be checked in polynomial time 19

  20. Example: Hamiltonian Path Problem: given a graph, is there a path that goes through every node exactly • once? Decision problem: answer is yes or no • Optimization problem: find a path with minimum cost, etc.; not required • Is this problem in NP? • Given a path, can we verify that the path is correct in polynomial time? • Is this problem in P? • Can we solve it in polynomial time? • 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend