cs3000 algorithms data jonathan ullman
play

CS3000: Algorithms & Data Jonathan Ullman Lecture 8: Dynamic - PowerPoint PPT Presentation

CS3000: Algorithms & Data Jonathan Ullman Lecture 8: Dynamic Programming: RNA Folding, Practice Feb 3, 2020 RNA Folding DNA DNA is a string of four bases {A,C,G,T} Two complementary strands of DNA stick together and form a


  1. CS3000: Algorithms & Data Jonathan Ullman Lecture 8: Dynamic Programming: RNA Folding, Practice • Feb 3, 2020

  2. RNA Folding

  3. DNA • DNA is a string of four bases {A,C,G,T} • Two complementary strands of DNA stick together and form a double helix • A—T and C—G are complementary pairs

  4. RNA Folding • RNA is a string of four bases {A,C,G,U} • A single RNA strand sticks to itself and folds into complex structures • A—U and C—G are complementary pairs

  5. RNA Folding • RNA strand will try to minimize energy (form the most bonds) subject to constraints

  6. RNA Folding • RNA is a string of bases 𝒄 𝟐 , … , 𝒄 𝒐 ∈ 𝑩, 𝑫, 𝑯, 𝑽 • The structure is given by a set of bonds 𝑇 consisting of pairs 𝑗, 𝑘 with 𝑗 < 𝑘 • (Complements) Only 𝐵 − 𝑉 or 𝐷 − 𝐻 can be paired • (Matching) No base 𝑐 5 is in two pairs in 𝑇 • (No Sharp Turns) If 𝑗, 𝑘 ∈ 𝑇 , then 𝑗 < 𝑘 − 4 • (Non-Crossing) If 𝑗, 𝑘 , 𝑙, ℓ ∈ 𝑇 then it cannot be the case that 𝑗 < 𝑙 < 𝑘 < ℓ

  7. RNA Folding • Input: RNA sequence 𝒄 𝟐 , … , 𝒄 𝒐 ∈ 𝐵, 𝐷, 𝐻, 𝑉 • Output: A set of pairs 𝑇 ⊆ 1, … , 𝑜 × 1, … , 𝑜 • Goal: maximize the size of 𝑇 • (Complements) Only 𝐵 − 𝑉 or 𝐷 − 𝐻 can be paired • (Matching) No base 𝑐 5 is in two pairs in 𝑇 • (No Sharp Turns) If 𝑗, 𝑘 ∈ 𝑇 , then 𝑗 < 𝑘 − 4 • (Non-Crossing) If 𝑗, 𝑘 , 𝑙, ℓ ∈ 𝑇 then it cannot be the case that 𝑗 < 𝑙 < 𝑘 < ℓ

  8. Dynamic Programming • Let 𝑃 be the optimal set of pairs for 𝑐 > ⋯ 𝑐 @ • Case 1: 𝑃 does not include any pair involving 𝑜 • Case 2: 𝑃 has 𝑜 pair with some 𝑢 < 𝑜 − 4 in 𝑃

  9. Dynamic Programming • Let 𝑃 5,B be the optimal set of pairs for 𝑐 5 ⋯ 𝑐 B • Case 1: 𝑃 5,B does not include any pair involving 𝑘 • Case 2: 𝑃 5,B has 𝑘 pair with some 𝑢 < 𝑘 − 4 in 𝑃

  10. Dynamic Programming • Let OPT 𝑗, 𝑘 be the opt. number of pairs for 𝑐 5 ⋯ 𝑐 B • Case 1: 𝑘 pairs with nothing • Case 2: 𝑘 pairs with 𝑢 < 𝑘 − 4

  11. Dynamic Programming • Let OPT 𝑗, 𝑘 be the opt. number of pairs for 𝑐 5 ⋯ 𝑐 B • Case 1: 𝑘 pairs with nothing • Case 2: 𝑘 pairs with 𝑢 < 𝑘 − 4 Recurrence: OPT 𝑗, 𝑘 = max OPT 𝑗, 𝑘 − 1 , max OPT 𝑗, 𝑢 − 1 + OPT 𝑢 + 1, 𝑘 − 1 Maximum over all 𝑢 such that 𝑗 ≤ 𝑢 < 𝑘 − 4 • B are compatible bases • 𝑐 N , 𝑐 Base Cases: OPT 𝑗, 𝑘 = 0 if 𝑗 ≥ 𝑘 − 4

  12. Filling the Table Sequence: 𝐵𝐷𝐷𝐻𝐻𝑉𝐵𝐻𝑉 Recurrence: OPT 𝑗, 𝑘 = max OPT 𝑗, 𝑘 − 1 , OPPQROSPT N OPT 𝑗, 𝑢 − 1 + OPT 𝑢 + 1, 𝑘 − 1 max 6 7 8 j = 9 4 0 0 0 3 0 0 2 0 i = 1

  13. RNA Folding Summary • Compute the optimal RNA folding in time 𝑃 𝑜 V and space 𝑃 𝑜 W • Dynamic Programming: • Decide on an optimal pair 𝑐 N − 𝑐 @ • Remaining RNA is two non-overlapping pieces • Adding variables: one subproblem for each interval • Non-crossing is critical • Think about how the dynamic programming algorithm changes if we remove each of the conditions

  14. Dynamic Programming Practice

  15. Midterm I Review

  16. Midterm I Topics • Fundamentals: • Induction • Asymptotics • Recurrences • Stable Matching • Divide and Conquer • Dynamic Programming

  17. Topics: Induction • Proof by Induction: 5Y> = @ @Z> @ • Mathematical formulas, e.g. ∑ 𝑗 W • Spot the bug • Solutions to recurrences • Correctness of divide-and-conquer algorithms • Good way to study: • Lehman-Leighton-Meyer, Mathematics for CS • Review divide-and-conquer in Kleinberg-Tardos

  18. Practice Question: Induction • Suppose you have an unlimited supply of 3 and 7 cent coins, prove by induction that you can make any amount 𝑜 ≥ 12 .

  19. Topics: Asymptotics • Asymptotic Notation • 𝑝, 𝑃, 𝜕, Ω, Θ • Relationships between common function types • Good way to study: • Kleinberg-Tardos Chapter 2

  20. Topics: Asymptotics Notation … means … Think… E.g. 100n 2 = O(n 3 ) f(n)=O(n) ∃𝑑 > 0, 𝑜 c > 0, ∀𝑜 ≥ 𝑜 c : At most 0 ≤ 𝑔 𝑜 ≤ 𝑑𝑕(𝑜) “≤” 2 n = W (n 100 ) f(n)= W (g(n)) ∃𝑑 > 0, 𝑜 c > 0, ∀𝑜 ≥ 𝑜 c : At least 0 ≤ 𝑑𝑕 𝑜 ≤ 𝑔(𝑜) “≥” f(n)= Q (g(n)) log(n!) = Q (n log n) Equals 𝑔 𝑜 = 𝑃 𝑕 𝑜 and 𝑔 𝑜 = 𝛻(𝑕 𝑜 ) “=” n 2 = o(2 n ) f(n)=o(g(n)) ∀𝑑 > 0, ∃𝑜 c > 0, ∀𝑜 ≥ 𝑜 c : Less than 0 ≤ 𝑔 𝑜 < 𝑑𝑕(𝑜) “<” n 2 = w (log n) f(n)= w (g(n)) ∀𝑑 > 0, ∃𝑜 c > 0, ∀𝑜 ≥ 𝑜 c : Greater than 0 ≤ 𝑑𝑕 𝑜 < 𝑔(𝑜) “>”

  21. Topics: Asymptotics • Constant factors can be ignored • ∀𝐷 > 0 𝐷𝑜 = 𝑃 𝑜 • Smaller exponents are Big-Oh of larger exponents • ∀𝑏 > 𝑐 𝑜 l = 𝑃 𝑜 m • Any logarithm is Big-Oh of any polynomial m 𝑜 = 𝑃 𝑜 r • ∀𝑏, 𝜁 > 0 log W • Any polynomial is Big-Oh of any exponential • ∀ 𝑏 > 0, 𝑐 > 1 𝑜 m = 𝑃 𝑐 @ • Lower order terms can be dropped • 𝑜 W + 𝑜 V/W + 𝑜 = 𝑃 𝑜 W

  22. Practice Question: Asymptotics • Put these functions in order so that 𝑔 5 = 𝑃 𝑔 5Z> • 𝑜 PQt u v • 8 PQt u @ • 2 V PQt u PQt u @ • 2 PQt u @ u @ • ∑ 𝑗 5Y> • 𝑜 W log W 𝑜

  23. Practice Question: Asymptotics • Suppose 𝑔 > = 𝑃 𝑕 and 𝑔 W = 𝑃 𝑕 . Prove that 𝑔 > + 𝑔 W = 𝑃 𝑕 .

  24. Topics: Recurrences • Recurrences • Representing running time by a recurrence • Solving common recurrences • Master Theorem • Good way to study: • Erickson book • Kleinberg-Tardos divide-and-conquer chapter

  25. Practice Question: Recurrences F(n): For i = 1,…,n 2 : Print “Hi” For i = 1,…,3: F(n/3) • Write a recurrence for the running time of this algorithm. Write the asymptotic running time given by the recurrence.

  26. � � Topics: Recurrences • Consder the recurrence 𝑈 𝑜 = 𝑜 ⋅ 𝑈 𝑜 + 𝑜 with 𝑈 1 = 1 . Solve using a recursion tree.

  27. Topics: Divide-and-Conquer • Divide-and-Conquer • Writing pseudocode • Proving correctness by induction • Analyzing running time via recurrences • Examples we’ve studied: • Mergesort, Binary Search, Karatsuba’s, Selection • Good way to study: • Example problems from Kleinberg-Tardos or Erickson • Practice, practice, practice!

  28. Topics: Dynamic Programming • Dynamic Programming • Identify sub-problems • Write a recurrence, 𝑃𝑄𝑈 𝑜 = max 𝑤 @ + 𝑃𝑄𝑈 𝑜 − 6 , 𝑃𝑄𝑈(𝑜 − 1) • Fill the dynamic programming table • Find the optimal solution • Analyze running time • Good way to study: • Example problems from Kleinberg-Tardos or Erickson • Practice, practice, practice!

  29. � Practice Question • Design an 𝑃(𝑜) -time algorithm that takes an array 𝐵[1: 𝑜] and returns a sorted array containing the smallest 𝑜 elements of 𝐵

  30. Practice Question • Consider the following sorting algorithm A[1:n] is a global array SillySort(1,n): if (n <= 2): put A in order else: SillySort(1,2n/3) SillySort(n/3,n) SillySort(1,2n/3) • Prove that it is correct • Analyze its running time

  31. Dynamic Programming Practice

  32. Chocolate Bar Splitting • Input: A chocolate bar with 𝑜 × 𝑛 pieces • Output: The minimum number of cuts needed to divide the block into perfect squares

  33. Chocolate Bar Splitting

  34. Vankin’s Mile • Input: An 𝑜 × 𝑜 board of numbers • Rules: • Place a chip on the board • Keep moving the tile down or right until you fall off • Score = sum of the numbers your chip visited • Output: The best possible strategy

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend