CSC373 Algorithm Design, Analysis & Complexity
373F20 - Nisarg Shah 1
CSC373 Algorithm Design, Analysis & Complexity Nisarg Shah - - PowerPoint PPT Presentation
CSC373 Algorithm Design, Analysis & Complexity Nisarg Shah 373F20 - Nisarg Shah 1 Introduction Totally useless Instructors this semester! Nisarg Shah o cs.toronto.edu/~nisarg, nisarg@cs, SF 2301C o LEC 0101 and 0102 TAs: Too
373F20 - Nisarg Shah 1
373F20 - Nisarg Shah 2
➢ Nisarg Shah
➢ First online version of the course, so expect a bumpy ride at the start, but
hopefully, we’ll get through together
➢ Use any of the feedback mediums (email, Piazza, …) to let me know if you have
any suggestions for improvement
Totally useless this semester!
373F20 - Nisarg Shah 3
➢ All the information below is in the course information sheet, available on
Piazza
➢ Link will be distributed after about a week or two ➢ LaTeX preferred, scans are OK!
373F20 - Nisarg Shah 4
➢ Delivered live ➢ 10 minute break after every 50 minutes of lecture ➢ Students can ask questions using Zoom’s chat feature ➢ One TA will be present to continuously answer questions ➢ I might also answer questions once in a while
373F20 - Nisarg Shah 5
➢ Delivered live by TAs ➢ Problem sets will be posted early on the course webpage
➢ Please try them before coming to the tutorials ➢ TAs will explain the problems, allow you to discuss them in breakout rooms,
and then go over key parts of the solutions
➢ Solutions will be posted later on the course webpage
373F20 - Nisarg Shah 6
➢ Each section is divided into three parts (A,B,C) ➢ Students divided by birth month: A = Jan-Apr, B = May-Aug, C = Sep-Dec ➢ Feel free to attend a different tutorial than the one you’re assigned
➢ If the attendance is low, the number of tutorials per section may be reduced
373F20 - Nisarg Shah 7
➢ Do you have conflicts with these slots? Poll!
➢ I will conduct them ➢ Use the “raise hand” feature ➢ I will call upon the raised hands in order ➢ When called upon, unmute and ask the question ➢ Always phrase your question in a way that doesn’t give away your solutions or
approach to an assignment problem
373F20 - Nisarg Shah 8
➢ Need to be able to attend live! ➢ I’m considering using part of the Tue 4-5pm lecture slot to give you more time
➢ Open book, closed internet ➢ You may be asked to join a zoom link and keep your video on ➢ If you have a question, you can “raise hand”, and I or a TA can take you to a
breakout room to answer your question
➢ Upload scanned answer sheet at the end (we’ll do a mock run of this)
373F20 - Nisarg Shah 9
➢ In groups of up to three students ➢ Best way to learn is for each member to try each problem
➢ May need to mull them over for several days; do not expect to start and finish
the assignment on the same day!
➢ May include bonus questions
➢ May need to compress the PDF
373F20 - Nisarg Shah 10
* 10% = 30%
* 20% = 40%
* 30% = 30%
373F20 - Nisarg Shah 11
➢ Assignment 1:
➢ Assignment 2:
➢ Assignment 3:
➢ Assignment 4:
➢ Midterm 1:
➢ Midterm 2:
➢ The tests are during the tutorial slot, so there should ideally be no conflict ➢ That said, if you think you’ll have a conflict, let me know at the earliest
373F20 - Nisarg Shah 12
➢ [CLRS] Cormen, Leiserson, Rivest, Stein: Introduction to Algorithms.
➢ [DPV] Dasgupta, Papadimitriou, Vazirani: Algorithms. ➢ [KT] Kleinberg; Tardos: Algorithm Design.
373F20 - Nisarg Shah 13
➢ Free to discuss with classmates or read online material ➢ Must write solutions in your own words
➢ For each question, must cite the peer (write the name) or the online sources
(provide links), if you obtained a significant insight directly pertinent to the question
➢ Failing to do this is plagiarism!
373F20 - Nisarg Shah 14
➢ Borrowed from: Prof. Allan Borodin (citation!)
not know how to approach this question”)
373F20 - Nisarg Shah 15
➢ 4 total late days across all 4 assignments ➢ Managed by MarkUs ➢ At most 2 late days can be applied to a single assignment ➢ Already covers legitimate reasons such as illness, university activities, etc.
373F20 - Nisarg Shah 16
➢ Polls (already tried) ➢ Chat ➢ Reactions ➢ Raise hand ➢ Yes/No ➢ Breakout rooms
373F20 - Nisarg Shah 17
373F20 - Nisarg Shah 18
373F20 - Nisarg Shah 19
Muhammad ibn Musa al-Khwarizmi
373F20 - Nisarg Shah 20
➢ Ubiquitous in the real world
➢ Important to be able to design and analyze algorithms ➢ For some problems, good algorithms are hard to find
373F20 - Nisarg Shah 21
➢ Algorithms in specialized environments or using advanced techniques
➢ Other concerns with algorithms
➢ …mostly beyond the scope of this course
373F20 - Nisarg Shah 22
➢ Divide and Conquer ➢ Greedy ➢ Dynamic programming ➢ Network flow ➢ Linear programming ➢ NP-completeness (not really an algorithm design paradigm) ➢ Approximation algorithms (if time permits) ➢ Randomized algorithms (if time permits)
373F20 - Nisarg Shah 23
➢ A very interesting question! ➢ Subject of much ongoing research…
➢ Proof of correctness ➢ Proof of running time
373F20 - Nisarg Shah 24
➢ Polynomial time ➢ It should use at most poly(n) steps on any n-bit input
➢ If the input to an algorithm is a number 𝑦, the number of bits of input is log 𝑦
➢ How much is too much?
373F20 - Nisarg Shah 25
373F20 - Nisarg Shah 26
373F20 - Nisarg Shah 27
➢ Try to prove that the problem is hard ➢ Formally establish complexity results ➢ NP-completeness, NP-hardness, …
may suddenly become hard
➢ Minimum spanning tree (MST) vs bounded degree MST ➢ 2-colorability vs 3-colorability
373F20 - Nisarg Shah 28
373F20 - Nisarg Shah 29
373F20 - Nisarg Shah 30
➢ You’ll be working with abstract notations, proving correctness of algorithms,
analyzing the running time of algorithms, designing new algorithms, and proving complexity results.
➢ If you’re somewhat scared going into the course ➢ If you’re already comfortable with the proofs, and want challenging problems
373F20 - Nisarg Shah 31
➢ CSC473: Advanced Algorithms ➢ CSC438: Computability and Logic ➢ CSC463: Computational Complexity and Computability
➢ CSC304: Algorithmic Game Theory and Mechanism Design (self promotion!) ➢ CSC384: Introduction to Artificial Intelligence ➢ CSC436: Numerical Algorithms ➢ CSC418: Computer Graphics
373F20 - Nisarg Shah 32
373F20 - Nisarg Shah 33
➢ Mergesort - 𝑃 𝑜 log 𝑜 ➢ Karatsuba algorithm for fast multiplication - 𝑃 𝑜log2 3 rather than 𝑃 𝑜2 ➢ Largest subsequence sum in 𝑃 𝑜 ➢ …
➢ Maybe in CSC236/CSC240 and/or CSC263/CSC265 ➢ Write “yes”/”no” in chat
373F20 - Nisarg Shah 34
➢ Break (a large chunk of) a problem into two smaller subproblems of the same
type
➢ Solve each subproblem recursively and independently ➢ At the end, quickly combine solutions from the two subproblems and/or solve
any remaining part of the original problem
conquer…
373F20 - Nisarg Shah 35
➢ Useful for analyzing divide-and-conquer running time ➢ If you haven’t already seen it, please spend some time understanding it
373F20 - Nisarg Shah 36
Intuition: Compare f(n) with nlog
b
373F20 - Nisarg Shah 37
➢ Given an array 𝑏 of length 𝑜, count the number of pairs (𝑗, 𝑘) such that 𝑗 < 𝑘
but 𝑏 𝑗 > 𝑏[𝑘]
➢ Voting theory ➢ Collaborative filtering ➢ Measuring the “sortedness” of an array ➢ Sensitivity analysis of Google's ranking function ➢ Rank aggregation for meta-searching on the Web ➢ Nonparametric statistics (e.g., Kendall's tau distance)
373F20 - Nisarg Shah 38
➢ Count (𝑗, 𝑘) such that 𝑗 < 𝑘 but 𝑏 𝑗 > 𝑏[𝑘]
➢ Check all Θ 𝑜2 pairs
➢ Divide: break array into two equal halves 𝑦 and 𝑧 ➢ Conquer: count inversions in each half recursively ➢ Combine:
373F20 - Nisarg Shah 39
373F20 - Nisarg Shah 40
373F20 - Nisarg Shah 41
373F20 - Nisarg Shah 42
➢ Induction on 𝑜 is usually very helpful ➢ Allows you to assume correctness of subproblems
➢ Suppose 𝑈(𝑜) is the running time for inputs of size 𝑜 ➢ Our algorithm satisfies 𝑈 𝑜 = 2 𝑈
Τ
𝑜 2 + 𝑃(𝑜)
➢ Master theorem says this is 𝑈 𝑜 = 𝑃(𝑜 log 𝑜)
373F20 - Nisarg Shah 43
Let’s say 𝑈 𝑜 = 2 𝑈 Τ
𝑜 2 + 2𝑜
373F20 - Nisarg Shah 44
➢ Given 𝑜 points of the form (𝑦𝑗, 𝑧𝑗) in the plane, find the closest pair of points.
➢ Basic primitive in graphics and computer vision ➢ Geographic information systems, molecular modeling, air traffic control ➢ Special case of nearest neighbor
373F20 - Nisarg Shah 45
➢ Sort and check!
➢ Find closest points by x coordinate ➢ Find closest points by y coordinate
➢ No two points have the same x or y coordinate
373F20 - Nisarg Shah 46
➢ Find closest points by x or y coordinate ➢ Doesn’t work!
1 + 𝜗 1 1 + 𝜗 1 2
373F20 - Nisarg Shah 47
➢ Divide: points in equal halves by drawing a vertical line 𝑀 ➢ Conquer: solve each half recursively ➢ Combine: find closest pair with one point on each side of 𝑀 ➢ Return the best of 3 solutions
Seems like Ω(𝑜2)
373F20 - Nisarg Shah 48
➢ We can restrict our attention to points within 𝜀 of 𝑀 on each side, where 𝜀 =
best of the solutions in two halves
373F20 - Nisarg Shah 49
➢ Only need to look at points within 𝜀 of 𝑀 on each side, ➢ Sort points on the strip by 𝑧 coordinate ➢ Only need to check each point with next 11 points in sorted list!
Wait, what? Why 11?
373F20 - Nisarg Shah 50
➢ If two points are at least 12 positions apart in the
sorted list, their distance is at least 𝜀
➢ No two points lie in the same
𝜀/2 × 𝜀/2 box
➢ Two points that are more than two rows apart are
at distance at least 𝜀
373F20 - Nisarg Shah 51
➢ Divide each integer into two parts
𝑜 2 + 𝑦2, 𝑧 = 𝑧1 ∗ 10 Τ 𝑜 2 + 𝑧2
𝑜 2 + (𝑦2𝑧2)
➢ Four Τ
𝑜 2-digit multiplications can be replaced by three
𝑧1 + 𝑧2 − 𝑦1𝑧1 − 𝑦2𝑧2
➢ Running time
Τ
𝑜 2 + 𝑃(𝑜) ⇒ 𝑈 𝑜 = 𝑃 𝑜log2 3
373F20 - Nisarg Shah 52
multiplying two 𝑜 × 𝑜 matrices
➢ Call 𝑜 the “size” of the problem
𝐷11 𝐷12 𝐷21 𝐷22 = 𝐵11 𝐵12 𝐵21 𝐵22 ∗ 𝐶11 𝐶12 𝐶21 𝐶22
➢ Naively, this requires 8 multiplications of size 𝑜/2
➢ Strassen’s insight: replace 8 multiplications by 7
Τ
𝑜 2 + 𝑃(𝑜2) ⇒ 𝑈 𝑜 = 𝑃 𝑜log2 7
373F20 - Nisarg Shah 53
𝐷11 𝐷12 𝐷21 𝐷22 = 𝐵11 𝐵12 𝐵21 𝐵22 ∗ 𝐶11 𝐶12 𝐶21 𝐶22
373F20 - Nisarg Shah 54
➢ Given array 𝐵 of 𝑜 comparable elements, find 𝑙th smallest ➢ 𝑙 = 1 is min, 𝑙 = 𝑜 is max, 𝑙 =
Τ 𝑜 + 1 2 is median
➢ 𝑃 𝑜 is easy for min/max
➢ 𝑃(𝑜𝑙) by modifying bubble sort ➢ 𝑃 𝑜 log 𝑜 by sorting ➢ 𝑃 𝑜 + 𝑙 log 𝑜 using min-heap ➢ 𝑃(𝑙 + 𝑜 log 𝑙) using max-heap
373F20 - Nisarg Shah 55
➢ 𝐵𝑚𝑓𝑡𝑡 = elements ≤ 𝑞, 𝐵𝑛𝑝𝑠𝑓 = elements > 𝑞 ➢ If 𝐵𝑚𝑓𝑡𝑡 ≥ 𝑙, return 𝑙th smallest in 𝐵𝑚𝑓𝑡𝑡, otherwise return (𝑙 − 𝐵𝑚𝑓𝑡𝑡 )th
smallest in 𝐵𝑛𝑝𝑠𝑓
➢ If pivot is close to the min or the max, then we basically get
𝑈 𝑜 ≤ 𝑈 𝑜 − 1 + 𝑃(𝑜), which only gives 𝑈 𝑜 = 𝑃 𝑜2
➢ Want to reduce 𝑜 − 1 to a fraction of 𝑜 (like 𝑜/2, 5𝑜/6, etc)
373F20 - Nisarg Shah 56
𝑜 5 groups of 5 each
373F20 - Nisarg Shah 57
𝑜 5 groups of 5 each
373F20 - Nisarg Shah 58
𝑜 5 groups of 5 each
373F20 - Nisarg Shah 59
𝑜 5 groups of 5 each
373F20 - Nisarg Shah 60
➢ Out of 𝑜/5 medians, 𝑜/10 are > 𝑞∗
373F20 - Nisarg Shah 61
➢ Out of 𝑜/5 medians, 𝑜/10 are > 𝑞∗
373F20 - Nisarg Shah 62
𝑜 10 of the Τ 𝑜 5 medians are ≤ 𝑞∗
➢ For each such median, there are 3 elements ≤ 𝑞∗ ➢ So there can be at most
Τ
7𝑜 10 elements that can be > 𝑞∗
373F20 - Nisarg Shah 63
Τ
7𝑜 10
➢ Similarly, 𝐵𝑚𝑓𝑡𝑡 ≤
Τ
7𝑜 10
➢ (These are rough calculations…)
373F20 - Nisarg Shah 64
𝑜 5 groups of 5 each
𝑜 5 medians
Τ
𝑜 5 + 𝑈
Τ
7𝑜 10 + 𝑃(𝑜)
𝑜 5 +
Τ
7𝑜 10 =
Τ
9𝑜 10
➢ Only a fraction of 𝑜, so by the Master theorem, 𝑈 𝑜 = 𝑃(𝑜)
𝑃(𝑜) 𝑃(𝑜) 𝑈(𝑜/5) 𝑈(7𝑜/10)
373F20 - Nisarg Shah 65
➢ Typically hard to determine ➢ We still don’t know best algorithms for multiplying two 𝑜-digit integers or two
𝑜 × 𝑜 matrices
373F20 - Nisarg Shah 66
➢ Usually, we design an algorithm and then analyze its running time ➢ Sometimes we can do the reverse:
𝑈 𝑜 = 4 𝑈 ൗ 𝑜 2 + 𝑃 𝑜2
𝑃(𝑜2) computation to combine
373F20 - Nisarg Shah 67
➢ For much of this analysis, we are assuming random access to elements of input ➢ So we’re ignoring underlying data structures (e.g. doubly linked list, binary
tree, etc.)
➢ We’re only counting the number of comparison or arithmetic operations ➢ So we’re ignoring issues like how real numbers are stored in the closest pair
problem
➢ When we get to P vs NP, representation will matter
373F20 - Nisarg Shah 68
➢ Can be any reasonable parameter of the problem ➢ E.g., for matrix multiplication, we used 𝑜 as the size ➢ But an input consists of two matrices with 𝑜2 entries ➢ It doesn’t matter whether we call 𝑜 or 𝑜2 the size of the problem ➢ The actual running time of the algorithm won’t change