SLIDE 1
Lecture 01 Part 01 Algorithms How do we turn it into something a - - PowerPoint PPT Presentation
Lecture 01 Part 01 Algorithms How do we turn it into something a - - PowerPoint PPT Presentation
Lecture 01 Part 01 Algorithms How do we turn it into something a computer Recall DSC 40A... can do? How do we formalize learning from data? Recall DSC 40A... can do? How do we formalize learning from data? How do we turn it
SLIDE 2
SLIDE 3
Recall DSC 40A...
▶ How do we formalize learning from data? ▶ How do we turn it into something a computer can do?
SLIDE 4
Example: Predicting Salary
SLIDE 5
Example: Predicting Salary
SLIDE 6
The End
(𝑌𝑈𝑌)−1 ⃗ 𝑥 = 𝑌𝑈 ⃗ 𝑐
SLIDE 7
Wait...
▶ We actually need to compute the answer... ▶ We need an algorithm.
SLIDE 8
Wait...
▶ We actually need to compute the answer... ▶ We need an algorithm.
SLIDE 9
An Algorithm?
>>> import numpy as np >>> w = np.linalg.solve(X.T @ X, X.T @ b)
▶ Will it work for 1,000,000 data points? ▶ What about for 1,000,000 features?
SLIDE 10
Example: Minimize Error
▶ Goal: summarize a collection of numbers, 𝑦1, … , 𝑦𝑜: ▶ Idea: find number 𝑁 minimizing the total absolute error:
𝑜
∑
𝑗=1
|𝑁 − 𝑦𝑗|
SLIDE 11
Example: Minimize Error
▶ Solution: The median of 𝑦1, … , 𝑦𝑜. ▶ But how do we actually compute the median?
SLIDE 12
Lecture 01 – Part 02 Example: Clustering
SLIDE 13
Clustering
▶ Given a pile of data, discover similar groups. ▶ Examples:
▶ Find political groups within social network data. ▶ Given data on COVID-19 symptoms, discover groups that are afgected difgerently. ▶ Find the similar regions of an image (segmentation).
▶ Most useful when data is high dimensional...
SLIDE 14
Example: Old Faithful
SLIDE 15
Example: Old Faithful
SLIDE 16
Clustering
▶ Goal: for computer to identify the two groups in the data.
SLIDE 17
Example: Old Faithful
SLIDE 18
Clustering
▶ How do we turn this into something a computer can do? ▶ DSC 40A says: “Turn it into an optimization problem”. ▶ Idea: develop a way of quantifying the “goodness” of a clustering; find the best.
SLIDE 19
SLIDE 20
Quantifying Separation
Define the “separation” 𝜀(𝐶, 𝑆) to be the smallest distance between a blue point and red point.
SLIDE 21
The Problem
▶ Given: 𝑜 points ⃗ 𝑦(1), … , ⃗ 𝑦(𝑜). ▶ Find: an assignment of points to clusters R and B so as to maximize 𝜀(𝐶, 𝑆).
SLIDE 22
The End
SLIDE 23
The “Brute Force” Algorithm
▶ There are finitely-many possible clusterings. ▶ Algorithm: Try each possible clustering, return that with largest separation, 𝜀(𝐶, 𝑆). ▶ This is called a brute force algorithm.
SLIDE 24
best_separation = float('inf') # Python for ”infinity” best_clustering = None for clustering in all_clusterings(data): sep = calculate_separation(clustering) if sep < best_separation: best_separation = sep best_clustering = clustering print(best_clustering)
SLIDE 25
The End
SLIDE 26
Wait...
▶ How long will this take to run if there are 𝑜 points? ▶ How many clusterings of 𝑜 things are there?
SLIDE 27
Combinatorics
▶ How many ways are there to assign R or B to 𝑜
- bjects?
▶ Two choices 1 for each object: 2 × 2 × … × 2 = 2𝑜.
1Small nitpick: actual color doesn’t matter, 2𝑜−1.
SLIDE 28
Time
▶ Suppose it takes at least 1 nanosecond to check a single clustering. ▶ One billionth of a second. ▶ If there are 𝑜 points, it will take at least 2𝑜 nanoseconds to check all clusterings.
SLIDE 29
Time Needed
𝑜 Time 1 1 nanosecond
SLIDE 30
Time Needed
𝑜 Time 1 1 nanosecond 10 1 microsecond
SLIDE 31
Time Needed
𝑜 Time 1 1 nanosecond 10 1 microsecond 20 1 millisecond
SLIDE 32
Time Needed
𝑜 Time 1 1 nanosecond 10 1 microsecond 20 1 millisecond 30 1 second
SLIDE 33
Time Needed
𝑜 Time 1 1 nanosecond 10 1 microsecond 20 1 millisecond 30 1 second 40 18 minutes
SLIDE 34
Time Needed
𝑜 Time 1 1 nanosecond 10 1 microsecond 20 1 millisecond 30 1 second 40 18 minutes 50 13 days
SLIDE 35
Time Needed
𝑜 Time 1 1 nanosecond 10 1 microsecond 20 1 millisecond 30 1 second 40 18 minutes 50 13 days 60 36 years
SLIDE 36
Time Needed
𝑜 Time 1 1 nanosecond 10 1 microsecond 20 1 millisecond 30 1 second 40 18 minutes 50 13 days 60 36 years 70 37,000 years
SLIDE 37
Example: Old Faithful
▶ The Old Faithful data set has 270 points. ▶ Brute force algorithm will finish in 6 × 1064 years.
SLIDE 38
Example: Old Faithful
▶ The Old Faithful data set has 270 points. ▶ Brute force algorithm will finish in 6 × 1064 years.
SLIDE 39
Algorithm Design
▶ Oħten, most obvious algorithm is unusably slow. ▶ Does this mean our problem is too hard? ▶ We’ll see an effjcient solution by the end of the quarter.
SLIDE 40
Algorithm Design
▶ Oħten, most obvious algorithm is unusably slow. ▶ Does this mean our problem is too hard? ▶ We’ll see an effjcient solution by the end of the quarter.
SLIDE 41
Algorithm Design
▶ Oħten, most obvious algorithm is unusably slow. ▶ Does this mean our problem is too hard? ▶ We’ll see an effjcient solution by the end of the quarter.
SLIDE 42