Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
Introduction to Machine Learning
Lecture 1
September 2, 2015 Introduction to Machine Learning 1
Introduction to Machine Learning Lecture 1 Introduction to Machine - - PowerPoint PPT Presentation
Wentworth Institute of Technology COMP4050 Machine Learning | Fall 2015 | Derbinsky Introduction to Machine Learning Lecture 1 Introduction to Machine Learning September 2, 2015 1 Wentworth Institute of Technology COMP4050 Machine
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 1
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 2
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 3
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 4
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 5
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 6
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
*glassdoor.com, National Avg as of August 24, 2015
September 2, 2015 Introduction to Machine Learning 7
Posi%on ¡ Salary* ¡ Data ¡Scien*st ¡ $118,709 ¡ Machine ¡Learning ¡Engineer ¡ $112,500 ¡ So3ware ¡Engineer ¡ $90,374 ¡
“A ¡data ¡scien*st ¡is ¡someone ¡who ¡knows ¡more ¡sta*s*cs ¡than ¡a ¡computer ¡ scien*st ¡and ¡more ¡computer ¡science ¡than ¡a ¡sta*s*cian.” ¡ – ¡Josh ¡Blumenstock ¡(UW) ¡ ¡ “Data ¡Scien*st ¡= ¡sta*s*cian ¡+ ¡programmer ¡+ ¡coach ¡+ ¡storyteller ¡+ ¡ar*st” ¡ ¡ – ¡Shlomo ¡Aragmon ¡(Ill. ¡Inst. ¡of ¡Tech) ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 8
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
example, instance Unit of input Composed of features (or attributes)
each digit via raw pixels: 28x28=784-pixel vector of greyscale values [0-255]
– Dimensionality: number of features per instance (|vector|)
are possible, and might be advantageous
selection is challenging
September 2, 2015 Introduction to Machine Learning 9
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 10
Instance ¡ Features ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
– Interval: degree of difference (e.g. Celsius) – Ratio: has meaningful zero, ratio has meaning (e.g. Kelvin)
– Nominal: equality, containment (e.g. hair color, part of speech) – Ordinal: supports ranking (Likert, true/false)
September 2, 2015 Introduction to Machine Learning 11
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 12
hSp://www.mymarketresearchmethods.com/types-‑of-‑data-‑nominal-‑ordinal-‑interval-‑ra*o/ ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 13
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 14
Person1 ¡ Person2 ¡ Rela%onship ¡ Ann ¡ Bob ¡ Friend ¡ Ann ¡ Sally ¡ Friend ¡ Ann ¡ Billy ¡ Sibling ¡ Bob ¡ Billy ¡ Friend ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 15
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 16
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
– Deterministic: every output can be uniquely determined by a set
the same way for a given set of initial conditions – Stochastic (probabilistic): randomness is present, and variable states are not described by unique values, but rather by probability distributions – Often: deterministic process + hypothesized distribution of noise
– States/variables are either directly measured (observable), or inferred from data
reduce problem dimensionality)
September 2, 2015 Introduction to Machine Learning 17
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 18
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 19
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 20
α ¡ β ¡ β ¡ ? ¡ … ¡ … ¡ γ ¡ Training ¡Set ¡ Tes,ng ¡Set ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 21
SepalLength ¡ SepalWidth ¡ PetalLength ¡ PetalWidth ¡ Species ¡ 5.1 ¡ 3.5 ¡ 1.4 ¡ 0.2 ¡ setosa ¡ 4.9 ¡ 3.0 ¡ 1.4 ¡ 0.2 ¡ setosa ¡ 4.7 ¡ 3.2 ¡ 1.3 ¡ 0.2 ¡ setosa ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 22
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
– Nearest Neighbor (kNN)
– ID3, C4.5
– Linear/logistic regression, support vector machines (SVM)
– Naïve Bayes
– Backpropagation – Deep learning
September 2, 2015 Introduction to Machine Learning 23
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
– Via distance function
September 2, 2015 Introduction to Machine Learning 24
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 25
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 26
Explicit ¡knowledge ¡ ¡ Representa*on, ¡vs. ¡implicit ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 27
Objec&ve ¡func&on ¡ Kernel ¡trick ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 28
Deep ¡Architectures ¡ Vanishing ¡Gradient ¡ Feedforward ¡vs. ¡ Recurrant ¡ Gradient ¡descent ¡ Backpropaga*on ¡ Perceptron ¡ Linear ¡classifier ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 29
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 30
Model ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡as ¡
y = f(x)
ˆ f(x)
Err(x) = Bias2 + Variance + Irreducible Error
Err(x) = E[(Y − ˆ f(x))2] Bias = E[ ˆ f(x)] − f(x) Variance = E[( ˆ f(x) − E[ ˆ f(x)])2]
Irreducible Error = σ2
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 31
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 32
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 33
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 34
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 35 Introduction to Machine Learning
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
Agent ¡ Environment ¡ state ¡ st ¡ ac*on ¡ at ¡ reward ¡ rt+1 ¡ st+1 ¡
September 2, 2015 36 Introduction to Machine Learning
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 37
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 38
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 39
Parametric ¡algorithm: ¡model ¡does ¡not ¡grow ¡with ¡data ¡size ¡
MB ¡ GB ¡ TB ¡
Certain ¡ Uncertain ¡
Sta%c ¡ Real-‑%me ¡
Homogenous ¡ Heterogeneous ¡
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 40
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 41
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 42
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 43
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 44
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 45
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 46
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
September 2, 2015 Introduction to Machine Learning 47
Wentworth Institute of Technology COMP4050 – Machine Learning | Fall 2015 | Derbinsky
each composed of k-dimensional feature vectors
regression), unsupervised, and reinforcement
algorithms are seeking an ideal tradeoff between model bias and variance
collection/preprocessing/analysis, training/evaluation, and eventual deployment
September 2, 2015 Introduction to Machine Learning 48