Machine Learning: Basic Principles Teaching demonstration Kalle - - PowerPoint PPT Presentation

machine learning basic principles
SMART_READER_LITE
LIVE PREVIEW

Machine Learning: Basic Principles Teaching demonstration Kalle - - PowerPoint PPT Presentation

Machine Learning: Basic Principles Teaching demonstration Kalle Palomki Department of Signal Processing and Acoustics Aalto University Content 1. Goal 2. Machine learning: definition 3. Classification an important machine learning


slide-1
SLIDE 1

Machine Learning: Basic Principles

Teaching demonstration Kalle Palomäki Department of Signal Processing and Acoustics Aalto University

slide-2
SLIDE 2

Content

  • 1. Goal
  • 2. Machine learning: definition
  • 3. Classification – an important machine learning

approach

  • 4. A machine learning problem

 Hands on problem solving  Demonstration

  • 5. Summary
slide-3
SLIDE 3

Goal

 Part of introductory sessions adjusted to 20

minutes

 4th year students with no background in

machine learning

 Start building understanding of machine

learning by

 Concrete examples  Solving simple hands on problems

slide-4
SLIDE 4

Machine learning - definition

Wikipedia: “Machine learning deals with the construction and study of systems that can learn from data, rather than follow only explicitly programmed instructions”

slide-5
SLIDE 5

http://oldentech.files.wordpress.com/2010/07/1028528_29880053.jpg http://www.paranormalpeopleonline.com/boskop-man-big-brains-and-increased-intelligence/

Common sense definition: machines that learn a little like the brains

slide-6
SLIDE 6

Internet and machine learning - far beyond the single brains capacity

http://www.slate.com/blogs/future_tense/2014/10/24/internet_sleep_new_research_from_usc_shows_internet_activity_changes_in.html

slide-7
SLIDE 7

Machine learning categories

 Supervised learning

 Classification

 Unsupervised learning

 Clustering

 Reinforcement learning

slide-8
SLIDE 8

Classifier

slide-9
SLIDE 9

Classifier

slide-10
SLIDE 10

http://upload.wikimedia.org/wikipedia/commons/3/39/Leonardo_da_Vinci_043-mod.jpg

http://ecx.images-amazon.com/images/I/51f9cnKx90L._SY300_.jpg

Problem

Lisa is a tailor...

slide-11
SLIDE 11

Lisa makes uniforms

Salvation army uniforms: men have trousers, women skirts

http://www.bilerico.com/2009/03/Army%20Uniforms.jpg

slide-12
SLIDE 12

Sometimes she makes mistakes

These should be skirts.

slide-13
SLIDE 13

Once she made a skirt for prince Charles!

http://i.dailymail.co.uk/i/pix/2009/05/21/article-1186234-050B9CB2000005DC-834_224x423.jpg

slide-14
SLIDE 14

Hip Waist Hip Waist

slide-15
SLIDE 15

waist (cm) hip (cm) gender 29.6 34.4 Female 28.9 34.4 Female 31.3 34.5 ??? 30.8 33.7 Male 29.8 34.5 ??? 32.5 33.6 Male 30.6 34.4 ??? ..... ..... .......

Here is Lisa’s data

slide-16
SLIDE 16

*

Missing gender information:

* *

Female samples: Red Male samples : Blue

slide-17
SLIDE 17

Some help to Lisa?

 Discuss in pairs 2 min:

 How would you approach this problem?  What kind of algorithm would you design?  Try to come up with some ideas please!  Use the picture provided to assist your discussion

slide-18
SLIDE 18

K-nearest neighbours algorithm

  • 1. Determine K = number of nearest neighbours
  • 2. Calculate the distance between test sample all the

training samples

 Use euclidean distance measure:

  • ,
  • 3. Sort the distances and determine nearst neigbours
  • 4. Gather the categories of the nearest neighbors
  • 5. Use the majority voting to predict the test sample class

http://people.revoledu.com/kardi/tutorial/KNN/

slide-19
SLIDE 19

*

Missing gender information:

* *

Female samples: Red Male samples : Blue

slide-20
SLIDE 20

*

Missing gender information:

*

Female samples: Red Male samples : Blue

K = 3

*

slide-21
SLIDE 21

K-nearest neighbours algorithm

  • 1. Determine K = number of nearest neighbours
  • 2. Calculate the distance between test sample all the

training samples

 Use euclidean distance measure:

  • ,
  • 3. Sort the distances and determine nearst neigbours
  • 4. Gather the categories of the nearest neighbors
  • 5. Use the majority voting to predict the test sample class

http://people.revoledu.com/kardi/tutorial/KNN/

slide-22
SLIDE 22

Euclidean distance

  • ,
  • http://people.revoledu.com/kardi/tutorial/KNN/

Test sample

Training samples Euclidean distance

slide-23
SLIDE 23

Euclidean distance

  • ,
  • http://people.revoledu.com/kardi/tutorial/KNN/

Test sample

Training samples Eucidean distance Training sample index

slide-24
SLIDE 24

Euclidean distance

  • ,
  • http://people.revoledu.com/kardi/tutorial/KNN/

Data dimension Test sample

Training samples Eucidean distance Training sample index Dimension index

slide-25
SLIDE 25

Euclidean distance

  • ,
  • http://people.revoledu.com/kardi/tutorial/KNN/

Data dimension M=2 Test sample

Training samples Eucidean distance Training sample index Dimension index

slide-26
SLIDE 26

*

Test sample Female samples of training data Male samples of training data Euclidean distance: d1

slide-27
SLIDE 27

*

Test sample Female samples of training data Male samples of training data d2

slide-28
SLIDE 28

*

Test sample Female samples of training data Male samples of training data d3

slide-29
SLIDE 29

*

Test sample Female samples of training data Male samples of training data d4

slide-30
SLIDE 30

*

Test sample Female samples of training data Male samples of training data d5

slide-31
SLIDE 31

*

Test sample Female samples of training data Male samples of training data d6

slide-32
SLIDE 32

K-nearest neighbours algorithm

  • 1. Determine K = number of nearest neighbours
  • 2. Calculate the distance between test sample all the training

samples

 Use euclidean distance measure:

  • ,
  • 3. Sort the distances and determine nearest neigbours
  • 4. Gather the categories of the nearest neighbors
  • 5. Use the majority voting to predict the test sample class

http://people.revoledu.com/kardi/tutorial/KNN/

slide-33
SLIDE 33

*

Test sample Female samples of training data Male samples of training data 3 nearest neighbors

slide-34
SLIDE 34

K-nearest neighbours algorithm

  • 1. Determine K = number of nearest neighbours
  • 2. Calculate the distance between test sample all the training

samples

 Use euclidean distance measure:

  • ,
  • 3. Sort the distances and determine nearest neigbours
  • 4. Gather the categories of the nearest neighbors
  • 5. Use the majority voting to predict the test sample

class

http://people.revoledu.com/kardi/tutorial/KNN/

slide-35
SLIDE 35

*

Test sample Female samples of training data Male samples of training data 3 nearest neighbors

All 3 neighbors were Male Class was male

slide-36
SLIDE 36

*

Test sample Female samples of training data Male samples of training data 3 nearest neighbors

slide-37
SLIDE 37

*

Test sample Female samples of training data Male samples of training data 3 nearest neighbors

2 neighbors Female 1 neighbor Male More Females than Males Class is Female

slide-38
SLIDE 38

Classification problem Lisa has lost gender information of one of her customers, and does not know whether to make skirt or trousers. She is planning to throw a coin. Can you help her to make a better decision? The customer who is missing gender information: Gender ------, Waist 28, Hip 34, gender waist (cm) hip (cm) Male 28 32 Male 33 35 Female 27 33 Female 31 36

http://www.dcs.gla.ac.uk/~srogers/firstcourseml/matlab/chapter5/knnexample.html#1 Molarius A, Seidell JC, Sans S, Tuomilehto J, Kuulasmaa K. (1999) "Waist and hip circumferences, and waist-hip ratio in 19 populations of the WHO MONICA Project", International Journal of Obesity and Related Metabolic Disorders :J. Internat. Association Study Obesity, 23:116-125.

slide-39
SLIDE 39

Gender waist (cm) hip (cm) distance Male 28 32 (28-28)2+(34-32)2=4 Male 33 35 (28-33)2+(34-35)2=26 Female 27 33 (28-27)2+(34-33)2=2 Female 31 36 (28-31)2+(34-36)2=13

Solution

Test sample 28, 34

slide-40
SLIDE 40

Gender waist (cm) hip (cm) distance Male 28 32 (28-28)2+(34-32)2=4 Male 33 35 (28-33)2+(34-35)2=26 Female 27 33 (28-27)2+(34-33)2=2 Female 31 36 (28-31)2+(34-36)2=13

Solution

Test sample 28, 34

slide-41
SLIDE 41

Gender waist (cm) hip (cm) Distance rank Male 28 32 (28-28)2+(34-32)2=4 2 Male 33 35 (28-33)2+(34-35)2=26 4 Female 27 33 (28-27)2+(34-33)2=2 1 Female 31 36 (28-31)2+(34-36)2=13 3

Solution

Test sample 28, 34

slide-42
SLIDE 42

Gender waist (cm) hip (cm) Distance rank belongs to the neighborhood (Yes or No) Male 28 32 (28-28)2+(34-32)2=4 2 Yes Male 33 35 (28-33)2+(34-35)2=26 4 No Female 27 33 (28-27)2+(34-33)2=2 1 Yes Female 31 36 (28-31)2+(34-36)2=13 3 Yes

Solution

Test sample 28, 34

slide-43
SLIDE 43

Gender waist (cm) hip (cm) Distance rank belongs to the neighborhood (Yes or No) gender if in neigborhood Male 28 32 (28-28)2+(34-32)2=4 2 Yes Male Male 33 35 (28-33)2+(34-35)2=26 4 No ‐‐‐‐‐ Female 27 33 (28-27)2+(34-33)2=2 1 Yes Female Female 31 36 (28-31)2+(34-36)2=13 3 Yes Female

Solution

Test sample 28, 34

Male 1 Female 2 Number of Female > Number of Male Class: Female

slide-44
SLIDE 44
slide-45
SLIDE 45

Summary

  • We addressed briefly principles of machine learning

1. First we defined the machine learning 2. Classification as an important machine learning task 3. Solved a hands on problem of classification utilizing K- nearest neighbour algorithm

  • Check out my website for
  • These slides
  • Exercise
  • The code on the decision border calculations in previous slides

http://users.spa.aalto.fi/kpalomak/demonstration_session

slide-46
SLIDE 46

What next

 Supervised learning

 Classification

 Unsupervised learning

 Clustering

 Reinforcement learning

slide-47
SLIDE 47

http://cs.nyu.edu/~roweis/data.html

Face recognition

slide-48
SLIDE 48

Speech recognition

 Spectrum over time for “cat”

k a t

slide-49
SLIDE 49

Searches