Administrivia Homework 4 due this Thursday, April 07 Additional - PowerPoint PPT Presentation

Administrivia • Homework 4 due this Thursday, April 07 • Additional OH today (3-4 pm, CS 274) CMPSCI 370: Intro. to Computer Vision The machine learning framework • Honors section will meet today University of Massachusetts, Amherst • We will discuss how the Microsoft Kinect works April 5/7, 2016 • Non HH students are welcome (CS 142, 4:00-5:00 pm) Instructor: Subhransu Maji 2 Today and tomorrow Classification • The machine learning framework How would you write a program to distinguish a picture of me from a picture of someone else? • Common datasets in computer vision • An example: decision tree classifiers How would you write a program to determine whether a sentence is grammatical or not? How would you write a program to distinguish cancerous cells from normal cells? 3 CMPSCI 370 Subhransu Maji (UMASS) 4

Classification Data (“weather” prediction) How would you write a program to distinguish a picture of me from a Example dataset: picture of someone else? ‣ Provide examples pictures of me and pictures of other people and let a classifier learn to distinguish the two. How would you write a program to determine whether a sentence is grammatical or not? ‣ Provide examples of grammatical and ungrammatical sentences and let a classifier learn to distinguish the two. How would you write a program to distinguish cancerous cells from normal cells? Three principal components ‣ Provide examples of cancerous and normal cells and let a classifier 1. Class label (aka “label”, denoted by y ) learn to distinguish the two. 2. Features (aka “attributes”) 3. Feature values (aka “attribute values”, denoted by x ) ➡ Feature values can be binary, nominal or continuous A labeled dataset is a collection of (x, y) pairs CMPSCI 370 Subhransu Maji (UMASS) 5 CMPSCI 370 Subhransu Maji (UMASS) 6 Data (“weather” prediction) Data (face recognition) Example dataset: Task: Predict the class of this “test” example Requires us to generalize from the training data What is a good representation for images? Pixel values? Edges? CMPSCI 370 Subhransu Maji (UMASS) 7 CMPSCI 370 Subhransu Maji (UMASS) 8

Ingredients for classification Regression Whole idea: Inject your knowledge into a learning system Sources of knowledge: Regression is like classification except the labels are real valued 1. Feature representation ➡ Not typically a focus of machine learning Example applications: ➡ Typically seen as “problem specific” ‣ Stock value prediction ➡ However, it’s hard to learn from bad representations ‣ Income prediction 2. Training data: labeled examples ‣ CPU power consumption ➡ Often expensive to label lots of data ‣ Your grade in CMPSCI 689 ➡ Sometimes data is available for “free” 3. Model ➡ No single learning algorithm is always good ( “no free lunch” ) ➡ Different learning algorithms work with different ways of representing the learned classifier CMPSCI 370 Subhransu Maji (UMASS) 9 CMPSCI 370 Subhransu Maji (UMASS) 10 Structured prediction Unsupervised learning: Clustering Two types of clustering 1. Clustering into distinct components 2. Hierarchical clustering CMPSCI 370 Subhransu Maji (UMASS) 11 CMPSCI 370 Subhransu Maji (UMASS) 12

Unsupervised learning: Clustering Unsupervised learning: Clustering http://vision.ucsd.edu/~gvanhorn/ Two types of clustering 1. Clustering into distinct components ➡ How many clusters are there? north american birds ➡ What is important? Person? Expression? Lighting? 2. Hierarchical clustering perching birds jays, magpies, crows CMPSCI 370 Subhransu Maji (UMASS) 13 CMPSCI 370 Subhransu Maji (UMASS) 14 Unsupervised learning: Clustering Learning to recognize Apply a prediction function to a feature representation of the Two types of clustering image to get the desired output:   1. Clustering into distinct components f( ) = “apple” ➡ How many clusters are there? f( ) = “tomato” ➡ What is important? Person? Expression? Lighting? 2. Hierarchical clustering f( ) = “cow” ➡ What is important? ➡ How will we use this? 16 CMPSCI 370 Subhransu Maji (UMASS) 15

Steps The machine learning framework y = f( x ) Training Training Labels Training Images Image Learned output prediction Image Training Features model function feature Training: given a training set of labeled examples   Learned {( x 1 ,y 1 ), …, ( x N ,y N )}, estimate the prediction function f by model Testing minimizing the prediction error on the training set Testing: apply f to a never before seen test example x and output the predicted value y = f( x ) Image Prediction Features Test Image 17 Slide credit: D. Hoiem Features (examples) Classifiers: Nearest neighbor Raw pixels (and simple Histograms, bags of features functions of raw pixels) Training Test Training examples example examples from class 2 from class 1 GIST descriptors Histograms of oriented gradients(HOG) f( x ) = label of the training example nearest to x All we need is a distance function for our inputs No training required! more on this topic next week… 19

Classifiers: Linear Classifiers: Decision trees Play tennis? Find a linear function to separate the classes: f( x ) = sgn( w ⋅ x + b) 22 Generalization Diagnosing generalization ability Training error: how well does the model perform at prediction on the data on which it was trained? Test error: how well does it perform on a never before seen test set? Training and test error are both high : underfitting • Model does an equally poor job on the training and the test set • Either the training procedure is ineffective or the model is too “simple” to represent the data Training error is low but test error is high : overfitting • Model has fit irrelevant characteristics (noise) in the Training set (labels known) Test set (labels unknown) training data • Model is too complex or amount of training data is How well does a learned model generalize from the insufficient data it was trained on to a new test set?

Today and tomorrow Underfitting and overfitting • The machine learning framework • Common datasets in computer vision Underfitting Good generalization Overfitting • An example: decision trees Figure source 26 Caltech-101: Intra-class variability Caltech 101 & 256 http://www.vision.caltech.edu/Image_Datasets/Caltech101/ http://www.vision.caltech.edu/Image_Datasets/Caltech256/ Griffin, Holub, Perona, 2007 Fei-Fei, Fergus, Perona, 2004

PASCAL competitions The PASCAL Visual Object Classes Challenge (2005-2012) http://pascallin.ecs.soton.ac.uk/challenges/VOC/ http://pascallin.ecs.soton.ac.uk/challenges/VOC/ • Challenge classes:   Classification: For each of the twenty classes, Person: person   predicting presence/absence of an example of that Animal: bird, cat, cow, dog, horse, sheep   Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train   class in the test image Indoor: bottle, chair, dining table, potted plant, sofa, tv/ Detection: Predicting the bounding box and label of monitor each object from the twenty target classes in the test image • Dataset size (by 2012):   11.5K training/validation images, 27K bounding boxes, 7K segmentations PASCAL competitions PASCAL competitions http://pascallin.ecs.soton.ac.uk/challenges/VOC/ http://pascallin.ecs.soton.ac.uk/challenges/VOC/ Action classification (10 action classes) Segmentation: Generating pixel-wise segmentations giving the class of the object visible at each pixel, or "background" otherwise   Person layout: Predicting the bounding box and label of each part of a person (head, hands, feet)

ImageNet LabelMe Dataset http://www.image-net.org/ http://labelme.csail.mit.edu/ Russell, Torralba, Murphy, Freeman, 2008 Today and tomorrow The decision tree model of learning • The machine learning framework Classic and natural model of learning • Common datasets in computer vision Question: Will an unknown user enjoy an unknown course? • An example: decision tree classifiers ‣ You: Is the course under consideration in Systems? ‣ Me: Yes ‣ You: Has this student taken any other Systems courses? ‣ Me: Yes ‣ You: Has this student liked most previous Systems courses? ‣ Me: No ‣ You: I predict this student will not like this course. Goal of learner: Figure out what questions to ask, and in what order, and what to predict when you have answered enough questions 20 questions game 35 CMPSCI 370 Subhransu Maji (UMASS) 36

Learning a decision tree Greedy decision tree learning Recall that one of the ingredients of If I could ask one question, what learning is training data question would I ask? ‣ I’ll give you (x, y) pairs, i.e., set of ‣ You want a feature that is most (attributes, label) pairs useful in predicting the rating of the course ‣ We will simplify the problem by ➡ {0,+1, +2} as “liked” ‣ A useful way of thinking about this is to look at the histogram of the ➡ {-1,-2} as “hated” labels for each feature Here: ‣ Questions are features ‣ Responses are feature values ‣ Rating is the label Lots of possible trees to build Can we find good one quickly? Course ratings dataset CMPSCI 370 Subhransu Maji (UMASS) 37 CMPSCI 370 Subhransu Maji (UMASS) 38 What attribute is useful? What attribute is useful? Attribute = Easy? Attribute = Easy? # correct = 6 CMPSCI 370 Subhransu Maji (UMASS) 39 CMPSCI 370 Subhransu Maji (UMASS) 40

Administrivia Homework 4 due this Thursday, April 07 Additional - PowerPoint PPT Presentation

Administrivia Homework 4 due this Thursday, April 07 Additional OH today (3-4 pm, CS 274) CMPSCI 370: Intro. to Computer Vision The machine learning framework Honors section will meet today University of Massachusetts, Amherst

Administrivia CSCE150A CSCE150A Computer Science & Engineering 150A Administrivia Problem

Outline Administrivia Introduction to Machine Learning Greg Mori - CMPT 419/726 Machine

More Threads and Synchronization More Threads and Synchronization Administrivia Administrivia

Introduction to Machine Learning Greg Mori - CMPT 419/726 Bishop PRML Ch. 1 Administrivia

Administrivia Administrivia Nachos guide and Lab #1 are on the web.

CSCE 471/871 Lecture 0: Stephen Scott Administrivia Welcome Introduction What is Bioin-

Project 2 Soumya Basu Department of Computer Science Cornell University September 18, 2015

Ontology Engineering Administrivia and general information Maria Keet email: mkeet@cs.uct.ac.za

Administrivia Mini project is graded 1 st place: Justin (75.45) 2 nd place: Liia

Administrivia Website. cis.poly.edu/jsterling/cs3224 Text: Modern Operating Systems ;

Administrivia Mini project deadline: today Attach the capture of the evaluation run output

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 ADMINISTRIVIA - Waitlist/Enrollment

Modern Programming Languages (Seminar) Guido Salvaneschi Joscha Drechsler Outline

Provider EVV System Training December 21, 21, 2017 2017 Zoom Webinar - Administrivia

Plan for Today Administrivia come to office hours to start talking about possible projects

EECS E6870 - Speech Recognition Administrivia Lecture 11 Linear Discriminant Analysis

WHY ZERO TO LANDFILL IS IMPORTANT Saves Resources Reduces Environmental Impact

CSE 481: NLP Capstone Spring 2017 Yejin Choi University of Washington Office Hour News Hannah:

Yes, it is a Curse: politics and the adverse impact of natural-resource riches Professor

Beh Behavioral Economics and Green Growth ior l Economics nd Green Gro th The Role of Insurance

MyWatershed -Water Balance Exercise What will happen if we build a check-dam and a reservoir?

Joshua 5:136:27 MISSION IMPOSSIBLE JOSHUAS MISSION IMPOSSIBLE OUR PERSONAL MISSION

R E T U R N O R I E N T E D P R O G R A M M E E V O L U T I O N with R O P E R Olivia Lucca

All Hands Meeting January 17, 2013 Nebraska East Union Roadmap for Todays Conversation

Administrivia Homework 4 due this Thursday, April 07 Additional - PowerPoint PPT Presentation

Administrivia Homework 4 due this Thursday, April 07 Additional OH today (3-4 pm, CS 274) CMPSCI 370: Intro. to Computer Vision The machine learning framework Honors section will meet today University of Massachusetts, Amherst

Administrivia CSCE150A CSCE150A Computer Science &amp; Engineering 150A Administrivia Problem

Outline Administrivia Introduction to Machine Learning Greg Mori - CMPT 419/726 Machine

More Threads and Synchronization More Threads and Synchronization Administrivia Administrivia

Introduction to Machine Learning Greg Mori - CMPT 419/726 Bishop PRML Ch. 1 Administrivia

Administrivia Administrivia Nachos guide and Lab #1 are on the web.

CSCE 471/871 Lecture 0: Stephen Scott Administrivia Welcome Introduction What is Bioin-

Project 2 Soumya Basu Department of Computer Science Cornell University September 18, 2015

Ontology Engineering Administrivia and general information Maria Keet email: mkeet@cs.uct.ac.za

Administrivia Mini project is graded 1 st place: Justin (75.45) 2 nd place: Liia

Administrivia Website. cis.poly.edu/jsterling/cs3224 Text: Modern Operating Systems ;

Administrivia Mini project deadline: today Attach the capture of the evaluation run output

CS 744: Big Data Systems Shivaram Venkataraman Fall 2018 ADMINISTRIVIA - Waitlist/Enrollment

Modern Programming Languages (Seminar) Guido Salvaneschi Joscha Drechsler Outline

Provider EVV System Training December 21, 21, 2017 2017 Zoom Webinar - Administrivia

Plan for Today Administrivia come to office hours to start talking about possible projects

EECS E6870 - Speech Recognition Administrivia Lecture 11 Linear Discriminant Analysis

WHY ZERO TO LANDFILL IS IMPORTANT Saves Resources Reduces Environmental Impact

CSE 481: NLP Capstone Spring 2017 Yejin Choi University of Washington Office Hour News Hannah:

Yes, it is a Curse: politics and the adverse impact of natural-resource riches Professor

Beh Behavioral Economics and Green Growth ior l Economics nd Green Gro th The Role of Insurance

MyWatershed -Water Balance Exercise What will happen if we build a check-dam and a reservoir?

Joshua 5:136:27 MISSION IMPOSSIBLE JOSHUAS MISSION IMPOSSIBLE OUR PERSONAL MISSION

R E T U R N O R I E N T E D P R O G R A M M E E V O L U T I O N with R O P E R Olivia Lucca

All Hands Meeting January 17, 2013 Nebraska East Union Roadmap for Todays Conversation

Administrivia CSCE150A CSCE150A Computer Science & Engineering 150A Administrivia Problem