ECE 6504: Advanced Topics in Machine Learning
Probabilistic Graphical Models and Large-Scale Learning
Dhruv Batra Virginia Tech
ECE 6504: Advanced Topics in Machine Learning Probabilistic - - PowerPoint PPT Presentation
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Dhruv Batra Virginia Tech What is this class about? Some of the most exciting developments in Machine Learning, AI, Statistics &
ECE 6504: Advanced Topics in Machine Learning
Probabilistic Graphical Models and Large-Scale Learning
Dhruv Batra Virginia Tech
What is this class about? Some of the most exciting developments in Machine Learning, AI, Statistics & related fields in the last 3 decades
(C) Dhruv Batra 2
First Caveat
– This should not be your first introduction to ML – You will need a formal class; not just self-reading/coursera – If you took ECE 4984/5984, you’re in the right place – If you took ECE 5524 or equivalent, see list of topics taught in ECE 4984/5984.
(C) Dhruv Batra 3
Topics Covered in Intro to ML&P
Networks, Support Vector Machines, Kernels
(C) Dhruv Batra 4
What is this class about?
(C) Dhruv Batra 5
Exciting Developments
– Directed: Bayesian Networks (Bayes Nets) – Undirected: Markov/Conditional Random Fields – Structured Prediction
– Online learning – Distributed learning
– Convolutional Nets – Distributed backprop – Dropout
(C) Dhruv Batra 6
Not covered in this class
What is Machine Learning?
– automatically detect patterns in data – use the uncovered patterns to predict future data or other
– improve their performance (P) – at some task (T) – with experience (E)
(C) Dhruv Batra 7
Tasks
(C) Dhruv Batra 8
Classification x y Regression x y
Discrete Continuous
Clustering x c
Discrete ID
Dimensionality Reduction x z
Continuous
Supervised Learning Unsupervised Learning
Classification
(C) Dhruv Batra 9
Classification x y
Discrete
Speech Recognition
(C) Dhruv Batra 10 Slide Credit: Carlos Guestrin
Machine Translation
(C) Dhruv Batra 11 Figure Credit: Kevin Gimpel
Object/Face ¡detec,on ¡
– Canon, ¡Sony, ¡Fuji, ¡… ¡ ¡ ¡ ¡ ¡ ¡ ¡
12 Slide Credit: Noah Snavely, Steve Seitz, Pedro Felzenschwalb (C) Dhruv Batra
Reading a noun (vs verb)
[Rustandi et al., 2005]
Slide Credit: Carlos Guestrin 13
Regression
(C) Dhruv Batra 14
Regression x y
Continuous
Stock market
15 (C) Dhruv Batra
Weather Prediction
Temperature
Slide Credit: Carlos Guestrin 16 (C) Dhruv Batra
Tasks
(C) Dhruv Batra 17
Classification x y Regression x y
Discrete Continuous
Clustering x c
Discrete ID
Dimensionality Reduction x z
Continuous
Supervised Learning Unsupervised Learning
Need for Joint Prediction
(C) Dhruv Batra 18
Handwriting recognition
Character recognition, e.g., kernel SVMs
e c b c b c a a a a a a z c b c a c r r r r r
Handwriting recognition 2
[Smyth ¡et ¡al., ¡1994] ¡
Local Ambiguity
Local Ambiguity
(C) Dhruv Batra 22
slide credit: Fei-Fei Li, Rob Fergus & Antonio Torralba
Joint Prediction
(C) Dhruv Batra 23
Classification x1, x2,…, xn y1, y2,…,yn Regression
Discrete Continuous
x1, x2,…, xn y1, y2,…,yn
How many parameters?
(C) Dhruv Batra 24
Probabilistic Graphical Models
AI in the last 10-20 years
– Graph Theory + Probability
probability distributions
– Exploit conditional independencies
– naïve Bayes – logistic regression – Many more …
(C) Dhruv Batra 25
Types of PGMs
(C) Dhruv Batra 26 Graphical Models
Directed
Directed Factor Graph Bayesian Networks
Dynamic Bayes nets Markov chains HMM LDS Latent variable models Discrete Mixture models cluster- ing Continuous dimen- reductChain Graphs Undirected Graphs
Markov network
input dependent CRF Pairwise Boltz. machine (disc.) Gauss. Process (cont)Clique Graphs
Junction tree Clique treeFactor Graphs
Image Credit: David Barber
Main Issues in PGMs
– How do we store P(X1, X2, …, Xn) – What does my model mean/imply/assume? (Semantics)
– How do I answer questions/queries with my model? such as – Marginal Estimation: P(X5 | X1, X4) – Most Probable Explanation: argmax P(X1, X2, …, Xn)
– How do we learn parameters and structure of P(X1, X2, …, Xn) from data? – What model is the right for my data?
(C) Dhruv Batra 27
Key Ingredient
– Encoded in the graph structure
(C) Dhruv Batra 28
Application: Evolutionary Biology
(C) Dhruv Batra 29
[Friedman et al.]
30
Application: Computer Vision
Chain model (hidden Markov model) Interpreting sign language sequences
(C) Dhruv Batra Image Credit: Simon JD Prince
Application: Speech
(C) Dhruv Batra 31
Application: Sensor Network
(C) Dhruv Batra 32
A ¡ B ¡ C ¡
Image Credit: Carlos Guestrin & Erik Sudderth
Application: Medical Diagnosis
(C) Dhruv Batra 33 Image Credit: Erik Sudderth
Application: Coding
(C) Dhruv Batra 34
Observed Bits True Bits Parity Constraints
Application: Protein Folding
– http://youtu.be/bTlNNFQxs_A?t=175 – http://www.youtube.com/watch?v=lGYJyur4FUA
(C) Dhruv Batra 35
Application: Protein Folding
– http://youtu.be/bTlNNFQxs_A?t=175 – http://www.youtube.com/watch?v=lGYJyur4FUA
(C) Dhruv Batra 36
37
Application: Computer Vision
Image Credit: Simon JD Prince
Tree model Parsing the human body
(C) Dhruv Batra
38
Application: Computer Vision
Image Credit: Simon JD Prince
Grid model Markov random field (blue nodes) Semantic segmentation
(C) Dhruv Batra
Application: Computer Vision
– [Hoiem et al. IJCV ’07], [Hoiem et al. CVPR ’08], [Saxena PAMI ’08], [Ramalingam et al. CVPR ‘08].
39 (C) Dhruv Batra
– [Berg et al. CVPR ’04, Phd-Thesis ‘07], [Gallagher et al. CVPR ’08].
40
Mildred and Lisa
Lisa Mildred
Application: Computer Vision
1900 1920 1940 1960 1980 2000 0.01 0.02 0.03 0.04 0.05 0.06 0.07 Birth Year Probability Probability of Birth Year Mildred Lisa Nora Peyton Linda
(C) Dhruv Batra
– [Berg et al. CVPR ’04, Phd-Thesis ‘07], [Gallagher et al. CVPR ’08].
41
President George W. Bush makes a statement in the Rose Garden while Secretary of Defense Donald Rumsfeld looks on, July 23, 2003. Rumsfeld said the United States would release graphic photographs of the dead sons of Saddam Hussein to prove they were killed by American troops. Photo by Larry Downing/Reuters British director Sam Mendes and his partner actress Kate Winslet arrive at the London premiere of ’The Road to Perdition’, September 18, 2002. The films stars Tom Hanks as a Chicago hit man who has a separate family life and co-stars Paul Newman and Jude Law. REUTERS/Dan ChungApplication: Computer Vision
(C) Dhruv Batra
And many
many
42 (C) Dhruv Batra
Course Information
– dbatra@vt – Office Hours: Fri 1-2pm – Location: 468 Whittemore
(C) Dhruv Batra 43
Syllabus
– Representation: Directed Acyclic Graphs (DAGs), Conditional Probability Tables (CPTs), d-Separation, v-structures, Markov Blanket, I-Maps – Parameter Learning: MLE, MAP, EM – Structure Learning: Chow-Liu, Decomposable scores, hill climbing – Inference: Marginals, MAP/MPE, Variable Elimination
– Representation: Junction trees, Factor graphs, treewidth, Local Makov Assumptions, Moralization, Triangulation – Inference: Belief Propagation, Message Passing, Linear Programming Relaxations, Dual-Decomposition, Variational Inference, Mean Field – Parameter Learning: MLE, gradient descent – Structured Prediction: Structured SVMs, Cutting-Plane training
– Online learning: perceptrons, stochastic (sub-)gradients – Distributed Learning: Dual Decomposition, Alternating Direction Method
(C) Dhruv Batra 44
Syllabus
and then some.
and implementations
J
(C) Dhruv Batra 45
Prerequisites
– Classifiers, regressors, loss functions, MLE, MAP
– Matrix multiplication, eigenvalues, positive semi-definiteness…
– Nodes, edges, trees, cycles, depth-first search
– Dynamic programming, basic data structures, complexity…
– Matlab for HWs. Your language of choice for project
(C) Dhruv Batra 46
Textbook
– We will assign readings from online/free books, papers, etc
– [On Library Reserve] Probabilistic Graphical Models: Principles and Techniques Daphne Koller and Nir Friedman – [Free PDF from author] Bayesian reasoning and machine learning David Barber http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php? n=Brml.HomePage – [Free PDF from authors] Graphical models, exponential families, and variational inference. Martin J. Wainwright and Michael I. Jordan.
(C) Dhruv Batra 47
Grading
– First one goes out Jan 30
Start early, Start early, Start early, Start early
– Projects done individually, or groups of two students
– Take home – 3-5 days
– Contribute to class discussions on Scholar – Ask questions, answer questions – Reading assigned papers
(C) Dhruv Batra 48
Re-grading Policy
– Within 3 days of receiving grades: see me
– The goal is understanding the material and making progress towards our research.
(C) Dhruv Batra 49
Homeworks
– Due in 2 weeks via Scholar (Assignments tool) – Theory + Implementation (similar format as 4984/5984) – HW1 out 1/30
– 5 late days for the semester
– After late days are used up:
(C) Dhruv Batra 50
Project
– Chance to try Graphical Models – Encouraged to apply to your research (computer vision, communication, UAVs, computational biology…) – Must be done this semester. No double counting. – Extra credit for shooting for a publication
– Application/Survey
interest
– Formulation/Development
– Theory
– We will give a list of ideas, points to dataset/algorithms/code – Mentor teams and give feedback.
(C) Dhruv Batra 51
Spring 2013 Projects
– Gordon Christie & Ujwal Krothpalli, Grad Students – http://youtu.be/VFPAHY7th9A
(C) Dhruv Batra 52
Spring 2013 Projects
– Vireshwar Kumar & Dhiraj Amuru, Grad Students
(C) Dhruv Batra 53
Collaboration Policy
– Only on HW and project (not allowed in exams). – You may discuss the questions – Each student writes their own answers – Write on your homework anyone with whom you collaborate – Each student must write their own code for the programming part
– Neither ethical nor in your best interest – Always credit your sources – Don’t cheat. We will find out.
(C) Dhruv Batra 54
Audit / Sit in
– ECE Audit Request form
– Deadline: Jan 27
– Talk to instructor.
(C) Dhruv Batra 55
Communication Channels
– No direct emails to Instructor unless private information – Instructor can mark/provide answers to everyone – Class participation credit for answering questions! – No posting hints/answers. We will monitor.
– https://scholar.vt.edu/portal/site/s14ece6504 – https://filebox.ece.vt.edu/~s14ece6504/
(C) Dhruv Batra 56
Other Relevant Classes
– Instructor: X Deng – Offered: Spring
– Instructor: BM Fraticelli – Offered: Spring
– Instructor: MH Farhood – Offered: Spring
– Instructor: Devi Parikh – Offered: Spring
(C) Dhruv Batra 57
Guest Lectures
– Graphical Models for Neuroscience – Variational Inference
(C) Dhruv Batra 58
Misc Notes
– Slides + notes available on scholar
– On par with Spring 2013 4984/5984 – Significantly more than Fall 2013 4984 – More than Fall 2013 5984
– Focus on depth, not breadth – We will go as slow as necessary and bearable J
(C) Dhruv Batra 59
Plan for Today
(C) Dhruv Batra 60
Todo
– Probability Refresher: Barber Chap 1 – Graph Theory Refresher: Barber Chap 2
(C) Dhruv Batra 61