Department of Computer Science CSCI 5622: Machine Learning Chenhao - PowerPoint PPT Presentation

Mar 16, 2024 •400 likes •1.01k views

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 18: Clustering Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Learning objectives Learn about general clustering Learn about the K-Means

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 18: Clustering Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1
Learning objectives • Learn about general clustering • Learn about the K-Means algorithm • Learn about Gaussian Mixture Models 2
Supervised learning Unsupervised learning Data: X Labels: Y Data: X Latent structure: Z 3
Clustering • One important unsupervised method is clustering • Goal: Organize data in classes 4
Clustering applications – Microarray Gene Expression data From: “Skin layer-specific transcriptional profiles in normal and recessive yellow (Mc1re/Mc1re) mice'' by April and Barsh in Pigment Cell Research (2006) 5
Clustering applications – Medical Imaging 6
Clustering applications – Community detection 7
News Media 8
Clustering • One important unsupervised method is clustering • Goal: Organize data in classes • Classes are hard to define • Different data representation may lead to different clusterings 9
Clustering • One important unsupervised method is clustering • Goal: Organize data in classes • Data have high in-class similarity • Data have low out-of-class similarity 10
Clustering - Similarity 11
Clustering - Similarity 12
K-Means • Simplest clustering method • Iterative in nature • Reasonably fast • Very popular in practice (though with more bells and whistles) • Requires real-valued data 13
K-Means 14
K-Means 15
16
17
18
19
20
21
22
More K-means • Animations: http://shabal.in/visuals/kmeans/4.html 23
K-Means in numbers 24
K-Means in numbers 25
K-Means in numbers 26
K-Means in numbers 27
K-Means in numbers 28
K-Means in numbers 29
K-Means in numbers 30
K-Means in numbers 31
K-Means in numbers 32
K-Means in numbers 33
K-Means in numbers 34
K-Means in numbers 35
K-Means 36
K-Means 37
K-Means • Weaknesses • Doesn't really work with categorical data • Usually only converges to local minimum • Have to determine number of clusters • Can be sensitive to outliers • Only generates convex clusters 38
K-means - Weaknesses • Doesn't really work with categorical data 39
K-means - Weaknesses • Doesn't really work with categorical data • Fix : Do K-Modes instead 40
K-means - Weaknesses • Usually only converges to local minimum 41
K-means - Weaknesses • Usually only converges to local minimum • Fix : Do several runs with random inits. and choose best 42
K-means - Weaknesses • Have to determine number of clusters 43
K-means - Weaknesses • Have to determine number of clusters • Fix: Use the elbow method Run K-Means for different values of k and look at loss function 44
45
46
47
48
49
50
Gaussian Mixture Models 51
Gaussian Mixture Models 52
Gaussian Mixture Models 53
Gaussian Mixture Models 54
Gaussian Mixture Models 55
Gaussian Mixture Models 56
Gaussian Mixture Models 57
Gaussian Mixture Models 58
Gaussian Mixture Models 59
Recap • K-means is the most commonly used clustering algorithm • We learned the Gaussian Mixture Model’s generative story • We will learn EM-algorithm next week 60

Recommend

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 23: Machine

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 23: Machine learning and society Slides adapted from Chris Ketelsen 1 Learning objectives Learn about the connection between our society and machine learning

777 views • 35 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 13: Boosting

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 13: Boosting Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Learning objectives Understand the general idea behind ensembling Learn about

537 views • 39 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 12:

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 12: Regularization, regression, and multi-class classification Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 HW 2 2 Learning objective Review

835 views • 62 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 21: Reinforcement

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 21: Reinforcement learning I Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Administrivia Poster printing Email your poster to

940 views • 44 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 17: Midterm

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 17: Midterm review 1 Theory PAC learning Bias-variance tradeoff Model selection Methods K-nearest neighbor Nave Bayes Linear

586 views • 55 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 16:

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 16: Dimensionality Reduction Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Midterm A. Review session B. Flipped classroom C. Go over the example

1.08k views • 72 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 19: EM algorithm,

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 19: EM algorithm, Topic modeling Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Administrivia HW4 due, HW5 out Remember that we only count the

1.32k views • 65 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 20: Topic

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 20: Topic modeling and variational inferrence Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Administrivia Poster printing (stay tuned!) HW 5

902 views • 53 slides

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 14: PAC

Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 14: PAC learnability Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1 Announcements Proposal due tomorrow night HW2 regrade requests Peer

847 views • 53 slides

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine Learning Rob Schapire Princeton University www.cs.princeton.edu/ schapire Machine

1.26k views • 38 slides

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum Computing Machine Learning Quantum Computing Machine Learning so hot so so hot Quantum Computing Machine Learning Quantum Computing Machine Learning

835 views • 51 slides

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is Machine Learning? Azure Machine Learning: How it works Azure Machine Learning in action Get started Contents What is Machine Learning?

456 views • 21 slides

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING Exam Format The exam lasts a total of 3 hours: - Upon entering the room, you must

373 views • 21 slides

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

MACHINE LEARNING 2012 MACHINE LEARNING MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How to separate the red class from the grey class? x 2 360 r x 1 Polar coordinates Data

1.04k views • 44 slides

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach to Preventing to Preventing to Preventing to Preventing Avoidable ED Utilization Avoidable ED Utilization Avoidable ED

727 views • 13 slides

Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning

DataCamp Machine Learning for Finance in Python MACHINE LEARNING FOR FINANCE IN PYTHON Machine learning for finance Nathan George Data Science Professor DataCamp Machine Learning for Finance in Python Machine Learning in Finance source:

389 views • 36 slides

SQL data manipula.on language SQL Data Manipulation Language (DML) Primarily declarative

SQL data manipula.on language SQL Data Manipulation Language (DML) Primarily declarative query language Specify what you want to compute and not how Starting point: relational calculus aka first-order predicate logic With many

1.34k views • 113 slides

Mortgage Broker Mortgage Broker 1. Brokers teach you how to buy. Good brokers will guide you

8 Powerful Reasons 8 Powerful Reasons You Need A You Need A Mortgage Broker Mortgage Broker 1. Brokers teach you how to buy. Good brokers will guide you through the home buying process, from application, liaising with your solicitor,

674 views • 10 slides

Computer Graphics (CS 543) Lecture 1 (Part 2): Introduction to OpenGL/GLUT(Part 1) Prof Emmanuel

Computer Graphics (CS 543) Lecture 1 (Part 2): Introduction to OpenGL/GLUT(Part 1) Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic Institute (WPI) Recall: OpenGL Basics OpenGLs function Rendering (or drawing) OpenGL can

891 views • 37 slides

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz Masud Ziko, Jose Dolz, Eric

Laplacian Regularized Few Shot Learning (LaplacianShot) Imtiaz Masud Ziko, Jose Dolz, Eric Granger and Ismail Ben Ayed ETS Montreal 1 Overview Few-Shot Proposed Experiments Learning LaplacianShot - Experimental Setup - What and Why ?

1.22k views • 55 slides

A Pixelated Readout System What does one want for a real DUNE sized system? Rick Van Berg Penn

A Pixelated Readout System What does one want for a real DUNE sized system? Rick Van Berg Penn 8/14/18 General System Considerations Power Complexity / Reliability Data volumes Requirements Sensitivity SNR Data

428 views • 20 slides

Assignment #3 Which is something you may wish to do since it is Assignment #3 So You Want to

Assignment #3 Which is something you may wish to do since it is Assignment #3 So You Want to Write some Procedural Shaders In fact Assignment #3 Assignments Goal is to be able to produce something nicer than Some advice:

92 views • 5 slides

Design Principles for Secure Systems Systems Driving Ideas for Security Principles Saltzer

1 Design Principles for Secure Systems Systems Driving Ideas for Security Principles Saltzer and Schroeder (1975) defined 8 principles that are based on the ideas of simplicity and restriction are based on the ideas of simplicity and

468 views • 17 slides

Some Experiments Michele Conforti DMPA, University of Padova Domenico Salvagnin with Benders

Some Experiments Michele Conforti DMPA, University of Padova Domenico Salvagnin with Benders CGLPs DEI, University of Padova IBM ILOG CPLEX master CGLP Basic Benders CGLP structure same objective same variables are fixed! same size and

549 views • 13 slides