EM, K-Means, GNG 4-6-16 Reading Quiz Which of the following can be - PowerPoint PPT Presentation

EM, K-Means, GNG 4-6-16

Reading Quiz Which of the following can be considered an instance of the EM algorithm? a) Agglomerative clustering b) Divisive clustering c) K-means clustering d) Growing neural gas

EM algorithm E step: “expectation” … terrible name ● Classify the data using the current model. M step: “maximization” … slightly less terrible name ● Generate the best model using the current classification of the data. Initialize the model, then alternate E and M steps until convergence.

K-means algorithm Model: k clusters each represented by a centroid. E step: ● Assign each point to the closest centroid. M step: ● Move each centroid to the mean of the points assigned to it. Convergence: we ran an E step where no points had their assignment changed.

Initializing k-means Reasonable options: 1) Start with a random E step. ● Randomly assign each point to a cluster in {1, 2, …, k}. 2) Start with a random M step. a) Pick random centroids within the maximum range of the data. b) Pick random data points to use as initial centroids.

K-means in action

Other examples of EM ● Naive bayes soft clustering (from the reading) ● Gaussian mixture model clustering

Gaussian mixture models A Gaussian distribution is a multivariate generalization of a normal distribution (the classic bell curve). A Gaussian mixture is a distribution comprised of several independent Gaussians. If we model our data as a Gaussian mixture, we’re saying that each data point was a random draw from one of several Gaussian distributions (but we may not know which).

EM for Gaussian mixture models Model: data drawn from a mixture of k Gaussians E step: ● Compute the (log) likelihood of the data ○ Each point’s probability of being drawn from each Gaussian. M step: ● Update the mean and covariance of each Gaussian. ○ Weighted by how responsible that Gaussian was for each data point.

How do we pick k? There’s no hard rule. ● Sometimes the application for which the clusters will be used dictates k. ● If k can be flexible, then we need to consider the tradeoffs: ○ Higher k will always decrease the error (increase the likelihood). ○ Lower k will always produce a simpler model.

Growing neural gas 0) Start with two random connected nodes, then repeat 1...9: 1) Pick a random data point. 2) Find the two closest nodes to the data point. 3) Increment the age of all edges from the closest node. 4) Add the squared distance to the error of the closest node. 5) Move the closest node and all of its neighbors toward the data point. ● Move the closest node more than its neighbors. 6) Connect the two closest nodes or reset their edge age. 7) Remove old edges; if a node is isolated, delete it. 8) Every λ iterations, add a new node. ● Between the highest-error node and its highest-error neighbor 9) Decay all errors.

Adjusting nodes based on one data point

Adjusting nodes based on one data point These node’s error increases. These edges get aged.

Every λ iterations, add a new node Highest error node. Highest error neighbor.

Growing neural gas 0) Start with two random connected nodes, then repeat 1...9: 1) Pick a random data point. 2) Find the two closest nodes to the data point. 3) Increment the age of all edges from the closest node. 4) Add the squared distance to the error of the closest node. 5) Move the closest node and all of its neighbors toward the data point. ● Move the closest node more than its neighbors. 6) Connect the two closest nodes or reset their edge age. 7) Remove old edges. 8) Every λ iterations, add a new node. ● Between the highest-error node and its highest-error neighbor 9) Decay all errors.

Growing neural gas in action

Discussion question What unsupervised learning problem is growing neural gas solving? Is it clustering? Is it dimensionality reduction? Is it something else?

EM, K-Means, GNG 4-6-16 Reading Quiz Which of the following can be - PowerPoint PPT Presentation

EM, K-Means, GNG 4-6-16 Reading Quiz Which of the following can be considered an instance of the EM algorithm? a) Agglomerative clustering b) Divisive clustering c) K-means clustering d) Growing neural gas EM algorithm E step:

Golden Goliath Resources TSX: GNG.V www.goldengoliath.com Hunting Giants in the Land of Giants

K-MEANS++ OPTIMAL INITIALIZATION ALGORITHM An Improved K-means Clustering Method OVERVIEW

Lecture 23/Chapter 19 Diversity of Sample Means Means versus Proportions Behavior of

Data Clustering: Data Clustering: 50 Years Beyond K means 50 Years Beyond K means 50 Years

Multi-variable Optimization K-means clustering K-means clustering on points is finding K

1 K-means clustering The K-means clustering algorithm can be seen as applying the EM algorithm to

K -means Clustering Ke Chen Reading: [7.3, EA], [9.1, CMB] COMP24111 Machine Learning Outline

11/11/2014 Chapter 22 INFERENCES ABOUT MEANS 1 SAMPLING DISTRIBUTION FOR MEANS Recall, the

Chapter 7: The Distribution of Sample Means Frequency 2 1 0 1 2 3 4 5 6 7 8 9 Scores Distribution

A Semantics for Means-End Relations Jesse Hughes Technical University of Eindhoven August 29,

k -means++ seeding Have seen that the k -means algorithm can output arbitrarily poor solutions, if

Mekonomen Group January - December 2015 17 February 2016 Financial performance God tillvxt

MacConvilles Surveying BIM What it Means to Quantity Surveying BIM What it Means to

How Tortillas Stack Up in the Baking Industry What is a Tortilla? In Mexico, means little

QSL Card QSL Card A means of providing written confirmation A means of providing written

Fed Forum Personal Bankruptcy Reform of 2005: Means-Testing or Mean-Spirited? Astrid Dick

CS1010 Programming Methodology AY18/19 Sem 1 Lecture 1 14 August 2018 Admin Matters Unit 1:

CSSE463: Image Recognition Day 5 Demo code posted Lab 2 due Wednesday. Be sure you

Continuous random Variables Anna Karlin Most Slides by Alex Tsun + Joshua Fan Agenda Recap

Stochastic Exploration of Real Varieties David J. Kahle Associate Professor Joint with Jon

Linear Regression Yijun Zhao Northeastern University Fall 2016 Yijun Zhao Linear Regression

Class Photo HW: Summary Due Before class Friday Chapter 2 Hardcopy due: Friday before

Auto tomat mated d Pla lanning ing State + action unique resulting state Sometimes

Fitting Markovian binary trees using global and individual population data Sophie Hautphenne *

Sambuz

Useful Links

Newsletter

Mail Us