Supervising Unsupervised Learning Vikas K. Garg & Adam Kalai - - PowerPoint PPT Presentation

supervising unsupervised learning
SMART_READER_LITE
LIVE PREVIEW

Supervising Unsupervised Learning Vikas K. Garg & Adam Kalai - - PowerPoint PPT Presentation

Supervising Unsupervised Learning Vikas K. Garg & Adam Kalai Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning Clustering problem Clustering repository: in isolation: 1 14 2 15 0 db 3 db How many clusters?


slide-1
SLIDE 1

Supervising Unsupervised Learning

Vikas K. Garg & Adam Kalai

Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning

slide-2
SLIDE 2

Clustering problem in isolation:

2˚ 15˚ 1˚ 14˚ 0 db 3 db 34˚ 65˚

Clustering repository: How many clusters?

Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning

slide-3
SLIDE 3

Contributions

Introduce a principled framework to evaluate unsupervised settings Show how to transfer knowledge across heterogeneous datasets

different sizes, dimensions, representations, domains...

Design provably efficient algorithms

select clustering algorithm and number of clusters, determine threshold in single-linkage clustering remove outliers, recycle problems

Make good meta-clustering possible

introduce meta-scale-invariance property show how to circumvent Kleinberg’s impossibility result

Automate deep feature learning across very small datasets

encode diverse small data effectively into big data perform non-trivial zero shot learning

Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning

slide-4
SLIDE 4

General approach

Define a meta-distribution µ over all problems in the universe Each training sample is a dataset drawn i.i.d. from µ Learn a mapping from an intrinsic measure to an extrinsic measure Intrinsic measure avoids labels and abstracts away heterogeneity Each test problem is drawn from µ but labels are hidden Compute intrinsic measure on test and predict the extrinsic quality Encode covariance of small datasets for deep zero-shot learning

Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning

slide-5
SLIDE 5

Number of clusters

Summary Run k-means algorithm with different k on each train dataset. Use Silhouette Index (SI) as intrinsic measure. Use Adjusted Rand Index (ARI) as extrinsic measure.

40 60 80 100 120 140 160 180 200 220 240 260 280 300

0.1 0.105 0.11 0.115 0.12 Number of training datasets Average ARI Selecting the number of clusters

Silhouette Ours Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning

slide-6
SLIDE 6

Clustering algorithm (assume fixed k for simplicity)

Summary Run different algorithms to get k clusters & compute SI. Form a feature vector from SI and dataset specific features (e.g. max and min singular values, size, dimensionality). Use Adjusted Rand Index (ARI) as extrinsic measure.

40 60 80 100 120 140 160 180 200 220 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13

Number of training datasets Adjusted Rand Index (ARI) Performance of different algorithms

Ours KMeans KMeans-N Ward Ward-N Average Average-N Complete Complete-N Spectral Spectral-N Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning

slide-7
SLIDE 7

Fraction of outliers

Summary Remove points with large norms, cluster other points, and compute SI. Put the removed points into clusters, and compute ARI. Find the candidate fraction that performs best on test set. Extensions possible to customize fractions for each test set.

50 100 150 200 250 300 0.1 0.105 0.11 0.115 0.12 0.125 0.13

Number of training datasets Average Adjusted Rand Index Performance with outlier removal

5% 4% 3% 2% 1% 0% Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning

slide-8
SLIDE 8

Deep learning binary similarity function

Summary Sample pairs of examples from each small dataset. For each pair, also include covariance features specific to its dataset. Label 1 if the sampled pair comes from same cluster, 0 otherwise. Train a deep net classifier on all the pairs together. Predict whether test pair comes from same cluster or not.

Internal test (IT) External test (ET) 0.5 0.55 0.6 0.65 0.7 0.75 0.8 Average accuracy Average binary similarity prediction accuracy

Ours Majority

Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning

slide-9
SLIDE 9

See you...

Tue Dec 4th 05:00 – 07:00 PM Room 210 & 230 AB Poster #164

Vikas K. Garg & Adam Kalai Supervising Unsupervised Learning