3D Deep Clustering a clustering framework for unsupervised learning - - PowerPoint PPT Presentation

3d deep clustering
SMART_READER_LITE
LIVE PREVIEW

3D Deep Clustering a clustering framework for unsupervised learning - - PowerPoint PPT Presentation

3D Deep Clustering a clustering framework for unsupervised learning of 3D object feature descriptors LAMBDA Workshop on Retrieval and Shape analysis Alexandros Georgiou M.Sc. in Algorithms, Logic and Discrete Mathematics 1/12 Problem


slide-1
SLIDE 1

1/12

3D Deep Clustering

a clustering framework for unsupervised learning of 3D object feature descriptors LAMBDA Workshop on Retrieval and Shape analysis

Alexandros Georgiou

M.Sc. in “Algorithms, Logic and Discrete Mathematics”

slide-2
SLIDE 2

2/12

Problem

Construct feature descriptors for 3D objects.

slide-3
SLIDE 3

2/12

Problem

Construct feature descriptors for 3D objects. Application dependent.

slide-4
SLIDE 4

2/12

Problem

Construct feature descriptors for 3D objects. Application dependent. Local feature descriptors assign to each point on the shape a vec- tor in some multi-dimensional descriptor space representing the local structure of the shape around that point.

⊲ Used in higher-level tasks such as establishing correspondence between shapes or shape retrieval.

slide-5
SLIDE 5

2/12

Problem

Construct feature descriptors for 3D objects. Application dependent. Local feature descriptors assign to each point on the shape a vec- tor in some multi-dimensional descriptor space representing the local structure of the shape around that point.

⊲ Used in higher-level tasks such as establishing correspondence between shapes or shape retrieval.

Global descriptors describe the whole shape and are often produced by aggregating local descriptors, e.g. bag-of-features paradigm.

slide-6
SLIDE 6

2/12

Problem

Construct feature descriptors for 3D objects. Application dependent. Local feature descriptors assign to each point on the shape a vec- tor in some multi-dimensional descriptor space representing the local structure of the shape around that point.

⊲ Used in higher-level tasks such as establishing correspondence between shapes or shape retrieval.

Global descriptors describe the whole shape and are often produced by aggregating local descriptors, e.g. bag-of-features paradigm. ⋆ Use Convolutional Neural Networks (defined on non-Euclidean domains) and learn task-specific features.

slide-7
SLIDE 7

3/12

non-Euclidean CNNs

3D objects are modeled as compact Riemannian manifolds.

slide-8
SLIDE 8

3/12

non-Euclidean CNNs

3D objects are modeled as compact Riemannian manifolds.

⊲ The operation of convolution between functions on Euclidean spaces is taken for granted. ⊲ But it is not well defined for manifolds.

slide-9
SLIDE 9

3/12

non-Euclidean CNNs

3D objects are modeled as compact Riemannian manifolds.

⊲ The operation of convolution between functions on Euclidean spaces is taken for granted. ⊲ But it is not well defined for manifolds.

Define convolution between functions on manifolds.

slide-10
SLIDE 10

3/12

non-Euclidean CNNs

3D objects are modeled as compact Riemannian manifolds.

⊲ The operation of convolution between functions on Euclidean spaces is taken for granted. ⊲ But it is not well defined for manifolds.

Define convolution between functions on manifolds.

⊲ Spectral methods.

slide-11
SLIDE 11

3/12

non-Euclidean CNNs

3D objects are modeled as compact Riemannian manifolds.

⊲ The operation of convolution between functions on Euclidean spaces is taken for granted. ⊲ But it is not well defined for manifolds.

Define convolution between functions on manifolds.

⊲ Spectral methods. ⊲ Spatial methods.

slide-12
SLIDE 12

3/12

non-Euclidean CNNs

3D objects are modeled as compact Riemannian manifolds.

⊲ The operation of convolution between functions on Euclidean spaces is taken for granted. ⊲ But it is not well defined for manifolds.

Define convolution between functions on manifolds.

⊲ Spectral methods. ⊲ Spatial methods.

Other non-Euclidean domains, e.g. graphs.

slide-13
SLIDE 13

4/12

non-Euclidean CNNs

MoNet [Monti et al. ]

slide-14
SLIDE 14

4/12

non-Euclidean CNNs

MoNet [Monti et al. ]

⊲ x point on a manifold, y ∈ N(x) points in the neighborhood of x.

slide-15
SLIDE 15

4/12

non-Euclidean CNNs

MoNet [Monti et al. ]

⊲ x point on a manifold, y ∈ N(x) points in the neighborhood of x. ⊲ u(x, y) is a d-dimensional vector of pseudo-coordinates.

slide-16
SLIDE 16

4/12

non-Euclidean CNNs

MoNet [Monti et al. ]

⊲ x point on a manifold, y ∈ N(x) points in the neighborhood of x. ⊲ u(x, y) is a d-dimensional vector of pseudo-coordinates. ⊲ Weighting function (kernel) wΘ(u) =

  • w1(u), . . . , wn(u)
slide-17
SLIDE 17

4/12

non-Euclidean CNNs

MoNet [Monti et al. ]

⊲ x point on a manifold, y ∈ N(x) points in the neighborhood of x. ⊲ u(x, y) is a d-dimensional vector of pseudo-coordinates. ⊲ Weighting function (kernel) wΘ(u) =

  • w1(u), . . . , wn(u)
  • ⊲ Patch operator

Di(x)f =

  • y∈N (x)

wi

  • u(x, y)
  • f (y)
slide-18
SLIDE 18

4/12

non-Euclidean CNNs

MoNet [Monti et al. ]

⊲ x point on a manifold, y ∈ N(x) points in the neighborhood of x. ⊲ u(x, y) is a d-dimensional vector of pseudo-coordinates. ⊲ Weighting function (kernel) wΘ(u) =

  • w1(u), . . . , wn(u)
  • ⊲ Patch operator

Di(x)f =

  • y∈N (x)

wi

  • u(x, y)
  • f (y)

⊲ Define convolution as

  • f ⋆ g
  • (x) =

n

  • i=1

giDi(x)f

slide-19
SLIDE 19

5/12

How do we learn the parameters ?

slide-20
SLIDE 20

5/12

How do we learn the parameters ?

Supervised learning (classification)

slide-21
SLIDE 21

5/12

How do we learn the parameters ?

Supervised learning (classification)

⊲ FΘ(x) = gΘ ◦ fΘ(x) = y ∈ Y on input x ∈ X.

slide-22
SLIDE 22

5/12

How do we learn the parameters ?

Supervised learning (classification)

⊲ FΘ(x) = gΘ ◦ fΘ(x) = y ∈ Y on input x ∈ X. ⊲ fΘ is the convnet mapping and fΘ(x) is the vector of features.

slide-23
SLIDE 23

5/12

How do we learn the parameters ?

Supervised learning (classification)

⊲ FΘ(x) = gΘ ◦ fΘ(x) = y ∈ Y on input x ∈ X. ⊲ fΘ is the convnet mapping and fΘ(x) is the vector of features. ⊲ gΘ is a classifier that predicts the correct labels on top of the features fΘ(x).

slide-24
SLIDE 24

5/12

How do we learn the parameters ?

Supervised learning (classification)

⊲ FΘ(x) = gΘ ◦ fΘ(x) = y ∈ Y on input x ∈ X. ⊲ fΘ is the convnet mapping and fΘ(x) is the vector of features. ⊲ gΘ is a classifier that predicts the correct labels on top of the features fΘ(x). ⊲ Training set X = {x1, . . . , xn} with associated labels y1, . . . , yn ∈ Y.

slide-25
SLIDE 25

5/12

How do we learn the parameters ?

Supervised learning (classification)

⊲ FΘ(x) = gΘ ◦ fΘ(x) = y ∈ Y on input x ∈ X. ⊲ fΘ is the convnet mapping and fΘ(x) is the vector of features. ⊲ gΘ is a classifier that predicts the correct labels on top of the features fΘ(x). ⊲ Training set X = {x1, . . . , xn} with associated labels y1, . . . , yn ∈ Y. ⊲ Learn Θ by optimizing the following problem: min

Θ n

  • i=1

ℓ(gΘ ◦ fΘ(xi), yi) for some loss function ℓ.

slide-26
SLIDE 26

5/12

How do we learn the parameters ?

Supervised learning (classification)

⊲ FΘ(x) = gΘ ◦ fΘ(x) = y ∈ Y on input x ∈ X. ⊲ fΘ is the convnet mapping and fΘ(x) is the vector of features. ⊲ gΘ is a classifier that predicts the correct labels on top of the features fΘ(x). ⊲ Training set X = {x1, . . . , xn} with associated labels y1, . . . , yn ∈ Y. ⊲ Learn Θ by optimizing the following problem: min

Θ n

  • i=1

ℓ(gΘ ◦ fΘ(xi), yi) for some loss function ℓ.

What about unsupervised learning ?

slide-27
SLIDE 27

6/12

slide-28
SLIDE 28

7/12

How it works

slide-29
SLIDE 29

7/12

How it works

slide-30
SLIDE 30

7/12

How it works

Produce the vector of features fΘ(xi) through the convnet.

slide-31
SLIDE 31

7/12

How it works

Produce the vector of features fΘ(xi) through the convnet. Assign a pseudo-label yi to each fΘ(xi) using a clustering method, e.g. k-means.

slide-32
SLIDE 32

7/12

How it works

Produce the vector of features fΘ(xi) through the convnet. Assign a pseudo-label yi to each fΘ(xi) using a clustering method, e.g. k-means. Update the parameters of the convet, just as in the case of supervised learning, by predicting these pseudo-labels.

slide-33
SLIDE 33

8/12

How it works: Implementation details

This method is prone to trivial solutions.

slide-34
SLIDE 34

8/12

How it works: Implementation details

This method is prone to trivial solutions.

⊲ Empty clusters.

slide-35
SLIDE 35

8/12

How it works: Implementation details

This method is prone to trivial solutions.

⊲ Empty clusters. An optimal decision is to assign all of the inputs to a single cluster. This is caused by absence of mechanisms to prevent empty clusters.

slide-36
SLIDE 36

8/12

How it works: Implementation details

This method is prone to trivial solutions.

⊲ Empty clusters. An optimal decision is to assign all of the inputs to a single cluster. This is caused by absence of mechanisms to prevent empty clusters. ⋆ Automatically reassign empty clusters during the k-means

  • ptimization.
slide-37
SLIDE 37

8/12

How it works: Implementation details

This method is prone to trivial solutions.

⊲ Empty clusters. An optimal decision is to assign all of the inputs to a single cluster. This is caused by absence of mechanisms to prevent empty clusters. ⋆ Automatically reassign empty clusters during the k-means

  • ptimization.

⊲ Trivial parameterization.

slide-38
SLIDE 38

8/12

How it works: Implementation details

This method is prone to trivial solutions.

⊲ Empty clusters. An optimal decision is to assign all of the inputs to a single cluster. This is caused by absence of mechanisms to prevent empty clusters. ⋆ Automatically reassign empty clusters during the k-means

  • ptimization.

⊲ Trivial parameterization. If the vast majority of inputs is assigned to a few clusters, the param- eters will exclusively discriminate between them. This, for example, is caused when the number of inputs per class is highly unbalanced in supervised classigication.

slide-39
SLIDE 39

8/12

How it works: Implementation details

This method is prone to trivial solutions.

⊲ Empty clusters. An optimal decision is to assign all of the inputs to a single cluster. This is caused by absence of mechanisms to prevent empty clusters. ⋆ Automatically reassign empty clusters during the k-means

  • ptimization.

⊲ Trivial parameterization. If the vast majority of inputs is assigned to a few clusters, the param- eters will exclusively discriminate between them. This, for example, is caused when the number of inputs per class is highly unbalanced in supervised classigication. ⋆ Sample inputs based on a uniform distribution over the classes, or pseudo-labels.

slide-40
SLIDE 40

9/12

Our goals

Problem

Construct task-specific feature descriptors for 3D objects.

slide-41
SLIDE 41

9/12

Our goals

Problem

Construct task-specific feature descriptors for 3D objects. ⋆ Construct a non-Euclidean CNN, FΘ = gΘ ◦ fΘ, that classifies 3D

  • bjects so that fΘ produces good task-specific features and learn the

hyper-parameter Θ in an unsupervised setting as in DeepCluster.

slide-42
SLIDE 42

9/12

Our goals

Problem

Construct task-specific feature descriptors for 3D objects. ⋆ Construct a non-Euclidean CNN, FΘ = gΘ ◦ fΘ, that classifies 3D

  • bjects so that fΘ produces good task-specific features and learn the

hyper-parameter Θ in an unsupervised setting as in DeepCluster.

⊲ The structure of fΘ is crucial.

slide-43
SLIDE 43

9/12

Our goals

Problem

Construct task-specific feature descriptors for 3D objects. ⋆ Construct a non-Euclidean CNN, FΘ = gΘ ◦ fΘ, that classifies 3D

  • bjects so that fΘ produces good task-specific features and learn the

hyper-parameter Θ in an unsupervised setting as in DeepCluster.

⊲ The structure of fΘ is crucial. ⊲ Other possible clustering methods besides k-means.

slide-44
SLIDE 44

9/12

Our goals

Problem

Construct task-specific feature descriptors for 3D objects. ⋆ Construct a non-Euclidean CNN, FΘ = gΘ ◦ fΘ, that classifies 3D

  • bjects so that fΘ produces good task-specific features and learn the

hyper-parameter Θ in an unsupervised setting as in DeepCluster.

⊲ The structure of fΘ is crucial. ⊲ Other possible clustering methods besides k-means. ⊲ Find the right tasks/datasets.

slide-45
SLIDE 45

9/12

Our goals

Problem

Construct task-specific feature descriptors for 3D objects. ⋆ Construct a non-Euclidean CNN, FΘ = gΘ ◦ fΘ, that classifies 3D

  • bjects so that fΘ produces good task-specific features and learn the

hyper-parameter Θ in an unsupervised setting as in DeepCluster.

⊲ The structure of fΘ is crucial. ⊲ Other possible clustering methods besides k-means. ⊲ Find the right tasks/datasets. ⊲ Evaluate the method.

slide-46
SLIDE 46

9/12

Our goals

Problem

Construct task-specific feature descriptors for 3D objects. ⋆ Construct a non-Euclidean CNN, FΘ = gΘ ◦ fΘ, that classifies 3D

  • bjects so that fΘ produces good task-specific features and learn the

hyper-parameter Θ in an unsupervised setting as in DeepCluster.

⊲ The structure of fΘ is crucial. ⊲ Other possible clustering methods besides k-means. ⊲ Find the right tasks/datasets. ⊲ Evaluate the method. ⊲ What about other non-Euclidean domains such as graphs ?

slide-47
SLIDE 47

10/12

Our goals

Applications

slide-48
SLIDE 48

10/12

Our goals

Applications

⊲ Invariant descriptors

slide-49
SLIDE 49

10/12

Our goals

Applications

⊲ Invariant descriptors ⊲ Shape correspondence

slide-50
SLIDE 50

10/12

Our goals

Applications

⊲ Invariant descriptors ⊲ Shape correspondence ⊲ Shape retrieval

slide-51
SLIDE 51

10/12

Our goals

Applications

⊲ Invariant descriptors ⊲ Shape correspondence ⊲ Shape retrieval

Datasets

⊲ ShapeNet ⊲ FAUST ⊲ ABC

slide-52
SLIDE 52

11/12

The end

Thank you !

slide-53
SLIDE 53

12/12

Bibliography

Caron, Mathilde & Bojanowski, Piotr & Joulin, Armand & Douze, Matthijs, Deep Clustering for Unsupervised Learning of Visual Fea- tures. Monti, Federico & Boscaini, Davide & Masci, Jonathan & Rodol` a, Emanuele & Svoboda, Jan & Bronstein, Michael, Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs. Bronstein, Michael & Bruna, Joan & Lecun, Yann & Szlam, Arthur & Vandergheynst, Pierre, Geometric Deep Learning: Going beyond Euclidean data.