Riemannian Geometry and Machine Learning for Non Euclidean Data - - PowerPoint PPT Presentation

riemannian geometry and machine learning for non
SMART_READER_LITE
LIVE PREVIEW

Riemannian Geometry and Machine Learning for Non Euclidean Data - - PowerPoint PPT Presentation

Riemannian Geometry and Machine Learning for Non Euclidean Data Frank C. Park and C.J. Jang Seoul National University Carl Friedrich Gauss (1777 1855) 15 th Century Mapmaking ...were shortest paths on the sphere (but in most cases theyre


slide-1
SLIDE 1

Riemannian Geometry and Machine Learning for Non‐Euclidean Data

Frank C. Park and C.J. Jang Seoul National University

slide-2
SLIDE 2

Carl Friedrich Gauss (1777‐1855)

slide-3
SLIDE 3

15th Century Mapmaking

slide-4
SLIDE 4
slide-5
SLIDE 5

It would be nice if straight lines on maps... ...were shortest paths on the sphere (but in most cases they’re not)

slide-6
SLIDE 6

Google Maps (Mercator projection)

slide-7
SLIDE 7

Mercator maps are very accurate for countries near the equator (e.g., Brazil)

slide-8
SLIDE 8

Greenland vs Africa: Sizes on Mercator Map

slide-9
SLIDE 9

Greenland vs Africa: Actual Size Comparison

slide-10
SLIDE 10
slide-11
SLIDE 11

Mercator Map

slide-12
SLIDE 12

Gall‐Peters Map

slide-13
SLIDE 13

Gall‐Peters Map: Greenland vs Africa

Relative areas are accurate, but shapes are now distorted

slide-14
SLIDE 14

National Geographic Map (Winkel map)

slide-15
SLIDE 15
slide-16
SLIDE 16

David Hilbert (1862‐1943)

slide-17
SLIDE 17
  • Isometry

(distortion‐free)

  • Area‐preserving
  • Geodesic‐preserving
  • Angle‐preserving

(conformal)

  • ....

A Hierarchy of Mappings

slide-18
SLIDE 18

The unit two‐sphere is parametrized as Spherical coordinates:

Calculus on the Sphere

1. x cos sin y sin sin cos

Other coordinate parametrizations are possible, e.g., stereographic projection:

2 1 , 2 1 , 1 1

slide-19
SLIDE 19

Calculus on the Sphere

Given a curve

  • n the sphere, its

incremental arclength is

, , sin sin 1

  • The matrix sin

1 is called the first fundamental

form in classical differential geometry (we’ll call it the Riemannian metric).

slide-20
SLIDE 20
  • Length of
  • Area of

Calculus on the Sphere

  • sin
  • sin
  • Calculating lengths and areas on the

sphere using spherical coordinates:

Note that the area element is

slide-21
SLIDE 21
  • Local coordinates:
  • The Riemannian metric: , sin

1

  • Note 1: Other local coordinates are possible.
  • Note 2: Other choices of Riemannian metric are also

possible by defining

differently, e.g., choose any

symmetric positive‐definite 3x3 matrix , , and set

  • Calculus on the Sphere: The Setup So Far
slide-22
SLIDE 22

Calculus on Riemannian Manifolds

Manifold local coordinates x

*Invertible with a differentiable inverse. Essentially, one can be smoothly deformed into the other.

A differentiable manifold is a space that is locally diffeomorphic* to Euclidean space (e.g., a multidimensional surface)

slide-23
SLIDE 23

Calculus on Riemannian Manifolds

  • A Riemannian metric is an inner product defined on

each tangent space that varies smoothly over .

∈ symmetric positive-definite

slide-24
SLIDE 24

Calculus on Riemannian Manifolds

  • Volume of a subset
  • f

: Volume

  • Length of a curve
  • n

(local coordinates

  • :
slide-25
SLIDE 25

Mappings Between Riemannian Manifolds

slide-26
SLIDE 26

Given two manifolds and , the mapping : → is an isometry if it preserves distances and angles everywhere: , , , for all , in and are then said to be isometric to each other; can be transformed into without any stretching or tearing.

Original

Isometry

isometric to not isometric to

slide-27
SLIDE 27

: →

  • Isometry ⟺

,

Coordinates metric

Isometry: Mathematical Formulation

Coordinates metric

slide-28
SLIDE 28

There is no isometry between manifolds

  • f different Gaussian
  • curvatures. What’s

the best one can do in this case?

  • Isometries and Gaussian Curvature
slide-29
SLIDE 29

: →

  • Finding Nearly Isometric Maps

Local coordinates , metric Local coordinates , metric

Note: The “distance” must be coordinate‐invariant.

slide-30
SLIDE 30

Coordinate‐Invariance This is Spinal Tap (1984)

slide-31
SLIDE 31

A coordinate‐invariant functional of : → has the general form

, ⋯ , det

where · is any symmetric function, and , ⋯ , are the roots of

  • .

, , local coord. , ⋯ , Riemannian metric , , local coord. , ⋯ , Riemannian metric

Coordinate‐Invariant Functionals

slide-32
SLIDE 32
  • Intuition: Take to be made of elastic (e.g., rubber) and to be

rigid (e.g., made of steel).

Harmonic Maps

  • Wrap the elastic so that it covers

all of , and and let settle to its elastic equilibrium state. This is the harmonic map solution [Eells and Sampson 1964].

slide-33
SLIDE 33
  • , ⋯ , ∑
  • , with boundary conditions
  • The harmonic mapping functional is

det ⋯

  • Variational equations:

1 det

  • det
  • Γ
  • where is , entry of , Γ
  • are the Christoffel symbols of the

second kind

Harmonic Maps: Formulation

slide-34
SLIDE 34

Finding the minimum distortion map from the unit interval [0,1] to itself:

  • Find the mapping

that maps the interval [0,1] onto [0,1] so as to minimize

  • Variational equations are

, which correspond to the equations for the line .

Examples of Harmonic Maps

slide-35
SLIDE 35

Geodesics: Given two points on the Riemannian manifold , find the path of shortest distance connecting these two points:

  • Find the mapping

with endpoints specified that minimizes

  • Variational equations:
  • 1

Examples of Harmonic Maps

slide-36
SLIDE 36

Examples of Harmonic Maps

Harmonic Functions: Find the equilibrium temperature distribution over a planar region with the boundary temperatures specified:

  • Find the mapping
  • with values for

specified on the boundary of the region.

  • Variational equations:
  • (Laplace’s equation)
slide-37
SLIDE 37

Manifold Learning Revisited

slide-38
SLIDE 38
  • Find a lower‐dimensional, minimum distortion, Euclidean representation
  • f high‐dimensional data:
  • Examples from locally linear embedding (LLE) (Roweis et al. 2000)
  • usually , ≪

∈ ∈

Mapping 3‐dim data to 2‐dim space Face images mapped into 2‐dim space

Manifold Learning

slide-39
SLIDE 39
  • Recall the general setup of our global distortion measure:
  • , ⋯ , det
  • Riemannian Manifold Learning
slide-40
SLIDE 40

Choices need to be made:

  • Manifolds

and

  • Metric

in

  • Metric H in
  • Integrand function
  • Constraints, boundary conditions
  • Discretization method

* can be estimated using , … , , from Laplace‐Beltrami operator based method

Riemannian Manifold Learning

A classification scheme for existing manifold learning algorithms A roadmap for finding new manifold learning methods and algorithms (for example, the harmonic mapping distortion)

slide-41
SLIDE 41
  • Discretized objective function for ∑
  • :

1 2 Tr

  • 1

2 Tr

  • 2
  • Given

, for

  • If

is unspecified, can be optimized with respect to other global

distortion measures

Example: Harmonic Mapping Distortion Details

  • where

∈ : embedding points in

  • ∈ : embedding of boundary points
  • ,
slide-42
SLIDE 42

A Taxonomy of Manifold Learning Algorithms (1)

  • (inverse pseudo‐metric)
  • Volume

element Constraint LLE

(Locally Linear Embedding)

(Roweis et al. 2000)

Rank‐one matrix ΔΔ

LE

(Laplacian Eigenmap)

(Belkin et al. 2003)

Kernel‐weighted covariance matrix

,

  • det

DM

(Diffusion Map)

(Coifman et al. 2006)

Projected metric from

  • det

det

Manifold learning algorithms such as LLE, LE, DM share the similar objective to harmonic maps while having equality constraint to avoid trivial solution . ∈

  • Δ in LLE is local reconstruction error
  • btained when running the algorithm
  • in LE method represents the projected

metric from

slide-43
SLIDE 43
  • (inverse pseudo‐metric)
  • Volume

element Constraint RR

(Riemannian Relaxation)

(McQueen et al. 2016)

Projected metric from the ambient manifold

( is estimated from Laplace‐Beltrami operator based method)

max

  • 1

det

LS

(Least‐squares spectral distortion)

Same as above 1

  • det

PD

(P(n) distance metric distortion)

Same as above log

  • det

HM

(Harmonic mapping distortion)

Same as above

  • det

f

  • LS and PD can be thought of as variants of RR with different
  • For HM, further optimization is possible when boundary is not specified

A Taxonomy of Manifold Learning Algorithms (2)

slide-44
SLIDE 44

Flattened Swiss roll

: data points

Swiss roll data (2‐dim manifold in 3‐dim space) Diffusion map embedding

= ∑

  • 1
  • Riemannian distortion results

Isomap embedding

Harmonic mapping with boundary () to minimize = ∑

  • 1
  • Minimum distortion

results are closer to flattened swiss roll

Example: Swiss Roll

slide-45
SLIDE 45
  • Face images for the corresponding two‐dim. embeddings

Diffusion map embedding Riemannian distortion results Isomap embedding

heading angle mouth shape = ∑

  • 1
  • Harmonic mapping with

boundary () to minimize = ∑

  • 1
  • Variations in the face heading angle and mouth shape can be
  • bserved along the horizontal and vertical axes respectively

Example: Faces

slide-46
SLIDE 46

Machine Learning for Non‐Euclidean Data

slide-47
SLIDE 47

Kendall’s shape space ℙ M‐Rep ( SO3 SO2) Lie Shape

  • ( SO3 )

Examples of Non‐Euclidean Data

Rotations SO(3), rigid body motions SE(3), general linear transformations GL(n) and their various subgroups, etc: geometry and distance metrics are now well‐established (but still not widely known or used by the community).

slide-48
SLIDE 48
  • Inertial parameters of a rigid body:

,

, , , , , , ∈

( : mass, ∈ : first moment, ∈ : moments of inertia)

  • 4x4 symmetric matrix representation of :

  • ∈ ,

should be positive definite, i.e., .

  • Examples of Non‐Euclidean Data

P(n): The space of symmetric positive‐definite matrices

slide-49
SLIDE 49

Natural Distance on P(n)

  • Affine‐invariant metric on ∈ :
  • ,

( 0)

  • Geodesic distance on P(n):

,

,

∑ log

  •  Well‐defined on positive definite matrix

manifold ∈ 4

 Invariant to reference frames, physical units  Dimensionless  Better encodes natural distance between

positive mass distributions

Geodesic path on P(4) Geodesic Distances between Pairs of Inertial Parameters

slide-50
SLIDE 50

Example: Human Dynamic Modeling

  • T. Lee, P. M. Wensing, F. C. Park, “Geometric Robot Dynamic Identification: A Convex Programming Approach,” submitted to TRO, 2018
  • T. Lee, F. C. Park, “A Geometric Algorithm for Robust Multibody Inertial Parameter Identification,” RA-Letters, 2018
  • High dimensional system
  • Insufficient, noisy measurements

Geometric Method Existing Vector Space Methods

slide-51
SLIDE 51

Each voxel is a 3D multivariate normal distribution: the mean indicates the position, while the covariance indicates the direction of diffusion of water molecules. Segmentation of a DTI image requires a metric on the manifold of multivariate Gaussian distributions.

Examples of Non‐Euclidean Data

Diffusion tensor images (DTI)

slide-52
SLIDE 52

Using the standard approach of calculating distances on the means and covariances separately, and summing the two for the total distance, results in dist(a,b) = dist(b,c), which is unsatisfactory. In this example, water molecules are able to move more easily in the x‐axis

  • direction. Therefore,

diffusion tensors (b) and (c) are closer than (a) and (b)

Geometry of DTI Segmentation

slide-53
SLIDE 53

An n‐dimensional statistical manifold is a set of probability distributions parametrized by some smooth, continuously‐varying parameter

.

∈ | | ,

  • Geometry of Statistical Manifolds
slide-54
SLIDE 54
  • The Fisher information defines a Riemannian metric
  • n a statistical manifold

~.| log

  • log
  • Connection to KL divergence:

. || . 1 2

Geometry of Statistical Manifolds

slide-55
SLIDE 55
  • The manifold of Gaussian distributions

, Σ ∈ , Σ ∈ , where ∈ , ≻ 0

  • Fisher information metric on

Σ 1 2 ΣΣ

  • Euler‐Lagrange equations for geodesics on
  • Σ
  • Σ

Geometry of Gaussian Distributions

slide-56
SLIDE 56
  • Geodesic Path on

0 , Σ 1 0.1 , 1 1 ,

  • 0.2
0.2 0.4 0.6 0.8 1 1.2
  • 0.2
0.2 0.4 0.6 0.8 1 1.2 1.4

Geometry of Gaussian Distributions

slide-57
SLIDE 57
  • Fisher information metric on

with fixed mean

1 2 ΣΣ

Affine‐invariant metric on

  • Invariant under general linear group

action

Σ → Σ, ∈

which implies coordinate invariance.

  • Closed‐form geodesic distance

Σ, Σ log Σ

Σ

  • /

Restriction to Covariances

slide-58
SLIDE 58

Using covarianceand Euclidean distance Using MND distance

Results of Segmentation for Brain DTI

slide-59
SLIDE 59
  • Manifold learning for human mass-inertia data:

PC 1 PC 2 PC 1 PC 2

Principal geodesic analysis (PGA) Vector space principal component analysis (PCA)

Infeasible inertial parameters

standard deviation standard deviation standard deviation standard deviation

Body thickness is captured along PC1 Height and upper body thickness are captured along PC2

Example: Human Mass‐Inertia Data

slide-60
SLIDE 60

Concluding Remarks

slide-61
SLIDE 61
  • ML for non‐Euclidean data is receiving greater

attention from the ML research community:

  • Application to autoencoders;
  • CNNs for geometric data;
  • Many problems in engineering are analogous to

trying to fit a square peg into a round hole.

  • Often the things we work with are not vectors,

but elements of a manifold.

  • The geometric methods and distortion

measures described in this talk can be helpful in addressing such problems.

Concluding Remarks