ISOMAP and LLE 2020 Fisher 1922 ... the objective of - PowerPoint PPT Presentation

ISOMAP and LLE 姚遠 2020

Fisher 1922 ... the objective of statistical methods is the reduction of data. A quantity of data... is to be replaced by relatively few quantities which shall adequately represent ... the relevant information contained in the original data. Since the number of independent facts supplied in the data is usu- ally far greater than the number of facts sought, much of the information supplied by an actual sample is irrelevant. It is the object of the statistical process employed in the reduction of data to exclude this irrelevant information, and to isolate the whole of the relevant information contained in the data. – R . A . Fisher � 2

Python scikit-learn Manifold learning Toolbox http://scikit-learn.org/stable/modules/manifold.html • PCA/MDS(SMACOF algorithm, not spectral method) • ISOMAP/LLE (+MLLE) • Hessian Eigenmap • Laplacian Eigenmap • LTSA • tSNE 2 � 3

Matlab Dimensionality Reduction Toolbox http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_R • eduction.html Math.pku.edu.cn/teachers/yaoy/Spring2011/matlab/drtoolbox • – PrincipalFComponentFAnalysisF(PCA),FProbabilisticFPC – FactorFAnalysisF(FA),FSammon mapping,FLinearFDiscriminant AnalysisF(LDA) – MultidimensionalFscalingF(MDS),FIsomap,FLandmarkFIsomap – LocalFLinearFEmbeddingF(LLE),FLaplacian Eigenmaps,FHessianFLLE,FConformalFEigenmaps – LocalFTangentFSpaceFAlignmentF(LTSA),FMaximumFVarianceFUnfoldingF(extensionFofFLLE) – LandmarkFMVUF(LandmarkMVU),FFastFMaximumFVarianceFUnfoldingF(FastMVU) KernelFPCA – DiffusionFmaps – … – � X

Recall: PCA • Principal Component Analysis (PCA) X p × n = [ X 1 X 2 ... X n ] One Dimensional Manifold

Recall: MDS • Given pairwise distances D, where D ij = d ij2 , the squared distance between point i and j – Convert the pairwise distance matrix D (c.n.d.) into the dot product matrix B (p.s.d.) • B ij (a) = -.5 H(a) D H’(a), Hölder matrix H(a) = I-1a’; • a = 1 k : B ij = -.5 (D ij - D ik – D jk ) $ ' N N N • a = 1/n: ∑ ∑ ∑ B ij = − 1 2 D ij − 1 − 1 1 D sj D it D st & + ) N N N 2 % ( s = 1 t = 1 s , t = 1 – Eigendecomposition of B = YY T If we preserve the pairwise Euclidean distances do we preserve the structure??

Nonlinear Manifolds.. PCA and MDS see the Euclidean A distance What is important is the geodesic distance Unfold the manifold

Intrinsic Description.. • To preserve structure , preserve the geodesic distance and not the Euclidean distance.

Manifold Learning Learning when data ∼ M ⊂ R N Clustering: M → { 1 , . . . , k } connected components, min cut Classification/Regression: M → { − 1 , +1 } or M → R P on M × { − 1 , +1 } or P on M × R Dimensionality Reduction: f : M → R n n << N M unknown: what can you learn about M from data? e.g. dimensionality, connected components holes, handles, homology curvature, geodesics

All you wanna know about differential geometry but were afraid to ask, in 9 easy slides

Embedded (sub-)Manifolds M k ⊂ R N Locally (not globally) looks like Euclidean space. S 2 ⊂ R 3

Tangent Space T p M k ⊂ R N k -dimensional affine subspace of R N .

Tangent Vectors and Curves φ ( t ) : R → M k v � d φ ( t ) � = V � d t ! (t) � 0 Tangent vectors <———> curves.

Riemannian Geometry Norms and angles in tangent space. w v � v, w � � v � , � w �

Geodesics φ ( t ) : [0 , 1] → M k � 1 � � d φ � � l ( φ ) = � dt � � dt � 0 Can measure length using norm in tangent space. Geodesic — shortest curve between two points.

Tangent Vectors vs. Derivatives f : M k → R v φ ( t ) : R → M k ! (t) f ( φ ( t )) : R → R � dv = df ( φ ( t )) d f � � dt � 0 Tangent vectors <———> Directional derivatives.

Gradients f : M k → R v �∇ f , v � ≡ d f dv ! (t) Tangent vectors <———> Directional derivatives. Gradient points in the direction of maximum change.

Exponential Maps p v w exp p : T p M k → M k (t) ! exp p ( v ) = r exp p ( w ) = q q r Geodesic φ ( t ) � d φ ( t ) � φ (0) = p, φ ( � v � ) = q = v � dt � 0

Laplacian-Beltrami Operator x p 1 x f : M k → R 2 exp p : T p M k → M k ∂ 2 f (exp p ( x )) � ∆ M f ( p ) ≡ ∂ x 2 i i Orthonormal coordinate system.

Generative Models in Manifold Learning

Spectral Geometric Embedding Given x 1 , . . . , x n ∈ M ⊂ R N , Find y 1 , . . . , y n ∈ R d where d < < N ISOMAP (Tenenbaum, et al, 00) LLE (Roweis, Saul, 00) Laplacian Eigenmaps (Belkin, Niyogi, 01) Local Tangent Space Alignment (Zhang, Zha, 02) Hessian Eigenmaps (Donoho, Grimes, 02) Diffusion Maps (Coifman, Lafon, et al, 04) Related: Kernel PCA (Schoelkopf, et al, 98)

Meta-Algorithm • Construct a neighborhood graph • Construct a positive semi-definite kernel • Find the spectrum decomposition Kernel Spectrum

Two Basic Geometric Embedding Methods: Science 2000 • Tenenbaum-de Silva-Langford Isomap Algorithm – Global approach. – On a low dimensional embedding • Nearby points should be nearby. • Faraway points should be faraway. • Roweis-Saul Locally Linear Embedding Algorithm – Local approach • Nearby points nearby

Isomap • Estimate the geodesic distance between faraway points. • For neighboring points Euclidean distance is a good approximation to the geodesic distance. • For faraway points estimate the distance by a series of short hops between neighboring points. – Find shortest paths in a graph with edges connecting neighboring data points Once we have all pairwise geodesic distances use classical metric MDS

Isomap - Algorithm • Construct an n-by-n neighborhood graph – connecting points whose distances are within a fixed radius. – K nearest neighbor graph • Compute the shortest path (geodesic) distances between nodes: D – Floyd’s Algorithm (O( N 3 )) – Dijkstra’s Algorithm (O( kN 2 logN)) • Construct a lower dimensional embedding. – Classical MDS (K = -0.5 H D H’ = U S U’)

Isomap

Example…

Residual Variance vs. Intrinsic Dimension Face Images SwissRoll Fig. 2. The residual variance of PCA (open triangles), MDS [open triangles in (A) through (C); open circles in (D)], and Isomap (filled circles) on four data sets ( 42 ). ( A ) Face images varying in pose and il- lumination (Fig. 1A). ( B ) Swiss roll data (Fig. 3). ( C ) Hand images varying in finger exten- sion and wrist rotation ( 20 ). ( D ) Handwritten “2”s (Fig. 1B). In all cas- es, residual variance de- creases as the dimensionality d is increased. The intrinsic dimensionality of the data can be estimated by Hand Images 2 looking for the “elbow” at which this curve ceases to decrease significantly with added dimensions. Arrows mark the true or approximate dimensionality, when known. Note the tendency of PCA and MDS to overestimate the dimensionality, in contrast to Isomap.

ISOMAP on Alanine-dipeptide ISOMAP 3D embedding with RMSD metric on 3900 Kcenters

Convergence of ISOMAP • ISOMAP has provable convergence guarantees; • Given that { x i } is sampled sufficiently dense, graph shortest path distance will approximate closely the original geodesic distance as measured in manifold M ; • But ISOMAP may suffer from nonconvexity such as holes on manifolds

Two step approximations

Convergence Theorem   [Bernstein, de Silva, Langford, Main Theorem Theorem 1: Let M be a compact submanifold of R n and let { x i } be a finite set of data points in M. We are given a graph G on { x i } and positive real numbers ⌅ 1 , ⌅ 2 < 1 and ⇥ , ⇤ > 0. Suppose: 1. G contains all edges ( x i , x j ) of length ⌅ x i � x j ⌅ ⇥ ⇤ . 2. The data set { x i } statisfies a ⇥ -sampling condition – for every point m ⇤ M there exists an x i such that d M ( m , x i ) < ⇥ . 3. M is geodesically convex – the shortest curve joining any two points on the surface is a geodesic curve. ⇧ 24 ⌅ 1 , where r 0 is the minimum radius of curvature of M – 4. ⇤ < ( 2 / ⇧ ) r 0 1 r 0 = max γ , t ⌅ � 00 ( t ) ⌅ where � varies over all unit-speed geodesics in M. 5. ⇤ < s 0 , where s 0 is the minimum branch separation of M – the largest positive number for which ⌅ x � y ⌅ < s 0 implies d M ( x , y ) ⇥ ⇧ r 0 . 6. ⇥ < ⌅ 2 ⇤ / 4. Then the following is valid for all x , y ⇤ M , ( 1 � ⌅ 1 ) d M ( x , y ) ⇥ d G ( x , y ) ⇥ ( 1 + ⌅ 2 ) d M ( x , y )

Probabilistic Result I So, short Euclidean distance hops along G approximate well actual geodesic distance as measured in M. I What were the main assumptions we made? The biggest one was the δ -sampling density condition. I A probabilistic version of the Main Theorem can be shown where each point x i is drawn from a density function. Then the approximation bounds will hold with high probability. Here’s a truncated version of what the theorem looks like now: Asymptotic Convergence Theorem: Given λ 1 , λ 2 , µ > 0 then for density function α sufficiently large: 1 − λ 1 ≤ d G ( x , y ) d M ( x , y ) ≤ 1 + λ 2 will hold with probability at least 1 − µ for any two data points x, y.

ISOMAP and LLE 2020 Fisher 1922 ... the objective of - PowerPoint PPT Presentation

ISOMAP and LLE 2020 Fisher 1922 ... the objective of statistical methods is the reduction of data. A quantity of data... is to be replaced by relatively few quantities which shall adequately represent ... the relevant information

Stude nt Suc c e ss Sc ore c a rd Stude nt Suc c e ss Sc ore c a rd I I rvine Va lle y Co lle

ISOMAP and LLE 2019 Fisher 1922 ... the objective of statistical methods is the

Nonlinear Manifold Learning Part One: Background, LLE, IsoMap 6.454 Area One Seminar October 8

VACUUM E XCE LLE NCE DE FINE D VACUUM E XCE LLE NCE DE FINE D Cutting

Wha t is Co lle g e Cre dit Plus (CCP)? Why CCP a t T ri-C? ACCE SS Be ne fits a nd Cha lle

L a Sa lle Ba nk & T rust # 5200 WH F to PUD fo r a n Anima l Ho spita l* L a Sa lle

T he US Cybe r Cha lle ng e U S Cyb e r Cha lle ng e : De ve lo p ing the Ne xt Ge ne ra

De bug g ing L a rg e S c a le a nd Hybrid P a ra lle l C ode Ma rk O'C onnor m a rk@ a

Pat Patch ch-based based Ei Eigen en-fac ace e Isomap omap Netw Networ orks ks By:

WHS SE NI ORS AGE NDA Re vie w Gra dua tio n Re q uire me nts Disc uss Co lle g e / Ca

OVE RVI E W I NSURT E CH & T E L E MAT I CS Dr. Mic he lle O sb o rne , MBA, C

Hopstech Indu Hopstech Indu Industries Limited Industries Limited Assure Exce lle nce o

T eam- I nitiated P roblem S olving II (TIPS II) Cha nda T e lle e n, E duc a tio na l Co

JACK CKSONVILLE LLE 2 2017 AN OPPORTUNITY FOR HISTORIC PRESERVATION & REVITALIZATION

Yo uth in the 21 st Ce ntury: Ho w will the y me e t the c ha lle ng e s? Ma ure e n E . K e

Ric ha rd Ro se a nd F e lic ia T ripp Co lle g e Co a c he s Ma king Wa ve s F o unda tio

Replica-exchange in molecular dynamics Part of 2014 SeSE course in Advanced molecular dynamics

Free energy calculations: An efficient adaptive biasing potential method P. Fleurat-Lessard - B.

Deep Neural Networks and Molecular Dynamics Roberto Car Princeton University MAX Conference

Annotation Error in Public Databases ALEXANDRA SCHNOES UNIVERSITY OF CALIFORNIA, SAN FRANCISCO

Supervised Ensembles of Prediction Methods for Subcellular Localization APBC 2008 Johannes

Using NMR relaxation data to improve the dynamics of methyl groups in AMBER and CHARMM force

VISUALIZING THE PROTEIN SEQUENCE UNIVERSE L.STANBERRY 1 , R.HIGDON 1 , W.HAYNES 1 , N.KOLKER 1 ,

Simulation of rare events by Adaptive Multilevel Splitting algorithms Charles-Edouard Brhier

ISOMAP and LLE 2020 Fisher 1922 ... the objective of - PowerPoint PPT Presentation

ISOMAP and LLE 2020 Fisher 1922 ... the objective of statistical methods is the reduction of data. A quantity of data... is to be replaced by relatively few quantities which shall adequately represent ... the relevant information

Stude nt Suc c e ss Sc ore c a rd Stude nt Suc c e ss Sc ore c a rd I I rvine Va lle y Co lle

ISOMAP and LLE 2019 Fisher 1922 ... the objective of statistical methods is the

Nonlinear Manifold Learning Part One: Background, LLE, IsoMap 6.454 Area One Seminar October 8

VACUUM E XCE LLE NCE DE FINE D VACUUM E XCE LLE NCE DE FINE D Cutting

Wha t is Co lle g e Cre dit Plus (CCP)? Why CCP a t T ri-C? ACCE SS Be ne fits a nd Cha lle

L a Sa lle Ba nk &amp; T rust # 5200 WH F to PUD fo r a n Anima l Ho spita l* L a Sa lle

T he US Cybe r Cha lle ng e U S Cyb e r Cha lle ng e : De ve lo p ing the Ne xt Ge ne ra

De bug g ing L a rg e S c a le a nd Hybrid P a ra lle l C ode Ma rk O'C onnor m a rk@ a

Pat Patch ch-based based Ei Eigen en-fac ace e Isomap omap Netw Networ orks ks By:

WHS SE NI ORS AGE NDA Re vie w Gra dua tio n Re q uire me nts Disc uss Co lle g e / Ca

OVE RVI E W I NSURT E CH &amp; T E L E MAT I CS Dr. Mic he lle O sb o rne , MBA, C

Hopstech Indu Hopstech Indu Industries Limited Industries Limited Assure Exce lle nce o

T eam- I nitiated P roblem S olving II (TIPS II) Cha nda T e lle e n, E duc a tio na l Co

JACK CKSONVILLE LLE 2 2017 AN OPPORTUNITY FOR HISTORIC PRESERVATION &amp; REVITALIZATION

Yo uth in the 21 st Ce ntury: Ho w will the y me e t the c ha lle ng e s? Ma ure e n E . K e

Ric ha rd Ro se a nd F e lic ia T ripp Co lle g e Co a c he s Ma king Wa ve s F o unda tio

Replica-exchange in molecular dynamics Part of 2014 SeSE course in Advanced molecular dynamics

Free energy calculations: An efficient adaptive biasing potential method P. Fleurat-Lessard - B.

Deep Neural Networks and Molecular Dynamics Roberto Car Princeton University MAX Conference

Annotation Error in Public Databases ALEXANDRA SCHNOES UNIVERSITY OF CALIFORNIA, SAN FRANCISCO

Supervised Ensembles of Prediction Methods for Subcellular Localization APBC 2008 Johannes

Using NMR relaxation data to improve the dynamics of methyl groups in AMBER and CHARMM force

VISUALIZING THE PROTEIN SEQUENCE UNIVERSE L.STANBERRY 1 , R.HIGDON 1 , W.HAYNES 1 , N.KOLKER 1 ,

Simulation of rare events by Adaptive Multilevel Splitting algorithms Charles-Edouard Brhier

L a Sa lle Ba nk & T rust # 5200 WH F to PUD fo r a n Anima l Ho spita l* L a Sa lle

OVE RVI E W I NSURT E CH & T E L E MAT I CS Dr. Mic he lle O sb o rne , MBA, C

JACK CKSONVILLE LLE 2 2017 AN OPPORTUNITY FOR HISTORIC PRESERVATION & REVITALIZATION