on the computation of distances between 2d histograms by
play

On the Computation of Distances between 2D-Histograms by Minimum - PowerPoint PPT Presentation

Introduction Formulations Results Conclusions References On the Computation of Distances between 2D-Histograms by Minimum Cost Flows Stefano Gualandi Federico Bassetti and Marco Veneroni Universit di Pavia, Dipartimento di Matematica


  1. Introduction Formulations Results Conclusions References On the Computation of Distances between 2D-Histograms by Minimum Cost Flows Stefano Gualandi Federico Bassetti and Marco Veneroni Università di Pavia, Dipartimento di Matematica email: stefano.gualandi@unipv.it twitter: @famo2spaghi 1/25

  2. Introduction Formulations Results Conclusions References Let X = { x 1 , . . . , x n } and Y = { y 1 , . . . , y m } be two discrete spaces. Let µ = ( µ ( x 1 ) , . . . , µ ( x n )) probability vector on X ν = ( ν ( y 1 ) , . . . , ν ( y m )) probability vector on Y c : X × Y → R + a cost function. 2/25

  3. Introduction Formulations Results Conclusions References Let X = { x 1 , . . . , x n } and Y = { y 1 , . . . , y m } be two discrete spaces. Let µ = ( µ ( x 1 ) , . . . , µ ( x n )) probability vector on X ν = ( ν ( y 1 ) , . . . , ν ( y m )) probability vector on Y c : X × Y → R + a cost function. Definition 1 (Kantorovich-Rubinshtein Functional (San15; LS17)) The Kantorovich-Rubinshtein functional in the discrete setting is the following LP problem (a special case of the Hitchcock Problem) � � W c ( µ, ν ) = min c ( x , y ) π ( x , y ) x ∈ X y ∈ Y � s.t. π ( x , y ) = µ ( x ) ∀ x ∈ X y ∈ Y � π ( x , y ) = ν ( y ) ∀ y ∈ Y x ∈ X π ( x , y ) ≥ 0 . 2/25

  4. Introduction Formulations Results Conclusions References Definition 2 (Wasserstein distance (San15)) When X = Y and c ( x , y ) = d p ( x , y ), where d is a ground distance on X , we define the Wasserstein distance of order p as: W p ( µ, ν ) := W d p ( µ, ν ) min(1 / p , 1) which is a distance on the simplex of probability vectors on X . 3/25

  5. Introduction Formulations Results Conclusions References Definition 2 (Wasserstein distance (San15)) When X = Y and c ( x , y ) = d p ( x , y ), where d is a ground distance on X , we define the Wasserstein distance of order p as: W p ( µ, ν ) := W d p ( µ, ν ) min(1 / p , 1) which is a distance on the simplex of probability vectors on X . OUR CONTRIBUTION: For Wasserstein distances of order p = 1 with the following ground distances d 1 ( x , y ) = || x − y || 1 d 2 ( x , y ) = || x − y || 2 d ∞ ( x , y ) = || x − y || ∞ 3/25

  6. Introduction Formulations Results Conclusions References Definition 2 (Wasserstein distance (San15)) When X = Y and c ( x , y ) = d p ( x , y ), where d is a ground distance on X , we define the Wasserstein distance of order p as: W p ( µ, ν ) := W d p ( µ, ν ) min(1 / p , 1) which is a distance on the simplex of probability vectors on X . OUR CONTRIBUTION: For Wasserstein distances of order p = 1 with the following ground distances d 1 ( x , y ) = || x − y || 1 → Exact method (easy) d 2 ( x , y ) = || x − y || 2 d ∞ ( x , y ) = || x − y || ∞ 3/25

  7. Introduction Formulations Results Conclusions References Definition 2 (Wasserstein distance (San15)) When X = Y and c ( x , y ) = d p ( x , y ), where d is a ground distance on X , we define the Wasserstein distance of order p as: W p ( µ, ν ) := W d p ( µ, ν ) min(1 / p , 1) which is a distance on the simplex of probability vectors on X . OUR CONTRIBUTION: For Wasserstein distances of order p = 1 with the following ground distances d 1 ( x , y ) = || x − y || 1 → Exact method (easy) d 2 ( x , y ) = || x − y || 2 d ∞ ( x , y ) = || x − y || ∞ → Exact method (easy) 3/25

  8. Introduction Formulations Results Conclusions References Definition 2 (Wasserstein distance (San15)) When X = Y and c ( x , y ) = d p ( x , y ), where d is a ground distance on X , we define the Wasserstein distance of order p as: W p ( µ, ν ) := W d p ( µ, ν ) min(1 / p , 1) which is a distance on the simplex of probability vectors on X . OUR CONTRIBUTION: For Wasserstein distances of order p = 1 with the following ground distances d 1 ( x , y ) = || x − y || 1 → Exact method (easy) d 2 ( x , y ) = || x − y || 2 → Exact and approximation methods (tricky) d ∞ ( x , y ) = || x − y || ∞ → Exact method (easy) 3/25

  9. Introduction Formulations Results Conclusions References Definition 2 (Wasserstein distance (San15)) When X = Y and c ( x , y ) = d p ( x , y ), where d is a ground distance on X , we define the Wasserstein distance of order p as: W p ( µ, ν ) := W d p ( µ, ν ) min(1 / p , 1) which is a distance on the simplex of probability vectors on X . OUR CONTRIBUTION: For Wasserstein distances of order p = 1 with the following ground distances d 1 ( x , y ) = || x − y || 1 → Exact method (easy) d 2 ( x , y ) = || x − y || 2 → Exact and approximation methods (tricky) d ∞ ( x , y ) = || x − y || ∞ → Exact method (easy) All the methods rely on solving an Uncapacitated Min Cost Flow problem (but on different networks) 3/25

  10. Introduction Formulations Results Conclusions References 2D Histograms 2D Histograms 2D histograms can be seen as discrete measures on a finite set of points in R 2 . To represent 2D histograms with N × N equally spaced bins, we take X = L N := { i = ( i 1 , i 2 ) : i 1 = 0 , . . . , N − 1 , i 2 = 0 , . . . , N − 1 } One can think of each point ( i 1 , i 2 ) as the center of a bin. 4/25

  11. Introduction Formulations Results Conclusions References Computing Wasserstein Distance of order 1 Computing W 1 distances between 2D histograms with n = N 2 bins reduces to an Uncapacitated Min Cost Flow problem on a bipartite graph with 2 n nodes and n 2 arcs, and can be solved in O ( n 3 log n ) time (Orl93; GTT89). Related to Earth Mover Distance (RTG00; PW09; Cut13) 5/25

  12. Introduction Formulations Results Conclusions References Computing Wasserstein Distance of order 1 Computing W 1 distances between 2D histograms with n = N 2 bins reduces to an Uncapacitated Min Cost Flow problem on a bipartite graph with 2 n nodes and n 2 arcs, and can be solved in O ( n 3 log n ) time (Orl93; GTT89). Related to Earth Mover Distance (RTG00; PW09; Cut13) 5/25

  13. Introduction Formulations Results Conclusions References Computing Wasserstein Distance of order 1 Computing W 1 distances between 2D histograms with n = N 2 bins reduces to an Uncapacitated Min Cost Flow problem on a bipartite graph with 2 n nodes and n 2 arcs, and can be solved in O ( n 3 log n ) time (Orl93; GTT89). Related to Earth Mover Distance (RTG00; PW09; Cut13) ( i 1 − j 1 ) 2 + ( i 2 − j 2 ) 2 � ground distance: d 2 ( i , j ) = 5/25

  14. Introduction Formulations Results Conclusions References W 1 distances: Recomputation from the Literature (SSG17) Best results with the Network Simplex of the Lemon Graph Library v1.3.1 Similar results with CPLEX 12.7 and Gurobi 7.0 32x32 vertices: 2 048 arcs: 1 048 576 runtime: 0.4 s 6/25

  15. Introduction Formulations Results Conclusions References W 1 distances: Recomputation from the Literature (SSG17) Best results with the Network Simplex of the Lemon Graph Library v1.3.1 Similar results with CPLEX 12.7 and Gurobi 7.0 32x32 64x64 vertices: 2 048 8 192 arcs: 1 048 576 16 777 216 runtime: 0.4 s 10.9 s 6/25

  16. Introduction Formulations Results Conclusions References W 1 distances: Recomputation from the Literature (SSG17) Best results with the Network Simplex of the Lemon Graph Library v1.3.1 Similar results with CPLEX 12.7 and Gurobi 7.0 32x32 64x64 128x128 vertices: 2 048 8 192 32 768 arcs: 1 048 576 16 777 216 268 435 456 runtime: 0.4 s 10.9 s out-of-memory 6/25

  17. Introduction Formulations Results Conclusions References W 1 distances: Recomputation from the Literature (SSG17) Best results with the Network Simplex of the Lemon Graph Library v1.3.1 Similar results with CPLEX 12.7 and Gurobi 7.0 32x32 64x64 128x128 256x256 vertices: 2 048 8 192 32 768 131 072 arcs: 1 048 576 16 777 216 268 435 456 4 294 967 296 runtime: 0.4 s 10.9 s out-of-memory out-of-memory 6/25

  18. Introduction Formulations Results Conclusions References W 1 distances: Recomputation from the Literature (SSG17) Best results with the Network Simplex of the Lemon Graph Library v1.3.1 Similar results with CPLEX 12.7 and Gurobi 7.0 32x32 64x64 128x128 256x256 512x512 vertices: 2 048 8 192 32 768 131 072 524 288 arcs: 1 048 576 16 777 216 268 435 456 4 294 967 296 68 719 476 736 runtime: 0.4 s 10.9 s out-of-memory out-of-memory out-of-memory 6/25

  19. Introduction Formulations Results Conclusions References Min Cost Flow Formulation on K n For computing W d h distances between µ, ν ∈ X , we define the complete flow network K n = ( V , A ): Nodes: V = L N 7/25

  20. Introduction Formulations Results Conclusions References Min Cost Flow Formulation on K n For computing W d h distances between µ, ν ∈ X , we define the complete flow network K n = ( V , A ): Nodes: V = L N Arcs: A = { ( i , j ) | ∀ i , j ∈ V , i � = j } 7/25

  21. Introduction Formulations Results Conclusions References Min Cost Flow Formulation on K n For computing W d h distances between µ, ν ∈ X , we define the complete flow network K n = ( V , A ): Nodes: V = L N Arcs: A = { ( i , j ) | ∀ i , j ∈ V , i � = j } Arc costs: c ij = d ij = || i − j || h , ∀ ( i , j ) ∈ A Flow balance: b i = µ ( x i ) − ν ( y i ) , ∀ i ∈ V 7/25

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend