On Scalable and Efficient Computation of Large Scale Optimal - PowerPoint PPT Presentation

On Scalable and Efficient Computation of Large Scale Optimal Transport Yujia Xie, Minshuo Chen, Haoming Jiang, Tuo Zhao, Hongyuan Zha School of Computational Science and Engineering H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Jun. 13, 2019

Thirty-sixth International Conference on Machine Learning Optimal Tranport (OT) The OT problem aims to align data from multiple sources. Resource Allocation : We want to assign a set of assets to a set of receivers so that an optimal economic benefit is achieved. Domain Adaptaion : We collect multiple datasets from different domains, and we need to learn a model from a source dataset, which can be further adapted to target datasets. Both applications can be formulated as OT problems. Xie et al. — On Scalable and Efficient Computation of Large Scale Optimal Transport 2/9

Thirty-sixth International Conference on Machine Learning Optimal Tranport Formulation OT aims to find an optimal joint distribution γ ∗ of µ and ν , which minimizes the expectation on some cost function c , i.e., γ ∗ = arg min E ( X,Y ) ∼ γ [ c ( X, Y )] , γ subject to X ∼ µ, Y ∼ ν. γ ∗ is referred as the optimal transport plan , suggesting the way to transport between µ and ν with minimum cost. Existing Methods Discretization + Linear Programming The number of grids needs to scale exponentially w.r.t. dimension Xie et al. — On Scalable and Efficient Computation of Large Scale Optimal Transport 3/9

Thirty-sixth International Conference on Machine Learning SPOT • OT: γ ∗ = arg min γ E ( X,Y ) ∼ γ [ c ( X, Y )] , s . t . X ∼ µ, Y ∼ ν . • Approximate γ ∗ by an implicit generative model G ( Z ) , � G X ( Z ) � X � � G ( Z ) = ≈ , G Y ( Z ) Y where Z ∼ ρ, X ∼ µ, Y ∼ ν . • Substitute G ( Z ) into OT problem, we can rewrite the problem as arg min E Z ∼ ρ [ c ( G X ( Z ) , G Y ( Z ))] , G subject to W 1 ( G X ( Z ) , µ ) = 0 , W 1 ( G Y ( Z ) , ν ) = 0 . where W 1 ( G X ( Z ) , µ ) denotes the standard Wasserstein metric between a random vector G X ( Z ) and a distribution µ . Here we use the fact that W 1 ( G X ( Z ) , µ ) = 0 indicates G X ( Z ) ∼ µ . Xie et al. — On Scalable and Efficient Computation of Large Scale Optimal Transport 4/9

Thirty-sixth International Conference on Machine Learning SPOT min max E Z ∼ ρ [ c ( G X ( Z ) , G Y ( Z ))] G ∈G λ X ∈F 1 X ,λ Y ∈F 1 Y � � + η λ X ( G X ( Z ) , X ) + λ Y ( G Y ( Z ) , Y ) , G X ( Z ) X λ X c Z L λ Y Y G G Y ( Z ) Xie et al. — On Scalable and Efficient Computation of Large Scale Optimal Transport 5/9

Thirty-sixth International Conference on Machine Learning Computing Wasserstein Distance (WD) WD is the expected cost of optimal transport plan, W = E ( X,Y ) ∼ γ ∗ [ c ( X, Y )] . LR = 10 − 3 LR = 10 − 4 LR = 10 − 5 Here, ROT is the state-of-the-art method (Seguy, 2018). Xie et al. — On Scalable and Efficient Computation of Large Scale Optimal Transport 6/9

Thirty-sixth International Conference on Machine Learning Generate Paired Samples Photos-Monet Xie et al. — On Scalable and Efficient Computation of Large Scale Optimal Transport 7/9

Thirty-sixth International Conference on Machine Learning Domain Adaptation (DA) New DA method – DASPOT Setting: Goal: predict the labels of { y j } . Source MNIST USPS SVHN MNIST Target USPS MNIST MNIST MNISTM ROT (Seguy, 2018) 72 . 6% 60 . 5% 62 . 9% − StochJDOT (Damodaran, 2018) 93 . 6% 90 . 5% 67 . 6% 66 . 7% DeepJDOT (Damodaran, 2018) 95 . 7% 96 . 4% 96.7% 92 . 4% DASPOT 97.5% 96.5% 96 . 2% 94.9% Xie et al. — On Scalable and Efficient Computation of Large Scale Optimal Transport 8/9

Thirty-sixth International Conference on Machine Learning Thank you! Xie et al. — On Scalable and Efficient Computation of Large Scale Optimal Transport 9/9

On Scalable and Efficient Computation of Large Scale Optimal - PowerPoint PPT Presentation

On Scalable and Efficient Computation of Large Scale Optimal Transport Yujia Xie, Minshuo Chen, Haoming Jiang, Tuo Zhao, Hongyuan Zha School of Computational Science and Engineering H. Milton Stewart School of Industrial and Systems Engineering

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

A Scalable Scalable Approach Approach A for for Large- -Scale Scale Schema Schema

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

Highly Efficient Gradient Computation for Highly Efficient Gradient Computation for Density-

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Scalable and Memory-Efficient Clustering of Large-Scale Social Networks Joyce Jiyoung Whang, Xin

Techniques for Efficient Secure Computation Based on Yaos Protocol Yehuda Lindell Bar-Ilan

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 3: Parallel and Scalable Data

Wavelets for Efficient Querying of Large Wavelets for Efficient Querying of Large

Concretely Efficient La Large-Sc Scale M MPC wi with th Acti tive Securi rity ty (or

Dyninst Scalable Tools Workshop Granlibakken Resort Lake Tahoe, California Dyninst Scalable

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill & Melinda

Large-scale machine learning and convex optimization Francis Bach INRIA - Ecole Normale Sup

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Multi-agent learning Erik Berbee & Bas van Gijzel , Master Student AT, Utrecht University Erik

Point Detectors KRYSTIAN MIKOLAJCZYK AND CORDELIA SCHMID [2004] Shreyas Saxena Gurkirit Singh

Efficient Interactive Training Selection for Large-scale Entity Resolution Qing Wang, Dinusha

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Zvi Griliches Lectures 2011 Pillars of Prosperity The Political Economics of Development Clusters

On Scalable and Efficient Computation of Large Scale Optimal - PowerPoint PPT Presentation

On Scalable and Efficient Computation of Large Scale Optimal Transport Yujia Xie, Minshuo Chen, Haoming Jiang, Tuo Zhao, Hongyuan Zha School of Computational Science and Engineering H. Milton Stewart School of Industrial and Systems Engineering

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

A Scalable Scalable Approach Approach A for for Large- -Scale Scale Schema Schema

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems A. Gharaibeh, E.

Formal Definition of Computation Formal Definition of Computation p.1/28 Computation

Highly Efficient Gradient Computation for Highly Efficient Gradient Computation for Density-

Scalable String Matching on the Scalable String Matching on the Scalable String Matching on the

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

INFRASTRUCTURE 2110414 Large Scale Computing Systems Natawut Nupairoj, Ph.D. Outline 2

Scalable and Memory-Efficient Clustering of Large-Scale Social Networks Joyce Jiyoung Whang, Xin

Techniques for Efficient Secure Computation Based on Yaos Protocol Yehuda Lindell Bar-Ilan

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 3: Parallel and Scalable Data

Wavelets for Efficient Querying of Large Wavelets for Efficient Querying of Large

Concretely Efficient La Large-Sc Scale M MPC wi with th Acti tive Securi rity ty (or

Dyninst Scalable Tools Workshop Granlibakken Resort Lake Tahoe, California Dyninst Scalable

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill &amp; Melinda

Large-scale machine learning and convex optimization Francis Bach INRIA - Ecole Normale Sup

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Multi-agent learning Erik Berbee &amp; Bas van Gijzel , Master Student AT, Utrecht University Erik

Point Detectors KRYSTIAN MIKOLAJCZYK AND CORDELIA SCHMID [2004] Shreyas Saxena Gurkirit Singh

Efficient Interactive Training Selection for Large-scale Entity Resolution Qing Wang, Dinusha

+ Design of Parallel Algorithms Parallel Algorithm Analysis Tools + Topic Overview n Sources

Zvi Griliches Lectures 2011 Pillars of Prosperity The Political Economics of Development Clusters

The Challenge Initiative: Business Unusual Approach to Scale up Kojo Lokko Bill & Melinda

Multi-agent learning Erik Berbee & Bas van Gijzel , Master Student AT, Utrecht University Erik