A Study of Nesterovs Scheme for Lagrangian Decomposition and MAP - PowerPoint PPT Presentation

A Study of Nesterov’s Scheme for Lagrangian Decomposition and MAP Labeling Bogdan Savchynskyy, J¨ org Kappes, Stefan Schmidt, Christoph Schn¨ orr Heidelberg Collaboratory for Image Processing (HCI) University of Heidelberg 1/15

MRF/MAP Inference – Applications ✓➳ ✛ y ✝ ✏ arg min ➳ θ v ♣ y v q � θ vv ✶ ♣ y v , y v ✶ q y P Y V v P V vv ✶ P E Segmentation [Rother et al. 2004], [Nowozin, Lampert 2010] Multi-camera stereo [Kolmogorov, Zabih 2002] Stereo and Motion [Kim et al. 2003] Clustering [Zabih, Kolmogorov. 2004] Medical imaging [Raj et al. 2007] Pose Estimation [Bergtholdt et al. 2010], [Bray et al. 2006] . . . A comparative study of energy minimization methods for Markov random fields with smoothness-based priors. R. Szeliski et al. 2008 2/15

MRF/MAP Inference – Approaches Graph Cuts [Boykov et al. 2001] Special type of potentials. [Kolmogorov, Zabih 2002] Sub-modularity [Boykov, Kolmogorov 2004] QPBO and Roof Duality [Hammel et al.1984], Partial optimality. [Boros, Hammer 2002], [Rother et al. 2007], [Kohli et al. 2008] Combinatorial methods [Bergtholdt et al. 2006], [Schlesinger 2009] Exponential complexity in the [Sanchez et al. 2008], worst-case. [Marinescu, Dechter 2009] 3/15

MRF/MAP Inference – Approaches Message passing and belief propagation Relaxation, dual decomposition. [Weiss, Freeman 2001], [Wainwright et al. Sub-optimal fixed point 2002], [Kolmogorov 2005], [Globerson, Stopping criterion? Jaakkola 2007] Sub-gradient Optimization Schemes Relaxation, dual decomposition. [Komodakis et al. 2007], [Schlesinger, Slow convergence. Giginyak 2007], [Kappes et al. 2010] Stopping criterion? Focus and Contribution: Local Polytope/LP relaxation based on dual decomposition – similar to message passing and sub-gradient schemes; efficient iterations – outperforms subgradient; convergence to the optimum – outperforms message passing; stopping criterion based on duality gap – novel! 4/15

Dual Decomposition Approach E 1 ♣ θ 1 , y q E 2 ♣ θ 2 , y q E ♣ θ, y q ✏ � Ñ � ✒ ✚ y P Y V E 1 ♣ θ 1 , y q � min y P Y V E 2 ♣ θ 2 , y q y P Y V E ♣ θ, y q ➙ min max min θ 1 � θ 2 ✏ θ Simple subproblems in parallel Concave, but non-smooth 5/15

Large Scale Convex Optimization Problem: Dual Decomposition Ñ Convex, Large-Scale, Non-Smooth Sub-gradient schemes: [Komodakis et al. 2007], Smoothing technique + [Schlesinger, Giginyak 2007] accelerated gradient Block-coordinate ascent: methods: [Nesterov 2004, [Wainwright 2004], 2007] [Kolmogorov 2005], [Globerson, Proximal methods: [Combettes, Jaakkola 2007] Wajs 2005], [Beck, Teboulle Smoothing + Block-coordinate 2009] ascent: [Johnson et al. 2007], Proximal Primal-Dual [Werner 2009] Algorithms: [Esser et al. 2010] Proximal methods: [Ravikumar et al. 2010] Solution direction: Smooth and Optimize 6/15

Smoothing Technique by Y.Nesterov y P D r① Ax, y ② � φ ♣ y qs ✁ ✁ ✁ ✁ ✁ Ñ min y P D r① Ax, y ② � φ ♣ y q � ρd ♣ y qs min ❧♦♦♦♦♦♦♦♦♦♦♦♦♠♦♦♦♦♦♦♦♦♦♦♦♦♥ ❧♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♠♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♥ f ♣ x q ˜ f ρ ♣ x q Concave, but non-smooth, Lipschitz-continuous gradient, convergence t ✓ O ♣ 1 convergence t ✓ O ♣ 1 ε 2 q ε q 7/15

Efficient Implementation of a Nesterov’s Method Our approach Basic scheme Worst-case number Stopping condition Duality gap of steps Smoothing selection Worst-case analysis Adaptive Lipschitz constant estimation Worst-case analysis Adaptive (step-size selection) 8/15

Duality Gap and Stopping Condition g ♣ x, y q ✁ max x g ♣ x, y q ↕ ε min x max min y y dual decomposition approaches optimize the relaxed dual x g ♣ x, y q . max min y standard approach – estimate a non-relaxed primal, integer solution. we estimate the relaxed primal min x max g ♣ x, y q – difficult! . y 9/15

Smoothing Selection Fast optimization – low precision Slow optimization – high precision 10/15

Smoothing Selection δ ε ✏ 2 δ Nesterov: worst-case δ estimate. Ours: adaptive estimate. Tsukuba dataset and precision about 0 . 3 % 11/15

Lipschitz Constant (Steps-Size) Estimation Nesterov: worst-case estimate of L . x ✏ y � 1 L ∇ f ♣ y q Ours: adaptive estimate of L without violating the theory! Tsukuba dataset and precision about 3 % 12/15

Comparison to Other Approaches Random synthetic grid model 20x20, 5 labels and Tsukuba dataset 13/15

Summary Contribution: O ♣ 1 ε q vs. O ♣ 1 Improved convergence estimation: ε 2 q Sound stopping condition: min x max g ♣ x, y q ✁ max min x g ♣ x, y q ↕ ε y y Fine-grained ✓ � parallelization properties Applicable to arbitrary graphs and arbitrary potentials. Future work: Examine Primal-Dual viewpoint – EMMCVPR 2011 Appication in structured prediction and learning. 14/15

V. Jojic, S. Gould, and D. Koller. Accelerated dual decomposition ... 2010 Primal LP solution Primal integer solution Synthetic grid 20x20, 5 labels. 15/15

A Study of Nesterovs Scheme for Lagrangian Decomposition and MAP - PowerPoint PPT Presentation

A Study of Nesterovs Scheme for Lagrangian Decomposition and MAP Labeling Bogdan Savchynskyy, J org Kappes, Stefan Schmidt, Christoph Schn orr Heidelberg Collaboratory for Image Processing (HCI) University of Heidelberg 1/15 MRF/MAP

Lagrangian Decomposition for Optimal Cost Partitioning Florian Pommerening 1 oger 1 Malte Helmert

Today Lagrangian Dual. Already saw example! Convex Separator. Farkas Lemma. Lagrangian Dual.

Thermal decomposition of the Thermal decomposition of the Thermal decomposition of the Thermal

Polar Decomposition of a Matrix Garrett Buffington May 4, 2014 The Polar Decomposition SVD and

Complexity and Simplicity of Optimization Problems Yurii Nesterov, CORE/INMA (UCL) February 17 -

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov,

Scheme Announcements Scheme Scheme is a Dialect of Lisp 4 Scheme is a Dialect of Lisp What

Exact rate of Nesterov Scheme Vasileios Apidopoulos, Jean-Fran cois Aujol, Charles Dossal,

What can Scheme learn from JavaScript? Scheme Workshop 2014 Andy Wingo Me and Scheme Guile

A Lagrangian strategy for in situ sampling of the physical-biological A Lagrangian strategy for in

Lagrangian Duality Jos e De Don a September 2004 Centre of Complex Dynamic Systems and

Lagrangian observations; single particle statistics J. H. LaCasce Norwegian Meteorological

Positivity- -preserving Lagrangian schemes for preserving Lagrangian schemes for Positivity

QCD - introduction lagrangian, symmetries, running coupling, Coulomb gauge Lagrangian Quantum

THE TRAINING LAYOFF SCHEME THE TRAINING LAYOFF SCHEME 1 October 2009 The Training Layoff Scheme

Government Pension Scheme (LGPS) Scheme Administration Defined Benefit Scheme National

COMMUNITY NOISE MAPPING AND ACTION PLANNING- A EUROPEAN APPROACH TO REDUCE ENVIRONMENTAL NOISE

Zihang Yin Introduction R is commonly used as an open share statistical software

From rational points to homotopy fixed points Chern Institute July 25, 2016 Gereon Quick NTNU A

Rally-Owl Overview of Rally-Owl Game This game is based off of Rally-X The goal of the game is

Cooperating Technical Partner Flood Hazard Mapping Project Flood Hazard Mapping Project

Labour market analysis: Wages (structure, trends, thematic areas and relation to other major

NCES Initiative on the Future of NAEP Edward Haertel, Panel Chair Stanford University New

Emissions Reductions from Deforestation Hotspots in the Peruvian Amazon June 2014 Program