The How of (Sub)Gradient Note: Subdifferential is intersection of - PowerPoint PPT Presentation

The How of (Sub)Gradient Note: Subdifferential is intersection of infinite half-spaces and is therefore convex and closed August 31, 2018 82 / 402

The How of (Sub)Gradient Note: Subdifferential is intersection of infinite half-spaces and is therefore a closed convex set even if f is NOT convex. August 31, 2018 82 / 402

First peek into subgradient calculus: Function Convexity First Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum: If f 1 ,f 2 , . . . ,f m are convex, then f ( x ) = max { f 1 ( x ) ,f 2 ( x ) , ...,f m ( x ) } is convex In Quiz 1, problem 1, m=2 f1 = ||x||_1 f2 = ||x||_in fi nity August 31, 2018 83 / 402

First peek into subgradient calculus: Function Convexity First Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum: If f 1 ,f 2 , . . . ,f m are convex, then { } f ( x ) = max f ( x ) ,f ( x ) , . . . ,f ( x ) is also convex. For example: 1 2 m ▶ Sum of r largest components of x ∈ ℜ n f ( x ) = x [1] + x [2] + . . . + x [ r ] , where x [1] is the i th largest component of x , is Proof: Either from fi rst principles (invoking convexity of f1...fm) Or Inspect intersection of epigraphs of f1...fm Will our proof of convexity hold for an in fi nite (possibly even uncountable) number of indices i (which had a fi nite set of values 1...m above)? ANS: Yes!! August 31, 2018 83 / 402

First peek into subgradient calculus: Function Convexity First Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum: If f 1 ,f 2 , . . . ,f m are convex, then { } f ( x ) = max f ( x ) ,f ( x ) ,..., f ( x ) is also convex. For example: 1 2 m ▶ Sum of r largest components of x ∈ ℜ n f ( x ) = x [1] + x [2] + . . . + x [ r ] , where x [1] is the i th largest component of x , is a convex function. Pointwise supremum: If f ( x , y )is convex in x for every y ∈ S , then g ( x ) =sup f ( x , y ) y ∈ S is convex by a proof similar to S is a set of possibly that on the board: in fi nite number of indices RHS will have sup over y instead of max over i Similarly, LHS will also have sup over y instead of max over i August 31, 2018 83 / 402

First peek into subgradient calculus: Function Convexity First Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum: If f 1 ,f 2 , . . . ,f m are convex, then { } f ( x ) = max f ( x ) ,f ( x ) ,..., f ( x ) is also convex. For example: 1 2 m ▶ Sum of r largest components of x ∈ ℜ n f ( x ) = x [1] + x [2] + . . . + x [ r ] , where x [1] is the i th largest component of x , is a convex function. Pointwise supremum: If f ( x , y )is convex in x for every y ∈S , then g ( x ) =sup y ∈ S f ( x , y ) is convex. For example: ▶ The function that returns the maximum eigenvalue of a symmetric matrix X , viz. , ∥ X y ∥ 2 is λ max ( X ) =sup a convex function obtained as supremum y ∈S ∥ y ∥ 2 over an in fi nite number of y with ||y||_2 = 1 over the function ||Xy||_2 August 31, 2018 83 / 402

First peek into subgradient calculus: Function Convexity First Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum: If f 1 ,f 2 , . . . ,f m are convex, then { } f ( x ) = max f ( x ) ,f ( x ) ,..., f ( x ) is also convex. For example: 1 2 m ▶ Sum of r largest components of x ∈ ℜ n f ( x ) = x [1] + x [2] + . . . + x [ r ] , where x [1] is the i th largest component of x , is a convex function. Pointwise supremum: If f ( x , y )is convex in x for every y ∈S , then g ( x ) =sup y ∈ S f ( x , y ) is convex. For example: ▶ The function that returns the maximum eigenvalue of a symmetric matrix X , viz. , λ max ( X ) =sup ∥ X y ∥ 2 is a convex function of the symmetrix matrix X . y ∈S ∥ y ∥ 2 If X is symmetrix, max eigenvalue of X^TX is squared of max eigenvalue of X August 31, 2018 83 / 402

Basic Subgradient Calculus: Illustration for pointwiseMaximum Finite pointwise maximum: if f ( x ) = max i =1 ... m f i ( x ), then ∂f ( x ) = subdi ff erential of f_i(x) at points x where f(x) = f_i(x) (that is points where there is a unique/unambiguous maximizer, the subdi ff erential of f(x) is the subdi ff erential of that unique maximizer) Convex hull o f subdi ff erentials of f_i(x) for all i s.t f(x) = f_i(x) (that is points where there is a unique/unambiguous maximizer, the subdi ff erential of f(x) is the subdi ff erential of that unique maximizer) Includes union August 31, 2018 84 / 402

Basic Subgradient Calculus: Illustration for pointwiseMaximum Finite pointwise maximum: if f ( x ) = max i =1 ... m f i ( x ), then ( ) ∪ ∂f ( x ) = conv ∂f i ( x ) , which is the convex hull of union of subdifferentials of i : f i ( x )= f ( x ) all active functions at x . General pointwise maximum: if f ( x ) = max s ∈ S f s ( x ),then closure of convex hull under some regularity conditions (on S , f s ), ∂f ( x ) = of union of subdi ff erentials Additional operation that ensures the subdi ff erential to be closed August 31, 2018 84 / 402

Basic Subgradient Calculus: Illustration for pointwiseMaximum Finite pointwise maximum: if f ( x ) = max i =1 ... m f i ( x ), then ( ) ∪ ∂f ( x ) = conv ∂f i ( x ) , which is the convex hull of union of subdifferentials of i : f i ( x )= f ( x ) all active functions at x . General pointwise maximum: if f ( x ) = max s ∈ S f s ( x ),then { ( ) } ∪ ∂ f s ( x ) under some regularity conditions (on S , f s ), ∂ f ( x ) = cl conv s : f s ( x )= f ( x ) August 31, 2018 84 / 402

Subgradient of ∥ x ∥ 1 Assume x ∈ ℜ n . Then ∥ x ∥ 1 = max over 2^n functions each corresponding to s^Tx August 31, 2018 85 / 402

Subgradient of ∥ x ∥ 1 Assume x ∈ ℜ n . Then ∥ x ∥ 1 =max x T s which is a pointwise maximum of2 n functions s ∈ { − 1 , +1 } n Let S ∗ ⊆{− 1 , +1 } n be the set of s such that for each s ∈S ∗ , the value of x T s is the same max value. ( ) ∪ Thus, ∂ ∥ x ∥ 1 = conv s . s ∈ S ∗ August 31, 2018 85 / 402

More Subgradient Calculus: Function Convexity first Following functions are again convex, but again, may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? ∑ n 1 ≤ i ≤ n is convex and Nonnegative weighted sum: f = α f is convex if each f for i i i i =1 α i ≥ 0 , 1 ≤ i ≤ n . Composition with affine function: f ( Ax + b )is convex if f is convex. For example: ∑ m The log barrier for linear inequalities, f ( x ) = − ( b − a x ), is convex since − log( x )is log T ▶ i i i =1 convex. ▶ Any norm of an affine function, f ( x ) = || Ax + b || , is convex. if A is m x n, then f() is de fi ned on R^n whereas f(Ax+b) is de fi ned on R^m August 31, 2018 86 / 402

The How of (Sub)Gradient Note: Subdifferential is intersection of - PowerPoint PPT Presentation

The How of (Sub)Gradient Note: Subdifferential is intersection of infinite half-spaces and is therefore convex and closed August 31, 2018 82 / 402 The How of (Sub)Gradient Note: Subdifferential is intersection of infinite half-spaces and is

Visualizing Model Architecture john.sekar@mssm.edu SASB `17 Kinetics ~ Reaction Rules Enz Sub

Gradient Analysis NMDS Indirect Gradient Analysis NMDS Direct Gradient Analysis Objective:

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020)

How to use Gradient and Multi-Texture 1. Many situations, we need use the gradient texture for our

CSC2541 Lecture 5 Natural Gradient Roger Grosse Roger Grosse CSC2541 Lecture 5 Natural Gradient

CS 6316 Machine Learning Gradient Descent Yangfeng Ji Department of Computer Science University

DESIGN BID BUILD CSD HDR Contractor Sub Sub Sub DESIGN-BID-BUILD Most common approach in

Equipment Planning Rachelle 2. Cleaning & Communication Kim/Dave Sub Sub-Co

Sub-Riemannian view on SU (2) and semigroup of its sub-Laplacian Irina Markina, Der Chen Chang,

Functions Num : N N and Sub : N 3 N , Num( a ) := a , Sub( , x ,

OSMOSIS and DIFFUSION Concentration gradient Concentration Gradient - change in the concentration

Gradient interfaces with and without disorder Codina Cotar University College London September

Gradient Gibbs measures with disorder Codina Cotar University College London April 16, 2015,

20 Kelvin cold High gradient RF gun Materials and gradient Some properties of pure metals in low

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. MLSS

Shape Matching and Object Recognition Title:Matching Hierarchical Structures using Association

Unsupervised Deconvolution-Segmentation of Textured Image Bayesian approach: optimal strategy and

The Global Geometry of Stationary Surfaces in 4-dimensional Lorentz space Xiang Ma (Joint with

Imitation Theory and Experimental Evidence Joerg Oechssler University of Heidelberg

Integrability of Limit Shape Phenomena in Six Vertex Model Ananth Sridhar UC Berkeley Physics

Learning with Differentiable Perturbed Optimizers Quentin Berthet Optimization for ML - CIRM -

Preparing for Your SFSP Desk Review Hannah Powell, MS, RD School Meals and Summer Program

LEVERAGE MOBILE SOLUTIONS TO INCREASE ACCESS TO SUMMER MEALS 2018 Schools Out, Foods

The How of (Sub)Gradient Note: Subdifferential is intersection of - PowerPoint PPT Presentation

The How of (Sub)Gradient Note: Subdifferential is intersection of infinite half-spaces and is therefore convex and closed August 31, 2018 82 / 402 The How of (Sub)Gradient Note: Subdifferential is intersection of infinite half-spaces and is

Visualizing Model Architecture john.sekar@mssm.edu SASB `17 Kinetics ~ Reaction Rules Enz Sub

Gradient Analysis NMDS Indirect Gradient Analysis NMDS Direct Gradient Analysis Objective:

Conjugate Gradient (CG) Majid Lesani Alireza Masoum Overview Backpropagation Gradient

Applied Machine Learning Gradient Descent Methods Siamak Ravanbakhsh COMP 551 (Fall 2020)

How to use Gradient and Multi-Texture 1. Many situations, we need use the gradient texture for our

CSC2541 Lecture 5 Natural Gradient Roger Grosse Roger Grosse CSC2541 Lecture 5 Natural Gradient

CS 6316 Machine Learning Gradient Descent Yangfeng Ji Department of Computer Science University

DESIGN BID BUILD CSD HDR Contractor Sub Sub Sub DESIGN-BID-BUILD Most common approach in

Equipment Planning Rachelle 2. Cleaning &amp; Communication Kim/Dave Sub Sub-Co

Sub-Riemannian view on SU (2) and semigroup of its sub-Laplacian Irina Markina, Der Chen Chang,

Functions Num : N N and Sub : N 3 N , Num( a ) := a , Sub( , x ,

OSMOSIS and DIFFUSION Concentration gradient Concentration Gradient - change in the concentration

Gradient interfaces with and without disorder Codina Cotar University College London September

Gradient Gibbs measures with disorder Codina Cotar University College London April 16, 2015,

20 Kelvin cold High gradient RF gun Materials and gradient Some properties of pure metals in low

Painless Stochastic Gradient Descent : Interpolation, Line-Search, and Convergence Rates. MLSS

Shape Matching and Object Recognition Title:Matching Hierarchical Structures using Association

Unsupervised Deconvolution-Segmentation of Textured Image Bayesian approach: optimal strategy and

The Global Geometry of Stationary Surfaces in 4-dimensional Lorentz space Xiang Ma (Joint with

Imitation Theory and Experimental Evidence Joerg Oechssler University of Heidelberg

Integrability of Limit Shape Phenomena in Six Vertex Model Ananth Sridhar UC Berkeley Physics

Learning with Differentiable Perturbed Optimizers Quentin Berthet Optimization for ML - CIRM -

Preparing for Your SFSP Desk Review Hannah Powell, MS, RD School Meals and Summer Program

LEVERAGE MOBILE SOLUTIONS TO INCREASE ACCESS TO SUMMER MEALS 2018 Schools Out, Foods

Equipment Planning Rachelle 2. Cleaning & Communication Kim/Dave Sub Sub-Co