6. Approximation and fitting norm approximation least-norm problems - PowerPoint PPT Presentation

Convex Optimization — Boyd & Vandenberghe 6. Approximation and fitting • norm approximation • least-norm problems • regularized approximation • robust approximation 6–1

Norm approximation minimize � Ax − b � ( A ∈ R m × n with m ≥ n , � · � is a norm on R m ) interpretations of solution x ⋆ = argmin x � Ax − b � : • geometric : Ax ⋆ is point in R ( A ) closest to b • estimation : linear measurement model y = Ax + v y are measurements, x is unknown, v is measurement error given y = b , best guess of x is x ⋆ • optimal design : x are design variables (input), Ax is result (output) x ⋆ is design that best approximates desired result b Approximation and fitting 6–2

examples • least-squares approximation ( � · � 2 ): solution satisfies normal equations A T Ax = A T b ( x ⋆ = ( A T A ) − 1 A T b if rank A = n ) • Chebyshev approximation ( � · � ∞ ): can be solved as an LP minimize t subject to − t 1 � Ax − b � t 1 • sum of absolute residuals approximation ( � · � 1 ): can be solved as an LP 1 T y minimize subject to − y � Ax − b � y Approximation and fitting 6–3

Penalty function approximation minimize φ ( r 1 ) + · · · + φ ( r m ) subject to r = Ax − b ( A ∈ R m × n , φ : R → R is a convex penalty function) examples 2 • quadratic: φ ( u ) = u 2 log barrier quadratic 1 . 5 • deadzone-linear with width a : φ ( u ) 1 deadzone-linear φ ( u ) = max { 0 , | u | − a } 0 . 5 • log-barrier with limit a : 0 − 1 . 5 − 1 − 0 . 5 0 0 . 5 1 1 . 5 u − a 2 log(1 − ( u/a ) 2 ) � | u | < a φ ( u ) = ∞ otherwise Approximation and fitting 6–4

example ( m = 100 , n = 30 ): histogram of residuals for penalties φ ( u ) = u 2 , φ ( u ) = − log(1 − u 2 ) φ ( u ) = | u | , φ ( u ) = max { 0 , | u |− a } , 40 p = 1 0 − 2 − 1 0 1 2 10 p = 2 0 − 2 − 1 0 1 2 Deadzone 20 0 − 2 − 1 0 1 2 Log barrier 10 0 − 2 − 1 0 1 2 r shape of penalty function has large effect on distribution of residuals Approximation and fitting 6–5

Huber penalty function (with parameter M ) u 2 � | u | ≤ M φ hub ( u ) = M (2 | u | − M ) | u | > M linear growth for large u makes approximation less sensitive to outliers 2 20 1 . 5 10 φ hub ( u ) f ( t ) 1 0 0 . 5 − 10 − 20 0 − 1 . 5 − 1 − 0 . 5 0 0 . 5 1 1 . 5 − 10 − 5 0 5 10 u t • left: Huber penalty for M = 1 • right: affine function f ( t ) = α + βt fitted to 42 points t i , y i (circles) using quadratic (dashed) and Huber (solid) penalty Approximation and fitting 6–6

Least-norm problems minimize � x � subject to Ax = b ( A ∈ R m × n with m ≤ n , � · � is a norm on R n ) interpretations of solution x ⋆ = argmin Ax = b � x � : • geometric: x ⋆ is point in affine set { x | Ax = b } with minimum distance to 0 • estimation: b = Ax are (perfect) measurements of x ; x ⋆ is smallest (’most plausible’) estimate consistent with measurements • design: x are design variables (inputs); b are required results (outputs) x ⋆ is smallest (’most efficient’) design that satisfies requirements Approximation and fitting 6–7

examples • least-squares solution of linear equations ( � · � 2 ): can be solved via optimality conditions 2 x + A T ν = 0 , Ax = b • minimum sum of absolute values ( � · � 1 ): can be solved as an LP 1 T y minimize subject to − y � x � y, Ax = b tends to produce sparse solution x ⋆ extension: least-penalty problem minimize φ ( x 1 ) + · · · + φ ( x n ) subject to Ax = b φ : R → R is convex penalty function Approximation and fitting 6–8

Regularized approximation minimize (w.r.t. R 2 + ) ( � Ax − b � , � x � ) A ∈ R m × n , norms on R m and R n can be different interpretation: find good approximation Ax ≈ b with small x • estimation: linear measurement model y = Ax + v , with prior knowledge that � x � is small • optimal design : small x is cheaper or more efficient, or the linear model y = Ax is only valid for small x • robust approximation: good approximation Ax ≈ b with small x is less sensitive to errors in A than good approximation with large x Approximation and fitting 6–9

Scalarized problem minimize � Ax − b � + γ � x � • solution for γ > 0 traces out optimal trade-off curve • other common method: minimize � Ax − b � 2 + δ � x � 2 with δ > 0 Tikhonov regularization � Ax − b � 2 2 + δ � x � 2 minimize 2 can be solved as a least-squares problem 2 � �� A � � b √ � � minimize x − � � 0 δI � � 2 solution x ⋆ = ( A T A + δI ) − 1 A T b Approximation and fitting 6–10

Optimal input design linear dynamical system with impulse response h : t � y ( t ) = h ( τ ) u ( t − τ ) , t = 0 , 1 , . . . , N τ =0 input design problem: multicriterion problem with 3 objectives 1. tracking error with desired output y des : J track = � N t =0 ( y ( t ) − y des ( t )) 2 2. input magnitude: J mag = � N t =0 u ( t ) 2 3. input variation: J der = � N − 1 t =0 ( u ( t + 1) − u ( t )) 2 track desired output using a small and slowly varying input signal regularized least-squares formulation minimize J track + δJ der + ηJ mag for fixed δ, η , a least-squares problem in u (0) , . . . , u ( N ) Approximation and fitting 6–11

example : 3 solutions on optimal trade-off surface (top) δ = 0 , small η ; (middle) δ = 0 , larger η ; (bottom) large δ 5 1 0 . 5 0 u ( t ) y ( t ) 0 − 5 − 0 . 5 − 1 − 10 0 50 100 150 200 0 50 100 150 200 t t 4 1 2 0 . 5 y ( t ) u ( t ) 0 0 − 0 . 5 − 2 − 1 − 4 0 50 100 150 200 0 50 100 150 200 t t 4 1 2 0 . 5 u ( t ) y ( t ) 0 0 − 0 . 5 − 2 − 1 − 4 0 50 100 150 200 0 50 100 150 200 t t Approximation and fitting 6–12

Signal reconstruction minimize (w.r.t. R 2 + ) ( � ˆ x − x cor � 2 , φ (ˆ x )) • x ∈ R n is unknown signal • x cor = x + v is (known) corrupted version of x , with additive noise v • variable ˆ x (reconstructed signal) is estimate of x • φ : R n → R is regularization function or smoothing objective examples: quadratic smoothing, total variation smoothing: n − 1 n − 1 � � x i ) 2 , φ quad (ˆ x ) = (ˆ x i +1 − ˆ φ tv (ˆ x ) = | ˆ x i +1 − ˆ x i | i =1 i =1 Approximation and fitting 6–13

quadratic smoothing example 0 . 5 0 . 5 x ˆ 0 − 0 . 5 x 0 0 1000 2000 3000 4000 0 . 5 − 0 . 5 0 1000 2000 3000 4000 ˆ x 0 − 0 . 5 0 . 5 0 1000 2000 3000 4000 0 . 5 x cor 0 x ˆ 0 − 0 . 5 − 0 . 5 0 1000 2000 3000 4000 0 1000 2000 3000 4000 i i three solutions on trade-off curve original signal x and noisy signal x cor � ˆ x − x cor � 2 versus φ quad (ˆ x ) Approximation and fitting 6–14

total variation reconstruction example 2 x i 0 ˆ 2 1 − 2 0 500 1000 1500 2000 x 0 2 − 1 x i 0 ˆ − 2 0 500 1000 1500 2000 − 2 2 0 500 1000 1500 2000 2 1 x cor 0 x i 0 ˆ − 1 − 2 − 2 0 500 1000 1500 2000 0 500 1000 1500 2000 i i three solutions on trade-off curve original signal x and noisy � ˆ x − x cor � 2 versus φ quad (ˆ x ) signal x cor quadratic smoothing smooths out noise and sharp transitions in signal Approximation and fitting 6–15

2 x ˆ 0 2 1 − 2 0 500 1000 1500 2000 x 0 2 − 1 x ˆ 0 − 2 0 500 1000 1500 2000 − 2 2 0 500 1000 1500 2000 2 1 x cor 0 ˆ x 0 − 1 − 2 − 2 0 500 1000 1500 2000 0 500 1000 1500 2000 i i three solutions on trade-off curve original signal x and noisy � ˆ x − x cor � 2 versus φ tv (ˆ x ) signal x cor total variation smoothing preserves sharp transitions in signal Approximation and fitting 6–16

Robust approximation minimize � Ax − b � with uncertain A two approaches: • stochastic : assume A is random, minimize E � Ax − b � • worst-case: set A of possible values of A , minimize sup A ∈A � Ax − b � tractable only in special cases (certain norms � · � , distributions, sets A ) 12 example : A ( u ) = A 0 + uA 1 10 x nom • x nom minimizes � A 0 x − b � 2 2 8 • x stoch minimizes E � A ( u ) x − b � 2 r ( u ) x stoch 2 6 with u uniform on [ − 1 , 1] x wc 4 • x wc minimizes sup − 1 ≤ u ≤ 1 � A ( u ) x − b � 2 2 2 figure shows r ( u ) = � A ( u ) x − b � 2 0 − 2 − 1 0 1 2 u Approximation and fitting 6–17

stochastic robust LS with A = ¯ A + U , U random, E U = 0 , E U T U = P E � ( ¯ A + U ) x − b � 2 minimize 2 • explicit expression for objective: E � ¯ E � Ax − b � 2 Ax − b + Ux � 2 = 2 2 � ¯ Ax − b � 2 2 + E x T U T Ux = � ¯ Ax − b � 2 2 + x T Px = • hence, robust LS problem is equivalent to LS problem � ¯ Ax − b � 2 2 + � P 1 / 2 x � 2 minimize 2 • for P = δI , get Tikhonov regularized problem � ¯ Ax − b � 2 2 + δ � x � 2 minimize 2 Approximation and fitting 6–18

6. Approximation and fitting norm approximation least-norm problems - PowerPoint PPT Presentation

Convex Optimization Boyd & Vandenberghe 6. Approximation and fitting norm approximation least-norm problems regularized approximation robust approximation 61 Norm approximation minimize Ax b ( A R m

Track fitting, vertex fitting and Track fitting, vertex fitting and Track fitting, vertex fitting

Week 2 Video 5 Cross-Validation and Over-Fitting Over-Fitting Ive mentioned over-fitting a

Lecture 11 Fitting ARIMA Models 10/10/2018 1 Model Fitting Fitting ARIMA For an

6. Approximation and fitting Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao

Functions and Data Fitting COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning

Fitting a Line, Residuals, and Correlation October 28, 2019 October 28, 2019 1 / 36 Fitting a

Fitting a Line, Residuals, and Correlation August 27, 2019 August 27, 2019 1 / 54 Fitting a

Least Squares and Data Fitting Data fitting How do we best fit a set of data points? Linear

Over fitting distribution functions over Bayesian Regression / " ' i diggllloise dist

Fitting high resolution structures into low resolution EM maps Michael Rossmann Purdue

Unit 1: Data Fitting Motivation Data fitting: Construct a continuous function that represents

Mechanical Fitting Failures Reporting and Data Analysis - 1 - MFFR Reporting 191.12

Lecture 19 Fitting CAR and SAR Models Colin Rundel 03/29/2017 1 Fitting areal models 2 CAR

Lecture 3 bis Fitting and the Hough transform Fitting: Motivation 9300 Harris Corners Pkwy,

Lecture 18 Fitting CAR and SAR Models Colin Rundel 11/07/2018 1 Fitting areal models Revised

Fitting: Deformable contours Goal: move from array of pixel values (or Monday, Feb 21 filter

A Leg Up On College The Scale and Distribution of Community College Participation among

Topic III.2: Maximum Entropy Models Discrete Topics in Data Mining Universitt des Saarlandes,

Structured Prediction Problem Unstructured prediction Structured prediction Part of

On the distribution of arithmetic sequences in the Collatz graph Keenan Monks, Harvard University

On Banach spaces of vector-valued random variables and their duals motivated by risk measures

US U SE ER R S S M MA AN NU UA AL L Aldo Vecchietti

Java: Learning to Program with Robots Chapter 08: Collaborative Classes Chapter Objectives

A Cache Poisoning Attack Targeting DNS Forwarding Devices Xiaofeng Zheng , Chaoyi Lu, Jian Peng,

6. Approximation and fitting norm approximation least-norm problems - PowerPoint PPT Presentation

Convex Optimization Boyd & Vandenberghe 6. Approximation and fitting norm approximation least-norm problems regularized approximation robust approximation 61 Norm approximation minimize Ax b ( A R m

Track fitting, vertex fitting and Track fitting, vertex fitting and Track fitting, vertex fitting

Week 2 Video 5 Cross-Validation and Over-Fitting Over-Fitting Ive mentioned over-fitting a

Lecture 11 Fitting ARIMA Models 10/10/2018 1 Model Fitting Fitting ARIMA For an

6. Approximation and fitting Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao

Functions and Data Fitting COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning

Fitting a Line, Residuals, and Correlation October 28, 2019 October 28, 2019 1 / 36 Fitting a

Fitting a Line, Residuals, and Correlation August 27, 2019 August 27, 2019 1 / 54 Fitting a

Least Squares and Data Fitting Data fitting How do we best fit a set of data points? Linear

Over fitting distribution functions over Bayesian Regression / &quot; ' i diggllloise dist

Fitting high resolution structures into low resolution EM maps Michael Rossmann Purdue

Unit 1: Data Fitting Motivation Data fitting: Construct a continuous function that represents

Mechanical Fitting Failures Reporting and Data Analysis - 1 - MFFR Reporting 191.12

Lecture 19 Fitting CAR and SAR Models Colin Rundel 03/29/2017 1 Fitting areal models 2 CAR

Lecture 3 bis Fitting and the Hough transform Fitting: Motivation 9300 Harris Corners Pkwy,

Lecture 18 Fitting CAR and SAR Models Colin Rundel 11/07/2018 1 Fitting areal models Revised

Fitting: Deformable contours Goal: move from array of pixel values (or Monday, Feb 21 filter

A Leg Up On College The Scale and Distribution of Community College Participation among

Topic III.2: Maximum Entropy Models Discrete Topics in Data Mining Universitt des Saarlandes,

Structured Prediction Problem Unstructured prediction Structured prediction Part of

On the distribution of arithmetic sequences in the Collatz graph Keenan Monks, Harvard University

On Banach spaces of vector-valued random variables and their duals motivated by risk measures

US U SE ER R S S M MA AN NU UA AL L Aldo Vecchietti

Java: Learning to Program with Robots Chapter 08: Collaborative Classes Chapter Objectives

A Cache Poisoning Attack Targeting DNS Forwarding Devices Xiaofeng Zheng , Chaoyi Lu, Jian Peng,

Over fitting distribution functions over Bayesian Regression / " ' i diggllloise dist