Optimization for Machine Learning
Lecture 4: Quasi-Newton Methods S.V . N. (vishy) Vishwanathan
Purdue University vishy@purdue.edu
July 11, 2012
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 1 / 28
Optimization for Machine Learning Lecture 4: Quasi-Newton Methods - - PowerPoint PPT Presentation
Optimization for Machine Learning Lecture 4: Quasi-Newton Methods S.V . N. (vishy) Vishwanathan Purdue University vishy@purdue.edu July 11, 2012 S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 1 / 28 The Story So
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 1 / 28
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 28
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 3 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 4 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 6 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 6 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 6 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 6 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 7 / 28
Classical Quasi-Newton Algorithms
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 7 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 8 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 9 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 9 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 9 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 10 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 10 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 10 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 12 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 12 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 12 / 28
Non-smooth Problems
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 12 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 13 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 14 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 15 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 15 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 15 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 17 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 17 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 17 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 17 / 28
BFGS with Subgradients
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 18 / 28
BFGS with Subgradients
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.642 0.644 0.646 0.648 0.650 0.652 0.654
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 18 / 28
BFGS with Subgradients
2.564 2.566 2.568 2.570 2.572 2.574 2.576 2.578 2.580 2.582 1.0 1.5 2.0 2.5 3.0 3.5 ×1e-7+6.430731e-1
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 18 / 28
BFGS with Subgradients
2.564 2.566 2.568 2.570 2.572 2.574 2.576 2.578 2.580 2.582 1.0 1.5 2.0 2.5 3.0 3.5 ×1e-7+6.430731e-1
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 18 / 28
Experiments
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 19 / 28
Experiments
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 20 / 28
Experiments
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 21 / 28
Experiments
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 21 / 28
Experiments
0.0 0.5 1.0
x
0.0 0.2 0.4 0.6 0.8 1.0
y
0.00 0.01
0.00 0.04
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 22 / 28
Experiments
0.0 0.5 1.0
x
0.0 0.2 0.4 0.6 0.8 1.0
y
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 22 / 28
Experiments
101 102 103 CPU Seconds 0.3 1.0 Objective Value
6 )
101 102 103 104 CPU Seconds 0.4 1.0 Objective Value
5 )
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 23 / 28
Experiments
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 BFGS Quadratic Model Piecewise Linear Function 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
0.0 0.5 1.0 1.5 Gradient of BFGS Model Piecewise Constant Gradient S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 24 / 28
Experiments
10-1 100 101 CPU Seconds 5.3 6.2 7.2 Objective Value
6 )
x10
✂1
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 25 / 28
Experiments
100 101 102 CPU Seconds 1.2 2 3 4 Objective Value
6 )
x10
✂1
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 26 / 28
Experiments
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 27 / 28
Experiments
S.V . N. Vishwanathan (Purdue University) Optimization for Machine Learning 28 / 28