SLIDE 1 Incremental Randomized Sketching for Online Kernel Learning
Xiao Zhang Shizhong Liao∗
College of Intelligence and Computing, Tianjin University szliao@tju.edu.cn June 13, 2019
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 1 / 11
SLIDE 2 Outline
1
Introduction
2
Main Results
3
Conclusion
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 2 / 11
SLIDE 3 Introduction
New Challenges of Online Kernel Learning
(1) High computational complexities Per-round time complexity depending on T [Calandriello et al., 2017b] Linear space complexity [Calandriello et al., 2017a] (2) Lack of theoretical guarantees Lack of sublinear regrets for randomized sketching [Wang et al., 2016] Lack of constant lower bounds on budget/sketch size [Lu et al., 2016]
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 3 / 11
SLIDE 4 Introduction
Main Contribution
Table 1: Comparison with existing online kernel learning approaches (1st order: existing first-order approaches; 2nd order: existing second-order approaches)
Computational complexities Theoretical guarantees Time (per round) Space Budget/Sketch size Regret 1st order Constant Constant Linear Sublinear 2nd order Sublinear Linear Logarithmic Sublinear Proposed Constant Constant Constant Sublinear
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 4 / 11
SLIDE 5 Main Results
Incremental Randomized Sketching Approach
Sequence of Instances Matrix Sketching Sketch Updating Incremental Randomized Sketching Explicit Mapping Gradient Descent Hypothesis Updating Online Prediction
Figure 1: Novel incremental randomized sketching scheme for online kernel learning
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 5 / 11
SLIDE 6 Main Results
Incremental Randomized Sketching Approach
( 1) pp t+
Δ
( 1) t+
K
( ) t
K
( 1) m t+
C
( 1) m t+
C
( )
† 1 pm ( ) t+
Φ
) pp ( 1 t+
Φ
( )
) pm † ( 1 t+
Φ
( 1) t+
ψ
( 1) t
+ ( 1) t+
ψ
) pm ( 1 t+
Φ
=
) pp ( 1 t+
Φ
=
) pm (t
Φ
) pp (t
Φ
+ +
( ) p t
S
( +1) p t
S
( ) m t
S
( +1) m t
S
( 1) p t+
s
( 1) m t+
s
( 1) pm t+
Δ , ,
( 1) t
+ ( 1) t+
ψ
(
,
)
( ) t a
S
( +1) p t
S
( 1) p t+
s
, ,
( 1) t
+ ( 1) t+
ψ
( )
( 1) sk t+
F
2( ) t+
1 t+
Q
m
1 s i i=
x
Figure 2: The proposed incremental randomized sketching for kernel matrix approximation at round t + 1
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 6 / 11
SLIDE 7 Main Results
Incremental Randomized Sketching Theory
Low-Rank Approximation Property Inner Product Preserving Property Matrix Product Preserving Property Regret Bound
Figure 3: The dependence structure of
Product preserving property: Statistically unbiased. Approximation property: (1 + ǫ)-relative error bound. Regret bound: O( √ T) regret bound, constant lower bounds of sketch sizes.
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 7 / 11
SLIDE 8 Main Results
Experimental Results
Table 2: Comparison of online kernel learning algorithms in adversarial environments
Algorithm german-1 german-2 Mistake rate Time Mistake rate Time FOGD 37.493 ± 0.724 0.140 32.433 ± 0.196 0.265 NOGD 30.918 ± 0.003 0.405 26.737 ± 0.002 0.778 PROS-N-KONS 27.633 ± 0.416 33.984 17.737 ± 0.900 98.873 SkeGD (θ = 0.1) 17.320 ± 0.136 0.329 7.865 ± 0.059 0.597 SkeGD (θ = 0.01) 17.272 ± 0.112 0.402 7.407 ± 0.086 0.633 SkeGD (θ = 0.005) 16.578 ± 0.360 0.484 7.266 ± 0.065 0.672 SkeGD (θ = 0.001) 16.687 ± 0.155 1.183 6.835 ± 0.136 1.856
Our incremental randomized sketching achieves a better learning performance in terms of accuracy and efficiency even in adversarial environments.
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 8 / 11
SLIDE 9 Conclusion
Novel incremental randomized sketching for online kernel learning. Meet the new challenges of online kernel learning. (1) (1 + ǫ)-relative error bound. (2) Sublinear regret bound under constant lower bounds of the sketch size. (3) Constant per-round computational complexities. A sketch scheme for both online and offline large-scale kernel learning.
Xiao Zhang Shizhong Liao (TJU) ICML 2019 June 13, 2019 9 / 11
SLIDE 10
Main References
[Calandriello et al., 2017a] Calandriello, D., Lazaric, A., and Valko, M. (2017a). Efficient second-order online kernel learning with adaptive embedding. In Advances in Neural Information Processing Systems 30, pages 6140–6150. [Calandriello et al., 2017b] Calandriello, D., Lazaric, A., and Valko, M. (2017b). Second-order kernel online convex optimization with adaptive sketching. In Proceedings of the 34th International Conference on Machine Learning, pages 645–653. [Lu et al., 2016] Lu, J., Hoi, S. C., Wang, J., Zhao, P., and Liu, Z. (2016). Large scale online kernel learning. Journal of Machine Learning Research, 17:1613–1655. [Wang et al., 2016] Wang, S., Zhang, Z., and Zhang, T. (2016). Towards more efficient SPSD matrix approximation and CUR matrix decomposition. Journal of Machine Learning Research, 17:1–49.
SLIDE 11
Thank you!