Michael Spece Departments of Machine Learning and Statistics - - PowerPoint PPT Presentation

michael spece
SMART_READER_LITE
LIVE PREVIEW

Michael Spece Departments of Machine Learning and Statistics - - PowerPoint PPT Presentation

Generalization Martingale Bounds Ongoing Work Generalization for Streaming Data Michael Spece Departments of Machine Learning and Statistics Carnegie Mellon University June 11, 2015 1 / 12 Generalization Martingale Bounds Ongoing Work


slide-1
SLIDE 1

Generalization Martingale Bounds Ongoing Work

Generalization for Streaming Data

Michael Spece

Departments of Machine Learning and Statistics Carnegie Mellon University

June 11, 2015

1 / 12

slide-2
SLIDE 2

Generalization Martingale Bounds Ongoing Work

Learning Game/Decision Theoretic Setup

Fix T ∈ Z+ Environment generates T observations y := y1, · · · , yT Learner estimates ˆ x(y)

2 / 12

slide-3
SLIDE 3

Generalization Martingale Bounds Ongoing Work

Definition of Generalization

Generalization error (a measure of overfitting) E

y

  • 1

T

T

  • t=1

ℓ(ˆ x(y), yt) − E

y0 ℓ(ˆ

x(y), y0)

  • 3 / 12
slide-4
SLIDE 4

Generalization Martingale Bounds Ongoing Work

Online Learning Refinement (Online to Batch Conversion)

A specific way of computing the estimate (compute it online): Fix T ∈ Z+ For t ∈ {1, · · · , T} Environment generates yt Learner “instantaneously” estimates ˆ x′

t(y1, · · · , yt)

Learner estimates ˆ x := ˆ x′

4 / 12

slide-5
SLIDE 5

Generalization Martingale Bounds Ongoing Work

Void for Generalizing from Streaming Data

Drawbacks of batch perspective for streaming data Final estimate is not equal to the last sequential estimation Empirical risk is not equal to actual loss suffered under sequential estimation Given the definition of generalization error, restricts the notion

  • f cumulative loss to mean

5 / 12

slide-6
SLIDE 6

Generalization Martingale Bounds Ongoing Work

Solution

Fix T ∈ Z+ Environment generates a single observation y := (y1, · · · , yT) Learner estimates ˆ x(y) Generalization error becomes E

y

  • ℓ(ˆ

x(y), y) − E

y0 ℓ(ˆ

x(y), y0)

  • 6 / 12
slide-7
SLIDE 7

Generalization Martingale Bounds Ongoing Work

Online Learning Refinement

Compute estimate online Fix T ∈ Z+ Environment generates y For t ∈ {1, · · · , T} Learner “instantaneously” estimates ˆ xt(y1, · · · , yt) Learner estimates ˆ x := (ˆ x1, · · · , ˆ xT)

7 / 12

slide-8
SLIDE 8

Generalization Martingale Bounds Ongoing Work

Summary

Generalization error is a measure of overfitting Applying to streaming data (one vector-valued observation, vector-valued estimation), generalization error becomes E

y

  • ℓ(ˆ

x(y), y) − E

y0 ℓ(ˆ

x(y), y0)

  • Features

Expanded Applications Preserves ordering of estimations Dynamic models Single loss Non-convex cumulative losses Minimal assumptions Non-stationary data

8 / 12

slide-9
SLIDE 9

Generalization Martingale Bounds Ongoing Work

Bounding

Given an online learning algorithm, one can attempt to show that the algorithm generalizes by bounding its generalization error Certain functional forms and regularity conditions entail a martingale bound Example functional form: ℓ(ˆ x(y), y) = B(ℓ′

1(ˆ

x1, y2), · · · , ℓ′

T−1(ˆ

xT−1, yT))1 Example regularity conditions: B nonnegative, subadditive, and (for better rates) smooth

1This form appears in Rahklin et al. 2010. 9 / 12

slide-10
SLIDE 10

Generalization Martingale Bounds Ongoing Work

Implication

Martingale bound is in the form of a supremum of the norms of martingale difference sequences (MDSs) Under regularity conditions, the supremum grows sublinearly in T, i.e. generalization holds.

10 / 12

slide-11
SLIDE 11

Generalization Martingale Bounds Ongoing Work

Generality to which results hold

More general results can simplify notation

11 / 12

slide-12
SLIDE 12

Generalization Martingale Bounds Ongoing Work

Algorithmic Analysis

Generalization error can be computed for simulated data or, with additional assumptions, estimated from data Does generalization error help explain the improved performance of

  • nline forecasters?

12 / 12