Anticipating Concept Drift in Online Learning Micha l Derezi nski - - PowerPoint PPT Presentation

▶

Jul 09, 2023 41 likes •101 views

Anticipating Concept Drift in Online Learning Micha l Derezi nski (speaker), Badri Narayan Bhaskar Online setting: predict with t , get loss f t ( t ). Tracking Regret: compare losses to a good sequence t : R T ( )

SLIDE 1

Anticipating Concept Drift in Online Learning

Micha l Derezi´ nski (speaker), Badri Narayan Bhaskar

Online setting: predict with θt ∈ Θ, get loss ft( θt). Tracking Regret: compare losses to a good sequence θt: RT(θ) = ft( θt) − ft(θt). (Incurred Regret) ∝ (Variability of θt) What if the drift trajectory of θt can be anticipated? Use a window of past predictions θ

t t−k to get a drift estimate.

SLIDE 2

Linear Drift Model

1 1 1 2 2 3 3 4 4 4 5 5 5 6 6 6 7 7 GD AMGD Comparator Drift estimation:

Φ(θ;

θ

t t−k)

AMGD:

θt+1

θt −ηt∇ft( θt)

k ( θt − θt−k) Two-Track: without drift θ(1)

t

and with drift θ(2)

t

Final prediction:

θt = (1 − wt)

θ(1)

t

+ wt θ(2)

t

SLIDE 3

Simulations: Trajectories

100 200 300 400 500 −30 −20 −10 10 20 30 Comparator GD TTND TTMP AMGD

SLIDE 4

Simulations: Losses

50 100 150 200 250 10

−1

10 10

10 Time Loss GD TTND TTMP AMGD

SLIDE 5

Analysis

Regret bounds for Two-Track: O( √ T(1 + V

Φ(θ)))

where V

Φ(θ) T t=1 θt+1 −

Φt(θt) How good is the drift estimation Φ? We show bounds for V

Φ(θ) against the optimal Φ∗.

Can we prove linear convergence observed in simulations? We prove convergence for a special case of AMGD.

SLIDE 6

Many Open Questions

◮ Full regret and convergence analysis. ◮ Generalization to nonlinear drift models. ◮ How to select the learning rate η? ◮ What if we use time-stamps instead of index t? ◮ How to best avoid instability in AMGD?