SLIDE 1
Parallel Online Learning Daniel Hsu Nikos Karampatziakis John - - PowerPoint PPT Presentation
Parallel Online Learning Daniel Hsu Nikos Karampatziakis John - - PowerPoint PPT Presentation
Parallel Online Learning Daniel Hsu Nikos Karampatziakis John Langford University of Pennsylvania Cornell University Yahoo! Research Rutgers University Workshop on Learning on Cores, Clusters and Clouds Online Learning Learner gets the
SLIDE 2
SLIDE 3
Delay
◮ Parallelizing online learning leads to delay problems. ◮ Temporally correlated or adversarial examples. ◮ We investigate no delay and bounded delay schemes.
SLIDE 4
Tree Architectures
xF1 xF2 xF3 xF4 ˆ y1,1 ˆ y1,2 ˆ y1,3 ˆ y1,4 ˆ y2,1 ˆ y2,2 ˆ y
SLIDE 5
Local Updates
Each node in the tree:
◮ Computes its prediction pi,j based on its weights and inputs ◮ Sends ˆ
yi,j = σ(pi,j) to its parent1
◮ Updates its weights based on ∇ℓ(pi,j, y)
No delay Representation power: between Naive Bayes and centralized linear model.
1The nonlinearity introduced by σ has an interesting effect
SLIDE 6
Global Updates
◮ Local update can help or hurt. ◮ Improved representation power by more communication.
◮ Delayed global training ◮ Delayed backprop