doublesqueeze
play

DoubleSqueeze: Parallel Stochastic Gradient Descent with - PowerPoint PPT Presentation

DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression Hanlin Tang, Xiangru Lian , Chen Yu, Tong Zhang, Ji Liu Presenter: Xiangru Lian Compressed SGD (existing algorithms) Worker 1 g (1) g n x


  1. DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression Hanlin Tang, Xiangru Lian , Chen Yu, Tong Zhang, Ji Liu Presenter: Xiangru Lian

  2. Compressed SGD (existing algorithms) Worker 1 g (1) ¯ g n x t +1 = x t − γ ∑ C ω [ g ( i ) ] Server n i =1 ¯ ¯ g g Compression Operator : g (2) g (3) 1bit Quantization { Worker 2 Worker 3 Clipping Top-k Sparsification

  3. Compressed SGD introduces error: 1.2 → 1; error = − 0.2 We can do better by compensating this error: 1.2 → 1; error = − 0.2 Next Step Next_Grad Next_Grad - error ←

  4. DoubleSqueeze High Level: Compensating Error for Both Server and Workers Worker : i g ( i ) ← ∇ F ( x ; ξ ( i ) ), v ( i ) ← C ω [ g ( i ) + δ ( i ) ] , δ ( i ) ← g ( i ) + δ ( i ) − v ( i ) Server : n g ← 1 ∑ g + ¯ ¯ g + ¯ v ( i ) , v ← C ω [ ¯ δ ] , δ ← ¯ δ − ¯ ¯ ¯ v n i =1 On All Workers (Model Update): x ← x − γ ¯ v

  5. Convergence Rate Assumptions: Non Convex, with L-Lipschitz Gradient; f ( x ) 𝔽 ξ ∼𝒠 i ∥∇ F ( x ; ξ ) − ∇ f i ( x ) ∥ 2 ≤ σ 2 , ∀ i , ∀ x ; ∥ C ω [ x ] − x ∥ 2 ≤ ζ 2 T : Total Iterations 2 𝔽∥∇ f ( x T ) ∥ 2 ≲ 1 + σ + ζ 3 (DoubleSqueeze) 2 T nT 3 𝔽∥∇ f ( x T ) ∥ 2 ≲ 1 + σ ζ + (Compressed SGD) nT T

  6. Experiments ResNet-18 on CIFAR-10. 8 Nvidia 1080Ti GPUs. 1 GPU per worker. 1Bit Quantization: Top-k Sparsification: Convergence VDQLllD SGD VDnLllD SGD 1.5 DouEleSqueeze DouEleSTueeze 1.0 Rate LoVV 0E0-SGD 0E0-SGD LoVV 1.0 QSGD ToS-k SGD 0.5 0.5 0.0 0.0 0 50 100 150 200 0 100 200 300 eSoch eSoch VDnillD 6GD 500 VDQillD 6GD 500 DouEle6Tueeze DouEle6queeze 400 0(0-6GD Per-Epoch 400 0(0-6GD VeconGV VecoQGV ToS-k 6GD 46GD 300 300 Time 200 200 100 100 0 0 0.02 0.04 0.06 0.08 0.10 0.02 0.04 0.06 0.08 0.10 BDnGwiGth (1/0E) BDQGwiGth (1/0E)

  7. Thanks Welcome to Pacific Ballroom #99 to see the poster for more detail

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend