DoubleSqueeze: Parallel Stochastic Gradient Descent with - PowerPoint PPT Presentation

Mar 15, 2023 •105 likes •178 views

DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression Hanlin Tang, Xiangru Lian , Chen Yu, Tong Zhang, Ji Liu Presenter: Xiangru Lian Compressed SGD (existing algorithms) Worker 1 g (1) g n x

DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression Hanlin Tang, Xiangru Lian , Chen Yu, Tong Zhang, Ji Liu Presenter: Xiangru Lian
Compressed SGD (existing algorithms) Worker 1 g (1) ¯ g n x t +1 = x t − γ ∑ C ω [ g ( i ) ] Server n i =1 ¯ ¯ g g Compression Operator : g (2) g (3) 1bit Quantization { Worker 2 Worker 3 Clipping Top-k Sparsification
Compressed SGD introduces error: 1.2 → 1; error = − 0.2 We can do better by compensating this error: 1.2 → 1; error = − 0.2 Next Step Next_Grad Next_Grad - error ←
DoubleSqueeze High Level: Compensating Error for Both Server and Workers Worker : i g ( i ) ← ∇ F ( x ; ξ ( i ) ), v ( i ) ← C ω [ g ( i ) + δ ( i ) ] , δ ( i ) ← g ( i ) + δ ( i ) − v ( i ) Server : n g ← 1 ∑ g + ¯ ¯ g + ¯ v ( i ) , v ← C ω [ ¯ δ ] , δ ← ¯ δ − ¯ ¯ ¯ v n i =1 On All Workers (Model Update): x ← x − γ ¯ v
Convergence Rate Assumptions: Non Convex, with L-Lipschitz Gradient; f ( x ) 𝔽 ξ ∼𝒠 i ∥∇ F ( x ; ξ ) − ∇ f i ( x ) ∥ 2 ≤ σ 2 , ∀ i , ∀ x ; ∥ C ω [ x ] − x ∥ 2 ≤ ζ 2 T : Total Iterations 2 𝔽∥∇ f ( x T ) ∥ 2 ≲ 1 + σ + ζ 3 (DoubleSqueeze) 2 T nT 3 𝔽∥∇ f ( x T ) ∥ 2 ≲ 1 + σ ζ + (Compressed SGD) nT T
Experiments ResNet-18 on CIFAR-10. 8 Nvidia 1080Ti GPUs. 1 GPU per worker. 1Bit Quantization: Top-k Sparsification: Convergence VDQLllD SGD VDnLllD SGD 1.5 DouEleSqueeze DouEleSTueeze 1.0 Rate LoVV 0E0-SGD 0E0-SGD LoVV 1.0 QSGD ToS-k SGD 0.5 0.5 0.0 0.0 0 50 100 150 200 0 100 200 300 eSoch eSoch VDnillD 6GD 500 VDQillD 6GD 500 DouEle6Tueeze DouEle6queeze 400 0(0-6GD Per-Epoch 400 0(0-6GD VeconGV VecoQGV ToS-k 6GD 46GD 300 300 Time 200 200 100 100 0 0 0.02 0.04 0.06 0.08 0.10 0.02 0.04 0.06 0.08 0.10 BDnGwiGth (1/0E) BDQGwiGth (1/0E)
Thanks Welcome to Pacific Ballroom #99 to see the poster for more detail

Recommend

Exponentiability in Double Categories and the Glueing Construction Susan Niefield Union College

Exponentiability in Double Categories and the Glueing Construction Susan Niefield Union College Schenectady, NY July 2019 Idea What are the exponentiable objects Y in a double category s D 1 D 1 D 0 D 1 D 0 ? id

437 views • 17 slides

The Compilation Process Preprocessing: o processes include-files, conditional compilation and

4/1/14 The Compilation Process Preprocessing: o processes include-files, conditional compilation and macros. Compilation: Writing Large Programs o takes the output of the preprocessor and the source code, and generates assembler

272 views • 6 slides

BEPS and EU Requirements for Country-by-Country Reporting Dr. Raffaele Petruzzi Florian

BEPS and EU Requirements for Country-by-Country Reporting Dr. Raffaele Petruzzi Florian Navisotschnigg Symposion on Transparency and Exchange of Information 2 March 2017 Institute for Austrian and International Tax Law www.wu.ac.at/taxlaw

316 views • 16 slides

International Tax Issues Impacting the Aviation Leasing Industry Brian Leonard, Partner,

School of Aviation Finance International Tax Issues Impacting the Aviation Leasing Industry Brian Leonard, Partner, PricewaterhouseCoopers www.pwc.ie Interna tiona l ta x issues im p a cting the a v ia tion lea sing ind ustry 20 January

727 views • 40 slides

Java Classes (Java: An Eventful Approach, Ch. 6), 13 November 2012 Slides Credit: Bruce,

11/13/2012 CS 120 Lecture 18 Java Classes (Java: An Eventful Approach, Ch. 6), 13 November 2012 Slides Credit: Bruce, Danyluk and Murtagh A Class of Our Own Weve used many classes: Location FilledRect Color Text And

259 views • 13 slides

t t t

s r t t t

218 views • 10 slides

Single Top s -channel Production in / Single Top s -channel Production in / E T +jets at CDF E T

Single Top s -channel Production in / Single Top s -channel Production in / E T +jets at CDF E T +jets at CDF Matteo Cremonesi Matteo Cremonesi 1 / 10 Matteo Cremonesi 1/10 Higgs Hunting 2013 - July 26, 2013 Introduction/1 Single Top Quark

493 views • 10 slides

CS261 Data Structures AVL Trees Goals Pros/Cons of

CS261 Data Structures AVL Trees Goals Pros/Cons of a BST AVL Solu;on Height-Balanced Trees Binary Search Tree: Balance 50 25 75 20 35 60

404 views • 29 slides

U.S. .S. Depar Department tment of of Hou Housing sing and and Urban De Urban Development

U.S. .S. Depar Department tment of of Hou Housing sing and and Urban De Urban Development elopment Of Office fice of of Housing Housing Counseling Counseling Submitting a Budget Facilitated by Booth Management Consulting 7230 Lee

699 views • 42 slides

Adjustments to Financial Statements Session 07 & 08 Session Outline Adjustments to

Adjustments to Financial Statements Session 07 & 08 Session Outline Adjustments to Financial Statements Closing Inventory Accruals and Prepayments Interest Depreciation Bad Debts Preparation of Financial Statements

516 views • 26 slides

PACC Offshore Services Holdings Ltd. Results Presentation 9M FY18 Results 2 Nov 2018 1 Agenda

` ` PACC Offshore Services Holdings Ltd. Results Presentation 9M FY18 Results 2 Nov 2018 1 Agenda Page 1. Industry Outlook and Key Highlights 3 2. Financial Highlights 5 3. Business Strategy 15 4. Appendices 16 2 Industry Outlook

388 views • 18 slides

BKD & NACHC Town Hall for CHC Leaders Helping leaders navigate HHS Post-Payment Notice of

10/8/2020 BKD & NACHC Town Hall for CHC Leaders Helping leaders navigate HHS Post-Payment Notice of Reporting Requirements October 9, 2020 Introductions Jeff Allen Gervean Williams Kimberly McKay Partner NACHC Director Managing

349 views • 23 slides

A86045 Accoun,ng and Financial Repor,ng (2017/2018) Session 18 Financial Instruments 1 Non-

A86045 Accoun,ng and Financial Repor,ng (2017/2018) Session 18 Financial Instruments 1 Non- Deriva,ves Paul G. Smith B.A., F.C.A. SESSION 18 OVERVIEW A 86045 Accoun,ng and Financial 2 Repor,ng Course Objec,ves At the end of this course

1.3k views • 113 slides

our promises our promises delivering on delivering on Sir George Mathewson Sir George

our promises delivering on our promises our promises delivering on delivering on Sir George Mathewson Sir George Mathewson Executive Deputy Chairman Executive Deputy Chairman 19 April 2000 delivering on our promises Slide 3 May RBS

480 views • 43 slides

Victorian Default Offer 2021 Draft decision Online public forum Thursday 8 October 2020 Welcome

Victorian Default Offer 2021 Draft decision Online public forum Thursday 8 October 2020 Welcome Please mute your mic. Please note this public forum is being recorded including questions, comments and chats. Please use Slido for

658 views • 27 slides

In the 1990s 12 pro pe rtie s Curre nt Situa tion Ove r 200 pro pe rtie s 52

$13 millio n re ntal re ve nue 880,000 square fe e t In the 1990s 12 pro pe rtie s Curre nt Situa tion Ove r 200 pro pe rtie s 52 millio n square fe e t $646 millio n re ntal re ve nue Re la tive Pric e Pe

733 views • 40 slides

Signal and Systems Chapter 1: Signals and Systems Signals 1) Systems 2) Some examples of

Signal and Systems Chapter 1: Signals and Systems Signals 1) Systems 2) Some examples of systems 3) System properties and examples 4) Causality a) Linearity b) Time invariance c) Reformatted version of open course notes from MIT

1.8k views • 31 slides

Disaster risk reduction initiatives in the UK : Strengthening resilience for hydrometeorology

Disaster risk reduction initiatives in the UK : Strengthening resilience for hydrometeorology hazard during a pandemic Professor Dilanthi Amaratunga Global Disaster Resilience Centre University of Huddersfield, UK d.amaratunga@hud.ac.uk

301 views • 19 slides

CONTAGION VERSUS FLIGHT TO QUALITY IN FINANCIAL MARKETS Jose Olmo Department of Economics City

EVA IV, Gothenburg, August 2005 CONTAGION VERSUS FLIGHT TO QUALITY IN FINANCIAL MARKETS Jose Olmo Department of Economics City University, London (joint work with Jes us Gonzalo, Universidad Carlos III de Madrid) 4th Conference on Extreme

631 views • 33 slides

CISC883: LECTURE 1 INTRODUCTION TO ULSS Cor-Paul Bezemer 2 Todays lecture Course

CISC883: LECTURE 1 INTRODUCTION TO ULSS Cor-Paul Bezemer 2 Todays lecture Course summary Introduction to ULSS 3 Course Summary Course Notes: Ultra-Large-Scale Systems: The Software Challenge of the Future, Linda Northrop

1.21k views • 43 slides

TimeBoost Fine-Grained Interleaving of Multithreaded Lagrange Relaxation based Gate Sizing with

TimeBoost Fine-Grained Interleaving of Multithreaded Lagrange Relaxation based Gate Sizing with Buffering Optimizations Apostolos Stefanidis , Dimitrios Mangiras, Giorgos Dimitrakopoulos Integrated Circuits Lab Electrical and Computer

547 views • 16 slides

On the last 10 billion years of stellar mass growth in star-forming galaxies z szomoru+11

On the last 10 billion years of stellar mass growth in star-forming galaxies z szomoru+11 Log(SFR/M*) Sam Leitner (University of Chicago) Advisor: Andrey Kravtsov Santa Cruz Galaxy Workshop, August 2011 z0 A persistent SDSS SFR-M *

802 views • 31 slides

Sub-quadratic search for significant correlations Graham Cormode Jacques Dark University of

Sub-quadratic search for significant correlations Graham Cormode Jacques Dark University of Warwick G.Cormode@Warwick.ac.uk Computational scalability and big data Most work on massive data tries to scale up the computation Many great

510 views • 22 slides

Spatial Distribution of Supply and the Role of Market Thic- nkess: Theory and Evidence from

Spatial Distribution of Supply and the Role of Market Thic- nkess: Theory and Evidence from Ridesharing Soheil Ghili & Vineet Kumar (Yale) Research Question on a Broad Level How does economies of density shape the distribution of supply in

419 views • 4 slides