PLACETO: LEARNING GENERALIZABLE DEVICE PLACEMENT ALGORITHMS FOR - - PowerPoint PPT Presentation

▶

Sep 11, 2022 168 likes •324 views

PLACETO: LEARNING GENERALIZABLE DEVICE PLACEMENT ALGORITHMS FOR DISTRIBUTED MACHINE LEARNING Ravi vichandra Ad Addanki, , Sh Shaileshh Bo Bojja jja Ve Venkatakrishnan, , Sh Shreyan Gupta, , Ho Hongzi Mao, , Mohammad A Alizadeh

SLIDE 1

PLACETO: LEARNING GENERALIZABLE DEVICE PLACEMENT ALGORITHMS FOR DISTRIBUTED MACHINE LEARNING

Ravi vichandra Ad Addanki, , Sh Shaileshh Bo Bojja jja Ve Venkatakrishnan, , Sh Shreyan Gupta, , Ho Hongzi Mao, , Mohammad A Alizadeh Presented by: Obodoekwe Nnaemeka

SLIDE 2

Problem

Distributed training (GPU and CPU) Human experts? Reinforcement learning?

SLIDE 3

Problem

Sometimes tolerable. Solutions do not generalize The optimization is done for a single graph. Single computational graph vs Class of computational graph

SLIDE 4

Placeto

Efficien ency

Sequence of iterative placement improvements

Gen ener eralizability

NN architecture that uses graph embedding to encode the computation of graph structure in the placement policy.

SLIDE 5

Learning method

■ Markov Decision Process

SLIDE 6

POLICY NETWORK ARCHITECTURE

SLIDE 7

GRAPH EMBEDDING

SLIDE 8

Training Details

Colocation Simulator

SLIDE 9

Experimentation

Deep learning models (Incep eption

n-V3

V3, N , NAS ASNet, N t, NMT MT) Synthetic data (cifar10, ptb, nmt) Single GPU, Scotch, Human Expert, RNN based approach.

SLIDE 10

Result

Performance Generalizability

SLIDE 11

PLACETO VS RNN

SLIDE 12

GENERALIZABILITY

SLIDE 13

GENERALIZABILITY

SLIDE 14

Place deep dive

Node traversal

rder

Alternative architectures Simple aggregator Simple partitioner

SLIDE 15

Critic

+ First attempt to generalize device placement using a graph embedding network + Really Impressive performance

Only optimizes placement decisions
It shows generalization to unseen graphs, but they are generated artificially by

architecture search for a single learning task and dataset. How does the framework handle failure. Evaluation protocol needs to be more explicit.