Distributed Deep Learning Using Hopsworks CGI Trainee Program - - PowerPoint PPT Presentation

distributed deep learning using hopsworks
SMART_READER_LITE
LIVE PREVIEW

Distributed Deep Learning Using Hopsworks CGI Trainee Program - - PowerPoint PPT Presentation

I NTRO H OPSWORKS D ISTRIBUTED DL B LACK -B OX O PTIMIZATION F EATURE S TORE S UMMARY D EMO /W ORKSHOP Distributed Deep Learning Using Hopsworks CGI Trainee Program Workshop Kim Hammar kim@logicalclocks.com Before we start.. 1. Register for an


slide-1
SLIDE 1

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

Distributed Deep Learning Using Hopsworks

CGI Trainee Program Workshop Kim Hammar kim@logicalclocks.com

slide-2
SLIDE 2

Before we start..

  • 1. Register for an account at:

www.hops.site

  • 2. Follow the instructions at:

http://bit.ly/2EnZQgW

slide-3
SLIDE 3

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISTRIBUTED COMPUTING + DEEP LEARNING = ?

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Distributed Computing Deep Learning

Why Combine the two?

2em11 Chen Sun et al. “Revisiting Unreasonable Effectiveness of Data in Deep Learning Era”. In: CoRR abs/1707.02968 (2017). arXiv: 1707.02968. URL: http://arxiv.org/abs/1707.02968. 2em12 Jeffrey Dean et al. “Large Scale Distributed Deep Networks”. In: Advances in Neural Information Processing Systems 25. Ed. by F. Pereira et al. Curran Associates, Inc., 2012, pp. 1223–1231.

slide-4
SLIDE 4

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISTRIBUTED COMPUTING + DEEP LEARNING = ?

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Distributed Computing Deep Learning

Why Combine the two?

◮ We like challenging problems

2em11 Chen Sun et al. “Revisiting Unreasonable Effectiveness of Data in Deep Learning Era”. In: CoRR abs/1707.02968 (2017). arXiv: 1707.02968. URL: http://arxiv.org/abs/1707.02968. 2em12 Jeffrey Dean et al. “Large Scale Distributed Deep Networks”. In: Advances in Neural Information Processing Systems 25. Ed. by F. Pereira et al. Curran Associates, Inc., 2012, pp. 1223–1231.

slide-5
SLIDE 5

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISTRIBUTED COMPUTING + DEEP LEARNING = ?

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Distributed Computing Deep Learning

Why Combine the two?

◮ We like challenging problems ◮ More productive data science ◮ Unreasonable effectiveness of data1 ◮ To achieve state-of-the-art results2

2em11 Chen Sun et al. “Revisiting Unreasonable Effectiveness of Data in Deep Learning Era”. In: CoRR abs/1707.02968 (2017). arXiv: 1707.02968. URL: http://arxiv.org/abs/1707.02968. 2em12 Jeffrey Dean et al. “Large Scale Distributed Deep Networks”. In: Advances in Neural Information Processing Systems 25. Ed. by F. Pereira et al. Curran Associates, Inc., 2012, pp. 1223–1231.

slide-6
SLIDE 6

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISTRIBUTED DEEP LEARNING (DDL): PREDICTABLE SCALING

3

2em13 Jeff Dean. Building Intelligent Systems withLarge Scale Deep Learning. https : / / www . scribd . com / document/355752799/Jeff-Dean-s-Lecture-for-YC-AI. 2018.

slide-7
SLIDE 7

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISTRIBUTED DEEP LEARNING (DDL): PREDICTABLE SCALING

slide-8
SLIDE 8

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DDL IS NOT A SECRET ANYMORE

4

2em14 Tal Ben-Nun and Torsten Hoefler. “Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis”. In: CoRR abs/1802.09941 (2018). arXiv: 1802.09941. URL: http://arxiv.org/abs/ 1802.09941.

slide-9
SLIDE 9

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DDL IS NOT A SECRET ANYMORE

TensorflowOnSpark CaffeOnSpark Distributed TF

Frameworks for DDL Companies using DDL

slide-10
SLIDE 10

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DDL REQUIRES AN ENTIRE

SOFTWARE/INFRASTRUCTURE STACK

e1 e2 e3

Distributed Training

e4 Gradient ∇ Gradient ∇ Gradient ∇ Gradient ∇

Distributed Systems Data Validation Feature Engineering Data Collection Hardware Management HyperParameter Tuning Model Serving Pipeline Management A/B Testing Monitoring

slide-11
SLIDE 11

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTLINE

  • 1. Hopsworks: Background of the platform
  • 2. Managed Distributed Deep Learning using HopsYARN,

HopsML, PySpark, and Tensorflow

  • 3. Black-Box Optimization (Hyperparameter Tuning) using

Hopsworks, Metadata Store, PySpark, and Maggy5

  • 4. Feature Store data management for machine learning
  • 5. Coffee Break
  • 6. Demo, end-to-end ML pipeline
  • 7. Hands-on Workshop, try out Hopsworks on our cluster in

Luleå

2em15 Moritz Meister and Sina Sheikholeslami. Maggy. https://github.com/logicalclocks/maggy. 2019.

slide-12
SLIDE 12

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS

slide-13
SLIDE 13

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS

HopsFS

slide-14
SLIDE 14

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS

HopsFS HopsYARN

(GPU/CPU as a resource)

slide-15
SLIDE 15

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS

HopsFS HopsYARN

(GPU/CPU as a resource)

Frameworks

(ML/Data)

slide-16
SLIDE 16

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS

HopsFS HopsYARN

(GPU/CPU as a resource)

Frameworks

(ML/Data) Feature Store Pipelines Experiments Models

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

ML/AI Assets

slide-17
SLIDE 17

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS

HopsFS HopsYARN

(GPU/CPU as a resource)

Frameworks

(ML/Data) Feature Store Pipelines Experiments Models

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

ML/AI Assets

from hops import featurestore from hops import experiment featurestore.get_features([ "average_attendance", "average_player_age"]) experiment.collective_all_reduce(features , model)

APIs

slide-18
SLIDE 18

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS

HopsFS HopsYARN

(GPU/CPU as a resource)

Frameworks

(ML/Data)

Distributed Metadata

(Available from REST API) Feature Store Pipelines Experiments Models

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

ML/AI Assets

from hops import featurestore from hops import experiment featurestore.get_features([ "average_attendance", "average_player_age"]) experiment.collective_all_reduce(features , model)

APIs

slide-19
SLIDE 19

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

INNER AND OUTER LOOP OF LARGE SCALE DEEP LEARNING

Inner loop

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y

. . .

worker1 worker2 workerN Data Synchronization

∇1 ∇2 ∇N

slide-20
SLIDE 20

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

INNER AND OUTER LOOP OF LARGE SCALE DEEP LEARNING

Outer loop Metric τ Search Method hparams h

Inner loop

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y

. . .

worker1 worker2 workerN Data Synchronization

∇1 ∇2 ∇N

slide-21
SLIDE 21

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

INNER AND OUTER LOOP OF LARGE SCALE DEEP LEARNING

Outer loop Metric τ Search Method hparams h

Inner loop

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y

. . .

worker1 worker2 workerN Data Synchronization

∇1 ∇2 ∇N

slide-22
SLIDE 22

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

INNER LOOP: DISTRIBUTED DEEP LEARNING

    x1 . . . xn    

Features

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-23
SLIDE 23

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

INNER LOOP: DISTRIBUTED DEEP LEARNING

e1 e2 e3 e4 p1 p2 p3 p4

Gradient ∇ Gradient ∇ Gradient ∇ Gradient ∇ Data Partition Data Partition Data Partition Data Partition

slide-24
SLIDE 24

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISTRIBUTED DEEP LEARNING IN PRACTICE

◮ Implementation

  • f distributed

algorithms is becoming a commodity (TF, PyTorch etc)

◮ The hardest part

  • f DDL is now:

◮ Cluster

management

◮ Allocating

GPUs

◮ Data

management

◮ Operations &

performance

?

Models GPUs Data Distribution

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y
slide-25
SLIDE 25

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS DDL SOLUTION

slide-26
SLIDE 26

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS DDL SOLUTION

from hops import experiment experiment.collective_all_reduce(train_fn)

slide-27
SLIDE 27

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS DDL SOLUTION

from hops import experiment experiment. collective_all_reduce( train_fn ) HopsYARN RM YARN container GPU as a resource YARN container GPU as a resource YARN container GPU as a resource YARN container GPU as a resource

Resource requests Client API YARN container

slide-28
SLIDE 28

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS DDL SOLUTION

from hops import experiment experiment. collective_all_reduce( train_fn ) HopsYARN RM YARN container GPU as a resource Spark executor YARN container GPU as a resource Spark executor YARN container GPU as a resource Spark executor YARN container GPU as a resource Spark executor

Resource requests Client API YARN container

Spark driver

slide-29
SLIDE 29

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS DDL SOLUTION

from hops import experiment experiment. collective_all_reduce( train_fn ) HopsYARN RM YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor

Resource requests Client API YARN container conda env

Spark driver

slide-30
SLIDE 30

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS DDL SOLUTION

from hops import experiment experiment. collective_all_reduce( train_fn ) HopsYARN RM YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor

Resource requests Client API Here is my ip: 192.168.1.1 Here is my ip: 192.168.1.2 Here is my ip: 192.168.1.3 Here is my ip: 192.168.1.4 YARN container conda env

Spark driver

slide-31
SLIDE 31

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS DDL SOLUTION

from hops import experiment experiment. collective_all_reduce( train_fn ) HopsYARN RM YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor

Resource requests Client API Gradient ∇ Gradient ∇ Gradient ∇ Gradient ∇ YARN container conda env

Spark driver

Hops Distributed File System (HopsFS)

slide-32
SLIDE 32

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

HOPSWORKS DDL SOLUTION

from hops import experiment experiment. collective_all_reduce( train_fn ) HopsYARN RM YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor YARN container GPU as a resource conda env Spark executor

Resource requests Client API Gradient ∇ Gradient ∇ Gradient ∇ Gradient ∇ YARN container conda env

Spark driver

Hops Distributed File System (HopsFS)

Hide complexity behind simple API

Allocate resources using pyspark

Allocate GPUs for spark executors using HopsYARN

Serve sharded training data to workers from HopsFS

Use HopsFS for aggregating logs, checkpoints and results

Store experiment metadata in metastore

Use dynamic allocation for interactive resource management

slide-33
SLIDE 33

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Outer loop Metric τ Search Method hparams h

Inner loop

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y

. . .

worker1 worker2 workerN Data Synchronization

∇1 ∇2 ∇N

slide-34
SLIDE 34

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Outer loop Metric τ Search Method hparams h

Inner loop

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y

. . .

worker1 worker2 workerN Data Synchronization

∇1 ∇2 ∇N

slide-35
SLIDE 35

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

    x1 . . . xn    

Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-36
SLIDE 36

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Example Use-Case from one of our clients:

◮ Goal: Train a One-Class GAN model for fraud detection ◮ Problem: GANs are extremely sensitive to

hyperparameters and there exists a very large space of possible hyperparameters.

◮ Example hyperparameters to tune: learning rates η,

  • ptimizers, layers.. etc.

Real input x Random Noise z Generator

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Discriminator

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

slide-37
SLIDE 37

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Num Neurons/Layer 25 30 35 40 45 N u m L a y e r s 2 4 6 8 10 12 L e a r n i n g R a t e 0.00 0.02 0.04 0.06 0.08 0.10

Search Space

    x1 . . . xn     Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-38
SLIDE 38

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Num Neurons/Layer 25 30 35 40 45 N u m L a y e r s 2 4 6 8 10 12 L e a r n i n g R a t e 0.00 0.02 0.04 0.06 0.08 0.10

Search Space

η1, .. η2, .. η3, .. η4, .. η5, ..

Shared Task Queue Parallel Workers w1 w1 w1 w1

    x1 . . . xn     Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-39
SLIDE 39

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

slide-40
SLIDE 40

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Num Neurons/Layer 25 30 35 40 45 N u m L a y e r s 2 4 6 8 10 12 L e a r n i n g R a t e 0.00 0.02 0.04 0.06 0.08 0.10

Search Space

η1, .. η2, .. η3, .. η4, .. η5, ..

Shared Task Queue Parallel Workers w1 w1 w1 w1

    x1 . . . xn     Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-41
SLIDE 41

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Num Neurons/Layer 25 30 35 40 45 N u m L a y e r s 2 4 6 8 10 12 L e a r n i n g R a t e 0.00 0.02 0.04 0.06 0.08 0.10

Search Space

η1, .. η2, .. η3, .. η4, .. η5, ..

Shared Task Queue Parallel Workers w1 w1 w1 w1 Which algorithm to use for search?

    x1 . . . xn     Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-42
SLIDE 42

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Num Neurons/Layer 25 30 35 40 45 N u m L a y e r s 2 4 6 8 10 12 L e a r n i n g R a t e 0.00 0.02 0.04 0.06 0.08 0.10

Search Space

η1, .. η2, .. η3, .. η4, .. η5, ..

Shared Task Queue Parallel Workers w1 w1 w1 w1 How to monitor progress? Which algorithm to use for search?

    x1 . . . xn     Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-43
SLIDE 43

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Num Neurons/Layer 25 30 35 40 45 N u m L a y e r s 2 4 6 8 10 12 L e a r n i n g R a t e 0.00 0.02 0.04 0.06 0.08 0.10

Search Space

η1, .. η2, .. η3, .. η4, .. η5, ..

Shared Task Queue Parallel Workers w1 w1 w1 w1 How to aggregate results? How to monitor progress? Which algorithm to use for search?

    x1 . . . xn     Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-44
SLIDE 44

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Num Neurons/Layer 25 30 35 40 45 N u m L a y e r s 2 4 6 8 10 12 L e a r n i n g R a t e 0.00 0.02 0.04 0.06 0.08 0.10

Search Space

η1, .. η2, .. η3, .. η4, .. η5, ..

Shared Task Queue Parallel Workers w1 w1 w1 w1 How to aggregate results? How to monitor progress? Which algorithm to use for search? Fault Tolerance?

    x1 . . . xn     Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-45
SLIDE 45

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

OUTER LOOP: BLACK BOX OPTIMIZATION

Num Neurons/Layer 25 30 35 40 45 N u m L a y e r s 2 4 6 8 10 12 L e a r n i n g R a t e 0.00 0.02 0.04 0.06 0.08 0.10

Search Space

η1, .. η2, .. η3, .. η4, .. η5, ..

Shared Task Queue Parallel Workers w1 w1 w1 w1 How to aggregate results? How to monitor progress? Which algorithm to use for search? Fault Tolerance?

    x1 . . . xn     Features Hyperparameters η, num_layers, neurons

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

This should be managed with platform support!

slide-46
SLIDE 46

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

PARALLEL EXPERIMENTS

from hops import experiment experiment.random_search(train_fn)

slide-47
SLIDE 47

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

ASYNCHRONOUS SEARCH WORKFLOW

λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α

Workers Coordinator

Global Task Queue

20 40 60 80 100 Epochs Accuracy lr=0.0021,layers=5 lr=0.01,layers=2 lr=0.01,layers=10 lr=0.001,layers=15 lr=0.001,layers=25 lr=0.019,layers=5 lr=0.001,layers=7 lr=0.01,layers=4 lr=0.0014,layers=3 lr=0.05,layers=1

Trials Progress Black-Box Optimziers minx f(x) x ∈ S

Suggested tasks Results Suggested tasks Results Suggested tasks Results

slide-48
SLIDE 48

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

ASYNCHRONOUS SEARCH WORKFLOW

λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α

Workers Coordinator

Global Task Queue

20 40 60 80 100 Epochs Accuracy lr=0.0021,layers=5 lr=0.01,layers=2 lr=0.01,layers=10 lr=0.001,layers=15 lr=0.001,layers=25 lr=0.019,layers=5 lr=0.001,layers=7 lr=0.01,layers=4 lr=0.0014,layers=3 lr=0.05,layers=1

Trials Progress Black-Box Optimziers minx f(x) x ∈ S

Suggested tasks Results Heartbeats Suggested tasks Results Heartbeats Suggested tasks Results Heartbeats

slide-49
SLIDE 49

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

ASYNCHRONOUS SEARCH WORKFLOW

λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α

Workers Coordinator

Global Task Queue

20 40 60 80 100 Epochs Accuracy lr=0.0021,layers=5 lr=0.01,layers=2 lr=0.01,layers=10 lr=0.001,layers=15 lr=0.001,layers=25 lr=0.019,layers=5 lr=0.001,layers=7 lr=0.01,layers=4 lr=0.0014,layers=3 lr=0.05,layers=1

Trials Progress Black-Box Optimziers minx f(x) x ∈ S

Suggested tasks Results Early Stop Heartbeats Suggested tasks Results Early Stop Heartbeats Suggested tasks Results Heartbeats

slide-50
SLIDE 50

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

ASYNCHRONOUS SEARCH WORKFLOW

λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α λ

Suggestions

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Trial Metric

α

Workers Coordinator

Global Task Queue

20 40 60 80 100 Epochs Accuracy lr=0.0021,layers=5 lr=0.01,layers=2 lr=0.01,layers=10 lr=0.001,layers=15 lr=0.001,layers=25 lr=0.019,layers=5 lr=0.001,layers=7 lr=0.01,layers=4 lr=0.0014,layers=3 lr=0.05,layers=1

Trials Progress Black-Box Optimziers minx f(x) x ∈ S

Suggested tasks Results Early Stop Heartbeats Suggested tasks Results Early Stop Heartbeats Suggested tasks Results Heartbeats Checkpoints

slide-51
SLIDE 51

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

INNER AND OUTER LOOP OF LARGE SCALE DEEP LEARNING

Outer loop Metric τ Search Method hparams h

Inner loop

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 b1 x1,1 x1,2 x1,3 ˆ y

. . .

worker1 worker2 workerN Data Synchronization

∇1 ∇2 ∇N

slide-52
SLIDE 52

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

FEATURE STORE

slide-53
SLIDE 53

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

FEATURE STORE

ϕ(x)

    y1 . . . yn         x1,1 . . . x1,n . . . . . . . . . xn,1 . . . xn,n    

ˆ y

slide-54
SLIDE 54

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

FEATURE STORE

ϕ(x)

    y1 . . . yn         x1,1 . . . x1,n . . . . . . . . . xn,1 . . . xn,n    

ˆ y

_\_ ( " ) ) _/_

2em16 scaling_michelangelo.

slide-55
SLIDE 55

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

FEATURE STORE

ϕ(x)

    y1 . . . yn         x1,1 . . . x1,n . . . . . . . . . xn,1 . . . xn,n    

ˆ y

_\_ ( " ) ) _/_

“Data is the hardest part of ML and the most important piece to get right. Modelers spend most of their time selecting and transforming features at training time and then building the pipelines to deliver those features to production models.”

  • Uber6

2em16 scaling_michelangelo.

slide-56
SLIDE 56

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

FEATURE STORE

ϕ(x)

    y1 . . . yn         x1,1 . . . x1,n . . . . . . . . . xn,1 . . . xn,n    

ˆ y

Feature Store

“Data is the hardest part of ML and the most important piece to get right. Modelers spend most of their time selecting and transforming features at training time and then building the pipelines to deliver those features to production models.”

  • Uber7

2em17 scaling_michelangelo.

slide-57
SLIDE 57

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

WHAT IS A FEATURE?

A feature is a measurable property of some data-sample A feature could be..

◮ An aggregate value (min, max, mean, sum) ◮ A raw value (a pixel, a word from a piece of text) ◮ A value from a database table (the age of a customer) ◮ A derived representation: e.g an embedding or a cluster

Features are the fuel for AI systems:

    x1 . . . xn    

Features

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Model θ

ˆ y

Prediction

L(y, ˆ y)

Loss Gradient ∇θL(y, ˆ y)

slide-58
SLIDE 58

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

FEATURE ENGINEERING IS CRUCIAL FOR MODEL PERFORMANCE

x1

  • • •
  • • •
slide-59
SLIDE 59

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

FEATURE ENGINEERING IS CRUCIAL FOR MODEL PERFORMANCE

x1 x2

slide-60
SLIDE 60

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

FEATURE ENGINEERING IS CRUCIAL FOR MODEL PERFORMANCE

x1 x2

slide-61
SLIDE 61

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISENTANGLE YOUR ML PIPELINES WITH A FEATURE STORE

Data Sources Dataset 1 Dataset 2

. . .

Dataset n Feature Store Feature Store A data management platform for machine learning. The interface between data engineering and data science. Models Models are trained using sets of features. The features are fetched from the feature store and can overlap between models.

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y ≥ 0.9 < 0.9 ≥ 0.2 < 0.2 ≥ 11.2 < 11.2 B B A (−1, −1) (−8, −8) (−10, 0) (0, −10) 40 60 80 100 160 180 200 X Y

slide-62
SLIDE 62

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISENTANGLE YOUR ML PIPELINES WITH A FEATURE STORE

Dataset 1 Dataset 2

. . .

Dataset n Feature Store

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y ≥ 0.9 < 0.9 ≥ 0.2 < 0.2 ≥ 11.2 < 11.2 B B A (−1, −1) (−8, −8) (−10, 0) (0, −10) 40 60 80 100 160 180 200 X Y

Backfilling

slide-63
SLIDE 63

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISENTANGLE YOUR ML PIPELINES WITH A FEATURE STORE

Dataset 1 Dataset 2

. . .

Dataset n Feature Store

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y ≥ 0.9 < 0.9 ≥ 0.2 < 0.2 ≥ 11.2 < 11.2 B B A (−1, −1) (−8, −8) (−10, 0) (0, −10) 40 60 80 100 160 180 200 X Y

Backfilling Analysis

slide-64
SLIDE 64

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISENTANGLE YOUR ML PIPELINES WITH A FEATURE STORE

Dataset 1 Dataset 2

. . .

Dataset n Feature Store

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y ≥ 0.9 < 0.9 ≥ 0.2 < 0.2 ≥ 11.2 < 11.2 B B A (−1, −1) (−8, −8) (−10, 0) (0, −10) 40 60 80 100 160 180 200 X Y

Backfilling Analysis Versioning

slide-65
SLIDE 65

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

DISENTANGLE YOUR ML PIPELINES WITH A FEATURE STORE

Dataset 1 Dataset 2

. . .

Dataset n Feature Store

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y ≥ 0.9 < 0.9 ≥ 0.2 < 0.2 ≥ 11.2 < 11.2 B B A (−1, −1) (−8, −8) (−10, 0) (0, −10) 40 60 80 100 160 180 200 X Y

Backfilling Analysis Versioning Documentation

slide-66
SLIDE 66

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

SUMMARY

◮ Deep Learning is going distributed ◮ Algorithms for DDL are available in several frameworks ◮ Applying DDL in practice brings a lot of operational

complexity

◮ Hopsworks is a platform for scale out deep learning and

big data processing

◮ Hopsworks makes DDL simpler by providing simple

abstractions for distributed training, parallel experiments and much more..

@hopshadoop www.hops.io @logicalclocks www.logicalclocks.com We are open source: https://github.com/logicalclocks/hopsworks https://github.com/hopshadoop/hops

Thanks to Logical Clocks Team: Jim Dowling, Seif Haridi, Theo Kakantousis, Fabio Buso, Gautier Berthou, Ermias Gebremeskel, Mahmoud Ismail, Salman Niazi, Antonios Kouzoupis, Robin Andersson, Alex Ormenisan, Rasmus Toivonen and Steffen Grohsschmiedt.

slide-67
SLIDE 67

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

Feature Computation Raw/Structured Data Data Lake Feature Store Curated Features Model

b0 x0,1 x0,2 x0,3 b1 x1,1 x1,2 x1,3 ˆ y

Demo-Setting

slide-68
SLIDE 68

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

Hands-on Workshop

  • 1. If you haven’t registered, do it now on hops.site
  • 2. Cheatsheet: http://snurran.sics.se/hops/

kim/workshop_cheat.txt

slide-69
SLIDE 69

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 1 (HELLO HOPSWORKS)

  • 1. Create a Deep Learning Tour Project on Hopsworks
  • 2. Start a Jupyter Notebook with the config:

◮ “Experiment” Mode ◮ 1 GPU ◮ 4000 (MB) memory for the driver (appmaster) ◮ 8000 (MB) memory for the executor ◮ Rest can be default

  • 3. Create a new “PySpark” notebook
  • 4. In the first cell, write:

print("Hello Hopsworks")

  • 5. Execute the cell (Ctrl + <Enter>)
slide-70
SLIDE 70

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 2 (DISTRIBUTED HELLO HOPSWORKS WITH GPU)

  • 1. Add a new cell with the contents:

def executor(): print("Hello from GPU")

  • 2. Add a new cell with the contents:

from hops import experiment experiment.launch(executor)

  • 3. Execute the two cells in order (Ctrl + <Enter>)
  • 4. Go to the Application UI
slide-71
SLIDE 71

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 2 (DISTRIBUTED HELLO HOPSWORKS WITH GPU)

slide-72
SLIDE 72

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 2 (DISTRIBUTED HELLO HOPSWORKS WITH GPU)

slide-73
SLIDE 73

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 3 (LOAD MNIST FROM HOPSFS)

  • 1. Add a new cell with the contents:

from hops import hdfs import tensorflow as tf def create_tf_dataset(): train_files = [hdfs.project_path() + "TestJob/data/mnist/train/train.tfrecords"] dataset = tf.data.TFRecordDataset(train_files) def decode(example): example = tf.parse_single_example(example ,{ ’image_raw’: tf.FixedLenFeature([], tf.string), ’label’: tf.FixedLenFeature([], tf.int64)}) image = tf.reshape(tf.decode_raw(example[’image_raw’], tf.uint8), (28,28,1)) label = tf.one_hot(tf.cast(example[’label’], tf.int32), 10) return image, label return dataset.map(decode).batch(128).repeat()

slide-74
SLIDE 74

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 3 (LOAD MNIST FROM HOPSFS)

  • 1. Add a new cell with the contents:

create_tf_dataset()

  • 2. Execute the two cells in order (Ctrl + <Enter>)
slide-75
SLIDE 75

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 4 (DEFINE CNN MODEL)

from tensorflow import keras def create_model(): model = keras.Sequential() model.add(keras.layers.Conv2D(filters=32, kernel_size=3, padding=’same’, activation=’relu’, input_shape=(28,28,1))) model.add(keras.layers.BatchNormalization()) model.add(keras.layers.MaxPooling2D(pool_size=2)) model.add(keras.layers.Dropout(0.3)) model.add(keras.layers.Conv2D(filters=64, kernel_size=3, padding=’same’, activation=’relu’)) model.add(keras.layers.BatchNormalization()) model.add(keras.layers.MaxPooling2D(pool_size=2)) model.add(keras.layers.Dropout(0.3)) model.add(keras.layers.Flatten()) model.add(keras.layers.Dense(128, activation=’relu’)) model.add(keras.layers.Dropout(0.5)) model.add(keras.layers.Dense(10, activation=’softmax’)) return model

slide-76
SLIDE 76

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 4 (DEFINE CNN MODEL)

  • 1. Add a new cell with the contents:

create_model().summary()

  • 2. Execute the two cells in order (Ctrl + <Enter>)
slide-77
SLIDE 77

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 5 (DEFINE & RUN THE EXPERIMENT)

  • 1. Add a new cell with the contents:

from hops import tensorboard from tensorflow.python.keras.callbacks import TensorBoard def train_fn(): dataset = create_tf_dataset() model = create_model() model.compile(loss=keras.losses.categorical_crossentropy ,

  • ptimizer=keras.optimizers.Adam(),metrics=[’accuracy’])

tb_callback = TensorBoard(log_dir=tensorboard.logdir()) model_ckpt_callback = keras.callbacks.ModelCheckpoint( tensorboard.logdir(), monitor=’acc’) history = model.fit(dataset , epochs=50, steps_per_epoch=80, callbacks=[tb_callback]) return history.history["acc"][−1]

slide-78
SLIDE 78

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

EXERCISE 5 (DEFINE & RUN THE EXPERIMENT)

  • 1. Add a new cell with the contents:

experiment.launch(train_fn)

  • 2. Execute the two cells in order (Ctrl + <Enter>)
  • 3. Go to the Application UI and monitor the training progress
slide-79
SLIDE 79

INTRO HOPSWORKS DISTRIBUTED DL BLACK-BOX OPTIMIZATION FEATURE STORE SUMMARY DEMO/WORKSHOP

REFERENCES

◮ Example notebooks https:

//github.com/logicalclocks/hops-examples

◮ HopsML8 ◮ Hopsworks9 ◮ Hopsworks’ feature store10 ◮ Maggy

https://github.com/logicalclocks/maggy

2em18 Logical Clocks AB. HopsML: Python-First ML Pipelines. https : / / hops . readthedocs . io / en / latest/hopsml/hopsML.html. 2018. 2em19 Jim Dowling. Introducing Hopsworks. https : / / www . logicalclocks . com / introducing - hopsworks/. 2018. 2em110 Kim Hammar and Jim Dowling. Feature Store: the missing data layer in ML pipelines? https://www. logicalclocks.com/feature-store/. 2018.