@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward - PowerPoint PPT Presentation

@MagnusHyttsten

Meet Robin

Guinea Pig Meet Robin

An Awkward Social Experiment (that I'm afraid you need to be part of...)

ROCKS!

"GTC" Input Data Examples (Train & Test Data) Model <Awkward Output (Your Silence> Brain)

"GTC" Labels Input Data (Correct Answers) Examples (Train & Test Data) "Rocks" Model Output Loss (Your function Brain) Optimizer

"GTC" Labels Input Data (Correct Answers) Examples (Train & Test Data) "Rocks" "Rocks" Model Output Loss (Your function Brain) Optimizer

Agenda Intro to Machine Learning Creating a TensorFlow Model Why are GPUs Great for Machine Learning Workloads Distributed TensorFlow Training

Premade Estimators Datasets Estimator tf.keras tf.keras.layers Python Frontend Java C++ TensorFlow Distributed Execution Engine CPU GPU Android iOS ...

TensorFlow Estimator Architecture Estimator (tf.estimator) calls input_fn (Datasets, tf.data)

Premade Estimators Estimator (tf.estimator) calls input_fn (Datasets, tf.data) subclass Premade Estimators LinearClassifier DNNLinearCombinedClassifier DNNLinearCombinedRegressor DNNClassifier BaselineClassifier LinearRegressor BaselineRegressor DNNRegressor

Premade Estimators Premade Estimators Datasets LinearRegressor(...) LinearClassifier(...) DNNRegressor(...) DNNClassifier(...) estimator = DNNLinearCombinedRegressor(...) DNNLinearCombinedClassifier(...) BaselineRegressor(...) BaselineClassifier(...) # Train locally estimator.train ( input_fn=..., ... estimator.evaluate( input_fn=..., ...) Datasets estimator.predict ( input_fn=..., ...)

Custom Models #1 - model_fn Estimator (tf.estimator) calls calls Keras Layers (tf.keras.layer) model_fn input_fn use (Datasets, tf.data) subclass Premade Estimators LinearClassifier DNNLinearCombinedClassifier DNNLinearCombinedRegressor DNNClassifier BaselineClassifier LinearRegressor BaselineRegressor DNNRegressor

Custom Models #2 - Keras Model Estimator Keras model_to_estimator (tf.estimator) (tf.keras) calls calls Keras Layers (tf.keras.layer) model_fn input_fn use (Datasets, tf.data) subclass Premade Estimators LinearClassifier DNNLinearCombinedClassifier DNNLinearCombinedRegressor DNNClassifier BaselineClassifier LinearRegressor BaselineRegressor DNNRegressor

Custom Models tf.keras.layers tf.keras # Imports yada yada ... model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.2)) model.add(Dense(10, activation='softmax')) model.compile(loss='categorical_crossentropy' , optimizer='adam' , metrics=['accuracy'] )

Train/Evaluate Model Estimator Datasets # Convert a Keras model to tf.estimator.Estimator ... estimator = keras.estimator.model_to_estimator ( model, ... ) # Train locally estimator.train ( input_fn=..., ... estimator.evaluate( input_fn=..., ...) Datasets estimator.predict ( input_fn=..., ...)

Summary - Use Estimators, Datasets, and Keras ● Premade Estimators (tf.estimator): When possible ● Custom Models a. model_fn in Estimator & tf.keras.layers b. Keras Models (tf.keras) estimator = keras.model_to_estimator(...) ■ ● Datasets (tf.data) for the input pipeline

Agenda Intro to Machine Learning Creating a TensorFlow Model Why are GPUs Great for Machine Learning Workloads Distributed TensorFlow Training

Disclaimer... High-Level - We look at only parts of the power of GPUs ● ● Simple Overview - More optimal designs exist Reduced Scope - Only considering fully-connected layers, etc ●

Strengths of V100 GPU Built for Massively Parallel Computations ● ● Hardware & software suitable to manage Deep Learning Workloads (Tensor Cores, mixed-precision execution, etc)

Strengths of V100 GPU Built for Massively Parallel Computations ● ● Specific hardware / software to manage Deep Learning Workloads (Tensor Cores, mixed-precision execution, etc) Tesla SXM V100 ● 5376 cores (FP32)

Strengths of V100 GPU What are we going to do with 5376 FP32 cores?

Strengths of V100 GPU What are we going to do with 5376 FP32 cores? "Execute things in parallel"!

Strengths of V100 GPU What are we going to do with 5376 FP32 cores? "Execute things in parallel"! Yes, but how can we exactly do that for ML Workloads?

Strengths of V100 GPU What are we going to do with 5376 FP32 cores? "Execute things in parallel"! Yes, but how can we exactly do that for ML Workloads? "Hey, that's your job - That's why we're here listening"!

Strengths of V100 GPU What are we going to do with 5376 FP32 cores? "Execute things in parallel"! Yes, but how can we exactly do that for ML Workloads? "Hey, that's your job - That's why we're here listening"! Alright, let's talk about that then

We may have a huge number of layers ● Each layer can have huge number of neurons ● --> There may be 100s millions or even billions * and + ops All knobs are W values that we need to tune So that given a certain input, they generate the correct output

"Matrix Multiplication is EATING (the computing resources of) THE WORLD" h i_j = [X 0 , X 1 , X 2, ... ] * [W 0 , W 1 , W 2, ... ] h i_j = X 0 *W 0 + X 1 *W 1 + X 2 *W 2 + ...

Matmul X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6

Single-threaded Execution

Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X W [ [ 1 0.1 2 0.1 . . * . . . . 256 0.1 [ [

Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X W [ [ 1*0.1 = 0.1 1 0.1 2 0.1 . . * . . . . 256 0.1 [ [

Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X Prev W [ [ 1*0.1 = 0.1 1 0.1 2 0.1 0.1 . . * . . . . 256 0.1 [ [

Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X Prev W [ [ 1*0.1 = 0.1 1 0.1 2 0.1 0.1 + 2*0.1 = 0.3 . . * . . . . 256 0.1 [ [

Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X Prev W [ [ 1*0.1 = 0.1 1 0.1 2 0.1 0.1 + 2*0.1 = 0.3 . . . * . . . 3238.5+255*0.1 = 3264 . . 256 0.1 3264 + 256*0.1 = 3289.6 [ [

Single-threaded Execution X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X Prev W Single-threaded [ [ Execution 1*0.1 = 0.1 1 0.1 2 0.1 0.1 + 2*0.1 = 0.3 256 * t . . . * . . . 3238.5+255*0.1 = 3264 . . 256 0.1 3264 + 256*0.1 = 3289.6 [ [

GPU Execution

GPU - #1 Multiplication Step X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X W [ [ 1 0.1 2 0.1 . . * . . . . 256 0.1 [ [

GPU - #1 Multiplication Step X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X W [ [ 1 0.1 Tesla SXM V100 2 0.1 5376 cores (FP32) . . * . . . . 256 0.1 [ [

GPU - #1 Multiplication Step X = [1.0, 2.0, ..., 256.0] # Let's say we have 256 input values W = [0.1, 0.1, ..., 0.1] # Then we need to have 256 weight values h 0,0 = X * W # [1*0.1 + 2*0.1 + ... + 256*0.1] == 32389.6 X W [ [ 1 0.1 2 0.1 . . * . . . . 256 0.1 [ [

@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward - PowerPoint PPT Presentation

@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward Social Experiment (that I'm afraid you need to be part of...) ROCKS! "GTC" Input Data Examples (Train & Test Data) Model <Awkward Output (Your Silence>

@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward Social Experiment (that I'm

SPARQLing Pig SPARQLing Pig Processing Linked Data with Pig Latin Stefan Hagedorn, Katja Hose,

Pig manure: A valuable Fertiliser! Gerard McCutcheon Pig Development Department Why should You

SparkSQL 1 Where are we? Pig Latin HiveQL Pig Hive ??? Hadoop MapReduce Spark RDD

SparkSQL 11/14/2018 1 Where are we? Pig Latin HiveQL Pig Hive ??? Hadoop MapReduce

Welcome The Super Pig 2019 The Year of the Earth Pig Setting The Scene The Chinese Zodiac

SparkSQL 1 Where are we? Pig Latin HiveQL Pig Hive ??? Hadoop MapReduce Spark RDD

Apache Pig for Data Science Casey Stella April 9, 2014 Casey Stella (Hortonworks) Apache Pig

Part 1. The Essence of the Pig 1. 2. 3. 4. 5. 6. Part 1. The Essence of the Pig 1.

Getuing Starued with TensorFlow on GPUs Magnus Hyttsten @MagnusHyttsten 1 Agenda + An Awkward

Integrating Assisted Reproductive Technologies & Elite Pig Genetics to Transform the Pig

Making Money in the Making Money in the Russian Pig and Poultry Russian Pig and Poultry Sector

HelenOS in the Year of the Pig HelenOS in the Year of the Pig http://www.helenos.org

A Smarter Pig: Building a SQL interface to Pig using Apache Calcite Eli Levine & Julian Hyde

Integrating solenoid code & GUINEA-PIG slicing Yngve Inntjore Levinsen CERN 26. of April,

LinSim Linear Accelerator Simulation Framework with PLACET and GUINEA-PIG Jochem Snuverink (JAI,

Selection and estimation in exploratory subgroup analyses a proposal Gerd Rosenkranz, Novartis

RESULTS PRESENTATION for the year ended 28 February 2019 P RO P E RT Y F U N D

The results Official Launch of L&C Project Sunday, July 23, 2017 1 Mollymook Beach

Transmission draft decision ElectraNet 1 July 2013 to 30 June 2018 Mr Andrew Reeves Chairman 1

Substantial Damage Estimator Overview Chapman 2008 Introductions And Housekeeping My name is

Control of Reaction Systems via Rate Estimation and Feedback Linearization Diogo Rodrigues,

Advances in error estimation for Advances in error estimation for homogenisation homogenisation

April 9, 2020 - 8:00-9:00 am Teleconference: (647) 951-8467 or Long Distance: 1 (844) 304 -7743

@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward - PowerPoint PPT Presentation

@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward Social Experiment (that I'm afraid you need to be part of...) ROCKS! "GTC" Input Data Examples (Train & Test Data) Model <Awkward Output (Your Silence>

@MagnusHyttsten Meet Robin Guinea Pig Meet Robin An Awkward Social Experiment (that I'm

SPARQLing Pig SPARQLing Pig Processing Linked Data with Pig Latin Stefan Hagedorn, Katja Hose,

Pig manure: A valuable Fertiliser! Gerard McCutcheon Pig Development Department Why should You

SparkSQL 1 Where are we? Pig Latin HiveQL Pig Hive ??? Hadoop MapReduce Spark RDD

SparkSQL 11/14/2018 1 Where are we? Pig Latin HiveQL Pig Hive ??? Hadoop MapReduce

Welcome The Super Pig 2019 The Year of the Earth Pig Setting The Scene The Chinese Zodiac

SparkSQL 1 Where are we? Pig Latin HiveQL Pig Hive ??? Hadoop MapReduce Spark RDD

Apache Pig for Data Science Casey Stella April 9, 2014 Casey Stella (Hortonworks) Apache Pig

Part 1. The Essence of the Pig 1. 2. 3. 4. 5. 6. Part 1. The Essence of the Pig 1.

Getuing Starued with TensorFlow on GPUs Magnus Hyttsten @MagnusHyttsten 1 Agenda + An Awkward

Integrating Assisted Reproductive Technologies &amp; Elite Pig Genetics to Transform the Pig

Making Money in the Making Money in the Russian Pig and Poultry Russian Pig and Poultry Sector

HelenOS in the Year of the Pig HelenOS in the Year of the Pig http://www.helenos.org

A Smarter Pig: Building a SQL interface to Pig using Apache Calcite Eli Levine &amp; Julian Hyde

Integrating solenoid code &amp; GUINEA-PIG slicing Yngve Inntjore Levinsen CERN 26. of April,

LinSim Linear Accelerator Simulation Framework with PLACET and GUINEA-PIG Jochem Snuverink (JAI,

Selection and estimation in exploratory subgroup analyses a proposal Gerd Rosenkranz, Novartis

RESULTS PRESENTATION for the year ended 28 February 2019 P RO P E RT Y F U N D

The results Official Launch of L&amp;C Project Sunday, July 23, 2017 1 Mollymook Beach

Transmission draft decision ElectraNet 1 July 2013 to 30 June 2018 Mr Andrew Reeves Chairman 1

Substantial Damage Estimator Overview Chapman 2008 Introductions And Housekeeping My name is

Control of Reaction Systems via Rate Estimation and Feedback Linearization Diogo Rodrigues,

Advances in error estimation for Advances in error estimation for homogenisation homogenisation

April 9, 2020 - 8:00-9:00 am Teleconference: (647) 951-8467 or Long Distance: 1 (844) 304 -7743

Integrating Assisted Reproductive Technologies & Elite Pig Genetics to Transform the Pig

A Smarter Pig: Building a SQL interface to Pig using Apache Calcite Eli Levine & Julian Hyde

Integrating solenoid code & GUINEA-PIG slicing Yngve Inntjore Levinsen CERN 26. of April,

The results Official Launch of L&C Project Sunday, July 23, 2017 1 Mollymook Beach