TensorFlow and Recurrent Neural Networks CSE392 - Spring 2019 - PowerPoint PPT Presentation

TensorFlow and Recurrent Neural Networks CSE392 - Spring 2019 Special Topic in CS

Task ● Recurrent Neural Network how? ● Language Modeling and ○ Implementation toolkit: (Most Tasks) TensorFlow

Language Modeling Building a model (or system / API) that can answer the following: Trained a sequence of What is the next word Language natural language in the sequence? Model training Training Corpus (fit, learn)

Language Modeling To fully capture natural language, models get very complex! Building a model (or system / API) that can answer the following: Trained a sequence of What is the next word Language natural language in the sequence? Model training Training Corpus (fit, learn)

Two Topics 1. A Concept in Machine Learning: Recurrent Neural Networks (RNNs) 2. A Toolkit or Data WorkFlow System: TensorFlow Powerful for implementing RNNs

TensorFlow A workflow system catered to numerical computation. Basic idea: defines a graph of operations on tensors (i.stack.imgur.com)

TensorFlow A workflow system catered to numerical computation. Basic idea: defines a graph of operations on tensors A multi-dimensional matrix (i.stack.imgur.com)

TensorFlow A workflow system catered to numerical computation. Basic idea: defines a graph of operations on tensors A multi-dimensional matrix A 2-d tensor is just a matrix. 1-d: vector 0-d: a constant / scalar (i.stack.imgur.com)

TensorFlow A workflow system catered to numerical computation. Basic idea: defines a graph of operations on tensors A multi-dimensional matrix A 2-d tensor is just a matrix. 1-d: vector 0-d: a constant / scalar Linguistic Ambiguity: “ds” of a Tensor =/= Dimensions of a Matrix (i.stack.imgur.com)

TensorFlow A workflow system catered to numerical computation. Basic idea: defines a graph of operations on tensors Why? Efficient, high-level built-in linear algebra and machine learning optimization operations (i.e. transformations). enables complex models, like deep learning

Tensor Flow Operations on tensors are often conceptualized as graphs: A simple example: c =mm(A, B) c = tensorflow.matmul(a, b) a b

Tensor Flow Operations on tensors are often conceptualized as graphs: example: d=b+c e=c+2 a=d ∗ e (Adventures in Machine Learning. Python TensorFlow Tutorial , 2017)

* technically, operations that work with tensors. Ingredients of a TensorFlow tensors* operations variables - persistent an abstract computation mutable tensors (e.g. matrix multiply, add) constants - constant executed by device kernels placeholders - from data graph session devices defines the environment in the specific devices (cpus or gpus) on which to run the which operations run . (like a Spark context) session.

* technically, operations that work with tensors. Ingredients of a TensorFlow tensors* operations ○ tf.Variable(initial_value, name) variables - persistent an abstract computation ○ tf.constant(value, type, name) mutable tensors (e.g. matrix multiply, add) ○ tf.placeholder(type, shape, name) constants - constant executed by device kernels placeholders - from data graph session devices defines the environment in the specific devices (cpus or gpus) on which to run the which operations run . (like a Spark context) session.

Operations tensors* operations variables - persistent an abstract computation mutable tensors (e.g. matrix multiply, add) constants - constant executed by device kernels placeholders - from data

Sessions tensors* ● Places operations on devices operations variables - persistent an abstract computation mutable tensors ● Stores the values of variables (when not distributed) (e.g. matrix multiply, add) constants - constant executed by device kernels placeholders - from data ● Carries out execution: eval() or run() graph session devices defines the environment in the specific devices (cpus or gpus) on which to run the which operations run . (like a Spark context) session.

* technically, operations that work with tensors. Ingredients of a TensorFlow tensors* operations variables - persistent an abstract computation mutable tensors (e.g. matrix multiply, add) constants - constant executed by device kernels placeholders - from data graph session devices defines the environment in the specific devices (cpus or gpus) on which to run the which operations run . (like a Spark context) session.

Example import tensorflow as tf b = tf.constant(1.5, dtype=tf.float32, name="b") c = tf.constant(3.0, dtype=tf.float32, name="c") d = b+c e = c+2 a = d*e

Example import tensorflow as tf b = tf.constant(1.5, dtype=tf.float32, name="b") c = tf.constant(3.0, dtype=tf.float32, name="c") d = b+c #1.5 + 3 e = c+2 #3+2 a = d*e #4.5*5 = 22.5

Example (working with 0-d tensors) import tensorflow as tf b = tf.constant(1.5, dtype=tf.float32, name="b") c = tf.constant(3.0, dtype=tf.float32, name="c") d = b+c #1.5 + 3 e = c+2 #3+2 a = d*e #4.5*5 = 22.5

Example: now a 1-d tensor import tensorflow as tf b = tf.constant( [1.5, 2, 1, 4.2] , dtype=tf.float32, name="b") c = tf.constant( [3, 1, 5, 10] , dtype=tf.float32, name="c") d = b+c e = c+2 a = d*e

Example: now a 1-d tensor import tensorflow as tf b = tf.constant( [1.5, 2, 1, 4.2] , dtype=tf.float32, name="b") c = tf.constant( [3, 1, 5, 10] , dtype=tf.float32, name="c") d = b+c #[4.5, 3, 6, 14.2] e = c+2 #[5, 4, 7, 12] a = d*e #??

Example: now a 2-d tensor import tensorflow as tf b = tf.constant( [[...], [...]] , dtype=tf.float32, name="b") c = tf.constant( [[...], [...]] , dtype=tf.float32, name="c") d = b+c e = c+2 a = tf.matmul(d,e)

Example: Logistic Regression X = tf.constant( [[...], [...]] , dtype=tf.float32, name="X") y = tf.constant( [...] , dtype=tf.float32, name="y") # Define our beta parameter vector: beta = tf.Variable(tf.random_uniform([featuresZ_pBias.shape[1], 1], -1., 1.), name = "beta")

Example: Logistic Regression X = tf.constant( [[...], [...]] , dtype=tf.float32, name="X") y = tf.constant( [...] , dtype=tf.float32, name="y") # Define our beta parameter vector: beta = tf.Variable(tf.random_uniform([featuresZ_pBias.shape[1], 1], -1., 1.), name = "beta") #then setup the prediction model's graph: y_pred = tf.softmax(tf.matmul(X, beta), name="predictions")

Example: Logistic Regression X = tf.constant( [[...], [...]] , dtype=tf.float32, name="X") y = tf.constant( [...] , dtype=tf.float32, name="y") # Define our beta parameter vector: beta = tf.Variable(tf.random_uniform([featuresZ_pBias.shape[1], 1], -1., 1.), name = "beta") #then setup the prediction model's graph: y_pred = tf.softmax(tf.matmul(X, beta), name="predictions") #Define a *cost function* to minimize: penalizedCost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(y_pred), reduction_indices=1)) #conceptually like |y - y_pred|

Optimizing Parameters -- derived from gradients TensorFlow has built-in ability to derive gradients given a cost function. tf.gradients(cost, [params]) (http://rasbt.github.io/mlxtend/user_guide/general_concepts/gradient-optimization/)

Example: Logistic Regression X = tf.constant( [[...], [...]] , dtype=tf.float32, name="X") y = tf.constant( [...] , dtype=tf.float32, name="y") # Define our beta parameter vector: beta = tf.Variable(tf.random_uniform([featuresZ_pBias.shape[1], 1], -1., 1.), name = "beta") #then setup the prediction model's graph: y_pred = tf.softmax(tf.matmul(X, beta), name="predictions") #Define a *cost function* to minimize: cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(y_pred), reduction_indices=1))

Example: Logistic Regression X = tf.constant( [[...], [...]] , dtype=tf.float32, name="X") y = tf.constant( [...] , dtype=tf.float32, name="y") # Define our beta parameter vector: beta = tf.Variable(tf.random_uniform([featuresZ_pBias.shape[1], 1], -1., 1.), name = "beta") #then setup the prediction model's graph: y_pred = tf.softmax(tf.matmul(X, beta), name="predictions") #Define a *cost function* to minimize: cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(y_pred), reduction_indices=1)) #define how to optimize and initialize: optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate) training_op = optimizer.minimize(cost) init = tf.global_variables_initializer()

Example: Logistic Regression X = tf.constant( [[...], [...]] , dtype=tf.float32, name="X") y = tf.constant( [...] , dtype=tf.float32, name="y") # Define our beta parameter vector: beta = tf.Variable(tf.random_uniform([featuresZ_pBias.shape[1], 1], -1., 1.), name = "beta") #then setup the prediction model's graph: y_pred = tf.softmax(tf.matmul(X, beta), name="predictions") #Define a *cost function* to minimize: cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(y_pred), reduction_indices=1)) #define how to optimize and initialize: optimizer = tf.train.GradientDescentOptimizer(learning_rate = learning_rate) training_op = optimizer.minimize(cost) init = tf.global_variables_initializer() #iterate over optimization: with tf.Session() as sess: sess.run(init) for epoch in range(n_epochs): sess.run(training_op) #done training, get final beta: best_beta = beta.eval()

TensorFlow and Recurrent Neural Networks CSE392 - Spring 2019 - PowerPoint PPT Presentation

TensorFlow and Recurrent Neural Networks CSE392 - Spring 2019 Special Topic in CS Task Recurrent Neural Network how? Language Modeling and Implementation toolkit: (Most Tasks) TensorFlow Language Modeling Building a model

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

CS6501: Deep Learning for Visual Recognition Recurrent Neural Networks (RNNs) Todays Class

TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini paolo.dragone@unitn.it

Neural Networks with Googles TensorFlow Shuo Zhang Computational discourse analysis 11/22/16

The Power of Linear Recurrent Neural Networks Neural Networks Was knnen lineare rekurrente

Recurrent Neural Network Xiaogang Wang xgwang@ee.cuhk.edu.hk February 26, 2019 cuhk Xiaogang

Getting Started with TensorFlow Part I: TensorFlow Graphs and Sessions Nick Winovich Department

C-FX-02-V1.0 DSV 4.0 2 45 15 TensorFlow TensorBoard TensorFlow

CHAPTER VII VII CHAPTER Learning in Recurrent Networks Learning in Recurrent Networks CHAPTER

Recurrent Neural Networks Greg Mori - CMPT 419/726 Goodfellow, Bengio, and Courville: Deep

Understanding LSTM Networks Recurrent Neural Networks An unrolled recurrent neural network The

CSEP 517: Natural Language Processing Recurrent Neural Networks Autumn 2018 Luke Zettlemoyer

Recurrent Neural Networks CS60010: Deep Learning Abir Das IIT Kharagpur Mar 11, 2020

A Trip Through the NGC TensorFlow Container GTC 2019 S9256 AGENDA A Trip Through the TensorFlow

Tensorflow 1.x API Graphs and Sessions Keras provides a high-level API allowing to easily

Markov Logic Networks November 18, 2008 CS 486/686 University of Waterloo Outline Markov

Markov Logic Markov Logic Probability First-Order Logic Propositional Logic Markov Logic

GHDL Tristan Gingold OSDA 2019-03-29 1 / 3 What is GHDL ? A Free (GPL v2+) VHDL

Dynamic Optimizations Last time Predication and speculation Today Dynamic compilation

drafu-ietg-ipwave-ipv6-over-80211ocb-25.txt A. Petrescu, N. Benamar, J. Hrri, C. Huitema, J-H.

Software NFs The good: The fmexibility of software The software development cycle The bad: The

Generating Sequences with Recurrent Neural Networks - Graves, Alex, 2013 Yuning Mao Based on