TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini - - PowerPoint PPT Presentation

tensorflow neural networks lab
SMART_READER_LITE
LIVE PREVIEW

TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini - - PowerPoint PPT Presentation

TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini paolo.dragone@unitn.it passerini@disi.unitn.it Machine Learning Dragone, Passerini (DISI) TensorFlow Machine Learning 1 / 28 Introduction TensorFlow TensorFlow is a Python


slide-1
SLIDE 1

TensorFlow: neural networks lab

Paolo Dragone and Andrea Passerini

paolo.dragone@unitn.it passerini@disi.unitn.it

Machine Learning

Dragone, Passerini (DISI) TensorFlow Machine Learning 1 / 28

slide-2
SLIDE 2

Introduction

TensorFlow

TensorFlow is a Python package Numerical computation using data flow graphs Developed (by Google) for the purpose of machine learning and deep neural networks research

Installation and Documentation

https://www.tensorflow.org/

Dragone, Passerini (DISI) TensorFlow Machine Learning 2 / 28

slide-3
SLIDE 3

Introduction

Outline

MNIST dataset

I https://www.tensorflow.org/versions/master/tutorials/

mnist/beginners/index.html#the-mnist-data

MNIST for ML Beginners

I https://www.tensorflow.org/versions/master/tutorials/

mnist/beginners/index.html

Deep MNIST for Experts

I https://www.tensorflow.org/versions/master/tutorials/

mnist/pros/index.html

Dragone, Passerini (DISI) TensorFlow Machine Learning 3 / 28

slide-4
SLIDE 4

Introduction

On the lab computers

To use TensorFlow on the lab computers open the terminal in Menu → Others → TensorFlow.

On Ubuntu 12.04

Run the script run me on ubuntu1204.sh

Dragone, Passerini (DISI) TensorFlow Machine Learning 4 / 28

slide-5
SLIDE 5

MNIST dataset

MNIST dataset

Dataset of handwritten digits Each image 28 × 28 = 784 pixels Train: 60k test: 10k

Dragone, Passerini (DISI) TensorFlow Machine Learning 5 / 28

slide-6
SLIDE 6

MNIST dataset

Importing MNIST

Use the given input data.py script Train: 55k, validation: 5k, test: 10k mnist is a Python Class (has attributes, and methods)

I mnist.train.images I mnist.train.labels I mnist.train.next batch(n) I . . . Dragone, Passerini (DISI) TensorFlow Machine Learning 6 / 28

slide-7
SLIDE 7

MNIST dataset

Data representation

The 28 × 28 = 784 pixels are represented as a vector The 10 classes are represented with the one-hot encoding

Dragone, Passerini (DISI) TensorFlow Machine Learning 7 / 28

slide-8
SLIDE 8

Softmax regressions

Softmax regressions

Look at an image and give probabilities for it being each digit Evidence that an image is a particular class i evidencei = X

j

Wijxj + bi ∀i Softmax to shape the evidence as a probability distribution over i cases softmaxi = exp(evidencei) P

j exp(evidencej)

∀i

Dragone, Passerini (DISI) TensorFlow Machine Learning 8 / 28

slide-9
SLIDE 9

Softmax regressions

Softmax regressions

Schematic view Vectorized version Compactly ˆ y = softmax(Wx + b)

Dragone, Passerini (DISI) TensorFlow Machine Learning 9 / 28

slide-10
SLIDE 10

Softmax regressions

Implementation: model

Import TensorFlow Define the placeholders Define the variables Define the softmax layer

Dragone, Passerini (DISI) TensorFlow Machine Learning 10 / 28

slide-11
SLIDE 11

Softmax regressions

Implementation: optimization

Define the cost (cross-entropy): Hy(ˆ y) = − P y log ˆ y Define the training algorithm Start a new session (for now we have not computed anything) Initialize the variables Train the model

Dragone, Passerini (DISI) TensorFlow Machine Learning 11 / 28

slide-12
SLIDE 12

Softmax regressions

Implementation: evaluation

Evaluate accuracy (it should be around 0.91) Plot the model weights (plotter.py)

Dragone, Passerini (DISI) TensorFlow Machine Learning 12 / 28

slide-13
SLIDE 13

Deep convolutional net

Deep architechtures

0.91 accuracy on MNIST is NOT good! State of the art performance is 0.9979 Let’s refine our model

I 2 convolutional layers I alternated with 2 max pool layers I ReLU layer (with dropout) I Softmax regressions

Accuracy target: 0.99

Dragone, Passerini (DISI) TensorFlow Machine Learning 13 / 28

slide-14
SLIDE 14

Deep convolutional net

Convolutional layer

Broadly used in image classification Local connettivity (width, height, depth) Spatial arrangement (stride, padding) Parameter sharing

convolutional max pool convolutional max pool ReLU (dropout) softmax

x1 x1 x2 x2 x784 x784 . . . ˆ y0 ˆ y0 ˆ y1 ˆ y1 ˆ y9 ˆ y9 . . .

Dragone, Passerini (DISI) TensorFlow Machine Learning 14 / 28

slide-15
SLIDE 15

Deep convolutional net

Max pooling layer

Commonly used after convolutional layer(s) Reduce spatial size Avoid overfitting Max pooling 2 × 2 is very common

convolutional max pool convolutional max pool ReLU (dropout) softmax

x1 x1 x2 x2 x784 x784 . . . ˆ y0 ˆ y0 ˆ y1 ˆ y1 ˆ y9 ˆ y9 . . .

Dragone, Passerini (DISI) TensorFlow Machine Learning 15 / 28

slide-16
SLIDE 16

Deep convolutional net

ReLU (and Dropout)

Fully connected layer Rectified Linear Unit activation Dropout randomly excludes neurons to avoid

  • verfitting

convolutional max pool convolutional max pool ReLU (dropout) softmax

x1 x1 x2 x2 x784 x784 . . . ˆ y0 ˆ y0 ˆ y1 ˆ y1 ˆ y9 ˆ y9 . . .

Dragone, Passerini (DISI) TensorFlow Machine Learning 16 / 28

slide-17
SLIDE 17

Deep convolutional net

Softmax layer

Look at a feature configuration (coming from the layers below) and give probabilities for it being each digit Evidence that a feature configuration is a particular class i evidencei = X

j

Wijxj + bi ∀i Softmax to shape the evidence as a probability distribution over i cases softmaxi = exp(evidencei) P

j exp(evidencej)

∀i

convolutional max pool convolutional max pool ReLU (dropout) softmax

x1 x1 x2 x2 x784 x784 . . . ˆ y0 ˆ y0 ˆ y1 ˆ y1 ˆ y9 ˆ y9 . . .

Dragone, Passerini (DISI) TensorFlow Machine Learning 17 / 28

slide-18
SLIDE 18

Deep convolutional net

Implementation: preparation

Imports Load data Placeholders and data reshaping

NOTE

Reshaping is needed for convolution and max pooling

Dragone, Passerini (DISI) TensorFlow Machine Learning 18 / 28

slide-19
SLIDE 19

Deep convolutional net

Implementation: weights initialization

Define functions to initialize variables of the model

Dragone, Passerini (DISI) TensorFlow Machine Learning 19 / 28

slide-20
SLIDE 20

Deep convolutional net

Implementation: convolution and pooling

Define functions (to keep the code cleaner)

Dragone, Passerini (DISI) TensorFlow Machine Learning 20 / 28

slide-21
SLIDE 21

Deep convolutional net

Implementation: model (convolution and max pooling)

1st layer: convolutional with max pooling 2nd layer: convolutional with max pooling

Shrinking

Applying 2 × 2 max pooling we are shrinking the image After 2 layers we moved from 28 × 28 to 7 × 7 For each point we have 64 features

Dragone, Passerini (DISI) TensorFlow Machine Learning 21 / 28

slide-22
SLIDE 22

Deep convolutional net

Implementation: model (ReLu, dropout and softmax)

3rd layer: ReLU

Reshape

We are switching back to fully connected layers, we want to reshape the input as a flat vector. 3rd layer: add dropout 4th layer: softmax (output)

Dragone, Passerini (DISI) TensorFlow Machine Learning 22 / 28

slide-23
SLIDE 23

Deep convolutional net

Implementation: optimization and evaluation

Define the cost (cross-entropy): Hy(ˆ y) = − P y log ˆ y Define the training algorithm Start a new session (for now we have not computed anything) Initialize the variables Define the accuracy before training (for monitoring)

Dragone, Passerini (DISI) TensorFlow Machine Learning 23 / 28

slide-24
SLIDE 24

Deep convolutional net

Implementation: optimization and evaluation

Train the model (may take a while) Evaluate the accuracy (it should be around 0.99)

Dragone, Passerini (DISI) TensorFlow Machine Learning 24 / 28

slide-25
SLIDE 25

Assignment

Assignment

The third ML assignment is to compare the performance of the deep convolutional network when removing layers. For this assignment you need to adapt the code of the complete deep

  • architecture. By removing one layer at the time, and keeping all the
  • thers, you can evaluate the change in performance of the neural network

in classifing the MNIST dataset.

Note

The only coding required is to modify the shape and/or size of the input vectors of the layers. The output of each layer has to remain the same. The report has to contain a short introduction on the methodologies used in the deep architecture showed during the lab (convolution, max pooling, ReLU, dropout, softmax).

Dragone, Passerini (DISI) TensorFlow Machine Learning 25 / 28

slide-26
SLIDE 26

Assignment

Assignment

Steps

1 Remove the 1st layer: convolutional and max pooling 2 Train and test the network 3 Remove the 2nd layer: convolutional and max pooling 4 Train and test the network 5 Remove the 3rd layer: ReLU with dropout 6 Train and test the network

Computation

Training a model on a quad-core CPU takes 30-45 mins. You may want to use the computers in the lab.

Dragone, Passerini (DISI) TensorFlow Machine Learning 26 / 28

slide-27
SLIDE 27

Assignment

Assignment

After completing the assignment submit it via email Send an email to paolo.dragone@unitn.it (cc: passerini@disi.unitn.it) Subject: tensorflowSubmit2016 Attachment: id name surname.zip containing:

I the Python code F model no1.py (model without the 1st layer) F model no2.py (model without the 2nd layer) F model no3.py (model without the 3rd layer) I the report (PDF format)

NOTE

No group work This assignment is mandatory in order to enroll to the oral exam

Dragone, Passerini (DISI) TensorFlow Machine Learning 27 / 28

slide-28
SLIDE 28

References

References

https://www.tensorflow.org/ http://cs231n.github.io/convolutional-networks/ https: //www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf

Dragone, Passerini (DISI) TensorFlow Machine Learning 28 / 28