L 2 Reg u lari z ation Techniq u e u sing Keras IN TR OD U C TION - - PowerPoint PPT Presentation

l 2 reg u lari z ation techniq u e u sing keras
SMART_READER_LITE
LIVE PREVIEW

L 2 Reg u lari z ation Techniq u e u sing Keras IN TR OD U C TION - - PowerPoint PPT Presentation

L 2 Reg u lari z ation Techniq u e u sing Keras IN TR OD U C TION TO TE N SOR FL OW IN R Colleen Bobbie Instr u ctor The o v erfitting challenge Training data : Testing Data : Small Variance ( Good !) Large Variance ( Bad !) INTRODUCTION TO


slide-1
SLIDE 1

L2 Regularization Technique using Keras

IN TR OD U C TION TO TE N SOR FL OW IN R

Colleen Bobbie

Instructor

slide-2
SLIDE 2

INTRODUCTION TO TENSORFLOW IN R

The overfitting challenge

Training data: Small Variance (Good!) Testing Data: Large Variance (Bad!)

slide-3
SLIDE 3

INTRODUCTION TO TENSORFLOW IN R

Overcoming overfitting

To tackle an overt model:

  • 1. Decrease overing by increasing training
  • 2. Decrease overing by changing the complexity of the network

change network structure (number of weights) change network parameters (values of weights) These techniques are known as regularization.

slide-4
SLIDE 4

INTRODUCTION TO TENSORFLOW IN R

L2 Regularization ("Ridge Regression")

L2 Regularization: aims to nd a model that may not t the training data as well, but has the exibility to t other datasets small amount of bias = signicant drop in variance in testing data Result:

hps://developers.google.com/machine-learning/crash-course/regularization-for-simplicity/l2-regularization

1

slide-5
SLIDE 5

INTRODUCTION TO TENSORFLOW IN R

L2 Regularization in Keras

In Keras: added to the model when layers are declared

model <- keras_model_sequential() model %>% layer_dense(units = 15, activation = 'relu', input_shape = 8, kernel_regularizer = regularizer_l2(l = 0.001)) final loss = total loss + 0.001 * weight coefficient value

slide-6
SLIDE 6

Let's practice!

IN TR OD U C TION TO TE N SOR FL OW IN R

slide-7
SLIDE 7

Dropout technique using TFEstimators

IN TR OD U C TION TO TE N SOR FL OW IN R

Colleen Bobbie

Instructor

slide-8
SLIDE 8

INTRODUCTION TO TENSORFLOW IN R

Dropout

  • ne of the most popular forms of regularization

modies the neural network directly A full diagram may look like this:

slide-9
SLIDE 9

INTRODUCTION TO TENSORFLOW IN R

Dropout

The rst dropped diagram may look like this:

slide-10
SLIDE 10

INTRODUCTION TO TENSORFLOW IN R

Dropout

Another diagram may look like this:

slide-11
SLIDE 11

INTRODUCTION TO TENSORFLOW IN R

Dropout in R

Using the Estimators API with a dnn_classifier :

  • urmodel <- dnn_classifier(

hidden_units = 6, feature_columns = ftr_colns, dropout = 0.5)

dropout probability is 0.5 or 50% the probability of any given hidden layer will be dropped is 50%

slide-12
SLIDE 12

Let's practice!

IN TR OD U C TION TO TE N SOR FL OW IN R

slide-13
SLIDE 13

Hyperparameter tuning with tfruns

IN TR OD U C TION TO TE N SOR FL OW IN R

Colleen Bobbie

Instructor

slide-14
SLIDE 14

INTRODUCTION TO TENSORFLOW IN R

Hyperparameters for neural networks

number of layers layer activations batch sizes and more!

slide-15
SLIDE 15

INTRODUCTION TO TENSORFLOW IN R

Introduction to tfruns

Which dropout is best?

# Create your dnn_classifier model mymodel <- dnn_classifier(feature_columns = featcols, hidden_units = c(40, 60, 10), n_classes = 2, label_vocabulary = c("N", "Y"), dropout = 0.5)

0.5? 0.2? 0.3? 0.4?

slide-16
SLIDE 16

INTRODUCTION TO TENSORFLOW IN R

Tuning a run

best practice: dene ags for key parameters outside of source code.

  • 1. Create a training script

.R le script that contains all R code for the model Helpful to save this in working directory

  • 2. Identify the ags

ags dene what rates you would like to test on each parameter for example: dropout = c(0.2,0.3,0.4)

slide-17
SLIDE 17

INTRODUCTION TO TENSORFLOW IN R

Tuning a run

Dropout:

runs <- tuning_run( "modelsourcecode.R", flags = list( dropout = c(0.2, 0.3, 0.4, 0.5) ) )

Dropout and Activation:

runs <- tuning_run( "modelsourcecode.R", flags = list( dropout = c(0.2, 0.3, 0.4, 0.5), activation = c("relu", "softmax") ) )

slide-18
SLIDE 18

INTRODUCTION TO TENSORFLOW IN R

Evaluating the run(s)

If running in interactive mode, TensorBoard will show up. Otherwise:

runs[order(runs$eval_accuracy, decreasing = TRUE), ] Data frame: 4 x 24 run_dir eval_accuracy eval_accuracy_baseline eval_auc eval_auc_prec_recall 3 runs/2019-09-29T21-02-35Z 0.9927 0.5418 0.9988 0.9986 2 runs/2019-09-29T21-03-29Z 0.9891 0.5855 0.9998 0.9998 1 runs/2019-09-29T21-04-12Z 0.9564 0.5127 0.9888 0.9835 4 runs/2019-09-29T21-01-42Z 0.9491 0.5673 0.9917 0.9881 # ... with 20 more columns: # steps_completed, metrics, script, start, end, completed, output, source_code, context, type

slide-19
SLIDE 19

INTRODUCTION TO TENSORFLOW IN R

Evaluating the run(s)

dropout = c(0.2, 0.3, 0.4, 0.5)

Data frame: 4 x 24 run_dir eval_accuracy eval_accuracy_baseline eval_auc eval_auc_precision_recall 3 runs/2019-09-29T21-02-35Z 0.9927 0.5418 0.9988 0.9986 2 runs/2019-09-29T21-03-29Z 0.9891 0.5855 0.9998 0.9998 1 runs/2019-09-29T21-04-12Z 0.9564 0.5127 0.9888 0.9835 4 runs/2019-09-29T21-01-42Z 0.9491 0.5673 0.9917 0.9881 # ... with 20 more columns: # steps_completed, metrics, script, start, end, completed, output, source_code, context, type

Dropout = 0.4!

slide-20
SLIDE 20

Let's practice!

IN TR OD U C TION TO TE N SOR FL OW IN R

slide-21
SLIDE 21

So long and thanks for all the fish

IN TR OD U C TION TO TE N SOR FL OW IN R

Colleen Bobbie

Instructor, signing o

slide-22
SLIDE 22

INTRODUCTION TO TENSORFLOW IN R

What you've learned

Chapter 1: Introduction to TensorFlow core concepts TensorFlow syntax TensorBoard Chapter 2: Learning the basics linear regression model using the Core API linear regression model using Estimators Chapter 3: Deep learning in TensorFlow end-to-end workow of a Keras DNN canned DNN using Estimators Chapter 4: Model regularization L2 regularization ("Ridge Regression") Dropout Hyperparameter tuning

slide-23
SLIDE 23

INTRODUCTION TO TENSORFLOW IN R

Learning on your own

Some ideas:

  • 1. Other canned TFEstimator models such as dnn_linear_combined_regressor
  • 2. Text classication model using Keras, incorporating a regularization technique
  • 3. What question maers to you?

Some resources: Full RStudio TensorFlow Documentation Kaggle , FiveThirtyEight

slide-24
SLIDE 24

Congratulations!

IN TR OD U C TION TO TE N SOR FL OW IN R