Let the AI do the Talk Adventures with Natural Language Generation - - PowerPoint PPT Presentation

let the ai do the talk
SMART_READER_LITE
LIVE PREVIEW

Let the AI do the Talk Adventures with Natural Language Generation - - PowerPoint PPT Presentation

Let the AI do the Talk Adventures with Natural Language Generation @MarcoBonzanini PyParis 2018 PyData London Conference 12-14 July 2019 @PyDataLondon NATURAL LANGUAGE GENERATION Natural Language Processing 5 Natural Language Processing


slide-1
SLIDE 1

Let the AI do the Talk

Adventures with Natural Language Generation

@MarcoBonzanini

PyParis 2018

slide-2
SLIDE 2
slide-3
SLIDE 3

PyData London Conference 12-14 July 2019 @PyDataLondon

slide-4
SLIDE 4

NATURAL LANGUAGE GENERATION

slide-5
SLIDE 5

Natural Language Processing

5

slide-6
SLIDE 6

Natural Language
 Understanding Natural Language
 Generation

Natural Language Processing

6

slide-7
SLIDE 7

Natural Language Generation

7

slide-8
SLIDE 8

The task of generating
 Natural Language from a machine representation

8

Natural Language Generation

slide-9
SLIDE 9

Applications of NLG

9

slide-10
SLIDE 10

Applications of NLG

10

Summary Generation

slide-11
SLIDE 11

Applications of NLG

11

Weather Report Generation

slide-12
SLIDE 12

Applications of NLG

12

Automatic Journalism

slide-13
SLIDE 13

Applications of NLG

13

Virtual Assistants / Chatbots

slide-14
SLIDE 14

LANGUAGE
 MODELLING

slide-15
SLIDE 15

Language Model

15

slide-16
SLIDE 16

Language Model

A model that gives you the probability of a sequence of words

16

slide-17
SLIDE 17

Language Model

P(I’m going home)
 >
 P(Home I’m going)

17

slide-18
SLIDE 18

Language Model

P(I’m going home)
 >
 P(I’m going house)

18

slide-19
SLIDE 19

Infinite Monkey Theorem

https://en.wikipedia.org/wiki/Infinite_monkey_theorem

19

slide-20
SLIDE 20

Infinite Monkey Theorem

from random import choice from string import printable def monkey_hits_keyboard(n):

  • utput = [choice(printable) for _ in range(n)]

print("The monkey typed:") print(''.join(output))

20

slide-21
SLIDE 21

Infinite Monkey Theorem

>>> monkey_hits_keyboard(30) The monkey typed: % a9AK^YKx OkVG)u3.cQ,31("!ac% >>> monkey_hits_keyboard(30) The monkey typed: fWE,ou)cxmV2IZ l}jSV'XxQ**9'|

21

slide-22
SLIDE 22

n-grams

22

slide-23
SLIDE 23

n-grams

Sequence on N items from a given sample of text

23

slide-24
SLIDE 24

n-grams

>>> from nltk import ngrams >>> list(ngrams("pizza", 3))

24

slide-25
SLIDE 25

n-grams

>>> from nltk import ngrams >>> list(ngrams("pizza", 3)) [('p', 'i', 'z'), ('i', 'z', 'z'), ('z', 'z', ‘a')]

25

slide-26
SLIDE 26

n-grams

>>> from nltk import ngrams >>> list(ngrams("pizza", 3)) [('p', 'i', 'z'), ('i', 'z', 'z'), ('z', 'z', ‘a')]

character-based trigrams

26

slide-27
SLIDE 27

n-grams

>>> s = "The quick brown fox".split() >>> list(ngrams(s, 2))

27

slide-28
SLIDE 28

n-grams

>>> s = "The quick brown fox".split() >>> list(ngrams(s, 2)) [('The', 'quick'), ('quick', 'brown'), ('brown', 'fox')]

28

slide-29
SLIDE 29

n-grams

>>> s = "The quick brown fox".split() >>> list(ngrams(s, 2)) [('The', 'quick'), ('quick', 'brown'), ('brown', 'fox')]

word-based bigrams

29

slide-30
SLIDE 30

From n-grams to Language Model

30

slide-31
SLIDE 31

From n-grams to Language Model

  • Given a large dataset of text
  • Find all the n-grams
  • Compute probabilities, e.g. count bigrams:



 


31

slide-32
SLIDE 32

Example: Predictive Text in Mobile

32

slide-33
SLIDE 33

Example: Predictive Text in Mobile

33

slide-34
SLIDE 34

34

most likely next word

Example: Predictive Text in Mobile

slide-35
SLIDE 35

Marco is …
 
 
 
 
 


35

Example: Predictive Text in Mobile

slide-36
SLIDE 36

Marco is a good time to get the latest flash player is required for video playback is unavailable right now because this video is not sure if you have a great day.

36

Example: Predictive Text in Mobile

slide-37
SLIDE 37

Limitations of LM so far

37

slide-38
SLIDE 38

Limitations of LM so far

  • P(word | full history) is too expensive
  • P(word | previous few words) is feasible
  • … Local context only! Lack of global context

38

slide-39
SLIDE 39

QUICK INTRO TO NEURAL NETWORKS

slide-40
SLIDE 40

Neural Networks

40

slide-41
SLIDE 41

Neural Networks

41

x1 x2 h1 y1 h2 h3 Input layer Output layer Hidden layer(s)

slide-42
SLIDE 42

Neurone Example

42

slide-43
SLIDE 43

Neurone Example

43

x1 w2 w1 x2

?

slide-44
SLIDE 44

Neurone Example

44

x1 w2 w1 x2

?

F(w1x1 + w2x2)

slide-45
SLIDE 45

Training the Network

45

slide-46
SLIDE 46

Training the Network

46

  • Random weight init
  • Run input through the network
  • Compute error


(loss function)

  • Use error to adjust weights


(gradient descent + back-propagation)

slide-47
SLIDE 47

More on Training

47

slide-48
SLIDE 48

More on Training

  • Batch size
  • Iterations and Epochs
  • e.g. 1,000 data points, if batch size = 100

we need 10 iterations to complete 1 epoch

48

slide-49
SLIDE 49

RECURRENT
 NEURAL NETWORKS

slide-50
SLIDE 50

Limitation of FFNN

50

slide-51
SLIDE 51

Limitation of FFNN

51

Input and output

  • f fixed size
slide-52
SLIDE 52

Recurrent Neural Networks

52

slide-53
SLIDE 53

Recurrent Neural Networks

53

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

slide-54
SLIDE 54

Recurrent Neural Networks

54

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

slide-55
SLIDE 55

Limitation of RNN

55

slide-56
SLIDE 56

Limitation of RNN

56

“Vanishing gradient” Cannot “remember” what happened long ago

slide-57
SLIDE 57

Long Short Term Memory

57

slide-58
SLIDE 58

Long Short Term Memory

58

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

slide-59
SLIDE 59

59

https://en.wikipedia.org/wiki/Long_short-term_memory

slide-60
SLIDE 60

A BIT OF PRACTICE

slide-61
SLIDE 61

Deep Learning in Python

61

slide-62
SLIDE 62

Deep Learning in Python

  • Some NN support in scikit-learn
  • Many low-level frameworks: Theano,

PyTorch, TensorFlow

  • … Keras!
  • Probably more

62

slide-63
SLIDE 63

Keras

63

slide-64
SLIDE 64

Keras

  • Simple, high-level API
  • Uses TensorFlow, Theano or CNTK as backend
  • Runs seamlessly on GPU
  • Easier to start with

64

slide-65
SLIDE 65

LSTM Example

65

slide-66
SLIDE 66

LSTM Example

model = Sequential() model.add( LSTM( 128, input_shape=(maxlen,len(chars)) ) ) model.add(Dense(len(chars), activation='softmax'))

66

Define the network

slide-67
SLIDE 67

LSTM Example

  • ptimizer = RMSprop(lr=0.01)

model.compile(
 loss='categorical_crossentropy', 


  • ptimizer=optimizer


)

67

Configure the network

slide-68
SLIDE 68

LSTM Example

model.fit(x, y, batch_size=128, epochs=60, callbacks=[print_callback]) model.save(‘char_model.h5’)

68

Train the network

slide-69
SLIDE 69

LSTM Example

for i in range(output_size): ... preds = model.predict(x_pred, verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] generated += next_char

69

Generate text

slide-70
SLIDE 70

LSTM Example

for i in range(output_size): ... preds = model.predict(x_pred, verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] generated += next_char

70

Seed text

slide-71
SLIDE 71

Sample Output

71

slide-72
SLIDE 72

Sample Output

are the glories it included. Now am I lrA to r ,d?ot praki ynhh kpHu ndst -h ahh umk,hrfheleuloluprffuamdaedospe aeooasak sh frxpaphrNumlpAryoaho (…)

72

Seed text After 1 epoch

slide-73
SLIDE 73

Sample Output

I go from thee: Bear me forthwitht wh, t che f uf ld,hhorfAs c c ff.h scfylhle, rigrya p s lee rmoy, tofhryg dd?ofr hl t y ftrhoodfe- r Py (…)

73

After ~5 epochs

slide-74
SLIDE 74

Sample Output

a wild-goose flies, Unclaim'd of any manwecddeelc uavekeMw gh whacelcwiiaeh xcacwiDac w fioarw ewoc h feicucra h,h, :ewh utiqitilweWy ha.h pc'hr, lagfh eIwislw ofiridete w laecheefb .ics,aicpaweteh fiw?egp t? (…)

74

After 20+ epochs

slide-75
SLIDE 75

Tuning

slide-76
SLIDE 76

Tuning

  • More layers?
  • More hidden nodes? or less?
  • More data?
  • A combination?
slide-77
SLIDE 77

Wyr feirm hat. meancucd kreukk? , foremee shiciarplle. My, Bnyivlaunef sough bus: Wad vomietlhas nteos thun. lore

  • rain, Ty thee I Boe,

I rue. niat

77

Tuning

After 1 epoch

slide-78
SLIDE 78

to Dover, where inshipp'd Commit them to plean me than stand and the woul came the wife marn to the groat pery me Which that the senvose in the sen in the poor The death is and the calperits the should

78

Tuning

Much later

slide-79
SLIDE 79

FINAL REMARKS

slide-80
SLIDE 80

A Couple of Tips

slide-81
SLIDE 81

A Couple of Tips

  • You’ll need a GPU
  • Develop locally on very small dataset


then run on cloud on real data

  • At least 1M characters in input,


at least 20 epochs for training

  • model.save() !!!
slide-82
SLIDE 82

Summary

  • Natural Language Generation is fun
  • Simple models vs. Neural Networks
  • Keras makes your life easier
  • A lot of trial-and-error!
slide-83
SLIDE 83

THANK YOU

@MarcoBonzanini speakerdeck.com/marcobonzanini

slide-84
SLIDE 84
  • Brandon Rohrer on "Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM)":

https://www.youtube.com/watch?v=WCUNPb-5EYI

  • Chris Olah on Understanding LSTM Networks:


http://colah.github.io/posts/2015-08-Understanding-LSTMs/

  • Andrej Karpathy on "The Unreasonable Effectiveness of Recurrent Neural Networks":


http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Pics:

  • Weather forecast icon: https://commons.wikimedia.org/wiki/File:Newspaper_weather_forecast_-_today_and_tomorrow.svg
  • Stack of papers icon: https://commons.wikimedia.org/wiki/File:Stack_of_papers_tied.svg
  • Document icon: https://commons.wikimedia.org/wiki/File:Document_icon_(the_Noun_Project_27904).svg
  • News icon: https://commons.wikimedia.org/wiki/File:PICOL_icon_News.svg
  • Cortana icon: https://upload.wikimedia.org/wikipedia/commons/thumb/8/89/Microsoft_Cortana_light.svg/1024px-

Microsoft_Cortana_light.svg.png

  • Siri icon: https://commons.wikimedia.org/wiki/File:Siri_icon.svg
  • Google assistant icon: https://commons.wikimedia.org/wiki/File:Google_mic.svg

Readings & Credits