Neural Classification of Linguistic Coherence using Long Short-Term - - PowerPoint PPT Presentation

neural classification of linguistic coherence using long
SMART_READER_LITE
LIVE PREVIEW

Neural Classification of Linguistic Coherence using Long Short-Term - - PowerPoint PPT Presentation

Neural Classification of Linguistic Coherence using Long Short-Term Memories Pashutan Modaresi, Matthias Liebeck and Stefan Conrad Order of Sentences Hi! My name is Alan. My name is Alan. Hi! Is what makes a text semantically I am a


slide-1
SLIDE 1

Neural Classification of Linguistic Coherence using Long Short-Term Memories

Pashutan Modaresi, Matthias Liebeck and Stefan Conrad

slide-2
SLIDE 2

Order of Sentences

Is what makes a text semantically meaningful

➔ Hi! ➔ My name is Alan. ➔ I am a computer scientist!

Hi! My name is Alan. I am a computer scientist! Hi! I am a computer scientist! My name is Alan. I am a computer scientist! Hi! My name is Alan. I am a computer scientist! My name is Alan. Hi! My name is Alan. Hi! I am a computer scientist! My name is Alan. I am a computer scientist! Hi!

slide-3
SLIDE 3

Humans vs. Machines

Question

Is there a need to teach all these abilities to a machine?

Discourse Coherence Linguistic Redundancy Linguistic Contradiction Pragmatics

slide-4
SLIDE 4

Sentence Ordering

Question?

What about the sizes of m and m’ ? Should they be equal? Hi! My name is Alan.

0, 1

  • 1, 0, 1
slide-5
SLIDE 5

Many Applications! Focus was

TEXT SUMMARIZATION

in the news domain

Question?

What are the other applications of sentence

  • rdering?
slide-6
SLIDE 6

Treat the problem as a classification task

Question

Why do we use the negative log-likelihood and not the log-likelihood?

Number of Instances Class probability of the n-th pair

slide-7
SLIDE 7

Deep Neural Architecture

S1 S2 LSTM Dropout LSTM Dropout +1

  • 1
slide-8
SLIDE 8

Deep Neural Architecture

S1 S2 LSTM Dropout LSTM Dropout +1

  • 1

One-Hot Encoding

slide-9
SLIDE 9

Deep Neural Architecture

S1 S2 LSTM Dropout LSTM Dropout +1

  • 1

Embedding Tip

Embedding: Simple matrix multiplication with input vector Init the matrix E

slide-10
SLIDE 10

Deep Neural Architecture

S1 S2 LSTM Dropout LSTM Dropout +1

  • 1

Merge Tip

Concatenate the embeddings

slide-11
SLIDE 11

Deep Neural Architecture

S1 S2 LSTM Dropout LSTM Dropout +1

  • 1

Long Short-Term Memory Tip

LSTM: Just a special kind of RNNs addressing their difficulties

slide-12
SLIDE 12

Deep Neural Architecture

S1 S2 LSTM Dropout LSTM Dropout +1

  • 1

Regularization Tip

Dropout: Sets a random set of its arguments to zero.

slide-13
SLIDE 13

Deep Neural Architecture

S1 S2 LSTM Dropout LSTM Dropout +1

  • 1

Softmax Tip

Dropout: Sets a random set of its arguments to zero.

slide-14
SLIDE 14

Data

How to collect the required data to train the network?

➔ Binary ➔ Ternary +1

  • 1

+1 Correct Order

  • 1

Wrong Order Correct Order Wrong Order Mising Context

slide-15
SLIDE 15

Baseline - SVM

English German Binary Ternary Binary Ternary 0.24 0.16 0.25 0.16

SVMs: Not really appropriate for sequential modelling

slide-16
SLIDE 16

Macro-Averaged F1

slide-17
SLIDE 17

Lessons Learned

  • Use appropriate tools for sequence modeling
  • RNNs are slow. First train on a subset of data
  • Train deep models with lots of data points
  • Find a way to automatically annotate data
  • Use regularization (be generous)
slide-18
SLIDE 18

Thank You For Your Attention