Lecture 22: Representation Learning Kai-Wei Chang CS @ University - - PowerPoint PPT Presentation

β–Ά
lecture 22 representation learning
SMART_READER_LITE
LIVE PREVIEW

Lecture 22: Representation Learning Kai-Wei Chang CS @ University - - PowerPoint PPT Presentation

Lecture 22: Representation Learning Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501-NLP 1 Feature Representations Feature Representation Learning Algorithm Color_red


slide-1
SLIDE 1

Lecture 22: Representation Learning

Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16

1 CS6501-NLP

slide-2
SLIDE 2

Feature Representations

CS6501-NLP 2

Feature Representation Color_red Shape_round Has_leaf … Learning Algorithm

slide-3
SLIDE 3

Feature Representation

v E.g., Conditional Random Field v P 𝒖 𝒙 ∝ ∏ exp(βˆ‘ πœ‡+𝑔

+ 𝑒.,𝒙 + βˆ‘ πœƒ2𝑕2(𝑒.,𝑒.45, 𝒙) 2 +

)

.

CS6501-NLP 3

𝑒7 𝑒8 π‘₯7 π‘₯8 𝑒5 𝑒: π‘₯5 π‘₯:

Node feature 𝑔(𝑒.,𝒙) Edge feature 𝑕(𝑒.,𝑒.45,𝒙)

slide-4
SLIDE 4

Feature Representation

v High-order combinations – kernel trick

CS6501-NLP 4

slide-5
SLIDE 5

Tree Kernel

v How to measure the similarity between two parse trees?

CS6501-NLP 5

slide-6
SLIDE 6

Learning representations via NN

v Identify high-order combinations v NN architecture for encoding language structures v Learn hierarchical representations

vRepresentations for token/phrases/sentences

CS6501-NLP 6

slide-7
SLIDE 7

How to represent words?

v Token, bi-gram, n-gram (one-hot featuers) v Word embeddings v Task-specific word embeddings

vE.g., for sentiment analysis

CS6501-NLP 7

slide-8
SLIDE 8

How to represent phrases/sentences? v Recursive NN [Socher, Manning, Ng 11] v Many follow-up approaches

CS6501-NLP 8

slide-9
SLIDE 9

CS6501-NLP 9

Unsupervised Feature learning & deep learning , Andrew Ng

slide-10
SLIDE 10

Auto-encoder and auto-decoder

CS6501-NLP 10

slide-11
SLIDE 11

Sequence to sequence models

[Sutskever, Vinyals & Le 14]

v Have been shown effective in machine translation, image captioning and and many structured tasks

CS6501-NLP 11

slide-12
SLIDE 12

Structured prediction β‹‚ Representation Learning

v NLP problems are structural

vOutput variables are inter-correlated vNeed joint predictions

v Traditional approaches

vGraphical model approaches

v E.g., Probabilistic graphical models, structured perceptron

vSequence of decisions

v E.g., incremental perceptron, L2S, transition- based methods

SPNLP 12

slide-13
SLIDE 13

Recent trends

v Landscape of methods in Deepβ‹‚Structure

vDeep learning/hidden representation e.g., seq2seq, RNN vDeep features into factors, traditional factor graph inference e.g., LSTM+CRF, graph transformer networks vGlobally optimized transitional-based approaches e.g., beam-search seq2seq, SyntaxNet v…

SPNLP 13