Recent Advances and Key Challenges Russ Salakhutdinov Machine - - PowerPoint PPT Presentation

recent advances and key challenges
SMART_READER_LITE
LIVE PREVIEW

Recent Advances and Key Challenges Russ Salakhutdinov Machine - - PowerPoint PPT Presentation

Recent Advances and Key Challenges Russ Salakhutdinov Machine Learning Department Carnegie Mellon University Canadian Institute for Advanced Research Key Challenges Multimodal Learning Reasoning, Attention and Memory Natural Language


slide-1
SLIDE 1

Recent Advances and Key Challenges

Russ Salakhutdinov

Machine Learning Department Carnegie Mellon University Canadian Institute for Advanced Research

slide-2
SLIDE 2
  • Multimodal Learning
  • Reasoning, Attention and Memory
  • Natural Language Understanding
  • Deep Reinforcement Learning
  • Unsupervised Learning / One-Shot & Transfer Learning

Key Challenges

slide-3
SLIDE 3

Model Samples

  • a group of people in a crowded area
  • a group of people are walking and talking
  • a group of people, standing around and talking

strangers, coworkers, conventioneers, attendants

TAGS: Nearest Neighbor Sentence:

people taking pictures of a crazy person

Deep Learning: Image Understanding

slide-4
SLIDE 4

Caption Generation

A car is parked in the middle of nowhere There is a cat sitting

  • n a shelf

A little boy with a bunch

  • f friends on the street

Kiros, Salakhutdinov, Zemel, ICML 2014

slide-5
SLIDE 5

Caption Generation

The handlebars are trying to ride a bike rack A man holding a red apple in his mouth

Kiros, Salakhutdinov, Zemel, ICML 2014

The two birds are trying to be seen in the water

slide-6
SLIDE 6

Caption Generation with Visual Attention

A man riding a horse in a field.

Xu et al, ICML 2015

slide-7
SLIDE 7

Caption Generation with Visual Attention

Xu et al, ICML 2015

slide-8
SLIDE 8
  • Multimodal Learning
  • Reasoning, Attention and Memory
  • Natural Language Understanding
  • Deep Reinforcement Learning
  • Unsupervised Learning / One-Shot & Transfer Learning

Key Challenges

slide-9
SLIDE 9

Who-Did-What Dataset

  • Context: “…arrested Illinois governor Rod Blagojevich and his chief of staff

John Harris on corruption charges … included Blogojevich allegedly conspiring to sell or trade the senate seat left vacant by President-elect Barack Obama…”

  • Query: President-elect Barack Obama said Tuesday he was not aware of

alleged corruption by X who was arrested on charges of trying to sell Obama’s senate seat.

  • Answer: Rod Blagojevich

Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016

slide-10
SLIDE 10

Gated Attention Mechanism

  • Use Recurrent Neural Networks (RNNs)

to encode a document and a query.

  • Use element-wise multiplication to

model the interactions between document and query:

Dhingra, Liu, Yang, Cohen, Salakhutdinov, 2016

slide-11
SLIDE 11

Multi-hot Architecture

  • Reasoning over multiple sentences requires several passes over the context

Dhingra, Liu, Yang, Cohen, Salakhutdinov, 2016

slide-12
SLIDE 12

Reasoning and Attention

  • Context: “…arrested Illinois governor Rod Blagojevich and his chief of staff John Harris on

corruption charges … included Blogojevich allegedly conspiring to sell or trade the senate seat left vacant by President-elect Barack Obama…”

  • Query: “President-elect Barack Obama said Tuesday he was not aware of alleged corruption

by X who was arrested on charges of trying to sell Obama’s senate seat.”

  • Answer: Rod Blagojevich

Layer 1 Layer 2

slide-13
SLIDE 13

Memory Networks

Weston, Chopra, Bordes, ICLR 2015; Sukhbaatar et al., NIPS 2015

Multiple passes over context help with sequential reasoning Memory

slide-14
SLIDE 14

Broad-Context Language Modeling

Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.'' She gave me a quick nod and turned back to X

LAMBADA dataset, Paperno et al., 2016

slide-15
SLIDE 15

Broad-Context Language Modeling

Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.'' She gave me a quick nod and turned back to X

LAMBADA dataset, Paperno et al., 2016

slide-16
SLIDE 16

Broad-Context Language Modeling

Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.'' She gave me a quick nod and turned back to X X = Terry

LAMBADA dataset, Paperno et al., 2016

slide-17
SLIDE 17

Incorporating Prior Knowledge

Her plain face broke into a huge smile when she saw

  • Terry. “Terry!” she called
  • ut. She rushed to meet

him and they embraced. “Hon, I want you to meet an old friend, Owen

  • McKenna. Owen, please

meet Emily.'’ She gave me a quick nod and turned back to X

Coreference Dependency Parses Entity relations Word relations Core NLP Freebase WordNet

Recurrent Neural Network Text Representation

Dhingra, Jin, Yang, Cohen, Salakhutdinov NAACL 2018

slide-18
SLIDE 18

Explicit Memory

there ball the left She kitchen the to went She football the got Mary Coreference Hyper/Hyponymy RNN

Dhingra, Jin, Yang, Cohen, Salakhutdinov NAACL 2018

slide-19
SLIDE 19

Explicit Memory

RNN

xt Mt h0 h1 . . . ht−1 e1 e|E| . . . ht Mt+1 gt

Memory as Acyclic Graph Encoding (MAGE) - RNN there ball the left She kitchen the to went She football the got Mary Coreference Hyper/Hyponymy RNN

Dhingra, Jin, Yang, Cohen, Salakhutdinov NAACL 2018

slide-20
SLIDE 20

Open Domain Question Answering

Bhuwan Dhingra et. al. 2018

  • Finding answers to factual questions posed in Natural Language:

Who first voiced Meg in Family Guy?

  • A. Lacey Chabert

Who voiced Meg in Family Guy?

  • A. Lacey Chabert, Mila Kunis
slide-21
SLIDE 21

Text Augmented Knowledge Graphs

Bhuwan Dhingra et. al. 2018 Questions Answers Who voiced Meg in Family Guy? Which year was Blade Runner released?

Which club did Cristiano Ronaldo play for in 2011?

Lacey Chabert, Mila Kunis 1982 Real Madrid Knowledge Source Who voiced Meg in Family Guy? Which year was Blade Runner released?

Which club did Cristiano Ronaldo play for in 2011? Who voiced Meg in Family Guy? Which year was Blade Runner released? Which club did Cristiano Ronaldo play for in 2011?

Lacey Chabert, Mila Kunis 1982 Real Madrid Lacey Chabert, Mila Kunis 1982 Real Madrid

slide-22
SLIDE 22

Knowledge Base as a Knowledge Source

Bhuwan Dhingra et. al. 2018 Who first voiced Meg in Family Guy? KB Query Graph Lacey Chabert Semantic Parsing

slide-23
SLIDE 23

Text as a Knowledge Source

Bhuwan Dhingra et. al. 2018 Step 1 (Information Retrieval): Retrieve passages relevant to the Question using shallow methods Step 2 (Reading Comprehension): Perform deep reading of passages to extract answers

slide-24
SLIDE 24

Text Augmented Knowledge Graph

Bhuwan Dhingra et. al. 2018

Meg Griffin is a character from the animated television series Family Guy Originally voiced by Lacey Chabert during the first season, she has been voiced by Mila Kunis since season 2

d1 d2

Who first voiced Meg in Family Guy?

Meg Griffin Lacey Chabert Family Guy character-in voiced-by Mila Kunis

Entity Linking TF-IDF based sentence retrieval Personalized Pagerank

slide-25
SLIDE 25
  • Multimodal Learning
  • Reasoning, Attention and Memory
  • Natural Language Understanding
  • Deep Reinforcement Learning
  • Unsupervised Learning / One-Shot & Transfer Learning

Key Challenges

slide-26
SLIDE 26

Learning Behaviors

Action Observation

Learning to map sequences of observations to actions, for a particular goal

slide-27
SLIDE 27

Reinforcement Learning

Observation / State Action Reward

slide-28
SLIDE 28

Deep Reinforcement Learning

Observation / State Action Reward

h3 h2 h1 v W3 W2 W1

Deep Neural Net

slide-29
SLIDE 29

Deep RL with Memory

Observation / State Action Reward

Learned External Memory

Differentiable Neural Computer, Graves et al., Nature, 2016; Neural Turing Machine, Graves et al., 2014

slide-30
SLIDE 30

Deep RL with Memory

Observation / State Action Reward

Learned Structured Memory

Parisotto, Salakhutdinov, ICLR 2018

slide-31
SLIDE 31

Random Maze with Indicator

  • Indicator: Either blue or pink

Ø If blue, find the green block Ø If pink, find the red block

  • Negative reward if agent does not find correct

block in N steps or goes to wrong block.

slide-32
SLIDE 32

Deep RL with Structured Memory

Write

Mt

Write

Mt+1

Read with Attention

Parisotto, Salakhutdinov, 2017

slide-33
SLIDE 33

Building Intelligent Agents

Observation / State Action Reward

Learned External Memory Knowledge Base

slide-34
SLIDE 34

Task-oriented Language Grounding

Chaplot et al., AAAi 2019

slide-35
SLIDE 35

Active Neural Localization and SLAM

Chaplot, Parisotto, Salakhutdinov, ICLR2018

slide-36
SLIDE 36

Building Intelligent Agents

Observation / State Action Reward

Learned External Memory Knowledge Base

Learning from Fewer Examples, Fewer Experiences

slide-37
SLIDE 37

Thank you