SLIDE 1
The Natural Language of Actions Guy Tennenholtz , Shie Mannor ICML - - PowerPoint PPT Presentation
The Natural Language of Actions Guy Tennenholtz , Shie Mannor ICML - - PowerPoint PPT Presentation
The Natural Language of Actions Guy Tennenholtz , Shie Mannor ICML 2019 Technion Institute of Technology What is Language? What is Language? Language is a purely human and non-instinctive method of communicating ideas, emotions, and desires
SLIDE 2
SLIDE 3
What is Language?
‘Language is a purely human and non-instinctive method of communicating ideas, emotions, and desires by means of voluntarily produced symbols.’ Edward Sapir, 1921
2
SLIDE 4
What is Language?
3
SLIDE 5
Contextual Semantics
Contextual Representation A word’s contextual representation is an abstract congnitive structure that emmulates from encounters with the word in various linguistic contexts. We learn new words based on contextual cues ∙ I saw a little yazuba sleeping behind the tree. ∙ One glass of feandra is enough to get you drunk.
4
SLIDE 6
Meaning of Actions?
Characteristics ∙ Actions characterized by the company they keep ∙ Context is generated by an
- ptimal agent
∙ Agent demonstrates acceptable behavior in the environment
5
SLIDE 7
Act2Vec
SLIDE 8
Act2Vec
7
SLIDE 9
Skip Gram (used in Word2Vec)
8
SLIDE 10
Skip Gram (used in Word2Vec)
9
SLIDE 11
Act2Vec: Drawing
∙ 70,000 human drawn squares from the QuickDraw! dataset ∙ Strokes correspond to two
- perations
10
SLIDE 12
Act2Vec: Navigation
Act2Vec embedding trained in 2d environment on actions sequences of primitive actions: ∙ Move Forward 1 unit ∙ Turn Left 15 degrees ∙ Turn Right 15 degrees
11
SLIDE 13
Act2Vec: Navigation
12
SLIDE 14
Act2Vec in Reinforcement Learning
SLIDE 15
Act2Vec in Reinforcement Learning
14
SLIDE 16
Act2Vec for Q-Value Approximation
15
SLIDE 17
Q-Embedding
1
1He, Ji, et al. ”Deep Reinforcement Learning with a Natural Language Action Space.” Proceedings of
the 54th Annual Meeting of the Association for Computational Linguistics (2016).
16
SLIDE 18
Act2Vec for Exploration
17
SLIDE 19
k-Exp
k-Exp ∙ Divide action embedding space into k clusters using a clustering algorithm (e.g., k-means) ∙ Sample a cluster uniformly ∙ Given a cluster, uniformly sample an action within it
18
SLIDE 20
Act2Vec for Domain Transfer
19
SLIDE 21
Navigation
Transfer to 3d environment
20
SLIDE 22
The Semantics of Actions in Starcraft II
SLIDE 23
StarCraft II
22
SLIDE 24
StarCraft II
23
SLIDE 25
StarCraft II
24
SLIDE 26
Conclusion
SLIDE 27