Human-Robot Social Interactions through Multimodal Deep Attention - - PowerPoint PPT Presentation

human robot social interactions through multimodal deep
SMART_READER_LITE
LIVE PREVIEW

Human-Robot Social Interactions through Multimodal Deep Attention - - PowerPoint PPT Presentation

MIN Faculty Department of Informatics Human-Robot Social Interactions through Multimodal Deep Attention Recurrent Q-Network Nana Baah University of Hamburg Faculty of Mathematics, Informatics and Natural Sciences Department of Informatics


slide-1
SLIDE 1

MIN Faculty Department of Informatics

Human-Robot Social Interactions through Multimodal Deep Attention Recurrent Q-Network

Nana Baah

University of Hamburg Faculty of Mathematics, Informatics and Natural Sciences Department of Informatics Technical Aspects of Multimodal Systems

  • 14. December 2018

Nana Baah – Human-Robot Social Interactions through MDARQN 1 / 20

slide-2
SLIDE 2

Outline

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

  • 1. Introduction and Motivation
  • 2. The Proposed MDARQN
  • 3. Robot Actions with Attention
  • 4. Training Phase
  • 5. Results and Discussion
  • 6. Conclusion

Nana Baah – Human-Robot Social Interactions through MDARQN 2 / 20

slide-3
SLIDE 3

Why the need for HRSI?

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

◮ Human-robot social interaction (HRSI) ◮ Humans and robots coexisting ◮ Impossible to program such interpreter to complex human

behavior

◮ Self-learning architecture to learn social interaction skills

Nana Baah – Human-Robot Social Interactions through MDARQN 3 / 20

slide-4
SLIDE 4

Introduction and Motivation

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

◮ Reinforcement Learning (RL) + Deep Learning = Deep

Q-Network

◮ Multimodal Deep Q-Network (MDQN) ◮ Robots augemented with MDQN learned to choose appropriate

actions

◮ Required perceivability due to lack of attention

Nana Baah – Human-Robot Social Interactions through MDARQN 4 / 20

slide-5
SLIDE 5

The Proposed MDARQN

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

Each stream consists of:

  • 1. Convolution

Network (Convnets)

  • 2. Long-Short Term

Memory (LSTM)

  • 3. Attention

Mechanism Network (G)

Multimodal Deep Attention Recurrent Q-Network Architecture [2]

Nana Baah – Human-Robot Social Interactions through MDARQN 5 / 20

slide-6
SLIDE 6

Convolutional Network (Convnets)

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

  • 1. MDARQN inputs are 2 streams of pre-processed visual frame:

◮ Y-channel for grayscale ◮ Depth-channel for depth images

  • 2. Consist of 4 layers, followed by non-linear rectified function
  • 3. Output D-dimensional feature vectors are feed to attention

network

Convolutional Network for gray-scale images as input [2]

Nana Baah – Human-Robot Social Interactions through MDARQN 6 / 20

slide-7
SLIDE 7

Long-Short Term Memory (LSTM)

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

  • 1. Recurrent Neural Network (RNN) allow information to persist
  • 2. Input gate: previous hidden state (ht−1), previous memory

state (ct−1) and annotation vector (zt)

  • 3. Forget gate: old/irrelevant data is discarded
  • 4. Output gate: output data (ht) is based on memory state (ct)

Long-short Term Memory Network (LSTM) [2]

Nana Baah – Human-Robot Social Interactions through MDARQN 7 / 20

slide-8
SLIDE 8

Attention Mechanism Network

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

  • 1. Generates the annotation vector
  • 2. Input: D-dimensional L feature vectors and previous hidden

state of the LSTM

  • 3. Soft attention network:

◮ Differentiable ◮ Deterministic Soft Attention Network [4]

Nana Baah – Human-Robot Social Interactions through MDARQN 8 / 20

slide-9
SLIDE 9

Q-Network

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

  • 1. Q-values are normalized for the fusion
  • 2. Normalized values are averaged together to generate output

Q-values

  • 3. Greedy action is taken (highest Q-value)

Deep Q-learning for Human-robot Interaction [2]

Nana Baah – Human-Robot Social Interactions through MDARQN 9 / 20

slide-10
SLIDE 10

Modified Robotic System

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

◮ A Pepper robot was used for

the project

◮ Used visual sensors

◮ 2-D camera and 3-D sensor

(10 fps with 320x240)

◮ Modified with Force Sensing

Resistors (FSR) touch sensor

◮ forms basis for reward

function

Pepper Robot [1]

Nana Baah – Human-Robot Social Interactions through MDARQN 10 / 20

slide-11
SLIDE 11

Attention Steering for non-Greedy actions

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

◮ Ensures perceivable HRSI ◮ Robot randomly picks an action from set of legal actions ◮ Attention steering function

◮ Awareness ◮ Sensitive to real world stimulus (sound and movement detection)

◮ Robot executes 4 sets of legal actions

  • 1. Wait
  • 2. Look towards human
  • 3. Wave hand
  • 4. Handshake

Nana Baah – Human-Robot Social Interactions through MDARQN 11 / 20

slide-12
SLIDE 12

Reward Function

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

◮ Handshake detection through touch sensor forms the baseline. ◮ Rewards

◮ Successful handshake: 1 ◮ Unsuccessful handshake: -0.1 ◮ Other actions: 0

Nana Baah – Human-Robot Social Interactions through MDARQN 12 / 20

slide-13
SLIDE 13

Training Phase

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

  • 1. Agent was trained for 14 days
  • 2. By interacting with people in

uncontrolled environment

  • 3. At each time step:

◮ Environment provides an

  • bservation state ot

◮ Agent takes an action using

ǫ-greedy policy

◮ Environment provides scalar

reward rt and next state st+1

Reinforcement model of interaction [3]

Nana Baah – Human-Robot Social Interactions through MDARQN 13 / 20

slide-14
SLIDE 14

Data Generation Phase

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

  • 1. Interaction experience et = {st, at, rt, st+1} is stored into a

replay buffer M

  • 2. Data generation cycle ends when terminate state T is achieved
  • 3. Replay buffer stores N most recent experiences

Nana Baah – Human-Robot Social Interactions through MDARQN 14 / 20

slide-15
SLIDE 15

Evaluation Procedure

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

Evaluate MDARQN decisions and impact of attention model on HRI.

  • 1. MDARQN decisions on a data set

◮ More than 1 feasible action ◮ 3 volunteers suggested the best action for the scenario

  • 2. Evaluating the impact of attention mechanism

◮ Robot interacted with the public under the trained Q-networks’

policy

◮ MDARQN performance were compared to MDQN

Nana Baah – Human-Robot Social Interactions through MDARQN 15 / 20

slide-16
SLIDE 16

Results

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

Robot waits [2] Robot looks towards human [2] Robot waves [2] Robot offers a handshake [2]

Nana Baah – Human-Robot Social Interactions through MDARQN 16 / 20

slide-17
SLIDE 17

Discussion

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

◮ Interpret human walking trajectory ◮ Level of human engagement ◮ Human’s body orientation and distance ◮ People willingness to interact with a robot ◮ Precise selective interaction attention ◮ High penalty results in rude behavior ◮ Low penalty results in repeated handshakes

Nana Baah – Human-Robot Social Interactions through MDARQN 17 / 20

slide-18
SLIDE 18

Conclusion

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

◮ Proposed MDARQN was trained for 14 days ◮ By interpreting and executing a responsive action ◮ Learned to infer intention ◮ Attention indication adds perceivability to robot actions ◮ Learned to choose appropriate decisions in diverse interaction

scenarios

Nana Baah – Human-Robot Social Interactions through MDARQN 18 / 20

slide-19
SLIDE 19

Thank You Any Questions?

Nana Baah – Human-Robot Social Interactions through MDARQN 19 / 20

slide-20
SLIDE 20

References

Introduction and Motivation The Proposed MDARQN Robot Actions with Attention Training Phase Results and Discussion Conclusion Reference

[1] MS Windows NT Kernel Description. https://www.softbankrobotics.com/emea/en/pepper. Accessed: 2018-12-14. [2] Ahmed Hussain Qureshi et al. “Show, attend and interact: Perceivable human-robot social interaction through neural attention Q-network”. In: Robotics and Automation (ICRA), 2017 IEEE International Conference on. IEEE. 2017,

  • pp. 1639–1645.

[3] Hado Van Hasselt, Arthur Guez, and David Silver. “Deep Reinforcement Learning with Double Q-Learning.” In: AAAI.

  • Vol. 2. Phoenix, AZ. 2016, p. 5.

[4] Shiyang Yan et al. “Hierarchical Multi-scale Attention Networks for action recognition”. In: Signal Processing: Image Communication 61 (2018), pp. 73–84.

Nana Baah – Human-Robot Social Interactions through MDARQN 20 / 20