from heuristic to optimal models in naturalistic visual
play

From heuristic to optimal models in naturalistic visual search - PowerPoint PPT Presentation

From heuristic to optimal models in naturalistic visual search Angela Radulescu 1,2 *, Bas van Opheusden 1,2 *, Fred Callaway 2 , Thomas Griffiths 2 & James Hillis 1 1 2 Bridging AI and Cognitive Science workshop, ICLR April 24th, 2020


  1. From heuristic to optimal models in naturalistic visual search Angela Radulescu 1,2 *, Bas van Opheusden 1,2 *, Fred Callaway 2 , Thomas Griffiths 2 & James Hillis 1 1 2 Bridging AI and Cognitive Science workshop, ICLR April 24th, 2020 � 1

  2. An everyday problem… …where are the keys? � 2

  3. Resource allocation in visual search • Main contribution : frame visual search as a reinforcement learning problem ‣ Fixations as information-gathering actions ‣ Do people employ optimal strategies? � 3

  4. Resource allocation in visual search • Main contribution : frame visual search as a reinforcement learning problem ‣ Fixations as information-gathering actions ‣ Do people employ optimal strategies? • Challenges: ‣ Representing the state space — world is high-dimensional; what features does visual system have access to ? ‣ Finding the optimal policy — reward function is sparse; how to balance cost of sampling and performance ? � 4

  5. Naturalistic visual search in VR • VR + gaze tracking, fixed camera location • Cluttered room, 1 target among many distractors • “Find the target within 8 seconds” • 6 different rooms x 5 locations per room x 10 trials per location = 300 unique scenes • Some trials assisted � 5

  6. � 6

  7. � 7

  8. Start End � 8

  9. Meta-level Markov Decision Process Callaway & Gri ffi ths 2018 � 9

  10. Meta-level Markov Decision Process • Latent : {F true , i true } ‣ Scene features and target identity unknown to the agent • States : {F , J, f target } ‣ Mean and precision of each feature for each object • Actions : { o, ⊥ } ‣ Fixate on object o, or terminate • Transitions : measure X ~ N(F true , J meas ) ‣ J meas decreases with distance from o ‣ Integrate X into F and J with Bayesian cue combination • Rewards : if fixating o then R = -c ; if ⊥ then R = 1 if argmax(P(target | F , J)) = i true and 0 otherwise ‣ Reward agent when most probable target given state matches true target Callaway & Gri ffi ths 2018 10 �

  11. Meta-level Markov Decision Process • Latent : {F true , i true } ‣ Scene features and target identity unknown to the agent • States : {F , J, f target } ‣ Mean and precision of each feature for each object • Actions : { o, ⊥ } ‣ Fixate on object o, or terminate • Transitions : measure X ~ N(F true , J meas ) ‣ J meas decreases with distance from o ‣ Integrate X into F and J with Bayesian cue combination • Rewards : if fixating o then R = -c ; if ⊥ then R = 1 if argmax(P(target | F , J)) = i true and 0 otherwise ‣ Reward agent when most probable target given state matches true target Callaway & Gri ffi ths 2018 11 �

  12. Meta-level Markov Decision Process • Latent : {F true , i true } ‣ Scene features and target identity unknown to the agent • States : {F , J, f target } ‣ Mean and precision of each feature for each object • Actions : { o, ⊥ } ‣ Fixate on object o, or terminate • Transitions : measure X ~ N(F true , J meas ) ‣ J meas decreases with distance from o ‣ Integrate X into F and J with Bayesian cue combination • Rewards : if fixating o then R = -c ; if ⊥ then R = 1 if argmax(P(target | F , J)) = i true and 0 otherwise ‣ Reward agent when most probable target given state matches true target Callaway & Gri ffi ths 2018 12 �

  13. Meta-level Markov Decision Process • Latent : {F true , i true } ‣ Scene features and target identity unknown to the agent • States : {F , J, f target } ‣ Mean and precision of each feature for each object • Actions : { o, ⊥ } ‣ Fixate on object o, or terminate • Transitions : measure X ~ N(F true , J meas ) ‣ J meas decreases with distance from o ‣ Integrate X into F and J with Bayesian cue combination • Rewards : if fixating o then R = -c ; if ⊥ then R = 1 if argmax(P(target | F , J)) = i true and 0 otherwise ‣ Reward agent when most probable target given state matches true target Callaway & Gri ffi ths 2018 13 �

  14. Meta-level Markov Decision Process • Latent : {F true , i true } ‣ Scene features and target identity unknown to the agent • States : {F , J, f target } ‣ Mean and precision of each feature for each object • Actions : { o, ⊥ } ‣ Fixate on object o, or terminate • Transitions : measure X ~ N(F true , J meas ) ‣ J meas decreases with distance from o ‣ Integrate X into F and J with Bayesian cue combination • Rewards : if fixating o then R = -c ; if ⊥ then R = 1 if argmax(P(target | F , J)) = i true and 0 otherwise ‣ Reward agent when most probable target given state matches true target Callaway & Gri ffi ths 2018 14 �

  15. Challenge I: representing the belief space Challenge II: finding the optimal policy � 15

  16. Which features to include? � 16

  17. Which features to include? Objects Target Color Shape Object attributes Treisman & Gelade, 1980 Horowitz & Wolfe, 2017 � 17

  18. Which features to include? � 18

  19. Which features to include? Shape 3D mesh D2 distribution � 19

  20. Which features to include? Shape Color 3D mesh D2 distribution 2D texture CIELAB A B A B � 20

  21. Which features to include? Shape Color 3D mesh D2 distribution 2D texture CIELAB PCA PCA A B A B � 21

  22. Which features to include? Shape Color 3D mesh D2 distribution 2D texture CIELAB PCA PCA A B A B Full Partial (3PCs) Full Partial (3PCs) Similarity structure � 22

  23. Shape and color predict gaze � 23

  24. Shape and color predict gaze Gaze on objects Gaze on objects � 24

  25. Shape and color predict gaze � 25

  26. Challenge I: representing the belief space Challenge II: finding the optimal policy � 26

  27. “Ideal observer” model of visual search Calculate/update posterior probabilities If maximum exceeds criterion, STOP Move eyes to object most likely to be target Sample information at fixated location • Can be expressed as a policy in the meta-MDP , but not necessarily optimal Najemnik and Geisler, 2005 Yang, Lengyel and Wolpert, 2017 � 27

  28. Optimizing meta-level return with deep reinforcement learning • Proximal Policy Optimization π Policy (PPO, Schulman, 2017), Object locations implemented with tf-agents Object features • 10 replications, manually tuned hyper-parameters Target features • Manual tweaking of input Posterior V Value representation & initialization Input Dense Layers � 28

  29. Optimizing meta-level return with deep reinforcement learning • Proximal Policy Optimization 0.8 (PPO, Schulman, 2017), 0.6 implemented with tf-agents 5eward 0.4 • 10 replications, manually tuned hyper-parameters 0.2 • Manual tweaking of input 0 0 0.5 1 1.5 2 representation & initialization 6imulated eSisRdes (milliRns) � 29

  30. Optimizing meta-level return with deep reinforcement learning • Proximal Policy Optimization 0.8 (PPO, Schulman, 2017), 0.6 implemented with tf-agents 5eward 0.4 • 10 replications, manually tuned hyper-parameters 0.2 NG • Manual tweaking of input 0 0 0.5 1 1.5 2 representation & initialization 6imulated eSisRdes (milliRns) 30 �

  31. Does optimal policy match humans? Model Human Start End � 31

  32. Does optimal policy match humans? Model Object Human Object Start End � 32

  33. Which features drive human search? Model Human Start End � 33

  34. Ongoing work • Alternative schemes for extracting low-dimensional feature representations of objects ‣ Deep convolutional neural network models of human ventral visual stream (Yamins et al. 2014, Fan et al. 2019) ‣ MeshNet model of 3D shape representation (Feng et al. 2018) � 34

  35. Ongoing work • Alternative schemes for extracting low-dimensional feature representations of objects ‣ Deep convolutional neural network models of human ventral visual stream (Yamins et al. 2014, Fan et al. 2019) ‣ MeshNet model of 3D shape representation (Feng et al. 2018) • Investigating the learned policy ‣ Is it optimal? � 35

  36. Thank you! � 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend