inferring user intent for learning by observation
play

Inferring User Intent for Learning by Observation Kevin R. Dixon - PowerPoint PPT Presentation

Carnegie Mellon Inferring User Intent for Learning by Observation Kevin R. Dixon krd@cs.cmu.edu Department of Electrical & Computer Engineering Carnegie Mellon University 2004-01-23, Inferring User Intent for LBO p.1 Carnegie Mellon


  1. Carnegie Mellon Learning Algorithm Overview Input: Set of demonstrations Output: CDHMM describing the demonstrations 2004-01-23, Inferring User Intent for LBO – p.17

  2. Carnegie Mellon Learning Algorithm Overview Input: Set of demonstrations Output: CDHMM describing the demonstrations Requirements: Continuous-density observations CDHMM should be simple Low computational complexity Correctness 2004-01-23, Inferring User Intent for LBO – p.17

  3. Carnegie Mellon Our Approach Consider each task as a random walk through a target CDHMM Assign each observation to a node in a graph x 1 x 1 x 3 x 3 x 2 x 2 Repeatedly merge similar nodes Use fixed-topology estimation on resulting structure 2004-01-23, Inferring User Intent for LBO – p.18

  4. Carnegie Mellon Similarity Merging In Action Original Loose Strict 0 12 24 1 13 25 2 14 26 3 15 27 4 16 28 5 17 29 6 18 30 7 19 31 8 20 32 9 21 22 23 Merging using “loose” and “strict” definitions of similarity 2004-01-23, Inferring User Intent for LBO – p.19

  5. Carnegie Mellon Similarity Merging In Action Original Loose Strict 0 12 24 0 1 13 25 1 2 14 26 2 3 15 27 16 4 16 28 10 5 17 29 4 11 6 18 30 5 17 12 7 19 31 8 13 9 8 20 32 19 9 21 22 23 Merging using “loose” and “strict” definitions of similarity 2004-01-23, Inferring User Intent for LBO – p.19

  6. Carnegie Mellon Similarity Merging In Action Original Loose Strict 0 12 24 0 0 21 1 13 25 1 1 22 11 2 14 26 2 2 3 23 13 12 3 15 27 16 4 24 27 14 4 16 28 10 5 25 15 5 17 29 4 11 6 26 16 6 18 30 5 17 12 7 28 17 7 19 31 8 13 8 29 18 9 8 20 32 9 19 19 9 21 10 22 23 Merging using “loose” and “strict” definitions of similarity 2004-01-23, Inferring User Intent for LBO – p.19

  7. Carnegie Mellon Properties of the Algorithm The algorithm only produces graphs where All similar nodes are merged All dissimilar nodes are unmerged Implies locally minimal number of nodes, but not necessarily globally minimal The worst-case computational complexity is quadratic in the number of observations 2004-01-23, Inferring User Intent for LBO – p.20

  8. Carnegie Mellon Correctness The probability of error decreases exponentially as more tasks are added Probability of Error Number of Tasks However, the theorem deals with estimating individual states, not a sequence of observations 2004-01-23, Inferring User Intent for LBO – p.21

  9. Carnegie Mellon Outline Background Modeling the User Predictive Robot Programming Hypothesizing about User Actions Learning By Observation Conclusions and Open Issues 2004-01-23, Inferring User Intent for LBO – p.22

  10. Carnegie Mellon Theory Meets Humanity Properties of the learning algorithm assume ideal conditions It remains to be seen how the algorithm works on real-world data Test in relative isolation with an application of Predictive Robot Programming 2004-01-23, Inferring User Intent for LBO – p.23

  11. Carnegie Mellon Predictive Robot Programming As capabilities increase, robots are performing more sophisticated tasks Simple tasks take days to program, complex tasks take weeks or months Significant programming time means that production may have to be halted temporarily Decreasing programming time will increase the appeal of robotic automation 2004-01-23, Inferring User Intent for LBO – p.24

  12. Carnegie Mellon Predictive Robot Programming Most tasks can be decomposed into simpler subtasks Subtasks may be repeated many times through the program However, most robot programmers recreate the subtasks from scratch each time 2004-01-23, Inferring User Intent for LBO – p.25

  13. Carnegie Mellon Visualizing Similarity Task A Task B −1.82 −1.82 z (m) z (m) −1.86 −1.86 Start Start −1.9 −1.9 1.1 0.74 −0.28 0.82 1.12 0.76 0.8 −0.3 1.14 0.78 0.78 1.16 0.8 −0.32 0.76 0.82 −0.34 0.74 x (m) y (m) x (m) y (m) Waypoints describe the location and orientation of the end effector Waypoints from two different subroutines They are different, but contain a common pattern that is translated and rotated 2004-01-23, Inferring User Intent for LBO – p.26

  14. Carnegie Mellon Predictive Robot Programming Key idea behind Predictive Robot Programming: Learn from previous user actions Reduce programming time by identifying and completing subtasks automatically As an analogy, word-completion programs 2004-01-23, Inferring User Intent for LBO – p.27

  15. Carnegie Mellon Predictive Robot Programming Prediction: � y n + 1 Hypothesis Subgoals: y 0 , y 1 , . . . , y n Environment: User supplies subgoals Ignore environment System only predicts next subgoal (maximum likelihood) 2004-01-23, Inferring User Intent for LBO – p.28

  16. Carnegie Mellon Predictive Robot Programming We present two sets of results Offline programming: Showing prediction accuracy on real-world data Online programming: Showing decrease in programming time in laboratory 2004-01-23, Inferring User Intent for LBO – p.29

  17. Carnegie Mellon Offline-Programming Context 5 arc-welding programs, producing different products 252 – 1899 waypoints, 16 – 196 subroutines Took over 70 days to create by a professional robot programmer 2004-01-23, Inferring User Intent for LBO – p.30

  18. Carnegie Mellon Offline-Programming Methodology Initialize CDHMM Another Add Subroutine yes no Subroutine? to CDHMM Get Next Subroutine no Get Another yes Waypoint Waypoint? no Predict Next Sufficient Compute yes Waypoint Confidence? Prediction Error Repeat this process for each robot program 2004-01-23, Inferring User Intent for LBO – p.31

  19. Carnegie Mellon Model Complexity Quantify similarity by δ ∈ (0 , 1] δ → 0 induces simple CDHMMs δ → 1 induces complex CDHMMs 190 180 States 170 160 150 140 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 δ 0.03 0.028 Time (s) 0.026 0.024 0.022 0.02 0.018 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 δ 2004-01-23, Inferring User Intent for LBO – p.32

  20. Carnegie Mellon Modeling the User −3 x 10 1.5 1.45 1.4 Avg Median (m) 1.35 1.3 1.25 1.2 1.15 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 δ When δ is too small, CDHMM is too simple When δ is too large, CDHMM overfits A value of δ ∈ (0 . 5 , 0 . 9) induces the “right” complexity 2004-01-23, Inferring User Intent for LBO – p.33

  21. Carnegie Mellon Prediction Confidence Regardless of criterion, prediction will always exist Should only suggest most accurate predictions Need a causal statistic 2004-01-23, Inferring User Intent for LBO – p.34

  22. Carnegie Mellon Prediction Confidence Regardless of criterion, prediction will always exist Should only suggest most accurate predictions Need a causal statistic Compute prediction confidence, φ n ∈ [0 , 1] , based on CDHMM entropy from observing current task High entropy results in low confidence Low entropy results in high confidence Filter predictions based on confidence threshold 2004-01-23, Inferring User Intent for LBO – p.34

  23. Carnegie Mellon Prediction Confidence 100 Useful Predictions Total Predictions 80 60 Percentage 40 20 0 0 0.2 0.4 0.6 0.8 1 Confidence Threshold A useful prediction is within 1 millimeter of the target Confidence correlates with prediction error as − 0 . 89 ( p ≪ 0 . 01) 2004-01-23, Inferring User Intent for LBO – p.35

  24. Carnegie Mellon Temporal Performance −1 10 −2 10 Median (m) −3 10 −4 10 200 400 600 800 1000 1200 1400 1600 Waypoint Number Error generally decreases as more waypoints are added median: 190 microns, 0 . 38% Cartesian error 2004-01-23, Inferring User Intent for LBO – p.36

  25. Carnegie Mellon Offline-Programming Results Majority of predictions are useful Median errors: Cartesian error 30 – 200 microns ( 0 . 03% – 0 . 2% ) Running time per waypoint Construct CDHMM and predict: 30 ms 2004-01-23, Inferring User Intent for LBO – p.37

  26. Carnegie Mellon Online-Programming Setup The ultimate criterion is reduction in programming time We collected 44 robot programs from 3 users in a laboratory setting 2004-01-23, Inferring User Intent for LBO – p.38

  27. Carnegie Mellon Online-Programming Results Prediction mean std change Wilcoxon ( sec ) ( sec ) Criterion Conf Baseline N/A N/A 292 . 2 78 . 61 φ n ≥ 0 . 8 193 . 2 32 . 07 − 33 . 88% 99 . 95% φ n ≥ 0 . 5 178 . 0 33 . 39 − 39 . 08% 99 . 99% Programming time used to complete the tasks with no predictions high-confidence predictions low-confidence predictions Statistically significant drop when using predictions No statistical significance between prediction criteria 2004-01-23, Inferring User Intent for LBO – p.39

  28. Carnegie Mellon Limitations Indicating location of prediction Collisions are a problem Monolithic CDHMM does not represent entire tasks well 2004-01-23, Inferring User Intent for LBO – p.40

  29. Carnegie Mellon Outline Background Modeling the User Predictive Robot Programming Hypothesizing about User Actions Learning By Observation Conclusions and Open Issues 2004-01-23, Inferring User Intent for LBO – p.41

  30. Carnegie Mellon Shelter from the Storm The algorithm appears to model user actions well These were relatively sheltered conditions Inject more realistic factors 2004-01-23, Inferring User Intent for LBO – p.42

  31. Carnegie Mellon Learning By Observation Allow users to program mobile robots by demonstrating trajectory Consider a vacuum-cleaning robot Undesirable to require retraining each time furniture is moved Create a system that automates motor-skill tasks, regardless of environment, occlusion, and noise 2004-01-23, Inferring User Intent for LBO – p.43

  32. Carnegie Mellon Learning By Observation Allow users to program mobile robots by demonstrating trajectory Consider a vacuum-cleaning robot Undesirable to require retraining each time furniture is moved Create a system that automates motor-skill tasks, regardless of environment, occlusion, and noise Within reason, of course 2004-01-23, Inferring User Intent for LBO – p.43

  33. Carnegie Mellon Methodology Associate Observe Compute Learn from Perform Subgoals with User Subgoals Demonstrations Task Environment Map Demos yes Another no to Same Demo? Environment In Predictive Robot Programming: “Learn from Demos” “Perform Task” Now we have added Sensor issues Extraction of subgoals Environment considerations 2004-01-23, Inferring User Intent for LBO – p.44

  34. Carnegie Mellon Agent Orange ActivMedia Pioneer DX II SICK scanning laser range finder Carmen toolkit (Montemerlo et al., 2003) 2004-01-23, Inferring User Intent for LBO – p.45

  35. Carnegie Mellon Observing the User Wall Computer Chair Legs Human Legs Agent Orange Desk Map of laboratory Sample laser scan Object occlusion handled by Kalman filters 2004-01-23, Inferring User Intent for LBO – p.46

  36. Carnegie Mellon Hypothesizing about User Actions Requirements: Emulate “what the user would have done” in novel conditions Ability to be mapped to different environments Incorporate multiple examples Facilitate learning A method for representing trajectories 2004-01-23, Inferring User Intent for LBO – p.47

  37. Carnegie Mellon Representing Trajectories There has been much work in representing trajectories Cubic splines (Craig, 1989) Bézier curves (Hwang et al., 2003) Predicate calculus (Nicolescu & Matari´ c, 2001) Nonlinear differential equations (Schaal et al., 2003) 2004-01-23, Inferring User Intent for LBO – p.48

  38. Carnegie Mellon Representing Trajectories There has been much work in representing trajectories Cubic splines (Craig, 1989) Bézier curves (Hwang et al., 2003) Predicate calculus (Nicolescu & Matari´ c, 2001) Nonlinear differential equations (Schaal et al., 2003) We want to store user trajectories in a generative manner 2004-01-23, Inferring User Intent for LBO – p.48

  39. Carnegie Mellon Representing Trajectories There has been much work in representing trajectories Cubic splines (Craig, 1989) Bézier curves (Hwang et al., 2003) Predicate calculus (Nicolescu & Matari´ c, 2001) Nonlinear differential equations (Schaal et al., 2003) We want to store user trajectories in a generative manner Our approach: Sequenced Linear Dynamical Systems (LDS) 2004-01-23, Inferring User Intent for LBO – p.48

  40. Carnegie Mellon Sequenced LDS Segment trajectory at important points, the subgoals Represent each segment as a single Linear Dynamical System Reconstruct trajectory using the LDS estimates in sequence 2004-01-23, Inferring User Intent for LBO – p.49

  41. Carnegie Mellon LDS Example Start 2004-01-23, Inferring User Intent for LBO – p.50

  42. Carnegie Mellon LDS Example Start Fit the trajectory with the least-squares LDS Reproduce the trajectory with the estimated LDS 2004-01-23, Inferring User Intent for LBO – p.50

  43. Carnegie Mellon LDS Example Start Fit the trajectory with the least-squares LDS Reproduce the trajectory with the estimated LDS Response of LDS to various initial conditions represents a hypothesis of user intent 2004-01-23, Inferring User Intent for LBO – p.50

  44. Carnegie Mellon LDS Example Start Fit the trajectory with the least-squares LDS Reproduce the trajectory with the estimated LDS Response of LDS to various initial conditions represents a hypothesis of user intent Response of LDS to different subgoals represents a hypothesis of user intent 2004-01-23, Inferring User Intent for LBO – p.50

  45. Carnegie Mellon LDS Example Start Start 2004-01-23, Inferring User Intent for LBO – p.51

  46. Carnegie Mellon The Building Blocks Suppose we have a trajectory, X = { x 0 , x 1 , . . . , x N } Assume data are generated according to = R ( x n − ) + x n x n +1 x N 2004-01-23, Inferring User Intent for LBO – p.52

  47. Carnegie Mellon The Building Blocks Suppose we have a trajectory, X = { x 0 , x 1 , . . . , x N } Assume data are generated according to = R ( x n − ) + x n x n +1 x N Captures direction, curvature, speed 2004-01-23, Inferring User Intent for LBO – p.52

  48. Carnegie Mellon The Building Blocks Suppose we have a trajectory, X = { x 0 , x 1 , . . . , x N } Assume data are generated according to = R ( x n − ) + x n x n +1 x N Captures direction, curvature, speed Specifies trajectory terminus 2004-01-23, Inferring User Intent for LBO – p.52

  49. Carnegie Mellon LDS Estimation Assume subgoal is last point in trajectory segment 2004-01-23, Inferring User Intent for LBO – p.53

  50. Carnegie Mellon LDS Estimation Assume subgoal is last point in trajectory segment The least-squares solution to R is R � = ([ x 1 · · · x N ] − [ x 0 · · · x N − 1 ]) ([ x 0 · · · x N − 1 ] − [ x N · · · x N ]) R · R = ( X 1: N − X 0: N − 1 ) ( X 0: N − 1 − Γ N ) 2004-01-23, Inferring User Intent for LBO – p.53

  51. Carnegie Mellon LDS Estimation Assume subgoal is last point in trajectory segment The least-squares solution to R is R � = ([ x 1 · · · x N ] − [ x 0 · · · x N − 1 ]) ([ x 0 · · · x N − 1 ] − [ x N · · · x N ]) R · R = ( X 1: N − X 0: N − 1 ) ( X 0: N − 1 − Γ N ) Yes, it is a closed-form solution 2004-01-23, Inferring User Intent for LBO – p.53

  52. Carnegie Mellon Reconstructing Trajectories Use the induced control law with � x 0 = x 0 and � � = R ( � x n − x N ) + � x n x n + 1 We guarantee stability under reasonable conditions Bounded trajectories Terminate at the desired subgoal 2004-01-23, Inferring User Intent for LBO – p.54

  53. Carnegie Mellon Improving the Generalization Learning from more trajectories improves generalization 2004-01-23, Inferring User Intent for LBO – p.55

  54. Carnegie Mellon Improving the Generalization Start Start Learning from more trajectories improves generalization Two trajectories with “slight counter-clockwise curvature” 2004-01-23, Inferring User Intent for LBO – p.55

  55. Carnegie Mellon Improving the Generalization Start Start Start Start Learning from more trajectories improves generalization Two trajectories with “slight counter-clockwise curvature” 2004-01-23, Inferring User Intent for LBO – p.55

  56. Carnegie Mellon A Single LDS Is Not Enough A single LDS is an extremely compact representation But its simplicity is insufficient for LBO tasks 2004-01-23, Inferring User Intent for LBO – p.56

  57. Carnegie Mellon A Single LDS Is Not Enough A single LDS is an extremely compact representation But its simplicity is insufficient for LBO tasks Segment complicated trajectories based on predictability 2004-01-23, Inferring User Intent for LBO – p.56

  58. Carnegie Mellon A Single LDS Is Not Enough Use LDS to predict next observation and segment trajectory at poorly predicted points These points mark subgoals in the trajectory 2004-01-23, Inferring User Intent for LBO – p.57

  59. Carnegie Mellon A Single LDS Is Not Enough Use LDS to predict next observation and segment trajectory at poorly predicted points These points mark subgoals in the trajectory 2004-01-23, Inferring User Intent for LBO – p.57

  60. Carnegie Mellon Trading Simplicity for Accuracy Prediction-error threshold = 0 . 4 Prediction-error threshold = 0 . 8 0.03 50 0.025 40 Avg Trajectory Error 0.02 Subgoals 30 0.015 20 0.01 10 0.005 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Prediction−error threshold Prediction−error threshold 2004-01-23, Inferring User Intent for LBO – p.58

  61. Carnegie Mellon Outline Background Modeling the User Predictive Robot Programming Hypothesizing about User Actions Learning By Observation Conclusions and Open Issues 2004-01-23, Inferring User Intent for LBO – p.59

  62. Carnegie Mellon The Real World We now have a model of user actions Hypotheses of user actions to different conditions We apply these abilities to learning motor skills in mobile robots 2004-01-23, Inferring User Intent for LBO – p.60

  63. Carnegie Mellon Observing the User Place slalom cones in the lab and track the user 2004-01-23, Inferring User Intent for LBO – p.61

  64. Carnegie Mellon Learning from the User Estimating the user trajectory Extracted 6 subgoals Average error of estimate is 20 millimeters 2004-01-23, Inferring User Intent for LBO – p.62

  65. Carnegie Mellon Learning from the User 10 runs of the robot 5 runs where the robot was “kidnapped” 2004-01-23, Inferring User Intent for LBO – p.63

  66. Carnegie Mellon Environment Changes “What would the user have done?” Automatically associate subgoals with objects in the environment LDS estimates automatically adjust their responses 2004-01-23, Inferring User Intent for LBO – p.64

  67. Carnegie Mellon Environment Changes “What would the user have done?” 2004-01-23, Inferring User Intent for LBO – p.65

  68. Carnegie Mellon Environment Changes “What would the user have done?” We asked the user to perform the same task Average error of estimate is 200 millimeters 2004-01-23, Inferring User Intent for LBO – p.65

  69. Carnegie Mellon Learning in Different Environments Can demonstrations from these two environments help in performing in another? 2004-01-23, Inferring User Intent for LBO – p.66

  70. Carnegie Mellon Learning in Different Environments Learning from demonstrations in different environments 2004-01-23, Inferring User Intent for LBO – p.67

  71. Carnegie Mellon Learning in Different Environments Learning from demonstrations in different environments First, map subgoals to the same environment 2004-01-23, Inferring User Intent for LBO – p.67

  72. Carnegie Mellon Learning in Different Environments Learning from demonstrations in different environments First, map subgoals to the same environment Construct CDHMM describing subgoals 2004-01-23, Inferring User Intent for LBO – p.67

  73. Carnegie Mellon Learning in Different Environments Learning from demonstrations in different environments First, map subgoals to the same environment Construct CDHMM describing subgoals Determine most-likely sequence of subgoals needed to complete task 2004-01-23, Inferring User Intent for LBO – p.67

  74. Carnegie Mellon Learning in Different Environments Two demonstrations and corresponding subgoals 2004-01-23, Inferring User Intent for LBO – p.68

  75. Carnegie Mellon Learning in Different Environments 3 4 A 5 6 0 B 1 C 2 CDHMM describing the subgoals with most-likely sequence shown in green 2004-01-23, Inferring User Intent for LBO – p.69

  76. Carnegie Mellon Learning in Different Environments Individually, the average error is 364 and 236 millimeters Together, the average error is 189 millimeters 2004-01-23, Inferring User Intent for LBO – p.70

  77. Carnegie Mellon LBO Discussion Computational approach appears viable Permits learning from Multiple demonstrations Demonstrations in different environments Environment mapping is weakest aspect Hungarian Method requires bijective mapping 2004-01-23, Inferring User Intent for LBO – p.71

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend