dealing with uncertainty
play

Dealing with Uncertainty reach a goal location among moving - PDF document

2/2/2012 Navigation among Moving Obstacles A robot with imperfect sensing must Dealing with Uncertainty reach a goal location among moving obstacles (dynamic world) Goal 2 Model, Sensing, and Control Robot created at Stanfords ARL


  1. 2/2/2012 Navigation among Moving Obstacles A robot with imperfect sensing must Dealing with Uncertainty reach a goal location among moving obstacles (dynamic world) Goal 2 Model, Sensing, and Control Robot created at Stanford’s ARL Lab to study issues  The robot and the y in robot control and obstacles are represented robot planning in no-gravity as disks moving in the plane space environment  The position and velocity of each disc are measured by each disc are measured by an overhead camera every air thrusters 1/30 sec obstacles gas tank x air bearing 3 4 Motion Planning Model, Sensing, and Control  The robot and the The robot plans its trajectories in configuration  time y f space using a probabilistic roadmap (PRM) method obstacles are represented robot  t as disks moving in the plane y  The position and velocity of q=(x,y) x each disc are measured by each disc are measured by an overhead camera within s=(q,q') 1/30 sec u=(f, )  obstacles  The robot controls the 1 x"= fcos magnitude f and the  x m orientation  of the total 1 pushing force exerted by y"= fsin  Obstacle map to cylinders in m the thrusters configuration  time space f f  M 5 6 1

  2. 2/2/2012 But executing this trajectory But executing this trajectory is likely to fail ... is likely to fail ... 1) The measured velocities of the obstacles are inaccurate 1) The measured velocities of the obstacles are inaccurate 2) Tiny particles of dust on the table affect trajectories and 2) Tiny particles of dust on the table affect trajectories and contribute further to deviation contribute further to deviation  Obstacles are likely to deviate from their expected  Obstacles are likely to deviate from their expected trajectories trajectories 3) Planning takes time, and during this time, obstacles keep 3) Planning takes time, and during this time, obstacles are moving moving  The computed robot trajectory is not properly  The computed robot trajectory is not properly synchronized with those of the obstacles synchronized with those of the obstacles  The robot may hit an obstacle before reaching its goal  The robot may hit an obstacle before reaching its goal Planning must take both uncertainty in world [Robot control is not perfect but “good” enough for the task] [Robot control is not perfect but “good” enough for the task] state and time constraints into account 7 8 Dealing with Uncertainty Dealing with Uncertainty  The robot can handle uncertainty in an obstacle  The robot can handle uncertainty in an obstacle position by representing the set of all positions of the position by representing the set of all positions of the obstacle that the robot think possible at each time obstacle that the robot think possible at each time (belief state) (belief state)  For example, this set can be a disc whose radius grows  For example, this set can be a disc whose radius grows linearly with time linearly with time Set of possible positions at time 2T Set of possible Initial set of positions at time T The robot must plan to be possible positions outside this disc at time t = T t = 0 t = T t = 2T t = 0 t = T t = 2T 9 10 Dealing with Uncertainty Dealing with Planning Time t  0 t   The robot can handle uncertainty in an obstacle execution planning position by representing the set of all positions of the  Let t=0 the time when planning starts. A time limit  is obstacle that the robot think possible at each time given to the planner (belief state)  The planner computes the states that will be possible  For example, this set can be a disc whose radius grows at t  and use them as the possible initial states n u m p n linearly with time  It returns a trajectory at some t  , whose execution will start at t    The forbidden regions in configuration  time space are  Since the PRM planner isn’t absolutely guaranteed to cones, instead of cylinders find a solution within  , it computes two trajectories  The trajectory planning method remains essentially using the same roadmap: one to the goal, the other to unchanged any position where the robot will be safe for at least an additional  . Since there are usually many such positions, the second trajectory is at least one order of magnitude faster to compute 11 12 2

  3. 2/2/2012 Are we done? Are we done?  Not quite !  The uncertainty model may itself be incorrect, e.g.: If an obstacle has an unexpected position, the planner is called back • There may be more dust on the table than anticipated to compute a new trajectory. Execution monitoring consists • Some obstacles have the ability to change trajectories of using the camera (at 30Hz)  But if we are too careful, we will end up with forbidden But if we are too careful, we will end up with forbidden t to verify that all obstacles are if th t ll bst l s regions so big that no solution trajectory will exist any more at positions allowed by the  So, it might be better to take some “risk” robot’s uncertainty model The robot must monitor the execution of the The robot must monitor the execution of the planned trajectory and be prepared to re-plan planned trajectory and be prepared to re-plan a new trajectory a new trajectory 13 14 Experimental Run Experimental Run Total duration : 40 sec 15 16 Is this guaranteed to work? Target-Tracking Example Of course not :  Thrusters might get clogged  The robot may run out of air or battery  The granite table may suddenly break into pieces  The robot must keep a target  Etc ... in its field of view  The robot has a prior map of the obstacles target [Unbounded uncertainty] robot  But it does not know the target’s trajectory in advance 17 18 3

  4. 2/2/2012 Time-Stamped States Target-Tracking Example (no cycles possible)  Time is discretized into small ([i,j], [u,v], t) steps of unit duration  At each time step, each of the • ([i+1,j], [u,v], t+1) two agents moves by at most • ([i+1,j], [u ‐ 1,v], t+1) one increment along a single • ([i+1 j] [u+1 v] t+1) • ([i+1,j], [u+1,v], t+1) axis axis right • ([i+1,j], [u,v ‐ 1], t+1)  The two moves are • ([i+1,j], [u,v+1], t+1) simultaneous  The robot senses the new position  State = (robot ‐ position, target ‐ position, time) of the target at each step  In each state, the robot can execute 5 possible actions :  The target is not influenced by the target robot {stop, up, down, right, left} robot (non ‐ adversarial, non ‐  Each action has 5 possible outcomes (one for each possible action of the cooperative target) target), with some probability distribution [Potential collisions are ignored for simplifying the presentation] 19 20 Rewards and Costs Expanding the state/action tree The robot must keep seeing the target as long as possible Each state where it does not see the target is  terminal The reward collected in every non-terminal state is  ... 1; it is 0 in each terminal state [  The sum of the rewards collected in an execution run is exactly the amount of time the robot sees the target] No cost for moving vs. not moving  horizon 1 horizon h 21 22 Assigning rewards Estimating the utility of a leaf  Terminal states: states where the target is not visible  Rewards: 1 in non ‐ terminal states; 0 in others  But how to estimate the utility of a leaf d at horizon h? ... ... target robot  Compute the shortest distance d for the target to escape the robot’s current field of view  If the maximal velocity v of the target is known, estimate the utility of the state to d/v [conservative estimate] horizon 1 horizon h horizon 1 horizon h 23 24 4

  5. 2/2/2012 Selecting the next action Pure Visual Servoing  Compute the optimal policy over the state/action tree using estimated utilities at leaf nodes  Execute only the first step of this policy  Repeat everything again at t+1… (sliding R t thi i t t 1 ( lidi horizon) ... Real ‐ time constraint: h is chosen so that a decision can be returned in unit time [A larger h may result in a better decision that will arrive too late !!] horizon 1 horizon h 25 26 Pure Visual Servoing Computing and Using a Policy 27 28 Computing and Using a Policy 29 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend