Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, - - PowerPoint PPT Presentation

hayato kobayashi tsugutoyo osaki tetsuro okuyama akira
SMART_READER_LITE
LIVE PREVIEW

Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, - - PowerPoint PPT Presentation

Hayato Kobayashi , Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, and Ayumi Shinohara Tohoku University, Japan (Team Jolly Pochie) 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 1 Robot position Ball position We need to check the screen


slide-1
SLIDE 1

Hayato Kobayashi, Tsugutoyo Osaki, Tetsuro Okuyama, Akira Ishino, and Ayumi Shinohara Tohoku University, Japan (Team Jolly Pochie)

1 2008/7/17 RoboCup Symposium 2008 in Suzhou, China

slide-2
SLIDE 2

/ 25 2 2008/7/17 RoboCup Symposium 2008 in Suzhou, China

Ball position Robot position We need to check the screen and field at the same time https://youtu.be/mB5MuDy9GFw

slide-3
SLIDE 3

/ 25 3 2008/7/17 RoboCup Symposium 2008 in Suzhou, China

Camera Augmented environment https://youtu.be/yGzA6hC9YY8 AR can alleviate the difficulty

slide-4
SLIDE 4

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 4

Simulated environment in “Haribote” developed by team ARAIBO Real environment Augmented environment Real robot Virtual ball Virtual robot Intermediate role Perceptible Touchable

slide-5
SLIDE 5

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 5

Camera Projector

Recognition Program Virtual Application Augmented Soccer Field System

slide-6
SLIDE 6

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 6

Other Object Robot

Extraction of contours by a background subtraction method Identification of robots’ orientation by a template matching method

slide-7
SLIDE 7

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 7

Real environment Virtual application Positions of real objects e.g., real robots Positions of virtual objects e.g., virtual ball and robots

slide-8
SLIDE 8

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 8

Robots can interact the virtual ball

slide-9
SLIDE 9

/ 25

 Essential for robot soccer

 No lost point, no lost game

 Learning has been difficult so far

2008/7/17 RoboCup Symposium 2008 in Suzhou, China 9

slide-10
SLIDE 10

/ 25

 Human intervention  Time consuming  Motor failure

2008/7/17 RoboCup Symposium 2008 in Suzhou, China 10

Stroke its head for successful saving Spank its head for failed saving

  • Ex. Learning of goal saving skills in the real environment

https://youtu.be/9oHA-GH9JT8 https://youtu.be/3Pluuk20xqs

slide-11
SLIDE 11

/ 25

 Gap from real environments

 Serious, especially for legged movements

2008/7/17 RoboCup Symposium 2008 in Suzhou, China 11

Real environment with human intervention, time consuming process, and motor failure fear Simple simulator without any difficulties Gap

slide-12
SLIDE 12

/ 25

 To bridge the gap

 Using the movements of real robots

 To allow autonomous learning

 Using the convenience of virtual balls

2008/7/17 RoboCup Symposium 2008 in Suzhou, China 12

Simple simulator without any difficulties Augmented environment without human intervention Real environment with many difficulties

Hard mode Normal mode Easy mode

slide-13
SLIDE 13

/ 25

 Acquire the map from states to actions

maximizing the sum of rewards

 Sarsa(λ) [Rummery and Niranjan 1994; Sutton 1996]

 Tile-coding (aka CMACs [Albus 1975])

2008/7/17 RoboCup Symposium 2008 in Suzhou, China 13

Agent (AIBO) Environment Action at State st Reward rt+1

slide-14
SLIDE 14

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 14

Vh Vv

r

φ

X Y State representation (r, φ, Vv, Vh, X, Y)

  • (r, φ): Ball position
  • (Vv, Vh): Ball velocity
  • (X, Y): Robot position

(We removed the orientation

  • f the robot by considering the

PK situation only.)

slide-15
SLIDE 15

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 15

Actions (10 kinds)

  • 8 directional walks
  • stay; prepare enemy’s kicks
  • save; interrupt enemy’s goals

8 directional walk actions stay action save action

slide-16
SLIDE 16

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 16

Rewards (punishments)

  • save_rewared: 0.5

when save is successful

  • save_punishment: -0.02

when save is failed

  • lost_punishment: -10

when a goal is scored

  • dist_reward: 1-|ydist|/112.5

when the game is over

  • passive_punishment: -0.0000001

when save is not selected 1 episode (game) is over, when the ball is out, when a goal is scored, or when save is successful ydist For getting near to the ball For accelerating the initial phase Aim: To stop the ball using save action as safely as possible

slide-17
SLIDE 17

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 17

Initial strategy Learned strategy Success: Blue screen Failure: Red screen https://youtu.be/xCoSGsQHkRY https://youtu.be/C2YTw6d7xPw

slide-18
SLIDE 18

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 18

save action stay action 8 walk actions × Robot’s actions

  • (r, φ): Ball position
  • (Vv, Vh): Ball velocity
slide-19
SLIDE 19

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 19

save action stay action 8 walk actions × Robot’s actions

  • (r, φ): Ball position
  • (Vv, Vh): Ball velocity
  • (X, Y): Robot position
slide-20
SLIDE 20

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 20

save action stay action 8 walk actions × Robot’s actions

  • (r, φ): Ball position
  • (Vv, Vh): Ball velocity
  • (X, Y): Robot position
slide-21
SLIDE 21

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 21 500 1000 1500 2000 Number of Episodes 20 40 60 80 100 Success Rate

Number of Episodes Success Rate 1200 episodes 95 % success Average success rate in past 100 episodes

slide-22
SLIDE 22

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 22

The action of the real robot The true positions of the virtual ball and robot https://youtu.be/HVx6TlHkPgw

slide-23
SLIDE 23

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 23

https://youtu.be/F3-3o2oCP14

slide-24
SLIDE 24

/ 25 50 100 150 200 Number of Episodes 20 40 60 80 100 Success Rate

Success Rate Number of Episodes

2008/7/17 RoboCup Symposium 2008 in Suzhou, China 24

45% Success 75% Success Starting from the result (95% success) in the simulator

slide-25
SLIDE 25

/ 25

50 100 150 200 Number of Episodes 20 40 60 80 100 Success Rate

500 1000 1500 2000 Number of Episodes 20 40 60 80 100 Success Rate

Number of Episodes Success Rate

2008/7/17 RoboCup Symposium 2008 in Suzhou, China 25

Gap of legged movements 2000 Episodes 200 Episodes

slide-26
SLIDE 26

/ 25

 Augmented soccer field system

 Intermediate role between simulated

environments and real environments

 Autonomous learning of goalie strategies

 Movements of real robots  Convenience of virtual balls

2008/7/17 RoboCup Symposium 2008 in Suzhou, China 26

slide-27
SLIDE 27

/ 25 2008/7/17 RoboCup Symposium 2008 in Suzhou, China 27

Air hockey game using our system