the gizmo player
play

The Gizmo Player Simon Doll Jan Kopcsek Alper Tunga Dresden, - PowerPoint PPT Presentation

Fakulttsname Informatik Fachrichtung Informatik Institutsname Intelligente Systeme The Gizmo Player Simon Doll Jan Kopcsek Alper Tunga Dresden, 13.02.2008 Finding a heuristic function Two ways for learning a heuristic function:


  1. Fakultätsname Informatik Fachrichtung Informatik Institutsname Intelligente Systeme The Gizmo Player Simon Dollé Jan Kopcsek Alper Tunga Dresden, 13.02.2008

  2. Finding a heuristic function Two ways for learning a heuristic function: • Deductive – Analyzing the rules – Identify common elements like game boards or pieces – Finding patterns • Inductive – Playing and learning from experience – Monte Carlo strategy TU Dresden, 13.02.2008 Gizmo Player Slide 2 of 10

  3. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  4. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  5. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  6. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  7. Monte Carlo Strategy • Play random games • Compute the means of scores for each move Use them as a  heuristic function TU Dresden, 13.02.2008 Gizmo Player Slide 3 of 10

  8. Monte Carlo Strategy • Problem: Same effort spend on interesting moves and uninteresting moves • Equivalent to play against a dummy player • • UCT Algorithm (Upper Confidence Bound for Trees): An algorithm to balance: • Exploration of interesting parts of the graph  Exploration of new parts  Make random games more realistic • TU Dresden, 13.02.2008 Gizmo Player Slide 4 of 10

  9. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  10. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  11. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  12. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  13. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  14. UCT Algorithm As long as there are unexplored • moves from our current state, explore them TU Dresden, 13.02.2008 Gizmo Player Slide 5 of 10

  15. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  16. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  17. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  18. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  19. UCT Algorithm As long as there are unexplored • moves from our current state, explore them Otherwise, choose the one with • the highest score using h : the heuristic value n : the number of games through the parent node n i : the number of games through the node TU Dresden, 13.02.2008 Gizmo Player Slide 6 of 10

  20. UCT Algorithm Which move to play? • The one with the highest  heuristic value In multiplayer games: • Store the heuristic value  for each player TU Dresden, 13.02.2008 Gizmo Player Slide 7 of 10

  21. Good points • Heuristic directly linked to the final score • Heuristic converges to min-max values • Time scalable • Easily parallelisable TU Dresden, 13.02.2008 Gizmo Player Slide 8 of 10

  22. Problems • Simultaneous moves: – What rule to choose to explore the nodes? – Which move to play? • Long games and loops: – Depth first search problem TU Dresden, 13.02.2008 Gizmo Player Slide 9 of 10

  23. Thank you for your attention And good luck to your players TU Dresden, 13.02.2008 Gizmo Player Slide 10 of 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend