leelachesszero
play

LeelaChessZero Open Source Community (F. Huizinga) Overview What - PowerPoint PPT Presentation

LeelaChessZero Open Source Community (F. Huizinga) Overview What is Lc0? The GameTree and A0 in a nutshell Contribute Useful links Technical details What is Lc0? 2016 Deepminds AlphaGo 2017 AlphaZero


  1. LeelaChessZero Open Source Community (F. Huizinga)

  2. Overview ● What is Lc0? ● The GameTree and A0 in a nutshell ● Contribute ● Useful links ● Technical details

  3. What is Lc0? ● 2016 Deepmind’s AlphaGo ● 2017 AlphaZero ● 2017 LeelaZero ● 2018 LeelaChessZero

  4. The Game Tree

  5. Why care? ● General approach, no domain knowledge required (Go, Chess, Shogi, …) ● Visual interpretation of the game allows for a deep positional - and materialistic understanding obtained from selfplay ● Fascinating gameplay, see youtube videos on alphazero/leelachesszero

  6. LeelaChessZero ● Initially missing details on the neural network architecture ● Variable compute budget ● Obtain dedicated hardware for training ● Always looking for contributors ○ Developers ○ Computational help ○ Testers/Elo estimators ○ Enthusiasts

  7. Links ● lczero.org ● testtraining.lczero.org ● github.com/LeelaChessZero ● discord.gg/pKujYxD

  8. Thanks to ● DeepMind ● Gian-Carlo Pascutto ● Leela Developers ● Lc0 Developers ● Testers ● Chess enthusiasts

  9. Minimax Algorithm +1 max(-1, +1, -1) -1 +1 -1 min(0, 0, -1) min(0, +1) 0 0 -1 0 +1

  10. Evaluation Function ● Minimax unable to reach terminal nodes given time constraints ● Approximate minimax value of subtree ● Must evaluate non-terminal nodes ● Centuries of human chess understanding to properly define this function

  11. Minimax + Eval 2 max(-3, 2, 0) -3 2 0 min(8, -3, 1) min(2, 4) 8 -3 1 2 4

  12. AlphaZero Main objective: Prune the gametree Learn the evaluation function (value) and most promising moves (policy) of the gametree iteratively from selfplay data.

  13. Neural Network Expected outcome: 1 X O Neural Network Move distribution X O 0 2 4 1 2 1 0

  14. Training Data Result Win +1 Loss -1 Draw 0 Expected outcome: 1 Game state X O Neural Network Policy Move distribution X O 0 2 4 1 Obtain data through selfplay 2 1 0

  15. (MCT) Search

  16. (MCT) Search

  17. (MCT) Search

  18. (MCT) Search

  19. (MCT) Search

  20. (MCT) Search

  21. Records of data (State 1 , Policy 1 , Result 1 ) (State 2 , Policy 2 , Result 2 ) ... (State n , Policy n , Result n ) Where n is the total moves in the game played.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend