a generalized linear outline evaluation model
play

A Generalized Linear Outline Evaluation Model Evaluation function - PowerPoint PPT Presentation

A Generalized Linear Outline Evaluation Model Evaluation function construction GLEM Building pattern-based evaluations Michael Buro


  1. � ✁ � ✁ � ✁ ✂ � ✂ ✁ � ✁ � ✁ � ✁ � � � � A Generalized Linear Outline Evaluation Model Evaluation function construction GLEM – Building pattern-based evaluations Michael Buro Application: Othello Department of Computing Science University of Alberta Future work mburo@cs.ualberta.ca http://www.cs.ualberta.ca/~mburo 10/16/02 1 10/16/02 2 Examples Evaluation Function Construction Chess: count pieces - fast! EFs are used in look-ahead seach to assign Material, mobility, King safety, pawn structure ... heuristic values to leaf nodes if no perfect Add weighted features classification is available w(delta-pawns) = 100 EFs correlated with optimization objective. E.g w(delta-queens) = 990 ... Expected/minimal distance to goal state Othello: evaluate parts of the board – fast! Probability of winning add 51 pre-computed pattern values (even in deterministic games? - yes!) Rubic's Cube: admissible heuristic Expected payoff Databases for solving sub-problems Classic approach: add weighted features (lower bound on solution length) Trade-off: evaluation accuracy vs. speed 10/16/02 3 10/16/02 4

  2. � ✁ � � � � � � � � � � � � � � � ✁ ✁ ✁ ✁ � � ✁ Two Problems Genetic Programming Where do features come from? Usually provided by human experts Breed LISP expressions (trees) What if there are no experts ? Atoms refer to state representation or provided features What if the expert can't explain the feature s/he is using? Maintain a pool of expressions What if human experts are weak in absolute terms? Let the best ones generate offspring How to combine features? (“cross-over”, “mutation”) Linear, non-linear? What structure? Remove weak performers How to assign weights to features? Iterate Search in Function Space : Very Hard! 10/16/02 5 10/16/02 6 GLEM Hybrid Approach Start with binary features Start with (simple) features (as simple as “Is a black King on A1?”) (could be raw state representation) Grow feature conjunctions Select evaluation model Combine relevant features linearly (e.g. linear, ANN, decision trees) Apply monotone squashing function Grow new features by combining previously to model saturation generated features Optimize feature weights using linear Select new relevant features regression Optimize numerical parameters Iterate if not satisfied 10/16/02 7 10/16/02 8

  3. ✁ � � � � � ✁ ✁ ✁ � ✁ ✁ � � ✁ � � ✁ � ✁ � � � � Top Level: Linear + Squashing Conjunctions Fast evaluation Complete, can represent perfect evaluation Efficient weight optimization Fast evaluation (Gradient based algorithms find global optimum) “only” 2^n feature combinations No need to apply squashing function during game-tree search: monotone! Natural non-linear feature interaction. E.g. F 1 : (Black King on 8 th rank) F 2 : (White rook on 7 th rank) F 1 not correlated with winning F 2 somewhat correlated with winning F 1 & F 2 much more correlated with winning 10/16/02 9 10/16/02 10 Generating Conjunctions Parameter Optimization Over-fitting? (good fit on training data, but poor generalization) Ad hoc solution: Generate conjunctions that Generate lots of training samples : appear at least N times in the training set: (state, evaluation) Inductive algorithm , length 1,2,3... Generate conjunctions Post processing: remove conjunctions that are Solve large (linear) regression problem not correlated with winning regression takes care of feature correlation ! Future work : Boot-strapping: iterate generate maximal conjunctions fast smarter handling of rare conjunctions 10/16/02 11 10/16/02 12

  4. � � � � � Application: Othello Patterns 10/16/02 13 10/16/02 14 Logistello's Evaluation Function 13 game stages (every 4 discs) Sum of 51 precomputed pattern value Fast! 1.4 million evaluations/sec on Athlon 1666 MHz 1.5 million weights 17 million training positions Fast Least squares takes 6 hours Evaluation 10/16/02 15 10/16/02 16

  5. � ✁ � � � � Future Work Better solution for rare configurations Weight bound depending on # of occurrence Automated pattern search Efficient implementation of large sparse patterns Non-linear top-level combinations Other applications: ataxx , backgammon , LOA , go ... 10/16/02 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend