Comparing Direct and Indirect Encodings Using Both Raw and - PowerPoint PPT Presentation

Comparing Direct and Indirect Encodings Using Both Raw and Hand-Designed Features in Tetris By Lauren Gillespie, Gabby Gonzales and Jacob Schrum gillespl@southwestern.edu gonzale9@alumni.southwestern.edu schrum2@southwestern.edu Howard Hughes Medical Institute

Introduction • Challenge: Use less domain-specific knowledge ✦ Important for general agents ✦ Accomplished using raw inputs ✦ Need to be able to process with a neural network • Why challenging? ✦ Complex domains = Large input space ✦ Large input space = Large neural networks ✦ Large neural networks = Difficult to train GECCO 2017.

Addressing Challenges • Deep Learning applies large NN to hard tasks † • HyperNEAT also capable of handling large NNs ✦ Indirect encoding, good with geometric inputs ‡ ✦ Compare to direct encoding, NEAT ✦ See if indirect encoding advantageous ✦ Also compare with hand-designed features † Mnih et al. 2013. Playing Atari with Deep Reinforcement Learning. ‡Hausknecht et al. 2012. HyperNEAT-GGP: A HyperNEAT-based Atari General Game Player. GECCO 2017.

Direct Vs. Indirect Encoding Evolved network Agent network and agent network UTILITY X Y X Y Bias … Evolved network … … … Direct Encoding Indirect Encoding (NEAT) (HyperNEAT) GECCO 2017.

Tetris Domain • Consists of 10 x 20 game board • Orient tetrominoes to clear lines • Clearing multiple lines = more points • NP-Complete domain † • One piece controller ✦ Agent has knowledge of current piece only † Breukelaar et al. 2004. Tetris is hard, even to approximate. GECCO 2017.

Previous Work Tetris Domain • All use hand-designed features ✦ Reinforcement Learning: ✦ Temporal difference learning: Bertsekas et al. 1996, Genesereth & Björnsson 2013 ❖ Policy search: Szitza & Lörincz 2006 ❖ Approximate Dynamic Programming: Gabillon et al. 2013 ❖ Evolutionary Computation: ✦ Simple EA with linear function approximator: Böhm et al. 2004 ❖ Covariance Matrix Adaptation Evolution Strategy: Boumaza 2009 ❖ Raw Visual Inputs • Asterix game from Atari 2600 Suite Neuroevolution: Gauci & Stanley 2008, Verbancsics & Stanley 2010 ✦ General video game playing in Atari: Hausknecht et al. 2012, Mnih et al. 2013 ✦ GECCO 2017.

Hand-Designed Features • Most common input scheme for training ANNs † • Hand-picked information of game state as input ✦ Network doesn’t deal with excess info Pros: ✦ Smaller input space, easier to learn ✦ Very domain-specific, not versatile Cons: ✦ Human expertise needed ✦ Useful features not always apparent † Schrum & Miikkulainen. 2016. Discovering Multimodal Behavior in Ms. Pac-Man through Evolution of Modular Neural Networks. GECCO 2017.

Raw Features • One feature per game state element • Minimal input processing by user ✦ Networks less limited by domain † Pros: ✦ Less human expertise needed ✦ Large input space & networks Cons: ✦ Harder to learn, more time † Gauci & Stanley. 2008. A Case Study on the Critical Role of Geometric Regularity in Machine Learning. GECCO 2017.

NEAT Perturb Weight Add Connection Add Node • NeuroEvolution of Augmenting Topologies † • Synaptic and structural mutations • Direct encoding ✦ Network size proportional to genome size • Crossover alignment via historical markings • Inefficient with large input sets ✦ Mutations do not alter behavior effectively † Stanley & Miikkulainen. 2002. Evolving Neural Networks Through Augmenting Topologies GECCO 2017.

UTILITY HyperNEAT X Y X Y Bias • Hypercube-based NEAT † • Extension of NEAT • Indirect encoding ✦ Evolved CPPNs encode larger substrate-based agent ANNs • Compositional Pattern-Producing Networks (CPPNs) ✦ CPPN queried across substrate to create agent ANN ✦ Inputs = neuron coordinates, outputs = link weights • Substrates ✦ Layers of neurons with geometric coordinates ✦ Substrate layout determined by domain/experimenter † Stanley et al. 2009. A Hypercube- based Encoding for Evolving Large-scale Neural Networks GECCO 2017.

HyperNEAT with Tetris • Geometric awareness : arises from indirect encoding • CPPN encodes geometry of domain into agent via substrates • Agent network can learn from task-relevant domain geometry detailed view Game UTILITY X State Y X Y … Bias Input CPPN substrates … … … Substrate Agent network layers GECCO 2017.

Raw Features Setup • Board configuration: ✦ Two input sets 1. Location of all blocks ❖ block = 1, no block = 0 2. Location of all holes ❖ hole = -1, no hole = 0 • NEAT: Inputs in linear sequence • HyperNEAT: Two 2D input substrates GECCO 2017.

Hand-Designed Features Setup • Bertsekas et al. features † plus additional hole per column feature • All scaled to [0,1] ✦ MAX HEIGHT X UTILITY Y ✦ Column height X Y Bias ✦ Height difference ✦ Tallest column ✦ TOTAL HOLES ✦ Number of holes ✦ Holes per column ✦ ✦ ✦ HEIGHTS DIFFS HOLES † Bersekas et al. 1996. Neuro-Dynamic Programming GECCO 2017.

Experimental Setup • Agent networks are afterstate evaluators • Each experiment evaluated with 30 runs ✦ 500 generations/run, 50 agents/generation ✦ Objectives averaged across 3 trials/agent ❖ Noisy domain, multiple trials needed • NSGA-II objectives: game score & survival time GECCO 2017.

NEAT vs. HyperNEAT: Raw Features 400 HyperNEAT Raw 350 NEAT Raw 300 Game Score 250 200 150 100 50 0 0 100 200 300 400 500 Generation GECCO 2017

NEAT vs. HyperNEAT: Hand-Designed Features 35000 HyperNEAT Features NEAT Features 30000 25000 Game Score 20000 15000 10000 5000 0 0 100 200 300 400 500 Generation GECCO 2017

Raw Features Champion Behavior NEAT with Raw Features HyperNEAT with Raw Features GECCO 2017

Hand-Designed Features Behavior HyperNEAT with NEAT with Hand-Designed Hand-Designed Features Features GECCO 2017

Visualizing Substrates Hidden Output Result Inputs GECCO 2017.

Discussion • Raw features: HyperNEAT clearly better than NEAT ✦ Indirect encoding advantageous ✦ NEAT ineffective at evolving large networks • Hand-Designed: HyperNEAT has less of an advantage ✦ Geometric awareness less important ✦ HyperNEAT CPPN limited by substrate topology GECCO 2017.

Future Work • HybrID † ✦ Start with HyperNEAT, switch to NEAT ✦ Gain advantage of both encodings • Raw feature Tetris with Deep Learning • Raw features in other visual domains ✦ Video games: DOOM, Mario, Ms. Pac-Man ✦ Board games: Othello, Checkers † Clune et al. 2004. HybrID: A Hybridization of Indirect and Direct Encodings for Evolutionary Computation. GECCO 2017.

Conclusion • Raw features • Indirect encoding HyperNEAT effective • Geometric awareness an advantage • Hand-designed features • Ultimately NEAT produced better agents • HybrID might combine strengths of both GECCO 2017.

Questions? • Contact info: gillespl@southwestern.edu schrum2@southwestern.edu gonzale9@alumni.southwestern.edu • Movies and Code: https://tinyurl.com/tetris-gecco2017 GECCO 2017. GECCO 2017

Auxiliary Slides

NSGA-II • Pareto-based multiobjective EA optimization • Parent population, μ , evaluated in domain • Child population, λ , evolved from μ and evaluated • μ + λ sorted into non-dominated Pareto fronts • Pareto front: All individual such that • v = ( v 1 , . . . , v n ) dominates vector u = ( u 1 , . . . , u n ) iff Time alive 1. ∀ i ∈ {1,..., n }: v i ≥ u i , and Pareto front 2. ∃ i ∈ {1,..., n }: v i > u i . • New μ picked from highest fronts • Tetris objectives: Game score, time Game score GECCO 2017.

Visualizing Link Weights GECCO 2017.

Afterstate Evaluation • Evolved agents used as afterstate evaluators • Determine next move from state after placing piece • All possible piece locations determined, evaluated • Placement with best evaluation from state chosen • If placements lead to loss, not considered • Agent moves piece to best placement, repeats GECCO 2017.

Comparing Direct and Indirect Encodings Using Both Raw and - PowerPoint PPT Presentation

Comparing Direct and Indirect Encodings Using Both Raw and Hand-Designed Features in Tetris By Lauren Gillespie, Gabby Gonzales and Jacob Schrum gillespl@southwestern.edu gonzale9@alumni.southwestern.edu schrum2@southwestern.edu Howard

RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW

Indirect Left Turns Study Indirect Left Turns Study Indirect Left Turns Study Indirect Left

2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW

Creating Dashboards of Direct and Creating Dashboards of Direct and Creating Dashboards of Direct

Raw Sockets and ICMP Raw Sockets and ICMP Code Examples Ping Traceroute Srinidhi

Evaluation of direct and indirect anthropic effects over Evaluation of direct and indirect

Dr. Sudip Chaudhuri Dr. Sudip Chaudhuri Dr. Sudip Chaudhuri Dr. Sudip Chaudhuri M. Sc., M.

Direct, Indirect & Cumulative Effects for Division Managed Projects Harrison Marshall,

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Oil & Gas Industry: Indirect tax By Santosh R. Sonar 12 January 2013 Oil and Gas - Indirect

Direct3D 11 Indirect Illumination Holger Gruen European ISV Relations AMD Direct3D 11 Indirect

problems of direct input and solutions Input devices vs. Finger-based input Indirect vs. Direct

Raw Committee Meeting 2015 Raw Nationals Scranton, PA October 14, 2015 Welcome from the Raw

Direct and Indirect Detection of Dark Matter Zhao-Huan Yu School of Physics, Sun

FIREDETEC FIREDETEC PRODUCT TECHNOLOGY: DIRECT & INDIRECT PROPERTIES DIRECT SYSTEM (<

Indirect Cost Recovery Using Federal Funds to Recover Indirect Costs Federal Funding

7 Neural MT 1: Neural Encoder-Decoder Models From Section 3 to Section 6, we focused on the

Encoding Normal Vectors using Optimized Spherical Coordinates J. Smith, G. Petrova, S. Schaefer

Algorithms R OBERT S EDGEWICK | K EVIN W AYNE 5.5 D ATA C OMPRESSION introduction

H OW TO O BFUSCATE ? Main tool is graded encoding [GG H 13] Like homomorphic

Indexing Index Construction CS6200: Information Retrieval Slides by: Jesse Anderton Motivation:

CS 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same

x86 Instruction Encoding ...and the nasty hacks we do in the kernel Borislav Petkov SUSE Labs

Sampling Effect on Performance Prediction of Configurable Systems : A Case Study Juliana Alves

Comparing Direct and Indirect Encodings Using Both Raw and - PowerPoint PPT Presentation

Comparing Direct and Indirect Encodings Using Both Raw and Hand-Designed Features in Tetris By Lauren Gillespie, Gabby Gonzales and Jacob Schrum gillespl@southwestern.edu gonzale9@alumni.southwestern.edu schrum2@southwestern.edu Howard

RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW CASHEW NUT QUALITY RAW

Indirect Left Turns Study Indirect Left Turns Study Indirect Left Turns Study Indirect Left

2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW CASHEW NUTS CROP IN 2019 RAW

Creating Dashboards of Direct and Creating Dashboards of Direct and Creating Dashboards of Direct

Raw Sockets and ICMP Raw Sockets and ICMP Code Examples Ping Traceroute Srinidhi

Evaluation of direct and indirect anthropic effects over Evaluation of direct and indirect

Dr. Sudip Chaudhuri Dr. Sudip Chaudhuri Dr. Sudip Chaudhuri Dr. Sudip Chaudhuri M. Sc., M.

Direct, Indirect &amp; Cumulative Effects for Division Managed Projects Harrison Marshall,

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

Oil &amp; Gas Industry: Indirect tax By Santosh R. Sonar 12 January 2013 Oil and Gas - Indirect

Direct3D 11 Indirect Illumination Holger Gruen European ISV Relations AMD Direct3D 11 Indirect

problems of direct input and solutions Input devices vs. Finger-based input Indirect vs. Direct

Raw Committee Meeting 2015 Raw Nationals Scranton, PA October 14, 2015 Welcome from the Raw

Direct and Indirect Detection of Dark Matter Zhao-Huan Yu School of Physics, Sun

FIREDETEC FIREDETEC PRODUCT TECHNOLOGY: DIRECT &amp; INDIRECT PROPERTIES DIRECT SYSTEM (&lt;

Indirect Cost Recovery Using Federal Funds to Recover Indirect Costs Federal Funding

7 Neural MT 1: Neural Encoder-Decoder Models From Section 3 to Section 6, we focused on the

Encoding Normal Vectors using Optimized Spherical Coordinates J. Smith, G. Petrova, S. Schaefer

Algorithms R OBERT S EDGEWICK | K EVIN W AYNE 5.5 D ATA C OMPRESSION introduction

H OW TO O BFUSCATE ? Main tool is graded encoding [GG H 13] Like homomorphic

Indexing Index Construction CS6200: Information Retrieval Slides by: Jesse Anderton Motivation:

CS 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same

x86 Instruction Encoding ...and the nasty hacks we do in the kernel Borislav Petkov SUSE Labs

Sampling Effect on Performance Prediction of Configurable Systems : A Case Study Juliana Alves

Direct, Indirect & Cumulative Effects for Division Managed Projects Harrison Marshall,

Oil & Gas Industry: Indirect tax By Santosh R. Sonar 12 January 2013 Oil and Gas - Indirect

FIREDETEC FIREDETEC PRODUCT TECHNOLOGY: DIRECT & INDIRECT PROPERTIES DIRECT SYSTEM (<