Effects of Constant Optimization by Nonlinear Contact: Michael - PowerPoint PPT Presentation

Effects of Constant Optimization by Nonlinear Contact: Michael Kommenda Least Squares Minimization in Symbolic Regression Heuristic and Evolutionary Algorithms Lab (HEAL) Michael Kommenda, Gabriel Kronberger , Stephan Winkler, Softwarepark 11 Michael Affenzeller, and Stefan Wagner A-4232 Hagenberg e-mail: michael.kommenda@fh-hagenberg.at Web: http://heal.heuristiclab.com http://heureka.heuristiclab.com

Symbolic Regression Model a relationship between input variables x and target variable y without any predefined structure Minimization of ε using an evolutionary algorithm • Model structure • Used variables • Constants / weights Effects of Constant Optimization by Nonlinear Least Squares Minimization 2

Research Assumption The correct model structure is found during the algorithm execution, but not recognized due to misleading / wrong constants. - - + 1.2 + 1.2 * 0.6 * Y 5.0 * X 1.0 * X 2.0 * Y 0.3 Effects of Constant Optimization by Nonlinear Least Squares Minimization 3

Constants in Symbolic Regression Ephemeral Random Constants - • Randomly initialized constants • Remain fixed during the algorithm run + 1.2 * Evolutionary Constants 1.0 X • Updated by mutation 2.0 0.3  𝐷 𝑜𝑓𝑥 = 𝐷 𝑝𝑚𝑒 + 𝑂 0, 𝜏  𝐷 𝑜𝑓𝑥 = 𝐷 𝑝𝑚𝑒 ∗ 𝑂 1, 𝜏 Finding correct constants • combination of existing values • mutation of constant symbol nodes  undirected changes to values Effects of Constant Optimization by Nonlinear Least Squares Minimization 4

Summary of Previous Research Faster genetic programming based on local gradient search of numeric leaf values (Topchy and Punch, GECCO 2001) Improving gene expression programming performance by using differential evolution (Zhang et al., ICMLA 2007) Evolution Strategies for Constants Optimization in Genetic Programming (Alonso, ICTAI 2009) Differential Evolution of Constants in Genetic Programming Improves Efficacy and Bloat (Mukherjee and Eppstein, GECCO 2012) Effects of Constant Optimization by Nonlinear Least Squares Minimization 5

Linear Scaling Improving Symbolic Regression with Interval Arithmetic and Linear Scaling (Keijzer, EuroGP 2003) Use Pearson’s R² as fitness function and perform linear scaling • Removes necessity to find correct offset and scale • Computationally efficient Outperforms the local gradient search Effects of Constant Optimization by Nonlinear Least Squares Minimization 6

Constant Optimization Concept • Treat all constants as parameters • Local optimization step • Multidimensional optimization Levenberg-Marquardt Algorithm • Least squares fitting of model parameters to empirical data 𝑛 2 𝑁𝑗𝑜𝑗𝑛𝑗𝑨𝑓 𝑅 𝛾 = • 𝑧 𝑗 − 𝑔 𝑦 𝑗 , 𝛾 𝑗=1 • Uses gradient and Jacobian matrix information • Implemented e.g. by ALGLIB Effects of Constant Optimization by Nonlinear Least Squares Minimization 7

Gradient Calculation Transformation of symbolic expression tree • Extract initial numerical values (starting point) + • Add scaling tree nodes 𝛾 6 * Automatic differentiation • Provided e.g. by AutoDiff - - 𝛾 5 • Numerical gradient calculation in one pass • Faster compared to symbolic differentiation 3.12 𝛾 4 + 𝜖𝑔 , 𝜖𝑔 , … , 𝜖𝑔 + 𝛼𝑔 = 𝜖𝛾 1 𝜖𝛾 2 𝜖𝛾 𝑜 * * 𝛾 1 𝑦 𝑦 Update tree with optimized values • 1.0 𝛾 2 0.06 Optionally calculate new fitness 𝛾 3 Effects of Constant Optimization by Nonlinear Least Squares Minimization 8

Constant Optimization Improvement 𝑱𝒏𝒒𝒔𝒑𝒘𝒇𝒏𝒇𝒐𝒖 = 𝑹𝒗𝒃𝒎𝒋𝒖𝒛 𝒑𝒒𝒖𝒋𝒏𝒋𝒜𝒇𝒆 − 𝑹𝒗𝒃𝒎𝒋𝒖𝒛 𝒑𝒔𝒋𝒉𝒋𝒐𝒃𝒎 Exemplary GP Run • Average & median improvement stays constantly low • Maximum improvement almost reaches the best quality found • Crossover worsens good individuals • The quality of few individuals can be dramatically increased Effects of Constant Optimization by Nonlinear Least Squares Minimization 9

Problems Symbolic regression benchmarks • Better GP Benchmarks: Community Survey Results and Proposals (White et al., GPEM 2013) Problem Function Training Test (𝑦 2 + 1) 𝑔 𝑦 = ln 𝑦 + 1 + ln Nguyen-7 20 500 30𝑦𝑨 𝑔 𝑦, 𝑧, 𝑨 = Keijzer-6 20 120 𝑦 − 10 𝑧 2 10 𝑔 𝑦 1 , … , 𝑦 5 = Vladislavleva-4 1024 5000 5 + 𝑦 𝑗 − 30 2 1 1 𝑔 𝑦, 𝑧 = 1 + 𝑦 −4 + Pagie-1 676 1000 1 + 𝑧 −4 𝑔 𝑦 1 , … , 𝑦 10 = 𝑦 1 𝑦 2 + 𝑦 3 𝑦 4 + 𝑦 5 𝑦 6 + 𝑦 1 𝑦 7 𝑦 9 + 𝑦 3 𝑦 6 𝑦 10 Poly-10 250 250 𝑔(𝑦 1 , … , 𝑦 10 ) = 10 sin(π𝑦 1 𝑦 2 ) + 20 𝑦 3 − 0.5 2 + 10𝑦 4 + 5𝑦 5 + 𝑂 0,1 Friedman-2 500 5000 Tower 3136 1863 Real world data Effects of Constant Optimization by Nonlinear Least Squares Minimization 10

Algorithm Configurations Genetic Programming with strict offspring selection • Only child individuals with better quality compared to the fitter parent are accepted in the new generation Varying parameters • Population size of 500, 1000, and 5000 for runs without constant optimization • Probability for constant optimization 25%, 50%, and 100% (population size 500) All others parameters were not modified • Maximum selection pressure of 100 was used as termination criterion • Size constraints of tree length 50 and depth 12 • Mutation rate of 25% • Function set consists solely of arithmetic functions (except Nguyen-7) Effects of Constant Optimization by Nonlinear Least Squares Minimization 11

Results - Quality Success rate (test R² > 0.99) OSGP 500 OSGP 1000 OGSP 5000 CoOp 25% CoOp 50% CoOp 100% 1.00 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 Nguyen-7 Keijzer-6 Vladislavleva-4 Pagie-1 Poly-10 Effects of Constant Optimization by Nonlinear Least Squares Minimization 12

Results - Quality Noisy datasets • Success rate not applicable • R² of best training solution (μ ± σ ) Friedman-2 Tower Configuration Training Test Training Test 0.836 ± 0.027 0.768 ± 0.172 0.877 ± 0.007 0.876 ± 0.012 OSGP 500 OSGP 1000 0.857 ± 0.036 0.831 ± 0.102 0.880 ± 0.006 0.877 ± 0.024 OSGP 5000 0.908 ± 0.035 0.836 ± 0.191 0.892 ± 0.006 0.890 ± 0.008 CoOp 25% 0.959 ± 0.001 0.871 ± 0.151 0.919 ± 0.006 0.916 ± 0.007 0.967 ± 0.000 0.920 ± 0.086 0.925 ± 0.005 0.921 ± 0.006 CoOp 50% 0.964 ± 0.000 0.864 ± 0.142 0.932 ± 0.005 0.927 ± 0.005 CoOp 100% Effects of Constant Optimization by Nonlinear Least Squares Minimization 13

Results – LM Iterations Constant optimization probability of 50% Varying iterations for the LM algorithm (3x, 5x, 10x) • success rate • respectively test R² for noisy datasets Problem OGSP 5000 CoOp 50% 3x CoOp 50% 5x CoOp 50% 10x 1.00 0.92 0.92 0.94 Nguyen-7 0.74 0.92 0.88 0.94 Keijzer-6 0.48 0.56 0.82 0.86 Vladislavleva-4 0.20 0.26 0.52 0.74 Pagie-1 0.62 0.78 0.88 0.94 Poly-10 0.836 ± 0.191 0.946 ± 0.046 0.943 ± 0.076 0.920 ± 0.086 Friedman-2 0.890 ± 0.009 0.902 ± 0.010 0.912 ± 0.008 0.921 ± 0.006 Tower Effects of Constant Optimization by Nonlinear Least Squares Minimization 14

Results - Execution Effort Execution effort relative to OSGP 500 OSGP 1000 OSGP 5000 CoOp 50% 3x CoOp 50% 5x CoOp 50% 10x 35.00 30.00 25.00 20.00 15.00 10.00 5.00 0.00 Nguyen-7 Keijzer-6 Valdislavleva-4 Paige-1 Poly-10 Friedmann-2 Tower Effects of Constant Optimization by Nonlinear Least Squares Minimization 16

Feature Selection Problems Artificial datasets 100 input variables Ɲ(0,1) • Linear combination of 10/25 variables with weights 𝑉 0,10 • • noisy  max R² = 0.90 • Training 120 rows, Test 500 rows Training Test • Population size 500 1 • Constant optimization 50% 5x 0.9 Observation • Constant optimization can lead to overfitting 0.8 • Selection of correct features is also an issue 0.7 0.6 OSGP CoOp OSGP CoOp 10 Feature s 25 Feature s Effects of Constant Optimization by Nonlinear Least Squares Minimization 17

Conclusion Constant optimization improves the success rate and quality of models • Better results with smaller population size • Especially useful for post-processing of models Removes the effort of evolving correct constants • Genetic programming can concentrate on the model structure and feature selection Ready-to-use implementation in HeuristicLab • Configurable probability, iterations, random sampling • All experiments available for download • http://dev.heuristiclab.com/AdditionalMaterial Effects of Constant Optimization by Nonlinear Least Squares Minimization 19

Effects of Constant Optimization by Nonlinear Contact: Michael Kommenda Least Squares Minimization in Symbolic Regression Heuristic and Evolutionary Algorithms Lab (HEAL) Michael Kommenda, Gabriel Kronberger , Stephan Winkler, Softwarepark 11 Michael Affenzeller, and Stefan Wagner A-4232 Hagenberg e-mail: michael.kommenda@fh-hagenberg.at Web: http://heal.heuristiclab.com http://heureka.heuristiclab.com

Effects of Constant Optimization by Nonlinear Contact: Michael - PowerPoint PPT Presentation

Effects of Constant Optimization by Nonlinear Contact: Michael Kommenda Least Squares Minimization in Symbolic Regression Heuristic and Evolutionary Algorithms Lab (HEAL) Michael Kommenda, Gabriel Kronberger , Stephan Winkler, Softwarepark 11

Nonlinear Control Lecture # 31 Nonlinear Observers Nonlinear Control Lecture # 31 Nonlinear

Nonlinear Control Lecture # 22 Special nonlinear Forms Nonlinear Control Lecture # 22 Special

Nonlinear Control Lecture # 21 Special nonlinear Forms Nonlinear Control Lecture # 21 Special

Nonlinear Control Lecture # 8 Special nonlinear Forms Nonlinear Control Lecture # 8 Special

Nonlinear Control Lecture # 12 Nonlinear Observers and Output Feedback Stabilization Nonlinear

Nonlinear Control Lecture # 20 Special nonlinear Forms Nonlinear Control Lecture # 20 Special

Nonlinear mixed-effects models using Stata Yulia Marchenko Executive Director of Statistics

Nonlinear mixed-effects models using Stata Yulia Marchenko Executive Director of Statistics

Effects and State Liam OConnor CSE, UNSW (and Data61) Term 2 2019 1 Effects State IO

Non-constant Non-constant growth model growth model You are calculating the intrinsic value of

Nonlinear Optimization: Discrete optimization INSEAD, Spring 2006 Jean-Philippe Vert Ecole des

Nonlinear Control Lecture # 1 Introduction Nonlinear Control Lecture # 1 Introduction Nonlinear

Numerical Proofs in Nonlinear Control Sicun Gao, UCSD Nonlinear control working Nonlinear

Using Stata to estimate nonlinear models with fixed effects Paulo high-dimensional fixed effects

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Motion with Constant Acceleration 1 Particle Under Constant Acceleration In the case of motion

T ypewritten symbols recognition using Genetic Programming I.L. Bratchikov, A.A. Popov

Algorithm-driven Business Conduct: Competition and Collusion Rob Nicholls r.nicholls@unsw.edu.au

IE538 : Genetic Algorithms and Tabu search Term Project : Traveling Salesman Problem (TSP)

Lecture on Initial Documents Markus Roggenbach October 2011 Overview 2 Overview Initial

Can Mathematics Help End the Scourge of Political Gerrymandering? Austin Fry frya2@xavier.edu

Changing Perspective Common themes throughout past papers Repeated simple games with

Snider Tir Tire Optimizes Its Its Cu Customers- Stores-Plants Transportatio ion Network

using rule-based classifier systems Shabnam Nazmi (PhD candidate) Department of electrical and