Kaizen Programming
Vinícius Veloso de Melo vinicius.melo@unifesp.br Institute of Science and Technology (ICT) Federal University of São Paulo (UNIFESP)
Newfoundland & Labrador, Canada
Kaizen Programming Vincius Veloso de Melo vinicius.melo@unifesp.br - - PowerPoint PPT Presentation
Newfoundland & Labrador, Canada Kaizen Programming Vincius Veloso de Melo vinicius.melo@unifesp.br Institute of Science and Technology (ICT) Federal University of So Paulo (UNIFESP) Summary Summary Context Kaizen Programming
Vinícius Veloso de Melo vinicius.melo@unifesp.br Institute of Science and Technology (ICT) Federal University of São Paulo (UNIFESP)
Newfoundland & Labrador, Canada
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 2 2
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 3 3
The quality of the individual (fitness) is how good it solves the problem
Initial population
Calculate Calculate the fitness the fitness Selection Selection Generate Generate
Check stopping Check stopping criteria / insert criteria / insert into current into current population population
Solution
Evolutionary Evolutionary cycle cycle
Each individual represents a complete solution Evolution is driven by natural selection (improvement of the fittest) Not so good individuals can pass to the next generation via tournament selection in order to maintain diversity in the population Usually random modifications
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 6 6
– Global optimum:
– Current best:
– GP (or similar) inserts expressions trying to reduce the
error caused by the garbage expressions
– Is it easy for GP (or similar) to get to the global optimum
from this current best?
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 8 8
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 9 9
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 10 10
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 11 11
Source: http://www.binaryspectrum.com/itservices/quality_assurance.html
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 13 13
– Evolution becomes a collaborative approach
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 15 15
A team of experts is formed to propose ideas to solve a problem, that are put together to become a complete solution The quality of the solution is how good it solves the problem The quality of an idea is a measurement
Now one can determine, exactly, which parts
Such property results in a reduction in bloat, smaller population sizes, and lower number of function evaluations Construct a solution (build a model) using only the new ideas or new and
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 16 16
GP (crossover and mutation)
– Using the set of terminals and non-terminals, the new ideas
(Ki) are non-linear relationships among the variables, i.e.:
Squares (multiple linear regression model)
– ŷ = β1K1+β2K2+β3K3 – βi are used to scale the ideas and are discovered by OLS
parameters
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 17 17
– Adjusted R2 : proportion of variance explained
– P-value: hypothesis test as a significance level α – Ideas with non-significant values (or very small
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 18 18
f(x) = -(x²)/(123.91-x+tanh(10))-13.502*sin(x)+sqrt((5.2134³)*x) f(x) = -(x²)/(123.91-x+tanh(10))-13.502*sin(x)+sqrt((5.2134³)*x) f(x) = -13.502*sin(x) 0.0740631*(-13.502)*sin(x) 1.0*sin(x)
P-value > α P-value > α P-value < α
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 19 19
Check this constant!
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 20 20
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 21 21
– Standard Crossover (SC) – No Same Mate (NSM) selection – Semantics Aware Crossover (SAC) – Context Aware Crossover (CAC) – Soft Brood Selection (SBS) – Semantic Similarity-based Crossover (SSC)
and B. Gorkemli. Artificial bee colony programming for symbolic regression. Information Sciences, 209(0):1 –15, 2012.
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 22 22
Why so large for
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 23 23
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 24 24
– Number of experts: 8 – Maximum number of node evaluations: 1 x 105 – Idea improver: 90% GP Uniform Mutation / 10% GP
– Idea sharing: one-point crossover – 100 independent runs
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 25 25
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 26 26
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 27 27
solving approach in which partial solutions are put together to result in a complete solution
– Easier to understand (when compared to a single huge
bloated solution generated by GP) and to interpret if necessary
component, or a sine component, etc
– The resulting ideas can be seen as features extracted from
the dataset. The features have different distinct accuracies to complement each other
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 28 28
– the guessing, and consequently the bloat,
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 31 31
– Change the techniques in the modules
– Test in other symbolic regression problems – Test in real-world problems and datasets – Perform sensitivity analysis of the parameters
– New results to be submitted soon
07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 32 32
vinicius.melo@unifesp.br
This work was supported by CNPq (Universal) grant 486950/2013-1
Newfoundland & Labrador, Canada