Kaizen Programming Vincius Veloso de Melo vinicius.melo@unifesp.br - - PowerPoint PPT Presentation

kaizen programming
SMART_READER_LITE
LIVE PREVIEW

Kaizen Programming Vincius Veloso de Melo vinicius.melo@unifesp.br - - PowerPoint PPT Presentation

Newfoundland & Labrador, Canada Kaizen Programming Vincius Veloso de Melo vinicius.melo@unifesp.br Institute of Science and Technology (ICT) Federal University of So Paulo (UNIFESP) Summary Summary Context Kaizen Programming


slide-1
SLIDE 1

Kaizen Programming

Vinícius Veloso de Melo vinicius.melo@unifesp.br Institute of Science and Technology (ICT) Federal University of São Paulo (UNIFESP)

Newfoundland & Labrador, Canada

slide-2
SLIDE 2

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 2 2

Summary Summary

  • Context
  • Kaizen Programming
  • Experiments
  • Summary and Conclusions
  • Future works
slide-3
SLIDE 3

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 3 3

The quality of the individual (fitness) is how good it solves the problem

Context Context

Initial population

Calculate Calculate the fitness the fitness Selection Selection Generate Generate

  • ffspring
  • ffspring

Check stopping Check stopping criteria / insert criteria / insert into current into current population population

Solution

Evolutionary Evolutionary cycle cycle

Each individual represents a complete solution Evolution is driven by natural selection (improvement of the fittest) Not so good individuals can pass to the next generation via tournament selection in order to maintain diversity in the population Usually random modifications

slide-4
SLIDE 4

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 6 6

Context Context

  • Suppose the following symbolic regression problem:

– Global optimum:

  • f(x) = sin(x)

– Current best:

  • f(x) = -(x²)/(123.91-x+tanh(10))-13.502*sin(x)+sqrt(abs((5.2134³)*x))

– GP (or similar) inserts expressions trying to reduce the

error caused by the garbage expressions

  • Bloat

– Is it easy for GP (or similar) to get to the global optimum

from this current best?

  • What if we could detect which parts of the

expression are good and which are bad?

slide-5
SLIDE 5

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 8 8

slide-6
SLIDE 6

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 9 9

The Kaizen methodology The Kaizen methodology

  • The Japanese word Kaizen means “Good

Change,” and is adopted as a philosophy of work which means continuous improvement

  • Kaizen Event is the term given to an event

consisting of a team (of workers and managers) working together for a brief period of time to find effective solutions to identified business problems

slide-7
SLIDE 7

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 10 10

slide-8
SLIDE 8

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 11 11

The Kaizen methodology The Kaizen methodology

Plan-Do-Check-Act (PDCA) Plan-Do-Check-Act (PDCA)

Source: http://www.binaryspectrum.com/itservices/quality_assurance.html

slide-9
SLIDE 9

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 13 13

Kaizen Programming Kaizen Programming

  • Kaizen Programming (KP) is a novel

evolutionary tool based on the concepts of the Kaizen methodology

  • KP is a computational implementation of a

Kaizen event with PDCA

  • KP individuals are not complete solutions,

but part of it (divide and conquer strategy)

– Evolution becomes a collaborative approach

instead of an egocentric one

slide-10
SLIDE 10

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 15 15

Kaizen Programming Kaizen Programming

PDCA cycle PDCA cycle

A team of experts is formed to propose ideas to solve a problem, that are put together to become a complete solution The quality of the solution is how good it solves the problem The quality of an idea is a measurement

  • f its contribution to the solution

Now one can determine, exactly, which parts

  • f the solution should be removed or improved

Such property results in a reduction in bloat, smaller population sizes, and lower number of function evaluations Construct a solution (build a model) using only the new ideas or new and

  • ld ideas at the same time
slide-11
SLIDE 11

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 16 16

Kaizen Programming Kaizen Programming

Application in Symbolic Regression Application in Symbolic Regression

  • The creation/modification of the ideas is performed by

GP (crossover and mutation)

– Using the set of terminals and non-terminals, the new ideas

(Ki) are non-linear relationships among the variables, i.e.:

  • K1=x2 ; K2=sin(x); K3=-x+3/x
  • The evaluation is performed by Ordinary Least

Squares (multiple linear regression model)

– ŷ = β1K1+β2K2+β3K3 – βi are used to scale the ideas and are discovered by OLS

  • All models generated by KP are linear in the

parameters

slide-12
SLIDE 12

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 17 17

Kaizen Programming Kaizen Programming

Application in Symbolic Regression Application in Symbolic Regression

  • The quality of the model (containing all partial

solutions) is a measure of the goodness-of-fit

– Adjusted R2 : proportion of variance explained

  • The quality of each solution is its importance to

the model, not how good it fits !

– P-value: hypothesis test as a significance level α – Ideas with non-significant values (or very small

β) are not useful to the model

  • The analysis of the model is used to guide the

search instead of using natural selection

slide-13
SLIDE 13

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 18 18

Remember me? Remember me?

f(x) = -(x²)/(123.91-x+tanh(10))-13.502*sin(x)+sqrt((5.2134³)*x) f(x) = -(x²)/(123.91-x+tanh(10))-13.502*sin(x)+sqrt((5.2134³)*x) f(x) = -13.502*sin(x) 0.0740631*(-13.502)*sin(x) 1.0*sin(x)

ONE SINGLE ITERATION ! ONE SINGLE ITERATION !

P-value > α P-value > α P-value < α

slide-14
SLIDE 14

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 19 19

Kaizen Programming Kaizen Programming

Application in Symbolic Regression Application in Symbolic Regression

Check this constant!

slide-15
SLIDE 15

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 20 20

Experiments: Symbolic regression Experiments: Symbolic regression

Main results: Nguyen functions Main results: Nguyen functions

slide-16
SLIDE 16

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 21 21

Experiments: Symbolic regression Experiments: Symbolic regression

Main results: Nguyen functions Main results: Nguyen functions

  • Artificial Bee Colony Programming (ABCP)
  • Genetic Programming

– Standard Crossover (SC) – No Same Mate (NSM) selection – Semantics Aware Crossover (SAC) – Context Aware Crossover (CAC) – Soft Brood Selection (SBS) – Semantic Similarity-based Crossover (SSC)

  • Results taken from D. Karaboga, C. Ozturk, N. Karaboga,

and B. Gorkemli. Artificial bee colony programming for symbolic regression. Information Sciences, 209(0):1 –15, 2012.

slide-17
SLIDE 17

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 22 22

Why so large for

  • nly 2 terminals?

Experiments: Symbolic regression Experiments: Symbolic regression

Main results: Nguyen functions Main results: Nguyen functions (Karaboga et al., 2012) (Karaboga et al., 2012)

slide-18
SLIDE 18

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 23 23

? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

Experiments: Symbolic regression Experiments: Symbolic regression

Main results: Nguyen functions Main results: Nguyen functions (Karaboga et al., 2012) (Karaboga et al., 2012)

slide-19
SLIDE 19

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 24 24

  • Kaizen Programming

– Number of experts: 8 – Maximum number of node evaluations: 1 x 105 – Idea improver: 90% GP Uniform Mutation / 10% GP

ERC Mutation

– Idea sharing: one-point crossover – 100 independent runs

  • This configuration will certainly give terrible results!

Experiments: Symbolic regression Experiments: Symbolic regression

Main results: Nguyen functions using KP Main results: Nguyen functions using KP

slide-20
SLIDE 20

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 25 25

Experiments: Symbolic regression Experiments: Symbolic regression

Main results: Nguyen functions using KP Main results: Nguyen functions using KP

slide-21
SLIDE 21

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 26 26

Experiments: Symbolic regression Experiments: Symbolic regression

Main results: Nguyen functions Main results: Nguyen functions

slide-22
SLIDE 22

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 27 27

Summary and Conclusions Summary and Conclusions

  • Kaizen Programming (KP) uses a collaborative problem

solving approach in which partial solutions are put together to result in a complete solution

  • The final solution is a multiple linear regression model

– Easier to understand (when compared to a single huge

bloated solution generated by GP) and to interpret if necessary

  • for instance: in the final model the best curve is an exponential

component, or a sine component, etc

– The resulting ideas can be seen as features extracted from

the dataset. The features have different distinct accuracies to complement each other

  • PCA? ICA? FFT?
slide-23
SLIDE 23

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 28 28

Summary and Conclusions Summary and Conclusions

  • KP’s methodology helps the search because

it acts as a better guide than regular natural selection in which only the best are useful

  • One can know, exactly, which ideas are

useful for the next improvement cycle

– the guessing, and consequently the bloat,

are decreased when compared to Genetic Programming (GP) and similar approaches

slide-24
SLIDE 24

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 31 31

Future Works Future Works

  • For regression:

– Change the techniques in the modules

  • GP, OLS, Adjusted R2, p-value

– Test in other symbolic regression problems – Test in real-world problems and datasets – Perform sensitivity analysis of the parameters

  • Test in other kinds of problems

– New results to be submitted soon

  • Suggestions/collaborations?
slide-25
SLIDE 25

07/14/14 07/14/14 GECCO 2014, VANCOUVER, BC, CANADA GECCO 2014, VANCOUVER, BC, CANADA 32 32

Thank you

vinicius.melo@unifesp.br

This work was supported by CNPq (Universal) grant 486950/2013-1

Newfoundland & Labrador, Canada