SAT-based Encodings for Optimal Decision Trees with Explicit Paths
Mikol´ aˇ s Janota1,2, Ant´
- nio Morgado1
1 INESC-ID/IST, Universidade de Lisboa, Portugal 2 Czech Technical University in Prague, Czech Republic
SAT 2020
Janota and Morgado 1 / 19
SAT-based Encodings for Optimal Decision Trees with Explicit Paths - - PowerPoint PPT Presentation
SAT-based Encodings for Optimal Decision Trees with Explicit Paths s Janota 1,2 , Ant onio Morgado 1 Mikol a 1 INESC-ID/IST, Universidade de Lisboa, Portugal 2 Czech Technical University in Prague, Czech Republic SAT 2020 Janota and
SAT-based Encodings for Optimal Decision Trees with Explicit Paths
Mikol´ aˇ s Janota1,2, Ant´
1 INESC-ID/IST, Universidade de Lisboa, Portugal 2 Czech Technical University in Prague, Czech Republic
SAT 2020
Janota and Morgado 1 / 19
Example of Decision Trees
Question: Should I start writing a new article paper ?
Janota and Morgado 2 / 19
Example of Decision Trees
Question: Should I start writing a new article paper ? Features: A - Is the idea for the paper great and innovative? B - Are the results bad? C - Is there a very close deadline for the paper?
Janota and Morgado 2 / 19
Example of Decision Trees
Question: Should I start writing a new article paper ? Features: A - Is the idea for the paper great and innovative? B - Are the results bad? C - Is there a very close deadline for the paper?
Samples:
A B C Write? 1 1 1 1 1 1 1 1 1 1 1
Janota and Morgado 2 / 19
Example of Decision Trees
Question: Should I start writing a new article paper ? Features: A - Is the idea for the paper great and innovative? B - Are the results bad? C - Is there a very close deadline for the paper?
Samples:
A B C Write? 1 1 1 1 1 1 1 1 1 1 1 ⇒
A C Write No B Write No 1 1 1
Janota and Morgado 2 / 19
Objectives and Motivation
What and Why: Given a set of samples, find provably smallest decision tree.
Janota and Morgado 3 / 19
Objectives and Motivation
What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better.
Janota and Morgado 3 / 19
Objectives and Motivation
What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Why provably smallest? standard algorithms heuristically find small trees
Janota and Morgado 3 / 19
Objectives and Motivation
What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Why provably smallest? standard algorithms heuristically find small trees How: Encode into SAT the question is there a tree of size N?
Janota and Morgado 3 / 19
Objectives and Motivation
What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Why provably smallest? standard algorithms heuristically find small trees How: Encode into SAT the question is there a tree of size N? Look for the smallest tree iteratively.
Janota and Morgado 3 / 19
Objectives and Motivation
What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Why provably smallest? standard algorithms heuristically find small trees How: Encode into SAT the question is there a tree of size N? Look for the smallest tree iteratively. Two minimization criteria investigated: depth and size.
Janota and Morgado 3 / 19
Minimum Depth Optimal Decision Tree
Example benchmark postoperative-patient-data-un 1-un with 50% sampling (approx. 20 features, 40 samples):
1 1 1 0 1 0 1 0 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 T T F F F F T T T F F F T T F T F T F 1 2 35 3 26 4 5 6 19 7 16 8 10 11 21 27 32 29 1 0 1 1 1 0 1 1 0 1 1 1 1 0 1 1 0 1 1 F T T F T F F T T F T F T T F 1 2 3 6 7 8 9 12 13 18 19 20 24 25
sklearn d-dtfinder
(depth 11, 37 nodes) (depth 6, 29 nodes)
Remaining tools timeout (after 1000s)
Janota and Morgado 4 / 19
Encoding Decision Trees into SAT
1 Current SAT approaches:
◮ model the tree as DAG ◮ impose tree structure
Janota and Morgado 5 / 19
Encoding Decision Trees into SAT
1 Current SAT approaches:
◮ model the tree as DAG ◮ impose tree structure
2 Our, new, approach:
◮ model the tree as a set of paths ◮ impose conditions on the paths to make up a (binary) tree
Janota and Morgado 5 / 19
Encoding Decision Trees into SAT
1 Current SAT approaches:
◮ model the tree as DAG ◮ impose tree structure
2 Our, new, approach:
◮ model the tree as a set of paths ◮ impose conditions on the paths to make up a (binary) tree
3 Advantages:
◮ explicit control over tree’s depth ◮ no need for cardinality constraints over neighbors (DAG) ◮ no need for distinction between internal/leaf nodes
Janota and Morgado 5 / 19
Trees as Paths
tree expands to O(n) paths consecutive paths overlap until they diverge A B C D E F 1 1 G 1
Janota and Morgado 6 / 19
Trees as Paths
tree expands to O(n) paths consecutive paths overlap until they diverge A B C D E F 1 1 G 1 A B C A B D E A B D F A G
1 1 1 1
Janota and Morgado 6 / 19
Trees as Paths
tree expands to O(n) paths consecutive paths overlap until they diverge A B C D E F 1 1 G 1 A B C A B D E A B D F A G
1 1 1 1
Janota and Morgado 6 / 19
Trees as Paths
tree expands to O(n) paths consecutive paths overlap until they diverge A B C D E F 1 1 G 1 A B C A B D E A B D F A G
1 1 1 1
Janota and Morgado 6 / 19
Trees as Paths
tree expands to O(n) paths consecutive paths overlap until they diverge A B C D E F 1 1 G 1 A B C A B D E A B D F A G
1 1 1 1
Janota and Morgado 6 / 19
Paths as Booleans
The shape of any path modeled as follows: go right? terminate? equal to previous? step 1 . . . go right? terminate? equal to previous? step S
Janota and Morgado 7 / 19
Paths as Booleans, example
A B C 1
Janota and Morgado 8 / 19
Paths as Booleans, example
A B C 1 A B A C
1
Janota and Morgado 8 / 19
Paths as Booleans, example
A B C 1 A B A C
1
A C
1
go right=T terminate=F equal to prev=T step 1 go right=* terminate=F equal to prev=F step 2 go right=* terminate=T equal to prev=F step 3
Janota and Morgado 8 / 19
Shaping Paths into Trees
N1 N2 N3 N4 N5 N6 N7
1 1 1
Directions: 1 1 1 1 1 1 1 1 1 1 1 1
Janota and Morgado 9 / 19
Shaping Paths into Trees
N1 N2 N3 N6 N7
1 1
Directions: 1 1 1 1 1 1 1 1
Janota and Morgado 9 / 19
Shaping Paths into Trees, Main Rules
R S L R
1 1 1 1 First path always goes to the left
Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules
R S L R
1 1 1 1 First path always goes to the left 2 Last path always goes to the right
Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules
R S L R
1 1 1 1 First path always goes to the left 2 Last path always goes to the right 3 Paths must not cross.
Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules
R S L R
1 1 1 1 First path always goes to the left 2 Last path always goes to the right 3 Paths must not cross. 4 To avoid gaps: once a path diverges . . .
Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules
R S L R
1 1 1 1 First path always goes to the left 2 Last path always goes to the right 3 Paths must not cross. 4 To avoid gaps: once a path diverges . . .
◮ it has to keep going left,
Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules
R S L R
1 1 1 1 First path always goes to the left 2 Last path always goes to the right 3 Paths must not cross. 4 To avoid gaps: once a path diverges . . .
◮ it has to keep going left, ◮ the previous path has to keep going right.
Janota and Morgado 10 / 19
Encoding Semantics
1 Each path has a classification class (single Boolean).
Janota and Morgado 11 / 19
Encoding Semantics
1 Each path has a classification class (single Boolean). 2 Each node is assigned a feature.
Janota and Morgado 11 / 19
Encoding Semantics
1 Each path has a classification class (single Boolean). 2 Each node is assigned a feature. 3 For each path determine which samples reach its end.
Janota and Morgado 11 / 19
Encoding Semantics
1 Each path has a classification class (single Boolean). 2 Each node is assigned a feature. 3 For each path determine which samples reach its end. 4 If a positive sample reaches the end of a path,
the path must be positive.
Janota and Morgado 11 / 19
Encoding Semantics
1 Each path has a classification class (single Boolean). 2 Each node is assigned a feature. 3 For each path determine which samples reach its end. 4 If a positive sample reaches the end of a path,
the path must be positive.
5 If a negative sample reaches the end of a path,
the path must be negative.
Janota and Morgado 11 / 19
Optimizations
Enforcing example matching
Janota and Morgado 12 / 19
Optimizations
Enforcing example matching Pure features
Janota and Morgado 12 / 19
Optimizations
Enforcing example matching Pure features Quasi-pure features
Janota and Morgado 12 / 19
Optimizations
Enforcing example matching Pure features Quasi-pure features Path lower/upper bounds
Janota and Morgado 12 / 19
Topologies
Observation: number of topologies for small trees is low 1 2 4 5 3 1 2 3 4 5
Janota and Morgado 13 / 19
Topologies
Observation: number of topologies for small trees is low 1 2 4 5 3 1 2 3 4 5 #n 5 7 9 11 13 15 17 19 21 ... #t 2 5 14 42 132 429 1,430 4,862 16,796 ...
Janota and Morgado 13 / 19
Enumerating Topologies
Split the search space for each topology . . . the solver only fills in features and categories. Essentially cube and conquer, but informed by the problem.
Janota and Morgado 14 / 19
Iterative SAT Calls
going from smaller to bigger, i.e. UNSAT to SAT
Janota and Morgado 15 / 19
Iterative SAT Calls
going from smaller to bigger, i.e. UNSAT to SAT either iterate over depth and size or over size
Janota and Morgado 15 / 19
Iterative SAT Calls
going from smaller to bigger, i.e. UNSAT to SAT either iterate over depth and size or over size topology enumeration combines iterative and non-iterative SAT calls
Janota and Morgado 15 / 19
Iterative SAT Calls
going from smaller to bigger, i.e. UNSAT to SAT either iterate over depth and size or over size topology enumeration combines iterative and non-iterative SAT calls partial topologies supported for larger numbers
Janota and Morgado 15 / 19
Iterative SAT Calls
going from smaller to bigger, i.e. UNSAT to SAT either iterate over depth and size or over size topology enumeration combines iterative and non-iterative SAT calls partial topologies supported for larger numbers heuristics for topology order also explored
Janota and Morgado 15 / 19
Summary of Results
650 700 750 800 850 900 Baseline Path-based-min-depth DAG-based-topologies DAG-based Path-based-topologies Path-based 794 818 797 812 643 871
Janota and Morgado 16 / 19
Distribution of tree sizes
5 10 15 20 25 30 35 40 45 50 55 60 11 13 15 17 19 21 23 25 29
mindt dtfinder-DT1 dtfinder-DT1-T-O dtfinder dtfinder-T-O ddtfinder ddtfinder-T-O
Janota and Morgado 17 / 19
Conclusions and Future Work
Novel SAT-based encoding for decision trees, enables natively controlling both the tree’s size and depth.
Janota and Morgado 18 / 19
Conclusions and Future Work
Novel SAT-based encoding for decision trees, enables natively controlling both the tree’s size and depth. Search-space splitting by topology enumeration.
Janota and Morgado 18 / 19
Conclusions and Future Work
Novel SAT-based encoding for decision trees, enables natively controlling both the tree’s size and depth. Search-space splitting by topology enumeration. Our implementation outperforms existing work.
Janota and Morgado 18 / 19
Conclusions and Future Work
Novel SAT-based encoding for decision trees, enables natively controlling both the tree’s size and depth. Search-space splitting by topology enumeration. Our implementation outperforms existing work. Depth-minimization allows optimizing larger instances.
Janota and Morgado 18 / 19
Conclusions and Future Work
Novel SAT-based encoding for decision trees, enables natively controlling both the tree’s size and depth. Search-space splitting by topology enumeration. Our implementation outperforms existing work. Depth-minimization allows optimizing larger instances. Integrate the proposed techniques into more expressive approaches (e.g. SMT-based synthesis)
Janota and Morgado 18 / 19
Conclusions and Future Work
Novel SAT-based encoding for decision trees, enables natively controlling both the tree’s size and depth. Search-space splitting by topology enumeration. Our implementation outperforms existing work. Depth-minimization allows optimizing larger instances. Integrate the proposed techniques into more expressive approaches (e.g. SMT-based synthesis) Integrate our tool with greedy approaches, for example:
Janota and Morgado 18 / 19
Conclusions and Future Work
Novel SAT-based encoding for decision trees, enables natively controlling both the tree’s size and depth. Search-space splitting by topology enumeration. Our implementation outperforms existing work. Depth-minimization allows optimizing larger instances. Integrate the proposed techniques into more expressive approaches (e.g. SMT-based synthesis) Integrate our tool with greedy approaches, for example:
◮ hybrid between a greedy approach and an exact.
Janota and Morgado 18 / 19
Conclusions and Future Work
Novel SAT-based encoding for decision trees, enables natively controlling both the tree’s size and depth. Search-space splitting by topology enumeration. Our implementation outperforms existing work. Depth-minimization allows optimizing larger instances. Integrate the proposed techniques into more expressive approaches (e.g. SMT-based synthesis) Integrate our tool with greedy approaches, for example:
◮ hybrid between a greedy approach and an exact. ◮ ensembles, where only limited depth is considered
Janota and Morgado 18 / 19
Janota and Morgado 19 / 19