Ehsan Nazerfard nazerfard@eecs.wsu.edu October 11, 2011 - - PowerPoint PPT Presentation
Ehsan Nazerfard nazerfard@eecs.wsu.edu October 11, 2011 - - PowerPoint PPT Presentation
Ehsan Nazerfard nazerfard@eecs.wsu.edu October 11, 2011 Introduction: Graphical Models Tutorial: Bayesian Networks Structure Learning Approaches RAI: Recursive Autonomy Identification for Bayesian Network Structure Learning Bayesian
Introduction: Graphical Models Tutorial: Bayesian Networks Structure Learning
Approaches
RAI: Recursive Autonomy Identification for
Bayesian Network Structure Learning Bayesian Network Structure Learning Summary and Further Studies Analysis Next Steps Discussion Topics
Three main types of Graphical Models Represent joint probability distribution.
- Nodes: random variables
- Edges: statistical dependencies
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Three main types of Graphical Models Represent joint probability distribution.
- Nodes: random variables
- Edges: direct influence in directed graphs
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Why we need Graphical Models?
- Intuitive way of representation of the relations
between variables
- Abstract out the conditional independence relations
between variables
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
between variables
Conditional independence
- “Is A dependent on B, given the value of C ?
- A
B |C P(A |B,C ) = P(A|C )
- A
B |C P(A,B |C ) = P(A|C )P(B|C )
Given , a Bayesian network is an
annotated DAG that represents a unique JPD
- ver :
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
) ,..., (
1 n
X X X
X
i i i n
X Pa X p X X p )) ( | ( ) ,..., (
1
Each node is annotated with a CPT that
represents
i
)) ( | (
i i
X pa X p
DAG: Directed Acyclic Graph JPD: Joint Probability Distribution CPT: Conditional Probability Table
Structure Learning
- Find a structure of Bayesian Network (BN) that best
describes the observed data.
Parameter Learning
Learning the parameters when the structure is
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
- Learning the parameters when the structure is
known. Generally parameter learning is a part of structure learning.
Goal
- Find a network structure of BN that describes the
- bserved data the most.
NP-Complete !!
Naïve Bayes …
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
- Naïve Bayes …
- Using domain knowledge
- Assumptions to make the problem tractable
- The text books (generally) assume the network is
already known.
- …
Two main categories:
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
1) Score and Search-Based (S&S) approach
- Learning the network structures
Constraint-Based (CB) approach
2) Constraint-Based (CB) approach
- Learning the edges composing a structure
Three main issues with S&S approach:
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
- 1. Search space
- 2. Search strategy
- 3. Model selection criterion
- 3. Model selection criterion
Number of possible DAGs containing n nodes:
Curse of dimensionality
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
) ( 2 ) 1 ( ) (
) ( 1 1
i n f i n n f
i n i n i i
Curse of dimensionality
# of variables # of the possible DAGs 1 1 2 3 3 25 … … 8 78,370,2329,343 9 1,213,442,454,842,881 10 4,175,098,976,430,598,100
Any search method from Artificial Intelligence:
- DFS, BFS, Best First Search, Simulated Annealing
- A-star and IDA-star
- …
How neighborhood is defined ?
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
- Current structure + adding, deleting or reversing
an arc.
- No cycle is allowed
K2 algorithm [3]
- Greedy search (total ordering is known)
Scoring function
- Evaluates how well a given network G matches the
data D.
- The best BN is the one that maximizes the scoring
function.
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
function.
- Based on ML:
Most frequently used: Bayesian Information
Criterion (BIC) [4]
)) , | ( ( max arg
k G G
G D p
G
Input: Observational data set Output: The resulting Bayesian network
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI Score and Search-Based approach – Pseudo code 1 Generate the initial BN (random or from domain knowledge), 1 Generate the initial BN (random or from domain knowledge), evaluate it and set it as the current network. 2 Evaluate the neighbors of the current BN. 3 If the best score of the neighbors is better than the score of the current BN, set neighbor with the best score as the current network and go to step 2. 4 Else stop the learning process.
Two main categories:
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
1) Score and Search-Based approach (S&S)
- Learning the network structures
Constraint-Based approach (CB)
2) Constraint-Based approach (CB)
- Learning the edges composing a structure
Learning the edges of a structure
- Discovering the conditional independence (CI)
relations from the data
- Infer the structure from learned relations
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
“Is A dependent on B, given the value of C ? Examples*
- child’s genes grandparents’ genes | parents’ genes
- amount of speeding fine type of car | speed
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
- amount of speeding fine type of car | speed
- lung cancer yellow teeth | smoker
- …
* Borrowed from Dr. Zoubin Ghahramani’s GM Tutorial
Example
- Child’s genes and his grandparents' genes
- A
D | B
Variable B d-separates A and D
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Example: Rolling two dices …
- B
C |
- B
C | D
V-Structure
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Example: Dice example …
- C : random numbers are in [1,6]
- D
E | C
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Is B conditionally independent of C, given E ?
- B
C | E ?
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Conditioned on no single variable makes C
and D independent.
- C
D | {B, E }
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Different between CB algorithms
- completeness and complexity
Algorithms (not limited to)
- TPDA: Three Phase Dependency Analysis, 1997
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
- TPDA: Three Phase Dependency Analysis, 1997
- SC: Sparse Candidate, 1999
- IC: Inductive Causation, 2000
- PC: Peter Spirtes and Clark Glymour, 2000
- MMHC: Max-Min Hill-Climbing, 2006
- RAI: Recursive Autonomy Identification, 2009
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Title
Bayesian Network Structure Learning by Recursive Autonomy Identifi fication
Authors Raanan Yehezkel and Boaz Lerner Authors Raanan Yehezkel and Boaz Lerner Journal of Machine Learning Research (2009),
- pp. 1527-1570
Conditional independence tests Edge direction (orientation rule) Structure decomposition
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Structure decomposition
- Diminish the curse of dimensionality problem
d-separation resolution (X,Y )
- Size of the smallest condition set that d-separates X and Y
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
d-separation resolution (G )
- The highest d-separation resolution in the graph
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI
Given , any two non-adjacent nodes in are
d-separated given nodes either included in or its exogenous causes.
Formally: ...
G G A G
A
G
S Y X t s V V S
ex A
| . . } {
Input: Observational data set Output: Partial DAG to represent the Markov
equivalent class
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI RAI algorithm-Pseudo code Start from a complete undirected graph Start from a complete undirected graph * Repeat the steps 1 to 3 from low to high graph d-separation resolution, until stopping criterion met (e.g. CI test threshold) 1 Test of CI between nodes, followed by the removal edges related to independence 2 Edge direction according to orientation rules (not always possible) 3 Graph decomposition into autonomous sub-structures. * For each sub-structure, apply RAI recursively (steps 1 to 3), while increasing the order of CI testing.
Bayesian Networks Structure Learning: Introduction | Tutorial | RAI