ReBaStaBa : handling Bayesian Network with R - - PowerPoint PPT Presentation

rebastaba
SMART_READER_LITE
LIVE PREVIEW

ReBaStaBa : handling Bayesian Network with R - - PowerPoint PPT Presentation

UseR'09 09_07_09 1/17 ReBaStaBa : handling Bayesian Network with R Jean-Baptiste.Denis@Jouy.Inra.Fr ML.Delignette@Vet-Lyon.Fr RPouillot@yahoo.Fr UseR'09 PLAN 09_07_09 2/17 Bayesian Network Defining a Bayesian Network [/bn/]


slide-1
SLIDE 1

UseR'09 09_07_09

1/17

ReBaStaBa :

handling Bayesian Network with R

Jean-Baptiste.Denis@Jouy.Inra.Fr ML.Delignette@Vet-Lyon.Fr RPouillot@yahoo.Fr

slide-2
SLIDE 2

UseR'09 09_07_09

2/17

PLAN

  • Bayesian Network
  • Defining a Bayesian Network [/bn/]
  • Computing with a /bn/
  • Some Features of rebastaba
slide-3
SLIDE 3

UseR'09 09_07_09

3/17

(1) A set of random variates

A S C B T X D

TC

  • travel to Asia ?
  • Smoker ?
  • Tuberculosis ?
  • lung Cancer ?
  • Bronchitis
  • T or C ?
  • bad X-ray?
  • bad Dysphnae test ?

Bayesian network

slide-4
SLIDE 4

UseR'09 09_07_09

4/17

Bayesian network

(2) A set of random variates with conditional independence structure

A S C B T X D

TC

(based on accepted causality)

[A,S,T,C,B,TC,X,D] = [A][S][T|A] [C|S][B|S] [TC|T,C][X|TC] [D|TC]

slide-5
SLIDE 5

UseR'09 09_07_09

5/17

Bayesian network

(3) A set of random variates with conditional independence structure and associated probability distributions.

A S C B T X D

TC

P(A=y)= 0.01 P(C=y|S=n) = 0.01 P(C=y|S=y) = 0.10 P(TC=y|T=y,C=n) = 1 .....

slide-6
SLIDE 6

UseR'09 09_07_09

6/17

Bayesian network

/Bn/s are a way to easily define a parsimonious joint probability distribution over a set of random variates.

A S C B T X D

TC

(the graph is very attractive)

slide-7
SLIDE 7

UseR'09 09_07_09

7/17

Defining a /bn/

<<Asian>> role= Just for illustration for UseR'09 <<A>> ltype= numcat lpod= y n lpara(p)= 0.01 0.99 <<S>> ltype= numcat lpod= y n lpara(p)= 0.50 0.50 <<T>> ltype= numcat lpod= y n lparent= A lpara(p)= 0.01 0.05 0.99 0.95 <<B>> ltype= numcat lpod= y n lparent= S lpara(p)= 0.60 0.30 0.40 0.70

Contents of a text file named asia.dat

slide-8
SLIDE 8

UseR'09 09_07_09

8/17

Defining a /bn/

# bn <- file2bn("asia.dat"); # print(bn,"n"); # print(bn,"l","A"); # print(bn,"l","T"); # dn <- bn2dn(bn,10000); print(dn@df[1:10,]); print(table(dn@df[,3:4])); print(grappa4mar2(bn,c("T","C"))); #

Sourcing asia.r

slide-9
SLIDE 9

UseR'09 09_07_09

9/17

# bn <- file2bn("asia.dat"); # print(bn,"n"); #

Node List ========= nb.Var ltype Parent(s) Node Child(ren) 1 1 numcat {} (A)-> {T} 2 1 numcat {} (S)=>2 {C;B} 3 1 numcat {A} ->(T)-> {TC} 4 1 numcat {S} ->(C)-> {TC} 5 1 numcat {S} ->(B)-> {D} 6 1 numcat {T;C} 2=>(TC)=>2 {X;D} 7 1 numcat {TC} ->(X) {} 8 1 numcat {B;TC} 2=>(D) {}

Sourcing asia.r

Computing with a /bn/

slide-10
SLIDE 10

UseR'09 09_07_09

10/17

# print(bn,"n"); # print(bn,"l","A"); print(bn,"l","T"); # dn <- bn2dn(bn,10000);

............................................ probabilities are multiplied by 10^2 1 A: y n 2 : 1 99 ............................................ probabilities are multiplied by 10^2 1 T: y n 2 A --- --- --- 3 y : 1 99 4 n : 5 95 ............................................ Sourcing asia.r

Computing with a /bn/

slide-11
SLIDE 11

UseR'09 09_07_09

11/17

print(bn,"l","T"); # dn <- bn2dn(bn,10000); print(dn@df); print(table(dn@df[,3:4])); print(grappa4mar2(bn,c("T","C"))); #

A S T C B TC X D >?< 1 n y n n y n n n 0 2 n y n n y n n n 0 3 n n n n y n n n 0 4 n n n n n n n n 0 5 n y n n n n n n 0 6 n y n n y n n n 0 7 n y n n y n n n 0 8 n n n n n n n n 0 9 n n n n n n n n 0 10 .................... Sourcing asia.r

Computing with a /bn/

slide-12
SLIDE 12

UseR'09 09_07_09

12/17

print(dn@df); print(table(dn@df[,3:4])); print(grappa4mar2(bn,c("T","C"))); #

Sourcing asia.r C T y n y 26 487 n 499 8988 >>> --------------> Empirical frequency table Marginal dimensions are: T(2 categories) C(2 categories) NO Conditional dimension NO Conditioned dimension probabilities are multiplied by 10^3 V1 V2 V3 V4 V5 1 C: y y n n 2 T: y n y n 3 : 3 52 47 898 >>> --------------> Computed frequency table

Computing with a /bn/

slide-13
SLIDE 13

UseR'09 09_07_09

13/17

/bn/ : Bayesian network /gn/ : graph structure /dn/ : specialized data frame /alk/ : asked node

Some Features Main S4 Objects

slide-14
SLIDE 14

UseR'09 09_07_09

14/17

<YES> for 'must be provided by the user' <yes> for 'can be provided by the user but there is a default value' < no> for 'necessary but generated by rebastaba' < NO> for 'necessary but fixed by rebastaba' < - > for 'irrelevant'

lpara lrep lnat lvar lparent lpod normal YES yes NO NO no YES uniform YES yes NO NO no YES Bernoulli YES yes NO NO no NO binomial YES yes NO NO no YES Dirac YES yes yes NO no YES multinomial YES no no yes no YES Dirichlet YES no no yes no YES numcat YES - NO yes yes YES parcat YES - NO yes no YES score YES - NO NO YES YES easyp YES yes YES yes no YES empidata - no YES YES yes YES popula - no YES YES NO YES program - YES YES YES yes YES

/alk/ : main slots

Some Features

slide-15
SLIDE 15

UseR'09 09_07_09

15/17

#----------------------------- <<X>> ltype= normal lpod= 10 20 lpara(mu)= 12 lpara(sigma)= 1 #----------------------------- <<Y>> ltype= normal lpod= 0 8 lpara(mu)= sqrt({{X}}+1) lpara(sigma)= 2 #----------------------------- # X is the parent of Y # because the expectation of Y # depends on it.

Some Features

/alk/ : main slots

slide-16
SLIDE 16

UseR'09 09_07_09

16/17

11400 lines of codes 9283 comment lines (self generation of Rd documentation) 15 S4 classes are defined 229 functions are available intensive checking with specialized functions 'check3rbsb' was called 129 times 'check4tyle' was called 73 times 'erreur ' was called 402 times 'rapport ' was called 23 times

Some Features

Programming

slide-17
SLIDE 17

UseR'09 09_07_09

17/17

Some Features

Near future

Soon on http://riskassessment.r-forge.r-project.org/

Enhancement on Bugs (Jags) transcription

Graph analysis (detection of conditional independences)

Interface with deal

. . . . .

slide-18
SLIDE 18

UseR'09 09_07_09

18/17

Thanks for your attention