Bayesian model for rare events recognition with use of logical - PowerPoint PPT Presentation

Bayesian model for rare events recognition with use of logical decision functions class (RESIM 08, Rennes, France) Vladim ir Berikov, Gennady Lbov Sobolev Institute of mathematics, Novosibirsk, Russia {berikov, lbov}@math.nsc.ru

What are the problems we are solving? � Pattern recognition � Regression analysis � Time series analysis � Cluster analysis in hard-to-formalize areas of investigation 2

Hard-to-formalize areas of investigations • lack of knowledge about the objects under investigation, that makes it difficult to formulate the mathematical model of the objects; • large number of heterogeneous (either quantitative or qualitative) features; small sample size; • heterogeneous types of expert knowledge; • nonlinear dependencies between features; 3

• presence of unknown values of features; • desire to present the results in the form understandable by a specialist in the applied area Peculiarities of rare events: • unbalanced data set; • non-symmetric loss function; • rare event with large losses = “extreme” event 4

Example of event tree for extreme flood forecast for rivers in Central Russia * Extremely high amount of snow Intensive snow melting in spring Extreme and flood Large amount of Large amount of precipitation in precipitation in late autumn spring Cold winter Deep frozen of earth and (av.temp.< –5ºC, below the surface long periods of low temperature, absence of thaws) * L. Kuchment, A. Gelfand. Dynamic-stochastic models of river flow 5 formation. Nauka, 1993. (in Russian)

Logical decision functions (LDF) If P 1 and P 2 and…and P m Then Y=1 e.g. P j ~ “X 2 >1” , P j ~“X 5 ∈ {c,d}” decision tree 6

Basic problems in constructing LDF � How to choose optimal complexity of LDF? (validity of quality criterion) � How to find optimal LDF from given family (validity of algorithm) 7

Pattern recognition problem Learning Expert knowledge sample X 1 X 2 …X n Y 3 5 … 0 1 6 2 … 1 0 6 2 … 0 1 ……………. Supposed class of 1 9 … 1 0 distributions Λ Goal: find f ∈Φ Class of having minimal risk decision of wrong recognition functions Φ 8

Mathematical setting • general collection Γ of objects • features X =( X 1 ,…, X j ,…, X n ), D X quantitative qualitative X j ordered ≥ • Y , D Y ={ w (1) ,…, w ( i ) ,…, w ( K ) }, K 2 - number of patterns • learning sample (a (1) ,…,a ( N ) ), N – sample size; x (i) = X ( a ( i ) ) , y (i) =Y ( a ( i ) ) • θ = p(x,y) – distribution of ( X,Y ) (“strategy of nature”) • class of distributions Λ 9

• a priori probabilities of patterns p (1) ,…,p ( K ) • decision function f : D X → D Y • class of decision functions Φ • Loss function L i,j (decision Y = i , but actually Y = j ) Y=1 ~ "extreme event"; Y=2 ~ "ordinary event" L 1,2 << L 2,1 • expected losses (risk) R f ( θ )= E L f(X),Y • optimal Bayes decision function f B : R fB ( θ )=inf f R f ( θ ) • learning method μ : f = μ ( s ) 10

• Probability distribution is unknown; • Learning sample has limited size One should reach a compromise between the complexity of class and the accuracy of decisions on learning sample 11

Complexity: � Number of parameters of discriminant function; � Number of features; � VC dimension; � Maximal number of leaves in decision tree; � ... 12

Risk 3 Effect of sample size Effect of decision functions class Φ Informativiness 2 1 of distribution Complexity of class Φ М opt 13

Recognition on a finite set of events … X 2 ... c M с 1 с 2 с M- 1 discrete unordered X 1 variable X partition example (independent from learning sample) 14

Bayesian approach: define meta- distribution on class of distributions Λ X, Y; D X ={1,…, j ,…, M } – cells, D Y ={1,…, i ,…, K } p i ( ) ( 1 ) ( i ) ( K ) = P = = , θ = ( X i , Y j ) ( p ,..., p ,..., p ) j j 1 M ( i ) ( i ) = ∑ p p - a priori probability of i -th class j j ( 1 ) ( i ) ( K ) ( i ) s = ∑ = Frequency vector , ( n ,..., n ,..., n ) n N j j M 1 i , j Λ = { θ - family of multinomial distributions. } Random vector Θ is defined on Λ ( i ) − 1 d 1 ( i ) j ( θ = ∏ , p ) ( p ) j Z i , j ( i ) 15 where > (Dirichlet distribution). d 0 j

when d j (i) ≡ d= 1 – uniform a priori distribution µ - deterministic learning method: f = µ(s) empirical error minimization method µ* Suppose that both sample S and strategy of nature Θ are random Risk of wrong recognition – random function R µ(S) ( Θ ) Suppose that a priori probability of rare event is known ( p (1) ; p (2) =1 ─ p (1) ) 16

Directions in model investigation: � How to set model parameters d j(i) ? � How to define optimal complexity of the class of LDF? � How to substantiate quality criterion? � How to get more reliable estimates of risk? � How to extend model on regression analysis, time series analysis, cluster analysis? 17

Setting a priori distribution with respect to expected probability of error for Bayes decision function f B d i ( ) ≡ Let K =2, d , where d > 0; j Θ = + P B ( ) I ( d 1 , d ) Proposition 1. , where I x (p,q) – E f 0 , 5 beta distribution function. 0,35 0,3 0,25 0,2 0,15 0,1 0,05 0 0,0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1,0 1,1 1,2 1,3 1,4 1,5 1,6 1,7 For example, if E Pf B = 0.15, then d=0.38 18

Expected error probability EP μ *(S) ( Θ ) as a function of complexity (Theorem 1) K =2, d =1, p (1) =0.05, p (2) =0.95, L 1,2 =1, L 2,1 =20, L 1,1 =L 2,2 =0 19

A posteriori estimates of risk (decision function f is fixed; learning sample is given) Theorem 2. A posteriori mathematical expectation of risk Θ s R ( ) function equals E Θ | f 1 ( . ( q ) ( q ) i ) = + = ∑ ∑ D d R L ( n d ) , where f , s f ( j ), q j j j + N D j , q i , j NB. A posteriori mathematical expectation of risk is optimum Bayes estimate of risk under quadratic loss function ( see Lehmann, E. L., Casella G. Theory of point estimation. Springer Verlag, 1998 .) M ≈ ~ ~ is error frequency; + P s n d , where n For K= 2, f , N M ~ ( LDF quality criteria ) ≈ + − P s n ( K 1 ) For d=1, f , N 20

Interval estimates of risk 21 � Upper risk bound over strategies of nature and over samples of size N ε Θ R μ ( ) 0 ( S ) Θ ≤ ε ≥ η P( R ( ) ) , μ ( S ) ε = ε μ η ( , N , M , ) (Theorem 3) 21

Ordered regression problem (intermediate between pattern recognition and regression analysis) Y – ordered discrete variable; loss function L i,q =( i-q ) 2 , where i , q =1,2,…, K (Theorem 4). 2.74 N =10, d 0 =0.1, K =6 2.72 2.7 2.68 * R µ 2.66 2.64 2.62 2.6 2.58 1 2 3 4 5 6 7 8 22 M

Recursive algorithm for decision tree construction optimal number of branches; optimal level of recursive embedding 23

Decision trees and event trees for rare events analysis X <60 1 X <60 1 and yes no X <100 2 X <100 Y=1 or 2 yes no Y=1 60 X 1 Y=1 Y=0 24

References � Lbov,G.S. Construction of recognition decision rules in the class of logical functions // International Journal of Imaging Systems and Technology. V.4, Issue 1. 1991. P. 62-64. � Berikov V.B., Lbov G. S. Bayes Estimates for Recognition Quality on Finite Sets of Events // Doklady Mathematical Sciences, Vol. 71, No. 3, 2005, P. 327–330. � Berikov V.B., Lbov G. S. Choice of Optimal Complexity of the Class of Logical Decision Functions in Pattern Recognition Problems // Doklady Mathematical Sciences, 2007. V.76, N 3/1. P. 969-971 � Lbov, G.S., Berikov V.B. Stability of decision functions in problems of pattern recognition and analysis of heterogeneous information. Novosibirsk, Sobolev Institute of mathematics. 2005. (in Russian) 25

Thank you for your attention 26

Bayesian model for rare events recognition with use of logical - PowerPoint PPT Presentation

Bayesian model for rare events recognition with use of logical decision functions class (RESIM 08, Rennes, France) Vladim ir Berikov, Gennady Lbov Sobolev Institute of mathematics, Novosibirsk, Russia {berikov, lbov}@math.nsc.ru What are

ISBT WORKING PARTY ON RARE DONORS Israel Have a national Rare Donor Program: Yes Number of

CME / CE Rare Disease Clinical Trials Module: Rare Pulmonology Clinical Trials Timothy Craig,

Rare Rare Rare-earth Rare-earth earth-based half earth-based half based half-Heusler based

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

ISBT WORKING PARTY ON RARE DONORS Canada Canadian Blood Services Have a national Rare Donor

RAREDISEASEDAY.ORG 1 Rare Disease Day 2020 13 th edition on Saturday 29 February 2020

ISBT WORKING PARTY ON RARE DONORS SINGAPORE Have a national Rare Donor Program: YES Number of

Rare cancers What is the problem, and how big is it? Rare (orphan) diseases NIH Office for Rare

CME / CE Rare Disease Clinical Trials Focus on Rare Hematologic Clinical Trials Morie A Gertz

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Predicting Rare Events via Large Deviations Theory: Rogue Waves and Motile Bacteria Tobias

Physics of rare events: insights on Napoleon death. Ezio Previtali INFN Sez. Milano Bicocca

Atomistic simulations of rare events using the using the gentlest ascent gentlest ascent

CS 7616 Pattern Recognition Bayesian Decision Theory Aaron Bobick School of Interactive Computing

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

demand for.. Rare Metals Research Capital Rare Metals Conference, April 30, 2008

On Qualitative Analysis of Fault Trees Using Structurally Persistent Nets Ricardo J. Rodr

Events and Feedback ck No screens Say your name Prof. Lydia Chilton COMS 4170 7 February 2018

Reactive elements Building Web Applications in R with Shiny Reactive objects Building Web

Example%Circuit%and%Terms 1 1 0 0 1 0 Test%Vector% Input%Stimulus%(1,1,0,0,1,0)

Last time Genetics and evolution Genetic algorithms Assignment 4 Assignment 3 11/2

AI AI Department of Computer Science University of Calgary CPSC 565 Winter 2003 Emergent

Formal Verification of Candidate Solutions for Evolutionary Circuit Design (Entry 04) Zden k

SOFTWARE ARCHITECTURE For MOLECTRONICS John H. Reif Computer Science Dept Duke Univ. In

Sambuz

Useful Links

Newsletter

Mail Us