journal of arti cial in telligence researc h 12 2000 219
play

Journal of Articial In telligence Researc h 12 (2000) - PDF document

Journal of Articial In telligence Researc h 12 (2000) 219-234 Submitted 5/99; published 5/00 Randomized Algorithms for the Lo op Cutset Problem Ann Bec k er anyut a@cs.technion.a c.il Reuv en Bar-Y eh uda


  1. Journal of Arti�cial In telligence Researc h 12 (2000) 219-234 Submitted 5/99; published 5/00 Randomized Algorithms for the Lo op Cutset Problem Ann Bec k er anyut a@cs.technion.a c.il Reuv en Bar-Y eh uda reuven@cs.technion.a c.il Dan Geiger d ang@cs.technion.a c.il Computer Scienc e Dep artment T e chnion, Haifa, 32000, Isr ael Abstract W e sho w ho w to �nd a minim um w eigh t lo op cutset in a Ba y esian net w ork with high probabilit y . Finding suc h a lo op cutset is the �rst step in the metho d of conditioning for inference. Our randomized algorithm for �nding a lo op cutset outputs a minim um lo op k k 1 c 6 cutset after O ( c 6 k n ) steps with probabilit y at least 1 (1 ) , where c > 1 is a � � k 6 constan t sp eci�ed b y the user, k is the minima l size of a minim um w eigh t lo op cutset, and n is the n um b er of v ertices. W e also sho w empirically that a v arian t of this algorithm often �nds a lo op cutset that is closer to the minim um w eigh t lo op cutset than the ones found b y the b est deterministic algorithms kno wn. 1. In tro duction The metho d of conditioning is a w ell kno wn inference metho d for the computation of p os- terior probabilities in general Ba y esian net w orks (P earl, 1986, 1988; Suermondt & Co op er, 1990; P eot & Shac h ter, 1991) as w ell as for �nding MAP v alues and solving constrain t sat- isfaction problems (Dec h ter, 1999). This metho d has t w o conceptual phases. First to �nd an optimal or close to optimal lo op cutset and then to p erform a lik eliho o d computation for eac h instance of the v ariables in the lo op cutset. This metho d is routinely used b y geneticists via sev eral genetic link age programs (Ott, 1991; Lang, 1997; Bec k er, Geiger, & Sc ha�er, 1998). A v arian t of this metho d w as dev elop ed b y Lange and Elston (1975). Finding a minim um w eigh t lo op cutset is NP-complete and th us heuristic metho ds ha v e often b een applied to �nd a reasonable lo op cutset (Suermondt & Co op er, 1990). Most metho ds in the past had no guaran tee of p erformance and p erformed v ery badly when presen ted with an appropriate example. Bec k er and Geiger (1994, 1996) o�ered an algorithm that �nds a lo op cutset for whic h the logarithm of the state space is guaran teed to b e at most a constan t factor o� the optimal v alue. An adaptation of these appro ximation algorithms has b een included in v ersion 4.0 of F ASTLINK , a p opular soft w are for analyzing large p edigrees with small n um b er of genetic mark ers (Bec k er et al., 1998). Similar algorithms in the con text of undirected graphs are describ ed b y Bafna, Berman, and F ujito (1995) and F ujito (1996). While appro ximation algorithms for the lo op cutset problem are quite useful, it is still w orth while to in v est in �nding a minim um lo op cutset rather than an appro ximation b e- cause the cost of �nding suc h a lo op cutset is amortized o v er the man y iterations of the conditioning metho d. In fact, one ma y in v est an e�ort of complexit y exp onen tial in the size of the lo op cutset in �nding a minim um w eigh t lo op cutset b ecause the second phase of the conditioning algorithm, whic h is rep eated for man y iterations, uses a pro cedure of suc h c � 2000 AI Access F oundation and Morgan Kaufmann Publishers. All righ ts reserv ed.

  2. Becker, Bar-Yehud a, & Geiger complexit y . The same considerations apply also to constrain t satisfaction problems as w ell as other problems in whic h the metho d of conditioning is useful (Dec h ter, 1990, 1999). In this pap er w e describ e sev eral randomized algorithms that compute a lo op cutset. As done b y Bar-Y eh uda, Geiger, Naor, and Roth (1994), our solution is based on a reduction to the w eigh ted feedbac k v ertex set problem. A fe e db ack vertex set (FVS) F is a set of v ertices of an undirected graph G = ( V ; E ) suc h that b y remo ving F from G , along with all the edges inciden t with F , a set of trees is obtained. The Weighte d F e e db ack V ertex Set (WFVS) pr oblem is to �nd a feedbac k v ertex set F of a v ertex-w eigh ted graph with a w eigh t + function w : V I R , suc h that P w ( v ) is minimized. When w ( v ) 1, this problem is ! � v 2 F called the FVS problem. The decision v ersion asso ciated with the FVS problem is kno wn to b e NP-Complete (Garey & Johnson, 1979, pp. 191{192). Our randomized algorithm for �nding a WFVS, called Repea tedW GuessI , outputs a k k 1 c 6 minim um w eigh t FVS after O ( c 6 k n ) steps with probabilit y at least 1 (1 ) , where � � k 6 c > 1 is a constan t sp eci�ed b y the user, k is the minimal size of a minim um w eigh t FVS, and n is the n um b er of v ertices. F or un w eigh ted graphs w e presen t an algorithm that �nds k k 1 c 4 a minim um FVS of a graph G after O ( c 4 k n ) steps with probabilit y at least 1 (1 ) . � � k 4 In comparison, sev eral deterministic algorithms for �nding a minim um FVS are describ ed k 2 in the literature. One has a complexit y O ((2 k + 1) n ) (Do wney & F ello ws, 1995b) and 4 others ha v e a complexit y O ((17 k )! n ) (Bo dlaender, 1990; Do wney & F ello ws, 1995a). A �nal v arian t of our randomized algorithms, called WRA, has the b est p erformance b ecause it utilizes information from previous runs. This algorithm is harder to analyze and its in v estigation is mostly exp erimen tal. W e sho w empirically that the actual run time of WRA is comparable to a Mo di�ed Greedy Algorithm (MGA), describ ed b y Bec k er and Geiger (1996), whic h is the b est a v ailable deterministic algorithm for �nding close to optimal lo op cutsets, and y et, the output of WRA is often closer to the minim um w eigh t lo op cutest than the output of MGA. The rest of the pap er is organized as follo ws. In Section 2 w e outline the metho d of conditioning, explain the related lo op cutset problem and describ e the reduction from the lo op cutset problem to the WFVS Problem. In Section 3 w e presen t three randomized algo- rithms for the WFVS problem and their analysis. In Section 4 w e compare exp erimen tally WRA and MGA with resp ect to output qualit y and run time. 2. Bac kground: The Lo op Cutset Problem A short o v erview of the metho d of conditioning and de�nitions related to Ba y esian net w orks are giv en b elo w. See the b o ok b y P earl (1988) for more details. W e then de�ne the lo op cutset problem. Let P ( u ; : : : ; u ) b e a probabilit y distribution where eac h v ariable u has a �nite set n i 1 of p ossible v alues called the domain of u . A directed graph D with no directed cycles is i called a Bayesian network of P if there is a 1{1 mapping b et w een f u ; : : : ; u and v ertices g n 1 in D , suc h that u is asso ciated with v ertex i and P can b e written as follo ws: i n Y P ( u ; : : : ; u ) = P ( u u ; : : : ; u ) (1) j 1 n i i i 1 j ( i ) i =1 where i ; : : : ; i are the source v ertices of the incoming edges to v ertex i in D . j ( i ) 1 220

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend