As Strong as the Weakest Link: Mining Diverse Cliques in Weighted Graphs
Petko Bogdanov (UC Santa Barbara),
with Ben Baumer (Smith College), Prithwish Basu (Raytheon BBN) , Amotz Bar-Noy (CUNY) and Ambuj K. Singh (UC Santa Barbara)
As Strong as the Weakest Link: Mining Diverse Cliques in Weighted - - PowerPoint PPT Presentation
As Strong as the Weakest Link: Mining Diverse Cliques in Weighted Graphs Petko Bogdanov (UC Santa Barbara), with Ben Baumer (Smith College), Prithwish Basu (Raytheon BBN) , Amotz Bar-Noy (CUNY) and Ambuj K. Singh (UC Santa Barbara) ECML/PKDD,
with Ben Baumer (Smith College), Prithwish Basu (Raytheon BBN) , Amotz Bar-Noy (CUNY) and Ambuj K. Singh (UC Santa Barbara)
Gene Interaction Networks Complexes - interacting functional units*
* Leemor Joshua-Tor, Structure and Function of Nucleic Acid Regulatory Complexeshttp://www.hhmi.org/research/structure-and- function-nucleic-acid-regulatory-complexes
○ images ○ video ○ other complex objects with similarity function
○ stocks of companies related in a supply chain ○ brain regions co-associated in performing a task*
* Hagmann P, Cammoun L, Gigandet X, Meuli R, Honey CJ, Wedeen VJ, Sporns O (2008) Mapping the structural core of human cerebral cortex. PLoS Biology Vol. 6, No. 7, e159
○ MAX CLIQUE is NP-hard
○ Managing overlap “adds” complexity
○ higher weight means stronger association
combination
diversity via α
number of distinct nodes in solution means higher diversity Score Diversity
○ reduction from SET COVER
Included in solution
○ monotonic ○ submodular ○ Allows a (1-1/e)-APX
○ Requires greedily finding the next best clique ○ MAX CLIQUE NP-hard to approximate to a constant
○ Can we develop a solution with APX guarantees that is fast? Limitations? ○ Can we develop a very fast solution of good quality?
Candidate to add to solution Diminishing return
Optimistic completion
Current lowest weight will be the lowest in the whole clique The rest of the nodes will not overlap
cliques in a thresholded graph
candidate with a better score contribution than the best UB, add it to the solution
cliques in a thresholded graph
candidate with a better score contribution than the best UB, add it to the solution
and repeat
Already in the solution A
UB? UB?
Already in the solution A
Grow away from included nodes based on UB
Already in the solution A
Grow away from included nodes based on UB
* S. Bandyopadhyay and M. Bhattacharyya. Mining the largest dense vertexlet in a weighted scale-free graph. Fundam. Inform., 96(1-2):1–25, 2009
Scalable, High Quality
○ application to discovery of effective groups in collaboration ○ complexes in gene networks ○ similarity/correlation graphs
The research was supported by the Army Research Laboratory under cooperative agreement W911NF-09-2-0053 (NS-CTA).
○ frequency of clique occurrence (not score) ○ non-unique labels
○ Bandyopadhyay et al. 2009: no APX guarantees, single clique, extended version does not have as good quality
○ Steiner trees ○ Clique percolation (CFinder) ○ Edge weights are constraints and not part of score