Cu Culprits ts an and Isl Island nds Jill illes V s Vreeken - PowerPoint PPT Presentation

Cu Culprits ts an and Isl Island nds Jill illes V s Vreeken 4 4 Ju July 2014 2014 (TA TADA)

Ser ervic ice Ann e Announ uncemen ent #1 Tensors Introduction - Introduction to tensors - Is DM science? - Tensors in DM - DM in action - Special topics in tensors Information Theory Mixed Grill - MDL + patterns - Influence Propagation - Entropy + correlation - Redescription Mining - MaxEnt + iterative DM - <special request>

Ser ervic ice Ann e Announ uncemen ent #1 Tensors Introduction <special request>? - Introduction to tensors - Is DM science? - Tensors in DM - DM in action - Special topics in tensors Let us know (asap, mail) what topic you would Information Theory Mixed Grill like to see discussed - MDL + patterns - Influence Propagation - Entropy + correlation - Redescription Mining - MaxEnt + iterative DM - <special request>

Who Who are the the Cu Culpri rits ts? B. Aditya Prakash Jill illes V s Vreeken Christos Faloutsos 4 4 Ju July 2014 2014 (TA TADA)

Fir irst st q quest estio ion of the e da day How can we find the number and location of starting points for epidemics in graphs? – or – Who are the culprits?

Virus P s Propaga gatio ion Susceptible-Infected (SI) Model [AJPH 2007] CDC data: Visualization of the first 35 tuberculosis (TB) patients and Diseases over contact networks their 1039 contacts

Culp lprit its: Pr Problem blem d def efin init itio ion 2d grid Question: Who started it?

Culp lprit its: Pr Problem blem d def efin init itio ion 2d grid Question: Who started it? Prior work: [Lappas et al. 2010, Shah et al. 2011]

Culp lprit its: E Exo xoner eratio ion

Who ho a are t e the c he culp lprit its Two-step solution 1) use MDL for number of seeds 2) for a given number: exoneration = centrality + penalty Running time linear! (in edges and nodes) NetSleuth

Mo Modeling using deling using MDL MDL Minimum Description Length principle Induction by Compression Related to Bayesian approaches MDL = Model + Data Cost of a Model: scoring the seed-set Number of possible Encoding integer | 𝑇 | | 𝑇 | -sized sets

Mo Modeling using deling using MDL MDL Encoding the Data: Propagation Ripples Infected Original Snapshot Graph Ripple R1 Ripple R2

Mo Modeling using deling using MDL MDL Ripple cost Ripple R How the ‘frontier’ How long is the ripple advances Total MDL cost Prakash, Vreeken, Faloutsos 2012

Ho How w to o opt ptim imiz ize e the sc e score? e? Two-step process  Given k quickly identify high-quality set S  Given set S , optimize the ripple R

Op Optim imiz izin ing t the he sc score High-quality k- seed-set  exoneration Best single seed:  smallest eigenvector of Laplacian sub-matrix  analyze a Constrained SI epidemic Exonerate neighbors Repeat

Op Optim imiz izin ing t the he sc score Optimizing R  Get the MLE ripple! Ripple R Finally use MDL score to tell us the best set N ET S LEUTH : Linear running time in nodes and edges

Experi riments How far are they? Evaluation functions:  MDL based  Overlap based Closer to 1 ( JD = Jaccard distance) the better

Experi riments: # # of f Seeds One Seed Two Seeds Three Seeds

Exper xperim iments: s: Q Quali lity ( (MDL MDL and JD) D) One Seed Two Seeds Ideal = 1 Three Seeds Prakash, Vreeken, Faloutsos 2012

Exper xperim iments: s: Q Quali lity ( (Jaccar ard Sc Scor ores) One Seed Two Seeds N ET S LEUTH Closer to True diagonal, the better Three Seeds

Exper xperim iments: s: S Scala labili ility

Conc nclu lusio ion Given : Graph and Infections Find : Best ‘Culprits’ Two-step solution  use MDL for number of seeds  for a given number: exoneration = centrality + penalty  NetSleuth :  Linear running time in nodes and edges

Connection Pat Con athwa hways Lema Le man Ako koglu Jille Jilles Vree eeken en Hangh ghan ang Tong ong Pol olo o Ch Chau au Nik ikola laj T j Tatti Christ Ch stos s Falout outsos os (Akoglu et al. SDM’13)

Quest uestio ion a at h hand nd How can we use a graph to explai ain a few sel selecte d nodes ?

Giv Given en a a ‘list ‘list’ o ’ of a authors… What can we say?  let’s use relational information Christos Faloutsos Jeffrey F. Naughton Surajit Chaudhuri H. V. Jagadish Hiroshi Ishii Scott E. Hudson David J. DeWitt Gerhard Weikum Shumin Zhai Bonnie E. John William Buxton Abigail Sellen Hector Garcia Molina Raghu Ramakrishnan Steve Benford James A. Landay Michael J. Carey Ravin Balakrishnan Brad A. Myers Rakesh Agrawal

Giv Given en a a ‘list ‘list’ o ’ of a authors… What can we say?  let’s use relational information

Usin sing t g the c e co-aut uthorsh ship g graph… h… Any structure?  too cluttered

Th The P e Problem blem Given  a large graph G  a handful of nodes S marked by an external process What can we say about S ?  are they close by ?  are they segregated ?  do they form groups ? Can we connect them?  with simple paths?  maybe using a few connectors ?

Our Our a app pproach Use the network structure to explain S Partition S into groups of nodes, such that  “simple” paths in G connect the nodes in each group ,  nodes in different groups are “not easily reachable” Use MDL to decide ‘ simple ’ and ‘ best ‘ partitioning

Example Simple connection pathways  good connectors  better sensemaking CHI VLDB

App Applic licatio ions 1. Graph anomaly description/summarization e.g. Gene interaction network Top-k anomalies  Summarize top-k node anomalies by groups  Find connections/connectors among groups

App Applic licatio ions 2. Query summarization e.g. Web network Top-ranked pages  Summarize top-k query pages by groups  Find connections/connectors among groups

App Applic licatio ions 3. Understanding dynamic events in graphs e.g. Social network Affected people  Event spread within groups explained by the network  Event spread between groups due to external influence

App Applic licatio ions 4. Understanding semantic coherence e.g. Ontology network Set of words  Summarize words by semantically coherent groups  Find connectors (other relevant words) per group

App Applic licatio ions 5. Understanding segregation (social science) e.g. school-children friendship network Students with attributes of interest  Summarize students by their social “circles”  Study groups (and groups within groups)

Problem: F For orma mally Problem Definition Given a graph G= ( V,E ) and a set of marked nodes M subseteq V Problem 1. Optimal partitioning Find a coherent partitioning P of M . Find the optimal number of partitions |P| . Problem 2. Optimal connection subgraphs Efficiently find the minimum cost set of subgraphs connecting the nodes in each part

Ob Objec jectiv ive: e: Inf Informally ly Our key idea is to use information theory Imagine a sender and a receiver.  both sender and receiver know graph structure G ,  only the sender knows the set of marked nodes M  goal: transmit M using as few bits as possible . Why would this work?  naïve : encode ID of each marked node with bits  better : exploit “close-by” nodes, restart for farther nodes … u vs. …

Ob Objec jectiv ive: e: Int Intuit itio ion We think of encoding as  hopping from node to node to encode close-by nodes  and flying to a new node to encode farther nodes  until all marked nodes are identified Simplicity of connection tree T is determined by:  the amount of flights we make across the graph;  ease of identifying the edges to follow next;  ease of identifying the marked nodes in our tour;

Ob Objec jectiv ive: e: F Formall lly minimize P, T i  encode #partitions  encode each part root node spanning tree number of identities of t of p i marked nodes in p i marked nodes  encoding of tree per part recursively encode all #branches of node t identities of branch nodes tree nodes

Solut lutio ion: In Intuit uitio ion It’s NP -hard. The problem is hard The problem is NP NP-hard rd  Reduces to directed Steiner tree problem  Related to the directed Steiner tree problem Hence, we resort to heuristics … The general idea:  transform G into a directed weighted graph G’  chop G’ into sub-graphs  find low-cost minimal spanning trees per sub-graph (we give 4 efficient algorithms)

Solut lutio ion: P Prelim elimin inaries ies Graph transformation  given undirected unweighted  we transform it into directed weighted where and Given G’ , the problem becomes: find the set of trees with minimum total cost on the marked nodes. Finding bounded-length paths  (multiple) short paths of length up to between marked nodes in G’  employ BFS-like expansion

Cu Culprits ts an and Isl Island nds Jill illes V s Vreeken - PowerPoint PPT Presentation

Cu Culprits ts an and Isl Island nds Jill illes V s Vreeken 4 4 Ju July 2014 2014 (TA TADA) Ser ervic ice Ann e Announ uncemen ent #1 Tensors Introduction - Introduction to tensors - Is DM science? - Tensors in DM - DM

SOLAR FAADE NDS 2005 - Modul 08 : CNC Shifted Seating Unit NDS 2005 - Modul 08 : CNC Cutplan

Treasure Island and Yerba Buena Island Treasure Island Citizens Advisory Board and Treasure Island

E XPERIMENTAL R ESPONSE OF AN O PEN E NDS SFD AND A S EALED E NDS SFD Luis San Andrs Sung-Hwa

ISLAND ECOSYSTEMS ISLAND ECOSYSTEMS ISLAND ECOSYSTEMS ISLAND ECOSYSTEMS The PABITRA Project

Treasure Island /Yerba Buena Island Treasure Island /Yerba Buena Island Redevelopment Project

2011 Annual General Meeting 03 December 2011 10:00 am Todays agenda Todays agenda

and vitrified wasteforms Russell J Hand ISL Department of Materials Science and Engineering

S s Peace be upon him In n Isl sl m Pa Part rt 1 Maul n Ebrahim

Tracking the culprits: Parasite eggs movement and sedimentation in waste stabilization ponds (WSP)

Inflight Modifications of Content: Who are the Culprits? Chao Zhang Cheng Huang David

ST PHILLIPS ISLAND PLANTATION SERVICES Boarding.jpg ST PHILLIPS ISLAND PLANTATION SERVICES

NDS: Safer and Stronger Disability Services and COVID-19 webinar Friday 28 th August 2020,

TREASURE ISLAND ARTS MASTER PLAN VISION 1. Treasure Island is a destination for the arts. 2. The

The Island of Reno, Sparks and Carson City The Island of Las Vegas & Henderson The Island

Vavau Island Tours Day Trip to the outer Island of Vavau Vavau-Island-Tours E-Mail:

Rhode Island Wind Turbines Thursday, July 17, 2014 6 7:30PM University of Rhode Island,

Image alignment Slides from Derek Hoiem, Svetlana Lazebnik Image source Alignment applications

REST: I don't Think it Means What You Think it Does Stefan Tilkov | @stilkov GOTO Amsterdam, 20

Production planning Big factory, produces widgets, doodads Each widget: 1 unit of

Dynamic Ontology Service for Historical Persons and Places Based on Crowdsourcing 22.1.2016,

Data Management and data analysis in Taiwan Guo Chin Liu on behave of TGWG Group Members C.

Instance level recognition III: Correspondence and efficient visual search Josef Sivic

CPRIT Product Development Program FY 2020 Cycle 1 (20.1) TXCO, RELCO, SEED RFAs Hosted By: Cindy

MARKETING FOR ENTREPRENEURS LECTURE 8 RECAP WHAT ? Course Structure DISCOVER CUSTOMER NEED

Sambuz

Useful Links

Newsletter

Mail Us

Cu Culprits ts an and Isl Island nds Jill illes V s Vreeken - PowerPoint PPT Presentation

Cu Culprits ts an and Isl Island nds Jill illes V s Vreeken 4 4 Ju July 2014 2014 (TA TADA) Ser ervic ice Ann e Announ uncemen ent #1 Tensors Introduction - Introduction to tensors - Is DM science? - Tensors in DM - DM

SOLAR FAADE NDS 2005 - Modul 08 : CNC Shifted Seating Unit NDS 2005 - Modul 08 : CNC Cutplan

Treasure Island and Yerba Buena Island Treasure Island Citizens Advisory Board and Treasure Island

E XPERIMENTAL R ESPONSE OF AN O PEN E NDS SFD AND A S EALED E NDS SFD Luis San Andrs Sung-Hwa

ISLAND ECOSYSTEMS ISLAND ECOSYSTEMS ISLAND ECOSYSTEMS ISLAND ECOSYSTEMS The PABITRA Project

Treasure Island /Yerba Buena Island Treasure Island /Yerba Buena Island Redevelopment Project

2011 Annual General Meeting 03 December 2011 10:00 am Todays agenda Todays agenda

and vitrified wasteforms Russell J Hand ISL Department of Materials Science and Engineering

S s Peace be upon him In n Isl sl m Pa Part rt 1 Maul n Ebrahim

Tracking the culprits: Parasite eggs movement and sedimentation in waste stabilization ponds (WSP)

Inflight Modifications of Content: Who are the Culprits? Chao Zhang Cheng Huang David

ST PHILLIPS ISLAND PLANTATION SERVICES Boarding.jpg ST PHILLIPS ISLAND PLANTATION SERVICES

NDS: Safer and Stronger Disability Services and COVID-19 webinar Friday 28 th August 2020,

TREASURE ISLAND ARTS MASTER PLAN VISION 1. Treasure Island is a destination for the arts. 2. The

The Island of Reno, Sparks and Carson City The Island of Las Vegas &amp; Henderson The Island

Vavau Island Tours Day Trip to the outer Island of Vavau Vavau-Island-Tours E-Mail:

Rhode Island Wind Turbines Thursday, July 17, 2014 6 7:30PM University of Rhode Island,

Image alignment Slides from Derek Hoiem, Svetlana Lazebnik Image source Alignment applications

REST: I don't Think it Means What You Think it Does Stefan Tilkov | @stilkov GOTO Amsterdam, 20

Production planning Big factory, produces widgets, doodads Each widget: 1 unit of

Dynamic Ontology Service for Historical Persons and Places Based on Crowdsourcing 22.1.2016,

Data Management and data analysis in Taiwan Guo Chin Liu on behave of TGWG Group Members C.

Instance level recognition III: Correspondence and efficient visual search Josef Sivic

CPRIT Product Development Program FY 2020 Cycle 1 (20.1) TXCO, RELCO, SEED RFAs Hosted By: Cindy

MARKETING FOR ENTREPRENEURS LECTURE 8 RECAP WHAT ? Course Structure DISCOVER CUSTOMER NEED

Sambuz

Useful Links

Newsletter

Mail Us

The Island of Reno, Sparks and Carson City The Island of Las Vegas & Henderson The Island