Provenance Analytics and Visualization Juliana Freire VisTrails - - PowerPoint PPT Presentation

provenance analytics and visualization
SMART_READER_LITE
LIVE PREVIEW

Provenance Analytics and Visualization Juliana Freire VisTrails - - PowerPoint PPT Presentation

Provenance Analytics and Visualization Juliana Freire VisTrails Group & Web and Databases Lab Provenance Analytics: Opportunities Provenance beyond reproducibility Opportunity for knowledge discovery, sharing and re-use Query


slide-1
SLIDE 1

Provenance Analytics and Visualization

Juliana Freire VisTrails Group & Web and Databases Lab

slide-2
SLIDE 2

2

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Provenance Analytics: Opportunities

 Provenance beyond reproducibility  Opportunity for knowledge discovery, sharing and

re-use

 Query information

– Understand processes and data dependencies – Find useful workflows, e.g., given a piece of data or task, which workflow should we run?

 Mine information

– Discover interesting patterns (e.g., common workflow patterns)  recommendation system, discover analogies – Identify homogeneous workflow groups by clustering 

  • rganize collections [Santos et al., IPAW 2008]

– Infer workflow specification from execution log [Aalst et al.,

TKDE 2004]

slide-3
SLIDE 3

3

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Guidance in Workflow Design

slide-4
SLIDE 4

4

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Guidance in Workflow Design

slide-5
SLIDE 5

5

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

VisComplete: A Workflow Recommendation System

[Koop et al., IEEE Vis 2008]  Mine graph fragments that co-occur in a provenance

collection

 Predict sets of likely workflow additions to a given

partial workflow

 Similar to a Web browser suggesting URL

completions

Provenance Repository

slide-6
SLIDE 6

6

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

VisComplete: A Workflow Recommendation System

 Mine graph fragments that co-occur in a provenance

collection

 Predict sets of likely workflow additions to a given

partial workflow

 Similar to a Web browser suggesting URL

completions

slide-7
SLIDE 7

7

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Querying Provenance

 Provenance is a graph  Visual interfaces to specify queries [Beeri et al., VLDB 2006, Scheidegger et al., TVCG 2007]

– WYSIWYQ -- What You See Is What You Query

 Visual interfaces to explore the results [Ellkvist et al., KEYS 2009]

Generate descriptive snippets

slide-8
SLIDE 8

8

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Querying Provenance

 Provenance is a graph  Visual interfaces to specify queries [Beeri et al., VLDB 2006, Scheidegger et al., TVCG 2007]

– WYSIWYQ -- What You See Is What You Query

 Visual interfaces to explore the results [Ellkvist et al., KEYS 2009]

Summarize collection by clustering

slide-9
SLIDE 9

9

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Comparing Results

 Ability to compare data products and corresponding

workflows

[Freire et al., IPAW 2006]

slide-10
SLIDE 10

10

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Mining Provenance: Challenges

 Provenance is a graph: mining is expensive  Workflow structure is complex

 Modules with parameters+values  Typed connections

 How to model provenance?

– For clustering, a vector-space based representation produced results correlated to results obtained using a more expensive structural representation [Santos et al., IPAW 2008]

 Which notions of distance and metrics make

sense for different applications and data sets?

 Which algorithms are effective and efficient? [Lauro Lins, Nivan Ferreira. Work in progress]

slide-11
SLIDE 11

11

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Mining Provenance: Challenges

Understanding User Behavior

[DEFOG system, Lins et al.]

  • Need analysis/visualization tools
slide-12
SLIDE 12

12

TaPP ‘11 – Provenance Analytics and Visualization Juliana Freire

Acknowledgments

 This work is partially supported by the National

Science Foundation grants IIS 1050422, IIS 0905385, IIS 0844572, IIS 0746500, CNS 0751152,; the Department of Energy, an IBM Faculty Award, and a University of Utah Seed Grant.

slide-13
SLIDE 13

Ευχαριστω Thank you Obrigada