robert ikeda jennifer widom
play

Robert Ikeda Jennifer Widom Stanford University Example CustList 1 - PowerPoint PPT Presentation

Robert Ikeda Jennifer Widom Stanford University Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n1 USA ClothCo Buying CustList n Items Patterns Pipeline for sales predictions Robert Ikeda 2


  1. Robert Ikeda Jennifer Widom Stanford University

  2. Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Pipeline for sales predictions Robert Ikeda 2

  3. Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Robert Ikeda 3

  4. Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Item Demand Cowboy Hat 3 ? Robert Ikeda 4

  5. Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Name Item Item Demand Amelie Cowboy Hat Cowboy Hat 3 ? Jacques Cowboy Hat Isabelle Cowboy Hat Robert Ikeda 5

  6. Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Name Address Amelie …Paris, TX Jacques …Paris, TX Isabelle …Paris, TX Robert Ikeda 6

  7. Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... X CustList n‐1 USA ClothCo Buying CustList n Items Patterns Name Address Name Address Amelie 65, quai d'Orsay, Paris 65, quai d'Orsay, Paris, France Amelie Jacques Jacques 39, rue de Bretagne, Paris 39, rue de Bretagne, Paris, France Isabelle Isabelle 20 Rue D'orsel, Paris 20 Rue D'orsel, Paris, France Robert Ikeda 7

  8. Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Item Demand Beret 3 Robert Ikeda 8

  9. Panda Past work tends to be… Panda… 1. Either data-based or process-based Capture both — “data-oriented workflows” 2. Focused on modeling and capturing provenance Also provenance operators and queries 3. Specific application domains General-purpose Robert Ikeda 9

  10. Remainder of Talk • Processing nodes and provenance capture • Provenance operations • Provenance queries • System and other issues • Current research Robert Ikeda 10

  11. Processing Nodes CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns • Relational nodes: structured, well-understood operations • Opaque nodes Robert Ikeda 11

  12. Provenance Capture • Model ― Likely to be similar to Open Provenance Model ― Support provenance at a variety of granularities • Interface ― Allow processing nodes to create and manipulate provenance ― For relational operations, can plug in existing provenance work Robert Ikeda 12

  13. Provenance Operations • Basic operations ― Backward tracing  Where did the cowboy-hat record come from? ― Forward tracing  Which predictions did this customer contribute to? CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Robert Ikeda 13

  14. Provenance Operations • Examples of additional functionality ― Forward propagation  Update all affected predictions after customers have moved from France to Texas CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Robert Ikeda 14

  15. Provenance Operations • Examples of additional functionality ― Refresh ≈ Backward tracing + forward propagation  Get latest predicted volume for cowboy hat sales (only) using latest customer lists and buying patterns CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Robert Ikeda 15

  16. Provenance Queries • Examples ― How many people from each country contributed to the cowboy hat prediction? ― Which customer list contributed the most to the top 100 predicted items? CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Robert Ikeda 16

  17. Provenance Queries • Examples ― How many people from each country contributed to the cowboy hat prediction? ― Which customer list contributed the most to the top 100 predicted items? • Seamlessly combine provenance and data • Compact and intuitive language • Amenable to optimization Robert Ikeda 17

  18. System and Other Issues • Query-driven provenance capture • Eager vs. lazy computation and storage • Fine-grained vs. coarse-grained • Approximate provenance Robert Ikeda 18

  19. Current Research • Building up basic system infrastructure • Refresh ― Efficiently compute the up-to-date value of selected output elements • Theoretical challenges ― Optimizing provenance storage vs. recomputation Robert Ikeda 19

  20. System Infrastructure • Handles structured relational operations as well as arbitrary Python processing nodes • Arbitrary acyclic transformation graphs • Backward tracing and forward propagation Robert Ikeda 20

  21. Refresh • Problem ― Efficiently compute the up-to-date value of selected output elements • Challenges ― Formally defining the refresh problem ― Understanding when refresh can be done efficiently ― Supporting a wide class of transformations and workflows Robert Ikeda 21

  22. Future Work • Most everything in this talk  Robert Ikeda 22

  23. Parag Agrawal, Abhijeet Mohapatra, Raghotham Murthy, Aditya Parameswaran, Hyunjung Park, Alkis Polyzotis, Semih Salihoglu

  24. Extra Slides Robert Ikeda 24

  25. Running Example CustList 1 Europe CustList 2 O Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Robert Ikeda 25

  26. PAND ANDA A

  27. Robert Ikeda Jennifer Widom Stanford University

  28. Panda’s Niche 1. Data-based or process-based 2. Modeling and capturing provenance 3. Specific application domains 1. Merge data-based and process-based 2. Provenance operators and queries 3. General-purpose Robert Ikeda 28

  29. Overview of Past Work 1. Data-based or process-based 2. Modeling and capturing provenance 3. Specific application domains Robert Ikeda 29

  30. Running Example CustList 1 Europe CustList 2 O Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Paris, France ? Paris, Texas ! Robert Ikeda 30

  31. Running Example CustList 1 Europe CustList 2 O Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Pipeline for Sales Prediction Robert Ikeda 31

  32. Provenance Capture • Processing Nodes ― Relational operations ― Opaque processing • Requirements ― Interface ― Model Robert Ikeda 32

  33. Running Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Paris, France ? Paris, Texas ! Robert Ikeda 33

  34. Processing Nodes • Relational Operations ― Relational operations ― Opaque processing • Opaque Processing ― Interface ― Model Robert Ikeda 34

  35. Provenance Queries • Operate over provenance and data • Compact and intuitive • Amenable to efficient planning Considering only customers from a specific list, which items are in the highest demand? Robert Ikeda 35

  36. Provenance Queries • Seamlessly combine provenance and data • Compact and intuitive language • Amenable to optimization Robert Ikeda 36

  37. Provenance Query Examples • How many people from each country contributed to the cowboy hat prediction? • Which customer list contributed the most to the top 100 predicted items? Robert Ikeda 37

  38. Running Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Name Address Name Item Name Address Item Demand Amelie 65, quai d'Orsay, Paris Amelie Cowboy Hat Amelie …Paris, TX Cowboy Hat 3 Jacques 39, rue de Bretagne, Paris Jacques Cowboy Hat Jacques …Paris, TX Isabelle 20 Rue D'orsel, Paris Isabelle Cowboy Hat Isabelle …Paris, TX Robert Ikeda 38

  39. Running Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Name Address Amelie 65, quai d'Orsay, Paris Jacques 39, rue de Bretagne, Paris Isabelle 20 Rue D'orsel, Paris Robert Ikeda 39

  40. Running Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Name Address Amelie 65, quai d'Orsay, Paris Jacques 39, rue de Bretagne, Paris Isabelle 20 Rue D'orsel, Paris Robert Ikeda 40

  41. Running Example CustList 1 Europe CustList 2 ItemVolumes Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Name Address Item Demand Amelie 65, quai d'Orsay, Paris Beret 3 Jacques 39, rue de Bretagne, Paris Isabelle 20 Rue D'orsel, Paris Robert Ikeda 41

  42. Processing Nodes CustList 1 Europe CustList 2 O Dedup Union Predict ItemAgg ... CustList n‐1 USA ClothCo Buying CustList n Items Patterns Relational Nodes: Structured, well-understood operations Robert Ikeda 42

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend