managing messes in
play

Managing Messes in [2] Computational Notebooks [6] [3] Andrew - PowerPoint PPT Presentation

[1] Managing Messes in [2] Computational Notebooks [6] [3] Andrew Head Fred Hohman Titus Barik Steven M. Drucker and Robert DeLine [7] UC Berkeley Georgia Tech Microsoft Research Computational Notebooks: Code, Text,


  1. [1] Managing Messes in [2] Computational Notebooks [6] [3] Andrew Head · Fred Hohman · 
 Titus Barik · Steven M. Drucker · and Robert DeLine [7] UC Berkeley · Georgia Tech · Microsoft Research

  2. Computational Notebooks: Code, Text, and Output Rich descriptions Code Output

  3. Notebook Programming Interfaces Abound

  4. Notebook Model of Exploratory Programming 1. Incremental execution

  5. Notebook Model of Exploratory Programming 1. Incremental execution 2. In-situ output

  6. Notebook Model of Exploratory Programming 1. Incremental execution 2. In-situ output 3. Incremental changes

  7. Notebook Model of Exploratory Programming 1. Incremental execution 2. In-situ output 3. Incremental changes 4. Control over layout

  8. Notebook Model of Exploratory Programming 1. Incremental execution 2. In-situ output 3. Incremental changes 4. Control over layout 1 W EEK P ASSES

  9. Notebook Model of Exploratory Programming 1. Incremental execution 2. In-situ output 3. Incremental changes 4. Control over layout

  10. Notebook Model of Exploratory Programming 1. Incremental execution 2. In-situ output 3. Incremental changes 4. Control over layout How did I 1 W EEK L ATER produce this? 1. How did I produce this result?

  11. Notebook Model of Exploratory Programming 1. Incremental execution 2. In-situ output 3. Incremental changes which 4. Control over layout petal_length? How did I 1 W EEK L ATER produce this? 1. How did I produce this result?

  12. Notebook Model of Exploratory Programming Didn't I have a better 1. Incremental execution version of this? 2. In-situ output 3. Incremental changes 4. Control over layout 1 W EEK L ATER 1. How did I produce this result? 2. Didn't I have a better version of this?

  13. Notebook Model of Exploratory Programming 1. Incremental execution 2. In-situ output 3. Incremental changes 4. Control over layout What can I 1 W EEK L ATER get rid of? 1. How did I produce this result? 2. Didn't I have a better version of this? 3. What can I get rid of?

  14. Messes in Computational Notebooks [1] Disappearance [2] Notebooks contain Deleted / overwritten code ugly code and dirty [6] Disorder tricks [Rule et al. 2018] [3] Out-of-order execution 
 1/2 of notebooks on GitHub [Rule et al. 2018] 31 / 41 surveyed [7] participants had trouble Dispersion finding prior analyses 
 Too many cells [Kery et al. 2018]

  15. Managing Messes in Computational Notebooks How can tools help analysts find, recover, and compare code in messy notebooks? [1] How messes happen C ODE G ATHERING T OOLS [*] Tools in context [ ] Implementation [ ] [ ] Qualitative usability study

  16. C ODE G ATHERING T OOLS Demo 1 W EEK P ASSES

  17. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code How did I produce this?

  18. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Variables How did I produce this? Outputs

  19. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code How did I produce this?

  20. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code How did I produce this? Request cell subset that produced the result. 1 W EEK P ASSES

  21. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code How did I produce this? Request cell subset that produced the result. 1 W EEK P ASSES

  22. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code How did I produce this? The gathered code is... Request cell subset that produced the result. • reduced • ordered • complete

  23. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Didn't I have a better version of this?

  24. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Didn't I have a better version of this? 1 W EEK P ASSES Open a version browser for a result.

  25. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Didn't I have a better version of this? Open a version browser for a result.

  26. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Didn't I have a better version of this? Open a version browser for a result.

  27. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Didn't I have a better version of this? Open a version browser for a result.

  28. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Didn't I have a better version of this? 1 W EEK P ASSES Open a version browser for a result.

  29. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Open a version browser for a result. Task 3: Cleaning Notebook What code can I get rid of?

  30. C ODE G ATHERING T OOLS Demo Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Open a version browser for a result. Task 3: Cleaning Notebook What code can I get rid of? ... Request cell subset that produced the result.

  31. C ODE G ATHERING T OOLS Demo How can tools help analysts manage messes in their notebooks? Task 1: Recovering Code Request cell subset that produced the result. Task 2: Comparing Versions Open a version browser for a result. Task 3: Cleaning Notebook ... Request cell subset that produced the result.

  32. Post-Hoc Mess Management Helping analysts clean and navigate their code whether or not they adopted a strategy to version or organize their code.

  33. Managing Messes in Computational Notebooks How can tools help analysts find, recover, and compare code in messy notebooks? [1] How messes happen C ODE G ATHERING T OOLS [2] Tools in context [3] Implementation [*] [ ] Qualitative usability study

  34. Implementation: Slicing Notebooks Notebook 1 some cells missing, cleaned, ordered some cells out-of-order notebooks [10] [ ] [11] [ ] [ ] ? versioned results [ ] [1] [2] [12] [3]

  35. Implementation: Slicing Notebooks Execution Log Notebook 1 2 some cells missing, all cells present, in-order some cells out-of-order [1] · · · [10] [6] [11] [7] · · · execution time [1] [10] [11] [2] [12] [3] [12]

  36. Implementation: Slicing Notebooks Execution Log Notebook 1 2 some cells missing, all cells present, in-order some cells out-of-order [1] · · · [10] [6] [11] [7] · · · execution time [1] [10] [11] [2] [12] [3] [12]

  37. Implementation: Slicing Notebooks Execution Log Program Slices [Weiser '81] Notebook 1 2 3 some cells missing, all cells present, in-order some cells out-of-order [1] · · · [10] [6] [11] [7] · · · execution time [1] [10] [11] [2] [12] [3] [12]

  38. Implementation: Slicing Notebooks Execution Log Program Slices [Weiser '81] Notebook 1 2 3 some cells missing, all cells present, in-order which can be used to make... some cells out-of-order [1] · · · [10] [ ] [6] cleaned, ordered [11] [ ] [7] notebooks [ ] [ ] (preserve cell boundaries and outputs) · · · execution time [1] [10] versioned [11] results [2] (slice all cell [12] versions) [3] [12]

  39. Cleaning and Exploring Messy Notebooks A Sample of Recent Research output recipes artifact explorer cell folding tabbed browsing cell version diffs of cell versions Interactions for Untangling Towards Effective Foraging Aiding Collaborative Reuse of Design and Use of Messy History in a by Data Scientists to Find Computational Notebooks with Computational Computational Notebook Past Analysis Choices 
 Annotated Cell Folding 
 Notebooks 
 Kery et al., VL/HCC '18 Kery et al., CHI '19 Rule et al., CSCW '18 Rule, Ph.D. Thesis, '18

  40. Evaluating Code Gathering Tools Q1 . What is the meaning of "cleaning"? Q2 . How do analysts use code gathering tools during exploratory data analysis?

  41. A Qualitative Study of Gathering Participants : N = 12 professional data analysts Cleaning Task × 2 : Clean a computational notebook, 
 with and without code gathering tools. Exploration : Rank movies in from a movies dataset. Use code gathering tools as you wish.

  42. Q1 . The Meaning of "Cleaning" Picking a subset of cells [P1-P12] ... and removing the rest [P8, P10-12] . "I picked a plot that looked interesting and, if you think of a dependency tree of cells, walked backwards and removed everything that wasn’t necessary." ... And many additional stages: merging cells writing documentation [P11] [P1, P5, P7, P10, P11] polishing visualizations restructuring code [P3, P4, P6, P12] [P1, P6] integrating with version control [P7]

  43. Q2 . How do analysts use code gathering tools during exploratory data analysis? Gathering to a notebook Very useful Highlighting dependencies Somewhat useful Not useful Version browser No basis to answer 0 3 6 9 12 # participants Participants described gathering to a notebook as "beautiful" and "amazing": it "hits the nail on the head."

  44. Some Observed Uses of Gathering Tools Gathering for multiple audiences "Finishing moves" Creating personal references Lightweight branching x

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend