worksheets
play

Worksheets Percy Liang UCI Reproducibility Symposium September 22, - PowerPoint PPT Presentation

Worksheets Percy Liang UCI Reproducibility Symposium September 22, 2020 The current research process 1 Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy 2 Problem 1: reproducibility Previous


  1. Worksheets Percy Liang UCI Reproducibility Symposium — September 22, 2020

  2. The current research process 1

  3. Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy 2

  4. Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy Dataset 2 72% accuracy 77% accuracy 2

  5. Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy Dataset 2 72% accuracy 77% accuracy Dataset 3 ? ? 2

  6. Problem 1: reproducibility Previous method New method Dataset 1 88% accuracy 92% accuracy Dataset 2 72% accuracy 77% accuracy Dataset 3 ? ? Dataset 4 ? ? ... ... ... 2

  7. Problem 2: efficiency Step 1: come up with a good idea 3

  8. Problem 2: efficiency Step 1: come up with a good idea Step 2: execute on it • Obtain data, clean it, convert between formats 3

  9. Problem 2: efficiency Step 1: come up with a good idea Step 2: execute on it • Obtain data, clean it, convert between formats • Try to reproduce results from previous work, email authors 3

  10. Problem 2: efficiency Step 1: come up with a good idea Step 2: execute on it • Obtain data, clean it, convert between formats • Try to reproduce results from previous work, email authors • Run experiments with different versions, keep track of provenance 3

  11. Problem 2: efficiency Step 1: come up with a good idea Step 2: execute on it • Obtain data, clean it, convert between formats • Try to reproduce results from previous work, email authors • Run experiments with different versions, keep track of provenance 3

  12. Tradeoff? efficiency reproducibility Folk wisdom: reproducibility slows down research. 4

  13. Tradeoff? efficiency reproducibility Folk wisdom: reproducibility slows down research. Our claim: reproducibility accelerates research (with the right tool). 4

  14. MLcomp.org (2008) 5

  15. MLcomp paradigm dataset algorithm 6

  16. MLcomp paradigm dataset algorithm accuracy metrics 6

  17. MLcomp paradigm dataset algorithm accuracy metrics Problem: too rigid, doesn’t help with the efficiency problem 6

  18. CodaLab Worksheets (2013-present) 7

  19. Bundles Worksheets 8

  20. Bundles Bundle : an arbitrary file/directory (code or data or results) 0x191aad8fa0ae4741b3123b15a8d59efa 9

  21. Bundles Uploaded by user (code or data): 10

  22. Bundles Uploaded by user (code or data): Derived by running an arbitrary command: 10

  23. Bundles cnn.py(0x45d17c) mnist(0x1ba223) - train.dat #!/usr/bin/python - test.dat import numpy as np ... data cnn.py exp2(0x2d4192) - stdout - stderr - stats.json exp ... 11

  24. Bundles cnn.py(0x45d17c) mnist(0x1ba223) - train.dat #!/usr/bin/python - test.dat import numpy as np ... data cnn.py - data/train.dat - data/test.dat exp2(0x2d4192) - cnn.py - stdout - stdout - stderr - stderr - stats.json - stats.json python cnn.py data/train.dat data/test.dat exp ... 11

  25. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist 12

  26. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py 12

  27. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" 12

  28. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" Look at output of runs: $ cl cat exp2/stdout 12

  29. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" Look at output of runs: $ cl cat exp2/stdout Manage runs: $ cl kill exp2; cl rm exp2 12

  30. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" Look at output of runs: $ cl cat exp2/stdout Manage runs: $ cl kill exp2; cl rm exp2 Run an entire pipeline with a different dataset or newer version of your code: $ cl mimic mnist exp2 cifar -n exp3 12

  31. Command-line Interface (CLI) Search for existing code and data: $ cl search mnist Upload new code or data: $ cl upload cnn.py Run experiments with arbitrary commands: $ cl run :cnn.py data:mnist "python cnn.py data/train.dat data/test.dat" Look at output of runs: $ cl cat exp2/stdout Manage runs: $ cl kill exp2; cl rm exp2 Run an entire pipeline with a different dataset or newer version of your code: $ cl mimic mnist exp2 cifar -n exp3 Copy from one CodaLab instance to another: $ cl add bundle mnist stanford::pliang-demo main::pliang-demo 12

  32. Modularity Real-world problems require efforts of entire community 13

  33. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  34. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  35. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  36. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  37. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  38. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  39. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  40. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  41. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  42. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  43. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  44. Modularity Real-world problems require efforts of entire community People specialize, contribute in decentralized way 13

  45. Intermediate tasks • Old way: use intermediate metrics, rhetoric 14

  46. Intermediate tasks • Old way: use intermediate metrics, rhetoric • New way: plug in and see ramifications automatically 14

  47. Intermediate tasks • Old way: use intermediate metrics, rhetoric • New way: plug in and see ramifications automatically 14

  48. Intermediate tasks • Old way: use intermediate metrics, rhetoric • New way: plug in and see ramifications automatically 14

  49. Intermediate tasks • Old way: use intermediate metrics, rhetoric • New way: plug in and see ramifications automatically 14

  50. Immutability Inspiration: Git version control system 15

  51. Immutability Inspiration: Git version control system • All programs/datasets/runs are write-once • Enable collaboration without chaos • Capture the research process in a reproducible way 15

  52. Bundles Worksheets 16

  53. Literacy Bundle graphs are about truth ; what about interpretation ? 17

  54. Literacy Bundle graphs are about truth ; what about interpretation ? Worksheet : an arbitrary document with embedded bundles description description description 17

  55. Literacy Bundle graphs are about truth ; what about interpretation ? Worksheet : an arbitrary document with embedded bundles description description description Inspiration: Mathematica notebook, Jupyter notebook 17

  56. A worksheet We now train the classifier with more data. 18

  57. A worksheet We now train the classifier with more data. Program : SVMlight Arguments : -n 2000 Dataset : thyroid Error : 2.6% Time : 1 second 18

  58. A worksheet We now train the classifier with more data. Program : SVMlight Arguments : -n 2000 Dataset : thyroid Error : 2.6% Time : 1 second Notice that the error remains the same, suggesting that we’ve saturated our model. 18

  59. 19

  60. nanc-1m.txt(0xc19b66) Two New Orleans... run-count(0xd4815b) - stdout data data 1 1 2 4 run1(0xad3d69) run2(0x992ced) 3 9 - stdout - stdout 415 872 19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend