Improving Reproducible Deep Learning Workflows with DeepDIVA M. - - PowerPoint PPT Presentation

improving reproducible deep learning
SMART_READER_LITE
LIVE PREVIEW

Improving Reproducible Deep Learning Workflows with DeepDIVA M. - - PowerPoint PPT Presentation

Improving Reproducible Deep Learning Workflows with DeepDIVA M. Alberti 1 * , V. Pondenkandath 1* , L. Vgtlin 1 , M. Wrsch 12 , R. Ingold 1 , M. Liwicki 13 *Equal contribution 1 DIVA Group, University of Fribourg, Switzerland 2 IIT, FHNW


slide-1
SLIDE 1

Improving Reproducible Deep Learning Workflows with DeepDIVA

  • M. Alberti1*, V. Pondenkandath1*, L. Vögtlin1, M. Würsch12, R. Ingold1, M. Liwicki13

*Equal contribution

1DIVA Group, University of Fribourg, Switzerland 2IIT, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Switzerland 3EISLAB Machine Learning, Luleå University of Technology, Sweden

slide-2
SLIDE 2

Reproducibility Crisis: Trust or Verify?

2

Joelle Pineau, “Reproducible, Reusable, and Robust Reinforcement Learning”, invited talk @NeurIPS 2018, Montreal, Canada

slide-3
SLIDE 3

No possibility to verify No possibility to extend Lots of overhead created Leads to no trust in scientific results

Why Is This a Problem?

3

slide-4
SLIDE 4

Ensure reproducibility

Of your own experiments Of other people’s experiments

Promote open-source code

Make it easy to have “good enough” code Enable code trustworthiness

How To Make Steps Forward?

4

slide-5
SLIDE 5

Open-Source Python framework Built on top of PyTorch Makes your life easer for:

Reproducing your own and other people’s experiments

Provides boilerplate code for:

Common deep learning scenarios Handling time consuming everyday problems

Documentation & Tutorial available

How We Contribute: DeepDIVA

5

slide-6
SLIDE 6

Reproducing Your Own Experiments

Short-term, or work in progress Long-term, or finished work

6

slide-7
SLIDE 7

Kilometres of poor or incomplete log files Stochasticity in the process

Short-term Reproducibility Dangers

7

slide-8
SLIDE 8

Meaningful logging

Saving all run parameters and command line args Providing concise coloured logs

Deterministic runs

Seeding the pseudo-random numbers generators: Python, Numpy and PyTorch. Disabling CuDNN (NVIDIA Deep Neural Network library) when necessary

How DeepDIVA Ensures Short-term Reproducibility

8

slide-9
SLIDE 9

Poor (or non-existent!) use of version control Hard-to-die bad programming habits Silent data modifications

Long-term Reproducibility Dangers

9

slide-10
SLIDE 10

Git status

Linking every run to a specific commit in Git Allowing this feature to be disabled for dev purposes

Copy code

Copying the entire running code in the output folder

Data Integrity Management

Footprint of the data in a JSON file using SHA-1 hashes

How DeepDIVA Ensures Long-term Reproducibility

10

slide-11
SLIDE 11

Reproducing Other People’s Experiments

Given a paper, try to replicate the results and observations

11

slide-12
SLIDE 12

In order to reproduce an experiment one needs:

Git repository URL Git commit identifier (full SHA) List of command line arguments used The data

Reproducing Other People’s Experiments

12

slide-13
SLIDE 13

Productivity Out-of

  • f-the-box

Making your life easier: do not reinvent the wheel!

13

slide-14
SLIDE 14

“One click away” Deep Learning Scenarios

14

slide-15
SLIDE 15

“when the data is ready the task is solved” Download a dataset with a click

Natural images, medical images, historical documents, …

Split your dataset

Train, Validation and Test splits

Analyse the data

Mean/std and class distributions

Ensure data integrity

Compare the footprints

Prepare Your Data

15

slide-16
SLIDE 16

Real-time Visualizations

16

Tensorboard (from TensorFlow) Confusion Matrix Features Visualization Weight Histograms Performance Evaluation

slide-17
SLIDE 17

Let machine learning find the best values No expensive grid or random search

Automatic Hyper-Parameter Optimization

17

slide-18
SLIDE 18

Be A Part Of f It It

Getting Started With DeepDIVA

18

slide-19
SLIDE 19

No Setup Time

From source on Ubuntu (or other flavours of Linux) Docker Image Coming Soon

Documentation

Online and in the code

Tutorials

Learn new features efficiently

Fork It

Extensive and modular for easy modifications

How To Use It

19

slide-20
SLIDE 20

Make Your Experiment Reproducible

bit.ly/DeepDIVA

20