Using Deep Learning to Solve Challenging Problems Jeff Dean Google - - PowerPoint PPT Presentation

using deep learning to solve challenging problems
SMART_READER_LITE
LIVE PREVIEW

Using Deep Learning to Solve Challenging Problems Jeff Dean Google - - PowerPoint PPT Presentation

Using Deep Learning to Solve Challenging Problems Jeff Dean Google Brain team g.co/brain Presenting the work of many people at Google Deep learning is causing a machine learning revolution ML Arxiv Papers per Year Deep Learning Modern


slide-1
SLIDE 1

Using Deep Learning to Solve Challenging Problems

Jeff Dean Google Brain team g.co/brain

Presenting the work of many people at Google

slide-2
SLIDE 2

Deep learning is causing a machine learning revolution

slide-3
SLIDE 3

ML Arxiv Papers per Year

slide-4
SLIDE 4

“cat”

Deep Learning

Modern Reincarnation of Artificial Neural Networks

Collection of simple trainable mathematical units, organized in layers, that work together to solve complicated tasks

Key Benefit

Learns features from raw, heterogeneous, noisy data No explicit feature engineering required

What’s New

new network architectures, new training math, *scale*

slide-5
SLIDE 5

ConvNets

slide-6
SLIDE 6

input

Pixels:

  • utput

“lion”

Functions a Deep Neural Network Can Learn

slide-7
SLIDE 7

input

Pixels: Audio:

  • utput

“lion” “How cold is it outside?”

Functions a Deep Neural Network Can Learn

slide-8
SLIDE 8

input

Pixels: Audio: “Hello, how are you?”

  • utput

“lion” “How cold is it outside?” “Bonjour, comment allez-vous?”

Functions a Deep Neural Network Can Learn

slide-9
SLIDE 9

input

Pixels: Audio: “Hello, how are you?” Pixels:

  • utput

“lion” “How cold is it outside?” “Bonjour, comment allez-vous?” “A blue and yellow train travelling down the tracks”

Functions a Deep Neural Network Can Learn

slide-10
SLIDE 10

But why now?

slide-11
SLIDE 11

Accuracy Scale (data size, model size)

1980s and 1990s

neural networks

  • ther approaches
slide-12
SLIDE 12

more compute Accuracy Scale (data size, model size) neural networks

  • ther approaches

1980s and 1990s

slide-13
SLIDE 13

more compute Accuracy Scale (data size, model size) neural networks

  • ther approaches

Now

slide-14
SLIDE 14

5% errors humans 2011 26% errors

slide-15
SLIDE 15

2016 3% errors 2011 5% errors humans 26% errors

slide-16
SLIDE 16

2008: Grand Engineering Challenges for 21st Century

  • Make solar energy affordable
  • Provide energy from fusion
  • Develop carbon sequestration methods
  • Manage the nitrogen cycle
  • Provide access to clean water
  • Restore & improve urban infrastructure
  • Advance health informatics
  • Engineer better medicines
  • Reverse-engineer the brain
  • Prevent nuclear terror
  • Secure cyberspace
  • Enhance virtual reality
  • Advance personalized learning
  • Engineer the tools for scientific

discovery

www.engineeringchallenges.org/challenges.aspx

slide-17
SLIDE 17

2008: Grand Engineering Challenges for 21st Century

  • Make solar energy affordable
  • Provide energy from fusion
  • Develop carbon sequestration methods
  • Manage the nitrogen cycle
  • Provide access to clean water
  • Restore & improve urban infrastructure
  • Advance health informatics
  • Engineer better medicines
  • Reverse-engineer the brain
  • Prevent nuclear terror
  • Secure cyberspace
  • Enhance virtual reality
  • Advance personalized learning
  • Engineer the tools for scientific

discovery

www.engineeringchallenges.org/challenges.aspx

I would personally add two others:

  • Communicate and access information regardless of language
  • Build flexible general purpose AI systems
slide-18
SLIDE 18

2008: Grand Engineering Challenges for 21st Century

  • Make solar energy affordable
  • Provide energy from fusion
  • Develop carbon sequestration methods
  • Manage the nitrogen cycle
  • Provide access to clean water
  • Restore & improve urban infrastructure
  • Advance health informatics
  • Engineer better medicines
  • Reverse-engineer the brain
  • Prevent nuclear terror
  • Secure cyberspace
  • Enhance virtual reality
  • Advance personalized learning
  • Engineer the tools for scientific

discovery

www.engineeringchallenges.org/challenges.aspx

I would personally add two others:

  • Communicate and access information regardless of language
  • Build flexible general purpose AI systems
slide-19
SLIDE 19

Restore & improve urban infrastructure

slide-20
SLIDE 20

https://waymo.com/tech/

slide-21
SLIDE 21

Advance health informatics

slide-22
SLIDE 22

Healthy Diseased

Hemorrhages

No DR Mild DR Moderate DR Severe DR Proliferative DR

1 2 3 4 5

slide-23
SLIDE 23

0.95

F-score

Algorithm Ophthalmologist (median)

0.91

“The study by Gulshan and colleagues truly represents the brave new world in medicine.” “Google just published this paper in JAMA (impact factor 37) [...] It actually lives up to the hype.”

  • Dr. Andrew Beam, Dr. Isaac Kohane

Harvard Medical School

  • Dr. Luke Oakden-Rayner

University of Adelaide

slide-24
SLIDE 24

Age: MAE 3.26 yrs Gender: AUC 0.97 Diastolic: MAE 6.39 mmHg Systolic: MAE 11.23 mmHg HbA1c: MAE 1.4%

Predicting things that doctors can’t predict from imaging Potential as a new biomarker Preliminary 5-yr MACE AUC: 0.7 Can we predict cardiovascular risk? If so, this is a very nice non-invasive way of doing so Can we also predict treatment response?

  • R. Poplin, A. Varadarajan et al. Predicting​ ​Cardiovascular​ ​Risk​ ​Factors​ ​from​ ​Retinal

Fundus​ ​Photographs​ ​using​ ​Deep​ ​Learning. Nature Biomedical Engineering, 2018.

Completely new, novel scientific discoveries

slide-25
SLIDE 25

Predictive tasks for healthcare

Given a patient’s electronic medical record data, can we predict the future? Deep learning methods for sequential prediction are becoming extremely good e.g. recent improvements in Google Translation

slide-26
SLIDE 26

neural (GNMT) phrase-based (PBMT)

English > Spanish English > French English > Chinese Spanish > English French > English Chinese > English

Translation model Translation quality 1 2 3 4 5 6 human perfect translation

Neural Machine Translation

Closes gap between old system and human-quality translation by 58% to 87% Enables better communication across the world

research.googleblog.com/2016/09/a-neural-network-for-machine.html

slide-27
SLIDE 27

Predictive tasks for healthcare

Given a large corpus of training data of de-identified medical records, can we predict interesting aspects of the future for a patient not in the training set?

  • will patient be readmitted to hospital in next N days?
  • what is the likely length of hospital stay for patient checking in?
  • what are the most likely diagnoses for the patient right now? and why?
  • what medications should a doctor consider prescribing?
  • what tests should be considered for this patient?
  • which patients are at highest risk for X in next month?

Collaborating with several healthcare organizations, including UCSF, Stanford, and

  • Univ. of Chicago.
slide-28
SLIDE 28

Medical Records Prediction Results

24 hours earlier

https://arxiv.org/abs/1801.07860

slide-29
SLIDE 29

Engineer better medicines

and maybe...

Make solar energy affordable Develop carbon sequestration methods Manage the nitrogen cycle

slide-30
SLIDE 30

Predicting Properties of Molecules

Toxic? Bind with a given protein? Quantum properties: E,ω0, ... DFT (density functional theory) simulator

slide-31
SLIDE 31

Predicting Properties of Molecules

Toxic? Bind with a given protein? Quantum properties: E,ω0, ... DFT (density functional theory) simulator

slide-32
SLIDE 32

Predicting Properties of Molecules

Toxic? Bind with a given protein? Quantum properties: E,ω0, ...

https://research.googleblog.com/2017/04/predicting-properties-of-molecules-with.html and https://arxiv.org/abs/1702.05532 and https://arxiv.org/abs/1704.01212 (latter to appear in ICML 2017)

  • State of the art results predicting output of expensive quantum chemistry

calculations, but ~300,000 times faster DFT (density functional theory) simulator

slide-33
SLIDE 33

Reverse engineer the brain

slide-34
SLIDE 34

Connectomics: Reconstructing Neural Circuits from High-Resolution Brain Imaging

slide-35
SLIDE 35

mouse cortex (AIBS) fly (HHMI) whole mouse brain (MPI) primates songbird [100 µm]^3 (MPI) log scale

Automated Reconstruction Progress at Google

Metric: Expected Run Length (ERL) “mean microns between failure” of automated neuron tracing

102 104 106 108

Expected run length (µm)

slide-36
SLIDE 36
  • Start with a seed point
  • Recurrent neural network iteratively

fills out an object based on image content and its own previous predictions

New Technology: Flood Filling Networks

https://arxiv.org/abs/1611.00421

2d Inference

slide-37
SLIDE 37

Flood Filling Networks: 3d Inference

slide-38
SLIDE 38

Flood Filling Networks: 3d Inference

~ 100 µm (10,000 voxels)

slide-39
SLIDE 39
  • Raw data produced by Max Planck

Institute for Neurobiology using serial block face scanning electron microscopy

  • 10,600 ⨉ 10,800 ⨉ 5,700 voxels =

~600 billion voxels

  • Goal: Reconstruct complete

connectivity and use to test specific hypotheses related to how biological nervous systems produce precise, sequential motor behaviors and perform reinforcement learning.

Courtesy Jorgen Kornfeld & Winfried Denk, MPI

Songbird Brain Wiring Diagram

slide-40
SLIDE 40

Engineer the tools for scientific discovery

slide-41
SLIDE 41

Open, standard software for general machine learning Great for Deep Learning in particular First released Nov 2015 Apache 2.0 license http://tensorflow.org/

and

https://github.com/tensorflow/tensorflow

slide-42
SLIDE 42

TensorFlow Goals

Establish common platform for expressing machine learning ideas and systems Open source it so that it becomes a platform for everyone, not just Google Make this platform the best in the world for both research and production use

slide-43
SLIDE 43
slide-44
SLIDE 44

AutoML: Automated machine learning (“learning to learn”)

slide-45
SLIDE 45

Current: Solution = ML expertise + data + computation

slide-46
SLIDE 46

Current: Solution = ML expertise + data + computation Can we turn this into: Solution = data + 100X computation ???

slide-47
SLIDE 47

Idea: model-generating model trained via reinforcement learning (1) Generate ten models (2) Train them for a few hours (3) Use loss of the generated models as reinforcement learning signal

Neural Architecture Search with Reinforcement Learning, Zoph & Le, ICLR 2016 arxiv.org/abs/1611.01578

Neural Architecture Search

slide-48
SLIDE 48

CIFAR-10 Image Recognition Task

slide-49
SLIDE 49

Inception-ResNet-v2

computational cost

Accuracy (precision @1)

accuracy

AutoML outperforms handcrafted models

Learning Transferable Architectures for Scalable Image Recognition, Zoph et al. 2017, https://arxiv.org/abs/1707.07012

slide-50
SLIDE 50

Inception-ResNet-v2

Years of effort by top ML researchers in the world computational cost

Accuracy (precision @1)

accuracy

AutoML outperforms handcrafted models

Learning Transferable Architectures for Scalable Image Recognition, Zoph et al. 2017, https://arxiv.org/abs/1707.07012

slide-51
SLIDE 51

Learning Transferable Architectures for Scalable Image Recognition, Zoph et al. 2017, https://arxiv.org/abs/1707.07012

computational cost

Accuracy (precision @1)

accuracy

AutoML outperforms handcrafted models

slide-52
SLIDE 52

computational cost

Accuracy (precision @1)

accuracy

AutoML outperforms handcrafted models

Learning Transferable Architectures for Scalable Image Recognition, Zoph et al. 2017, https://arxiv.org/abs/1707.07012

slide-53
SLIDE 53

computational cost

Accuracy (precision @1)

accuracy

AutoML outperforms handcrafted models

Learning Transferable Architectures for Scalable Image Recognition, Zoph et al. 2017, https://arxiv.org/abs/1707.07012

slide-54
SLIDE 54

https://cloud.google.com/automl/

slide-55
SLIDE 55

Early encouraging signs that we can build flexible systems that can solve new problems automatically…

slide-56
SLIDE 56

Early encouraging signs that we can build flexible systems that can solve new problems automatically… But, we’ll need more computation

slide-57
SLIDE 57

Special computation properties

reduced precision

  • k

about 1.2 × about 0.6 about 0.7 1.21042 × 0.61127 0.73989343

NOT

slide-58
SLIDE 58

handful of specific

  • perations

× =

reduced precision

  • k

about 1.2 × about 0.6 about 0.7 1.21042 × 0.61127 0.73989343

NOT Special computation properties

slide-59
SLIDE 59

Tensor Processing Unit v2

Google-designed device for neural net training and inference

  • 180 teraflops of computation, 64 GB of memory
  • Designed to be connected together
slide-60
SLIDE 60

TPU Pod 64 2nd-gen TPUs 11.5 petaflops 4 terabytes of memory

slide-61
SLIDE 61

https://cloud.google.com/tpu/

slide-62
SLIDE 62

Making 1000 Cloud TPUs available for free to top researchers who are committed to open machine learning research We’re excited to see what researchers will do with much more computation! g.co/tpusignup

slide-63
SLIDE 63

Deep neural networks and machine learning are producing significant breakthroughs that are solving some of the world’s grand challenges

slide-64
SLIDE 64

Deep neural networks and machine learning are producing significant breakthroughs that are solving some of the world’s grand challenges If you’re not considering how to use deep neural nets to solve your problems, you almost certainly should be! Thank you! More info: g.co/brain and tensorflow.org