Recent Trends in Computer Vision and Deep Learning Systems Yangqing - - PowerPoint PPT Presentation

recent trends in computer vision and deep learning systems
SMART_READER_LITE
LIVE PREVIEW

Recent Trends in Computer Vision and Deep Learning Systems Yangqing - - PowerPoint PPT Presentation

Recent Trends in Computer Vision and Deep Learning Systems Yangqing Jia Lead Researcher and Manager of AI Platform, Facebook Computer Vision AlexNet So it begins. VGGNet Punch it. GoogLeNet We must go deeper. ResNet And we took the word


slide-1
SLIDE 1

Recent Trends in Computer Vision and Deep Learning Systems

Yangqing Jia

Lead Researcher and Manager of AI Platform, Facebook

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4

Computer Vision

slide-5
SLIDE 5
slide-6
SLIDE 6

AlexNet

So it begins.

slide-7
SLIDE 7

VGGNet

Punch it.

slide-8
SLIDE 8

GoogLeNet

We must go deeper.

slide-9
SLIDE 9

ResNet

And we took the word seriously

slide-10
SLIDE 10

ResNet

And we took the word seriously

slide-11
SLIDE 11

ResNeXT

We totally see it coming

slide-12
SLIDE 12

Pushing the Performance

ScSVM AlexNet VGGNet GoogLeNet ResNet ResNeXT

3.03 3.57 6.7 7.3 16.4 28.2

slide-13
SLIDE 13

Why is it challenging?

Gradients, as one example

1 3 5 7 9 11 13 15

depth exploding vanishing ideal

slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17

Deep Learning Systems

slide-18
SLIDE 18

"SAP"

  • Scalability
slide-19
SLIDE 19

Scalability

Run fast, run far

“How do I train on
 multiple GPUs and machines?”

  • Probably the most question we got from Caffe users
slide-20
SLIDE 20

Scalability

Run fast, run far

1.2 million = (# of images in ImageNet1K) (# of new images @FB every 5 mins in 2013) (# of AI jobs per month @FB)

slide-21
SLIDE 21

Scalability

Run fast, run far L1 L2 L3 L3b L2b L1b U3 U2 U1

slide-22
SLIDE 22

Scalability

Run fast, run far L1 L2 L3 L3b L2b L1b U3 U2 U1 R3 R2 R1

slide-23
SLIDE 23

Scalability

Run fast, run far L1 L2 L3 L3b L2b L1b U3 U2 U1 R3 R2 R1 L1 L2 L3 L3b L2b L1b U3 U2 U1 R3 R2 R1

slide-24
SLIDE 24

Scalability

Run fast, run far L1 L2 L3 L3b L2b L1b U3 U2 U1 R3 R2 R1 L1 L2 L3 L3b L2b L1b U3 U2 U1 R3 R2 R1

slide-25
SLIDE 25

The Return of MPI

"I'm your father", said Allreduce. Allreduce

Tree based - O(MlogN) Ring based - O(M) etc.

slide-26
SLIDE 26

And so we scale

slide-27
SLIDE 27

"SAP"

  • Arithmetics
slide-28
SLIDE 28

Quantized Computation

Forget about float, the world is bigger 8 23 5 10 16 8

float fp16 fixed16 fixed8

slide-29
SLIDE 29

Why do we care?

Battery life is life.

float add fp16 add fixed16 add fixed8 add

0.9 0.4 0.05 0.03

float mul fp16 mul fixed8 mul

4.0 1.0 0.2

slide-30
SLIDE 30

How does it perform?

Source: Nvidia https://devblogs.nvidia.com/parallelforall/mixed-precision-programming-cuda-8/

slide-31
SLIDE 31

Why does it matter for cars?

250 watts 10 -> 20 TFlops 10 watts 0.7 -> 1.5 TFlops

slide-32
SLIDE 32

"SAP"

  • Portability
slide-33
SLIDE 33

Portable System

One software to rule them all, and...

AI Math and Algorithms Deployment Platforms

slide-34
SLIDE 34
slide-35
SLIDE 35

Portable System

Cloud, Mobile, IoT, Cars, Drones, Coffee makers

Model auto predictor =
 caffe2::Predictor(model_file) public class Predictor implements
 Caffe2ModelInterface;

slide-36
SLIDE 36
slide-37
SLIDE 37

The Land of Deep Learning System

Applications Caffe, Torch, TF, etc... Core Math

Eigen
 CuDNN NNPack
 THNN
 MKL

Comms

NCCL
 MPI
 ZeroMQ
 Redis
 ...

Low Level

CUDA OpenGL OpenCL Vulkan
 ...

Compilers DataBases

LevelDB
 RocksDB
 Hadoop Amazon S3 your old disk

Not as complex as a car, but still.

slide-38
SLIDE 38
slide-39
SLIDE 39
slide-40
SLIDE 40

Thank you!

Recent Trends in Computer Vision and Deep Learning Systems

Yangqing Jia