Constructing Fast Network through Deconstruction of Convolution - - PowerPoint PPT Presentation

constructing fast network
SMART_READER_LITE
LIVE PREVIEW

Constructing Fast Network through Deconstruction of Convolution - - PowerPoint PPT Presentation

Constructing Fast Network through Deconstruction of Convolution Yunho Jeon and Junmo Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology NeurIPS 2018 Goal CNN has achieved outstanding accuracy with deeper


slide-1
SLIDE 1

Constructing Fast Network through Deconstruction of Convolution

Yunho Jeon and Junmo Kim

School of Electrical Engineering Korea Advanced Institute of Science and Technology

NeurIPS 2018

slide-2
SLIDE 2

Goal

CNN has achieved outstanding accuracy with deeper and wider networks

Can we make fast CNN with smaller resources while retaining accuracy?

1

slide-3
SLIDE 3

How to make a fast network

  • Reduce FLOPs

– Grouped or depthwise convolution – Network pruning

2

slide-4
SLIDE 4

How to make a fast network

  • Reduce FLOPs

– Grouped or depthwise convolution – Network pruning But, Lower FLOPs ≠ Faster speed due to memory access!

3

slide-5
SLIDE 5

How to make a fast network

  • Reduce FLOPs

– Grouped or depthwise convolution – Network pruning But, Lower FLOPs ≠ Faster speed due to memory access!

  • Reduce memory access

– Reduce spatial convolutions

  • Maximize utilization of accessed memory

– Use 1x1 convolutions

4

slide-6
SLIDE 6

How to make a fast network

  • Reduce FLOPs

– Grouped or depthwise convolution – Network pruning But, Lower FLOPs ≠ Faster speed due to memory access!

  • Reduce memory access

– Reduce spatial convolutions

  • Maximize utilization of accessed memory

– Use 1x1 convolutions

5

Key Idea Deconstruct spatial convolution into atomic operations

slide-7
SLIDE 7

Deconstruction of convolution (1/3)

Insight

  • Spatial convolution

= Summation of 1x1 convolutions

6

=

slide-8
SLIDE 8

Deconstruction of convolution (2/3)

Shift Inputs instead of filters

7

=

slide-9
SLIDE 9

Deconstruction of convolution (3/3)

If we can share shifted inputs,

8

Share shifted inputs

slide-10
SLIDE 10

Deconstruction of convolution (3/3)

If we can share shifted inputs,

– Reduce FLOPs & memory access

9

Share shifted inputs

slide-11
SLIDE 11

Deconstruction of convolution (3/3)

If we can share shifted inputs,

– Reduce FLOPs & memory access – But, expressive power is limited if shifting to one direction

10

Share shifted inputs

slide-12
SLIDE 12

Deconstruction of convolution (3/3)

If we can share shifted inputs,

– Reduce FLOPs & memory access – But, expressive power is limited if shifting to one direction

11

Share shifted inputs

Key Challenge How to shift inputs?

slide-13
SLIDE 13

Our approach

  • Active Shift Layer (ASL)
  • 1. Use depthwise shift

12

slide-14
SLIDE 14

Our approach

  • Active Shift Layer (ASL)
  • 1. Use depthwise shift
  • 2. Introduce new shift parameters for each channel

13

slide-15
SLIDE 15

Our approach

  • Active Shift Layer (ASL)
  • 1. Use depthwise shift
  • 2. Introduce new shift parameters for each channel
  • 3. Expand to non-integer shift using interpolation

14

slide-16
SLIDE 16

Our approach

  • Active Shift Layer (ASL)
  • 1. Use depthwise shift
  • 2. Introduce new shift parameters for each channel
  • 3. Expand to non-integer shift using interpolation
  • Shift values are differentiable!

 Shift values are trained through network itself

15

slide-17
SLIDE 17

Example of Learned Shift

Enlarge receptive fields by shifting inputs

16

slide-18
SLIDE 18

Experiment (ImageNet)

  • Better accuracy with the smaller number of parameters
  • Faster inference time with similar accuracy

17

slide-19
SLIDE 19

Thank you

For more information, Please visit our poster #22