Deep neural architectures for structured output problems Clment - - PowerPoint PPT Presentation

deep neural architectures for structured output problems
SMART_READER_LITE
LIVE PREVIEW

Deep neural architectures for structured output problems Clment - - PowerPoint PPT Presentation

Deep neural architectures for structured output problems Clment Chatelain Soufiane Belharbi soufiane.belharbi@litislab.fr clement.chatelain@insa-rouen.fr LITIS - INSA de Rouen Joint work with: J.Lerouge, R.Herault, S.Adam, R.Modzelewski,


slide-1
SLIDE 1

images/logos

Deep neural architectures for structured output problems

Soufiane Belharbi soufiane.belharbi@litislab.fr Clément Chatelain clement.chatelain@insa-rouen.fr LITIS - INSA de Rouen Joint work with: J.Lerouge, R.Herault, S.Adam, R.Modzelewski, F.Jardin, B.Labbe May 19, 2015 LITIS - INSA de Rouen Deep neural architectures for structured output problems

slide-2
SLIDE 2

images/logos

Plan

1 Structured Output Problems 2 Input/Output Deep Architecture (IODA) 3 Application of IODA to medical image labeling 4 Application of IODA to Facial Landmark Detection 5 Conclusion 6 Future Work on IODA

LITIS - INSA de Rouen Deep neural architectures for structured output problems 1/46

slide-3
SLIDE 3

images/logos Structured Output Problems

Traditional Machine Learning Problems f : X → y Inputs X ∈ Rd: any type of input Outputs y ∈ R for the task: classification, regression, . . . Machine Learning for Structured Output Problems f : X → Y Inputs X ∈ Rd: any type of input Outputs Y ∈ Rd′, d′ > 1 a structured object (dependencies)

See C. Lampert slides [3]. LITIS - INSA de Rouen Deep neural architectures for structured output problems 2/46

slide-4
SLIDE 4

images/logos Structured Output Problems

Traditional Machine Learning Problems f : X → y Inputs X ∈ Rd: any type of input Outputs y ∈ R for the task: classification, regression, . . . Machine Learning for Structured Output Problems f : X → Y Inputs X ∈ Rd: any type of input Outputs Y ∈ Rd′, d′ > 1 a structured object (dependencies)

See C. Lampert slides [3]. LITIS - INSA de Rouen Deep neural architectures for structured output problems 2/46

slide-5
SLIDE 5

images/logos Structured Output Problems

Data = representation (values) + structure (dependencies)

Text: part-of-speech tagging, translation speech ⇄ text Protein folding Image Structured data

LITIS - INSA de Rouen Deep neural architectures for structured output problems 3/46

slide-6
SLIDE 6

images/logos Structured Output Problems

Approaches that Deal with Structured Output Data

◮ Kernel based methods: Kernel Density Estimation (KDE) ◮ Discriminative methods: Structure output SVM ◮ Graphical methods: HMM, CRF

, MRF , . . .

Drawbacks Perform one single data transformation Difficult to deal with high dimensional data Ideal approach

◮ Structured output problems ◮ High dimension data ◮ Multiple data transformation (complex mapping functions)

Deep neural networks?

LITIS - INSA de Rouen Deep neural architectures for structured output problems 4/46

slide-7
SLIDE 7

images/logos Structured Output Problems

Approaches that Deal with Structured Output Data

◮ Kernel based methods: Kernel Density Estimation (KDE) ◮ Discriminative methods: Structure output SVM ◮ Graphical methods: HMM, CRF

, MRF , . . .

Drawbacks Perform one single data transformation Difficult to deal with high dimensional data Ideal approach

◮ Structured output problems ◮ High dimension data ◮ Multiple data transformation (complex mapping functions)

Deep neural networks?

LITIS - INSA de Rouen Deep neural architectures for structured output problems 4/46

slide-8
SLIDE 8

images/logos Structured Output Problems

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4

y1 y2 y3

Output layer

◮ High dimension data OK ◮ Multiple data transformation (complex mapping functions) OK ◮ Structured output problems NO

LITIS - INSA de Rouen Deep neural architectures for structured output problems 5/46

Traditional Deep neural Network

car bus bike

slide-9
SLIDE 9

images/logos Input/Output Deep Architecture (IODA)

Plan

1 Structured Output Problems 2 Input/Output Deep Architecture (IODA) 3 Application of IODA to medical image labeling 4 Application of IODA to Facial Landmark Detection 5 Conclusion 6 Future Work on IODA

LITIS - INSA de Rouen Deep neural architectures for structured output problems 6/46

slide-10
SLIDE 10

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4

y1 y2 y3 y4 y5 y6 y7

Output layer

LITIS - INSA de Rouen Deep neural architectures for structured output problems 7/46

IODA: ◮ Incorporate the output structure by learning ◮ Discover hidden dependencies in the outputs

Structured object

slide-11
SLIDE 11

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1

LITIS - INSA de Rouen Deep neural architectures for structured output problems 8/46

Training IODA

slide-12
SLIDE 12

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1

LITIS - INSA de Rouen Deep neural architectures for structured output problems 8/46

Training IODA

slide-13
SLIDE 13

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2

LITIS - INSA de Rouen Deep neural architectures for structured output problems 9/46

Training IODA

slide-14
SLIDE 14

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2

LITIS - INSA de Rouen Deep neural architectures for structured output problems 10/46

Training IODA

Rin(xi ;θin) = ˆ xi

slide-15
SLIDE 15

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2

y1 y2 y3 y4 y5 y6 y7

Output layer

LITIS - INSA de Rouen Deep neural architectures for structured output problems 11/46

Training IODA

Rin(xi ;θin) = ˆ xi

slide-16
SLIDE 16

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2 Hidden layer 4

y1 y2 y3 y4 y5 y6 y7

Output layer

LITIS - INSA de Rouen Deep neural architectures for structured output problems 12/46

Training IODA

Rin(xi ;θin) = ˆ xi

slide-17
SLIDE 17

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4

y1 y2 y3 y4 y5 y6 y7

Output layer

LITIS - INSA de Rouen Deep neural architectures for structured output problems 13/46

Training IODA

Rin(xi ;θin) = ˆ xi

slide-18
SLIDE 18

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4

y1 y2 y3 y4 y5 y6 y7

Output layer

LITIS - INSA de Rouen Deep neural architectures for structured output problems 14/46

Training IODA

Rin(xi ;θin) = ˆ xi Rout (yi ;θout ) = ˆ yi

slide-19
SLIDE 19

images/logos Input/Output Deep Architecture (IODA)

x1 x2 x3 x4 x5 x6 x7

Input layer Hidden layer 1 Hidden layer 2 Hidden layer 3 Hidden layer 4

y1 y2 y3 y4 y5 y6 y7

Output layer

LITIS - INSA de Rouen Deep neural architectures for structured output problems 15/46

Training IODA

Rin(xi ;θin) = ˆ xi Rout (yi ;θout ) = ˆ yi Back-propagation M(xi ;θ,θin,θout ) = ˆ yi

slide-20
SLIDE 20

images/logos Input/Output Deep Architecture (IODA)

IODA framework: min

θ L(θ, D(x, y))

L(θ, D(x, y)) = 1 n

n

  • i=1
  • C(M(xi;θ,θin,θout), yi)
  • Learn (input → output) dependencies)

+ ℓin(Rin(xi;θin), xi)

  • Learn input dependencies

+ ℓout(Rout(yi;θout), yi)

  • Learn output dependencies
  • C(.), ℓin(.), ℓout(.): defined costs.

min

θ L(θ, D(x, y)) is hard to solve ⇒ split L(θ, D(x, y))

LITIS - INSA de Rouen Deep neural architectures for structured output problems 16/46

slide-21
SLIDE 21

images/logos Input/Output Deep Architecture (IODA)

IODA framework: min

θ L(θ, D(x, y))

L(θ, D(x, y)) = 1 n

n

  • i=1
  • C(M(xi;θ,θin,θout), yi)
  • Learn (input → output) dependencies)

+ ℓin(Rin(xi;θin), xi)

  • Learn input dependencies

+ ℓout(Rout(yi;θout), yi)

  • Learn output dependencies
  • C(.), ℓin(.), ℓout(.): defined costs.

min

θ L(θ, D(x, y)) is hard to solve ⇒ split L(θ, D(x, y))

LITIS - INSA de Rouen Deep neural architectures for structured output problems 16/46

slide-22
SLIDE 22

images/logos Input/Output Deep Architecture (IODA)

Relaxed-simplified version of IODA

1

Unsupervised training:

→ Input dependencies : min

θin 1 n

n

i=1 ℓin(Rin(xi;θin), xi)

→ output dependencies: min

θout 1 n

n

i=1 ℓout(Rout(yi;θout), yi) 2

Standard supervised learning: min

θ,θin,θout 1 n

n

i=1 C(M(xi;θ,θin,θout), yi)

Open source implementation

Implemented using our library: Crino [1] [Python-Theano based].

LITIS - INSA de Rouen Deep neural architectures for structured output problems 17/46

slide-23
SLIDE 23

images/logos Input/Output Deep Architecture (IODA)

Relaxed-simplified version of IODA

1

Unsupervised training:

→ Input dependencies : min

θin 1 n

n

i=1 ℓin(Rin(xi;θin), xi)

→ output dependencies: min

θout 1 n

n

i=1 ℓout(Rout(yi;θout), yi) 2

Standard supervised learning: min

θ,θin,θout 1 n

n

i=1 C(M(xi;θ,θin,θout), yi)

Open source implementation

Implemented using our library: Crino [1] [Python-Theano based].

LITIS - INSA de Rouen Deep neural architectures for structured output problems 17/46

slide-24
SLIDE 24

images/logos Application of IODA to medical image labeling

Plan

1 Structured Output Problems 2 Input/Output Deep Architecture (IODA) 3 Application of IODA to medical image labeling 4 Application of IODA to Facial Landmark Detection 5 Conclusion 6 Future Work on IODA

LITIS - INSA de Rouen Deep neural architectures for structured output problems 18/46

slide-25
SLIDE 25

images/logos Application of IODA to medical image labeling

Image labeling problems

Definition Assigning a label to each pixel of an image (AKA "semantic segmentation") Various applications in: Document image analysis (text, image, tables, etc.) Computer vision (road safety, natural scene understanding) Medical imaging (organ, tumour segmentation)

LITIS - INSA de Rouen Deep neural architectures for structured output problems 19/46

slide-26
SLIDE 26

images/logos Application of IODA to medical image labeling

Image labeling problems

Output dependencies Local dependencies (neigbouring labels are correlated) Structural dependencies (sky is generally above grass) → Image labeling can be considered as a structured output problems

LITIS - INSA de Rouen Deep neural architectures for structured output problems 20/46

slide-27
SLIDE 27

images/logos Application of IODA to medical image labeling

Application of IODA on a medical Image labeling problem

Collaboration with the Henri Becquerel Center (Quantif team) Sarcopenia is a critical indication for lymphoma treatment Can be measured on scanner images by labeling squeletal muscle at L3 (third vertebra) 4 min/patient for a senior radiologist Dataset 128 labeled L3 scanner images 512*512 pix Reference method from Chung (based on registration)

LITIS - INSA de Rouen Deep neural architectures for structured output problems 21/46

slide-28
SLIDE 28

images/logos Application of IODA to medical image labeling

Input/Output Deep Architecture (IODA) for Image Labeling IODA architecture for squeletal muscle segmentation

LITIS - INSA de Rouen Deep neural architectures for structured output problems 22/46

slide-29
SLIDE 29

images/logos Application of IODA to medical image labeling

Implementation Architecture (optimized on validation set)

x1 x2 x3 x4 x5 x6 x7 x8 x9 311 × 457 = 142127 1500 1500 y1 y2 y3 y4 y5 y6 y7 y8 y9 311 × 457 = 142127

A few figures: 428 M parameters (!!) Less than an hour for training (GPU, 4Go) 201.2 ms for decision

LITIS - INSA de Rouen Deep neural architectures for structured output problems 23/46

slide-30
SLIDE 30

images/logos Application of IODA to medical image labeling

Qualitative results 1/2

(a) CT image (b) Ground truth (c) Chung (d) IODA

Non-sarcopenic patient

LITIS - INSA de Rouen Deep neural architectures for structured output problems 24/46

slide-31
SLIDE 31

images/logos Application of IODA to medical image labeling

Qualitative results 2/2

(a) CT image (b) Ground truth (c) Chung (d) IODA

Sarcopenic patient

LITIS - INSA de Rouen Deep neural architectures for structured output problems 25/46

slide-32
SLIDE 32

images/logos Application of IODA to medical image labeling

Quantitative results

Method

  • Diff. (%)

Jaccard (%) Chung (reference method)

  • 10.6

60.3 No pre-train DA 0.12 85.88 Input pre-train DA 0.15 85.91 Input/Output pre-train DA (IODA) 3.37 88.47

LITIS - INSA de Rouen Deep neural architectures for structured output problems 26/46

slide-33
SLIDE 33

images/logos Application of IODA to medical image labeling

The "blank test image"

Feed the network with a blank image Published in pattern recognition [4]

LITIS - INSA de Rouen Deep neural architectures for structured output problems 27/46

slide-34
SLIDE 34

images/logos Application of IODA to Facial Landmark Detection

Plan

1 Structured Output Problems 2 Input/Output Deep Architecture (IODA) 3 Application of IODA to medical image labeling 4 Application of IODA to Facial Landmark Detection 5 Conclusion 6 Future Work on IODA

LITIS - INSA de Rouen Deep neural architectures for structured output problems 28/46

slide-35
SLIDE 35

images/logos Application of IODA to Facial Landmark Detection

◮ Facial landmarks:

set of facial key points with coordinates (x,y)

Task

= = ⇒ predict the shape(set of points) given a facial image ⇒ Geometric dependencies ⇒ structured output problem ⇒ Apply IODA (regression task)

LITIS - INSA de Rouen Deep neural architectures for structured output problems 29/46

slide-36
SLIDE 36

images/logos Application of IODA to Facial Landmark Detection

◮ Facial landmarks:

set of facial key points with coordinates (x,y)

Task

= = ⇒ predict the shape(set of points) given a facial image ⇒ Geometric dependencies ⇒ structured output problem ⇒ Apply IODA (regression task)

LITIS - INSA de Rouen Deep neural architectures for structured output problems 29/46

slide-37
SLIDE 37

images/logos Application of IODA to Facial Landmark Detection

◮ Facial landmarks:

set of facial key points with coordinates (x,y)

Task

= = ⇒ predict the shape(set of points) given a facial image ⇒ Geometric dependencies ⇒ structured output problem ⇒ Apply IODA (regression task)

LITIS - INSA de Rouen Deep neural architectures for structured output problems 29/46

slide-38
SLIDE 38

images/logos Application of IODA to Facial Landmark Detection

Datasets & Performance Measures ◮ Datasets: LFPW(∼1000 samples), HELEN(∼2300 samples) ◮ Performance Measure:

◮ Normalized Root Mean Square Error (NRMSE) ◮ Cumulative Distribution Function: CDFNRMSE ◮ Area Under the CDF Curve (AUC) **new**

Architecture (optimized on validation set)

x1 x2 x3 x4 x5 x6 x7 50 × 50 = 2500 1024 512 64 y1 y2 y3 y4 y5 y6 y7 68 × 2 = 136

⇒ Total training on GPU takes less than 30mins.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 30/46

slide-39
SLIDE 39

images/logos Application of IODA to Facial Landmark Detection

Datasets & Performance Measures ◮ Datasets: LFPW(∼1000 samples), HELEN(∼2300 samples) ◮ Performance Measure:

◮ Normalized Root Mean Square Error (NRMSE) ◮ Cumulative Distribution Function: CDFNRMSE ◮ Area Under the CDF Curve (AUC) **new**

Architecture (optimized on validation set)

x1 x2 x3 x4 x5 x6 x7 50 × 50 = 2500 1024 512 64 y1 y2 y3 y4 y5 y6 y7 68 × 2 = 136

⇒ Total training on GPU takes less than 30mins.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 30/46

slide-40
SLIDE 40

images/logos Application of IODA to Facial Landmark Detection

Datasets & Performance Measures ◮ Datasets: LFPW(∼1000 samples), HELEN(∼2300 samples) ◮ Performance Measure:

◮ Normalized Root Mean Square Error (NRMSE) ◮ Cumulative Distribution Function: CDFNRMSE ◮ Area Under the CDF Curve (AUC) **new**

Architecture (optimized on validation set)

x1 x2 x3 x4 x5 x6 x7 50 × 50 = 2500 1024 512 64 y1 y2 y3 y4 y5 y6 y7 68 × 2 = 136

⇒ Total training on GPU takes less than 30mins.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 30/46

slide-41
SLIDE 41

images/logos Application of IODA to Facial Landmark Detection

Visual results LFPW

LITIS - INSA de Rouen Deep neural architectures for structured output problems 31/46 No pre-train DA Input pre-train DA Input/Output pre-train DA (IODA)

slide-42
SLIDE 42

images/logos Application of IODA to Facial Landmark Detection

Visual results HELEN

LITIS - INSA de Rouen Deep neural architectures for structured output problems 32/46 No pre-train DA Input pre-train DA Input/Output pre-train DA (IODA)

slide-43
SLIDE 43

images/logos Application of IODA to Facial Landmark Detection

LFPW HELEN AUC CDF0.1 AUC CDF0.1 Mean shape 66.15% 18.30% 63.30% 16.97% No pre-train DA 0-0-0 77.60% 50.89% 80.91% 69.69% Input pre-train DA 1-0-0 79.25% 62.94% 82.13% 76.36% 2-0-0 79.10% 58.48% 82.39% 75.75% 3-0-0 79.51% 65.62% 82.25% 77.27% Input/Output pre-train DA 1-0-1 80.66% 68.30% 83.95% 83.03% 1-1-1 81.50% 72.32% 83.51% 80.90% 1-0-2 81.00% 71.42% 83.91% 82.42% 1-1-2 81.06% 70.98% 83.81% 83.03% 1-0-3 81.91% 74.55% 83.72% 80.30% 2-0-1 81.32% 72.76% 83.61% 80.00% 2-1-1 81.47% 70.08% 84.11% 83.33% 2-0-2 81.35% 71.87% 83.88% 82.12% 3-0-1 81.62% 72.76% 83.38% 78.48% Performance of mean shape, NDA, IDA and IODA on LFPW and HELEN.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 33/46

slide-44
SLIDE 44

images/logos Application of IODA to Facial Landmark Detection

Blank Image Test

Feed a blank image to a trained network ⇒ what is the output?

No pre-train DA 0-0-0 Input pre-train DA 3-0-0 Input/Output pre-train DA 1-0-3

The outputs on LFPW

Paper submitted to ECML 2015 (arXiv [2]). LITIS - INSA de Rouen Deep neural architectures for structured output problems 34/46

slide-45
SLIDE 45

images/logos Conclusion

Plan

1 Structured Output Problems 2 Input/Output Deep Architecture (IODA) 3 Application of IODA to medical image labeling 4 Application of IODA to Facial Landmark Detection 5 Conclusion 6 Future Work on IODA

LITIS - INSA de Rouen Deep neural architectures for structured output problems 35/46

slide-46
SLIDE 46

images/logos Conclusion

◮ Fully neural based approach ◮ Able to learn the output dependencies in high dimension ◮ Efficient on two real world problems

LITIS - INSA de Rouen Deep neural architectures for structured output problems 36/46

slide-47
SLIDE 47

images/logos Future Work on IODA

Plan

1 Structured Output Problems 2 Input/Output Deep Architecture (IODA) 3 Application of IODA to medical image labeling 4 Application of IODA to Facial Landmark Detection 5 Conclusion 6 Future Work on IODA

LITIS - INSA de Rouen Deep neural architectures for structured output problems 37/46

slide-48
SLIDE 48

images/logos Future Work on IODA 1

Embedded Pre-training (draft on arXiv): L(θ, D(x, y)) = 1 n

n

  • i=1
  • λC C(M(xi;θ,θin,θout), yi)

+ λin ℓin(Rin(xi;θin), xi) + λout ℓout(Rout(yi;θout), yi)

  • 20

40 60 80 100 Training time −0.3 −0.1 0.1 0.3 0.5 0.7 0.9 1.1 λC,λin,λout

λin λout λC

Relaxed IODA

20 40 60 80 100 Training time 0.0 0.2 0.4 0.6 0.8 1.0 λC,λin,λout

λout λin λC

Embedded IODA

LITIS - INSA de Rouen Deep neural architectures for structured output problems 38/46

slide-49
SLIDE 49

images/logos Future Work on IODA 2

Use of unlabeled data: L(θ, D(x, y)) = 1 n

n

  • i=1

λC C(M(xi;θ,θin,θout), yi) + 1 n + nin

n+nin

  • i=1

λin ℓin(Rin(xi;θin), xi) + 1 n + nout

n+nout

  • i=1

λout ℓout(Rout(yi;θout), yi) nin, nout potentially huge unlabeled input, output data.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 39/46

slide-50
SLIDE 50

images/logos Future Work on IODA 3

Convolutional IODA: Convolutional layers are efficient in feature extraction ⇒ Use convolutional layers instead of auto-encoders in the input-layers

LITIS - INSA de Rouen Deep neural architectures for structured output problems 40/46

slide-51
SLIDE 51

images/logos

Bibliography I

[1] Crino, a neural-network library based on Theano. https://github.com/jlerouge/crino, 2014. [2]

  • S. Belharbi, C. Chatelain, R. Hérault, and S. Adam.

Input/Output Deep Architecture for Structured Output Problems. ECML, 2015. [3]

  • CH. Lampert.

Slides: Learning with Structured Inputs and Ouputs, http://www.di.ens.fr/willow/events/ cvml2010/materials/INRIA_summer_school_2010_Christoph.pdf, 2010. [4]

  • J. Lerouge, R. Herault, C. Chatelain, F. Jardin, and R. Modzelewski.

Ioda: An input output deep architecture for image labeling. Pattern Recognition, 48(9):2847–2858, 2015. LITIS - INSA de Rouen Deep neural architectures for structured output problems 41/46

slide-52
SLIDE 52

images/logos

Questions

Thank you for your attention.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 42/46

slide-53
SLIDE 53

images/logos Appendix

Sets Train samples Test samples LFPW 811 224 HELEN 2000 330

Number of samples in datasets.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 43/46

slide-54
SLIDE 54

images/logos Appendix

Normalized Root Mean Square Error (NRMSE)

NRMSE(sp, sg) =

1 n∗D

n

i=1 ||spi − sgi||2, sp, sg predicted, ground truth shape. D inter-ocular distance of sg Cumulative Distribution Function: CDFNRMSE

CDFx = CARD(NRMSE≤x)

N CARD(.) cardinal of a set. N number of images. e.g. CDF0.1 = 0.4 means that 40% of images have an NRMSE error less or equal than 0.1 Area Under the CDF Curve (AUC) **new**: more numerical precision

Plot a CDFNRMSE curve by varying NRMSE in [0, 0.5]. Calculate the area under this curve.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 44/46

slide-55
SLIDE 55

images/logos Appendix

Normalized Root Mean Square Error (NRMSE)

NRMSE(sp, sg) =

1 n∗D

n

i=1 ||spi − sgi||2, sp, sg predicted, ground truth shape. D inter-ocular distance of sg Cumulative Distribution Function: CDFNRMSE

CDFx = CARD(NRMSE≤x)

N CARD(.) cardinal of a set. N number of images. e.g. CDF0.1 = 0.4 means that 40% of images have an NRMSE error less or equal than 0.1 Area Under the CDF Curve (AUC) **new**: more numerical precision

Plot a CDFNRMSE curve by varying NRMSE in [0, 0.5]. Calculate the area under this curve.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 44/46

slide-56
SLIDE 56

images/logos Appendix

Normalized Root Mean Square Error (NRMSE)

NRMSE(sp, sg) =

1 n∗D

n

i=1 ||spi − sgi||2, sp, sg predicted, ground truth shape. D inter-ocular distance of sg Cumulative Distribution Function: CDFNRMSE

CDFx = CARD(NRMSE≤x)

N CARD(.) cardinal of a set. N number of images. e.g. CDF0.1 = 0.4 means that 40% of images have an NRMSE error less or equal than 0.1 Area Under the CDF Curve (AUC) **new**: more numerical precision

Plot a CDFNRMSE curve by varying NRMSE in [0, 0.5]. Calculate the area under this curve.

LITIS - INSA de Rouen Deep neural architectures for structured output problems 44/46

slide-57
SLIDE 57

images/logos Appendix

Input layer pre-training using auto-encoders (1)

x1 x2 x3 x4 x5

Input layer Hidden layer: code

coder ˆ x1 ˆ x2 ˆ x3 ˆ x4 ˆ x5

Output layer

x e = g(U × x + a) coder d = h(V × e + c) decoder x ˆ x ⇐ d

U = V⊺: a tied weight auto-encoder ˆ x = Rin(x) LITIS - INSA de Rouen Deep neural architectures for structured output problems 45/46

slide-58
SLIDE 58

images/logos Appendix

Input layer pre-training using auto-encoders (1)

x1 x2 x3 x4 x5

Input layer Hidden layer: code

coder ˆ x1 ˆ x2 ˆ x3 ˆ x4 ˆ x5

Output layer

x e = g(U × x + a) coder d = h(V × e + c) decoder x ˆ x ⇐ d

U = V⊺: a tied weight auto-encoder ˆ x = Rin(x) LITIS - INSA de Rouen Deep neural architectures for structured output problems 45/46

slide-59
SLIDE 59

images/logos Appendix

Output layer pre-training using auto-encoders (2)

y1 y2 y3 y4 y5

Input layer Hidden layer: code

ˆ y1 ˆ y2 ˆ y3 ˆ y4 ˆ y5

Output layer

decoder y e = g(U × y + a)coder d = h(V × e + c) decoder y ˆ y ⇐ d

U = V⊺: a tied weight auto-encoder ˆ y = Rout (y) LITIS - INSA de Rouen Deep neural architectures for structured output problems 46/46

slide-60
SLIDE 60

images/logos Appendix

Output layer pre-training using auto-encoders (2)

y1 y2 y3 y4 y5

Input layer Hidden layer: code

ˆ y1 ˆ y2 ˆ y3 ˆ y4 ˆ y5

Output layer

decoder y e = g(U × y + a)coder d = h(V × e + c) decoder y ˆ y ⇐ d

U = V⊺: a tied weight auto-encoder ˆ y = Rout (y) LITIS - INSA de Rouen Deep neural architectures for structured output problems 46/46