GTC 2018 GTC 2018 Motivation Discovering Order in Unordered - - PowerPoint PPT Presentation

gtc 2018
SMART_READER_LITE
LIVE PREVIEW

GTC 2018 GTC 2018 Motivation Discovering Order in Unordered - - PowerPoint PPT Presentation

GTC 2018 GTC 2018 Motivation Discovering Order in Unordered Datasets: GMN Generative Markov Networks Experiments Conclusion Yao-Hung Hubert Tsai , Han Zhao , Nebojsa Jojic , and Ruslan Salakhutdinov Machine Learning


slide-1
SLIDE 1

GTC 2018 Motivation GMN Experiments Conclusion

GTC 2018

Discovering Order in Unordered Datasets: Generative Markov Networks

Yao-Hung Hubert Tsai†, Han Zhao†, Nebojsa Jojic‡, and Ruslan Salakhutdinov†

†Machine Learning Department, Carnegie Mellon University ‡Microsoft Research 1 / 32

slide-2
SLIDE 2

GTC 2018 Motivation GMN Experiments Conclusion

A Novel Question

Given an unordered dataset where instances may be exhibiting an implicit order, can we recover this order?

2 / 32

slide-3
SLIDE 3

GTC 2018 Motivation GMN Experiments Conclusion

Motivations

Data samples are

(X) independently identically distributed (O) possessing an unknown order

3 / 32

slide-4
SLIDE 4

GTC 2018 Motivation GMN Experiments Conclusion

Motivations

Data samples are

(X) independently identically distributed (O) possessing an unknown order

3 / 32

slide-5
SLIDE 5

GTC 2018 Motivation GMN Experiments Conclusion

Motivations

Data samples are

(X) independently identically distributed (O) possessing an unknown order

3 / 32

slide-6
SLIDE 6

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

Photos from same person, same place, same dressing, but different days.

Day 1 Day 2 Day 3 Day 4 Day 5

I.i.d. assumption is justified.

4 / 32

slide-7
SLIDE 7

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

Photos from same person, same place, same dressing, but different days.

Day 1 Day 2 Day 3 Day 4 Day 5

I.i.d. assumption is justified.

4 / 32

slide-8
SLIDE 8

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

After rearrangement....

Day 1 Day 2 Day 3 Day 4 Day 5

Implicit order is observed.

5 / 32

slide-9
SLIDE 9

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

After rearrangement....

Day 1 Day 2 Day 3 Day 4 Day 5

Implicit order is observed.

5 / 32

slide-10
SLIDE 10

GTC 2018 Motivation GMN Experiments Conclusion

Put it Differently....

Data generative process that generates data

i.i.d. (X) sequentially (O) parameterized with fewer parameters learned more easily

Propose a novel Markov network chain generative scheme.

6 / 32

slide-11
SLIDE 11

GTC 2018 Motivation GMN Experiments Conclusion

Put it Differently....

Data generative process that generates data

i.i.d. (X) sequentially (O) parameterized with fewer parameters learned more easily

Propose a novel Markov network chain generative scheme.

6 / 32

slide-12
SLIDE 12

GTC 2018 Motivation GMN Experiments Conclusion

Put it Differently....

Data generative process that generates data

i.i.d. (X) sequentially (O) parameterized with fewer parameters learned more easily

Propose a novel Markov network chain generative scheme.

6 / 32

slide-13
SLIDE 13

GTC 2018 Motivation GMN Experiments Conclusion

Put it Differently....

Data generative process that generates data

i.i.d. (X) sequentially (O) parameterized with fewer parameters learned more easily

Propose a novel Markov network chain generative scheme.

6 / 32

slide-14
SLIDE 14

GTC 2018 Motivation GMN Experiments Conclusion

Put it Differently....

Data generative process that generates data

i.i.d. (X) sequentially (O) parameterized with fewer parameters learned more easily

Propose a novel Markov network chain generative scheme.

6 / 32

slide-15
SLIDE 15

GTC 2018 Motivation GMN Experiments Conclusion

Put it Differently....

Data generative process that generates data

i.i.d. (X) sequentially (O) parameterized with fewer parameters learned more easily

Propose a novel Markov network chain generative scheme.

6 / 32

slide-16
SLIDE 16

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

Other examples/ applications:

Slow-moving human diseases (i.e., Alzheimers or Parkinsons) understanding Galaxy or star evolution Cellular or molecular biological processes

Recover the order from a snapshot of individual samples.

7 / 32

slide-17
SLIDE 17

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

Other examples/ applications:

Slow-moving human diseases (i.e., Alzheimers or Parkinsons) understanding Galaxy or star evolution Cellular or molecular biological processes

Recover the order from a snapshot of individual samples.

7 / 32

slide-18
SLIDE 18

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

Other examples/ applications:

Slow-moving human diseases (i.e., Alzheimers or Parkinsons) understanding Galaxy or star evolution Cellular or molecular biological processes

Recover the order from a snapshot of individual samples.

7 / 32

slide-19
SLIDE 19

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

Other examples/ applications:

Slow-moving human diseases (i.e., Alzheimers or Parkinsons) understanding Galaxy or star evolution Cellular or molecular biological processes

Recover the order from a snapshot of individual samples.

7 / 32

slide-20
SLIDE 20

GTC 2018 Motivation GMN Experiments Conclusion

Motivations (cont’d)

Other examples/ applications:

Slow-moving human diseases (i.e., Alzheimers or Parkinsons) understanding Galaxy or star evolution Cellular or molecular biological processes

Recover the order from a snapshot of individual samples.

7 / 32

slide-21
SLIDE 21

GTC 2018 Motivation GMN Experiments Conclusion

Datasets Sorting

Use predefined distance metric

Euclidean distance between image pixel values. p−distance between DNA/RNA sequences.

Not generalize well.

8 / 32

slide-22
SLIDE 22

GTC 2018 Motivation GMN Experiments Conclusion

Datasets Sorting

Use predefined distance metric

Euclidean distance between image pixel values. p−distance between DNA/RNA sequences.

Not generalize well.

8 / 32

slide-23
SLIDE 23

GTC 2018 Motivation GMN Experiments Conclusion

Datasets Sorting

Use predefined distance metric

Euclidean distance between image pixel values. p−distance between DNA/RNA sequences.

Not generalize well.

8 / 32

slide-24
SLIDE 24

GTC 2018 Motivation GMN Experiments Conclusion

Datasets Sorting

Use predefined distance metric

Euclidean distance between image pixel values. p−distance between DNA/RNA sequences.

Not generalize well.

8 / 32

slide-25
SLIDE 25

GTC 2018 Motivation GMN Experiments Conclusion

Our Proposed Method

Data are sampled from a Markov chain.

s1 → s2 → · · · → sn T (st|st−1; θ)

Propose a greedy batch-wise permutation scheme.

N: total # of data, b: batch # of data O(N b log b) (b is constant)

  • cf. O(N!) which is NP-Hard

9 / 32

slide-26
SLIDE 26

GTC 2018 Motivation GMN Experiments Conclusion

Our Proposed Method

Data are sampled from a Markov chain.

s1 → s2 → · · · → sn T (st|st−1; θ)

Propose a greedy batch-wise permutation scheme.

N: total # of data, b: batch # of data O(N b log b) (b is constant)

  • cf. O(N!) which is NP-Hard

9 / 32

slide-27
SLIDE 27

GTC 2018 Motivation GMN Experiments Conclusion

Our Proposed Method

Data are sampled from a Markov chain.

s1 → s2 → · · · → sn T (st|st−1; θ)

Propose a greedy batch-wise permutation scheme.

N: total # of data, b: batch # of data O(N b log b) (b is constant)

  • cf. O(N!) which is NP-Hard

9 / 32

slide-28
SLIDE 28

GTC 2018 Motivation GMN Experiments Conclusion

Our Proposed Method

Data are sampled from a Markov chain.

s1 → s2 → · · · → sn T (st|st−1; θ)

Propose a greedy batch-wise permutation scheme.

N: total # of data, b: batch # of data O(N b log b) (b is constant)

  • cf. O(N!) which is NP-Hard

9 / 32

slide-29
SLIDE 29

GTC 2018 Motivation GMN Experiments Conclusion

Our Proposed Method

Data are sampled from a Markov chain.

s1 → s2 → · · · → sn T (st|st−1; θ)

Propose a greedy batch-wise permutation scheme.

N: total # of data, b: batch # of data O(N b log b) (b is constant)

  • cf. O(N!) which is NP-Hard

9 / 32

slide-30
SLIDE 30

GTC 2018 Motivation GMN Experiments Conclusion

Problem Setup

π: permutation over [n] sπ(1) → sπ(2) → · · · → sπ(n) Joint log-likelihood estimation problem on

θ in transitional operator T (s′|s; θ) Optimal π

10 / 32

slide-31
SLIDE 31

GTC 2018 Motivation GMN Experiments Conclusion

Problem Setup

π: permutation over [n] sπ(1) → sπ(2) → · · · → sπ(n) Joint log-likelihood estimation problem on

θ in transitional operator T (s′|s; θ) Optimal π

10 / 32

slide-32
SLIDE 32

GTC 2018 Motivation GMN Experiments Conclusion

Problem Setup

π: permutation over [n] sπ(1) → sπ(2) → · · · → sπ(n) Joint log-likelihood estimation problem on

θ in transitional operator T (s′|s; θ) Optimal π

10 / 32

slide-33
SLIDE 33

GTC 2018 Motivation GMN Experiments Conclusion

Problem Setup

π: permutation over [n] sπ(1) → sπ(2) → · · · → sπ(n) Joint log-likelihood estimation problem on

θ in transitional operator T (s′|s; θ) Optimal π

10 / 32

slide-34
SLIDE 34

GTC 2018 Motivation GMN Experiments Conclusion

Problem Setup

π: permutation over [n] sπ(1) → sπ(2) → · · · → sπ(n) Joint log-likelihood estimation problem on

θ in transitional operator T (s′|s; θ) Optimal π

10 / 32

slide-35
SLIDE 35

GTC 2018 Motivation GMN Experiments Conclusion

Illustration

! (permutation) T(transitional operator)

Simultaneously learn

1 2

Apply

Train GMNs for green circle (without order) Apply trained transitional operator to blue circle

11 / 32

slide-36
SLIDE 36

GTC 2018 Motivation GMN Experiments Conclusion

Joint Log-Likelihood Estimation Problem (θ and π)

max

θ,π

log

  • P(π) P({si}n

i=1, π; θ)

  • = max

θ,π

log

  • P(π) P(1)(sπ(1))

n

  • t=2

T (sπ(t)|sπ(t−1); θ)

  • ≈ max

θ,π

log

  • P(π)

n

  • t=2

T (sπ(t)|sπ(t−1); θ)

  • P(π): distribution for π

P(1)(·): initial distribution π ∈ Π(n): all possible permutations computationally intractable (|Π(n)| = n!)

12 / 32

slide-37
SLIDE 37

GTC 2018 Motivation GMN Experiments Conclusion

Joint Log-Likelihood Estimation Problem (θ and π)

max

θ,π

log

  • P(π) P({si}n

i=1, π; θ)

  • = max

θ,π

log

  • P(π) P(1)(sπ(1))

n

  • t=2

T (sπ(t)|sπ(t−1); θ)

  • ≈ max

θ,π

log

  • P(π)

n

  • t=2

T (sπ(t)|sπ(t−1); θ)

  • P(π): distribution for π

P(1)(·): initial distribution π ∈ Π(n): all possible permutations computationally intractable (|Π(n)| = n!)

12 / 32

slide-38
SLIDE 38

GTC 2018 Motivation GMN Experiments Conclusion

Outline

1 Motivation 2 Generative Markov Networks 3 Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

4 Conclusion

13 / 32

slide-39
SLIDE 39

GTC 2018 Motivation GMN Experiments Conclusion

I - Neural Networks Parametrization

fθ(s, s′) = T (s′|s; θ) : Rp × Rp → [0, 1] For example, Bernoulli: T (s′|s; θ) =

d

  • i=1

gi(s, θ)s′

i (1 − gi(s, θ))1−s′ i

Reducing the space complexity in a unified model

  • cf. O(d2) with d = 2p for a binary image I ∈ {0, 1}p

Good estimate for unseen states

Differentiable approximation to discrete structures.

14 / 32

slide-40
SLIDE 40

GTC 2018 Motivation GMN Experiments Conclusion

I - Neural Networks Parametrization

fθ(s, s′) = T (s′|s; θ) : Rp × Rp → [0, 1] For example, Bernoulli: T (s′|s; θ) =

d

  • i=1

gi(s, θ)s′

i (1 − gi(s, θ))1−s′ i

Reducing the space complexity in a unified model

  • cf. O(d2) with d = 2p for a binary image I ∈ {0, 1}p

Good estimate for unseen states

Differentiable approximation to discrete structures.

14 / 32

slide-41
SLIDE 41

GTC 2018 Motivation GMN Experiments Conclusion

I - Neural Networks Parametrization

fθ(s, s′) = T (s′|s; θ) : Rp × Rp → [0, 1] For example, Bernoulli: T (s′|s; θ) =

d

  • i=1

gi(s, θ)s′

i (1 − gi(s, θ))1−s′ i

Reducing the space complexity in a unified model

  • cf. O(d2) with d = 2p for a binary image I ∈ {0, 1}p

Good estimate for unseen states

Differentiable approximation to discrete structures.

14 / 32

slide-42
SLIDE 42

GTC 2018 Motivation GMN Experiments Conclusion

I - Neural Networks Parametrization

fθ(s, s′) = T (s′|s; θ) : Rp × Rp → [0, 1] For example, Bernoulli: T (s′|s; θ) =

d

  • i=1

gi(s, θ)s′

i (1 − gi(s, θ))1−s′ i

Reducing the space complexity in a unified model

  • cf. O(d2) with d = 2p for a binary image I ∈ {0, 1}p

Good estimate for unseen states

Differentiable approximation to discrete structures.

14 / 32

slide-43
SLIDE 43

GTC 2018 Motivation GMN Experiments Conclusion

I - Neural Networks Parametrization

fθ(s, s′) = T (s′|s; θ) : Rp × Rp → [0, 1] For example, Bernoulli: T (s′|s; θ) =

d

  • i=1

gi(s, θ)s′

i (1 − gi(s, θ))1−s′ i

Reducing the space complexity in a unified model

  • cf. O(d2) with d = 2p for a binary image I ∈ {0, 1}p

Good estimate for unseen states

Differentiable approximation to discrete structures.

14 / 32

slide-44
SLIDE 44

GTC 2018 Motivation GMN Experiments Conclusion

I - Neural Networks Parametrization

fθ(s, s′) = T (s′|s; θ) : Rp × Rp → [0, 1] For example, Bernoulli: T (s′|s; θ) =

d

  • i=1

gi(s, θ)s′

i (1 − gi(s, θ))1−s′ i

Reducing the space complexity in a unified model

  • cf. O(d2) with d = 2p for a binary image I ∈ {0, 1}p

Good estimate for unseen states

Differentiable approximation to discrete structures.

14 / 32

slide-45
SLIDE 45

GTC 2018 Motivation GMN Experiments Conclusion

I - Neural Networks Parametrization

fθ(s, s′) = T (s′|s; θ) : Rp × Rp → [0, 1] For example, Bernoulli: T (s′|s; θ) =

d

  • i=1

gi(s, θ)s′

i (1 − gi(s, θ))1−s′ i

Reducing the space complexity in a unified model

  • cf. O(d2) with d = 2p for a binary image I ∈ {0, 1}p

Good estimate for unseen states

Differentiable approximation to discrete structures.

14 / 32

slide-46
SLIDE 46

GTC 2018 Motivation GMN Experiments Conclusion

II - Coordinate Ascent Style Optimization

Iteratively optimize θ and π

maxθ,π log

  • P(π)

n

  • t=2

T (sπ(t)|sπ(t−1); θ)

  • = maxθ log

n

  • t=2

T (sπ∗(t)|sπ∗(t−1); θ)

  • with

π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Back Propagation to update θ Greedy approximation of π∗

15 / 32

slide-47
SLIDE 47

GTC 2018 Motivation GMN Experiments Conclusion

II - Coordinate Ascent Style Optimization

Iteratively optimize θ and π

maxθ,π log

  • P(π)

n

  • t=2

T (sπ(t)|sπ(t−1); θ)

  • = maxθ log

n

  • t=2

T (sπ∗(t)|sπ∗(t−1); θ)

  • with

π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Back Propagation to update θ Greedy approximation of π∗

15 / 32

slide-48
SLIDE 48

GTC 2018 Motivation GMN Experiments Conclusion

II - Coordinate Ascent Style Optimization

Iteratively optimize θ and π

maxθ,π log

  • P(π)

n

  • t=2

T (sπ(t)|sπ(t−1); θ)

  • = maxθ log

n

  • t=2

T (sπ∗(t)|sπ∗(t−1); θ)

  • with

π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Back Propagation to update θ Greedy approximation of π∗

15 / 32

slide-49
SLIDE 49

GTC 2018 Motivation GMN Experiments Conclusion

II - Coordinate Ascent Style Optimization

Iteratively optimize θ and π

maxθ,π log

  • P(π)

n

  • t=2

T (sπ(t)|sπ(t−1); θ)

  • = maxθ log

n

  • t=2

T (sπ∗(t)|sπ∗(t−1); θ)

  • with

π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Back Propagation to update θ Greedy approximation of π∗

15 / 32

slide-50
SLIDE 50

GTC 2018 Motivation GMN Experiments Conclusion

III - Greedy Approximation of the Optimal Order

Original problem π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Complexity: O(n!)

Greedy approximation: π∗(j) ← max

k∈{π∗(1),...,π∗(j−1)} T (sk | sπ∗(j−1); θ)

Complexity: O(n2 log n) NP-Hard problem

16 / 32

slide-51
SLIDE 51

GTC 2018 Motivation GMN Experiments Conclusion

III - Greedy Approximation of the Optimal Order

Original problem π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Complexity: O(n!)

Greedy approximation: π∗(j) ← max

k∈{π∗(1),...,π∗(j−1)} T (sk | sπ∗(j−1); θ)

Complexity: O(n2 log n) NP-Hard problem

16 / 32

slide-52
SLIDE 52

GTC 2018 Motivation GMN Experiments Conclusion

III - Greedy Approximation of the Optimal Order

Original problem π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Complexity: O(n!)

Greedy approximation: π∗(j) ← max

k∈{π∗(1),...,π∗(j−1)} T (sk | sπ∗(j−1); θ)

Complexity: O(n2 log n) NP-Hard problem

16 / 32

slide-53
SLIDE 53

GTC 2018 Motivation GMN Experiments Conclusion

III - Greedy Approximation of the Optimal Order

Original problem π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Complexity: O(n!)

Greedy approximation: π∗(j) ← max

k∈{π∗(1),...,π∗(j−1)} T (sk | sπ∗(j−1); θ)

Complexity: O(n2 log n) NP-Hard problem

16 / 32

slide-54
SLIDE 54

GTC 2018 Motivation GMN Experiments Conclusion

III - Greedy Approximation of the Optimal Order

Original problem π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Complexity: O(n!)

Greedy approximation: π∗(j) ← max

k∈{π∗(1),...,π∗(j−1)} T (sk | sπ∗(j−1); θ)

Complexity: O(n2 log n) NP-Hard problem

16 / 32

slide-55
SLIDE 55

GTC 2018 Motivation GMN Experiments Conclusion

III - Greedy Approximation of the Optimal Order

Original problem π∗ = argmaxπ∈Π(n)

n

  • t=2

log T (sπ(t)|sπ(t−1); θ)

Complexity: O(n!)

Greedy approximation: π∗(j) ← max

k∈{π∗(1),...,π∗(j−1)} T (sk | sπ∗(j−1); θ)

Complexity: O(n2 log n) NP-Hard problem

16 / 32

slide-56
SLIDE 56

GTC 2018 Motivation GMN Experiments Conclusion

IV - Batch-Wise Permutation Training

Partition the training set into batches. Effect time complexity: O(b2 log b) · n/b = O(nb log b).

  • cf. O(n!), O(n2 log n)

Induce ergodic Markov chain by

  • verlapping batches

17 / 32

slide-57
SLIDE 57

GTC 2018 Motivation GMN Experiments Conclusion

IV - Batch-Wise Permutation Training

Partition the training set into batches. Effect time complexity: O(b2 log b) · n/b = O(nb log b).

  • cf. O(n!), O(n2 log n)

Induce ergodic Markov chain by

  • verlapping batches

17 / 32

slide-58
SLIDE 58

GTC 2018 Motivation GMN Experiments Conclusion

IV - Batch-Wise Permutation Training

Partition the training set into batches. Effect time complexity: O(b2 log b) · n/b = O(nb log b).

  • cf. O(n!), O(n2 log n)

Induce ergodic Markov chain by

  • verlapping batches

17 / 32

slide-59
SLIDE 59

GTC 2018 Motivation GMN Experiments Conclusion

IV - Batch-Wise Permutation Training

Partition the training set into batches. Effect time complexity: O(b2 log b) · n/b = O(nb log b).

  • cf. O(n!), O(n2 log n)

Induce ergodic Markov chain by

  • verlapping batches

17 / 32

slide-60
SLIDE 60

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Experiments

Discover Implicit Order Recover Order in Ordered Dataset One-Shot Recognition

18 / 32

slide-61
SLIDE 61

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Experiments

Discover Implicit Order Recover Order in Ordered Dataset One-Shot Recognition

18 / 32

slide-62
SLIDE 62

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Experiments

Discover Implicit Order Recover Order in Ordered Dataset One-Shot Recognition

18 / 32

slide-63
SLIDE 63

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Experiments

Discover Implicit Order Recover Order in Ordered Dataset One-Shot Recognition

18 / 32

slide-64
SLIDE 64

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Datasets for Discovering Implicit Order

Horse [1]

Horse images with object-background segmentation.

MSR SenseCam [2]

  • ffice category

19 / 32

slide-65
SLIDE 65

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Datasets for Discovering Implicit Order

Horse [1]

Horse images with object-background segmentation.

MSR SenseCam [2]

  • ffice category

19 / 32

slide-66
SLIDE 66

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Datasets for Discovering Implicit Order

Horse [1]

Horse images with object-background segmentation.

MSR SenseCam [2]

  • ffice category

19 / 32

slide-67
SLIDE 67

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Implicit Order

Order the dataset according to

maximum transitional probability (GMN) minimum Euclidean distance (NN)

  • 20 / 32
slide-68
SLIDE 68

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Image Propagation

Find the best next image according to

maximum transitional probability (GMN) minimum Euclidean distance (NN) GMN: GMN: NN: NN:

  • NN sticks between two similar images

21 / 32

slide-69
SLIDE 69

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Image Propagation

Find the best next image according to

maximum transitional probability (GMN) minimum Euclidean distance (NN) GMN: GMN: NN: NN:

  • NN sticks between two similar images

21 / 32

slide-70
SLIDE 70

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Videos for Training GMN

Horse (click me)

22 / 32

slide-71
SLIDE 71

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Datasets for Recovering Orders

UCF-CIL Action Dataset [3]

4 different subjects ballet fouette actions

Kendall Tau-b metric for comparing two orders

23 / 32

slide-72
SLIDE 72

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Datasets for Recovering Orders

UCF-CIL Action Dataset [3]

4 different subjects ballet fouette actions

Kendall Tau-b metric for comparing two orders

23 / 32

slide-73
SLIDE 73

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Datasets for Recovering Orders

UCF-CIL Action Dataset [3]

4 different subjects ballet fouette actions

Kendall Tau-b metric for comparing two orders

23 / 32

slide-74
SLIDE 74

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Qualitative Results

ballet fouette actions for subject1 with

true order

  • rder recovered from NN
  • rder recovered from GMNs trained on subject2
  • rder recovered from GMNs trained on subject1
  • 24 / 32
slide-75
SLIDE 75

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Quantitative Results

Kendall Tau-b metric with the true order

Applied To Trained GMN On NN Subject 1 Subject 2 Subject 3 Subject 4 Subject 1 0.364 ± 0.058 0.183 ± 0.033 0.247 ± 0.052 0.200 ± 0.065 0.137 Subject 2 0.062 ± 0.091 0.085 ± 0.038 0.079 ± 0.035 0.074 ± 0.052

  • 0.217

Subject 3

  • 0.026 ± 0.072
  • 0.086 ± 0.066

0.191 ± 0.055

  • 0.092 ± 0.083
  • 0.122

Subject 4 0.274 ± 0.056 0.288 ± 0.032 0.304 ± 0.025 0.355 ± 0.025 0.292

25 / 32

slide-76
SLIDE 76

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Dataset for One-Shot Recognition

miniImageNet [4]

Subset from ImageNet. 80 classes for training and 20 classes for testing. Each class consists of 600 images.

Consider 5-way 1-shot recognition task.

5 classes 1 labeled image/class

26 / 32

slide-77
SLIDE 77

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Dataset for One-Shot Recognition

miniImageNet [4]

Subset from ImageNet. 80 classes for training and 20 classes for testing. Each class consists of 600 images.

Consider 5-way 1-shot recognition task.

5 classes 1 labeled image/class

26 / 32

slide-78
SLIDE 78

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Dataset for One-Shot Recognition

miniImageNet [4]

Subset from ImageNet. 80 classes for training and 20 classes for testing. Each class consists of 600 images.

Consider 5-way 1-shot recognition task.

5 classes 1 labeled image/class

26 / 32

slide-79
SLIDE 79

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Dataset for One-Shot Recognition

miniImageNet [4]

Subset from ImageNet. 80 classes for training and 20 classes for testing. Each class consists of 600 images.

Consider 5-way 1-shot recognition task.

5 classes 1 labeled image/class

26 / 32

slide-80
SLIDE 80

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Training Details

Apply trained GMN from training classes to testing classes. Generate a Markov chain from labeled instance sc

0 for class

c: sc

0 → ˜

sc

1 → · · · → ˜

sc

k

For unlabeled instance sunlabeled, find the best fit sunlabeled → chain for class 1 (score 0.05) sunlabeled → chain for class 2 (score 0.1) sunlabeled → chain for class 3 (score 0.4) . . . sunlabeled → chain for class n (score 0.05)

27 / 32

slide-81
SLIDE 81

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Training Details

Apply trained GMN from training classes to testing classes. Generate a Markov chain from labeled instance sc

0 for class

c: sc

0 → ˜

sc

1 → · · · → ˜

sc

k

For unlabeled instance sunlabeled, find the best fit sunlabeled → chain for class 1 (score 0.05) sunlabeled → chain for class 2 (score 0.1) sunlabeled → chain for class 3 (score 0.4) . . . sunlabeled → chain for class n (score 0.05)

27 / 32

slide-82
SLIDE 82

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Training Details

Apply trained GMN from training classes to testing classes. Generate a Markov chain from labeled instance sc

0 for class

c: sc

0 → ˜

sc

1 → · · · → ˜

sc

k

For unlabeled instance sunlabeled, find the best fit sunlabeled → chain for class 1 (score 0.05) sunlabeled → chain for class 2 (score 0.1) sunlabeled → chain for class 3 (score 0.4) . . . sunlabeled → chain for class n (score 0.05)

27 / 32

slide-83
SLIDE 83

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Training Details

Apply trained GMN from training classes to testing classes. Generate a Markov chain from labeled instance sc

0 for class

c: sc

0 → ˜

sc

1 → · · · → ˜

sc

k

For unlabeled instance sunlabeled, find the best fit sunlabeled → chain for class 1 (score 0.05) sunlabeled → chain for class 2 (score 0.1) sunlabeled → chain for class 3 (score 0.4) . . . sunlabeled → chain for class n (score 0.05)

27 / 32

slide-84
SLIDE 84

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Training Details

Apply trained GMN from training classes to testing classes. Generate a Markov chain from labeled instance sc

0 for class

c: sc

0 → ˜

sc

1 → · · · → ˜

sc

k

For unlabeled instance sunlabeled, find the best fit sunlabeled → chain for class 1 (score 0.05) sunlabeled → chain for class 2 (score 0.1) sunlabeled → chain for class 3 (score 0.4) . . . sunlabeled → chain for class n (score 0.05)

27 / 32

slide-85
SLIDE 85

GTC 2018 Motivation GMN Experiments

Discover Implicit Order Recovering Orders in Ordered Dataset One-Shot Recognition

Conclusion

Quantitative Results

Model Basic/ Advanced Model Discriminative/ Generative Parametric/ Nonparametric Accuracy Meta-Learner LSTM (Ravi & Larochelle, 2017) Basic Discriminative Parametric 43.44±0.77 Model-Agnostic Meta-Learning (Finn et al., 2017) Basic Discriminative Parametric 48.70±1.84 Meta Networks (Munkhdalai & Yu, 2017) Advanced Discriminative Parametric 49.21±0.96 Meta-SGD (Li et al., 2017) Basic Discriminative Parametric 50.47±1.87 Temporal Convolutions Meta-Learning (Mishra et al., 2017) Advanced Discriminative Parametric 55.71±0.99 Nearest Neighbor with Cosine Distance Basic Discriminative Nonparametric 41.08±0.70 Matching Networks FCE (Vinyals et al., 2016) Basic Discriminative Nonparametric 43.56±0.84 Siamese (Koch et al., 2015) Basic Discriminative Nonparametric 48.42±0.79 mAP-Direct Loss Minimization (Triantafillou et al., 2017) Basic Discriminative Nonparametric 41.64±0.78 mAP-Structural Support Vector Machine (Triantafillou et al., 2017) Basic Discriminative Nonparametric 47.89±0.78 Prototypical Networks (Snell et al., 2017) Basic Discriminative Nonparametric 49.42±0.78 Attentive Recurrent Comparators (Shyam et al., 2017) Not Specified Discriminative Nonparametric 49.1 Skip-Residual Pairwise Networks (Mehrotra & Dukkipati, 2017) Advanced Discriminative Nonparametric 55.2 Generative Markov Networks without fine-tuning (ours) Basic Generative Nonparametric 45.36±0.94 Generative Markov Networks with fine-tuning (ours) Basic Generative Nonparametric 48.87±1.10

28 / 32

slide-86
SLIDE 86

GTC 2018 Motivation GMN Experiments Conclusion

Conclusion

Data are i.i.d. sampled (x). Data have implicit orders (generated from a Markov chain). Propose a greedy batch-wise permutation scheme to recover the order which requires complexity only O(N b log b) (cf. O(N!)).

29 / 32

slide-87
SLIDE 87

GTC 2018 Motivation GMN Experiments Conclusion

Conclusion

Data are i.i.d. sampled (x). Data have implicit orders (generated from a Markov chain). Propose a greedy batch-wise permutation scheme to recover the order which requires complexity only O(N b log b) (cf. O(N!)).

29 / 32

slide-88
SLIDE 88

GTC 2018 Motivation GMN Experiments Conclusion

Conclusion

Data are i.i.d. sampled (x). Data have implicit orders (generated from a Markov chain). Propose a greedy batch-wise permutation scheme to recover the order which requires complexity only O(N b log b) (cf. O(N!)).

29 / 32

slide-89
SLIDE 89

GTC 2018 Motivation GMN Experiments Conclusion

Conclusion

Data are i.i.d. sampled (x). Data have implicit orders (generated from a Markov chain). Propose a greedy batch-wise permutation scheme to recover the order which requires complexity only O(N b log b) (cf. O(N!)).

29 / 32

slide-90
SLIDE 90

GTC 2018 Motivation GMN Experiments Conclusion

References

  • E. Borenstein and S. Ullman, “Class-specific, top-down segmentation,” in European conference on

computer vision, pp. 109–122, Springer, 2002.

  • N. Jojic, A. Perina, and V. Murino, “Structural epitome: a way to summarize ones visual experience,”

in Advances in Neural Information Processing Systems 23 (J. D. Lafferty, C. K. I. Williams,

  • J. Shawe-Taylor, R. S. Zemel, and A. Culotta, eds.), pp. 1027–1035, Curran Associates, Inc., 2010.
  • Y. Shen and H. Foroosh, “View-invariant action recognition using fundamental ratios,” in Computer

Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–6, IEEE, 2008.

  • S. Ravi and H. Larochelle, “Optimization as a model for few-shot learning,” ICLR, 2017.

30 / 32

slide-91
SLIDE 91

GTC 2018 Motivation GMN Experiments Conclusion

The End

31 / 32

slide-92
SLIDE 92

GTC 2018 Motivation GMN Experiments Conclusion

Transitional Operator

Xt f1 f21 f22 f3 f41 f42

𝜈 Σ 𝜁 ~ 𝑂(0,1) + + +

∗ ∗ ∗

Z 𝑑𝑝𝑜𝑑𝑏𝑢 U

𝑌 4

U

𝟐 −

Xt+1

32 / 32