Action Respecting Embedding Michael Bowling April 25, 2005 - - PowerPoint PPT Presentation

action respecting embedding
SMART_READER_LITE
LIVE PREVIEW

Action Respecting Embedding Michael Bowling April 25, 2005 - - PowerPoint PPT Presentation

Action Respecting Embedding Michael Bowling April 25, 2005 University of Alberta Acknowledgments Joint work with. . . Ali Ghodsi and Dana Wilkinson. Localization work with. . . Adam Milstein and Wesley Loh. Discussions and insight.


slide-1
SLIDE 1

Action Respecting Embedding

Michael Bowling April 25, 2005 University of Alberta

slide-2
SLIDE 2

Acknowledgments

  • Joint work with. . .

Ali Ghodsi and Dana Wilkinson.

  • Localization work with. . .

Adam Milstein and Wesley Loh.

  • Discussions and insight. . .

Finnegan Southey, Dale Schuurmans, Tao Wang, Dan Lizotte, Michael Littman, and Pascal Poupart.

2 Subjective Mapping

slide-3
SLIDE 3

What is a Map?

3 Subjective Mapping

slide-4
SLIDE 4

Robot Maps

  • Maps are Models.

– Motion: P(xt+1|xt, ut). – Sensor: P(zt|xt). xt : pose ut : action or control zt : observation

  • Robot Pose: xt = (xt, yt, θt).

– Objective representation. – Models relate actions (ut) and observations (zt) to this frame of reference.

  • Can a map be learned with only subjective experience?

z1, u1, z2, u2, . . . , uT −1, zT Not an objective map.

4 Subjective Mapping

slide-5
SLIDE 5

ImageBot

  • A “Robot” moving around on a large image.

– Forward – Backward – Left – Right – Turn-CW – Turn-CCW – Zoom-In – Zoom-Out

  • Example: F × 5, CW × 8, F × 5, CW × 16, F × 5, CW × 8
  • A “Robot” moving around on a large image.

– Forward – Backward – Left – Right – Turn-CW – Turn-CCW – Zoom-In – Zoom-Out

  • Example: F × 5, CW × 8, F × 5, CW × 16, F × 5, CW × 8

5 Subjective Mapping

slide-6
SLIDE 6

ImageBot

  • Construct a map from a stream of input.

z1 z2 z3 z4 z5 u1 ⇒ u1 ⇒ u1 ⇒ u1 ⇒ u2 ⇒ z6 z7 z8 z9 z10 u2 ⇒ u2 ⇒ u2 ⇒ u2 ⇒ u2 ⇒

. . . . . . . . . . . . . . .

  • Actions are labels with no semantics.
  • No image features, just high-dimensional vectors.
  • Can a map be learned with only subjective experience?

6 Subjective Mapping

slide-7
SLIDE 7

Overview

  • What is a Map?
  • Subjective Mapping
  • Action Respecting Embedding (ARE)
  • Results
  • Future Work

7 Subjective Mapping

slide-8
SLIDE 8

Subjective Maps

  • What is a subjective map?

– Allows you to do “map things”, e.g., localize and plan. – No models. – Representation (i.e., pose) can be anything.

  • Subjective mapping becomes choosing a representation.

– What is a good representation? – How do we extract it from experience? – How do we answer our map questions with it?

8 Subjective Mapping

slide-9
SLIDE 9

Subjective Representation

  • (x, y, θ) is often a good representation. Why?

– Sufficient representation for generating observations.

✁✁ ✁✁ ✁✁ ✁✁ ✁✁ ✁✁ ✁✁ ✁✁ ✂✁✂✁✂ ✂✁✂✁✂ ✂✁✂✁✂ ✂✁✂✁✂ ✂✁✂✁✂ ✂✁✂✁✂ ✂✁✂✁✂ ✂✁✂✁✂

θ x y

– Low dimensional (despite high dimensional observations). – Actions are simple transformations. xt+1 = (xt + F cos θt, yt + F sin θt, θt + R)

9 Subjective Mapping

slide-10
SLIDE 10

Subjective Representation

  • How do we extract a representation like (x, y, θ)?

– Low dimensional description of observations? Dimensionality Reduction – Respects actions as simple transformations?

  • How do we extract a representation like (x, y, θ)?

– Low dimensional description of observations? Dimensionality Reduction – Respects actions as simple transformations?

10 Subjective Mapping

slide-11
SLIDE 11

Semidefinite Embedding (SDE)

(Weinberger & Saul, 2004)

  • Goal: Learn the kernel matrix, K, from the data.
  • Optimization Problem:

Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||xi − xj||2 Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||xi − xj||2 Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||xi − xj||2 Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||xi − xj||2 where η comes from k-nearest neighbors.

  • Use learned K with kernel PCA.

11 Subjective Mapping

slide-12
SLIDE 12

Semidefinite Embedding (SDE)

  • It works.
  • Reduces

dimensionality reduction to a constrained

  • ptimization problem.

12 Subjective Mapping

slide-13
SLIDE 13

Dimensionality Reduction Maps

40 20 x

time position on manifold

13 Subjective Mapping

slide-14
SLIDE 14

Subjective Representation

  • How do we extract a representation like (x, y, θ)?

– Low dimensional description of observations? Dimensionality Reduction – Respects actions as simple transformations? Action Respecting Embedding

14 Subjective Mapping

slide-15
SLIDE 15

Overview

  • What is a Map?
  • Subjective Mapping
  • Action Respecting Embedding (ARE)

– Non-Uniform Neighborhoods – Action Respecting Constraints

  • Results
  • Future Work

15 Subjective Mapping

slide-16
SLIDE 16

Action Respecting Embedding (ARE)

  • Like SDE, learns a kernel matrix K through optimization.
  • Uses the kernel matrix K with kernel PCA.
  • Exploits the action data (ut) in two ways.

– Non-uniform neighborhood graph. – Action respecting constraints.

16 Subjective Mapping

slide-17
SLIDE 17

Non-Uniform Neighborhoods

  • Actions only have a small effect on the robot’s pose.
  • If zt −

ut

zt+1, then Φ(zt) and Φ(zt+1) should be close.

  • Set zt’s neighborhood size to include zt−1 and zt+1.

zt zt+1

  • Neighbor graph uses this non-uniform neighborhood size.

17 Subjective Mapping

slide-18
SLIDE 18

Action Respecting Constraints

  • Constrain the representation so that actions must be

simple transformations.

  • What’s a simple transformation?

– Linear transformations are simple. f(x) = Ax + b – Distance preserving ones are just slightly simpler. f(x) = Ax + b where ATA = I. i.e., rotation and translation, but no scaling.

18 Subjective Mapping

slide-19
SLIDE 19

Action Respecting Constraints

  • Distance preserving ⇐

⇒ ||f(x) − f(x′)|| = ||x − x′||.

x x′ x x′ f(x) f(x′) x x′ f(x) f(x′)

  • For our representation, if ui = uj = u,

||fu(Φ(zi)) − fu(Φ(zj))|| = ||Φ(zi) − Φ(zj)|| ||Φ(zi+1) − Φ(zj+1)|| = ||Φ(zi) − Φ(zj)|| K(i+1)(i+1) − 2K(i+1)(j+1) + K(j+1)(j+1) = Kii − 2Kij + Kjj

19 Subjective Mapping

slide-20
SLIDE 20

Action Respecting Embedding

  • Optimization Problem:

Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||zi − zj||2 ∀i, j ui = uj ⇒

K(i+1)(i+1) − 2K(i+1)(j+1) + K(j+1)(j+1) = Kii − 2Kij + Kjj

where η is the non-uniform neighborhood graph.

  • Use learned K with kernel PCA to extract x1, . . . , xT.

Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||zi − zj||2 ∀i, j ui = uj ⇒

K(i+1)(i+1) − 2K(i+1)(j+1) + K(j+1)(j+1) = Kii − 2Kij + Kjj

where η is the non-uniform neighborhood graph.

  • Use learned K with kernel PCA to extract x1, . . . , xT.

Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||zi − zj||2 ∀i, j ui = uj ⇒

K(i+1)(i+1) − 2K(i+1)(j+1) + K(j+1)(j+1) = Kii − 2Kij + Kjj

where η is the non-uniform neighborhood graph.

  • Use learned K with kernel PCA to extract x1, . . . , xT.

Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||zi − zj||2 ∀i, j ui = uj ⇒

K(i+1)(i+1) − 2K(i+1)(j+1) + K(j+1)(j+1) = Kii − 2Kij + Kjj

where η is the non-uniform neighborhood graph.

  • Use learned K with kernel PCA to extract x1, . . . , xT.

Maximize: Tr(K) Subject to: K 0

  • ij Kij = 0

∀i, j ηij > 0 ∨ [ηTη]ij > 0 ⇒ Kii − 2Kij + Kjj = ||zi − zj||2 ∀i, j ui = uj ⇒

K(i+1)(i+1) − 2K(i+1)(j+1) + K(j+1)(j+1) = Kii − 2Kij + Kjj

where η is the non-uniform neighborhood graph.

  • Use learned K with kernel PCA to extract x1, . . . , xT.

20 Subjective Mapping

slide-21
SLIDE 21

Overview

  • What is a Map?
  • Subjective Mapping
  • Action Respecting Embedding (ARE)
  • Results
  • Future Work

21 Subjective Mapping

slide-22
SLIDE 22

ImageBot

  • Construct a map from a stream of input.

z1 z2 z3 z4 z5 u1 ⇒ u1 ⇒ u1 ⇒ u1 ⇒ u2 ⇒ z6 z7 z8 z9 z10 u2 ⇒ u2 ⇒ u2 ⇒ u2 ⇒ u2 ⇒

. . . . . . . . . . . . . . .

  • Actions are labels with no semantics.
  • No image features, just high-dimensional vectors.

22 Subjective Mapping

slide-23
SLIDE 23

Learned Representations

40 20 x

time position on manifold SDE ARE

23 Subjective Mapping

slide-24
SLIDE 24

Learned Representations

40 x 20

time position on manifold SDE ARE

24 Subjective Mapping

slide-25
SLIDE 25

Learned Representations

16 θ 16 θ 8

1st dimension of manifold 2nd dimension of manifold SDE ARE

25 Subjective Mapping

slide-26
SLIDE 26

Learned Representations

x 5 10 y x 5 10 y 5 5 x 5 10 y 5 5 5 5 x 5 10 y 5 5 5 5 10

1st dimension of manifold 2nd dimension of manifold SDE ARE

26 Subjective Mapping

slide-27
SLIDE 27

Learned Representations

x z 5 10 5 5 5 10 20

1st dimension of manifold 2nd dimension of manifold SDE ARE

27 Subjective Mapping

slide-28
SLIDE 28

Learned Representations

x 10 x 10 8 10 x 10 8 10 8 5 x 10 8 10 8 5 16 5

1st dimension 2nd dimension 3rd dimension

28 Subjective Mapping

slide-29
SLIDE 29

Subjective Mapping

  • Is it a map?
  • Planning

– Extract the distance preserving operators: fu : X → X. – Search for an action sequence.

  • Localization

– Extract motion and sensor models. P(xt+1|xt, ut) P(zt|xt) – Monte Carlo localization (MCL), a.k.a. particle filtering.

29 Subjective Mapping

slide-30
SLIDE 30

Summary

  • ARE automatically extracts a representation from only

subjective experience of observations and actions.

  • The representation is low-dimensional, action-respecting.
  • It has all of the qualities of a map.

– Can be used for planning. – Can be used for localization.

30 Subjective Mapping

slide-31
SLIDE 31

Future Work

  • Faster ARE. Customized solver?
  • Better ARE.

– Learning representations with walls. – State aliasing. – Continuous actions.

  • Using ARE.

– Heuristics for planning with ARE. – Behavior learning with ARE representations.

  • Subjective maps for real robots.

31 Subjective Mapping

slide-32
SLIDE 32

Questions?

32 Subjective Mapping

slide-33
SLIDE 33

Extra Slides

33 Subjective Mapping

slide-34
SLIDE 34

Planning

  • Problem:

from xT find a sequence of actions that will reach x∗ where x∗ ∈ {x1, . . . , xT}.

  • Need a Model: If we take action u from xt, what is xt+1?

– ARE guarantees the existence of a distance preserving function, fu, ∀t ut = u ⇒ fu(xt) = xt+1. – We just need to extract fu from the representation.

34 Subjective Mapping

slide-35
SLIDE 35

Planning with Procrustes

  • Variation on the extended orthonormal Procrustes problem.

Minimize: A, b

  • t ||Axt + b − xt+1||2 · 1ut=u

Subject to: ATA = I

  • Although not convex, it has an analytic solution!

35 Subjective Mapping

slide-36
SLIDE 36

Planning with Search

  • Using our action models, we can search for a plan.

xT u1 u2 u3 u4

. . . . . . . . . . . .

u1 u2 u3 u4 xT+1 xT+1 xT+1 xT+1

  • Simple iterative deepening depth-first search.

36 Subjective Mapping

slide-37
SLIDE 37

Planning Results

1st dimension of manifold 2nd dimension of manifold 1st dimension of manifold 2nd dimension of manifold

R R B B

37 Subjective Mapping

slide-38
SLIDE 38

Planning Results

1st dimension of manifold 2nd dimension of manifold 1st dimension of manifold 2nd dimension of manifold

  • o o o o o o

B

38 Subjective Mapping

slide-39
SLIDE 39

Planning Results

1st dimension 2nd dimension 3rd dimension

F 7 × r 9 × r

1st dimension 2nd dimension 3rd dimension

F 7 × r 9 × r

Return

39 Subjective Mapping

slide-40
SLIDE 40

Localization

  • Maps are Models.

– Motion: P(xt+1|xt, ut). – Sensor: P(zt|xt).

  • Extract noisy action models from experience.

– Noise is encoded in the later principal components. – Procrustes extracts the mean. – Regression errors define the variance.

  • Extract sensor model based on correlation between

image distances and distance in the representation.

  • Monte Carlo localization (MCL).

40 Subjective Mapping

slide-41
SLIDE 41

Localization Results

  • Noisy actions in IMAGEBOT.
  • No ground truth with subjective representations.
  • Choose the “closest” point in the training trajectory.

Mean Error (Pixels) Trajectory Oracle ARE Random Straight line 4.82 10.39 86.83 “A” with translation 3.62 14.81 104.56 “A” with zoom 1.71 19.58 84.67 Return

41 Subjective Mapping