High dimensional computing - the upside of the curse of - - PowerPoint PPT Presentation

high dimensional computing the upside of the curse of
SMART_READER_LITE
LIVE PREVIEW

High dimensional computing - the upside of the curse of - - PowerPoint PPT Presentation

High dimensional computing - the upside of the curse of dimensionality Peer Neubert Stefan Schubert Kenny Schlegel Peer Neubert, TU Chemnitz Topic: (Symbolic) Computation with large vectors Roughly synonyms: High dimensional Computing


slide-1
SLIDE 1

Peer Neubert, TU Chemnitz

High dimensional computing - the upside of the curse of dimensionality

Peer Neubert Stefan Schubert Kenny Schlegel

slide-2
SLIDE 2

Peer Neubert, TU Chemnitz

Roughly synonyms:

  • High dimensional Computing
  • Hyperdimensional Computing
  • Hypervectors
  • Vector Symbolic Architectures
  • Computing with large random vectors
  • ...

Topic: (Symbolic) Computation with large vectors

slide-3
SLIDE 3

Peer Neubert, TU Chemnitz

Roughly synonyms:

  • High dimensional Computing
  • Hyperdimensional Computing
  • Hypervectors
  • Vector Symbolic Architectures
  • Computing with large random vectors
  • ...

2D 3D thousands of dimensions

Topic: (Symbolic) Computation with large vectors

slide-4
SLIDE 4

Peer Neubert, TU Chemnitz

Roughly synonyms:

  • High dimensional Computing
  • Hyperdimensional Computing
  • Hypervectors
  • Vector Symbolic Architectures
  • Computing with large random vectors
  • ...

2D 3D thousands of dimensions

Topic: (Symbolic) Computation with large vectors

Pentti Kanerva. 2009. Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors. Cognitive Computation 1, 2 (2009), 139–159. https://doi.org/10.1007/s12559-009-9009-8 Neubert, P., Schubert, S., Protzel, P. 2019. An Introduction to Hyperdimensional Computing for Robotics. KI - Künstliche Intelligenz. https://doi.org/10.1007/s13218-019-00623-z

slide-5
SLIDE 5

Peer Neubert, TU Chemnitz

Reasons to attend

Interest in

– Exploiting the “curse of dimensionality” – Extending (deep) ANNs with symbolic processing – Noise robustness (and power efficiency)

Related Fields

– Robotics – Vector models for NLP – Information retrieval – Quantum cognition/probability/logic – ...

Goals

– Introduction to the topic – Intuition of underlying mathematical properties – Link to available approaches and implementations – Outline potential applications – Provide some first hands-on experience

slide-6
SLIDE 6

Peer Neubert, TU Chemnitz

Reasons to attend

Interest in

– Exploiting the “curse of dimensionality” – Extending (deep) ANNs with symbolic processing – Noise robustness (and power efficiency)

Related Fields

– Information retrieval – Vector models for NLP – Robotics – Quantum cognition/probability/logic – ...

Goals

– Introduction to the topic – Intuition of underlying mathematical properties – Link to available approaches and implementations – Outline potential applications – Provide some first hands-on experience

slide-7
SLIDE 7

Peer Neubert, TU Chemnitz

Reasons to attend

Interest in

– Exploiting the “curse of dimensionality” – Extending (deep) ANNs with symbolic processing – Noise robustness (and power efficiency)

Related Fields

– Information retrieval – Vector models for NLP – Robotics – Quantum cognition/probability/logic – ...

Goals

– Introduction to the topic – Intuition towards underlying mathematical properties – Link to available approaches and implementations – Outline potential applications – Provide some first hands-on experience

slide-8
SLIDE 8

Peer Neubert, TU Chemnitz

What we are doing

slide-9
SLIDE 9

Peer Neubert, TU Chemnitz

Robotics AI High dimensional computing We You

  • Our background is neither classic AI nor mathematics
  • We are very much interested in any thoughts and feedback!
slide-10
SLIDE 10

Peer Neubert, TU Chemnitz

Outline

14:00 Welcome 14:05 Introduction to high dimensional computing 15:05 Implementations in form of Vector Symbolic Architectures 15:30 Coffee break 16:00 Vector encodings of real world data 16:30 Applications 17:15 Discussion and conclusion

slide-11
SLIDE 11

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico? Hyperdimensional computing approach:

  • 1. Assign a random high-dimensional vector to each entity

”Name” is a random vector NAM ”USA” is a random vector USA ”Capital city” is a random vector CAP …

  • 2. Calculate a single high-dimensional vector that contains all information

F = (NAM*USA+CAP*WDC+CUR*DOL)*(NAM*MEX+CAP*MCX+CUR*PES)

  • 3. Calculate the query answer:

F*DOL ~ PES

Credits: Pentti Kanerva

slide-12
SLIDE 12

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico? Hyperdimensional computing approach:

  • 1. Assign a random high-dimensional vector to each entity

”Name” is a random vector NAM ”USA” is a random vector USA ”Capital city” is a random vector CAP …

  • 2. Calculate a single high-dimensional vector that contains all information

F = (NAM*USA+CAP*WDC+CUR*DOL)*(NAM*MEX+CAP*MCX+CUR*PES)

  • 3. Calculate the query answer:

F*DOL ~ PES

Credits: Pentti Kanerva

slide-13
SLIDE 13

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico? Hyperdimensional computing approach:

  • 1. Assign a random high-dimensional vector to each entity

”Name” is a random vector NAM ”USA” is a random vector USA ”Capital city” is a random vector CAP …

  • 2. Calculate a single high-dimensional vector that contains all information

F = (NAM*USA+CAP*WDC+CUR*DOL)*(NAM*MEX+CAP*MCX+CUR*PES)

  • 3. Calculate the query answer:

F*DOL ~ PES

Credits: Pentti Kanerva

slide-14
SLIDE 14

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico? Hyperdimensional computing approach:

  • 1. Assign a random high-dimensional vector to each entity

”Name” is a random vector NAM ”USA” is a random vector USA ”Capital city” is a random vector CAP …

  • 2. Calculate a single high-dimensional vector that contains all information

F = (NAM*USA+CAP*WDC+CUR*DOL)*(NAM*MEX+CAP*MCX+CUR*PES)

  • 3. Calculate the query answer:

F*DOL ~ PES

Credits: Pentti Kanerva

slide-15
SLIDE 15

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico? Hyperdimensional computing approach:

  • 1. Assign a random high-dimensional vector to each entity

”Name” is a random vector NAM ”USA” is a random vector USA ”Capital city” is a random vector CAP …

  • 2. Calculate a single high-dimensional vector that contains all information

F = (NAM*USA+CAP*WDC+CUR*DOL)*(NAM*MEX+CAP*MCX+CUR*PES)

  • 3. Calculate the query answer:

F*DOL ~ PES

Credits: Pentti Kanerva

slide-16
SLIDE 16

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico? Hyperdimensional computing approach:

  • 1. Assign a random high-dimensional vector to each entity

”Name” is a random vector NAM ”USA” is a random vector USA ”Capital city” is a random vector CAP …

  • 2. Calculate a single high-dimensional vector that contains all information

F = (NAM*USA+CAP*WDC+CUR*DOL)*(NAM*MEX+CAP*MCX+CUR*PES)

  • 3. Calculate the query answer:

F*DOL ~ PES

Credits: Pentti Kanerva

slide-17
SLIDE 17

Peer Neubert, TU Chemnitz

Teaser application 2: Place recognition in changing environments

Problem: Visual place recognition

Image credits: M. Milford and G. F. Wyeth. Seqslam: Visual route-based navigation for sunny summer days and stormy winter nights. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2012.

slide-18
SLIDE 18

Peer Neubert, TU Chemnitz

Teaser application 2: Place recognition in changing environments

Problem: Visual place recognition in changing environments

Image credits: M. Milford and G. F. Wyeth. Seqslam: Visual route-based navigation for sunny summer days and stormy winter nights. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2012.

slide-19
SLIDE 19

Peer Neubert, TU Chemnitz

Teaser application 2: Place recognition in changing environments

Deep Neural Network

slide-20
SLIDE 20

Peer Neubert, TU Chemnitz

Teaser application 2: Place recognition in changing environments

Deep Neural Network

slide-21
SLIDE 21

Peer Neubert, TU Chemnitz

Teaser application 2: Place recognition in changing environments

Deep Neural Network

slide-22
SLIDE 22

Peer Neubert, TU Chemnitz

Approaches to place recognition

Metric SLAM Pairwise image comparison

slide-23
SLIDE 23

Peer Neubert, TU Chemnitz

Approaches to place recognition

Metric SLAM Pairwise comparison

slide-24
SLIDE 24

Peer Neubert, TU Chemnitz

Spectrum of approaches with different

  • Complexity
  • Amount of map information
  • Robustness
  • ...

Approaches to place recognition

Metric SLAM In between, e.g. SeqSLAM Pairwise independently

slide-25
SLIDE 25

Peer Neubert, TU Chemnitz

Teaser application 2: Place recognition in changing environments

Deep Neural Network Yi = (Xi−k P ⊗

−k ) k=0

+

d

This is where the hyperdimensional magic happens

slide-26
SLIDE 26

Peer Neubert, TU Chemnitz

Teaser application 2: Place recognition in changing environments

Deep Neural Network Yi = (Xi−k P ⊗

−k ) k=0

+

d

This is where the hyperdimensional magic happens

slide-27
SLIDE 27

Peer Neubert, TU Chemnitz

Outline: Introduction to high dimensional computing

1) Historical note 2) High dimensional vector spaces and where they are used 3) Mathematical properties of high dimensional vector spaces 4) Vector Symbolic Architectures or “How to do symbolic computations using vectors spaces”

including “What is the Dollar of Mexico?”

slide-28
SLIDE 28

Peer Neubert, TU Chemnitz

Historical note

  • Ancient Greeks: Roots of geometry

Plato: geometric theory of creation and elements

Journey and work of Aristotle

Euclid: “Elements of geometry”

  • Modern scientific progress: Geometry and vectors

1637 Descartes “Analytic Geometry”

1844 Graßmann and 1853 Hamilton introduce vectors

1936 Birkhoff and von Neumann introduce quantum logic

  • More recently: Hyperdimensional Computing

Kanerva: Sparse Distributed Memory, Computing with large random vectors

Smolensky, Plate, Gaylor: Vector Symbolic Architectures

Fields: Vector models for NLP, Quantum cognition, ...

See: “Geometry and Meaning” by Dominic Widdows 2004, CSLI Publications, Stanford, ISBN 9781575864488

slide-29
SLIDE 29

Peer Neubert, TU Chemnitz

Historical note

  • Ancient Greeks: Roots of geometry

Plato: geometric theory of creation and elements

Journey and work of Aristotle

Euclid: “Elements of geometry”

  • Modern scientific progress: Geometry and vectors

1637 Descartes “Analytic Geometry”

1844 Graßmann and 1853 Hamilton introduce vectors

1936 Birkhoff and von Neumann introduce quantum logic

  • More recently: Hyperdimensional Computing

Kanerva: Sparse Distributed Memory, Computing with large random vectors

Smolensky, Plate, Gaylor: Vector Symbolic Architectures

Fields: Vector models for NLP, Quantum cognition, ...

See: “Geometry and Meaning” by Dominic Widdows 2004, CSLI Publications, Stanford, ISBN 9781575864488

slide-30
SLIDE 30

Peer Neubert, TU Chemnitz

Historical note

  • Ancient Greeks: Roots of geometry

Plato: geometric theory of creation and elements

Journey and work of Aristotle

Euclid: “Elements of geometry”

  • Modern scientific progress: Geometry and vectors

1637 Descartes “Analytic Geometry”

1844 Graßmann and 1853 Hamilton introduce vectors

1936 Birkhoff and von Neumann introduce quantum logic

  • More recently: Hyperdimensional Computing

Kanerva: Sparse Distributed Memory, Computing with large random vectors

Smolensky, Plate, Gaylor: Vector Symbolic Architectures

Fields: Vector models for NLP, Quantum cognition, ...

See: “Geometry and Meaning” by Dominic Widdows 2004, CSLI Publications, Stanford, ISBN 9781575864488

slide-31
SLIDE 31

Peer Neubert, TU Chemnitz

Vector space

  • e.g., n-dimensional real valued vectors
  • Intuitive meaning in 1 to 3 dimensional Euclidean spaces

e.g., position of a book in a rack

slide-32
SLIDE 32

Peer Neubert, TU Chemnitz

Vector space

  • e.g., n-dimensional real valued vectors
  • Intuitive meaning in 1 to 3 dimensional Euclidean spaces

e.g., position of a book in a rack or bookshelf

slide-33
SLIDE 33

Peer Neubert, TU Chemnitz

Vector space

  • e.g., n-dimensional real valued vectors
  • Intuitive meaning in 1 to 3 dimensional Euclidean spaces

e.g., position of a book in a rack, bookshelf or library

Image: Ralf Roletschek / Roletschek.at. Science library of Upper Lusatia in Görlitz, Germany.

slide-34
SLIDE 34

Peer Neubert, TU Chemnitz

Vector space

  • e.g., n-dimensional real valued vectors
  • Intuitive meaning in 1 to 3 dimensional Euclidean spaces

e.g., position of a book in a rack, bookshelf or library

?

slide-35
SLIDE 35

Peer Neubert, TU Chemnitz

Where are such vectors used?

  • Feature vectors, e.g., in computer vision or information retrieval
  • (Intermediate) representations in deep ANN
  • Vector models for natural language processing
  • Memory and storage models, e.g., Pentti Kanerva’s Sparse Distributed

Memory or Deepmind’s long-short term memory

  • Computational brain models, e.g. Jeff Hawkins’ HTM or Chris Eliasmith’s

SPAUN

  • Quantum cognition approaches
  • ...
slide-36
SLIDE 36

Peer Neubert, TU Chemnitz

Hierarchical Temporal Memory

Image source: https://numenta.com/

Jeff Hawkins

slide-37
SLIDE 37

Peer Neubert, TU Chemnitz

Hierarchical Temporal Memory: Neuron model

slide-38
SLIDE 38

Peer Neubert, TU Chemnitz

Hierarchical Temporal Memory: Neuron model

slide-39
SLIDE 39

Peer Neubert, TU Chemnitz

Hierarchical Temporal Memory: Neuron model

Sparse Distributed Representations

slide-40
SLIDE 40

Peer Neubert, TU Chemnitz

Quantum Cognition

  • Not quantum mind (=“the brain works by micro-physical quantum mechanics”)
  • A theory that models cognition by the same math that is used to describe

quantum mechanics

  • Important tool: representation using vector spaces and vector operators (e.g.

sums and projections)

  • Motivation: Some paradox or irrational judgements of humans can’t be

explained using classical probability theory and logic, e.g. conjunction and disjunction errors or order effects

Busemeyer, J., & Bruza, P. (2012). Quantum Models of Cognition and Decision. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511997716

slide-41
SLIDE 41

Peer Neubert, TU Chemnitz

Quantum Cognition

  • Not quantum mind (=“the brain works by micro-physical quantum mechanics”)
  • A theory that models cognition by the same math that is used to describe

quantum mechanics

  • Important tool: representation using vector spaces and operators (e.g. sums

and projections)

  • Motivation: Some paradox or irrational judgements of humans can’t be

explained using classical probability theory and logic, e.g. conjunction and disjunction errors or order effects

Busemeyer, J., & Bruza, P. (2012). Quantum Models of Cognition and Decision. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511997716

slide-42
SLIDE 42

Peer Neubert, TU Chemnitz

Outline: Introduction to high dimensional computing

1) Historical overview 2) High dimensional vector spaces and where they are used 3) Mathematical properties of high dimensional vector spaces 4) Vector Symbolic Architectures or “How to do symbolic computations using vectors spaces”

including “What is the Dollar of Mexico?”

slide-43
SLIDE 43

Peer Neubert, TU Chemnitz

Four properties of high-dimensional vector spaces

“The good, the bad, and the ugly” The Trivial The Bad The Surprising The Good

slide-44
SLIDE 44

Peer Neubert, TU Chemnitz

Properties 1/4: High-dimensional vector spaces have huge capacity

  • Capacity grows exponentially
  • Here: “high-dimensional” means thousands of dimensions
  • This property also holds for other vector spaces than

Binary, e.g. {0, 1}n, {-1, 1}n

Ternary, e.g. {-1, 0, 1}n

Real, e.g. [-1, 1]n

Sparse or Dense The Trivial The Bad The Surprising The Good

slide-45
SLIDE 45

Peer Neubert, TU Chemnitz

Properties 1/4: High-dimensional vector spaces have huge capacity

  • Capacity grows exponentially
  • Here: “high-dimensional” means thousands of dimensions
  • This property also holds for other vector spaces than

Binary, e.g. {0, 1}n, {-1, 1}n

Ternary, e.g. {-1, 0, 1}n

Real, e.g. [-1, 1]n

Sparse or Dense The Trivial The Bad The Surprising The Good

slide-46
SLIDE 46

Peer Neubert, TU Chemnitz

Properties 1/4: High-dimensional vector spaces have huge capacity

  • Capacity grows exponentially
  • Here: “high-dimensional” means thousands of dimensions
  • This property also holds for other vector spaces than

Binary, e.g. {0, 1}n, {-1, 1}n

Ternary, e.g. {-1, 0, 1}n

Real, e.g. [-1, 1]n

Sparse or Dense The Trivial The Bad The Surprising The Good

slide-47
SLIDE 47

Peer Neubert, TU Chemnitz

Properties 1/4: High capacity

500 1000 1500 2000 # dimensions 100 1020 1040 1060 1080 capacity d=0.01 d=0.03 d=0.05 d=0.10 dense (d=1)

  • approx. # atoms

in the universe

Binary vectors Sparsity d is ratio of “1s”

slide-48
SLIDE 48

Peer Neubert, TU Chemnitz

Properties 1/4: High capacity

500 1000 1500 2000 # dimensions 100 1020 1040 1060 1080 capacity d=0.01 d=0.03 d=0.05 d=0.10 dense (d=1)

  • approx. # atoms

in the universe

Binary vectors Sparsity d is ratio of “1s”

slide-49
SLIDE 49

Peer Neubert, TU Chemnitz

Properties 1/4: High capacity

500 1000 1500 2000 # dimensions 100 1020 1040 1060 1080 capacity d=0.01 d=0.03 d=0.05 d=0.10 dense (d=1)

  • approx. # atoms

in the universe

Binary vectors Sparsity d is ratio of “1s”

slide-50
SLIDE 50

Peer Neubert, TU Chemnitz

Properties 1/4: High capacity

500 1000 1500 2000 # dimensions 100 1020 1040 1060 1080 capacity d=0.01 d=0.03 d=0.05 d=0.10 dense (d=1)

  • approx. # atoms

in the universe

Binary vectors Sparsity d is ratio of “1s”

slide-51
SLIDE 51

Peer Neubert, TU Chemnitz

Downside of so much space: Bellman, 1961: “Curse of dimensionality”

“Algorithms that work in low dimensional space fail in higher dimensional spaces”

We require exponential amounts of samples to represent space with statistical significance (e.g., Hastie et al. 2009)

Properties 2/4: Nearest neighbor becomes unstable or meaningless

The Trivial The Bad The Surprising The Good

Bellman, R. E. (1961) Adaptive Control Processes: A Guided Tour. MIT Press, Cambridge Hastie, Tibshirani and Friedman (2009). The Elements of Statistical Learning (2nd edition)Springer- Verlag

slide-52
SLIDE 52

Peer Neubert, TU Chemnitz

Downside of so much space: Bellman, 1961: “Curse of dimensionality”

“Algorithms that work in low dimensional space fail in higher dimensional spaces”

We require exponential amounts of samples to represent space with statistical significance (e.g., Hastie et al. 2009)

Properties 2/4: Nearest neighbor becomes unstable or meaningless

The Trivial The Bad The Surprising The Good

Bellman, R. E. (1961) Adaptive Control Processes: A Guided Tour. MIT Press, Cambridge Hastie, Tibshirani and Friedman (2009). The Elements of Statistical Learning (2nd edition)Springer- Verlag

slide-53
SLIDE 53

Peer Neubert, TU Chemnitz

Example: Sorted library

  • Library contains books about 4 topics
  • We can’t infer the topic from the pose directly,
  • nly by nearby samples.

The Trivial The Bad The Surprising The Good

slide-54
SLIDE 54

Peer Neubert, TU Chemnitz

Example: Sorted library

History Novels Geometry Sports

  • Library contains books about 4 topics
  • We can’t infer the topic from the pose directly,
  • nly by nearby samples.

The Trivial The Bad The Surprising The Good

slide-55
SLIDE 55

Peer Neubert, TU Chemnitz

Example: Sorted library

History Novels Geometry Sports

  • Library contains books about 4 topics
  • We can’t infer the topic from the pose directly,
  • nly by nearby samples.

A single sample per topic Query The Trivial The Bad The Surprising The Good

slide-56
SLIDE 56

Peer Neubert, TU Chemnitz

Example: Sorted library

Geometry History Novels Sports Geometry History Novels Sports The more dimensions, the more samples are required to represent the shape of the clusters. History Novels Sports Geometry The Trivial The Bad The Surprising The Good

slide-57
SLIDE 57

Peer Neubert, TU Chemnitz

Example: Sorted library

Geometry History Novels Sports Geometry History Novels Sports The more dimensions, the more samples are required to represent the shape of the clusters. History Novels Sports Geometry The Trivial The Bad The Surprising The Good

slide-58
SLIDE 58

Peer Neubert, TU Chemnitz

Example: Sorted library

Geometry History Novels Sports Geometry History Novels Sports History Novels Sports Geometry The Trivial The Bad The Surprising The Good

slide-59
SLIDE 59

Peer Neubert, TU Chemnitz

Example: Sorted library

Geometry History Novels Sports Geometry History Novels Sports The more dimensions, the more samples are required to represent the shape of the clusters. History Novels Sports Geometry Exponential growth!

slide-60
SLIDE 60

Peer Neubert, TU Chemnitz

Properties 2/4: Nearest neighbor becomes unstable or meaningless

  • Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When Is nearest neighbor

meaningful? In: Database theory—ICDT’99. Springer, Berlin, Heidelberg, pp 217–235

“under a broad set of conditions (much broader than independent and identically distributed dimensions)” Increasing #dimensions

  • Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high

dimensional space. In: Database theory—ICDT 2001. Springer, Berlin Heidelberg, pp 420–434

The Trivial The Bad The Surprising The Good

slide-61
SLIDE 61

Peer Neubert, TU Chemnitz

Properties 2/4: Nearest neighbor becomes unstable or meaningless

  • Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When Is nearest neighbor

meaningful? In: Database theory—ICDT’99. Springer, Berlin, Heidelberg, pp 217–235

“under a broad set of conditions (much broader than independent and identically distributed dimensions)” Increasing #dimensions

  • Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high

dimensional space. In: Database theory—ICDT 2001. Springer, Berlin Heidelberg, pp 420–434

The Trivial The Bad The Surprising The Good

slide-62
SLIDE 62

Peer Neubert, TU Chemnitz

Properties 2/4: Nearest neighbor becomes unstable or meaningless

  • Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When Is nearest neighbor

meaningful? In: Database theory—ICDT’99. Springer, Berlin, Heidelberg, pp 217–235 “under a broad set of conditions (much broader than independent and identically distributed dimensions)”

Increasing #dimensions

  • Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high

dimensional space. In: Database theory—ICDT 2001. Springer, Berlin Heidelberg, pp 420–434

The Trivial The Bad The Surprising The Good

slide-63
SLIDE 63

Peer Neubert, TU Chemnitz

Properties 2/4: Nearest neighbor becomes unstable or meaningless

  • Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When Is nearest neighbor

meaningful? In: Database theory—ICDT’99. Springer, Berlin, Heidelberg, pp 217–235 “under a broad set of conditions (much broader than independent and identically distributed dimensions)”

Increasing #dimensions

  • Aggarwal CC, Hinneburg A, Keim DA (2001) On the surprising behavior of distance metrics in high

dimensional space. In: Database theory—ICDT 2001. Springer, Berlin Heidelberg, pp 420–434

The Trivial The Bad The Surprising The Good

slide-64
SLIDE 64

Peer Neubert, TU Chemnitz

Properties 3/4: Time to gamble!

The Trivial The Bad The Surprising The Good

slide-65
SLIDE 65

Peer Neubert, TU Chemnitz

Properties 3/4: Experiment

  • Random vectors:

uniformly distributed angles

  • btained by sampling each

dimension iid. ~N(0,1)

  • I bet:

Given a random vector A, we can sample a second random vector B and it will be almost

  • rthogonal (+/- 5°) ...

… if we are in a 4,000 dimensional vector space.

The Trivial The Bad The Surprising The Good

slide-66
SLIDE 66

Peer Neubert, TU Chemnitz

Properties 3/4: Experiment

  • Random vectors:

uniformly distributed angles

  • btained by sampling each

dimension iid. ~N(0,1)

  • I bet:

Given a random vector A, we can sample a second random vector B and it will be almost

  • rthogonal (+/- 5°) ...

… if we are in a 4,000 dimensional vector space.

The Trivial The Bad The Surprising The Good

slide-67
SLIDE 67

Peer Neubert, TU Chemnitz

  • Random vectors:

uniformly distributed angles

  • btained by sampling each

dimension iid. ~N(0,1)

  • I bet:

Given a random vector A, we can independently sample a second random vector B and it will be almost orthogonal (+/- 5°) ...

… if we are in a 4,000 dimensional vector space.

Properties 3/4: Experiment

A The Trivial The Bad The Surprising The Good

slide-68
SLIDE 68

Peer Neubert, TU Chemnitz

  • Random vectors:

uniformly distributed angles

  • btained by sampling each

dimension iid. ~N(0,1)

  • I bet:

Given a random vector A, we can independently sample a second random vector B and it will be almost orthogonal (+/- 5°) ...

… if we are in a 4,000 dimensional vector space.

Properties 3/4: Experiment

A B The Trivial The Bad The Surprising The Good

slide-69
SLIDE 69

Peer Neubert, TU Chemnitz

  • Random vectors:

uniformly distributed angles

  • btained by sampling each

dimension iid. ~N(0,1)

  • I bet:

Given a random vector A, we can independently sample a second random vector B and it will be almost orthogonal (+/- 5°) ...

… if we are in a 4,000 dimensional vector space.

Properties 3/4: Experiment

A B The Trivial The Bad The Surprising The Good

slide-70
SLIDE 70

Peer Neubert, TU Chemnitz

  • Random vectors:

uniformly distributed angles

  • btained by sampling each

dimension iid. ~N(0,1)

  • I bet:

Given a random vector A, we can independently sample a second random vector B and it will be almost orthogonal (+/- 5°) ...

… if we are in a 4,000 dimensional vector space.

Properties 3/4: Experiment

A B +/- 5° +/- 5° The Trivial The Bad The Surprising The Good

slide-71
SLIDE 71

Peer Neubert, TU Chemnitz

Properties 3/4: Experiment

  • Random vectors:

uniformly distributed angles

  • btained by sampling each

dimension iid. ~N(0,1)

  • I bet:

Given a random vector A, we can independently sample a second random vector B and it will be almost orthogonal (+/- 5°) ...

… if we are in a 4,000 dimensional vector space.

A B +/- 5° +/- 5° The Trivial The Bad The Surprising The Good

slide-72
SLIDE 72

Peer Neubert, TU Chemnitz

Properties 3/4: Random vectors are very likely almost

  • rthogonal
  • Random vectors: iid, uniform

The Trivial The Bad The Surprising The Good

slide-73
SLIDE 73

Peer Neubert, TU Chemnitz

Properties 3/4: Random vectors are very likely almost

  • rthogonal
  • Random vectors: iid, uniform

The Trivial The Bad The Surprising The Good

slide-74
SLIDE 74

Peer Neubert, TU Chemnitz

Properties 3/4: Random vectors are very likely almost

  • rthogonal
  • Random vectors: iid, uniform

5 10 15 20 25

# dimensions

1 2 3 4 5 6 7

area (varying unit) Surface areas

Similar Almost orthogonal

The Trivial The Bad The Surprising The Good

slide-75
SLIDE 75

Peer Neubert, TU Chemnitz

Properties 3/4: Random vectors are very likely almost

  • rthogonal
  • Random vectors: iid, uniform

5 10 15 20 25

# dimensions

1 2 3 4 5 6 7

area (varying unit) Surface areas

Similar Almost orthogonal

5 10 15 20 25

# dimensions

5 10 15 20 25 30 35

area (varying unit) Surface area of unit n-sphere The Trivial The Bad The Surprising The Good

slide-76
SLIDE 76

Peer Neubert, TU Chemnitz

Properties 3/4: Random vectors are very likely almost

  • rthogonal
  • Random vectors: iid, uniform

5 10 15 20 25

# dimensions

1 2 3 4 5 6 7

area (varying unit) Surface areas

Similar Almost orthogonal

5 10 15 20 25

# dimensions

100 105 1010 1015 1020 1025

AAlmost orthogonal /A

Similar

Ratio of surface areas

slide-77
SLIDE 77

Peer Neubert, TU Chemnitz

Properties 3/4: Random vectors are very likely almost

  • rthogonal
  • Random vectors: iid, uniform

5 10 15 20 25

# dimensions

0.1 0.2 0.3 0.4

probability Random sampling probability

Similar Almost orthogonal

200 400 600 800 1000

# dimensions

0.2 0.4 0.6 0.8 1

probability Random sampling probability (extended)

Similar Almost orthogonal

slide-78
SLIDE 78

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

  • Example 1:

Database 1 Mio vectors

  • 1. One million random

feature vectors [0,1]d

  • 2. query: noisy measurements
  • f feature vectors

What is the probability to get the wrong query answer?

The Trivial The Bad The Surprising The Good

slide-79
SLIDE 79

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

  • Example 1:

Database 1 Mio vectors

  • 1. One million random

feature vectors [0,1]d

  • 2. query: noisy measurements
  • f feature vectors

What is the probability to get the wrong query answer?

The Trivial The Bad The Surprising The Good

slide-80
SLIDE 80

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

  • Example 1:

Database 1 Mio vectors

  • 1. One million random

feature vectors [0,1]d

  • 2. query: noisy measurements
  • f feature vectors

What is the probability to get the wrong query answer?

The Trivial The Bad The Surprising The Good

slide-81
SLIDE 81

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

50 100 150 200 # dimensions 0.2 0.4 0.6 0.8 1 Probability of wrong query answer

noise

=0.10

noise

=0.25

noise

=0.50

  • Example 1:
slide-82
SLIDE 82

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

50 100 150 200 # dimensions 0.2 0.4 0.6 0.8 1 Probability of wrong query answer

noise

=0.10

noise

=0.25

noise

=0.50 50 100 150 200 dimension index

  • 1
  • 0.5

0.5 1 1.5 2 value

  • Example 1:
slide-83
SLIDE 83

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

  • Example 2:

Database 1 Mio vectors

  • 1. One million random

feature vectors [0,1]d

  • 2. query: noisy measurements
  • f feature vectors

What if the noise-vector is again a database vector?

slide-84
SLIDE 84

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

  • Example 2:

Database 1 Mio vectors

  • 1. One million random

feature vectors [0,1]d

  • 2. query: sum of feature

vectors How many database vectors can we add (=bundle) and still get exactly all the added vectors as answer?

slide-85
SLIDE 85

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

  • Example 2:

How many database vectors can we add (=bundle) and still get exactly all the added vectors as answer?

200 400 600 800 1000 # dimensions 0.2 0.4 0.6 0.8 1 Probability of wrong query answer k=2 k=3 k=4 k=5

[0,1]d

slide-86
SLIDE 86

Peer Neubert, TU Chemnitz

Properties 4/4: Noise has low influence on nearest neighbor queries with random vectors

  • Example 2:

How many database vectors can we add (=bundle) and still get exactly all the added vectors as answer?

200 400 600 800 1000 # dimensions 0.2 0.4 0.6 0.8 1 Probability of wrong query answer k=2 k=3 k=4 k=5 200 400 600 800 1000 # dimensions 0.2 0.4 0.6 0.8 1 Probability of wrong query answer k=2 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10

[0,1]d [-1,1]d

slide-87
SLIDE 87

Peer Neubert, TU Chemnitz

Example application: Object recognition

Database query

slide-88
SLIDE 88

Peer Neubert, TU Chemnitz

Example application: Object recognition

Database query

slide-89
SLIDE 89

Peer Neubert, TU Chemnitz

Example application: Object recognition

Database query

+

slide-90
SLIDE 90

Peer Neubert, TU Chemnitz

Example application: Object recognition

Database query

+

45 90 135 180 Angular distance of query to known vectors 0.4 0.5 0.6 0.7 0.8 0.9 1 Accuracy Individual Bundle Static B4 Static B8

slide-91
SLIDE 91

Peer Neubert, TU Chemnitz

How to store structured data?

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico?

Credits: Pentti Kanerva

roles fillers

slide-92
SLIDE 92

Peer Neubert, TU Chemnitz

How to store structured data?

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico? Hyperdimensional computing approach:

  • 1. Assign a random high-dimensional vector to each entity

”Name” is a random vector NAM ”USA” is a random vector USA ”Capital city” is a random vector CAP …

  • 2. Calculate a single high-dimensional vector that contains all information

F = (NAM*USA+CAP*WDC+CUR*DOL)*(NAM*MEX+CAP*MCX+CUR*PES)

  • 3. Calculate the query answer:

F*DOL ~ PES

Credits: Pentti Kanerva

roles fillers

slide-93
SLIDE 93

Peer Neubert, TU Chemnitz

Vector Symbolic Architectures (VSA)

  • VSA = high dimensional vector space + operations
  • Operations in a VSA:

Bind()

Bundle()

Permute()/Protect()

  • Additionally

Encoding/decoding

Clean-up memory

Term: Gayler RW (2003) Vector symbolic architectures answer Jackendoff’s challenges for cognitive

  • neuroscience. In: Proc. of ICCS/ASCS Int. Conf. on

cognitive science, pp 133–138. Sydney, Australia Pentti Kanerva. 2009. Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors. Cognitive Computation 1, 2 (2009), 139–159. https://doi.org/10.1007/s12559- 009-9009-8

slide-94
SLIDE 94

Peer Neubert, TU Chemnitz

Vector Symbolic Architectures (VSA)

  • VSA = high dimensional vector space + operations
  • Operations in a VSA:

Bind()

Bundle()

Permute()/Protect()

  • Additionally

Encoding/decoding

Clean-up memory

Term: Gayler RW (2003) Vector symbolic architectures answer Jackendoff’s challenges for cognitive

  • neuroscience. In: Proc. of ICCS/ASCS Int. Conf. on

cognitive science, pp 133–138. Sydney, Australia Pentti Kanerva. 2009. Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors. Cognitive Computation 1, 2 (2009), 139–159. https://doi.org/10.1007/s12559- 009-9009-8

slide-95
SLIDE 95

Peer Neubert, TU Chemnitz

Vector Symbolic Architectures (VSA)

  • VSA = high dimensional vector space + operations
  • Operations in a VSA:

Bind()

Bundle()

Permute()/Protect()

  • Additionally

Encoding/decoding

Clean-up memory

Term: Gayler RW (2003) Vector symbolic architectures answer Jackendoff’s challenges for cognitive

  • neuroscience. In: Proc. of ICCS/ASCS Int. Conf. on

cognitive science, pp 133–138. Sydney, Australia Pentti Kanerva. 2009. Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors. Cognitive Computation 1, 2 (2009), 139–159. https://doi.org/10.1007/s12559- 009-9009-8

Database

slide-96
SLIDE 96

Peer Neubert, TU Chemnitz

Vector Symbolic Architectures (VSA)

  • VSA = high dimensional vector space + operations
  • Operations in a VSA:

Bind()

Bundle()

Permute()/Protect()

  • Additionally

Encoding/decoding

Clean-up memory

Term: Gayler RW (2003) Vector symbolic architectures answer Jackendoff’s challenges for cognitive

  • neuroscience. In: Proc. of ICCS/ASCS Int. Conf. on

cognitive science, pp 133–138. Sydney, Australia Pentti Kanerva. 2009. Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors. Cognitive Computation 1, 2 (2009), 139–159. https://doi.org/10.1007/s12559- 009-9009-8

slide-97
SLIDE 97

Peer Neubert, TU Chemnitz

VSA operations

  • Bundling +

Goal: combine two vectors, such that

  • the result is similar to both vectors

Application: superpose information

  • Binding ⊗

Goal: combine two vectors, such that

  • the result is nonsimilar to both vectors
  • ne can be recreated from the result using the other

Application: bind value “a” to variable “x” (or a “filler” to a “role” or ...)

slide-98
SLIDE 98

Peer Neubert, TU Chemnitz

VSA operations

  • Bundling +

Goal: combine two vectors, such that

  • the result is similar to both vectors

Application: superpose information

  • Binding⊗

Goal: combine two vectors, such that

  • the result is nonsimilar to both vectors
  • ne can be recreated from the result using the other

Application: bind value “a” to variable “x” (or a “filler” to a “role” or ...)

slide-99
SLIDE 99

Peer Neubert, TU Chemnitz

VSA operations

  • Binding

– Properties

  • Associative, commutative
  • Self-inverse: X

X=I ⊗ (or additional unbind operator)

  • Result nonsimilar to input

– Example:

  • Hypervector space {-1,1}D
  • bind can be elementwise multiplication

– Application: bind value “a” to variable “x” (or a “filler” to a “role” or ...)

  • Bind:

– x = a → H = X

A ⊗

  • Unbind:

– x = ?

→ X H ⊗ = X (X ⊗ A) ⊗ = A

slide-100
SLIDE 100

Peer Neubert, TU Chemnitz

VSA operations

  • Binding

– Properties

  • Associative, commutative
  • Self-inverse: X

X=I ⊗ (or additional unbind operator)

  • Result nonsimilar to input

– Example:

  • Hypervector space {-1,1}D
  • binding can be elementwise multiplication

– Application: bind value “a” to variable “x” (or a “filler” to a “role” or ...)

  • Bind:

– x = a → H = X

A ⊗

  • Unbind:

– x = ?

→ X H ⊗ = X (X ⊗ A) ⊗ = A

slide-101
SLIDE 101

Peer Neubert, TU Chemnitz

  • Binding

– Properties

  • Associative, commutative
  • Self-inverse: X

X=I ⊗ (or additional unbind operator)

  • Result nonsimilar to input

– Example:

  • Hypervector space {-1,1}D
  • binding can be elementwise multiplication

– Application: bind value “a” to variable “x” (or a “filler” to a “role” or ...)

  • Bind:

– x = a → H = X

A ⊗

  • Unbind:

– x = ?

→ X H ⊗ = X (X ⊗ A) ⊗ = A

VSA operations

slide-102
SLIDE 102

Peer Neubert, TU Chemnitz

VSA operations

  • Bundling +

– properties

  • Associative, commutative, binding distributes over bundling
  • Result is similar to both inputs

– Example

  • Hypervector space [-1,1]D
  • bundling can be elementwise sum, elementwise normalized to [-1,1]

– Application:

{x = a, y = b} → H = X A + Y ⊗ B ⊗

  • Unbind a bundle

{x = a, y = b} → H = X A + Y ⊗ B ⊗ x = ? → X H = X (X A + Y ⊗ ⊗ ⊗ B) ⊗ = (X X A) + (X Y ⊗ ⊗ ⊗ B) ⊗ = A + noise

slide-103
SLIDE 103

Peer Neubert, TU Chemnitz

VSA operations

  • Bundling +

– properties

  • Associative, commutative, binding distributes over bundling
  • Result is similar to both inputs

– Example

  • Hypervector space [-1,1]D
  • bundling can be elementwise sum, elementwise normalized to [-1,1]

– Application:

{x = a, y = b} → H = X A + Y ⊗ B ⊗

  • Unbind a bundle

{x = a, y = b} → H = X A + Y ⊗ B ⊗ x = ? → X H = X (X A + Y ⊗ ⊗ ⊗ B) ⊗ = (X X A) + (X Y ⊗ ⊗ ⊗ B) ⊗ = A + noise

slide-104
SLIDE 104

Peer Neubert, TU Chemnitz

VSA operations

  • Bundling +

– properties

  • Associative, commutative, binding distributes over bundling
  • Result is similar to both inputs

– Example

  • Hypervector space [-1,1]D
  • bundling can be elementwise sum, elementwise normalized to [-1,1]

– Application:

{x = a, y = b} → H = X A + Y ⊗ B ⊗

  • Unbind a bundle

{x = a, y = b} → H = X A + Y ⊗ B ⊗ x = ? → X H = X (X A + Y ⊗ ⊗ ⊗ B) ⊗ = (X X A) + (X Y ⊗ ⊗ ⊗ B) ⊗ = A + noise

slide-105
SLIDE 105

Peer Neubert, TU Chemnitz

VSA operations

  • Bundling +

– properties

  • Associative, commutative, binding distributes over bundling
  • Result is similar to both inputs

– Example

  • Hypervector space [-1,1]D
  • bundling can be elementwise sum, elementwise normalized to [-1,1]

– Application:

{x = a, y = b} → H = X A + Y ⊗ B ⊗

  • Unbind a bundle

{x = a, y = b} → H = X A + Y ⊗ B ⊗ x = ? → X H = X (X A + Y ⊗ ⊗ ⊗ B) ⊗ = (X X A) + (X Y ⊗ ⊗ ⊗ B) ⊗ = A + noise

slide-106
SLIDE 106

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

United States

  • f America

Name: USA Capital City: Washington DC Currency: Dollar Mexico Name: Mexico Capital City: Mexico City Currency: Peso Given are 2 records: Question: What is the Dollar of Mexico? Hyperdimensional computing approach:

  • 1. Assign a random high-dimensional vector to each entity

”Name” is a random vector NAM ”USA” is a random vector USA ”Capital city” is a random vector CAP …

  • 2. Calculate a single high-dimensional vector that contains all information

F = (NAM*USA+CAP*WDC+CUR*DOL)*(NAM*MEX+CAP*MCX+CUR*PES)

  • 3. Calculate the query answer:

F*DOL ~ PES

Credits: Pentti Kanerva

slide-107
SLIDE 107

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

Credits: Pentti Kanerva

slide-108
SLIDE 108

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

Credits: Pentti Kanerva

slide-109
SLIDE 109

Peer Neubert, TU Chemnitz

Teaser application 1: “What is the Dollar of Mexico?”

Credits: Pentti Kanerva