Rethinking Data for Intelligent Computing Julie Pitt (@yakticus) - - PowerPoint PPT Presentation

rethinking data for intelligent computing
SMART_READER_LITE
LIVE PREVIEW

Rethinking Data for Intelligent Computing Julie Pitt (@yakticus) - - PowerPoint PPT Presentation

Rethinking Data for Intelligent Computing Julie Pitt (@yakticus) how I got here Jeff Hawkins the problem build machines capable of intelligent behavior questions what makes us intelligent? how does perception work? how does action work?


slide-1
SLIDE 1

Rethinking Data for Intelligent Computing

Julie Pitt (@yakticus)

slide-2
SLIDE 2
slide-3
SLIDE 3

how I got here

Jeff Hawkins

slide-4
SLIDE 4

the problem build machines capable of intelligent behavior

slide-5
SLIDE 5

questions

what makes us intelligent? how does perception work? how does action work? how does learning work? what does this mean for AI and data?

slide-6
SLIDE 6

what makes us intelligent?

1

slide-7
SLIDE 7

The origin of the asymmetry [of time] we experience can be traced all the way back to the orderliness of the universe near the big bang.

  • SEAN M. CARROLL

Scientific American, June 2008

slide-8
SLIDE 8

The defining characteristic of biological systems is that they maintain their states and form in the face of a constantly changing environment.

  • KARL FRISTON

Nature Reviews, February 2010

slide-9
SLIDE 9

free energy principle

Karl Friston

slide-10
SLIDE 10

all possible states

intelligent agents resist entropy

homeostasis

(i.e., survival)

slide-11
SLIDE 11

entropy = surprise (averaged over time)

low entropy high probability low surprise high entropy low probability high surprise

slide-12
SLIDE 12

intelligent agents minimize surprise?

slide-13
SLIDE 13

inside

the world model of the world sensory states

  • utside

surprise can’t be measured*

*directly

slide-14
SLIDE 14

surprise ≤ free energy

free energy model of the world

sensory states

free energy surprise

slide-15
SLIDE 15

free energy principle intelligent systems minimize free energy, which is an upper bound for surprise

> free energy surprise

slide-16
SLIDE 16

how do we minimize free energy?

the world

senses action

model of the world

predictions beliefs

  • 1. form predictions
  • 2. change the

world

  • 3. form

beliefs

slide-17
SLIDE 17

corollary to free energy principle

perception, action and learning are side- effects of free energy minimization

  • 1. form predictions → perception
  • 2. change the world → action
  • 3. form beliefs → learning
slide-18
SLIDE 18

how does perception work?

2

slide-19
SLIDE 19

demonstration

slide-20
SLIDE 20
slide-21
SLIDE 21
slide-22
SLIDE 22

you perceived the dalmatian when you could explain it

the world

sensory input action

  • utput

model of the world prediction beliefs

slide-23
SLIDE 23

...

the model is hierarchical

dalmatian prediction senses

several levels of abstraction between senses and “dalmatian” prediction

level 0 level N abstraction

slide-24
SLIDE 24

how did your brain form the prediction?

  • 1. form hypotheses
  • 2. select best hypotheses
  • 3. explain evidence
slide-25
SLIDE 25

message passing

  • 1. evidence used to form hypotheses
  • 2. inhibition used to select best hypotheses
  • 3. inferred causes used to explain evidence
  • 3. inferred

cause

  • 2. inhibition
  • 1. evidence
slide-26
SLIDE 26
  • 1. form hypotheses

■ each node represents a belief ■ belief = learned coincidence

○ e.g., frequent evidence of floppy ears, four legs and spots is caused by a dalmatian

belief encoded in connections level N level N - 1

slide-27
SLIDE 27
  • 1. form hypotheses

evidence

■ beliefs invoked by evidence from below

○ more abstract (general) than evidence ○ formulates a hypothesis that the belief is true

slide-28
SLIDE 28

■ related beliefs share connections

○ shared connections = common features ○ leads to conflicting hypotheses

  • 2. select best hypotheses

common features

slide-29
SLIDE 29
  • 2. select best hypotheses

■ hypotheses with shared evidence compete

○ strongest evidence + prediction wins ○ winners propagate, losers do not

loser: 2 inputs winner: 4 inputs

slide-30
SLIDE 30
  • 3. explain evidence

■ selected hypotheses that were predicted become inferred causes of evidence ■ inferred causes form lower level predictions

  • 2. inferred cause
  • 1. prediction
  • 3. new

predictions

slide-31
SLIDE 31

belief node

belief message flow

predicted? no evidence

  • ut

inferred cause in evidence in inferred cause

  • ut

update yes inhibition level N level N +1 level N -1 delete

slide-32
SLIDE 32

hierarchical prediction

■ high dimensional representation

○ leads to simultaneous predictions ○ allows parallel perceptions

■ predictions fill in top to bottom

○ many tasks become subconscious

subconscious perception

slide-33
SLIDE 33

perception & free energy

perception is a side-effect of free energy minimization ■ evidence = free energy

○ only prediction error is propagated forward

■ fully explaining evidence minimizes free energy

○ prediction = explanation of the future

slide-34
SLIDE 34

3

how does action work?

slide-35
SLIDE 35

hypothesis action is a special case of perception

proprioception

slide-36
SLIDE 36

active inference

■ actions inferred using proprioception ■ actions generated by prediction

proprioception action

motor predictions nervous system fulfills predictions motor state

slide-37
SLIDE 37

...

action plan = prediction

  • 2. eat food action plan

(prediction) interoceptive proprioceptive

  • 3. motor

predictions

(result in action)

  • 1. hunger

(evidence of “eat food” belief)

slide-38
SLIDE 38

action plan unfolds over time

get food from fridge & eat

walk to fridge get food & eat

get up walk towards fridge eat

  • pen door &

grab food stretch glutes balance turn walk

  • pen

door grab food put in mouth chew

time sitting in

  • ffice chair,

hungry eating, not hungry

slide-39
SLIDE 39

action & free energy

action : ■ minimizes free energy by changing the world to match predictions ■ is perception of future motor states ■ takes time

○ must be able to learn causes ○ temporal proximity

slide-40
SLIDE 40

how does learning work?

4

slide-41
SLIDE 41

■ evidence incorporated into beliefs

○ better explain the world in future

■ implemented as hebbian learning

prediction error triggers learning

no evidence (weaken) evidence (strengthen)

slide-42
SLIDE 42

learning & free energy

■ learning alters beliefs

○ affords long term reduction of uncertainty (i.e., free energy)

■ learning can be fast or slow

○ form new beliefs quickly ○ modify existing beliefs slowly ○ explains rapid learning during childhood

slide-43
SLIDE 43

what does this mean for AI and data?

5

slide-44
SLIDE 44

will computing as we know it cease to exist?

slide-45
SLIDE 45
slide-46
SLIDE 46

we’ll still need today’s computers

■ von Neumann architectures excel at processing

○ add two floating point numbers ○ execute deterministic code ○ store and retrieve data

■ intelligent machines will use computers

slide-47
SLIDE 47

what will change

...it learns through experience and leverages learnings to minimize free energy an intelligent machine interacts with its environment using its sensors and actuators...

slide-48
SLIDE 48

who’s the judge?

if you can construct a machine that can judge whether behavior is intelligent, you have solved the problem of intelligence

slide-49
SLIDE 49

what might machines be capable of in the future?

slide-50
SLIDE 50

go beyond human time scales

■ “stretch” out time

○ e.g., wake up once per decade ○ observe long term consequences

■ “compress” time

○ e.g., microsecond resolution ○ possess superhuman reflexes

slide-51
SLIDE 51

explore new sensory dimensions

■ live in virtual worlds, e.g.

○ sensing and reacting to internet traffic ○ control video game or VR character

■ experience the world on a global scale, e.g.

○ weather patterns ○ seismic activity ○ financial markets

slide-52
SLIDE 52

do the boring work

■ with limitless attention spans, do tedious work

○ monitor a patch of sky ○ keep a lookout for intruders ○ construct detailed virtual worlds

slide-53
SLIDE 53

develop communication

communication will emerge from experience

○ result of learning to predict other agents ○ full-blown language requires a rich model and significant horsepower

slide-54
SLIDE 54

how does data need to change?

slide-55
SLIDE 55

■ each sample taken “now”

○ data streams are parallel

■ action is in the present

○ can’t change the past ○ can exploit coherence in time

data needs to be in the present

time

slide-56
SLIDE 56

data needs to inspire action

■ sensory data format is free energy

○ encoding depends on the goal, e.g. ○ maintain temperature range → lots of free energy when “too hot” or “too cold”

slide-57
SLIDE 57

data can be noisy

■ leave noise in naturally noisy sensors

○ machines can infer even in presence of noise

slide-58
SLIDE 58

data need not be human-readable

■ machines can have sensors and actuators that interact with APIs

○ API data expressed as free energy ○ intermediate representation (e.g., prose, visualizations) not needed

slide-59
SLIDE 59

data need not be labeled

■ learning is unsupervised

○ need learning experiences, not training data ○ e.g., explore a maze containing some reward

■ learning is online

○ no separate training period

slide-60
SLIDE 60

data will flow through beliefs

■ belief = memory & processing unit

○ high dimensional representation ○ new hardware architecture needed

■ scalable intelligence

○ add belief capacity → increase intelligence ○ clone beliefs → crowd source

slide-61
SLIDE 61

challenges

slide-62
SLIDE 62

non-determinism

■ results not reproducible

○ noise adds non-determinism ○ each experience alters beliefs ○ actions affect the world

■ disadvantage in safety critical environments

○ advantage in entertainment (e.g., gaming)

slide-63
SLIDE 63

lack of transparency

■ cause of actions not readily discernible

○ cannot set breakpoints ○ behavior may be surprising

■ telemetry needed ■ testing will give way to laboratory experiments

slide-64
SLIDE 64

concern over threat to humans

■ safeguards needed e.g.,

○ unshakable belief that humans will not be harmed ○ harm leads to overabundance of free energy

slide-65
SLIDE 65

still a long way off

slide-66
SLIDE 66

further reading

■ selected papers by Karl Friston

○ e.g., Free Energy Principle review paper

■ toy implementation in Scala

○ hebbian learning implementation (no prediction or action) ○ inspired by Numenta/NuPIC (open source project based on biology)

slide-67
SLIDE 67

thanks!

any questions?

@yakticus julie@oomagnitude.com

slide-68
SLIDE 68