Learning How to Soar Learning How to Soar Terrence Sejnowski Salk - - PowerPoint PPT Presentation
Learning How to Soar Learning How to Soar Terrence Sejnowski Salk - - PowerPoint PPT Presentation
Learning How to Soar Learning How to Soar Terrence Sejnowski Salk Institute UCSD Bird Migration Bird Migration Migration Ecology of Birds, Ian Newton Thermal Soaring Thermal Soaring Rayleigh-Bnard Convection Rayleigh-Bnard Convection
Migration Ecology of Birds, Ian Newton
Bird Migration Bird Migration
Thermal Soaring Thermal Soaring
Rayleigh-Bénard Convection Rayleigh-Bénard Convection
Atmospheric Turbulence Atmospheric Turbulence
Tracking a Falcon with GPS Tracking a Falcon with GPS
Humans Soar Too Humans Soar Too
Glider Aerodynamics
Control over bank angle and angle of attack
Bank angle Angle of attack
Shephard & Lambertucci, 2013
1 - male condor 2- female condor 3 - black vulture 4 - caracara 1 2 3 4
- What quantities do birds sense?
- Vertical velocities, temperature, gradients, etc?
- How should the bird respond to these cues?
Physics simulations are complex and there are many variables.
How do Birds Find and Navigate Thermals? How do Birds Find and Navigate Thermals?
Experiments are hard to control and strategies are difficult to infer from limited data What should an optimal agent sense?
Time is Honey
Karl von Frisch
Temporal Difference Learning Temporal Difference Learning
) ( ) ( : error
- TD
1 1 t t t t
s V s V r
Sutton and Barto, 1988
t t t t t b b s p a s p t t t
a s p a s p e e s s a a a s
) , ( ) , ( : s preference the Update , Pr ) , ( : s preference by determined are Actions
) , ( ) , ( t t t
s V s V ) ( ) ( : update function value The
VUMmx1 - Octopamine
Hammer and Menzel, 1997
Montague, Dayan and Sejnowski, 1994
Temporal Difference Learning Temporal Difference Learning
Dopamine Neurons Dopamine Neurons Actor Critic Model Actor Critic Model
Environment
Dopamine Reward Prediction Error Cerebral Cortex Basal Ganglia
Montague, Dayan and Sejnowski, 1996
Go Defeat, 2017 Go Defeat, 2017 Temporal Difference Learning Temporal Difference Learning
Dopamine Reward Prediction Error Cerebral Cortex Basal Ganglia
Ke Jie Ke Jie
Environment
DeepMind DeepMind
Vertical velocity field Temperature field
What Do Thermals Look Like? What Do Thermals Look Like?
Rayleigh-Benard convection Reddy, Vergassola, Sejnowski, 2017
Pre-training Post-training
Sink or Soar? Sink or Soar?
+5o 0o
- 5o
vz
1-2 meters
az Vertical acceleration
Learned Policy Learned Policy
Vertical velocity gradient
temperature climb rate vertical acceleration vz gradients angle of attack
az and vz gradients across wings are useful control over angle of attack is not useful az and vz gradients across wings are useful control over angle of attack is not useful
Conclusions Conclusions
Field Experiments Field Experiments
GoPro Glider GoPro Glider
Field Experiments Field Experiments
Gautam Reddy
Field Experiments Field Experiments
Bank angle (o) 30
- 30
50 100 Time (s) desired
- bserved
Measuring the Vertical Wind Velocity Measuring the Vertical Wind Velocity
GPS and barometer measurement give vertical ground velocity We need to estimate wind velocity ground vel. = wind vel. + glider’s air vel. GPS/baro modeling
Pitch(o) 20 s
- 8
8
Phugoid