Opportunities for infusing physics in AI / ML algorithms Animashree - - PowerPoint PPT Presentation

opportunities for infusing physics in ai ml algorithms
SMART_READER_LITE
LIVE PREVIEW

Opportunities for infusing physics in AI / ML algorithms Animashree - - PowerPoint PPT Presentation

Opportunities for infusing physics in AI / ML algorithms Animashree Anandkumar Director of ML Research, NVIDIA Bren Professor, Caltech 1. Neural Programming 2 Combining Symbolic Expressions & Black-box Function Evaluations in Neural


slide-1
SLIDE 1

Opportunities for infusing physics in AI/ML algorithms

Animashree Anandkumar

Director of ML Research, NVIDIA Bren Professor, Caltech

slide-2
SLIDE 2

1.

Neural Programming

2

slide-3
SLIDE 3

Combining Symbolic Expressions & Black-box Function Evaluations in Neural Programs

FOROUGH ARABSHAHI, SAMEER SINGH, ANIMA ANANDKUMAR

slide-4
SLIDE 4

Symbolic + Numerical Input

Goal: Learn a domain of functions (sin, cos, log…)

○ Training on numerical input-output does not generalize.

Data Augmentation with Symbolic Expressions

○ Efficiently encode relationships between functions.

Solution:

○ Design networks to use both:

symbolic + numeric

○ Leverage the observed structure of the data Hierarchical expressions

slide-5
SLIDE 5

Neural Programming

◎Data-driven mathematical and symbolic reasoning ◎Leverage the observed structure of the data

○ Hierarchical expressions

5

slide-6
SLIDE 6

Applications

◎Mathematical equation verification

○ sin$ 𝜄 + cos$ 𝜄 = 1 ???

◎Mathematical question answering

○ sin$ 𝜄 +

$ = 1

◎Solving differential equations

6

slide-7
SLIDE 7

Examples

7

2.5 = 2×101 + 5 ×1023

Symbolic Data Point Function Evaluation Data Point Number Encoding Data Point

slide-8
SLIDE 8

Representing Mathematical Equations

◎Grammar rules

8

slide-9
SLIDE 9

Domain

9

slide-10
SLIDE 10

Tree-LSTM for capturing hierarchies

10

ICLR’18

slide-11
SLIDE 11

Dataset Generation

◎Random local changes

11

sin cos = ∧ 1

1

2

1

∧ 2 + θ θ 2 2 1 − sin cos = ∧ 1

1

2

1

∧ 2 + θ θ 2 2 1 ∧

sin cos = ∧ 1

1

2

1

∧ 2 + θ θ 2 2 1 ∧ ∧ 1

Replace Node Shrink Node Expand Node

slide-12
SLIDE 12

Summary of contributions

◎Combine symbolic expressions and function evaluation ◎New tasks

○ Equation verification ○ equation completion ○ Solving differential equations

◎Balanced dataset generation method ◎Generalizable representation of numbers

12

slide-13
SLIDE 13

Dataset Generation

◎Sub-tree matching

13

sin cos = ∧ 1

1

2

1

∧ 2 + θ θ 2 2 1

sin cos = ∧ 1

1

2

1

∧ 2 + θ θ 2 2 1 2 + x 2 y 2 y + ≡ x

Dictionary key-value pair Choose Node Replace with value’s pattern

slide-14
SLIDE 14

Examples of Generated Equations

14

slide-15
SLIDE 15

15

0.00% 20.00% 40.00% 60.00% 80.00% 100.00% 120.00% Generalization Extrapolation

Equation Verification

Majority Class Sympy LSTM:sym TreeLSTM:sym TreeLSTM:sum+num

slide-16
SLIDE 16

16

Majority Class Sympy LSTM : sym TreeLSTM : sym TreeLSTM:sym+num Generalization 50.24% 81.74% 81.71% 95.18% 97.20% Extrapolation 44.78% 71.93% 76.40% 93.27% 96.17%

40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00%

ACCURACY

EQUATION VERIFICATION

slide-17
SLIDE 17

Equation Completion

17

slide-18
SLIDE 18

Equation Completion

18

slide-19
SLIDE 19

Take-aways

◎Vastly Improved numerical evaluation: 90% over function-fitting

baseline.

◎Generalization to verifying symbolic equations of higher depth ◎Combining symbolic + numerical data helps in better

generalization for both tasks: symbolic and numerical evaluation.

LSTM: Symbolic TreeLSTM: Symbolic TreeLSTM: symbolic + numeric

76.40 % 93.27 % 96.17 %

slide-20
SLIDE 20

Solving Differential Equations

◎Traditional methods:

○ Gather numerical data from a differential equation ○ Design a neural network for training

◎Drawback:

○ Trained model can be used only for that differential equation ○ Train a new model for differential equation ○ Not generalizable

20

slide-21
SLIDE 21

Solving Differential Equations

◎Steps:

○ Find a set of candidate solutions ○ Accept the correct candidate using the neural programmer

◎Advantage:

○ Jointly train for many functions ○ Generalizable ○ Can be used for solving any differential equation

21

slide-22
SLIDE 22

Ordinary Differential Equations

◎𝑜56 order ODE ◎Find 𝑔(𝑦) that satisfies it

22

slide-23
SLIDE 23

Extension to solving differential equations

23

slide-24
SLIDE 24

24

50.16% 56.45% 80.07% 59.78% 94.09% 98.45%

SYMBOLIC ACCURACY ODE ACCURACY

Verifying ODE Solutions

Majority Class Sympy TreeLSTM

slide-25
SLIDE 25

2.

Tensorized deep learning

25

slide-26
SLIDE 26

Images: 3 dimensions Videos: 4 dimensions

Tensors for multi-dimensional data and higher order moments

Pairwise correlations Triplet correlations

slide-27
SLIDE 27

Operations on Tensors: Tensor Contraction

Tensor Contraction

Extends the notion of matrix product Matrix product Mv =

  • j

vjMj

=

+

Tensor Contraction T(u, v, ·) =

  • i,j

uivjTi,j,:

=

+ + +

slide-28
SLIDE 28

Deep Neural Nets: Transforming Tensors

slide-29
SLIDE 29

Deep Tensorized Networks

slide-30
SLIDE 30

Space Saving in Deep Tensorized Networks

slide-31
SLIDE 31

Tensor Train RNN and LSTMs

Tensors for long-term forecasting

Challenges in forecasting:

  • Long-term

dependencies

  • High-order correlations
  • Error propagation
slide-32
SLIDE 32

Climate dataset Traffic dataset

Tensor LSTM for Long-term Forecasting

slide-33
SLIDE 33

T e n s o r l y : H i g h - l e v e l A P I f o r T e n s o r A l g e b r a

  • Python programming
  • User-friendly API
  • Multiple backends:

flexible + scalable

  • Example notebooks in

repository

slide-34
SLIDE 34

Unsupervised learning of Topic Models through tensor methods

Justice Education Sports Topics

slide-35
SLIDE 35

Topic-word matrix [word=i|topic=j] Topic proportions P[topic = j|document]

Moment Tensor: Co-occurrence of Word Triplets

= + +

c r i m e Sports Educa!on

L e a r n i n g L D A M o d e l t h r o u g h t e n s o r s

slide-36
SLIDE 36

Tensor-based LDA training is faster

  • Mallet is an open-source framework for topic

modeling

  • Benchmarks on AWS SageMaker Platform
  • Bulit into AWS Comprehend NLP service.
slide-37
SLIDE 37

3.

Learning to land a drone

Guanya Shi, Xichen Shi, Michael O’Connell, Rose Yu, Kamyar Azizzadenesheli,

  • A. , Yisong Yue, and Soon-Jo Chung
slide-38
SLIDE 38

A New Vision for Autonomy

Center for Autonomous Systems and Technologies

slide-39
SLIDE 39

Physical Model for a Quadrotor drone

  • Dynamics:
  • Control:
  • Unknown forces & moments:
slide-40
SLIDE 40

Challenges in landing a Quadrotor drone

  • Unknown aerodynamic forces & moments.
  • Example 1: when drone is close to ground.
  • Example 2: as velocity goes up, air drag.
  • Example 3: external wind conditions.

Wind generation in CALTECH CAST wind tunnel

slide-41
SLIDE 41

Challenges in using DNNs to Learn Unknown Dynamics

  • Our idea: using DNNs to learn unknown aerodynamic forces and

then design nonlinear controller to cancel it (unknown moments are very limited in landing)

  • Challenge 1: DNNs are data-hungry
  • Challenge 2: DNNs can be unstable and generate unpredictable
  • utput
  • Challenge 3: DNNs are difficult to analyze and design provably

stable controller based on them

  • Our approach: using Spectral Normalization to control Lipschitz

property of DNNs and then design stable nonlinear controller (Neural-Lander)

slide-42
SLIDE 42

Neural Lander Demo 1

slide-43
SLIDE 43

Neural Lander Demo 2

slide-44
SLIDE 44

4.

Lots of efforts at NVIDIA..

slide-45
SLIDE 45

Exascale Deep Learning for Climate Analytics

  • 3 Exaops for AI
  • ~27k Volta 100 GPUs
slide-46
SLIDE 46

NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

Some research leaders at NVIDIA

Robotics

Dieter Fox

Learning & Perception

Jan Kautz Bill Dally Dave Luebke Alex Keller Aaron Lefohn

Graphics

Steve Keckler Dave Nellans Mike O’Connor

Architecture Programming

Michael Garland

VLSI

Brucek Khailany

Circuits

Tom Gray

Networks

Larry Dennison

Chief Scientist Computer vision Core ML

Sanja Fidler Me !

Applied research

Bryan Catanzaro

slide-47
SLIDE 47

Conclusions

◎Rich opportunities to infuse physical domain knowledge in AI algorithms. ◎Jointly using symbolic and numerical data greatly helps neural

programming generalization.

◎Tensors expand learning into any dimension. Tensorized neural

networks capture dependencies better.

◎Learning unknown aerodynamics using spectrally normalized DNNs. ◎Many efforts at NVIDIA to scale AI/ML for physics applications.

47

slide-48
SLIDE 48

Thanks!

Any questions?

48