Opportunities for infusing physics in AI/ML algorithms
Animashree Anandkumar
Director of ML Research, NVIDIA Bren Professor, Caltech
Opportunities for infusing physics in AI / ML algorithms Animashree - - PowerPoint PPT Presentation
Opportunities for infusing physics in AI / ML algorithms Animashree Anandkumar Director of ML Research, NVIDIA Bren Professor, Caltech 1. Neural Programming 2 Combining Symbolic Expressions & Black-box Function Evaluations in Neural
Animashree Anandkumar
Director of ML Research, NVIDIA Bren Professor, Caltech
2
FOROUGH ARABSHAHI, SAMEER SINGH, ANIMA ANANDKUMAR
Goal: Learn a domain of functions (sin, cos, log…)
○ Training on numerical input-output does not generalize.
Data Augmentation with Symbolic Expressions
○ Efficiently encode relationships between functions.
Solution:
○ Design networks to use both:
symbolic + numeric
○ Leverage the observed structure of the data Hierarchical expressions
○ Hierarchical expressions
5
○ sin$ 𝜄 + cos$ 𝜄 = 1 ???
○ sin$ 𝜄 +
$ = 1
6
7
2.5 = 2×101 + 5 ×1023
Symbolic Data Point Function Evaluation Data Point Number Encoding Data Point
8
9
10
ICLR’18
11
sin cos = ∧ 1
1
2
1
∧ 2 + θ θ 2 2 1 − sin cos = ∧ 1
1
2
1
∧ 2 + θ θ 2 2 1 ∧
sin cos = ∧ 1
1
2
1
∧ 2 + θ θ 2 2 1 ∧ ∧ 1
Replace Node Shrink Node Expand Node
○ Equation verification ○ equation completion ○ Solving differential equations
12
13
sin cos = ∧ 1
1
2
1
∧ 2 + θ θ 2 2 1
sin cos = ∧ 1
1
2
1
∧ 2 + θ θ 2 2 1 2 + x 2 y 2 y + ≡ x
Dictionary key-value pair Choose Node Replace with value’s pattern
14
15
0.00% 20.00% 40.00% 60.00% 80.00% 100.00% 120.00% Generalization Extrapolation
Equation Verification
Majority Class Sympy LSTM:sym TreeLSTM:sym TreeLSTM:sum+num
16
Majority Class Sympy LSTM : sym TreeLSTM : sym TreeLSTM:sym+num Generalization 50.24% 81.74% 81.71% 95.18% 97.20% Extrapolation 44.78% 71.93% 76.40% 93.27% 96.17%
40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00%
ACCURACY
17
18
baseline.
generalization for both tasks: symbolic and numerical evaluation.
LSTM: Symbolic TreeLSTM: Symbolic TreeLSTM: symbolic + numeric
76.40 % 93.27 % 96.17 %
○ Gather numerical data from a differential equation ○ Design a neural network for training
○ Trained model can be used only for that differential equation ○ Train a new model for differential equation ○ Not generalizable
20
○ Find a set of candidate solutions ○ Accept the correct candidate using the neural programmer
○ Jointly train for many functions ○ Generalizable ○ Can be used for solving any differential equation
21
22
23
24
50.16% 56.45% 80.07% 59.78% 94.09% 98.45%
SYMBOLIC ACCURACY ODE ACCURACY
Verifying ODE Solutions
Majority Class Sympy TreeLSTM
25
Images: 3 dimensions Videos: 4 dimensions
Pairwise correlations Triplet correlations
Tensor Contraction
Extends the notion of matrix product Matrix product Mv =
vjMj
+
Tensor Contraction T(u, v, ·) =
uivjTi,j,:
+ + +
Challenges in forecasting:
dependencies
Climate dataset Traffic dataset
T e n s o r l y : H i g h - l e v e l A P I f o r T e n s o r A l g e b r a
flexible + scalable
repository
Justice Education Sports Topics
Topic-word matrix [word=i|topic=j] Topic proportions P[topic = j|document]
Moment Tensor: Co-occurrence of Word Triplets
= + +
c r i m e Sports Educa!on
modeling
Guanya Shi, Xichen Shi, Michael O’Connell, Rose Yu, Kamyar Azizzadenesheli,
A New Vision for Autonomy
Wind generation in CALTECH CAST wind tunnel
then design nonlinear controller to cancel it (unknown moments are very limited in landing)
stable controller based on them
property of DNNs and then design stable nonlinear controller (Neural-Lander)
NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Robotics
Dieter Fox
Learning & Perception
Jan Kautz Bill Dally Dave Luebke Alex Keller Aaron Lefohn
Graphics
Steve Keckler Dave Nellans Mike O’Connor
Architecture Programming
Michael Garland
VLSI
Brucek Khailany
Circuits
Tom Gray
Networks
Larry Dennison
Chief Scientist Computer vision Core ML
Sanja Fidler Me !
Applied research
Bryan Catanzaro
programming generalization.
networks capture dependencies better.
47
48