ADVANCES IN TRINITY OF AI: DATA, ALGORITHMS & COMPUTE
Anima Anandkumar
Bren Professor at Caltech Director of ML Research at NVIDIA
ADVANCES IN TRINITY OF AI: DATA, ALGORITHMS & COMPUTE Anima - - PowerPoint PPT Presentation
ADVANCES IN TRINITY OF AI: DATA, ALGORITHMS & COMPUTE Anima Anandkumar Bren Professor at Caltech Director of ML Research at NVIDIA TRINITY FUELING ARTIFICIAL INTELLIGENCE ALGORITHMS OPTIMIZATION SCALABILITY MULTI-DIMENSIONALITY
Bren Professor at Caltech Director of ML Research at NVIDIA
ALGORITHMS
DATA
INFRASTRUCTURE FULL STACK FOR ML
Labeled data Unlabeled data Goal
Can it work at scale with deep learning?
Active learning heuristics:
probability (MNLP)
Test F1 score vs. % of labeled words
English
20 40 60 80 70 75 80 85 Percent of words annotated Test F1 score
MNLP LC RAND Best Deep Model Best Shallow Model
20 40 60 80 100 65 70 75
MNLP LC RAND Best Deep Model Best Shallow Model
Chinese
images questions dog? dog non-dog partial labels
active data active questions
inactive data inactive questions
active data inactive questions
inactive data active questions
0.1 0.2 0.3 0.4 0.5 0% 25% 50% 75% 100%
Uniform AL-ME AQ-ERC ALPF-ERC
+8%
Repeat Posterior of ground-truth labels given annotator quality model Use trained model to infer ground-truth labels Noisy crowdsourced annotations MLE : update Annotator quality using inferred labels from model Training with weighted loss. Use posterior as weights
MS-COCO dataset. Fixed budget: 35k annotations
Theorem: Under fixed budget, generalization error minimized with single annotation per sample. Assumptions:
(under no label noise).
have same quality.
5% wrt Majority rule
natural images
different from prediction.
y x y x
category intermediate rendering image latent variables
0.5 dog 0.2 cat 0.1 horse … 1.0 dog Choose render
Upsample, select location Render NRM: Generation CNN: Inference image unpooled feature map pooled feature map rectified feature map class template masked template upsampled template rendered image
Cross-Entropy Loss for Training the CNNs with Labeled Data
min
θ∈Aγ Hp,q(y|x, zmax) ≥
min
(zi)n
i=1,θ
1 n
n
X
i=1
− log p(yi|xi, zi; θ)
<latexit sha1_base64="o8aAYTACJXrjfyUfzuzjiD9z6U=">ACkXicbVFdb9MwFHXC1ygwOvbIyxUVUiuVKZkmwYSGCrxM4mVIdJtUd5HjOqk128liBzXz/H/4Pbzxb3DaSMDGlXx1dO491/a5aSm4NlH0Kwjv3X/w8NHW496Tp8+2n/d3Xpzqoq4om9JCFNV5SjQTXLGp4Uaw87JiRKaCnaWXn9v62XdWaV6ob6Yp2VySXPGMU2I8lfR/YMlVYrFZMkMAcwVYErOkRNiPzvM5kZI4B8eJLcdXbtjcrMbXF76frYyVZOXcCHDOrmAzZ3idWO5GPh3F7sIqN4bNaIcXWUWojZ0nAeta/umBN4BFkUM5bFr1zarNY1iPer+RjyDpD6K9aB1wF8QdGKAuTpL+T7woaC2ZMlQrWdxVJq5JZXhVDXw7VmJaGXJGczDxWRTM/t2lEHrz2zgKyo/FEG1uzfCkuk1o1MfWdrl75da8n/1Wa1yd7NLVdlbZim4uyWoApoF0PLHjFqBGNB4RW3L8V6J454xfYs+bEN/+8l1wur8Xe/z1YD51NmxhV6iV2iIYvQWTdAxOkFTRIPt4CA4Cj6Eu+FhOAm73jDoNLvonwi/Aa9kMoP</latexit><latexit sha1_base64="o8aAYTACJXrjfyUfzuzjiD9z6U=">ACkXicbVFdb9MwFHXC1ygwOvbIyxUVUiuVKZkmwYSGCrxM4mVIdJtUd5HjOqk128liBzXz/H/4Pbzxb3DaSMDGlXx1dO491/a5aSm4NlH0Kwjv3X/w8NHW496Tp8+2n/d3Xpzqoq4om9JCFNV5SjQTXLGp4Uaw87JiRKaCnaWXn9v62XdWaV6ob6Yp2VySXPGMU2I8lfR/YMlVYrFZMkMAcwVYErOkRNiPzvM5kZI4B8eJLcdXbtjcrMbXF76frYyVZOXcCHDOrmAzZ3idWO5GPh3F7sIqN4bNaIcXWUWojZ0nAeta/umBN4BFkUM5bFr1zarNY1iPer+RjyDpD6K9aB1wF8QdGKAuTpL+T7woaC2ZMlQrWdxVJq5JZXhVDXw7VmJaGXJGczDxWRTM/t2lEHrz2zgKyo/FEG1uzfCkuk1o1MfWdrl75da8n/1Wa1yd7NLVdlbZim4uyWoApoF0PLHjFqBGNB4RW3L8V6J454xfYs+bEN/+8l1wur8Xe/z1YD51NmxhV6iV2iIYvQWTdAxOkFTRIPt4CA4Cj6Eu+FhOAm73jDoNLvonwi/Aa9kMoP</latexit><latexit sha1_base64="o8aAYTACJXrjfyUfzuzjiD9z6U=">ACkXicbVFdb9MwFHXC1ygwOvbIyxUVUiuVKZkmwYSGCrxM4mVIdJtUd5HjOqk128liBzXz/H/4Pbzxb3DaSMDGlXx1dO491/a5aSm4NlH0Kwjv3X/w8NHW496Tp8+2n/d3Xpzqoq4om9JCFNV5SjQTXLGp4Uaw87JiRKaCnaWXn9v62XdWaV6ob6Yp2VySXPGMU2I8lfR/YMlVYrFZMkMAcwVYErOkRNiPzvM5kZI4B8eJLcdXbtjcrMbXF76frYyVZOXcCHDOrmAzZ3idWO5GPh3F7sIqN4bNaIcXWUWojZ0nAeta/umBN4BFkUM5bFr1zarNY1iPer+RjyDpD6K9aB1wF8QdGKAuTpL+T7woaC2ZMlQrWdxVJq5JZXhVDXw7VmJaGXJGczDxWRTM/t2lEHrz2zgKyo/FEG1uzfCkuk1o1MfWdrl75da8n/1Wa1yd7NLVdlbZim4uyWoApoF0PLHjFqBGNB4RW3L8V6J454xfYs+bEN/+8l1wur8Xe/z1YD51NmxhV6iV2iIYvQWTdAxOkFTRIPt4CA4Cj6Eu+FhOAm73jDoNLvonwi/Aa9kMoP</latexit><latexit sha1_base64="o8aAYTACJXrjfyUfzuzjiD9z6U=">ACkXicbVFdb9MwFHXC1ygwOvbIyxUVUiuVKZkmwYSGCrxM4mVIdJtUd5HjOqk128liBzXz/H/4Pbzxb3DaSMDGlXx1dO491/a5aSm4NlH0Kwjv3X/w8NHW496Tp8+2n/d3Xpzqoq4om9JCFNV5SjQTXLGp4Uaw87JiRKaCnaWXn9v62XdWaV6ob6Yp2VySXPGMU2I8lfR/YMlVYrFZMkMAcwVYErOkRNiPzvM5kZI4B8eJLcdXbtjcrMbXF76frYyVZOXcCHDOrmAzZ3idWO5GPh3F7sIqN4bNaIcXWUWojZ0nAeta/umBN4BFkUM5bFr1zarNY1iPer+RjyDpD6K9aB1wF8QdGKAuTpL+T7woaC2ZMlQrWdxVJq5JZXhVDXw7VmJaGXJGczDxWRTM/t2lEHrz2zgKyo/FEG1uzfCkuk1o1MfWdrl75da8n/1Wa1yd7NLVdlbZim4uyWoApoF0PLHjFqBGNB4RW3L8V6J454xfYs+bEN/+8l1wur8Xe/z1YD51NmxhV6iV2iIYvQWTdAxOkFTRIPt4CA4Cj6Eu+FhOAm73jDoNLvonwi/Aa9kMoP</latexit>Max-Min Loss for Training the CNNs with Labeled Data
αmaxHp,q(y|x, zmax) + αminHp,q(y|x, zmin)
<latexit sha1_base64="CTyL0GiHpBuDE9+L99VekVIDKxM=">AAACSHicdZBLSwMxFIUz9VXrq+rSTbAIiiIzIuiy6KZLBauFtg530tQGM5kxuSOt4/w8Ny7d+RvcuFDEnWntwla9EPg45x6SnCCWwqDrPju5icmp6Zn8bGFufmFxqbi8cm6iRDNeZZGMdC0Aw6VQvIoCJa/FmkMYSH4RXB/3/Ytbro2I1Bn2Yt4M4UqJtmCAVvKLfgNk3IHLtIG8i2kI3Syr+Gm8c5Nt9u67O3cjzhbdpmMBof4L9J0tv1hyd93B0N/gDaFEhnPiF58arYglIVfIJBhT99wYmyloFEzyrNBIDI+BXcMVr1tUEHLTTAdFZHTDKi3ajrQ9CulA/ZlIITSmFwZ2MwTsmHGvL/7l1RNsHzZToeIEuWLfF7UTSTGi/VZpS2jOUPYsANPCvpWyDmhgaLsv2BK88S//hvO9Xc/y6X6pfDSsI0/WyDrZJB45IGVSISekShh5IC/kjbw7j86r8+F8fq/mnGFmlYxMLvcFK2q1gQ==</latexit> <latexit sha1_base64="CTyL0GiHpBuDE9+L99VekVIDKxM=">AAACSHicdZBLSwMxFIUz9VXrq+rSTbAIiiIzIuiy6KZLBauFtg530tQGM5kxuSOt4/w8Ny7d+RvcuFDEnWntwla9EPg45x6SnCCWwqDrPju5icmp6Zn8bGFufmFxqbi8cm6iRDNeZZGMdC0Aw6VQvIoCJa/FmkMYSH4RXB/3/Ytbro2I1Bn2Yt4M4UqJtmCAVvKLfgNk3IHLtIG8i2kI3Syr+Gm8c5Nt9u67O3cjzhbdpmMBof4L9J0tv1hyd93B0N/gDaFEhnPiF58arYglIVfIJBhT99wYmyloFEzyrNBIDI+BXcMVr1tUEHLTTAdFZHTDKi3ajrQ9CulA/ZlIITSmFwZ2MwTsmHGvL/7l1RNsHzZToeIEuWLfF7UTSTGi/VZpS2jOUPYsANPCvpWyDmhgaLsv2BK88S//hvO9Xc/y6X6pfDSsI0/WyDrZJB45IGVSISekShh5IC/kjbw7j86r8+F8fq/mnGFmlYxMLvcFK2q1gQ==</latexit> <latexit sha1_base64="CTyL0GiHpBuDE9+L99VekVIDKxM=">AAACSHicdZBLSwMxFIUz9VXrq+rSTbAIiiIzIuiy6KZLBauFtg530tQGM5kxuSOt4/w8Ny7d+RvcuFDEnWntwla9EPg45x6SnCCWwqDrPju5icmp6Zn8bGFufmFxqbi8cm6iRDNeZZGMdC0Aw6VQvIoCJa/FmkMYSH4RXB/3/Ytbro2I1Bn2Yt4M4UqJtmCAVvKLfgNk3IHLtIG8i2kI3Syr+Gm8c5Nt9u67O3cjzhbdpmMBof4L9J0tv1hyd93B0N/gDaFEhnPiF58arYglIVfIJBhT99wYmyloFEzyrNBIDI+BXcMVr1tUEHLTTAdFZHTDKi3ajrQ9CulA/ZlIITSmFwZ2MwTsmHGvL/7l1RNsHzZToeIEuWLfF7UTSTGi/VZpS2jOUPYsANPCvpWyDmhgaLsv2BK88S//hvO9Xc/y6X6pfDSsI0/WyDrZJB45IGVSISekShh5IC/kjbw7j86r8+F8fq/mnGFmlYxMLvcFK2q1gQ==</latexit> <latexit sha1_base64="CTyL0GiHpBuDE9+L99VekVIDKxM=">AAACSHicdZBLSwMxFIUz9VXrq+rSTbAIiiIzIuiy6KZLBauFtg530tQGM5kxuSOt4/w8Ny7d+RvcuFDEnWntwla9EPg45x6SnCCWwqDrPju5icmp6Zn8bGFufmFxqbi8cm6iRDNeZZGMdC0Aw6VQvIoCJa/FmkMYSH4RXB/3/Ytbro2I1Bn2Yt4M4UqJtmCAVvKLfgNk3IHLtIG8i2kI3Syr+Gm8c5Nt9u67O3cjzhbdpmMBof4L9J0tv1hyd93B0N/gDaFEhnPiF58arYglIVfIJBhT99wYmyloFEzyrNBIDI+BXcMVr1tUEHLTTAdFZHTDKi3ajrQ9CulA/ZlIITSmFwZ2MwTsmHGvL/7l1RNsHzZToeIEuWLfF7UTSTGi/VZpS2jOUPYsANPCvpWyDmhgaLsv2BK88S//hvO9Xc/y6X6pfDSsI0/WyDrZJB45IGVSISekShh5IC/kjbw7j86r8+F8fq/mnGFmlYxMLvcFK2q1gQ==</latexit>Input Image Max Xentropy Min Xentropy Max-Min Xentropy Shared weights
Min cross-entropy minimizes the posteriors of incorrect labels.
!"#$%& () *+,-.% &%/0%&-/1 2*,34 /5/7
Rendering Path
Goal: Learn a domain of functions (sin, cos, log, add…)
Data Augmentation with Symbolic Expressions
Solution:
sin; 𝜄 + cos; 𝜄 = 1 sin −2.5 = −0.6
Decimal Tree for 2.5
LSTM: Symbolic TreeLSTM: Symbolic TreeLSTM: symbolic + numeric 76.40 % 93.27 % 96.17 %
With 1/2 data With 1/2 data
With 1/2 data With 1/2 data
P3.2x machines on AWS, Resnet50 on imagenet
Images: 3 dimensions Videos: 4 dimensions
Pairwise correlations Triplet correlations
Tensor Contraction
Extends the notion of matrix product Matrix product Mv =
vjMj
+
Tensor Contraction T(u, v, ·) =
uivjTi,j,:
+ + +
Tensor Train RNN and LSTMs
Challenges:
dependencies
correlations
Climate dataset Traffic dataset
flexible + scalable
repository
Robotics
Dieter Fox
Learning & Perception
Jan Kautz Bill Dally Dave Luebke Alex Keller Aaron Lefohn
Graphics
Steve Keckler Dave Nellans Mike O’Connor
Architecture Programming
Michael Garland
VLSI
Brucek Khailany
Circuits
Tom Gray
Networks
Larry Dennison
Chief Scientist Computer vision Core ML
Sanja Fidler Me !
Applied research
Bryan Catanzaro