Neural Architecture Search and Beyond Barret Zoph Confidential + - PowerPoint PPT Presentation

Neural Architecture Search and Beyond Barret Zoph Confidential + Proprietary Confidential + Proprietary

Progress in AI ● Generation 1: Good Old Fashioned AI ○ Handcraft predictions ○ Learn nothing ● Generation 2: Shallow Learning Handcraft features ○ ○ Learn predictions Generation 3: Deep Learning ● ○ Handcraft algorithm (architectures, data processing, …) ○ Learn features and predictions end-to-end Generation 4: Learn2Learn (?) ● ○ Handcraft nothing Learn algorithm, features and predictions end-to-end ○ Confidential + Proprietary

Importance of architectures for Vision ● Designing neural network architectures is hard ● Lots of human efforts go into tuning them There is not a lot of intuition into how to design them well ● Can we try and learn good architectures automatically? ● Two layers from the famous Inception V4 computer vision model. Canziani et al, 2017 Szegedy et al, 2017 Confidential + Proprietary

Convolutional Architectures Krizhevsky et al, 2012 Confidential + Proprietary

Uses primitives How does architecture search work? found in CV Research Sample models from search space Controller Trainer Accuracy Reinforcement Learning or Evolution Reward Zoph & Le. Neural Architecture Search with Reinforcement Learning. ICLR, 2017. arxiv.org/abs/1611.01578 Real et al . Large Scale Evolution of Image Classifiers. ICML, 2017. arxiv.org/abs/1703.01041 Confidential + Proprietary

How does architecture search work? Controller: proposes ML models Train & evaluate models 20K Iterate to find the most accurate model Confidential + Proprietary

Example: Using reinforcement learning controller (NAS) Softmax classifier Controller RNN Embedding Zoph & Le. Neural Architecture Search with Reinforcement Learning. ICLR, 2017. arxiv.org/abs/1611.01578 Confidential + Proprietary

Example: Using evolutionary controller Worker Possible Mutations ● Insert convolution Remove convolution ● ● Insert nonlinearity Remove nonlinearity ● ● Add-skip Remove skip ● ● Alter strides Alter number of channels ● ● Alter horizontal filter size Alter vertical filters size ● ● Alter Learning Rate Identity ● ● Reset weights Confidential + Proprietary

ImageNet Neural Architect Search Improvements Top-1 Accuracy Architecture Search Confidential + Proprietary

Architect Search ImageNet Tan & Le. EfficientNet: Rethinking Model Scaling for Deep Convolutional Old Architectures Neural Networks, 2019 arxiv.org/abs/1905.11946 MobileNetV3 Confidential + Proprietary

Architecture Search Object detection: COCO Ghiasi et al. Learning Scalable Feature Pyramid Architecture for Object Detection , 2019 arxiv.org/abs/1904.07392 Confidential + Proprietary

Architecture Decisions for Detection Architecture Search Human Designed Machine Designed Architecture Architecture Ghiasi et al. Learning Scalable Feature Pyramid Architecture for Object Detection , 2019 arxiv.org/abs/1904.07392 Confidential + Proprietary

Video Classification Architecture Search Architect Learn the connections Search State-of-the-art accuracy between blocks Ryoo et al. , 2019. AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures. arxiv.org/abs/1905.13209 Confidential + Proprietary

Translation: WMT Architecture Search 256 input words + 256 output words So, et al. The Evolved Transformer, 2019, arxiv.org/abs/1901.11117 Confidential + Proprietary

Architecture Decisions Using more convolutions in earlier layers Confidential + Proprietary

Platform-aware search Sample models from search space Mobile Controller Trainer Phones Latency Accuracy Reinforcement Learning Multi-objective or Evolution reward Tan et al. , MnasNet: Platform-Aware Neural Architecture Search for Mobile. CVPR, 2019 arxiv.org/abs/1807.11626 Confidential + Proprietary

Collaboration between Waymo and Google Brain: 20–30% lower latency / same quality. ● 8–10% lower error rate / same latency. ● ‘Interesting’ architectures: htups://medium.com/waymo/automl-automating-the-design-of-machine-learning-models-for-autonomous-driving-141a5583ec2a Confidential + Proprietary

Tabular Data trees, neural nets, #layers, activation functions, connectivity Automated Automated Automated Automated Automated Automated Feature Architecture Hyper- Model Model Model Distillation Engineering Search parameter Selection Ensembling and Export for Tuning Serving Normalization, Can distill to decision trees Transformation for interpretability (log, cosine) https://ai.googleblog.com/2019/05/an-end-to-end-automl-solution-for.html Confidential + Proprietary

Tabular Data AutoML placed 2nd in a live one-day Internal Benchmark on Kaggle Competitions competition against 76 teams Confidential + Proprietary

Problems of NAS ● Enormous compute consumption ○ Requires ~10k training trials to coverage on a carefully designed search space ○ Not applicable if single trial’s computation is heavy Works inefficiently on arbitrary and giant search space ● Feature selection (search space 2^100 if there are 100 features) ○ ○ Per feature transform (search space c^100 if there are 100 features and each has c types of transform) ○ Embedding and hidden layer size Confidential + Proprietary

Efficient NAS: Addressing the efficiency Key idea: Sum 1. One path inside a big model is a child model 2. Controller selects a path inside a big model and train for a few steps Conv Conv Pool 3. Controller selects another path inside a big 3x3 5x5 model and train for a few steps, reusing the weights produced by the previous step 4. Etc. Sum Results: Can save 100->1000x compute Conv Conv Pool Related works: DARTS, SMASH, One-shot 3x3 5x5 architecture search, Input Pham et al, 2018. Efficient Neural Architecture Search via Parameter Sharing, arxiv.org/abs/1802.03268 Confidential + Proprietary

Learning Data Augmentation Procedures Data Machine Learning Data Processing Model Very important but Focus of machine manually tuned learning research Confidential + Proprietary

Data Augmentation Confidential + Proprietary

AutoAugment Search Algorithm Controller : proposes Train & evaluate models with augmentation policy the augmentation policy 20K Iterate to find the most accurate policy Cubuk et al, 2018. AutoAugment: Learning Augmentation Confidential + Proprietary Policies from Data, arxiv.org/abs/1805.09501

AutoAugment: Example Learned Policy AutoAugment Learns: (Operation, Probability, Magnitude) Probability of applying Magnitude Confidential + Proprietary

AutoAugment: Example Learned Policy For each Sub-Policy (5 Sub-Policies = Policy): AutoAugment Learns: (Operation, Probability, Magnitude) Confidential + Proprietary

AutoAugment CIFAR Results Model No data aug Standard data-aug AutoAugment State-of-the-art accuracy Model No data aug Standard data-aug AutoAugment Confidential + Proprietary

AutoAugment ImageNet Results (Top5 error rate) Model No data augmentation Standard data augmentation AutoAugment Code is opensourced: https://github.com/tensorflow/models/tree/mast er/research/autoaugment Confidential + Proprietary

Expanded AutoAugment for Object Detection Zoph et al. 2019, Learning Data Augmentation Strategies for Object Detection, arxiv.org/abs/1906.11172 Confidential + Proprietary

Learn Augmentation on COCO Results ResNet-50 Model Confidential + Proprietary

Learn Augmentation on COCO Results State-of-the-art accuracy at the time for a single model Code is opensourced: https://github.com/tensorflow/tpu/tree/master/models/official/detection Confidential + Proprietary

RandAugment: Practical data augmentation with no separate search Faster AutoAugment w/ vastly reduced search space! Only two tunable parameters now: Magnitude and Policy Length Cubuk et al. 2019, RandAugment: Practical data augmentation with no separate search, arxiv.org/abs/1909.13719 Confidential + Proprietary

RandAugment: Practical data augmentation with no separate search Match or surpass AA with significantly less cost! Confidential + Proprietary

RandAugment: Practical data augmentation with no separate search Can easily scale regularization State-of-the-art accuracy strength when model size changes! Code and Models Opensourced: https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet Confidential + Proprietary

Neural Architecture Search and Beyond Barret Zoph Confidential + - PowerPoint PPT Presentation

Neural Architecture Search and Beyond Barret Zoph Confidential + Proprietary Confidential + Proprietary Progress in AI Generation 1: Good Old Fashioned AI Handcraft predictions Learn nothing Generation 2: Shallow Learning

Neural Architecture Search Yu Cao What is Neural Architecture Search (NAS) Selecting the optimal

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Foundations of Artificial Intelligence 9. State-Space Search: Tree Search and Graph Search Malte

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Neural Architecture Search in a Proxy Validation Loss Landscape Yanxi Li 1 , Minjing Dong 1 ,

CS 730/830: Intro AI 1 handout: slides Are We Done? Beyond A* Suboptimal Search Anytime

MEDIA DISRUPTION SEEING BEYOND SEEING BEYOND SEEING BEYOND SEEING BEYOND LED BY THE BLIND

Looking Beyond the Knob Looking Beyond the Knob Looking Beyond the Knob Looking Beyond the Knob

Elastic Search - Aditi Choksi (EW18455) Elastic Search Search engine Distributed

2 EBI Search 3 EBI Search 4 EBI

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Typed recursion in the rewriting calculus Benjamin Wack joint work with C. Kirchner, L. Liquori,

Two Set-based Implementations of Quotients in Type Theory Niccol` o Veltri Institute of

Australian National Accounts Views expressed here are of the authors and not necessarily of the

Two Types of Morphological Displacement Andrew Nevins Harvard University Morphology of the

Making and Measuring Progress in Adversarial Machine Learning Nicholas Carlini Google Research

Comodules over relative comonads for streams and infinite matrices R egis Spadotti joint work

What is a digital photo, really? Fabian Tamp //capnfabs.net/photo

Global Fund Valuation Briefing HOTEL LE ROYAL, LUXEMBOURG, 20 TH APRIL 2016 Agenda 2 09:00

Sambuz

Useful Links

Newsletter

Mail Us