Hardware-Software Co-design of Slimmed Optical Neural Networks Zheng - PowerPoint PPT Presentation

Hardware-Software Co-design of Slimmed Optical Neural Networks Zheng Zhao 1 , Derong Liu 1 , Meng Li 1 , Zhoufeng Ying 1 , Lu Zhang 2 , Biying Xu 1 , Bei Yu 2 , Ray Chen 1 , David Pan 1 The University of Texas at Austin 1 The Chinese University of Hong Kong 2

Introduction Emergence of dedicated AI accelerators t › Optical neural network processor: light in and light out » Speed-of-light floating point matrix-vector multiplication » >100GHz detection rate » Ultra-low energy consumption if configured › Great number of components, sensitivity to noise [Shen+, Nature Photonics 2017]

Previous Optical Neural Network (ONN) SVD decompose W = U Σ V* t Input Hidden Output layer layer layer U and V* are unitary matrices t › A unitary X satisfies XX* = I › Implemented by Mach-Zehnder Most area expensive … interferometers array Σ is a diagonal matrix t › Diagonal values are non-negative real W › Implemented by optical attenuators σ is non-linear activation in σ out U V* Σ t › Implemented by saturable absorber [Shen+, Nature Photonics 2017] 3

Implementing Unitary U and V* Mach-Zehnder interferometers (MZI) for U and V* t › A single MZI implements a 2-dim unitary coupler coupler ϕ out in › An array of n(n-1)/2 MZIs implements an n-dim unitary i th row T i,j … … in out j th row … i th col. j th col. Given an n-dim unitary, φ’s can be uniquely computed t 4

Previous ONN overview (m x n) W (n x 1) (m x 1) in σ out U Σ V* (m x m) (m x n) (n x n) t Layer size measured by # of MZIs = m(m-1)/2+n(n-1)/2 t Software training and hardware implementation › Train W directly in software à SVD-decomp to obtain U , Σ , V* Software Optical Training Implementation SVD decomp W U V* Σ

Slimmed Architecture (m x n) W (m x 1) (n x 1) in out σ T U Σ (m x n) (n x n) (n x n) t T: sparse tree network t U: unitary network same constraints as the previous architecture t Σ: diagonal network t Use less # of MZIs = n(n-1)/2 › 1 unitary matrix to maintain the expressivity › An area-efficient tree network to match the dimension 6

Co-design Overview t An arbitrary weight W is not TUΣ-decomposable t Co-design solution: training and implementation are coupled › T and Σ : Train the device parameters, constraints embedded › U: Add unitary regularization then approximate with true unitary Software Optical Training Implementation = T T Previous Train and Impl. Software Optical U Approx. Training Implementation SVD with U decomp reg. W U Σ V* = Σ Σ 7

Sparse Tree Network t Sparse Tree network ( T ) to match the different dimension › Suppose in-dim > out-dim › α: linear transfer coefficient x 1 y x 2 … … 1st subtree x N N x 1subtree out in 2nd subtree 3rd subtree 8

Sparse Tree Network Implementation t Implemented with MZIs or directional couplers t A 2 x 1 subtree x 1 y x 2 2 x 1 subtree can be Implemented with a single-out MZI or directional coupler coupler coupler ϕ out in (energy conservation) 9

Sparse Tree Network Implementation t Any N -input subtree with arbitrary α ’s satisfying energy conservation can be implemented it by cascading ( N-1 ) single-out MZIs. t Energy conservation embedded in training Software Optical Training Implementation = T T U Approx. with U reg. = Σ Σ

Unitary Network in Training t For unitary network U satisfying UU* = I , add the regularization reg = ∥ UU* − I ∥ F t Training loss function Loss = Data Loss + Regularization Loss leading to a near-implementable ONN with high accuracy t Trained U t ~ unitary but only true unitary is implementable by MZIs 11

Unitary Network in Implementation t Approximate U t by a true unitary U a t SVD-decompose U t = PSQ * à U a = PQ* t Claim. Minimize the regularization ⇔ find the best approximation Min. reg ⇔ Min. || U t - U a || F Software Optical Training Implementation = T T U Approx. with U reg. = Σ Σ

Simulation Results t Implemented in TensorFlow for various ONN setup N1: (14 � 14)-100-10 N4: (14 � 14)-150-150-10 N7: (14 � 14)-150-150-150-10 N2: (14 � 14)-150-10 N5: (28 � 28)-400-400-10 N8: (28 � 28)-400-400-200-10 N3: (28 � 28)-400-10 N6: (28 � 28)-600-300-10 N9: (28 � 28)-600-600-300-10 t Tested it on Intel Core i9-7900X CPU and an NVIDIA TitanXp GPU t Performed on the handwritten digit dataset MNIST 13

Simulation Results # of MZIs • N1~N9: network configurations • Our architecture uses 15%-38% less MZIs Accuracy • Similar accuracy (~0 accuracy loss) • Maximum loss is 0.0088 • Average is 0.0058

Noise Robustness t Better resilience due to less cascaded components Previous ONN Our ONN Accuracy Accuracy Noise Amplitude Noise Amplitude 15

Training Curve Regularization Regularization Accuracy Accuracy Epoch Epoch Regularization Accuracy • Converged in 300 epochs • Balance of the accuracy and the unitary approximation Epoch 16

Contributions of This Work t An new architecture for ONN › Area-efficiency › ~0 accuracy loss › Better robustness to noise t Hardware and software co-design methodology › Software-embedded hardware parameters › Hardware constraints guaranteed by software 17

Future Work t Better MZI pruning methods › ~0 phase MZI à pruned + accuracy recover › MZI-sparse unitary matrix t Design for robustness › Adjust noise distribution in training t Online training t ONN for other neural network architectures › CNN, RNN, etc. 18

Hardware-Software Co-design of Slimmed Optical Neural Networks Zheng - PowerPoint PPT Presentation

Hardware-Software Co-design of Slimmed Optical Neural Networks Zheng Zhao 1 , Derong Liu 1 , Meng Li 1 , Zhoufeng Ying 1 , Lu Zhang 2 , Biying Xu 1 , Bei Yu 2 , Ray Chen 1 , David Pan 1 The University of Texas at Austin 1 The Chinese University of

Hardware Observability Framework Hardware Observability Framework Hardware Observability

Optical Rings and Hybrid Mesh Rings Optical Networks draft-papadimitriou-optical-rings-00.txt

software and hardware for the Internet of Things. Choose hardware Design hardware Design

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Optical Recording and Optical Recording and That audio or video is of the highest quality

Experiment 3 Optical Rotation Optical rotation or optical activity The rotation of the plane

Agile Software Design 19 February, 2020 Software Design Early decisions Modular design Agile

Optical Fiber Madhuri Jash 07/03/2015 What is Optical Fiber? An optical fiber is a flexible,

NVIDIA OPTICAL FLOW Abhijit Patait, 3/18/2019 Optical Flow in Turing GPUs NVIDIA Optical Flow

USING OPTICAL FLOW CMPS261 Project Shweta Philip OPTICAL FLOW Assumptions made by optical

Optical Filters for Space Instrumentation Angela Piegari ENEA, Optical Coatings Laboratory, Roma,

Optical Recording and Optical Recording and and tilt it just right, the watchs face appears to

Upgrading Optical Flow to 3D Scene Flow through Optical Expansion Gengshan Yang 1 , Deva Ramanan

VC. VC. Hardware Startup The Hardware Revolu/on The Hardware Revolution Removing Barriers to

Sec Secure ure Hardware Hardware and Hardware and Hardware- En Enabled abled Security

WITH DEEP NEURAL NETWORKS INTELLIGENT ROBOTICS SEMINAR PIA UK 25.11.2019 OUTLINE 1.

SERP-Based Conversations Maarten de Rijke University of Amsterdam derijke@uva.nl Work in

Purpose of the Working Group To analyse the need for new/adapted standards for CO 2 quality, to

Public Transit Backbone Prepared by Robert B. Case, PhD, PE For HRT-Participating Cities Transit

Commissioning and Operation of the New CMS Phase-1 Pixel Detector Weinan Si University of

networks on microcontrollers Manuele Rusci*, Alessandro Capotondi, Luca Benini

Robus bust Inference nce vi via Gene nerative Cl Classifiers for r Handl ndling ng Noisy

Neuroimaging Inflammation in Clinical Depression and Obsessive Compulsive Disorder Dr. Jeffrey

May 30, 2019 Novice Infection Preventionists Infection Prevention Boot Camp May 3031, 2019