L ear ning and Vision R esear ch Gr oup Shuic he ng YAN Natio - PowerPoint PPT Presentation

L ear ning and Vision R esear ch Gr oup Shuic he ng YAN Natio nal U nive rsity o f Singapo re

Learning and Vision Research Group (LV)  Founded early 2008  20-30 members

Three Indicators of Excellence for Members Industry Commercialization Competition Awards High Citations One indicator is enough for a member to be an excellent researcher

Past, Present and Future of LV Smart Services/Devices (Never-ending Learning) Deep Learning Sparsity/Low-rank Future 微展未来 Subspace Learning Present 立足现在 Past 回顾经典

Learning and Vision Group, Past Subspace Learning, Sparsity/Low ‐ rank [Block ‐ Diagonality] [Guangcan LIU, Canyi LU, Jiashi FENG]

Subspace: Graph Embedding and Extensions Intrinsic Graph:   [{ } , ], N n x x G S  1 i i i Penalty Graph     [{ } , ] 2 min || - || ( ), [ , ,..., ] P N P T x G S y y S Tr YLY Y y y y 1 2  1 i j ij N i i Y i j      , L D S D S  2 max || - || ( ) p p T  y y S Tr YL Y ii ij j i i j ij  Y    , p p p p p i j L D S D S  ii ij j i Graph Embedding and Extensions: A General Framework for Dimensionality Reduction, TPAMI’07, Yan, et al.

Subspace: Graph Embedding and Extensions Intrinsic Graph:   [{ } , ], N n x x G S  1 i i i Direct Graph Embedding Linearization Kernelization ( p T ) T r YL Y     ( ) m ax T W A x y W x i i i ( ) T T r YL Y Y Original PCA & LDA, PCA, LDA, LPP, LEA KPCA, KDA ISOMAP, LLE, …… …… Laplacian Eigenmap Penalty Graph Tensorization Type  [{ } , ] P N P x G S Formulation      1 2 1 X  n i i y W W W 1 2 i i n     , L D S D S  ii ij j i CSA, DATER Example     , p p p p p L D S D S ……  ii ij j i Graph Embedding and Extensions: A General Framework for Dimensionality Reduction, TPAMI’07, Yan, et al.

Block ‐ diagonality ‐ 1: Low ‐ Rank Representation (LRR)  Given data learn the affinity matrix by LRR  Theorem: The solution to LRR is block diagonal when the data are drawn from independent subspaces. Block diagonal affinity matrix Robust recovery of subspace structures by low-rank representation, TPAMI’13, Liu, et al.

Block ‐ diagonality ‐ 2: Unified Block ‐ diagonal Conditions  Theorem: The solution to the following problem is block diagonal if (1) lies in independent subspaces; (2) satisfies the EBD conditions or is the unique solution.  Enforced Block Diagonal (EBD) conditions • • •  Many known regularizers satisfy EBD conditions, e.g., Robust and efficient subspace segmentation via least squares regression, ECCV’12, Lu, et al.

Block ‐ diagonality ‐ 3: Hard Block ‐ biagonal Constraint  Key property  The block diagonal prior  LRR with hard block diagonal constraint Robust Subspace Segmentation with Laplacian Constraint, CVPR’14, Feng, et al.

Learning and Vision Group, Present NUS ‐ Purine: A Bi ‐ graph based Deep Learning Framework [A 3 : Architecture, Algorithms, Applications]

Deep Learning in Learning and Vision (LV) Research Group Purine : Network ‐ in ‐ Network + Computational Baby Learning : General, bi ‐ graph based DL framework More human ‐ brain ‐ like network structure and learning process, reguralizers Multi ‐ PC Multi ‐ CPU/GPU Approximate Linear speedup High re ‐ usability, bridge academia and industry Algorithms Landing Smart Services/Devices + Cloud/Embedded System : Object analytics, product search/recom., human analytics, others Architecture Applications

Deep Learning in Learning and Vision (LV) Research Group Purine : Network ‐ in ‐ Network + Computational Baby Learning : General, bi ‐ graph based DL framework More human ‐ brain ‐ like network structure and learning process, reguralizers Multi ‐ PC Multi ‐ CPU/GPU Approximate Linear speedup High re ‐ usability, bridge academia and industry Algorithms Landing Smart Services/Devices + Cloud/Embedded System : Object analytics, product search/recom., human analytics, others Architecture Applications 1. 4 winner awards in VOC Best paper/demo awards: LFW: 98.78%, 2nd best 2. One 2nd prize in VOC ACM MM13, Best human parsing performance 3. 2nd prize in ImageNet’13 ACM MM12, Cross ‐ age synthesis Also licensed Face analysis with occlusions 4. 1st prize in ImageNet’14

A 3 ‐ I. Architecture Purine: a Bi-graph based Deep Learning Framework [Min LIN, Xuan LUO, Shuo LI]

What is “Purine” ● Benefited from the open source deep learning framework Caffe. ● In purine, the math functions and core computations are adapted from Caffe. ● Close molecular structure http://caffe.berkeleyvision.org/

Difference from Caffe Caffe Purine

Definition Graph vs Computation Graph Definition Graph Computation Graph Computation Graph of Convolutional Layer

Definition Graph vs Computation Graph Definition Graph Computation Graph Computation Graph of Dropout Layer

Purine Overview Two Subsystems in Purine: Interpretation: Compose network in Python, generate computation graph in YAML Optimization: Dispatch and solve computation graphs

Basic Components ● Blob (a tensor that contains data) Built in Op types SoftmaxLoss Conv ● Op (operator that performs computation SoftmaxLossDown ConvDown on blobs and outputs blobs) Gaussian ConvWeight Bernoulli Inner InnerDown Constant Ops are modular, Uniform InnerWeight They can be developed and packed in a shared Copy Bias Merge BiasDown library with some common functions exported. Slice Pool PoolDown Sum Purine can then dynamically load the ops like WeightedSum Relu extensions. Mul ReluDown Swap Softmax Dumper SoftmaxDown Loader

Sub ‐ system ‐ 1: Interpretation Definition Graph Computation Graph

Sub ‐ system ‐ 2: Optimization How to solve computation graph? ● Start from sources ● Stop at sinks ● Applies to any Directed Acyclic Graph (DAG) ● Op will compute when all its inputs are ready ● Blob is ready when all its inputs have computed ● All computations are event based and asynchronous, parallelized where possible

Why Computation Graph ● Less hard coding hard coded ● All tasks (algorithm and parallel computing) are consistently defined in graphs ● Solver ● Forward and Backward pass in the same graph In definition graph: Introduce concepts like forward pass and backward pass hard coded to alternate forward and backward pass. In computation graph: The logic is in the graph ● Any scheme of parallelism can be expressed in computation graph

Parallelization Implementation Properties of Ops and Blobs Example Blob defined in YAML type: blob Location: name: weight The location that the blob/op resides on, size: [96, 3, 11, 11] including: location: ip: 127.0.0.1 ● ip address of the target machine device: 0 ● what device it is on (CPU/GPU) Example Op defined in YAML Thread: type: op Thread is needed for op because both op_type: Conv name: conv1 CPU and GPU can be multiple threaded inputs: [ bottom, weight ] (Streams in terms of NVIDIA GPU). outputs: [ top ] location: ip: 127.0.0.1 device: 0 thread: 1 other fields ...

Parallelization ‐ 1 (Pipeline) One computation graph can span multiple machines! Special Op: Copy. Special Op: Copy. Case 1, Pipeline ● Location A & B are same machine ● Copy is executed as soon as input different devices: blob is ready Copy does one of the following: 1. cudaMemcpyHostToDevice ● Copy is run in its own worker 2. cudaMemcpyDeviceToDevice thread. Computation and data 3. cudaMemcpyDeviceToHost transfer are overlapped wherever How to run this pipeline? possible. ● Location A & B are on different machines: ● GPU inbound and outbound copy Copy reside on both machines are in different streams, fully utilize Source side: CUDA’s dual copy engines. nn_send(socket, data) Target side: nn_receive(socket, data)

Parallelization ‐ 1 (Pipeline) Graph 2 Graph 1 Graph 3 Replicate Iterate Graphs/Subgraphs

Parallelization ‐ 2 (Data parallelism) Case 2, Data parallelism ● Explicitly duplicate the nets at different locations ● Each duplicate run different data ● Gather weight gradients at parameter server

Parallelization ‐ 2 (Data parallelism) ● Higher layer gradients are computed Overlap data transfer and computation earlier than lower layers. ● Higher layer can send gradients to parameter server and get them back while the lower layers are doing their computation. ● Especially true for very deep networks ● Data parallelism even for fully connected layers. Though lots of parameter for FC layer, latency is hidden. ● Cross machine (network) latency is less of a problem

Profiling Result Data transfer overlaps with computation Parameter update of lowest layer Images per second 800 700 600 Note that 8 GPUs are on different 500 machines. 400 300 200 8 GPUs train GoogleNet in 40 hours. 100 Top5 error rate 12.67% (tuning) 0 GPUs 1 2 3 4 8

A 3 ‐ II. Algorithms Network-in-Network [More Human-brain-like Network Structure] [Min LIN, Qiang CHENG]

L ear ning and Vision R esear ch Gr oup Shuic he ng YAN Natio - PowerPoint PPT Presentation

L ear ning and Vision R esear ch Gr oup Shuic he ng YAN Natio nal U nive rsity o f Singapo re Learning and Vision Research Group (LV) Founded early 2008 20-30 members Three Indicators of Excellence for Members Industry

The Ears Randa M. Albusoul Anatomy Structure of the Ear: The ear is organ of hearing. It

the Inner Ear Reading: Yost Ch. 7 The Mammalian Ear The Inner Ear Inner ear contains two sensory

Res esear earch ch D Disrupted: Prot otec ecting F Fed eder eral R Res esear arch I

P as t and F ut ur e R es ear ch P as t and F ut ur e R es ear ch P as t and F ut ur e R es

A Regional gional Dair airy Foods oods Res esear earch h Cent enter er 1 Founda

R eview of c limate adaptation evaluations T imo L eiter Doctor al r esear cher , Gr

D isabil ity R esear on I ndepe ent L iving ility arch on penden ng and L earnin ning A 5

Cleani ning C ng Cont ontract Cleani ning C ng Cont ontract Cleani ning C ng Cont

Ear Structure and Function By: Dr. Vijay Kumar Four major divisions of auditory system -

ITAR / EAR Security Briefing Company Overview March 12, 2015 INTRODUCTION ITAR & EAR

YEA EAR 11 R 11 YEA EAR 11 11 YEA 11O 11Y 11Y Mrs Bennett 11B 11B Mrs Beaumont 11M

Navigating the Common Cor e with L with L e ar e ar ning Pr ning Pr ogr ogr e ssions e

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

Interaction ear head nose neck elbow shoulder leg arm foot hand Interaction ear head

Understanding Hearing Loss How We Hear Types of Hearing Loss Conductive Sensorineural

Introduction to Statistical and Computational Genomics Professors Jim Thomas and Elhanan

Sequence comparison: Introduction and motivation Genome 559: Introduction to Statistical and

Opportunity : Approach to New Product Development Idea and Opportunity A form, look or

Charge, spin and structural dynamics in Spin-Cross-Over materials School on Synchrotron and

Parsimony 123456789... Taxon1 CGACC A GGT... Taxon2 CGACC A GGT... Taxon3 CGGTC C GGT... Taxon4

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 2 Part 2

TITLE PAGE: Is protein sequence evolution constant over time? Carolin Kosiol & Nick

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Chlo-Agathe Azencott &

L ear ning and Vision R esear ch Gr oup Shuic he ng YAN Natio - PowerPoint PPT Presentation

L ear ning and Vision R esear ch Gr oup Shuic he ng YAN Natio nal U nive rsity o f Singapo re Learning and Vision Research Group (LV) Founded early 2008 20-30 members Three Indicators of Excellence for Members Industry

The Ears Randa M. Albusoul Anatomy Structure of the Ear: The ear is organ of hearing. It

the Inner Ear Reading: Yost Ch. 7 The Mammalian Ear The Inner Ear Inner ear contains two sensory

Res esear earch ch D Disrupted: Prot otec ecting F Fed eder eral R Res esear arch I

P as t and F ut ur e R es ear ch P as t and F ut ur e R es ear ch P as t and F ut ur e R es

A Regional gional Dair airy Foods oods Res esear earch h Cent enter er 1 Founda

R eview of c limate adaptation evaluations T imo L eiter Doctor al r esear cher , Gr

D isabil ity R esear on I ndepe ent L iving ility arch on penden ng and L earnin ning A 5

Cleani ning C ng Cont ontract Cleani ning C ng Cont ontract Cleani ning C ng Cont

Ear Structure and Function By: Dr. Vijay Kumar Four major divisions of auditory system -

ITAR / EAR Security Briefing Company Overview March 12, 2015 INTRODUCTION ITAR &amp; EAR

YEA EAR 11 R 11 YEA EAR 11 11 YEA 11O 11Y 11Y Mrs Bennett 11B 11B Mrs Beaumont 11M

Navigating the Common Cor e with L with L e ar e ar ning Pr ning Pr ogr ogr e ssions e

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

Branding Presentation VISION Mevushal VISION Muscat of Alexandria &amp; Viognier VISION

Interaction ear head nose neck elbow shoulder leg arm foot hand Interaction ear head

Understanding Hearing Loss How We Hear Types of Hearing Loss Conductive Sensorineural

Introduction to Statistical and Computational Genomics Professors Jim Thomas and Elhanan

Sequence comparison: Introduction and motivation Genome 559: Introduction to Statistical and

Opportunity : Approach to New Product Development Idea and Opportunity A form, look or

Charge, spin and structural dynamics in Spin-Cross-Over materials School on Synchrotron and

Parsimony 123456789... Taxon1 CGACC A GGT... Taxon2 CGACC A GGT... Taxon3 CGGTC C GGT... Taxon4

CS CS 466 466 In Introduct ctio ion t to B Bio ioin informatics ics Lecture 2 Part 2

TITLE PAGE: Is protein sequence evolution constant over time? Carolin Kosiol &amp; Nick

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Chlo-Agathe Azencott &amp;

ITAR / EAR Security Briefing Company Overview March 12, 2015 INTRODUCTION ITAR & EAR

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

TITLE PAGE: Is protein sequence evolution constant over time? Carolin Kosiol & Nick

Data Mining in Bioinformatics Day 5: Frequent Subgraph Mining Chlo-Agathe Azencott &