Convolutional Prototype Ensemble Robust Stream Classification & - PowerPoint PPT Presentation

UT DALLAS Erik Jonsson School of Engineering & ComputerScience Convolutional Prototype Ensemble Robust Stream Classification & Novel Class Detection Zhuoyi Wang * , Hemeng T ao * , Swarup Changra * , Latifur Khan * * The University of T exas at Dallas, Richardson TX,USA This material is based upon work supported by FEARLESS engineering

Agenda ❑ High Dimensional Data Stream Mining and Challenges ❑ Shortcomings of Current Solutions ❑ The Proposed Approach – Novel Class Detection – Classification – Incremental Learning – Performance Analysis & Improvement ❑ Experiments ❑ Discussion 2 FEARLESS engineering

High Dimensional Stream Mining ➢ High Dimensional Data Stream: – continuous flow of high dimensional instances. – common in life ’s image recognition and text application. Scene Stream in Flow of news autonomous system Summary in Social Network. ➢ Challenge: ➢ May evolve new emerging class during stream scenario. ➢ Limited amount of labeled data. ➢ Time limited for the execution of learning methods. 3 FEARLESS engineering

Evolving new class (Novel Class) Novel class High dimensional space in real world image data set (FASHION-MNIST) Novel Class Previous work: Traditional low dimensional space of ODIN[4], Open-Set[5] IRIS dataset Previous work: ECSMiner[1], SAND[2], ECHO[3]. [1]. Al-Khateeb, T., Masud, M. M., Khan, L., Aggarwal, C., Han, J., & Thuraisingham, B. (2012, December). Stream classification with recurring and novel class detection using class- based ensemble. In 2012 ICDM. [2]. Haque, Ahsanul, Latifur Khan, and Michael Baron. "Sand: Semi-supervised adaptive novel class detection and classification o ver data stream." In AAAI 2016. [3]. Haque, Ahsanul, et al. "Efficient handling of concept drift and concept evolution over stream data." In 2016 ICDE. [4]. Liang, Shiyu, Yixuan Li, and R. Srikant. "Enhancing the reliability of out-of-distribution image detection in neural networ ks." In ICLR 2017. [5]. Bendale, Abhijit, and Terrance E. Boult. "Towards open set deep networks." In CVPR 2016. 4 FEARLESS engineering

Limitation of Time and Space ➢ Generating instances from novel/unseen class sets ➢ Incrementally training classifier ensemble from new emerge class sets. Novel Class chunk D 2 D 1 D 3 D n Class D 3 D 4 D 4 D 5 Set New coming stream instances Network Note: D i contains M 1 M 2 M 5 M 3 Models instances from novel class set i Ensemble M a Mb Mc Prediction Previous work: [1]. Han, Shizhong , et al. "Incremental boosting convolutional neural network for facial action unit recognition." In NIPS 2016. [2]. Rebuffi, Sylvestre-Alvise, et al. "icarl: Incremental classifier and representation learning." In CVPR. 2017. 5 FEARLESS engineering

Shortcomings of Current Solutions ❖ Shortcomings : – Novel Class Detection: For traditional appraoch like SAND[1], ECHO[2], they typically suiable for the low dimensional feature space, where the novel class instances farther away from clusters containing known class examples. For recent years Deep Neurual Network (DNN) based methods such as [3] and [4], they utilize the DNN with softmax output and filter threshold. However, softmax function tend to allocate the new coming samples to a known class with high confidence[5], only apply the softmax output for rejecting novelty class is not suitable enough. [1]. Haque, Ahsanul, Latifur Khan, and Michael Baron. "Sand: Semi-supervised adaptive novel class detection and classification over data stream." In AAAI 20 16. [2]. Haque, Ahsanul, et al. "Efficient handling of concept drift and concept evolution over stream data." In 2016 ICDE. [3]. Han, Shizhong , et al. "Incremental boosting convolutional neural network for facial action unit recognition." In NIPS 2016. [4]. Liang, Shiyu, Yixuan Li, and R. Srikant. "Enhancing the reliability of out-of- distribution image detection in neural networks." In ICLR 2017. [5]. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In ICLR, 2014. 6 FEARLESS engineering

Shortcomings of Current Solutions ❖ Shortcomings : – Incremental Learning Current methods should also apply incremental learning to adapt changes along the high dimensional stream over a long period of time. Typical solution for the DNN mainly apply network ensemble [1] or fine tune, like: Incremetal ensemble of different DNN or layer Shortcomes: Incresing of either DNN structure parameters or layer embedding during continous or lifelong learning scenario would be both time and space consuming. [1]. Han, Shizhong , et al. "Incremental boosting convolutional neural network for facial action unit recognition." In NIPS 2016. 7 FEARLESS engineering

Motivation We could model the data of each (existed) class as a Gaussian mixture component, so the novel class could be regarded as a different distribution compared with existed ones, although they may bear some resemblance to the existed classes. Existed Class Potential Novel Class Distribution Distribution Novel Class Detection: could be addressed through filter out the anomaly large distance between different distributions; Incremental Learning: could be solved by adding new distribution and updating existed ones. 8 FEARLESS engineering

Proposed Approach: Prototype Ensemble Learning ➢ Prototype means class-characteristic distribution, if each class is regarded as a Gaussian mixture distribution, the prototypes would act as the mean value of each class' gaussian components Prototype2 Prototype1 Prototype3 Class: Car Prototype4 ✓ Novel Class Detection: – Similar instances of a class forms different prototypes under a certain class, outlier exmaples potentially form a new prototype associated with a novel class which is easier to be detected. ✓ Stream Classification: – Ensemble prototype as classifier to trained on different section of stream instances, and used for classification. ✓ Incremental Learning: – Create new prototypes accoring to novel class instances continuesly during stream process, then update the existing prototypes to make it adapt changes along the stream. 9 FEARLESS engineering

Overview: CPE 10 FEARLESS engineering

Prototype Establish We employ a Deep Neural Network architecture with convolutional layers. For a given input X, the output of the network is denoted by the , where f is the feature representation and θ is the correspond network parameters . For every class i, we select a small set of instances D i from D, and form the exemplar set Ɛ i . Then we form the initial prototype: Here, each prototype is denoted by 𝑞 𝑗𝑘 , i indicates a class label index in 𝑍 , and j is the prototype index. We denote the set of prototypes for each class 𝑧 𝑗 ∈ 𝑍 by 𝑄 𝑗 . 𝑞 𝑗𝑘 ∈ 𝑄 𝑗 11 FEARLESS engineering

Prototype Ensemble Loss We focus on improving local separation between prototypes. 12 FEARLESS engineering

Overall loss function for Training Similar with softmax/cross-entropy, the probability that x belongs to the prototype 𝑞 𝑗𝑘 is defined as: where C is the size of class set Y , K is the maximum number of the prototypes per class. Therefore, the probability of class label assignment for x is given: 13 FEARLESS engineering

Overall loss function for Training ➢ Overall objective function maximize the probability of x being Act as a regularization of the loss function associated with a prototype in P, could be regarded as the Cross Entropy loss for prototype. 14 FEARLESS engineering

Novel Class Detection Coming instance X Go through DNN and get Calculate and compare distance with other prototypes Threshold of prototype to Ensemble prot 𝐐 𝐣 for class i determined accept or reject: (Step 1) P i1 P ik P i2 . . . outlier outlier outlier If the distance of x to it’s nearest prototype X comes from AND (Step 2) is larger than correspond threshold, we class i False True degerming it as a novel class instance X is a potential novel class instance 15 FEARLESS engineering

Incremental Learning 16 FEARLESS engineering

Prototype Based Incremental Learning Period 1 Period 2 Period 3 Establish New Prototype Then: Apply back-prop to update parameter θ in Network model. 17 FEARLESS engineering

Prototype Based Incremental Learning Period 1 Period 2 Period 3 Update Existing Prototype Then: Apply back-prop to update parameter θ in Network model. 18 FEARLESS engineering

Complexity Time Complexity: Size of novel class candidate buffer is 𝑡 𝐶 , time complexity of calculating the gradient of one example is a constant 𝐷 , mini- batch size is 𝑡 𝑛𝑗𝑜𝑗 , epochs number is 𝑜 𝑓 , the number of classes in Stream is 𝑍 ′ . The time complexity of CPE is O ቀ 2 𝐷𝑜 𝑓 ൫ 𝑇 𝑛𝑗𝑜𝑗 + 𝑡 𝐶 𝑍 ′ ൯ ቁ . Space Complexity: It is a constant since the space used by exemplars, prototypes, buffer and network are constant. 19 FEARLESS engineering

Experiment Name of Data Number of Number of Set Instances Features FASHION-MNIST 70,000 784 SVHN 100,000 3072 CIFAR-10 60,000 3072 LSUN 80,000 4096 CINIC 106,110 4096 New-York-News 66,000 300 CPE setup: 1. DenseNet as the DNN straucture. 2. M = 2000 exemplars, K = 10 (Maximum prototype amount) 20 FEARLESS engineering

Convolutional Prototype Ensemble Robust Stream Classification & - PowerPoint PPT Presentation

UT DALLAS Erik Jonsson School of Engineering & ComputerScience Convolutional Prototype Ensemble Robust Stream Classification & Novel Class Detection Zhuoyi Wang * , Hemeng T ao * , Swarup Changra * , Latifur Khan * * The University of T

Boosting (ensemble) Module 4 - Ensemble classifiers - Objectives module 4: boosting (ensemble

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

What is a prototype? Design Thinking + 5-Stage Process Design/ Empathize Define Ideate Test

Ensemble Methods Albert Bifet May 2012 COMP423A/COMP523A Data Stream Mining Outline 1.

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

Anytime Reliability of Systematic LDPC Motivation Convolutional Codes LDPC Convolutional Codes

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Stream Ciphers Stream Ciphers 1 Stream Ciphers Generalization of one-time pad Trade

SiPM Active Ganging cryogenic test results ING. ESTEBAN CRISTALDO Prototype V1 Prototype V2

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Fresh water stream ecosystem Gr ov p 2 The description of stream lives Quadrat 1: Hong Kong Newt

PREserving Linked DAta: An introduc7on Carlo Meghini ISTI

Recommended Reading A Brief Introduction to OpenMP OpenMP FAQ http://openmp.org/openmp-faq.html

for Complex Analy;cal Queries Milos Nikolic, Mohammed El Seidy,

Resource-bounded functional programming on the JVM and .NET Stephen Gilmore Mobile Resource

A Novel Layer Sharing-based Incremental Learning via Bayesian Optimization Bomi Kim, Taehyeon

A constructive approach to incremental learning Mario Rosario Guarracino October 12, 2006

How to Train Your Perceptron 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University

Incremental Classification: First Step into Lifelong Learning PAN Xinyu MMLab, Department of IE