Sequence Prediction Using Neural Network Classifiers Yanpeng Zhao - PowerPoint PPT Presentation

Sequence Prediction Using Neural Network Classifiers Yanpeng Zhao ShanghaiTech University ICGI, Oct 7 th , 2016, Delft, the Netherlands

Sequence Prediction What’s the next symbol? 4 3 5 0 4 6 1 3 1 ?

Classification Perspective -1 0 1 2 3 4 5 6 the most likely Input Sequence next symbol 4 3 5 0 4 6 1 3 1 Classifier Multinomial

Representation of Inputs Continuous vector representation of the discrete symbols King – Man + Woman ≈ Queen Images are from: https://blog.acolyer.org/2016/04/21/the-amazing-power-of-word-vectors/

Representation of Inputs Construct inputs for classifiers using learned word vectors -2 -2 -2 -2 -2 4 3 5 0 4 6 1 3 1 2 -1 5 Label � � Input Sample � � � �� Word vectors are concatenated or stacked Predict the next symbol from the previous � = 15 symbols, each represented by a 30 -dimension vector

Input Test Sequence 4 3 5 0 4 6 1 3 1 Neural Network Classifiers � � � � � � � � � � Classifier Multinomial -1 0 1 2 3 4 5 6 the most likely next symbol

Multilayer Perceptrons (MLPs) � � � � � � � � � = � 1 � � a � = � � � � � � � � � � � = � 1 � � a � = � � � � � � � � � � � = � +1 1 +1 +1 � = �� Input � Hidden Layer 1 Hidden Layer 2 Softmax Output

Multilayer Perceptrons (MLPs) � � � � � � � � � = � 1 � � a � = � � � � � � � � � � � = � 1 � � a � = � � � � � � � � � � � = � +1 1 +1 +1 � = �� Input � Hidden Layer 1 Hidden Layer 2 Softmax Output |�| = 450 : 15 symbols with a 30-dimension vector for each symbol � � = 750 and � � = 1000

Convolutional Neural Networks (CNNs) k = 15, � = 30 CNN model architecture adapted from Yoon Kim. Convolutional neural networks for sentence classication. arXiv preprint arXiv:1408.5882 , 2014 Filter windows (height) of 10, 11, 12, 13, 14, 15 ; 200 feature maps for each window

Long Short Term Memory Networks (LSTMs) � � � ℎ �� = � � ⊗ � �� + � � ⊗ � � � = � � � � � � � � � ℎ � = � � ⊗ ��ℎ (� � ) 1 � ��ℎ � � Time step is 15 , and ℎ � of dim 32 is fed to a logistic regression classifier Images are from: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Weighted n-Gram Model (WnGM) 0.28 3 0.11 0.01 2 0.24 0.26 0.15 1 0.21 0.29 0.25 � � � + � � ⋅ + � � ⋅ = �� 0 0.26 0.21 0.06 0.18 0.23 0.26 −1 Label � � -Gram � � -Gram � � -Gram We set � to 2, 3, 4, 5, 6 with weights 0.3, 0.2, 0.2, 0.15, 0.15 respectively

Overview of Experiments • Implementation - MLP & CNN were implemented in MxNet - LSTM was implemented in Tensorflow - https://bitbucket.org/thinkzhou/spice • System & Hardware - CentOS 7.2 (64Bit) server - Intel Xeon Processor E5-2697 v2 @ 2.70GHz & Four Tesla K40Ms • Time cost - Run all the models on all datasets in less than 16h

Detail Scores on Public Test Sets Total score on private test sets is 10.160324

Discussion & Future Work • MLPs Total scores by different models on public test sets - make the best use of the symbol order information 15 9.802 9.593 9.325 9.237 • CNNs 8.666 10 7.444 5 - should use the problem-specific model architecture 0 - update vectors while training 3-Gram SL MLP CNN WnGram LSTM - train a deep averaging network (DAN) [Mohit et al., 2015] • LSTMs • Future work - integrate neural networks into probabilistic grammatical models in sequence prediction

Thanks

Sequence Prediction Using Neural Network Classifiers Yanpeng Zhao - PowerPoint PPT Presentation

Sequence Prediction Using Neural Network Classifiers Yanpeng Zhao ShanghaiTech University ICGI, Oct 7 th , 2016, Delft, the Netherlands Sequence Prediction Whats the next symbol? 4 3 5 0 4 6 1 3 1 ? Classification Perspective -1

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence-to-Sequence Learning with Neural Networks Ilya Sutskever, Oriol Vinyals, Quoc V. Le,

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or

First look at structures CS 6355: Structured Prediction 1 So far Binary classifiers

Recurrent Neural Models: Language Models, and Sequence Prediction and Generation CMSC 473/673

The Neural Noisy Channel: Generative Models for Sequence to Sequence Modeling Chris

On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

Evaluation of Machine Learning Methods on SPiCe Ichinari Sato 1 , Kaizaburo Chubachi 2 , Diptarama

Google/SkyWater and the Promise of the Open PDK The New Ecosystem of Open Source Silicon Tim

SPICE Simula,on and Capacitance Measurements Brad Ellison, Thomas

Liveness Checking as Safety Checking for Infinite State Spaces Viktor Schuppan 1 , Armin Biere 2 1

Exploring Proton Structure with Drell-Yan Scattering Hadron Physics Seminar Darmstadt, December

Large Scale Searches for Relic Dark Matter Mani Tripathi University of California, Davis

Isovector Dependence of the EMC Effect December 4, 2016 SRC/EMC 2016 IV EMC 1/9

Shared Memory Multiprocessors Logical design and software interactions 1 Shared Memory

Sequence Prediction Using Neural Network Classifiers Yanpeng Zhao - PowerPoint PPT Presentation

Sequence Prediction Using Neural Network Classifiers Yanpeng Zhao ShanghaiTech University ICGI, Oct 7 th , 2016, Delft, the Netherlands Sequence Prediction Whats the next symbol? 4 3 5 0 4 6 1 3 1 ? Classification Perspective -1

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Protein Sequence Analysis Protein Sequence Analysis Protein sequence motifs Protein sequence

Sequence-to-Sequence Learning with Neural Networks Ilya Sutskever, Oriol Vinyals, Quoc V. Le,

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Connectionist Temporal Classification 1 Sequence-to-sequence

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

SEQUENCE ANALYSIS The term &quot; sequence analysis &quot; in biology implies subjecting a DNA or

First look at structures CS 6355: Structured Prediction 1 So far Binary classifiers

Recurrent Neural Models: Language Models, and Sequence Prediction and Generation CMSC 473/673

The Neural Noisy Channel: Generative Models for Sequence to Sequence Modeling Chris

On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

Evaluation of Machine Learning Methods on SPiCe Ichinari Sato 1 , Kaizaburo Chubachi 2 , Diptarama

Google/SkyWater and the Promise of the Open PDK The New Ecosystem of Open Source Silicon Tim

SPICE Simula,on and Capacitance Measurements Brad Ellison, Thomas

Liveness Checking as Safety Checking for Infinite State Spaces Viktor Schuppan 1 , Armin Biere 2 1

Exploring Proton Structure with Drell-Yan Scattering Hadron Physics Seminar Darmstadt, December

Large Scale Searches for Relic Dark Matter Mani Tripathi University of California, Davis

Isovector Dependence of the EMC Effect December 4, 2016 SRC/EMC 2016 IV EMC 1/9

Shared Memory Multiprocessors Logical design and software interactions 1 Shared Memory

SEQUENCE ANALYSIS The term " sequence analysis " in biology implies subjecting a DNA or