Neural Networks for Machine Learning Lecture 15c Deep autoencoders - PowerPoint PPT Presentation

Neural Networks for Machine Learning Lecture 15c Deep autoencoders for document retrieval and visualization Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed

How to find documents that are similar to a query document fish 0 • Convert each document into a � bag of words � . cheese 0 – This is a vector of word counts ignoring order. vector 2 count 2 – Ignore stop words (like � the � or � over � ) school 0 • We could compare the word counts of the query query 2 document and millions of other documents but this reduce 1 is too slow. bag 1 – So we reduce each query vector to a much pulpit 0 smaller vector that still contains most of the iraq 0 information about the content of the document. word 2

How to compress the count vector output 2000 reconstructed counts vector • We train the neural network to 500 neurons reproduce its input vector as its output 250 neurons • This forces it to compress as much information as possible 10 into the 10 numbers in the central bottleneck. 250 neurons • These 10 numbers are then a good way to compare 500 neurons documents. input 2000 word counts vector

The non-linearity used for reconstructing bags of words • Divide the counts in a bag of words • When we train the first vector by N, where N is the total number RBM in the stack we use of non-stop words in the document. the same trick. – The resulting probability vector gives – We treat the word the probability of getting a particular counts as probabilities, word if we pick a non-stop word at but we make the visible random from the document. to hidden weights N times bigger than the • At the output of the autoencoder, we use hidden to visible a softmax. because we have N – The probability vector defines the observations from the desired outputs of the softmax. probability distribution.

Performance of the autoencoder at document retrieval • Train on bags of 2000 words for 400,000 training cases of business documents. – First train a stack of RBM � s. Then fine-tune with backprop. • Test on a separate 400,000 documents. – Pick one test document as a query. Rank order all the other test documents by using the cosine of the angle between codes. – Repeat this using each of the 400,000 test documents as the query (requires 0.16 trillion comparisons). • Plot the number of retrieved documents against the proportion that are in the same hand-labeled class as the query document. Compare with LSA (a version of PCA).

Retrieval performance on 400,000 Reuters business news stories

First compress all documents to 2 numbers using PCA on log(1+count). Then use different colors for different categories.

First compress all documents to 2 numbers using deep auto. Then use different colors for different document categories

Neural Networks for Machine Learning Lecture 15d Semantic hashing Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed

Finding binary codes for documents 2000 reconstructed counts • Train an auto-encoder using 30 logistic units for the code layer. 500 neurons • During the fine-tuning stage, add noise to the inputs to the code units. 250 neurons – The noise forces their activities to become bimodal in order to resist code 30 the effects of the noise. – Then we simply threshold the 250 neurons activities of the 30 code units to get a binary code. 500 neurons • Krizhevsky discovered later that its easier to just use binary stochastic 2000 word counts units in the code layer during training.

Using a deep autoencoder as a hash-function for finding approximate matches supermarket search hash function

Neural Networks for Machine Learning Lecture 15c Deep autoencoders - PowerPoint PPT Presentation

Neural Networks for Machine Learning Lecture 15c Deep autoencoders for document retrieval and visualization Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed How to find documents that are similar to a

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 5: Neural Networks and Deep Learning November

Data Mining II Neural Networks and Deep Learning Heiko Paulheim Deep Learning A recent

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network

Machine Learning 2 DS 4420 - Spring 2020 Neural Networks & backprop Byron C Wallace Neural

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

CS 6316 Machine Learning Neural Networks Yangfeng Ji Department of Computer Science University

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

KE4IR S E y K b d e r e I w P o p Knowledge Extraction for Information Retrieval

SODAR THE IRODS-POWERED SYSTEM FOR OMICS DATA ACCESS AND RETRIEVAL Mikko Nieminen iRODS

Data Cleansing for Web Information Retrieval Data Cleansing for Web Information Retrieval using

NPFL103: Information Retrieval (1) Introduction, Boolean retrieval, Inverted index, Text

Incorporating External Textual Knowledge for Life Event Recognition and Retrieval NTUnlg at

Vision and Language Representation Learning Self Supervised Pretraining and Multi-Task Learning

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Retrieval of Autobiographical Information Erica Yu and Scott Fricker AAPOR May 18, 2014 All

Neural Networks for Machine Learning Lecture 15c Deep autoencoders - PowerPoint PPT Presentation

Neural Networks for Machine Learning Lecture 15c Deep autoencoders for document retrieval and visualization Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed How to find documents that are similar to a

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Deep Learning with Neural Networks The Structure and Optimization of Deep Neural Networks Allan

Deep Convolutional Neural Nets COMPSCI 371D Machine Learning COMPSCI 371D Machine

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 5: Neural Networks and Deep Learning November

Data Mining II Neural Networks and Deep Learning Heiko Paulheim Deep Learning A recent

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Neural Networks for Machine Learning Lecture 2a An overview of the main types of neural network

Machine Learning 2 DS 4420 - Spring 2020 Neural Networks &amp; backprop Byron C Wallace Neural

(Very) Brief Introduction to Neural Networks IITP-03 Algorithms for NLP 1 / 31 Learning

CS 6316 Machine Learning Neural Networks Yangfeng Ji Department of Computer Science University

Neural Networks and their Application to Go Neural Networks Learning Blackjack Theory Training

KE4IR S E y K b d e r e I w P o p Knowledge Extraction for Information Retrieval

SODAR THE IRODS-POWERED SYSTEM FOR OMICS DATA ACCESS AND RETRIEVAL Mikko Nieminen iRODS

Data Cleansing for Web Information Retrieval Data Cleansing for Web Information Retrieval using

NPFL103: Information Retrieval (1) Introduction, Boolean retrieval, Inverted index, Text

Incorporating External Textual Knowledge for Life Event Recognition and Retrieval NTUnlg at

Vision and Language Representation Learning Self Supervised Pretraining and Multi-Task Learning

Retrieval by Content Image Retrieval Image Retrieval Problem Large Image and video data sets

Retrieval of Autobiographical Information Erica Yu and Scott Fricker AAPOR May 18, 2014 All

Machine Learning 2 DS 4420 - Spring 2020 Neural Networks & backprop Byron C Wallace Neural