deep learning
play

Deep Learning. Petr Po s k petr.posik@fel.cvut.cz Czech - PowerPoint PPT Presentation

CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Deep Learning. Petr Po s k petr.posik@fel.cvut.cz Czech Technical University in Prague Faculty of Electrical Engineering Dept. of


  1. CZECH TECHNICAL UNIVERSITY IN PRAGUE Faculty of Electrical Engineering Department of Cybernetics Deep Learning. Petr Poˇ s´ ık petr.posik@fel.cvut.cz Czech Technical University in Prague Faculty of Electrical Engineering Dept. of Cybernetics P. Poˇ s´ ık c � 2020 Artificial Intelligence – 1 / 44 petr.posik@fel.cvut.cz

  2. Deep Learning P. Poˇ s´ ık c � 2020 Artificial Intelligence – 2 / 44 petr.posik@fel.cvut.cz

  3. Question Based on your current knowledge and intuition, which of the following options is the best characterization of deep learning (DL) and its relation to machine learning (ML)? Deep Learning A • Question DL is any ML process that requires a deep involvement of a human designer in ex- • Definition tracting the right features from the raw data. • History • Terminology B DL is any solution to a ML problem that uses neural networks with a few, but very • Ex: Word embed. large hidden layers. • Ex: w2v arch. • Ex: w2v results C DL is a set of ML methods allowing us not only to solve the problem at hand, but also • Why deep? gain deep understanding of the solution process. • A new idea? • Boom of Deep Nets D DL is any method that tries to automatically transform the raw data into a represen- • Autoencoders • Stacked autoenc. tation suitable for the solution of our problem, often at multiple level of abstraction. • Pre-training ConvNets Successes Recurrent Nets Other remarks Summary P. Poˇ s´ ık c � 2020 Artificial Intelligence – 3 / 44 petr.posik@fel.cvut.cz

  4. What is Deep learning? Conventional ML techniques: ■ Limited in their ability to process natural data in their raw form. Deep Learning ■ Successful applications required careful engineering and human expertise to extract • Question suitable features. • Definition • History • Terminology • Ex: Word embed. • Ex: w2v arch. • Ex: w2v results • Why deep? • A new idea? • Boom of Deep Nets • Autoencoders • Stacked autoenc. • Pre-training ConvNets Successes Recurrent Nets Other remarks Summary P. Poˇ s´ ık c � 2020 Artificial Intelligence – 4 / 44 petr.posik@fel.cvut.cz

  5. What is Deep learning? Conventional ML techniques: ■ Limited in their ability to process natural data in their raw form. Deep Learning ■ Successful applications required careful engineering and human expertise to extract • Question suitable features. • Definition • History • Terminology Representation learning: • Ex: Word embed. • Ex: w2v arch. ■ Set of methods allowing a machine to be fed with raw data and to automatically • Ex: w2v results discover the representations suitable for correct classification/regression/modeling. • Why deep? • A new idea? • Boom of Deep Nets • Autoencoders • Stacked autoenc. • Pre-training ConvNets Successes Recurrent Nets Other remarks Summary P. Poˇ s´ ık c � 2020 Artificial Intelligence – 4 / 44 petr.posik@fel.cvut.cz

  6. What is Deep learning? Conventional ML techniques: ■ Limited in their ability to process natural data in their raw form. Deep Learning ■ Successful applications required careful engineering and human expertise to extract • Question suitable features. • Definition • History • Terminology Representation learning: • Ex: Word embed. • Ex: w2v arch. ■ Set of methods allowing a machine to be fed with raw data and to automatically • Ex: w2v results discover the representations suitable for correct classification/regression/modeling. • Why deep? • A new idea? • Boom of Deep Nets Deep learning: • Autoencoders • Stacked autoenc. ■ Representation-learning methods with multiple levels of representation, with • Pre-training increasing level of abstraction . ConvNets ■ Compose simple, but often non-linear modules transforming the representation at Successes one level into a representation at a higher, more abstract level. Recurrent Nets ■ The layers learn to represent the inputs in a way that makes it easy to predict the Other remarks target outputs. Summary P. Poˇ s´ ık c � 2020 Artificial Intelligence – 4 / 44 petr.posik@fel.cvut.cz

  7. A brief history of Neural Networks ■ 1940s: Model of neuron (McCulloch, Pitts) ■ 1950-60s: Modeling brain using neural networks (Rosenblatt, Hebb, etc.) Deep Learning ■ 1969: Research stagnated after Minsky and Papert’s book Perceptrons • Question ■ 1970s: Backpropagation • Definition • History ■ 1986: Backpropagation popularized by Rumelhardt, Hinton, Williams • Terminology ■ 1990s: Convolutional neural networks (LeCun) • Ex: Word embed. • Ex: w2v arch. ■ 1990s: Recurrent neural networks (Schmidhuber) • Ex: w2v results • Why deep? ■ 2006: Revival of deep networks, unsupervised pre-training (Hinton et al.) • A new idea? ■ 2013-: Huge industrial interest • Boom of Deep Nets • Autoencoders • Stacked autoenc. • Pre-training ConvNets Successes Recurrent Nets Other remarks Summary P. Poˇ s´ ık c � 2020 Artificial Intelligence – 5 / 44 petr.posik@fel.cvut.cz

  8. Terminology ■ Narrow vs wide : Refers to the number of units in a layer . ■ Shallow vs deep : Refers to the number of layers . Deep Learning • Question • Definition • History • Terminology • Ex: Word embed. • Ex: w2v arch. • Ex: w2v results • Why deep? • A new idea? • Boom of Deep Nets • Autoencoders • Stacked autoenc. • Pre-training ConvNets Successes Recurrent Nets Other remarks Summary P. Poˇ s´ ık c � 2020 Artificial Intelligence – 6 / 44 petr.posik@fel.cvut.cz

  9. Terminology ■ Narrow vs wide : Refers to the number of units in a layer . ■ Shallow vs deep : Refers to the number of layers . Deep Learning Making a deep architecture: • Question • Definition ■ A classifier uses the original representation: • History • Terminology Input Output • Ex: Word embed. layer layer • Ex: w2v arch. • Ex: w2v results x 1 • Why deep? • A new idea? • Boom of Deep Nets • Autoencoders x 2 • Stacked autoenc. • Pre-training y 1 � ConvNets x 3 Successes Recurrent Nets Other remarks x 4 Summary ■ A classifier uses features which are derived from the original representation: ■ A classifier uses features which are derived from the features derived from the original representation: P. Poˇ s´ ık c � 2020 Artificial Intelligence – 6 / 44 petr.posik@fel.cvut.cz

  10. Terminology ■ Narrow vs wide : Refers to the number of units in a layer . ■ Shallow vs deep : Refers to the number of layers . Deep Learning Making a deep architecture: • Question • Definition ■ A classifier uses the original representation: • History • Terminology ■ A classifier uses features which are derived from the original representation: • Ex: Word embed. Input Hidden Output • Ex: w2v arch. • Ex: w2v results layer layer layer • Why deep? • A new idea? • Boom of Deep Nets • Autoencoders x 1 • Stacked autoenc. • Pre-training ConvNets x 2 Successes � y 1 Recurrent Nets Other remarks x 3 Summary x 4 ■ A classifier uses features which are derived from the features derived from the original representation: P. Poˇ s´ ık c � 2020 Artificial Intelligence – 6 / 44 petr.posik@fel.cvut.cz

  11. Terminology ■ Narrow vs wide : Refers to the number of units in a layer . ■ Shallow vs deep : Refers to the number of layers . Deep Learning Making a deep architecture: • Question • Definition ■ A classifier uses the original representation: • History • Terminology ■ A classifier uses features which are derived from the original representation: • Ex: Word embed. ■ A classifier uses features which are derived from the features derived from the • Ex: w2v arch. • Ex: w2v results original representation: • Why deep? Input Hidden Hidden Output • A new idea? • Boom of Deep Nets layer layer layer 1 layer 2 • Autoencoders • Stacked autoenc. • Pre-training x 1 ConvNets Successes Recurrent Nets x 2 Other remarks Summary y 1 � x 3 x 4 P. Poˇ s´ ık c � 2020 Artificial Intelligence – 6 / 44 petr.posik@fel.cvut.cz

  12. Example: Word embeddings Sometimes, even shallow architectures can do surprisingly well! Deep Learning • Question • Definition • History • Terminology • Ex: Word embed. • Ex: w2v arch. • Ex: w2v results • Why deep? • A new idea? • Boom of Deep Nets • Autoencoders • Stacked autoenc. • Pre-training ConvNets Successes Recurrent Nets Other remarks Summary P. Poˇ s´ ık c � 2020 Artificial Intelligence – 7 / 44 petr.posik@fel.cvut.cz

  13. Example: Word embeddings Sometimes, even shallow architectures can do surprisingly well! Representation of text (words, sentences): Deep Learning • Question ■ Important for many real-world apps: search, ads recommendation, ranking, spam • Definition filtering, . . . • History • Terminology ■ Local representations (a concept is represented by a single node): • Ex: Word embed. ■ N-grams, 1-of-N coding, Bag of words • Ex: w2v arch. • Ex: w2v results ■ Easy to construct. • Why deep? • A new idea? ■ Large and sparse. • Boom of Deep Nets ■ No notion of similarity (synonyms, words with similar meaning ). • Autoencoders • Stacked autoenc. • Pre-training ConvNets Successes Recurrent Nets Other remarks Summary P. Poˇ s´ ık c � 2020 Artificial Intelligence – 7 / 44 petr.posik@fel.cvut.cz

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend