Deep learning Deep dual learning 1 Hamid Beigy Sharif university of - PowerPoint PPT Presentation

Deep learning Deep learning Deep dual learning 1 Hamid Beigy Sharif university of technology December 21, 2019 1 Some slides are adopted from Tao Qin, Sreeja R Thoom et al. slides. Hamid Beigy | Sharif university of technology | December 21, 2019 1 / 28

Deep learning Table of contents 1 Introduction 2 Dual learning 3 Dual Supervised Learning Hamid Beigy | Sharif university of technology | December 21, 2019 2 / 28

Deep learning | Introduction Introduction Hamid Beigy | Sharif university of technology | December 21, 2019 2 / 28

Deep learning | Introduction Three Pillars of Deep Learning 1 Three Pillars of Deep Learning Big data: web pages, search logs, social networks, and new mechanisms for data collection: conversation and crowd–sourcing. Big models: 1000+ layers, tens of billions of parameters Big computing: CPU clusters, GPU clusters, TPU clusters, FPGA farms, provided by Amazon, Azure, Ali etc. Hamid Beigy | Sharif university of technology | December 21, 2019 3 / 28

Deep learning | Introduction Some Challenges of Deep Learning 1 Big-Data Challenge Todays deep learning highly relies on huge amount of human-labeled training data Task Typical training data Image classification Millions of labeled images Speech recognition Thousands of hours of annotated voice data Machine translation Tens of millions of bilingual sentence pairs Human labeling is in general very expensive, and it is hard, if not impossible, to obtain large-scale labeled data for rare domains Hamid Beigy | Sharif university of technology | December 21, 2019 4 / 28

Deep learning | Introduction Machine translation 1 How translate from a source language to a destination language? 2 Main problems How translate words from the source language to the destination language? How order words in the destination language? How measure goodness of translation? What type of corpus is needed? (monolingual or bilingual) How build a sequence of translators? (Persian → English → French) Hamid Beigy | Sharif university of technology | December 21, 2019 5 / 28

Deep learning | Introduction Neural machine translation (NMT) 1 In NMT 2 , recurrent neural networks such as LSTM or GRU units are used. 2 Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. ”Neural machine translation by jointly learning to align and translate.” ICLR 2015. Hamid Beigy | Sharif university of technology | December 21, 2019 6 / 28

Deep learning | Introduction Neural machine translation (NMT) 1 A critical disadvantage of this fixed-length context vector design is incapability of remembering long sentences. 2 The attention mechanism was proposed to help memorize long source sentences in NMT 3 Another critical disadvantage of this model is training set. We need a large bilingual corpus. 4 Dual learning was introduced to overcome the need for a large bilingual corpus. Hamid Beigy | Sharif university of technology | December 21, 2019 7 / 28

Deep learning | Dual learning Dual learning Hamid Beigy | Sharif university of technology | December 21, 2019 7 / 28

Deep learning | Dual learning Duality in Machine Translation 1 Dual learning is a auto-encoder like mechanism to utilize the monolingual datasets 3 . 3 Y. Xia, D. He, T. Qin, L. Wang, N. Yu, T.-Y. Liu, and W.-Y. Ma. Dual learning for machine translation. NIPS 2016. Hamid Beigy | Sharif university of technology | December 21, 2019 8 / 28

Deep learning | Dual learning Duality in Speech Processing 1 Duality in Speech Processing. Speech recognition Primal Task 𝑔: 𝑦 → 𝑧 Welcome to Beijing! Dual Task 𝑕: 𝑧 → 𝑦 Speech synthesis Hamid Beigy | Sharif university of technology | December 21, 2019 9 / 28

Deep learning | Dual learning Duality in Question Answering and Generation Generation 1 Duality in Question Answering and Generation. Question answering Primal Task 𝑔: 𝑦 → 𝑧 Parts of the immune system of for what purpose do higher organisms create peroxide , organisms make peroxide superoxide , and singlet oxygen to and superoxide ? destroy invading microbes . Dual Task 𝑕: 𝑧 → 𝑦 Question generation Hamid Beigy | Sharif university of technology | December 21, 2019 10 / 28

Deep learning | Dual learning Duality in Search and Advertising 1 Duality in Search and Advertising. Search: find webpages for a given query Primal Task 𝑔: 𝑦 → 𝑧 Amazon Amazon.com Shopping Dual Task 𝑕: 𝑧 → 𝑦 Advertising: suggest keywords for a given webpage Hamid Beigy | Sharif university of technology | December 21, 2019 11 / 28

Deep learning | Dual learning Structural Duality in AI Structural duality is very common in artificial intelligence AI Tasks X → Y Y → X Image classification Translation from EN to CH Translation from CH to EN Speech processing Speech recognition Text to speech Image understanding Image captioning Image generation Conversation Question answering Question generation Search engine Query-document matching Query/keyword suggestion Currently most machine learning algorithms do not exploit structure duality for training and inference. Hamid Beigy | Sharif university of technology | December 21, 2019 12 / 28

Deep learning | Dual learning Dual Learning 1 A new learning framework that leverages the symmetric (primal-dual) structure of AI tasks to obtain effective feedback or regularization signals to enhance the learning/inference process. 2 If you dont have enough labeled data for training, can we use unlabeled data? 3 Dual Unsupervised Learning can leverage structural duality to learn from unlabeled data. Hamid Beigy | Sharif university of technology | December 21, 2019 13 / 28

Deep learning | Dual learning Dual learning (Definition) 1 Let us to define 4 D A Corpus of language A. D B Corpus of language B. P ( . | s , θ AB ) translation model from A to B. P ( . | s , θ BA ) translation model from B to A. LM A ( . ) learned language model of A. LM B ( . ) learned language model of B. 4 Y. Xia, D. He, T. Qin, L. Wang, N. Yu, T.-Y. Liu, and W.-Y. Ma. Dual learning for machine translation. NIPS 2016. Hamid Beigy | Sharif university of technology | December 21, 2019 14 / 28

Deep learning | Dual learning Dual learning (Algorithm) 1 We have 2 Generate K translated sentences s mid , 1 , s mid , 2 , . . . , s mid , K from P ( . | s , θ AB ) 3 Compute intermediate rewards r 1 , 1 , r 1 , 2 , . . . , r 1 , K from LM B ( s mid , K ) for each sentence as r 1 , k = LM B ( s mid , k ) Hamid Beigy | Sharif university of technology | December 21, 2019 15 / 28

Deep learning | Dual learning Dual learning (Algorithm) 1 We have 2 Compute communication rewards r 2 , 1 , r 2 , 2 , . . . , r 2 , K for each sentence as r 2 , k = ln P ( s | s mid , ; θ BA ) 3 Set the total reward of k th sentence as r k = α r 1 , k + (1 − α ) r 2 , k Hamid Beigy | Sharif university of technology | December 21, 2019 16 / 28

Deep learning | Dual learning Dual learning (Algorithm) 1 We have 2 Compute the stochastic gradient of θ AB and θ BA K ∇ θ AB E [ r ] = 1 � r k ∇ AB ln P ( s mid , k | s , θ AB ) K k =1 K ∇ θ BA E [ r ] = 1 � (1 − α ) ∇ BA ln P ( s mid , k | s , θ BA ) K k =1 Hamid Beigy | Sharif university of technology | December 21, 2019 17 / 28

Deep learning | Dual learning Dual learning (Algorithm) 1 We have 2 Update the mode parameters θ AB and θ BA θ AB ← θ AB + γ 1 ∇ θ AB E [ r ] θ BA ← θ BA + γ 2 ∇ θ BA E [ r ] Hamid Beigy | Sharif university of technology | December 21, 2019 18 / 28

Deep learning | Dual learning Dual learning algorithm (pseudo code)) �E��D:� Hamid Beigy | Sharif university of technology | December 21, 2019 19 / 28

Deep learning | Dual learning Experimental results Hamid Beigy | Sharif university of technology | December 21, 2019 20 / 28

Deep learning | Dual learning Experimental results 1 Reconstruction performance (BLEU: geometric mean of n -gram 0��E�D� precision) �IGR��WU�GWMR��SIUJRUPE�GI� 2�5C� ��I�MPSUR�IPI�W�JURP�FE�IOM�I�PRHIO��I�SIGMEOO��M�� 5��/6U�5� A� Hamid Beigy | Sharif university of technology | December 21, 2019 21 / 28

Deep learning | Dual learning Experimental results 1 For different source sentence length (Improvement is significant for 0��E�D� long sentences) Hamid Beigy | Sharif university of technology | December 21, 2019 22 / 28 6RU�HMJJIUI�W��R�UGI��I�WI�GI�OI��WL �PSUR�IPI�W�M��M��MJMGE�W�JRU�OR��I�WI�GI��

Deep learning | Dual learning 0��E�D� Experimental results �IGR��WU�GWMR��I�EPSOI� 1 Reconstruction examples Hamid Beigy | Sharif university of technology | December 21, 2019 23 / 28

Deep learning | Dual Supervised Learning Dual Supervised Learning Hamid Beigy | Sharif university of technology | December 21, 2019 23 / 28

Deep learning Deep dual learning 1 Hamid Beigy Sharif university of - PowerPoint PPT Presentation

Deep learning Deep learning Deep dual learning 1 Hamid Beigy Sharif university of technology December 21, 2019 1 Some slides are adopted from Tao Qin, Sreeja R Thoom et al. slides. Hamid Beigy | Sharif university of technology | December 21,

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre <

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

Deep learning Optimization and Regularization in deep networks Hamid Beigy Sharif university of

Minjie Wang Deep Learning Deep Learning trend in the past 10 years Caffe State-of-art DL

1 Surveys and datasets Nikos Tzavidis, University of Southampton, UK - n.tzavidis@soton.ac.uk

COVID-19: Guidance for the UD Research Community Virtual Town Hall Tuesday, March 31, 2020

Pre-history of planet detections Focus on transits 1620 - 1995 D. Briot 1 , J. Schneider 2 , P.

Mine Reclamation Applications of a New Water Budget Model: Wetbud W. Lee Daniels Dept. of Crop

The Globalization of the Cyber Market Sponsored By: The Globalization of the Cyber Market Visit

Characterizing Individual Behavior from Interaction History Patrick Perry NYU Stern Case Study:

MORE THAN THE RIDE 2020 IMPACT REPORT STATE 101 - FUNDING OPERATING BUDGET (FY2020) CAPITAL

Lessons Learned from Measuring Student Behavior During Class PERRY SAMSON UNIVERSITY OF

Sambuz

Useful Links

Newsletter

Mail Us

Deep learning Deep dual learning 1 Hamid Beigy Sharif university of - PowerPoint PPT Presentation

Deep learning Deep learning Deep dual learning 1 Hamid Beigy Sharif university of technology December 21, 2019 1 Some slides are adopted from Tao Qin, Sreeja R Thoom et al. slides. Hamid Beigy | Sharif university of technology | December 21,

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

Differen'able Func'onal Programming Noel Welsh @noelwelsh underscore Goals Deep learning

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep learning for natural language processing A short primer on deep learning Benoit Favre &lt;

Relational Deep Learning: A Deep Latent Variable Model for Link Prediction Hao Wang, Xingjian

Medical Imaging Elisa Sayrol Medical Imaging Interest in this area in Deep Learning: DeepDeep

Deep learning Optimization and Regularization in deep networks Hamid Beigy Sharif university of

Minjie Wang Deep Learning Deep Learning trend in the past 10 years Caffe State-of-art DL

1 Surveys and datasets Nikos Tzavidis, University of Southampton, UK - n.tzavidis@soton.ac.uk

COVID-19: Guidance for the UD Research Community Virtual Town Hall Tuesday, March 31, 2020

Pre-history of planet detections Focus on transits 1620 - 1995 D. Briot 1 , J. Schneider 2 , P.

Mine Reclamation Applications of a New Water Budget Model: Wetbud W. Lee Daniels Dept. of Crop

The Globalization of the Cyber Market Sponsored By: The Globalization of the Cyber Market Visit

Characterizing Individual Behavior from Interaction History Patrick Perry NYU Stern Case Study:

MORE THAN THE RIDE 2020 IMPACT REPORT STATE 101 - FUNDING OPERATING BUDGET (FY2020) CAPITAL

Lessons Learned from Measuring Student Behavior During Class PERRY SAMSON UNIVERSITY OF

Sambuz

Useful Links

Newsletter

Mail Us

Deep learning for natural language processing A short primer on deep learning Benoit Favre <