Lecture 4 Artificial Neural Networks Rui Xia T ext M ining Group N - PowerPoint PPT Presentation

Lecture 4 Artificial Neural Networks Rui Xia T ext M ining Group N anjing U niversity of S cience & T echnology rxia@njust.edu.cn

Brief History • Rosenblatt (1958) created the perceptron, an algorithm for pattern recognition. • Neural network research stagnated after machine learning research by Minsky and Papert (1969), who discovered two key issues with the computational machines that processed neural networks. – Basic perceptrons were incapable of processing the exclusive-or circuit. – Computers didn't have enough processing power to effectively handle the work required by large neural networks. • A key trigger for the renewed interest in neural networks and learning was Paul Werbos's (1975) back-propagation algorithm. • Both shallow and deep learning (e.g., recurrent nets) of ANNs have been explored for many years. Machine Learning, NJUST, 2018 2

Brief History • In 2006, Hinton and Salakhutdinov showed how a many-layered feedforward neural network could be effectively pre-trained one layer at a time. • Advances in hardware enabled the renewed interest after 2009. • Industrial applications of deep learning to large-scale speech recognition started around 2010. • Significant additional impacts in image or object recognition were felt from 2011 – 2012. • Deep learning approaches have obtained very high performance across many different natural language processing tasks after 2013. • Till now, deep learning architectures such as CNN, RNN, LSTM, GAN have been applied to a lot of fields, where they produced results comparable to and in some cases superior to human experts. Machine Learning, NJUST, 2018 3

Inspired from Neural Networks Machine Learning, NJUST, 2018 4

Multi-layer Neural Networks Machine Learning, NJUST, 2018 5

3-layer Forward Neural Networks • ANN Structure • Hypothesis 𝑧 𝑘 = 𝜀(𝛾 𝑘 + 𝜄 ො 𝑘 ) 𝑟 𝛾 𝑘 = ෍ 𝑥 ℎ𝑘 𝑐 ℎ ℎ=1 𝑐 ℎ = 𝜀(𝛽 ℎ + 𝛿 ℎ ) 𝑒 𝛽 ℎ = ෍ 𝑤 𝑗ℎ 𝑦 𝑗 𝑗=1 Machine Learning, NJUST, 2018 6

Learning algorithm • Training Set 𝑦 1 , 𝑧 1 , 𝑦 2 , 𝑧 2 , … , 𝑦 𝑛 , 𝑧 𝑛 , 𝑦 𝑗 𝜗𝑆 𝑒 , 𝑧 𝑗 𝜗𝑆 𝑚 𝐸 = • Cost function 𝑚 𝐹 𝑙 = 1 (𝑙) 2 (𝑙) − 𝑧 𝑘 2 ෍ 𝑧 𝑘 ො 𝑘=1 • Parameters 𝑤 𝜗 𝑆 𝑒∗𝑟 , 𝛿 𝜗 𝑆 𝑟 , 𝜕 𝜗 𝑆 𝑟∗𝑚 , 𝜄 𝜗 𝑆 𝑚 • Gradients to calculate 𝜖𝐹 (𝑙) , 𝜖𝐹 (𝑙) , 𝜖𝐹 (𝑙) , 𝜖𝐹 (𝑙) 𝜖𝑤 𝑗ℎ 𝜖𝛿 ℎ 𝜖𝜕 ℎ𝑘 𝜖𝜄 𝑘 Machine Learning, NJUST, 2018 7

Gradient Calculation • Firstly, gradient with respect to 𝜕 ℎ𝑘 : (𝑙) 𝜖𝐹 (𝑙) = 𝜖𝐹 (𝑙) 𝜖 ො 𝑧 𝑘 𝑘 ) ∙ 𝜖(𝛾 𝑘 + 𝜄 𝑘 ) (𝑙) ∙ 𝜖𝜕 ℎ𝑘 𝜖(𝛾 𝑘 + 𝜄 𝜖𝜕 ℎ𝑘 𝜖 ො 𝑧 𝑘 𝜖𝐹 (𝑙) (𝑙) − 𝑧 𝑘 (𝑙) where, (𝑙) = 𝑧 𝑘 ො 𝜖 ො 𝑧 𝑘 (𝑙) 𝜖 ො 𝑧 𝑘 (𝑙) ∙ 1 − ො 𝑘 ) = 𝜀 ′ 𝛾 𝑘 + 𝜄 (𝑙) 𝑘 = 𝜀 𝛾 𝑘 + 𝜄 𝑘 ∙ 1 − 𝜀 𝛾 𝑘 + 𝜄 = ො 𝑧 𝑘 𝑧 𝑘 𝑘 𝜖(𝛾 𝑘 + 𝜄 𝜖(𝛾 𝑘 + 𝜄 𝑘 ) = 𝑐 ℎ 𝜖𝜕 ℎ𝑘 Machine Learning, NJUST, 2018 8

Gradient Calculation (𝑙) 𝜖𝐹 (𝑙) 𝑘 ) = 𝜖𝐹 (𝑙) 𝜖 ො 𝑧 𝑘 𝑃𝑣𝑢𝑞𝑣𝑢𝑀𝑏𝑧𝑓𝑠 = Define: 𝑓𝑠𝑠𝑝𝑠 (𝑙) ∙ 𝑘 𝜖(𝛾 𝑘 + 𝜄 𝜖(𝛾 𝑘 + 𝜄 𝑘 ) 𝜖 ො 𝑧 𝑘 (𝑙) − 𝑧 𝑘 (𝑙) ∙ ො (𝑙) ∙ 1 − ො (𝑙) = 𝑧 𝑘 ො 𝑧 𝑘 𝑧 𝑘 𝜖𝐹 (𝑙) 𝑃𝑣𝑢𝑞𝑣𝑢𝑀𝑏𝑧𝑓𝑠 ∙ 𝑐 ℎ Then: = 𝑓𝑠𝑠𝑝𝑠 𝑘 𝜖𝜕 ℎ𝑘 • Secondly, gradient with respect to 𝜄 𝑘 : (𝑙) 𝜖𝐹 (𝑙) = 𝜖𝐹 (𝑙) 𝜖 ො 𝑧 𝑘 𝑘 ) ∙ 𝜖(𝛾 𝑘 + 𝜄 𝑘 ) (𝑙) ∙ 𝜖𝜄 𝜖(𝛾 𝑘 + 𝜄 𝜖𝜄 𝜖 ො 𝑧 𝑘 𝑘 𝑘 𝑃𝑣𝑢𝑞𝑣𝑢𝑀𝑏𝑧𝑓𝑠 ∙ 1 = 𝑓𝑠𝑠𝑝𝑠 𝑘 Machine Learning, NJUST, 2018 9

Gradient Calculation • Thirdly, gradient with respect to 𝑤 𝑗ℎ : 𝑚 𝜖𝐹 (𝑙) 𝜖𝐹 (𝑙) 𝑘 ) ∙ 𝜖(𝛾 𝑘 + 𝜄 𝑘 ) 𝜖(𝛽 ℎ + 𝛿 ℎ ) ∙ 𝜖(𝛽 ℎ + 𝛿 ℎ ) 𝜖𝑐 ℎ = ෍ ∙ 𝜖𝑤 𝑗ℎ 𝜖(𝛾 𝑘 + 𝜄 𝜖𝑐 ℎ 𝜖𝑤 𝑗ℎ 𝑘=1 𝜖𝐹 (𝑙) 𝑃𝑣𝑢𝑞𝑣𝑢𝑀𝑏𝑧𝑓𝑠 where, 𝑘 ) = 𝑓𝑠𝑠𝑝𝑠 𝑘 𝜖(𝛾 𝑘 + 𝜄 𝜖(𝛾 𝑘 + 𝜄 𝑘 ) = 𝜕 ℎ𝑘 𝜖𝑐 ℎ 𝜖𝑐 ℎ 𝜖(𝛽 ℎ + 𝛿 ℎ ) = 𝜀 ′ 𝛽 ℎ + 𝛿 ℎ = δ 𝛽 ℎ + 𝛿 ℎ ∙ 1 − δ 𝛽 ℎ + 𝛿 ℎ = 𝑐 ℎ ∙ 1 − 𝑐 ℎ 𝜖(𝛽 ℎ + 𝛿 ℎ ) (𝑙) = 𝑦 𝑗 𝜖𝑤 𝑗ℎ Machine Learning, NJUST, 2018 10

Gradient Calculation 𝜖𝐹 (𝑙) 𝐼𝑗𝑒𝑒𝑓𝑜𝑀𝑏𝑧𝑓𝑠 = 𝑓𝑠𝑠𝑝𝑠 define: ℎ 𝜖(𝛽 ℎ + 𝛿 ℎ ) 𝑚 𝜖𝐹 (𝑙) 𝑘 ) ∙ 𝜖(𝛾 𝑘 + 𝜄 𝑘 ) 𝜖𝑐 ℎ = ෍ ∙ 𝜖(𝛾 𝑘 + 𝜄 𝜖𝑐 ℎ 𝜖(𝛽 ℎ + 𝛿 ℎ ) 𝑘=1 𝑚 𝑃𝑣𝑢𝑞𝑣𝑢𝑀𝑏𝑧𝑓𝑠 ∙ 𝜕 ℎ𝑘 ∙ 𝜀 ′ 𝛽 ℎ + 𝛿 ℎ = ෍ 𝑓𝑠𝑠𝑝𝑠 𝑘 𝑘=1 𝑚 𝑃𝑣𝑢𝑞𝑣𝑢𝑀𝑏𝑧𝑓𝑠 ∙ 𝜕 ℎ𝑘 ∙ 𝑐 ℎ ∙ 1 − 𝑐 ℎ = ෍ 𝑓𝑠𝑠𝑝𝑠 𝑘 𝑘=1 𝜖𝐹 (𝑙) 𝐼𝑗𝑒𝑒𝑓𝑜𝑀𝑏𝑧𝑓𝑠 ∙ 𝑦 𝑗 (𝑙) then: = 𝑓𝑠𝑠𝑝𝑠 ℎ 𝜖𝑤 𝑗ℎ Machine Learning, NJUST, 2018 11

Gradient Calculation • Finally, gradient with respect to 𝛿 ℎ : 𝑚 𝜖𝐹 (𝑙) 𝜖𝐹 (𝑙) 𝑘 ) ∙ 𝜖(𝛾 𝑘 + 𝜄 𝑘 ) 𝜖𝑐 ℎ ∙ 𝜖 𝛽 ℎ + 𝛿 ℎ = ෍ ∙ 𝜖𝛿 ℎ 𝜖(𝛾 𝑘 + 𝜄 𝜖𝑐 ℎ 𝜖 𝛽 ℎ + 𝛿 ℎ 𝜖𝛿 ℎ 𝑘=1 𝐼𝑗𝑒𝑒𝑓𝑜𝑀𝑏𝑧𝑓𝑠 ∙ 1 = 𝑓𝑠𝑠𝑝𝑠 ℎ Machine Learning, NJUST, 2018 12

Back propagation algorithm algorithm flowchart weight updating 𝑛 Input: training set: 𝒠 = (𝑦 𝑙 , 𝑧 𝑙 ) 𝑙=1 𝜕 ℎ𝑘 ≔ 𝜕 ℎ𝑘 − η ∙ 𝜖𝐹 𝑙 learning rate 𝜃 Steps ： 𝜖𝜕 ℎ𝑘 1: initialize all parameters within (0,1) 𝑘 − η ∙ 𝜖𝐹 (𝑙) 2: repeat: 𝜄 𝑘 ≔ 𝜄 𝜖𝜄 3: for all 𝑦 𝑙 , 𝑧 𝑙 𝑘 ∈ 𝒠 do: 𝑤 𝑗ℎ ≔ 𝑤 𝑗ℎ − η ∙ 𝜖𝐹 (𝑙) 4: calculate 𝑧 𝑙 𝜖𝑤 𝑗ℎ 5: calculate 𝑓𝑠𝑠𝑝𝑠 𝑃𝑣𝑢𝑞𝑣𝑢𝑀𝑏𝑧𝑓𝑠 : 𝛿 ℎ ≔ 𝛿 ℎ − η ∙ 𝜖𝐹 (𝑙) 6: calculate 𝑓𝑠𝑠𝑝𝑠 𝐼𝑗𝑒𝑒𝑓𝑜𝑀𝑏𝑧𝑓𝑠 : 𝜖𝛿 ℎ 7: update 𝑤 , 𝜄 , 𝑤 and 𝛿 8: end for where η is the learning rate 9: until reach stop condition Output: trained ANN Machine Learning, NJUST, 2018 13

Practice: 3-layer Forward NN with BP • Given the following training data: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.html • Implement 3-layer Forward Neural Network with Back-Propagation and report the 5-fold cross validation performance ( code by yourself, don’t use Tensorflow); • Compare it with logistic regression and softmax regression. Machine Learning, NJUST, 2018 14

Practice #2: 3-layer Forward NN with BP • Given the following training data: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.html • Implement multi-layer Forward Neural Network with Back-Propagation and report the 5-fold cross validation performance (code by yourself); • Do that again (by using Tensorflow) • Tune the model by using different numbers of hidden layers and hidden nodes, different activation functions, different cost functions, different learning rates. Machine Learning, NJUST, 2018 15

Questions? Machine Learning, NJUST, 2018 16

Lecture 4 Artificial Neural Networks Rui Xia T ext M ining Group N - PowerPoint PPT Presentation

Lecture 4 Artificial Neural Networks Rui Xia T ext M ining Group N anjing U niversity of S cience & T echnology rxia@njust.edu.cn Brief History Rosenblatt (1958) created the perceptron, an algorithm for pattern recognition. Neural

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Artificial Neural Networks By: Kodi Neumiller Overview What is an artificial neural network

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Artificial Neural Networks Roger Barlow CODATA School - Roger Barlow -Artificial Neural Networks

How Neural Networks (NN) Biological Neuron: A . . . Can (Hopefully) Learn Artificial Neural . .

Artificial Neural Networks Oliver Schulte - CMPT 726 Feed-forward Networks Network Training

Networks Luke Schuler Overview What is an Artificial Neural Network? History

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS4501: Introduction to Computer Vision Neural Networks (NNs) Artificial Neural Networks (ANNs)

Introduction Supervised Learning CSCE CSCE 496/896 496/896 Lecture 2: Lecture 2: Basic

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

CSE 158 Web Mining and Recommender Systems Introduction What is CSE 158? In this course we will

Machine Learning Nave Bayes Model Rui Xia T ext M ining Group N anjing U niversity of S cience

MOL2NET, 2017 , 3, doi:10.3390/mol2net-03-xxxx 2 Programming was essential in the development of

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum

ELS: A Word-Level Method for Entity-Level Sentiment Analysis Nikos Engonopoulos Angeliki

ASSIST project Aims to deliver a service for searching and qualitatively analysing social

Data Mining The Oscars on Twitter Yun Zhou, Weiyan Shi, Mingyung Kim, Jiang Zhu, Alanna Iverson

Definition Liu et al. (2009) define a sentiment or opinion as a quintuple ,