( ) Intro. on Artificial Intelligence from the perspective of - PowerPoint PPT Presentation

人工智能引论 2018 罗智凌人工智能引论 ( 五 ) Intro. on Artificial Intelligence from the perspective of probability theory 罗智凌 luozhiling@zju.edu.cn College of Computer Science Zhejiang University http://www.bruceluo.net

人工智能引论 2018 罗智凌 Comparison on two major branches Ground truth Observation Feature Name Feed-Forward NN (Stochastic) Recurrent NN Input Feature Observation Output Ground truth (Latent, Visible) variables Learning Supervised Learning Unsupervised Learning Model Discriminative Model Generative Model Strategy Loss on ground truth(diff or entropy) Loss on observation(energy) Algorithm Gradient Descent (Variational) EM, Sampling Examples Perception, MLP, CNN LSTM, Markov Field, RBM Hybrid DBN, GAN, pre-trained/two-phase learning, AutoEncoder

人工智能引论 2018 罗智凌 Feed-Forward NN Recurrent NN GAN Perceptron CNN LSTM Auto -Encoder FRCNN bi-LSTM DBN MLP RBM word2vec Hopfield Net Markov Net LDA Stochastic Model

人工智能引论 2018 罗智凌 OUTLINE • Recurrent NN – Long Short Term Memory • Stochastic Model in Neural Network – Hopfield Nets – Restricted Boltzmann Machine – Sleep/wake Model – Echo-State Model • Hybrid Model – Deep Belief Network – AutoEncoder – Generative Adversarial Network

人工智能引论 2018 罗智凌 From multilayer perceptron (MLP) to Recurrent Neural Network to LSTM l Multi-Layer Perceptron(MLP) is by nature a feedforward directed acyclic network. l An MLP consists of multiple layers and can map input data to output data via a set of nonlinear activation functions. MLP utilizes a supervised learning technique called backpropagation for training the network. l However, MLP will not be able to learn mapping functions that there are dependency between input data (i.e., sequential data) Mapping Input Output

人工智能引论 2018 罗智凌 From multilayer perceptron (MLP) to Recurrent Neural Network to LSTM Recurrent Neural Network: An RNN has recurrent connections (connections to previous time steps of the same layer). l RNN are powerful but can get extremely complicated. Computations derived from earlier input are fed back into the network, which gives RNN a kind of memory. l Standard RNNs suffer from both exploding and vanishing gradients due to their iterative nature. Embedding vector sequence input Mapping (ht) (x0…xt)

人工智能引论 2018 罗智凌 Recurrent Models of Visual Attention Volodymyr Mnih Nicolas Heess Alex Graves Koray Kavukcuoglu. Recurrent Models of Visual Attention.

人工智能引论 2018 罗智凌 l Long Short-Term Memory (LSTM) Model: l LSTM is an RNN devised to deal with exploding and vanishing gradient problems in RNN. l An LSTM hidden layer consists of a set of recurrently connected blocks, known as memory cells. l Each of memory cells is connected by three multiplicative units - the input, output and forget gates. l The input to the cells is multiplied by the activation of the input gate, the output to the net is multiplied by the output gate, and the previous cell values are multiplied by the forget gate. Sepp Hochreiter &J ű rgen Schmidhuber, Long short-term memory, Neural computation, Vol. 9(8), pp. 1735--1780, MIT Press, 1997

人工智能引论 2018 罗智凌 LSTM Cell state Hidden state Input 2 states, 3 gates, 4 layers Cell/Hidden state Forget/Write/Read gate 3 sigmoid/ 1 tanh perceptron

人工智能引论 2018 罗智凌 Cell state Cell state through time forget gate Hidden state at t-1 Input at t Sigmoid function Forget signal, 1 represents “completely keep this”, 0 represents “completely forget this”

人工智能引论 2018 罗智凌 Write signal, 1 represents “completely write this”, 0 represents “completely Input(Write) gate ignore this” Hidden state at t-1 Input at t Content to write

人工智能引论 2018 罗智凌 Update cell state Cell state at t-1 Write signal Write signal Content to write Updated cell state

人工智能引论 2018 罗智凌 Read signal, 1 represents “completely read this, 0 represents Output(Read) gate “completely ignore this Hidden state at t-1 Input at t Updated hidden state at t

人工智能引论 2018 罗智凌 Language Translation

人工智能引论 2018 罗智凌 Stock Prediction

� � 人工智能引论 2018 罗智凌 Stochastic NN • Energy based probability distribution on (latent, visible) variables: • where Z is called partition function ( 配分函数 ). • Loss function 𝑀 𝜄, 𝐸 = − ' x 5 ) ( ∑ log 𝑞( 0 1 ∈3 = ' 𝐹 x 5 ( ∑ − log 𝑎 0 1 ∈3

人工智能引论 2018 罗智凌 Hopfield Nets • A Hopfield net is composed of binary threshold units with recurrent connections between them.

人工智能引论 2018 罗智凌 The energy function • The global energy is the sum of many contributions. Each contribution depends on one connection weight and the binary states of two neurons: ∑ b i − ∑ E = − s i s i s j w ij i i < j • This simple quadratic energy function makes it possible for each unit to compute locally how it’s state affects the global energy: ∑ Energy gap = Δ E i = E ( s i = 0 ) − E ( s i = 1 ) b i + s j w ij = j

人工智能引论 2018 罗智凌 Settling to an energy minimum -4 • To find an energy minimum in this ? 0 1 net, start from a random state and then update units one at a time in random order. 3 2 3 3 – Update each unit to whichever of its two states gives the -1 -1 0 0 1 lowest global energy. – i.e. use binary threshold units. - E = goodness = 3

人工智能引论 2018 罗智凌 Settling to an energy minimum -4 • To find an energy minimum in this 0 1 net, start from a random state and then update units one at a time in random order. 3 2 3 3 – Update each unit to whichever of its two states gives the -1 -1 0 ? 0 1 lowest global energy. – i.e. use binary threshold units. - E = goodness = 3

人工智能引论 2018 罗智凌 Settling to an energy minimum -4 • To find an energy minimum in this 0 1 net, start from a random state and then update units one at a time in random order. 3 2 3 3 – Update each unit to whichever of its two states gives the -1 -1 1 0 ? 0 1 lowest global energy. – i.e. use binary threshold units. - E = goodness = 3 - E = goodness = 4

人工智能引论 2018 罗智凌 A deeper energy minimum • The net has two triangles in which the three units mostly support each other. -4 1 0 – Each triangle mostly hates the other triangle. • The triangle on the left differs from the 3 2 3 3 one on the right by having a weight of 2 where the other one has a weight of 3. -1 -1 – So turning on the units in the triangle 1 0 1 on the right gives the deepest minimum. - E = goodness = 5

人工智能引论 2018 罗智凌 A neat way to make use of this type of computation • Hopfield (1982) proposed that • Using energy minima to memories could be energy represent memories gives a minima of a neural net. content-addressable memory: – The binary threshold decision – An item can be accessed rule can then be used to by just knowing part of its “ clean up ” incomplete or content. corrupted memories. – It is robust against • The idea of memories as energy hardware damage. minima was proposed by I. A. – It’s like reconstructing a Richards in 1924 in “Principles of dinosaur from a few bones. Literary Criticism”.

人工智能引论 2018 罗智凌 Boltzmann Machine • 如图所示为一个玻尔兹曼机 (BM) ，其蓝色节点为隐层 (hidden) ，白色节点为输入层 (visible) 。 • 与 Hopfield Net 相比，参数不固定，数据输入是对 v 的观察。 • 与递归神经网络相比： – 1 、 RNN 本质是学习一个函数，因此有输入和输出层的概念，而 BM 的用处在于学习一组数据的“内在表示”，因此其没有输出层的概念。 – 2 、 RNN 各节点链接为有向环，而 BM 各节点连接成无向完全图

( ) Intro. on Artificial Intelligence from the perspective of - PowerPoint PPT Presentation

2018 ( ) Intro. on Artificial Intelligence from the perspective of probability theory luozhiling@zju.edu.cn College of Computer Science Zhejiang University http://www.bruceluo.net

Artificial Intelligence Artificial Intelligence Artificial Intelligence Study and design of

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Course Presentation Summary Artificial Intelligence Motivations

Artificial Intelligence Intro (Chapter 1 of AIMA) Summary Artificial Intelligence What is AI?

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

1/29/10 CSE 3402: Intro to Artificial Intelligence CSE 3402: Intro to Artificial Intelligence

What is Artificial Intelligence? CPSC 322 Lecture 1 September 5, 2007 What is Artificial

Traditional Definition of Artificial Intelligence Trends Artificial Intelligence (AI) is

Artificial Intelligence as Law Bart Verheij Department of Artificial Intelligence, Bernoulli

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Lecture Overview What is Artificial Intelligence? Agents acting in an environment

CSCI 446: Artificial Intelligence CSCI 446: Artificial Intelligence Course Website:

1.1 What is AI? 1. What is Artificial Intelligence? 2. AI Past and Present 3. Rational

8th November 2019 Artificial Intelligence Finance Institute NYU Courant Artificial Intelligence

CSCI 446 ARTIFICIAL INTELLIGENCE EXAM 1 STUDY OUTLINE Introduction to Artificial Intelligence

Introduction to Artificial Intelligence What is Artificial Intelligence for YOU? CPSC 533

Bokeh is better than ever! Fabio Pliger fabio.pliger@gmail.com @b_smoke EuroPython 2016,

An Introduction to Prog Rock Prof. J. Paradiso MIT Media Lab Session 1: Intro - Things your

Meeting young peoples psychological needs during the COVID-19 crisis and beyond With: Chair:

Living Safely with Mental Illness: Preventing and responding to crises Joel A. Dvoskin, Ph.D.

1 Application Event-driven model Introduction Design factors [Akyildiz et al. I n the UC

For whoever wants to save their life will lose it, but whoever loses their life for me will

TRUTH AND BEAUTY: MATHEMATICS IN LITERATURE SPRING 2010 MWF 11:00 12:05 INSTRUCTOR: DR.

AMKPS First Meet-the-Parents Session 19 Jan 2019 The New Principal for 2019 Mr Chew Mun