The Shanghai Lectures 2019 HeronRobots Pathfinder Lectures Natural - - PowerPoint PPT Presentation

the shanghai lectures 2019
SMART_READER_LITE
LIVE PREVIEW

The Shanghai Lectures 2019 HeronRobots Pathfinder Lectures Natural - - PowerPoint PPT Presentation

The Shanghai Lectures 2019 HeronRobots Pathfinder Lectures Natural and Artificial Intelligence in Embodied Physical Agents The ShanghAI Lectures An experiment in global teaching Fabio Bonsignorio The ShanghAI Lectures and


slide-1
SLIDE 1

The Shanghai Lectures 2019

HeronRobots Pathfinder Lectures Natural and Artificial Intelligence in Embodied Physical Agents

slide-2
SLIDE 2
slide-3
SLIDE 3

The ShanghAI Lectures An experiment in global teaching

  • Fabio Bonsignorio
  • The ShanghAI Lectures and Heron Robots

欢迎您参与 “来⾃臫上海渚的⼈亻⼯左智能系列劣讲座”

slide-4
SLIDE 4

Lecture 5. ML, DL ..Object Recognition an Embodied AI view

Fabio Bonsignorio The ShanghAI Lectures and Heron Robots

slide-5
SLIDE 5

Crash Introduction to ML

Lecture slides adapted from Deep Learning www.deeplearningbook.org Ian Goodfellow 2016-09-26

slide-6
SLIDE 6

Representations Matters

Figure 1.1

slide-7
SLIDE 7

Depth: Repeated Composition

Figure 1.2

slide-8
SLIDE 8

Computational Graphs

Figure 1.3

slide-9
SLIDE 9

Machine Learning and AI

Figure 1.4

slide-10
SLIDE 10

Historical Waves

Figure 1.7

slide-11
SLIDE 11

Historical Trends: Growing Datasets

Figure 1.8

slide-12
SLIDE 12

The MNIST Dataset

Figure 1.9

slide-13
SLIDE 13

Connections per Neuron

Figure 1.10

slide-14
SLIDE 14

Number of Neurons

Figure 1.11

slide-15
SLIDE 15

Solving Object Recognition

Figure 1.12

slide-16
SLIDE 16

Numerical Computation for Deep Learning

slide-17
SLIDE 17

Numerical concerns for implementations of deep learning algorithms

  • Algorithms are often specified in terms of real numbers; real numbers cannot be implemented in

a finite computer

  • Does the algorithm still work when implemented with a finite number of bits?
  • Do small changes in the input to a function cause large changes to an output?
  • Rounding errors, noise, measurement errors can cause large changes
  • Iterative search for best input is difficult
slide-18
SLIDE 18

Roadmap

  • Iterative Optimization
  • Rounding error, underflow, overflow
slide-19
SLIDE 19

Iterative Optimization

  • Gradient descent
  • Curvature
  • Constrained optimization
slide-20
SLIDE 20

Gradient Descent

Figure 4.1

slide-21
SLIDE 21

Approximate Optimization

Figure 4.3

slide-22
SLIDE 22

We usually don’t even reach a local minimum

slide-23
SLIDE 23

Iterative Optimization

  • Gradient descent
  • Curvature
  • Constrained optimization
slide-24
SLIDE 24

Critical Points

Figure 4.2

slide-25
SLIDE 25

Saddle Points

Figure 4.5

(Gradient descent escapes, see Appendix C of “Qualitatively Characterizing Neural Network Optimization Problems”) Saddle points attract Newton’s method

slide-26
SLIDE 26

Curvature

Figure 4.4

slide-27
SLIDE 27

Neural net visualization

(From “Qualitatively Characterizing Neural Network Optimization Problems”) At end of learning:

  • gradient is still large
  • curvature is huge
slide-28
SLIDE 28

Iterative Optimization

  • Gradient descent
  • Curvature
  • Constrained optimization
slide-29
SLIDE 29

Roadmap

  • Iterative Optimization
  • Rounding error, underflow, overflow
slide-30
SLIDE 30

Numerical Precision: A deep learning super skill

  • Often deep learning algorithms “sort of work”
  • Loss goes down, accuracy gets within a few

percentage points of state-of-the-art

  • No “bugs” per se
  • Often deep learning algorithms “explode” (NaNs, large

values)

  • Culprit is often loss of numerical precision
slide-31
SLIDE 31

Rounding and truncation errors

  • In a digital computer, we use float32 or

similar schemes to represent real numbers

  • A real number x is rounded to x + delta for

some small delta

  • Overflow: large x replaced by inf
  • Underflow: small x replaced by 0
slide-32
SLIDE 32

Bug hunting strategies

  • If you increase your learning rate and the loss gets

stuck, you are probably rounding your gradient to zero somewhere: maybe computing cross-entropy using probabilities instead of logits

  • For correctly implemented loss, too high of learning

rate should usually cause explosion

slide-33
SLIDE 33

Machine Learning Basics

slide-34
SLIDE 34

Linear Regression

Figure 5.1

slide-35
SLIDE 35

Underfitting and Overfitting in Polynomial Estimation

Figure 5.2

slide-36
SLIDE 36

Generalization and Capacity

Figure 5.3

slide-37
SLIDE 37

Training Set Size

Figure 5.4

slide-38
SLIDE 38

Weight Decay

Figure 5.5

slide-39
SLIDE 39

Bias and Variance

Figure 5.6

slide-40
SLIDE 40

Decision Trees

Figure 5.7

slide-41
SLIDE 41

Principal Components Analysis

Figure 5.8

slide-42
SLIDE 42

Curse of Dimensionality

Figure 5.9

slide-43
SLIDE 43

Nearest Neighbor

Figure 5.10

slide-44
SLIDE 44

Manifold Learning

Figure 5.11

slide-45
SLIDE 45

Convolutional Networks

slide-46
SLIDE 46

Convolutional Networks

  • Scale up neural networks to process very large images /

video sequences

  • Sparse connections
  • Parameter sharing
  • Automatically generalize across spatial translations of

inputs

  • Applicable to any input that is laid out on a grid (1-D, 2-D,

3-D, …)

slide-47
SLIDE 47

Key Idea

  • Replace matrix multiplication in neural nets

with convolution

  • Everything else stays the same
  • Maximum likelihood
  • Back-propagation
  • etc.
slide-48
SLIDE 48

Matrix (Dot) Product

=

  • m

p m p n n

Must match

slide-49
SLIDE 49

Edge Detection by Convolution

  • 1
  • 1

Input Kernel Output Figure 9.6

slide-50
SLIDE 50

Practical Methodology

slide-51
SLIDE 51

What drives success in ML?

slide-52
SLIDE 52

Example: Street View Address Number Transcription

slide-53
SLIDE 53

Three Step Process

slide-54
SLIDE 54

Identify Needs

slide-55
SLIDE 55

Choose Metrics

slide-56
SLIDE 56

End-to-end System

slide-57
SLIDE 57

Deep or Not?

slide-58
SLIDE 58

Choosing Architecture Family

slide-59
SLIDE 59

Increasing Depth

slide-60
SLIDE 60

High Test Error

slide-61
SLIDE 61

Increasing Training Set Size

slide-62
SLIDE 62

Tuning the Learning Rate

Figure 11.1

slide-63
SLIDE 63

Monte Carlo Methods

slide-64
SLIDE 64

Roadmap

  • Basics of Monte Carlo methods
  • Importance Sampling
  • Markov Chains
slide-65
SLIDE 65

Randomized Algorithms

Las Vegas Monte Carlo Type of Answer Exact Random amount of error Runtime Random (until answer found) Chosen by user (longer runtime gives lesss error)

slide-66
SLIDE 66

Estimating sums / integrals with samples

slide-67
SLIDE 67

Justification

  • Unbiased:
  • The expected value for finite n is equal to the correct value
  • The value for any specific n samples will have random error, but the

errors for different sample sets cancel out

  • Low variance:
  • Variance is O(1/n)
  • For very large n, the error converges “almost surely” to 0
slide-68
SLIDE 68

For more information…

slide-69
SLIDE 69

Object Categorization

Lecture slides adapted from "Object Categorization an Overview and Two Models” Fei Fei Li

slide-70
SLIDE 70
slide-71
SLIDE 71

perceptible vision materia l thing

slide-72
SLIDE 72

Plato said… Ordinary objects are classified together if they `participate' in the same abstract Form, such as the Form of a Human or the Form of Quartz. Forms are proper subjects of philosophical investigation, for they have the highest degree of reality. Ordinary objects, such as humans, trees, and stones, have a lower degree of reality than the Forms. Fictions, shadows, and the like have a still lower degree of reality than ordinary objects and so are not proper subjects of philosophical enquiry.

slide-73
SLIDE 73

How many object categories are there?

slide-74
SLIDE 74

Identification: is that Potala Palace? Verification: is that a lamp? Detection: are there people?

slide-75
SLIDE 75

mountain tree building

slide-76
SLIDE 76
slide-77
SLIDE 77
slide-78
SLIDE 78
slide-79
SLIDE 79
slide-80
SLIDE 80
slide-81
SLIDE 81
slide-82
SLIDE 82
slide-83
SLIDE 83
slide-84
SLIDE 84

Three main issues Representation How to represent an object category Learning How to form the classifier, given training data Recognition How the classifier is to be used on novel data

slide-85
SLIDE 85

“Bag-of-words” models

slide-86
SLIDE 86
slide-87
SLIDE 87
slide-88
SLIDE 88
slide-89
SLIDE 89
slide-90
SLIDE 90

Rethinking Robotics for the Robot Companion of the future Rethinking Robotics for the Robot Companion of the future Rethinking Robotics for the Robot Companion of the future

Hints that DL … MUST WORK

slide-91
SLIDE 91

Thank you!

fabio.bonsignorio@gmail.com fabio.bonsignorio@heronrobots.com www.shanghailectures.org

slide-92
SLIDE 92

The Shanghai Lectures 2020

HeronRobots Path-finder Lectures Natural and Artificial Intelligence in Embodied Physical Agents

slide-93
SLIDE 93