Defining Machine Learning Dr. Alex Williams August 21, 2020 COSC - - PowerPoint PPT Presentation

defining machine learning
SMART_READER_LITE
LIVE PREVIEW

Defining Machine Learning Dr. Alex Williams August 21, 2020 COSC - - PowerPoint PPT Presentation

Defining Machine Learning Dr. Alex Williams August 21, 2020 COSC 425: Introduction to Machine Learning Fall 2020 (CRN: 44874) COSC 425: Intro. to Machine Learning 1 Syllabus Clarifications #1: No textbook requirement. (See Daume in Canvas.)


slide-1
SLIDE 1

Defining Machine Learning

1 COSC 425: Intro. to Machine Learning

COSC 425: Introduction to Machine Learning Fall 2020 (CRN: 44874)

  • Dr. Alex Williams

August 21, 2020

slide-2
SLIDE 2

2

2 COSC 425: Intro. to Machine Learning

Syllabus Clarifications

#1: No textbook requirement. (See Daume in Canvas.) #2: Added Office Hours link to Canvas. #3: Alternative Course Website http://web.eecs.utk.edu/~acw/teaching/cosc425/

slide-3
SLIDE 3

3

3 COSC 425: Intro. to Machine Learning

slide-4
SLIDE 4

4

4 COSC 425: Intro. to Machine Learning

Syllabus Clarifications

#4: Modern Machine Learning à Python

  • LearnPython (http://learnpython.org)
  • PythonTutor (http://pythontutor.com)
  • Programming w/ Mosh (https://www.youtube.com/…)
  • YouTube Video à 6-hour Intro to Python.
slide-5
SLIDE 5

5

5 COSC 425: Intro. to Machine Learning

Any Questions?

Use Zoom’s “Raise Hand” feature, and I’ll un-mute you.

slide-6
SLIDE 6

6

6 COSC 425: Intro. to Machine Learning

Today’s Agenda

We will address:

  • 1. What is “Machine Learning” (ML)?
  • 2. How is ML operationalized?
  • 3. What are the grand challenge of modern ML?
slide-7
SLIDE 7

7

7 COSC 425: Intro. to Machine Learning

What is Machine Learning?

slide-8
SLIDE 8

8

8 COSC 425: Intro. to Machine Learning

How would you define “machine learning”? Use Zoom’s “Raise Hand” feature, and I’ll un-mute you.

slide-9
SLIDE 9

9

9 COSC 425: Intro. to Machine Learning

“At a basic level, machine learning is about predicting the future based on the past.”

  • Hal Daumé III
slide-10
SLIDE 10

10

10 COSC 425: Intro. to Machine Learning

“Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.”

  • Arthur Samuel (1959)
slide-11
SLIDE 11

11

11 COSC 425: Intro. to Machine Learning

“How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?”

  • Tom Mitchell (1998)
slide-12
SLIDE 12

12

12 COSC 425: Intro. to Machine Learning

“A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P improves with experience E.”

  • Tom Mitchell (1998)
slide-13
SLIDE 13

13

13 COSC 425: Intro. to Machine Learning

(Representation + Evaluation + Optimization) = Learning

  • Pedro Domingos (2012)
slide-14
SLIDE 14

14

14 COSC 425: Intro. to Machine Learning

  • Ryan Urbanowicz (2018)

Computer Science Artificial Intelligence

Machine Learning

Mathematics Statistics

Types of ML

slide-15
SLIDE 15

15

15 COSC 425: Intro. to Machine Learning

  • Someone, at some point in time.
slide-16
SLIDE 16

16

16 COSC 425: Intro. to Machine Learning

So, what’s the right definition?

Technically: All of them.

slide-17
SLIDE 17

17

17 COSC 425: Intro. to Machine Learning

The overarching goal of these methods is to learn a function from prior data.

Spoiler: Machine learning is (mostly) operationalized mathematics.

slide-18
SLIDE 18

18

18 COSC 425: Intro. to Machine Learning

Terminology

tumor_size texture perimeter … 18.02 27.6 117.5 17.99 10.38 122.8 20.29 14.34 135.1 … … …

  • utcome

time N 31 N 61 R 27 … …

Dataset (i.e. with Input-Output Pairs) Input Variables (Features) Output Variables (Targets) Example / Instance

slide-19
SLIDE 19

19

19 COSC 425: Intro. to Machine Learning

Training Data Input-output Pairs

(xi, yi)

Learning Algorithm

f x

Testing Data

f(x) y

Major Assumption: You have access to yi, (i.e. output variables). Goal: Maximize performance for any x. Both in Training and Test Data!

slide-20
SLIDE 20

20

20 COSC 425: Intro. to Machine Learning

f x

Testing Data

f(x) y

What does “f” look like?

Linear regression as an example.

slide-21
SLIDE 21

21

21 COSC 425: Intro. to Machine Learning

Types of Machine Learning

slide-22
SLIDE 22

22

22 COSC 425: Intro. to Machine Learning

Types of Machine Learning

Supervised Learning Unsupervised Learning Reinforcement Learning

slide-23
SLIDE 23

24

24 COSC 425: Intro. to Machine Learning

Supervised Learning: Classification

Use-Case Criteria:

  • You have output variables, i.e. yi..
  • Your OVs are discrete / categorical.

Example: Spam Filtering

  • Goal: Learn a function from

categorical output.

  • e.g. {spam, not spam}

isUTKEmail HeaderKeyword Word 1 Word 2 isSpam x1 Yes CS425 Hi Prof … No x2 Yes Orientation Alex You … No x2 No urgent Dear Sir … Yes x4 No cash hello I … Yes x5 No help are you … Yes x6 Yes Survey Faculty this … No …

slide-24
SLIDE 24

25

25 COSC 425: Intro. to Machine Learning

Supervised Learning: Regression

Use-Case Criteria:

  • You have output variables, i.e. yi.
  • Your OVs are continuous.

Example: Tesla Speed Control

  • Goal: Learn a function for

a continuous output.

  • e.g. {0-100 MPH}
slide-25
SLIDE 25

26

26 COSC 425: Intro. to Machine Learning

Criticism: Output Variables are Unknown.

tumor_size texture perimeter … 18.02 27.6 117.5 17.99 10.38 122.8 20.29 14.34 135.1 … … …

  • utcome

time N 31 N 61 R 27 … …

Input Variables (Features) Output Variables (Targets)

X

slide-26
SLIDE 26

27

27 COSC 425: Intro. to Machine Learning

Unsupervised Learning: Clustering

Use-Case Criteria:

  • You have no output variables.

Example: Unlabeled Data

  • Goal: Learn a function from input.
  • e.g. Organize the data!
slide-27
SLIDE 27

28

28 COSC 425: Intro. to Machine Learning

Unsupervised Learning: Feature Selection

Long-Term Goal.

  • Figure out which inputs matter.

Feasible, but Challenging.

  • Data, data, and more data.

+2000 Citations! https://arxiv.org/pdf/1112.6209.pdf

slide-28
SLIDE 28

29

29 COSC 425: Intro. to Machine Learning

Criticism: “Learning from Data” isn’t Learning.

slide-29
SLIDE 29

30

30 COSC 425: Intro. to Machine Learning

Reinforcement Learning

Use-Case Criteria:

  • You have a some “environment”.
  • You have some notion of “good” behavior.
slide-30
SLIDE 30

31

31 COSC 425: Intro. to Machine Learning

Case Studies

slide-31
SLIDE 31

32

32 COSC 425: Intro. to Machine Learning

Case #1: OCR

A Neural Network Existing Instances

New Instance to Classify

slide-32
SLIDE 32

33

33 COSC 425: Intro. to Machine Learning

Case #1: OCR

Least Complex Most Complex

https://en.wikipedia.org/wiki/MNIST_database

slide-33
SLIDE 33

34

34 COSC 425: Intro. to Machine Learning

Case #1: OCR

Machines can be fooled!

Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images https://arxiv.org/abs/1412.1897

slide-34
SLIDE 34

35

35 COSC 425: Intro. to Machine Learning

Case #2: Computer Vision

slide-35
SLIDE 35

36

36 COSC 425: Intro. to Machine Learning

Case #2: Computer Vision

Deep Face: 97.35% vs Human: 97.53%

https://arxiv.org/pdf/1804.06655.pdf

slide-36
SLIDE 36

37

37 COSC 425: Intro. to Machine Learning

Case #2: Computer Vision

slide-37
SLIDE 37

38

38 COSC 425: Intro. to Machine Learning

Case #3: Image Captioning

”Two pizzas on a stove with wine.” “Three men playing frisbee in the grass”

slide-38
SLIDE 38

39

39 COSC 425: Intro. to Machine Learning

Case #3: Image Captioning

“A refrigerator filled with lots of food and drinks. ”A yellow school bus”.

slide-39
SLIDE 39

40

40 COSC 425: Intro. to Machine Learning

Case #4: Games

  • March 2016: AlphaGo defeats Lee Sedol.
  • “AlphaGo can’t beat me.” - Ke Jie (World Champion)
  • May 2017: AlphaGo Master defeats Ke Jie
  • “Last year, AlphaGo was still quite humanlike when it
  • played. But this year, it has became like a god of Go”.
  • Oct 2017: AlphaGo Zero outperforms AlphaGo Master.
  • Key Point: No prior training based on human expertise.
slide-40
SLIDE 40

41

41 COSC 425: Intro. to Machine Learning

Case #5: Text Generation

A Statistical Model of Language

Text Corpus

slide-41
SLIDE 41

42

42 COSC 425: Intro. to Machine Learning

Case #5: Text Generation

General Pre-Trained Transformer-2 (GPT-2)

This example uses arXiv-NLP’s training set. Try it here: https://transformer.huggingface.co/doc/arxiv-nlp

slide-42
SLIDE 42

43

43 COSC 425: Intro. to Machine Learning

Case #5: Text Generation

GPT-3: Text Understanding

  • OpenAI. Beta, Summer 2020.

(Not available to the public.)

Writing HTML + CSS … via text-commands?

slide-43
SLIDE 43

44

44 COSC 425: Intro. to Machine Learning

Case #5: Text Generation

Problem: Machine learning hinges on prior data.

Qui Gon Jinn to Jar Jar Binks. (32 BBY)

slide-44
SLIDE 44

45

45 COSC 425: Intro. to Machine Learning

Grand Challenges

slide-45
SLIDE 45

46

46 COSC 425: Intro. to Machine Learning

Today’s Machine Learning

Machine Learning is Modern Computer Science

  • Productivity Tools (e.g. Microsoft Word)
  • Well-Being Toos (e.g. Woebot)
  • Fraud Detection (e.g. CapitalOne, etc)
  • Speech Recognition (e.g. “Hey Google”)

… Why is Machine Learning Everywhere?

  • Sensing + Devices à Explosion of Data
  • Hardware Advances à Explosion of Processing Capabilities
  • Democratized ML à Explosion of Resources, Frameworks, etc
  • The Era of AIà Companies, investors, start-ups, etc.
slide-46
SLIDE 46

47

47 COSC 425: Intro. to Machine Learning

Grand Challenge #1: Data

O(n2) algorithms are infeasible.

  • ML has largely ignored algorithmic complexity.

A Need for Democratized Supercomputers.

  • New techniques for processing large datasets.

A Need for Parallelization.

  • Existing systems generally parallelize poorly (if at all).
slide-47
SLIDE 47

48

48 COSC 425: Intro. to Machine Learning

Grand Challenge #2: End-to-End Learning

The ML pipeline is substantial.

  • Efforts to streamline learning.

Single characters à Text Classification

  • https://arxiv.org/abs/1509.01626

Pixels à Autonomous Steering

  • https://arxiv.org/pdf/1604.07316v1.pdf
slide-48
SLIDE 48

49

49 COSC 425: Intro. to Machine Learning

Grand Challenge #3: ML Research

https://www.youtube.com/watch?v=-0G98MYUtjI

slide-49
SLIDE 49

50

50 COSC 425: Intro. to Machine Learning

Grand Challenge #4: People

Stanford’s HAI Conference. October 7, 2020.

slide-50
SLIDE 50

51

51 COSC 425: Intro. to Machine Learning

Today’s Agenda

You should now have answers to:

  • 1. What is “Machine Learning” (ML)?
  • 2. How is ML operationalized?
  • 3. What are the grand challenge of modern ML?
slide-51
SLIDE 51

52

52 COSC 425: Intro. to Machine Learning

Reading

  • Daume. Sec 1.1 + 1.2
  • Wagstaff. All of it!
slide-52
SLIDE 52

53

53 COSC 425: Intro. to Machine Learning

Any Questions?

Use Zoom’s “Raise Hand” feature, and I’ll un-mute you.

slide-53
SLIDE 53

54

54 COSC 425: Intro. to Machine Learning

Next Week

August 24th: Decision Trees August 26th: Decision Trees (continued) August 28th: The Limits of Learning

** All Asynchronous **