An Introduction to Machine Learning Shrey Gupta, Student at Duke - - PowerPoint PPT Presentation

an introduction to machine learning
SMART_READER_LITE
LIVE PREVIEW

An Introduction to Machine Learning Shrey Gupta, Student at Duke - - PowerPoint PPT Presentation

An Introduction to Machine Learning Shrey Gupta, Student at Duke University Who am I? Senior at Duke University interested in machine learning. Previously research & engineering at Google, quantitative research at hedge fund. Headed to


slide-1
SLIDE 1

An Introduction to Machine Learning

Shrey Gupta, Student at Duke University

slide-2
SLIDE 2

Who am I?

Senior at Duke University interested in machine learning. Previously research & engineering at Google, quantitative research at hedge fund. Headed to work on self-driving simulation after graduation. Co-founded and now advise Duke’s first undergraduate ML student group.

slide-3
SLIDE 3
slide-4
SLIDE 4

What is machine learning?

slide-5
SLIDE 5

What is machine learning?

“Give computers the ability to learn without being explicitly programmed.” -Arthur Samuel

slide-6
SLIDE 6

What is (not) machine learning?

zip_code = input(‘what is your zip code?’) if zip_code in LIST_OF_NC_ZIPCODES: print ‘user resides in North Carolina!’ if zip_code in LIST_OF_FL_ZIPCODES: print ‘user resides in Florida!’ ...

slide-7
SLIDE 7

What is machine learning?

input (data) income race political affiliation favorite grocery chain ...

  • utput

state of residence

slide-8
SLIDE 8

What is machine learning?

Training data: data used to train algorithm (i.e. create model).

example data point income race political affiliation favorite grocery chain ... model

analyze examples for patterns

x 1,000

slide-9
SLIDE 9

What types of algorithms are there?

Grouped into two categories: supervised and unsupervised learning.

slide-10
SLIDE 10

Supervised learning: classification

Data is labeled, and we want to predict a “class” or “category” as the output.

input (data) feature #1 feature #2 ...

  • utput

category #1 OR category #2 OR ...

slide-11
SLIDE 11

Example: classification

Given data about temperature, humidity, and wind speed, predict whether it will be sunny, cloudy, or raining.

input (data) temperature humidity wind speed

  • utput

sunny OR cloudy OR raining

slide-12
SLIDE 12

Example: classification

Predict whether the price of an equity will increase or decrease.

input (data) P/E ratio volatility analyst sentiment current price

  • utput

increase OR decrease OR stay the same

slide-13
SLIDE 13

Supervised learning: regression

Data is labeled, and we want to predict a continuous output.

input (data) feature #1 feature #2 ...

  • utput

value

slide-14
SLIDE 14

Example: regression

Predict the percentage increase or decrease in the price of an equity.

input (data) P/E ratio volatility analyst sentiment current price

  • utput

price (dollars)

slide-15
SLIDE 15

Example: regression

Given data about square footage, age, zip code, and housing demand, predict the selling price of a house.

input (data) age zip code square footage housing demand

  • utput

selling price (dollars)

slide-16
SLIDE 16

Unsupervised learning: clustering

Data is unlabeled, and we want to cluster the data points into groups.

slide-17
SLIDE 17

Example: clustering

Given consumption data, partition the consumers into market segments.

high school teen college teen having a baby age 50+

  • ld

techies just retired just married

slide-18
SLIDE 18

Example: clustering

Given consumption data, partition the consumers into market segments.

what’s everybody else buying?

slide-19
SLIDE 19

Example: clustering

Given several news articles (and their text), group them based

  • n similarity.

NBA NCAA NFL election Congress Trump flu

slide-20
SLIDE 20

Example: clustering

Given several news articles (and their text), group them based

  • n similarity.

here are similar articles you might like!

slide-21
SLIDE 21

What is happening today in machine learning?

slide-22
SLIDE 22

Computer vision

Computer vision is a related field that involves the understanding, processing, and reconstruction of 2- and 3-dimensional images. Common computer vision tasks in machine learning include classification, localization, object detection, and landmark detection.

slide-23
SLIDE 23

classification localization landmark detection

  • bject

detection

slide-24
SLIDE 24

Computer vision

1998: Yann LeCun organizes the MNIST database of handwritten digits, and develops a model that can classify handwritten digits.

slide-25
SLIDE 25

Computer vision

2012: Google Brain successfully trains a neural network to differentiate images of cats from dogs.

slide-26
SLIDE 26

Computer vision

2014: Facebook’s DeepFace successfully uses neural networks to perform facial recognition with over 97% accuracy.

slide-27
SLIDE 27

Computer vision

2015: Joseph Redmon invents “You Only Look Once” (YOLO), performing real-time object detection with performance higher than ever before.

slide-28
SLIDE 28
slide-29
SLIDE 29

Natural language processing

Natural language processing is a subset of artificial intelligence concerned with understanding natural language, including text and speech. Examples include sentiment analysis, language translation, reading comprehension, and textual question-answering.

slide-30
SLIDE 30

Natural language processing

2006: Google Translate launches, allowing translation between multiple languages for free.

slide-31
SLIDE 31

Natural language processing

2011: Siri, a natural language intelligent assistant, launches.

slide-32
SLIDE 32

Other impressive achievements

1997: IBM’s Deep Blue beats chess world champion Gary Kaspaov. 2009: The Netflix Prize is won for the best recommender system in predicting user film ratings. 2011: IBM’s Watson is able to defeat human champions in Jeopardy!

slide-33
SLIDE 33

Other impressive achievements

2014: The “Eugene Goostman” chatbot fools a third of judges in the Turing test. 2016: DeepMind develops AlphaGo and beats the top-ranked Go player. AlphaGo Zero, which is generalized to chess and

  • ther games, is developed the following year.
slide-34
SLIDE 34

When is machine learning useful?

slide-35
SLIDE 35

Power, complexity, and data

We have tons and tons of data, and huge amounts of compute power today. More complex models need lots of data. Otherwise, the model might find patterns that don’t really exist.

slide-36
SLIDE 36

Evaluation

Need to evaluate your model carefully. Several metrics, such as mean absolute error for regression and accuracy and precision for classification, and methods, such as cross-validation.

slide-37
SLIDE 37

Prediction and interpretability

Machine learning models are good for prediction, but don’t give underlying causation. Complex models can be difficult to interpret.

slide-38
SLIDE 38

Algorithmic bias

Machine learning is often used for high stakes decisions, such as determining whether to lend credit, facial recognition for criminals and terrorists, and recidivism. Training data needs to be representative and unbiased.

slide-39
SLIDE 39
slide-40
SLIDE 40
slide-41
SLIDE 41

How do I get started?

slide-42
SLIDE 42

How do I get started?

Online resources such as Coursera. Attend Duke’s Machine Learning Day (dukeml.org/ml-day) and MLBytes talks (dukeml.org/mlbytes). Start an ML group to gather interest in state-of-the-art tools and technologies being developed. Work on a project that uses ML!

slide-43
SLIDE 43

shreygupta.me/durham-tech-slides