A Taste of Data Science Michael Clarkson NACS Executive Leadership - - PowerPoint PPT Presentation

a taste of data science
SMART_READER_LITE
LIVE PREVIEW

A Taste of Data Science Michael Clarkson NACS Executive Leadership - - PowerPoint PPT Presentation

A Taste of Data Science Michael Clarkson NACS Executive Leadership Program at Cornell August 2019 Download these slides at https://tinyurl.com/TasteDataScience Computer Business Science Data Science Statistics Diagram inspired by


slide-1
SLIDE 1

A Taste of Data Science

Michael Clarkson NACS Executive Leadership Program at Cornell August 2019

Download these slides at https://tinyurl.com/TasteDataScience

slide-2
SLIDE 2

Computer Science Business Statistics Data Science

Diagram inspired by h"p://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

slide-3
SLIDE 3

Michael Clarkson, PhD Senior Lecturer in Computer Science Cornell University

Launched Data Science For All at Cornell

slide-4
SLIDE 4

Photo: http://100photos.time.com/photos/andreas-gursky-99-cent

slide-5
SLIDE 5

$

Image: https://pixabay.com/photos/data-computer-internet-online-www-2899901/

slide-6
SLIDE 6

Photo: https://www.flickr.com/photos/robertnelson/29699674200 https://blog.blueapron.io/forecasting-demand-at-blue-apron-ba62d6af5da2 h"ps://www.forbes.com/sites/stevebanker/2018/01/02/data-science-and-the-meal-kit-subscrip>on-business-model

slide-7
SLIDE 7

Photo: https://stories.starbucks.com/stories/2016/starbucks-mobile-app-launches-in-indonesia h"ps://www.forbes.com/sites/bernardmarr/2018/05/28/starbucks-using-big-data-analy@cs-and-ar@ficial-intelligence-to-boost-performance https://www.forbes.com/sites/bernardmarr/2018/04/04/how-mcdonalds-is-getting-ready-for-the-4th-industrial-revolution-using-ai-big-data-and-robotics https://www.wired.com/story/mcdonalds-big-data-dynamic-yield-acquisition/ h"ps://www.reuters.com/ar@cle/us-mcdonalds-mobile-idUSKBN16L2RM

slide-8
SLIDE 8

Photo: https://www.uber.com/en-BE/blog/ubereats-antwerp h"ps://www.wired.com/story/how-data-helps-deliver-your-dinner-on-8me-warm h"ps://www.eater.com/2018/10/24/18018334/uber-eats-virtual-restaurants h"ps://venturebeat.com/2018/10/02/uber-eats-and-the-6b-bookings-run-rate-the-ai-success-story-no-one-is-talking-about

slide-9
SLIDE 9

Data science is the key to personalized consumer experience and efficient operations at scale

slide-10
SLIDE 10

What is data science?

Answering questions from data using computation

slide-11
SLIDE 11

Ask Question Obtain Data Understand Data Understand World

slide-12
SLIDE 12

Example: Grocery Delivery

What questions would you ask? What data would you obtain?

slide-13
SLIDE 13

https://tech.instacart.com/3-million-instacart-orders-open-sourced-d40d29ead6f2

3,000,000

  • rders

200,000 customers

2017 Public Dataset

slide-14
SLIDE 14

Photo: https://pxhere.com/en/photo/940160

slide-15
SLIDE 15

In- ference Prediction Exploration

identifying patterns making guesses quantifying reliability

slide-16
SLIDE 16

In- ference Prediction Exploration

identifying patterns making guesses quantifying reliability

DEMO

slide-17
SLIDE 17

Data scientists…

  • Organize:

collect and clean data

  • Discover and communicate:

explore, program, and visualize

  • Automate:

separate data files from analyses for repeatability

slide-18
SLIDE 18

In- ference Prediction Exploration

identifying patterns making guesses quantifying reliability

DEMO

slide-19
SLIDE 19

Recommendations

slide-20
SLIDE 20

Program Input Output

Image: https://en.wikipedia.org/wiki/File:Computer.svg Photo: https://www.flickr.com/photos/nihgov/23682213069 Explanation (Prof. Weinberger): http://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/lecturenote01_MLsetup.html

slide-21
SLIDE 21

✔ ❌ ❌

slide-22
SLIDE 22

Data Program Input Output

✔ ❌ ❌ ✔

slide-23
SLIDE 23

Machine Learning:

Programs that improve on some task with experience

https://commons.wikimedia.org/wiki/File:Backgammon_lg.jpg https://pixabay.com/vectors/email-mail-spam-message-e-mail-29853/ https://pixabay.com/vectors/car-automobile-tesla-autonomous-2692593/

slide-24
SLIDE 24

https://www.freepik.com/free-photo/3d-render-male-head-showing-brain_1111538.htm

Artificial Intelligence Machine Learning

slide-25
SLIDE 25

Beyond bananas…

Photo: https://pixabay.com/illustrations/bananas-fruit-yellow-plant-food-3735673/

slide-26
SLIDE 26

Nearest neighbors algorithm

To make a recommendation for bananas to Alice:

  • Find the 3 “most similar” customers to Alice
  • Have them majority vote on whether they would

recommend bananas

  • Use that decision

Image: https://alliance.seas.upenn.edu/~cis520/dynamic/2016/wiki/index.php?n=Lectures.LocalLearning

slide-27
SLIDE 27

Data scientists…

  • Train computers:

use data to “teach” computer how to do task

  • Predict:

given a never-before-seen input, find right output

slide-28
SLIDE 28

In- ference Prediction Exploration

identifying patterns making guesses quantifying reliability

DEMO

slide-29
SLIDE 29

Data scientists…

  • Are skeptical:

always ask, “could it be just random chance?”

  • Explain uncertainty:

provide answer + estimate of “how right” answer is

slide-30
SLIDE 30

In- ference Prediction Exploration

identifying patterns making guesses quantifying reliability

slide-31
SLIDE 31

Privacy

Photo: https://pixabay.com/illustrations/mobile-security-privacy-protected-3469818/

slide-32
SLIDE 32

Photo: http://faqhow.com/other/any/how-target-predicted-a-girls-pregnancy-before-her https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did

slide-33
SLIDE 33

Photo: h'ps://www.wired.com/2010/03/ne8lix-cancels-contest

slide-34
SLIDE 34

Privacy concerns

  • Governments/businesses and individuals are

sometimes at odds over how identity is used

  • Intrinsic privacy: the individual's right to be left alone
  • Informational privacy: the individual's right to

determine for itself when, how, and to what extent information about it is communicated to others

slide-35
SLIDE 35

Ethical data scientists…

  • Seek consent
  • Select minimal identity
  • Limit storage
  • Avoid linking

http://www.cs.cornell.edu/fbs/publications/chptr.AuthPeople.pdf

slide-36
SLIDE 36

Computer Science Business Statistics Data Science

Diagram inspired by http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

slide-37
SLIDE 37

Photo: h'ps://news.cornell.edu/stories/2019/05/cornell-celebrate-151st-commencement

slide-38
SLIDE 38

Acknowledgments

CS 1380 Data Science For All Cornell University Michael Clarkson and Madeleine Udell Data 8 Foundations of Data Science University of California, Berkeley Ani Adhikari and John DeNero

slide-39
SLIDE 39

A Taste of Data Science

Michael Clarkson NACS Executive Leadership Program at Cornell August 2019

Download these slides at https://tinyurl.com/TasteDataScience