Communicating Results of Data Analysis Hctor Corrada Bravo - - PowerPoint PPT Presentation

communicating results of data analysis
SMART_READER_LITE
LIVE PREVIEW

Communicating Results of Data Analysis Hctor Corrada Bravo - - PowerPoint PPT Presentation

Communicating Results of Data Analysis Hctor Corrada Bravo University of Maryland, College Park, USA CMSC320: 20180503 For Today Data Analysis Deliverables: Written analyses R packages 1 / 13 Written analyses 1. Title 2.


slide-1
SLIDE 1

Communicating Results of Data Analysis

Héctor Corrada Bravo

University of Maryland, College Park, USA CMSC320: 2018­05­03

slide-2
SLIDE 2

For Today

Data Analysis Deliverables: Written analyses R packages 1 / 13

slide-3
SLIDE 3

https://leanpub.com/datastyle

Written analyses

  • 1. Title
  • 2. Introduction and motivation
  • 3. Description of dataset
  • 4. Description of statistical and machine learning models used

(Methods)

  • 5. Results (including measures of uncertainty)
  • 6. Conclusions (including potential problems)
  • 7. References

2 / 13

slide-4
SLIDE 4

Written analyses

Introduction and Motivation

Always lead with the question (task) you are addressing. E.g.: "Can we use tweets about stocks to predict stock prices?" Not: "Can we use the Random Forest algorithm to learn a classifier that predicts stock prices" E.g: "What are good predictors of student performance?" Not: "Can we use linear regression to predict student performance" 3 / 13

slide-5
SLIDE 5

Written analyses

Description of dataset

Size: entities and attributes Important: describe what you did to 1) obtain, 2) tidy the dataset. 4 / 13

slide-6
SLIDE 6

Written analyses

Description of data analysis methods

Be specific, use equations when appropriate: where is weight, is height and is an error term. When appropriate mention distributional assumptions on .

W = a + bH + e W H e e

5 / 13

slide-7
SLIDE 7

Written analyses

Description of data analysis methods

When using ML methods, describe: preprocessing (e.g., feature selection, transformations) algorithm choice (why is it appropriate) model selection and assessment (e.g., which classification metric and why) 6 / 13

slide-8
SLIDE 8

Written analyses

Results

Report estimates in the appropriate units Report estimates with uncertainty We saw confidence intervals on our previous lectures with specific advise regarding their presentation. (Note: this also applies to prediction metrics) 7 / 13

slide-9
SLIDE 9

Written analyses

Results

Important: Summarize importance of estimate (i.e., refer to the question you originally posed in introduction) Why does this estimate address your question? 8 / 13

slide-10
SLIDE 10

Written analyses

End matter

Include potential problems with the analysis you carried out. Include references to the analysis methods used. 9 / 13

slide-11
SLIDE 11

Graphics

Karl Broman's presentation on effective graphics: http://tinyurl.com/graphs2017 10 / 13

slide-12
SLIDE 12

Graphics

A few other notes on style: Make titles legible Annotate in plot if possible (see example data analysis early in semester) Include units in axis titles when appropriate E.g., not appropriate in PC scatterplot 11 / 13

slide-13
SLIDE 13

R packages

Case study: suppose you used data to create a classifier for diagnostic

  • purposes. How do you share?

R packages is a reproducible, high­visibility way of publishing these types of results Consistent organization Standardized deployment 12 / 13

slide-14
SLIDE 14

R packages

Case study: suppose you used data to create a classifier for diagnostic

  • purposes. How do you share?

R packages is a reproducible, high­visibility way of publishing these types of results Hadley's presentation on R packages http://www.slideshare.net/hadley/r­packages The book 13 / 13