Data wrangling with Tableau and Excel October 11 2016 JRNL 520H - - PowerPoint PPT Presentation

data wrangling with tableau and excel
SMART_READER_LITE
LIVE PREVIEW

Data wrangling with Tableau and Excel October 11 2016 JRNL 520H - - PowerPoint PPT Presentation

Data wrangling with Tableau and Excel October 11 2016 JRNL 520H What is data wrangling? Data wrangling is the process of preparing raw data for use in a data analysis or visualization software. What are the causes of dirty data? Data


slide-1
SLIDE 1

Data wrangling with Tableau and Excel

October 11 2016 JRNL 520H

slide-2
SLIDE 2

What is data wrangling?

Data wrangling is the process of preparing raw data for use in a data analysis or visualization software.

slide-3
SLIDE 3

What are the causes of dirty data?

  • Data entry error
slide-4
SLIDE 4

What are the causes of dirty data?

  • Data entry error
  • Incompatible tables
slide-5
SLIDE 5

What are the causes of dirty data?

  • Data entry error
  • Incompatible tables
  • Incompatible table format
slide-6
SLIDE 6

What should we look out for when cleaning data?

  • Table formating
slide-7
SLIDE 7

What should we look out for when cleaning data?

  • Table formating
  • Variable type
slide-8
SLIDE 8

What should we look out for when cleaning data?

  • Table formating
  • Variable type
  • Invalid character values
slide-9
SLIDE 9

What should we look out for when cleaning data?

  • Table formating
  • Variable type
  • Invalid character values
  • Invalid numeric values
slide-10
SLIDE 10

What should we look out for when cleaning data?

  • Table formating
  • Variable type
  • Invalid character values
  • Invalid numeric values
  • Grouping data
slide-11
SLIDE 11

What should we look out for when cleaning data?

  • Table formating
  • Variable type
  • Invalid character values
  • Invalid numeric values
  • Grouping data
  • Missing values
slide-12
SLIDE 12

Ideal format of data in Tableau

1. Start your data in cell A1. Remove all introductory information and footnotes. 2. Have the first row be the column headers/variable names 3. Have every subsequent row be one observation. No cross-tabulation!

slide-13
SLIDE 13

Ideal format of data in Tableau

Before After

slide-14
SLIDE 14

Ideal format of data in Tableau

Before After

slide-15
SLIDE 15

Data Interpreter

Tableau’s Data Interpreter feature draws out sub-tables and removes some of that extraneous information to help prepare your data source for analysis. Note: the data interpreter only works with Microsoft Excel files, not CSV or other file types.

slide-16
SLIDE 16

Data Interpreter

Tableau’s Data Interpreter feature draws out sub-tables and removes some of that extraneous information to help prepare your data source for analysis. Note: the data interpreter only works with Microsoft Excel files, not CSV or other file types.

Complete Tableau exercise

slide-17
SLIDE 17

Joins

A JOIN is a means for combining columns from one or more tables by using values common to each. There are four main join types: inner, left, right and full outer.

slide-18
SLIDE 18

Joins

slide-19
SLIDE 19

Joins

slide-20
SLIDE 20

Joins

slide-21
SLIDE 21

Joins

Complete Tableau exercise

slide-22
SLIDE 22

Wrangling in Excel

Sometimes the data interpreter in Tableau isn’t able to detect all of the errors in the

  • dataset. In cases like this, you will need to manually clean the data in Excel.

Complete Tableau exercise

slide-23
SLIDE 23

Pivot

Tabular format Columnar format

slide-24
SLIDE 24

Pivot

Complete Tableau exercise

Tabular format Columnar format