Exploratory Data Analysis
Paul Cohen ISTA 370 Spring, 2012
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 1 / 46
Exploratory Data Analysis Paul Cohen ISTA 370 Spring, 2012 Paul - - PowerPoint PPT Presentation
Exploratory Data Analysis Paul Cohen ISTA 370 Spring, 2012 Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 1 / 46 Outline Data, revisited The purpose of exploratory data analysis Learning to see Paul Cohen ISTA 370 ()
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 1 / 46
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 2 / 46
Data: A Review
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 3 / 46
Data: A Review
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 4 / 46
Data: A Review
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 5 / 46
Exploratory Data Analysis
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 6 / 46
Exploratory Data Analysis
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 7 / 46
Exploratory Data Analysis
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 8 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 9 / 46
iris$Petal.Length Frequency 1 2 3 4 5 6 7 10 20 30
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 10 / 46
ipl Density 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 11 / 46
versicolor virginica 1 2 3 4 5 6 7 Species Petal.Length
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 12 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 13 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 14 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 15 / 46
width Frequency 40 60 80 100 120 10 20 30 40 Test0 Frequency 0.1 0.2 0.3 0.4 0.5 0.6 0.7 10 20 30 40
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 16 / 46
Train0Squared Frequency 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25 30 Train0 Frequency 0.4 0.6 0.8 1.0 10 20 30 40
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 17 / 46
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 1.5 2.0 2.5
density.default(x = Train0Squared)
N = 187 Bandwidth = 0.04378 Density
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 18 / 46
4 6 8 10 12 14 0.2 0.4 0.6 0.8 1.0 NewSkills0 Train0Squared
0.4 0.6 0.8 1.0 2 4 6 8 10 12 14 Train0Squared NewSkills0
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 19 / 46
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 1.5 2.0 2.5
density.default(x = Train0Squared)
N = 187 Bandwidth = 0.04378 Density
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 20 / 46
TRUE 0.2 0.4 0.6 0.8 1.0 precocious proportion training items correct
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 21 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 22 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 23 / 46
Time kinect$lhand.y 50 100 150 −500 500 1000
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 24 / 46
Time kinect$lhand.y 50 100 150 −500 500 1000
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 25 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 26 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 27 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 28 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 29 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 30 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 31 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 32 / 46 Weight of 33 College Students Frequency 50 100 150 200 2 4 6 8 10
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 33 / 46
Weight of 33 College Students Frequency 50 100 150 200 2 4 6 8 10
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 34 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 35 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 36 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 37 / 46
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 38 / 46
1 2 3 4 5 6 7 8 9 10 11 Condition 0.0 0.4 0.8 1.2
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 39 / 46
2 4 6 8 0.0 0.2 0.4 0.6 0.8 1.0 Index taheri$A
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 40 / 46
2 4 6 8 0.0 0.2 0.4 0.6 0.8 1.0 Index taheri$A
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 41 / 46
200 300 400 10 15 20 25 30 disp mpg
Learning to See
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 42 / 46
200 300 400 10 15 20 25 30 disp mpg
Tips for looking at data
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 43 / 46
Weight of 33 College Students Frequency 50 100 150 200 2 4 6 8 10 Time kinect$lhand.y 50 100 150 −500 500 1000
Tips for looking at data
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 44 / 46
1 2 3 4 5 6 7 8 9 10 11 Condition 0.0 0.4 0.8 1.2
Tips for looking at data
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 45 / 46
0.4 0.6 0.8 1.0 2 4 6 8 10 12 14 Train0Squared NewSkills0 iris$Petal.Length Frequency 1 2 3 4 5 6 7 10 20 30
Tips for looking at data
Paul Cohen ISTA 370 () Exploratory Data Analysis Spring, 2012 46 / 46
1990 1995 2000 2005 2010 4 5 6 7 8 9 year unemployment