Vis u ali z ing regressions IN TR OD U C TION TO DATA VISU AL - - PowerPoint PPT Presentation

vis u ali z ing regressions
SMART_READER_LITE
LIVE PREVIEW

Vis u ali z ing regressions IN TR OD U C TION TO DATA VISU AL - - PowerPoint PPT Presentation

Vis u ali z ing regressions IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON Br y an Van de Ven Core De v eloper of Bokeh Seaborn h p :// seaborn . p y data . org / INTRODUCTION TO DATA VISUALIZATION IN PYTHON Recap : pandas


slide-1
SLIDE 1

Visualizing regressions

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

Bryan Van de Ven

Core Developer of Bokeh

slide-2
SLIDE 2

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Seaborn

hp://seaborn.pydata.org/

slide-3
SLIDE 3

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Recap: pandas DataFrames

Labelled tabular data structure Labels on rows: index Labels on columns: columns Columns are pandas Series

slide-4
SLIDE 4

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Tips DataFrame

total_bill tip sex smoker day time size 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.5 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4 ... ... ... ... ... ... ... ...

slide-5
SLIDE 5

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Linear regression plots

95% condence interval highlighted

slide-6
SLIDE 6

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using lmplot()

import pandas as pd import matplotlib.pyplot as plt import seaborn as sns tips = sns.load_dataset('tips') sns.lmplot(x='total_bill', y='tip', data=tips) plt.show()

slide-7
SLIDE 7

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Factors

total_bill tip sex smoker day time size 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.5 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4 ... ... ... ... ... ... ... ...

slide-8
SLIDE 8

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Grouping factors (same plot)

slide-9
SLIDE 9

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using hue

sns.lmplot(x='total_bill', y='tip', data=tips, hue='sex', palette='Set1') plt.show()

slide-10
SLIDE 10

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Grouping factors (subplots)

slide-11
SLIDE 11

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using col

sns.lmplot(x='total_bill', y='tip', data=tips, col='sex') plt.show()

slide-12
SLIDE 12

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Residual plots

slide-13
SLIDE 13

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using residplot()

sns.residplot(x='age',y='fare', data=tips, color='indianred') plt.show()

Similar arguments as lmplot() but more exible

x , y can be arrays or strings data is DataFrame (optional)

Optional arguments (e.g., color ) as in

matplotlib

slide-14
SLIDE 14

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

slide-15
SLIDE 15

Visualizing univariate distributions

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

Bryan Van de Ven

Core Developer of Bokeh

slide-16
SLIDE 16

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Visualizing data

Univariate → "one variable" Visualization techniques for sampled univariate data Strip plots Swarm plots Violin plots

slide-17
SLIDE 17

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Strip plot

slide-18
SLIDE 18

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using stripplot()

sns.stripplot(y='tip', data=tips) plt.ylabel('tip ($)') plt.show()

slide-19
SLIDE 19

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Grouping with stripplot()

sns.stripplot(x='day', y='tip', data=tip) plt.ylabel('tip ($)') plt.show()

slide-20
SLIDE 20

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Spreading out strip plots

sns.stripplot(x='day', y='tip', data=tip, size=4, jitter=True) plt.ylabel('tip ($)') plt.show()

slide-21
SLIDE 21

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Swarm plot

slide-22
SLIDE 22

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using swarmplot()

sns.swarmplot(x='day', y='tip', data=tips) plt.ylabel('tip ($)') plt.show()

slide-23
SLIDE 23

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

More grouping

slide-24
SLIDE 24

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

More grouping with swarmplot()

sns.swarmplot(x='day', y='tip', data=tips, hue='sex') plt.ylabel('tip ($)') plt.show()

slide-25
SLIDE 25

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Changing orientation

slide-26
SLIDE 26

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Changing orientation

sns.swarmplot(x='tip', y='day', data=tips, hue='sex',

  • rient='h')

plt.xlabel('tip ($)') plt.show()

slide-27
SLIDE 27

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Violin plot

slide-28
SLIDE 28

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using violinplot()

plt.subplot(1,2,1) sns.boxplot(x='day', y='tip', data=tips) plt.ylabel('tip ($)') plt.subplot(1,2,2) sns.violinplot(x='day', y='tip', data=tips) plt.ylabel('tip ($)') plt.tight_layout() plt.show()

slide-29
SLIDE 29

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Combining plots

slide-30
SLIDE 30

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Combining plots

sns.violinplot(x='day', y='tip', data=tips, inner=None, color='lightgray') sns.stripplot(x='day', y='tip', data=tips, size=4, jitter=True) plt.ylabel('tip ($)') plt.show()

slide-31
SLIDE 31

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

slide-32
SLIDE 32

Visualizing multivariate distributions

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

Bryan Van de Ven

Core Developer of Bokeh

slide-33
SLIDE 33

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Visualizing data

Bivariate → "two variables" Multivariate → "multiple variables" Visualizing relationships in multivariate data Joint plots Pair plots Heat maps

slide-34
SLIDE 34

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Joint plot

slide-35
SLIDE 35

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using jointplot()

sns.jointplot(x= 'total_bill', y= 'tip', data=tips) plt.show()

slide-36
SLIDE 36

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Joint plot using KDE

slide-37
SLIDE 37

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using kde=True

sns.jointplot(x='total_bill', y= 'tip', data=tips, kind='kde') plt.show()

slide-38
SLIDE 38

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Pair plot

slide-39
SLIDE 39

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using pairplot()

sns.pairplot(tips) plt.show()

slide-40
SLIDE 40

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using pairplot() with hue

sns.pairplot(tips, hue='sex') plt.show()

slide-41
SLIDE 41

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Covariance heat map of tips data

slide-42
SLIDE 42

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Using heatmap()

print(covariance) total_bill tip size total_bill 1.000000 0.675734 0.598315 tip 0.675734 1.000000 0.489299 size 0.598315 0.489299 1.000000 sns.heatmap(covariance) plt.title('Covariance plot') plt.show()

slide-43
SLIDE 43

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON