Making a scatter plot IN TR OD U C TION TO DATA SC IE N C E IN P - - PowerPoint PPT Presentation

making a scatter plot
SMART_READER_LITE
LIVE PREVIEW

Making a scatter plot IN TR OD U C TION TO DATA SC IE N C E IN P - - PowerPoint PPT Presentation

Making a scatter plot IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON Hillar y Green - Lerman Lead Data Scientist , Looker Mapping Cell Phone Signals INTRODUCTION TO DATA SCIENCE IN PYTHON What is a scatter plot ? INTRODUCTION TO DATA


slide-1
SLIDE 1

Making a scatter plot

IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON

Hillary Green-Lerman

Lead Data Scientist, Looker

slide-2
SLIDE 2

INTRODUCTION TO DATA SCIENCE IN PYTHON

Mapping Cell Phone Signals

slide-3
SLIDE 3

INTRODUCTION TO DATA SCIENCE IN PYTHON

What is a scatter plot?

slide-4
SLIDE 4

INTRODUCTION TO DATA SCIENCE IN PYTHON

What is a scatter plot?

slide-5
SLIDE 5

INTRODUCTION TO DATA SCIENCE IN PYTHON

Creating a scatter plot

plt.scatter(df.age, df.height) plt.xlabel('Age (in months)') plt.ylabel('Height (in inches)') plt.show()

slide-6
SLIDE 6

INTRODUCTION TO DATA SCIENCE IN PYTHON

Keyword arguments

plt.scatter(df.age, df.height, color='green', marker='s')

slide-7
SLIDE 7

INTRODUCTION TO DATA SCIENCE IN PYTHON

Changing marker transparency

plt.scatter(df.x_data, df.y_data, alpha=0.1)

slide-8
SLIDE 8

Let's practice

IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON

slide-9
SLIDE 9

Making a bar chart

IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON

Hillary Green-Lerman

Lead Data Scientist, Looker

slide-10
SLIDE 10

INTRODUCTION TO DATA SCIENCE IN PYTHON

Comparing pet crimes

precinct pets_abducted Farmburg 10 Cityville 15 Suburbia 9

plt.bar(df.precinct, df.pets_abducted) plt.ylabel('Pet Abductions') plt.show()

slide-11
SLIDE 11

INTRODUCTION TO DATA SCIENCE IN PYTHON

Horizontal bar charts

plt.barh(df.precinct, df.pets_abducted) plt.ylabel('Pet Abductions') plt.show()

slide-12
SLIDE 12

INTRODUCTION TO DATA SCIENCE IN PYTHON

Adding error bars

plt.bar(df.precinct, df.pet_abductions, yerr=df.error) plt.ylabel('Pet Abductions') plt.show()

slide-13
SLIDE 13

INTRODUCTION TO DATA SCIENCE IN PYTHON

Stacked bar charts

slide-14
SLIDE 14

INTRODUCTION TO DATA SCIENCE IN PYTHON

Stacked bar charts

slide-15
SLIDE 15

INTRODUCTION TO DATA SCIENCE IN PYTHON

Stacked bar charts

slide-16
SLIDE 16

INTRODUCTION TO DATA SCIENCE IN PYTHON

Stacked bar charts

plt.bar(df.precinct, df.dog, label='Dog') plt.bar(df.precinct, df.cat, bottom=df.dog, label='Cat') plt.legend() plt.show()

slide-17
SLIDE 17

Let's practice

IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON

slide-18
SLIDE 18

Making a histogram

IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON

Hillary Green-Lerman

Lead Data Scientist, Looker

slide-19
SLIDE 19

INTRODUCTION TO DATA SCIENCE IN PYTHON

Tracking down the kidnapper

slide-20
SLIDE 20

INTRODUCTION TO DATA SCIENCE IN PYTHON

What is a histogram?

slide-21
SLIDE 21

INTRODUCTION TO DATA SCIENCE IN PYTHON

Histograms with matplotlib

plt.hist(gravel.mass) plt.show()

slide-22
SLIDE 22

INTRODUCTION TO DATA SCIENCE IN PYTHON

Changing bins

plt.hist(data, bins=nbins) plt.hist(gravel.mass, bins=40)

slide-23
SLIDE 23

INTRODUCTION TO DATA SCIENCE IN PYTHON

Changing range

plt.hist(data, range=(xmin, xmax)) plt.hist(gravel.mass, range=(50, 100))

slide-24
SLIDE 24

INTRODUCTION TO DATA SCIENCE IN PYTHON

Normalizing

Unnormalized bar plot

plt.hist(male_weight) plt.hist(female_weight)

Sum of bar area = 1

plt.hist(male_weight, density=True) plt.hist(female_weight, density=True)

slide-25
SLIDE 25

Let's practice

IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON

slide-26
SLIDE 26

Recap of the rescue

IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON

Hillary Green-Lerman

Lead Data Scientist, Looker

slide-27
SLIDE 27

INTRODUCTION TO DATA SCIENCE IN PYTHON

You did it!

slide-28
SLIDE 28

INTRODUCTION TO DATA SCIENCE IN PYTHON

Modules and variables

Modules group functions together Add a module using import

import happens at the

beginning of a script le Variables store data: strings

  • r oats

import pandas as pd import numpy as np

slide-29
SLIDE 29

INTRODUCTION TO DATA SCIENCE IN PYTHON

Using functions

Perform a task Positional arguments Keyword arguments

slide-30
SLIDE 30

INTRODUCTION TO DATA SCIENCE IN PYTHON

Working with tabular data

import pandas as pd

DataFrames store tabular data Inspect data using .head()

  • r .info()

Select rows using logic

credit_reports[ credit_report.suspect == 'Freddy Frequentist']

slide-31
SLIDE 31

INTRODUCTION TO DATA SCIENCE IN PYTHON

Creating line plots

from matplotlib import pyplot as plt

Use plt.plot() to create a line plot Modify line plots with keyword arguments Add labels and legends

slide-32
SLIDE 32

INTRODUCTION TO DATA SCIENCE IN PYTHON

More plot types

plt.scatter() shows

individual data points

plt.bar() creates bar

charts

plt.hist() visualizes

distributions

slide-33
SLIDE 33

Great job!

IN TR OD U C TION TO DATA SC IE N C E IN P YTH ON