Vis u ali z ing time series IN TR OD U C TION TO DATA VISU AL - - PowerPoint PPT Presentation

vis u ali z ing time series
SMART_READER_LITE
LIVE PREVIEW

Vis u ali z ing time series IN TR OD U C TION TO DATA VISU AL - - PowerPoint PPT Presentation

Vis u ali z ing time series IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON Br y an Van de Ven Core De v eloper of Bokeh Datetimes & time series type(weather), type(weather.index) (pandas.core.frame.DataFrame,


slide-1
SLIDE 1

Visualizing time series

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

Bryan Van de Ven

Core Developer of Bokeh

slide-2
SLIDE 2

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Datetimes & time series

type(weather), type(weather.index) (pandas.core.frame.DataFrame, pandas.tseries.index.DatetimeIndex)

Date Temperature DewPoint Pressure 2010-01-01 00:00:00 AM 46.2 37.5 1 2010-01-01 01:00:00 AM 44.6 37.1 1 2010-01-01 02:00:00 AM 44.1 36.9 1 ... ... ... ...

slide-3
SLIDE 3

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Plotting DataFrames

plt.plot(weather) plt.show()

slide-4
SLIDE 4

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Time series

pandas time series: datetime as index

Datetime: represents periods or time-stamps Datetime index: specialized slicing

weather['2010-07-04'] weather['2010-03':'2010-04'] weather['2010-05']

etc.

slide-5
SLIDE 5

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Slicing time series

temperature = weather['Temperature'] march_apr = temperature['2010-03':'2010-04'] # data of March & April 2010 only march_apr.shape (1463,) march_apr.iloc[-4:] #extract last 4 entries from time series Date 2010-04-30 20:00:00 73.3 2010-04-30 21:00:00 71.3 2010-04-30 22:00:00 69.7 2010-04-30 23:00:00 68.5 Name: Temperature, dtype: float64

slide-6
SLIDE 6

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Plotting time series slices

slide-7
SLIDE 7

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Plotting time series slices

plt.plot(temperature['2010-01'], color='red', label='Temperature') dew point = weather['DewPoint'] plt.plot(dewpoint['2010-01'], color='blue', label='Dewpoint') plt.legend(loc='upper right') plt.xticks(rotation=60) plt.show()

slide-8
SLIDE 8

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Selecting & formatting dates

jan = temperature['2010-01'] dates = jan.index[::96] # Pick every 4th day print(dates) DatetimeIndex(['2010-01-01', '2010-01-05', '2010-01-09', '2010-01-13','2010-01-17', '2010-01-21', '2010-01-25', '2010-01-29'], dtype='datetime64[ns]', name='Date', freq=None) labels = dates.strftime('%b %d') # Make formatted labels print(labels) ['Jan 01' 'Jan 05' 'Jan 09' 'Jan 13' 'Jan 17' 'Jan 21' 'Jan 25' 'Jan 29']

slide-9
SLIDE 9

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Cleaning up ticks on axis

slide-10
SLIDE 10

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Cleaning up ticks on axis

plt.plot(temperature['2010-01'], color='red', label='Temperature') plt.plot(dewpoint['2010-01'], color='blue', label='Dewpoint') plt.xticks(dates, labels, rotation=60) plt.legend(loc='upper right') plt.show()

slide-11
SLIDE 11

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

slide-12
SLIDE 12

Time series with moving windows

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

Bryan Van de Ven

Core Developer of Bokeh

slide-13
SLIDE 13

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Hourly data over a year

slide-14
SLIDE 14

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Moving windows and time series

Moving window calculations Averages Medians Standard deviations Extracts information on longer time scales See pandas courses on how to compute

slide-15
SLIDE 15

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Moving averages

# smoothed computing using moving averages smoothed.info() <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 8759 entries, 2010-01-01 00:00:00 to 2010-12-31 23:00:00 Data columns (total 5 columns): 14d 8424 non-null float64 1d 8736 non-null float64 3d 8688 non-null float64 7d 8592 non-null float64 dtypes: float64(5) memory usage: 410.6 KB print(smoothed.iloc[:3,:]) 14d 1d 3d 7d Temperatur Date 2010-01-01 00:00:00 NaN NaN NaN NaN 46. 2010-01-01 01:00:00 NaN NaN NaN NaN 44. 2010-01-01 02:00:00 NaN NaN NaN NaN 44.

slide-16
SLIDE 16

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Viewing 24 hour averages

# moving average over 24 hours plt.plot(smoothed['1d']) plt.title('Temperature (2010)') plt.xticks(rotation=60) plt.show()

slide-17
SLIDE 17

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Viewing all moving averages

slide-18
SLIDE 18

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Viewing all moving averages

# plot DataFrame for January plt.plot(smoothed['2010-01']) plt.legend(smoothed.columns) plt.title('Temperature (Jan. 2010)') plt.xticks(rotation=60) plt.show()

slide-19
SLIDE 19

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Moving standard deviations

slide-20
SLIDE 20

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Moving standard deviations

plt.plot(variances['2010-01']) plt.legend(variances.columns) plt.title('Temperature deviations') plt.xticks(rotation=60) plt.show()

slide-21
SLIDE 21

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

slide-22
SLIDE 22

Histogram equalization in images

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

Bryan Van de Ven

Core Developer of Bokeh

slide-23
SLIDE 23

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Original image

slide-24
SLIDE 24

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Equalized image

slide-25
SLIDE 25

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Image histograms

slide-26
SLIDE 26

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Image histograms

  • rig = plt.imread('low-contrast-moon.jpg')

pixels = orig.flatten() plt.hist(pixels, bins=256, range=(0,256), normed=True, color='blue', alpha=0.3) plt.show() minval, maxval = orig.min(), orig.max() print(minval, maxval) 125 244

slide-27
SLIDE 27

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Rescaling the image

minval, maxval = orig.min(), orig.max() print(minval, maxval) 125 244 rescaled = (255/(maxval-minval)) * (pixels - minval) print(rescaled.min(), rescaled.max()) 0.0 255.0 plt.imshow(rescaled) plt.axis('off') plt.show()

slide-28
SLIDE 28

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Rescaled image

slide-29
SLIDE 29

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Original and rescaled histograms

slide-30
SLIDE 30

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Original and rescaled histograms

plt.hist(orig.flatten(), bins=256, range=(0,255), normed=True, color='blue', alpha=0.2)) plt.hist(rescaled.flatten(), bins=256, range=(0,255), normed=True, color='green', alpha=0.2)) plt.legend(['original', 'rescaled']) plt.show()

slide-31
SLIDE 31

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Image histogram & CDF

slide-32
SLIDE 32

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Image histogram & CDF

plt.hist(pixels, bins=256, range=(0,256), normed=True, color='blue', alpha=0.3) plt.twinx()

  • rig_cdf, bins, patches = plt.hist(pixels,

cumulative=True, bins=256, range=(0,256), normed=True, color='red', alpha=0.3) plt.title('Image histogram and CDF') plt.xlim((0, 255)) plt.show()

slide-33
SLIDE 33

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Equalizing intensity values

new_pixels = np.interp(pixels,bins[:-1],

  • rig_cdf*255

new = new_pixels.reshape(orig.shape) plt.imshow(new) plt.axis('off') plt.title('Equalized image') plt.show()

,

1

slide-34
SLIDE 34

INTRODUCTION TO DATA VISUALIZATION IN PYTHON

Equalized histogram & CDF

plt.hist(new_pixels, bins=256, range=(0,256), normed=True, color='blue', alpha=0.3) plt.twinx() plt.hist(new_pixels, bins=256, range=(0,256), normed=True, cumulative=True, color='red', alpha=0.1) plt.title('Equalized image histogram and CDF') plt.xlim((0, 255)) plt.show()

slide-35
SLIDE 35

Let's practice!

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

slide-36
SLIDE 36

Congratulations!

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON

Bryan Van de Ven

Core Developer of Bokeh

slide-37
SLIDE 37

Congratulations!

IN TR OD U C TION TO DATA VISU AL IZATION IN P YTH ON