Experimentações com grandes volumes de dados usando Notebooks
Gilmar Souza - Data & Analytics Principal
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 1 of 17 5/9/18, 11:57 AM
Experimentaes com grandes volumes de dados usando Notebooks - - PDF document
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... Experimentaes com grandes volumes de dados usando Notebooks Gilmar Souza - Data & Analytics Principal 1 of 17 5/9/18, 11:57 AM qconsp18-gilmar
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 1 of 17 5/9/18, 11:57 AM
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 2 of 17 5/9/18, 11:57 AM
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 3 of 17 5/9/18, 11:57 AM
source: http://jupyter.readthedocs.io/en/latest/architecture/how_jupyter_ipython_work.html (http://jupyter.readthedocs.io/en/latest /architecture/how_jupyter_ipython_work.html)
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 4 of 17 5/9/18, 11:57 AM
REPL: Repeat-Eval-Print Loop
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 5 of 17 5/9/18, 11:57 AM
In [18]: In [2]:
Multimedia
In [3]:
source: https://anaconda.org/jbednar/plotting_pitfalls/notebook (https://anaconda.org/jbednar/plotting_pitfalls/notebook)
read eval print Out[3]: import time def paused_print(words_csv, delay): words = words_csv.split(',') for word in words: print(word) time.sleep(delay) paused_print('read,eval,print', 1) #source: https://anaconda.org/jbednar/plotting_pitfalls/notebook import numpy as np np.random.seed(42) import holoviews as hv hv.notebook_extension('matplotlib') %opts Points [color_index=2] (cmap="bwr" edgecolors='k' s=50 alpha=1.0) %opts Scatter3D [color_index=3 fig_size=250] (cmap='bwr' edgecolor='k' s=50 alpha=1.0 %opts Image (cmap="gray_r") {+axiswise} %opts RGB [bgcolor="black" show_grid=False] import holoviews.plotting.mpl holoviews.plotting.mpl.MPLPlot.fig_alpha = 0 holoviews.plotting.mpl.ElementPlot.bgcolor = 'white' from holoviews.operation.datashader import datashade from colorcet import fire datashade.cmap=fire[50:] def blues_reds(offset=0.5,pts=300): blues = (np.random.normal( offset,size=pts), np.random.normal( offset,size=pts reds = (np.random.normal(-offset,size=pts), np.random.normal(-offset,size=pts return hv.Points(blues, vdims=['c']), hv.Points(reds, vdims=['c']) blues,reds = blues_reds() blues + reds + reds*blues + blues*reds
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 6 of 17 5/9/18, 11:57 AM
In [4]: In [19]:
source: https://github.com/jupyter-widgets/ipywidgets/blob/766cad54a47c07520e9d695534c4664c3391e7ec/docs/source /examples/Factoring.ipynb (https://github.com/jupyter-widgets/ipywidgets/blob/766cad54a47c07520e9d695534c4664c3391e7ec /docs/source/examples/Factoring.ipynb)
Out[4]:
n 2
− 1 = (x − 1) (x + 1) x2
hmap = hv.HoloMap({0:blues,0.000001:reds,1:blues,2:reds}, kdims=['level']) hv.Scatter3D(hmap.table(), kdims=['x','y','level'], vdims=['c']) from ipywidgets import interact from sympy import Symbol, Eq, factor from sympy import init_printing init_printing() x = Symbol('x') def factorit (n): return Eq(x**n-1, factor(x**n-1)) interact(factorit, n=(2,20));
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 7 of 17 5/9/18, 11:57 AM
In [22]: In [23]:
source: https://towardsdatascience.com/interactive-machine-learning-make-python-lively-again-a96aec7e1627 (https://towardsdatascience.com/interactive-machine-learning-make-python-lively-again-a96aec7e1627)
amplitude 1 ideal_mu 0.00 ideal_sigma 1.60 noise_sd 0.40 noise_mean
import matplotlib.pyplot as plt from scipy.optimize import curve_fit as cf from ipywidgets import interactive, fixed, interact_manual from IPython.display import display N_samples = 25 x=np.linspace(-2,2,N_samples) def f(x,a,mu,sigma): r=a*np.exp(-(x-mu)**2/(2*sigma**2)) return (r) def func(amplitude,ideal_mu,ideal_sigma,noise_sd,noise_mean): r=amplitude*np.exp(-(x-ideal_mu)**2/(2*ideal_sigma**2)) plt.figure(figsize=(8,5)) plt.plot(x,r,c='k',lw=3) r= r+np.random.normal(loc=noise_mean,scale=noise_sd,size=N_samples) plt.scatter(x,r,edgecolors='k',c='yellow',s=60) plt.grid(True) plt.show() return (r) y=interactive(func,amplitude=[1,2,3,4,5],ideal_mu=(-5,5,0.5), ideal_sigma=(0,2,0.2), noise_sd=(0,1,0.1),noise_mean=(-1,1,0.2)) display(y)
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 8 of 17 5/9/18, 11:57 AM
Reproducibilidade e Colaboração
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 9 of 17 5/9/18, 11:57 AM
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 10 of 17 5/9/18, 11:57 AM
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 11 of 17 5/9/18, 11:57 AM
In [8]: import io import base64 from IPython.display import HTML video = io.open('img/emr_zeppelin.mp4', 'r+b').read() encoded = base64.b64encode(video) video_data='''<video alt="test" controls> <source src="data:video/mp4;base64,{0}" type="video/mp4" /> </video>'''.format(encoded.decode('ascii'))
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 12 of 17 5/9/18, 11:57 AM
In [9]: Out[9]: HTML(data=video_data)
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 13 of 17 5/9/18, 11:57 AM
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 14 of 17 5/9/18, 11:57 AM
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 15 of 17 5/9/18, 11:57 AM
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 16 of 17 5/9/18, 11:57 AM
gilmar.souza@ifood.com.br (mailto:gilmar.souza@ifood.com.br) https://github.com/gilmar/qconsp18 (https://github.com/gilmar/qconsp18) http://gilmar.me (http://gilmar.me)
qconsp18-gilmar http://localhost:8888/notebooks/qconsp18-gilmar.... 17 of 17 5/9/18, 11:57 AM