Machine Learning Machine Learning Fast & Slow Fast & Slow - PowerPoint PPT Presentation

Machine Learning Machine Learning Fast & Slow Fast & Slow Suman Deb Roy Suman Deb Roy Lead Data Scientist @ betaworks

bot www.rundexter.com /messaging www.poncho.is www.digg.com www.digg.com/messaging www.rundexter.com www.poncho.is

The The Last Last 10% 10% Art & Art & Science Science Runway Runway

1: Poncho 1: Poncho • A weather cat that sends you personalized weather messages. • Algorithms + Humans • Not every feature in weather data has equal importance – what's ac?onable?

2: Digg Trending 2: Digg Trending • Ranked each day: – 10 million RSS feeds, 200 million tweets, 7.5 million new ar?cles ranked each day m.me/digg

3: Digg Deeper 3: Digg Deeper

4: 4: Instapaper’s Instapaper’s InstaRank InstaRank

5: Scale Model 5: Scale Model Communi?es Not Keywords

MACHINE LEARNING MACHINE LEARNING WAS HARD HARD WAS ITS STILL STILL HARD ITS HARD

VALUE of VALUE of Predic?on Error Varied Distribu?on Algorithms Algorithms vs. Data vs. Data Historical Data Similarity between training & test distribu?ons (less varied dist) Impact of a more complex algorithm Historical Data Value

Moving fast and slow Moving fast and slow • Fast: – Experience, Similar Problems, Pre-exis?ng pipelines • Slow: – New type of data, Bootstrap, Scaling • Main challenge: – how to jump between states, when to change gears.

Planned Planned Conscious Conscious Slow Fast Fast Fast Slow Slow Unconscious Unconscious Slow Fast

Effects of moving Fast Effects of moving Fast • Technical debt? – Refactoring code – improving unit tests – delete dead code – reducing dependencies – ?ghtening APIs – improving documenta?on

Effects of moving Slow Effects of moving Slow • Growth debt? – Wai?ng team mates – Uncertain quality assurance – Piling up further requests – Hypothesis might not be feedback driven – Overthinking the solu?on

Maintenance Maintenance • Code Level – How researchable, reusable, deployable • System Level – Eroding abstrac?on boundaries • Data Level – Data influences ML behavior.

Data vs. Code Organization Data vs. Code Organization • Snapshodng .. Detects bias • Interface at the method , be procedural – Easy to execute por?ons of the code. • Separate hyper-arguments from parameters – Parameter: How your model is specified – Hyper-Arguments: How your algorithm should run

Unstable APIs Unstable APIs • Who owns the data stream? • Who owns the model ? • Ownership by – en?re solu?on – Exper?se? DB ? Pipelines? Algorithms? Stats • Debug? – Frozen versioning instead of con?nual

Feature Erosion Feature Erosion • User behavior with new model could make features of current model unimportant • How can we detect this? • How can we prevent this?

Predictor Variables Predictor Variables • Myth: If you add a few more variables, the predictor will be befer. • If the predictors have realis?c priors, their coefficients could be appropriately pulled down (in expecta?on) and over fidng shouldn’t be such a problem

Visualizations Visualizations Any ML algorithm must be seen to believe it.

Visualizations Visualizations

Research vs. Production Research vs. Production • Collabora?on looks very different based on the end goals • Do you need to master git or just get by • How quickly can you move something from iPython to produc?on grade?

Even the best tools.. Even the best tools.. • Lets talk about iPython notebooks: – Version Control – Fragmented Code is deadly for produc?on grade. – Security issue : all those open ports – Code Reviews and Pull Requests.

Heuristic Escape Heuristic Escape “ Heuristic is an algorithm in a clown suit. It’s less predictable, it’s more fun, and it comes without a 30- day, money-back guarantee .” ― Steve McConnell, Code Complete

Domain of Impact Domain of Impact • Most engineers and computers scien?sts will conceptualize domains as primarily a ra?onal, evidence-based, problem-solving enterprise focused on well-defined condi?ons. • But the real world is ….. more complex! • e.g.,: Trending News Algorithms

Invention vs. Innovation Invention vs. Innovation • What is ML good at? Both ? • Not outside the box, instead connect them. • innova?on = improve significantly by adjus?ng ML method • inven?on = totally new ML method.

Fitting ML into the betaworks model Fitting ML into the betaworks model Product C Company Company Nexus B A Research

Code & Data Residence Code & Data Residence • ML module transfer – Code transfer • Core module • Model upda?ng component • Analysis component – Data transfer • Infrastructure rebuild? • Performance • maintenance

Powered by deepNews Research ready pipelines Research ready pipelines

Powered by deepNews + Scale Model Second order Analysis Second order Analysis

Conversational Conversational Software Software

HUMAN HUMAN BOT BOT HBI INTER INTER CONNECTION CONNECTION

ZERO automated solutions Affective Computing trending digg topics deeper Topic Modeling DBpedia Freebase APIs Apps for transactional tasks MANY automated solutions

HIGH VALUE of historical data LSTM ? Tone Analyzer? Trending Digg topics deeper LDA LSA Freebase DBpedia APIs Apps for transactional tasks LOW VALUE of historical data

Data Types by Company Data Types by Company • Digg has topic modeling/ news data • Scale model has social graph data • Poncho has weather data/editorialized personality • Giphy has gifs (emo?on++) • Instapaper has reading data • Dexter has hooks to APIs

Transfer Learning Transfer Learning Yosinski et. al. How transferrable are deep learning features? , in NIPS 2014

To Sum up To Sum up • Constraints to ML solu?ons occur at three levels: – Algorithmic – Data – Humans • These parameters lead to several oscilla?ng cycles of fast and slow impact of ML • Whats good for you?

ML 2016 ML 2016 • Understood by few, hyped by some, revered by most. • Can be the difference between a company scaling vs. close shop. • Almost every company can have at least 1 product feature powered by ML. • Be careful about bias in data.

Suman Deb Roy suman@betaworks.com | @_roysd data.betaworks.com

Machine Learning Machine Learning Fast & Slow Fast & Slow - PowerPoint PPT Presentation

Machine Learning Machine Learning Fast & Slow Fast & Slow Suman Deb Roy Suman Deb Roy Lead Data Scientist @ betaworks bot www.rundexter.com /messaging www.poncho.is www.digg.com www.digg.com/messaging www.rundexter.com

Fast-slow systems with chaotic noise David Kelly Ian Melbourne Courant Institute New York

Big and Small Steps for Fast and Slow Provability Paula Henk illc , University of Amsterdam

SEARCHING: FAST AND SLOW Susan Dumais http://research.microsoft.com/~sdumais #TAIA2014 Jul

Integrating new major Integrating new major components on fast and slow components on fast and

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Debra Prinzing SLOW FLOWERS COLLECTIONS Datisca cannabina ECOMMERCE: Direct to Consumer What

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

Cracking the Habit Code 21 days to keeping your resolutions 1 Day 3: Start Small & Go Slow

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

scoot Introducing Fast, Cheap, Personal Transportation Free, But Slow (<10 MPH, $0/trip)

Syed Aftab Rashid id, Geoffrey Nelissen and Eduardo Tovar 4/12/2016 Main CPU Cache Memory

Fast-slow systems with chaotic noise Ian Melbourne David Kelly Courant Institute New York

State of Retro Gaming in Emacs Vasilij Schneidermann April 2019 Outline 1 Intro 2 Interactive

Annual General Body Meeting Annual General Body Meeting Saturday, December 6th, 2008 Saturday,

Declaration 0 Connecting dreams, for a richer future with 5G To amaze and inspire beyond

Innovation in the Wired World Sheila Weir, Information Resource Officer U.S. Embassy

Financial Results of Fiscal Year Ended March 31, 2007 This is an English translation of Japanese

KTC: Performance 1Q18 Total Membership 3,065,500 accounts (as of March 31, 2018) KTC Story

arXiv:1106.6287v1 [astro-ph.SR] 30 Jun 2011 field is pushed into a downflow vertex of the

Introduction We have been developing gambling and social games for 12 years, developed and

Machine Learning Machine Learning Fast & Slow Fast & Slow - PowerPoint PPT Presentation

Machine Learning Machine Learning Fast & Slow Fast & Slow Suman Deb Roy Suman Deb Roy Lead Data Scientist @ betaworks bot www.rundexter.com /messaging www.poncho.is www.digg.com www.digg.com/messaging www.rundexter.com

Fast-slow systems with chaotic noise David Kelly Ian Melbourne Courant Institute New York

Big and Small Steps for Fast and Slow Provability Paula Henk illc , University of Amsterdam

SEARCHING: FAST AND SLOW Susan Dumais http://research.microsoft.com/~sdumais #TAIA2014 Jul

Integrating new major Integrating new major components on fast and slow components on fast and

Being a METS Startup Fast Failure; Fast Reward November 2016 Fast Failure; Fast Reward

Debra Prinzing SLOW FLOWERS COLLECTIONS Datisca cannabina ECOMMERCE: Direct to Consumer What

Lets talk locks! @kavya719 kavya locks. locks are slow locks are slow latency

Cracking the Habit Code 21 days to keeping your resolutions 1 Day 3: Start Small &amp; Go Slow

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

scoot Introducing Fast, Cheap, Personal Transportation Free, But Slow (&lt;10 MPH, $0/trip)

Syed Aftab Rashid id, Geoffrey Nelissen and Eduardo Tovar 4/12/2016 Main CPU Cache Memory

Fast-slow systems with chaotic noise Ian Melbourne David Kelly Courant Institute New York

State of Retro Gaming in Emacs Vasilij Schneidermann April 2019 Outline 1 Intro 2 Interactive

Annual General Body Meeting Annual General Body Meeting Saturday, December 6th, 2008 Saturday,

Declaration 0 Connecting dreams, for a richer future with 5G To amaze and inspire beyond

Innovation in the Wired World Sheila Weir, Information Resource Officer U.S. Embassy

Financial Results of Fiscal Year Ended March 31, 2007 This is an English translation of Japanese

KTC: Performance 1Q18 Total Membership 3,065,500 accounts (as of March 31, 2018) KTC Story

arXiv:1106.6287v1 [astro-ph.SR] 30 Jun 2011 field is pushed into a downflow vertex of the

Introduction We have been developing gambling and social games for 12 years, developed and

Cracking the Habit Code 21 days to keeping your resolutions 1 Day 3: Start Small & Go Slow

scoot Introducing Fast, Cheap, Personal Transportation Free, But Slow (<10 MPH, $0/trip)