The Art of Predictive Analytics: More Data, Same Models [STUDY - PowerPoint PPT Presentation

The Art of Predictive Analytics: More Data, Same Models [STUDY SLIDES] Joseph Turian joseph@metaoptimize.com @turian 2012.02.02 MetaOptimize

NOTE: These are the STUDY slides from my talk at the predictive analytics meetup: http://bit.ly/xVLBuS I have removed some graphics, and added some text. Please email me any questions

Who am I? Engineer with 20 yrs coding exp PhD 10 yrs exp: large-scale ML + NLP Founded MetaOptimize

What is MetaOptimize? Consultancy + community on: Large-scale ML + NLP Well engineered solutions

“Both NLP and ML have a lot of folk wisdom about what works and what doesn't. [This site] is crucial for sharing this collective knowledge.” - @aria42 http://metaoptimize.com/qa/

http://metaoptimize.com/qa/

“A lot of expertise in machine learning is simply developing effective biases .” -Dan Melamed (quoted from memory)

What's a good choice of learning rate for the second layer of this neural net on image patches? [intuition] 0.02! (Yoshua Bengio)

Occam's Razor is a great example of ML intuition

Without the aid of prejudice and custom I should not be able to find my way across the room. - William Hazlitt

It's fun to be a geek

Be an artist

How to build the world's biggest langid (langcat) model?

+ Vowpal Wabbit = Win

How to build the world's biggest langid (langcat) model? SOLVED.

The art of predictive analytics: 1) Know the data out there 2) Know the code out there 3) Intuition (bias)

A lot of data with one feature correlated with the label

Twitter sentiment analysis?

“Distant supervision” (Go et al., 09) Awesome! RT @rupertgrintnet Harry Potter Marks Place in Film History http://bit.ly/Eusxi :) (Use emoticons as labels)

Recipe: You know a lot about the problem Smart Priors

You know a lot about the problem: Smart Priors Yarowsky (1995), WSD 1) One sense per collocation. 2) One sense per discourse.

Recipe: You know a lot about the problem Create new features

You know a lot about the problem: Create new features Error-analysis

What errors is your model making? DO SOME EXPLORATORY DATA ANALYSIS (EDA)

Andrew Ng: “Advice for applying ML” Where do the errors come from?

Recipe: You know a little about the problem Semi-supervised learning

You know a little about the problem: Semi-supervised learning JOINT semi-supervised learning Ando and Zhang (2005) Suzuki and Isozaki (2008) Suzuki et al. (2009), etc. => effective but task-specific

You know a little about the problem: Semi-supervised learning Unsupervised learning, followed by Supervised learning

How can Bob improve his model? Sup Sup data model Supervised training 34

Semi-sup training? Sup Sup data model Supervised training 35

Semi-sup training? More feats Sup Sup data model Supervised training 36

More sup task 1 feats Sup Sup data model More features can be used on different tasks More sup task 2 feats Sup Sup data model 37

Unsup Joint semi-sup data Semi-sup model Sup data (standard semi-sup setup) 38

Unsup data Unsup model unsup pretraining Sup Semi-sup data model semi-sup fine-tuning 39 Unsupervised, then supervised

Unsup data Unsup model unsup training unsup feats Use unsupervised learning to create new features 40

Unsup data unsup feats unsup training These features can then be shared with other people Sup data Semi-sup model Sup training 41

Unsup data unsup feats unsup training sup task 1 sup task 2 sup task 3 42

Recipe: You know almost nothing about the problem Build cool generic features

Know almost nothing about problem: Build cool generic features Word features (Turian et al., 2010) http://metaoptimize.com/projects/wordreprs/

Brown clustering (Brown et al. 92) cluster(chairman) = `0010’ 2-prefix(cluster(chairman)) = `00’ 45 (image from Terry Koo)

50-dim embeddings: Collobert + Weston (2008) t-SNE vis by van der Maaten + Hinton (2008) 46

Know almost nothing about problem: Build cool generic features Document features: Document clustering LSA/LDA Deep model

Document features Salakhutdinov + Hinton 06

Document features example Domain adaptation for sentiment analysis (Glorot et al. 11)

Recipe: You know a little about the problem Make more REAL training examples

Make more real training examples Cuz you have some time or a small budget Amazon Mechanical Turk

Snow et al. 08 “Cheap and Fast – But is it Good?” 1K turk labels per dollar Average over (5) Turks to reduce noise => http://crowdflower.com/

Soylent (Bernstein et al. 10) Find-Fix-Verify: Crowd control design pattern Find a Fix each Verify quality problem problem of each fix Soylent, a Soylent, a Soylent, a prototype... Soylent, a prototype... prototype... prototype...

Make more real training examples Active learning

Dualist (Settles 11) http://code.google.com/p/dualist/

Dualist (Settles 11) http://code.google.com/p/dualist/ Applications: Document categorization WSD Information Extraction Twitter sentiment analysis

You know a little about the problem: Make more training examples FAKE training examples

FAKE training examples Denoising AA RBM

MNIST distortions (LeCun et al. 98)

No negative examples?

FAKE training examples Multi-view / multi-modal

Multi-view / multi-modal How do you evaluate an IR system, if you have no labels? See how good the title is at retrieving the body text.

2) KNOW THE DATA

Know the data Labelled/structured data: ODP, Freebase, Wikipedia, Dbpedia, etc.

Know the data Unlabelled data: WaCKy, ClueWeb09, CommonCrawl, Ngram corpora

Ngrams Google Bing Google Books Roll your own: Common crawl

Know the data Do something stupid on a lot of data

Do something stupid on a lot of data: Ngrams Spell-checking Phrase segmentation Word breaking Synonyms Language models See “An Overview of Microsoft Web N-gram Corpus and Applications” (Wang et al 10)

Do something stupid on a lot of data Web-scale k-means for NER (Lin and Wu 09)

Do something stupid on a lot of data Web-scale clustering

Know the data Multi-modal learning

Multi-modal learning Images and captions = features features “facepalm”

Multi-modal learning Titles and article body = features features Article body Title

Multi-modal learning Audio and tags = features features “upbeat”, “hip hop”

3) IT'S MODELS ALL THE WAY DOWN

Break down a pipeline 1-best (greedy), k-best, Finkel et al. 06

Good code to build on Stanford NLP tools, clustering algorithms, Terry Koo's parser, etc.

Good code to build on YOUR MODEL

Eat your own dogfood Bootstrapping (Yarowsky 95) Co-training (Blum+Mitchell 98) EM (Nigam et al., 00) Self-training (McClosky et al., 06)

Dualist (Settles '11) Active learning + semisup learning

Eat your own dogfood Cheap bootstrapping: One step of EM (Settles 11) “Awesome! What a great movie!”

It's models all the way down Use models to annotate Low recall + high precision + lots of data = win

Use models to annotate Face modeling

Pose-invariant face features

It's models all the way down THE FUTURE? Joins on large noisy data sets

Joins on large noisy data sets ReVerb (Fader et al., 11) http://reverb.cs.washington.edu Extractions over entire ClueWeb09 (826 MB compressed)

ReVerb (Fader et al., 11)

Joins on noisy data sets (can clean up the data??) ??? ⋈

The art of predictive analytics: 1) Know the data out there 2) Know the code out there 3) Intuition (bias)

Summary of recipes: Know your problem Throw in good features Use other's good models in yr pipeline Make more training examples Use a lot of data

"It especially annoys me when racists are accused of 'discrimination.' The ability to discriminate is a precious facility; by judging all members of one 'race' to be the same, the racist precisely shows himself incapable of discrimination." - Christopher Hitchens (RIP)

Other cool research to look at: * Frustratingly easy domain adaptation (Daume 07) * The Unreasonable Effectiveness of Data (Halevy et al 09) * Web-scale algorithms (search on http://metaoptimize.com/qa/) * Self-taught learning (Raina et al 07)

Please email me any questions Joseph Turian joseph@metaoptimize.com @turian http://metaoptimize.com/qa/ 2012.02.02

The Art of Predictive Analytics: More Data, Same Models [STUDY - PowerPoint PPT Presentation

The Art of Predictive Analytics: More Data, Same Models [STUDY SLIDES] Joseph Turian joseph@metaoptimize.com @turian 2012.02.02 MetaOptimize NOTE: These are the STUDY slides from my talk at the predictive analytics meetup:

Session 3 Upskilling for Predictive Analytics Travis M Short, FSA Upskilling for Predictive

HISTORY ART Pre- Historic Art Egyptian Art Greek Art Roman Art Byzantine Art Medieval Art

HISTORY ART Pre- Historic Art Egyptian Art Greek Art Roman Art Byzantine Art Medieval Art

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Predictive Analytics for Capacity Planning HIC 2015 Andrae Gaeth What is predictive

? Same time Same time Same place Same time Same place 2 different painters Our story really

Automating Predictive Analytics www.xpanseanalytics.com Agenda Predictive Analytics vs

Educational Predictive Analytics: Navigating Disparate Views Aaron Springer , Victoria Chou,

COVID-19 Predictive Analytics April 8th, 2020 Predictive Analytics Focus Areas Health System

Session 2 Predictive Analytics in Policyholder Behavior Eileen Burns, FSA, MAAA David Wang, FSA,

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

Model Predictive Control Model Predictive Control of Hybrid Systems of Hybrid Systems Model

Predictive Simulation & Big Data Analytics ISD Analytics Predict a better future

Real time Predictive Fraud Analytics using Databricks & Tableau Prasad Kona Partner Solution

Basic Network features Bart Baesens, Ph.D. Professor of Data Science, KU Leuven and University of

Homophily Bart Baesens, Ph.D. Professor of Data Science, KU Leuven and University of Southampton

Rapid Module Development Drupaldelphia, April 27, 2018 Intros Tom Mount Eastern Standard

CSE 504: Project Proposal Jennifer Niederlnder 01/13/2016 Improving Security Testing

The Geeks Guide to Leading Teams @patkua ThoughtWorks The Geeks Guide to Leading Teams

RUN YOUR JAVA CODE ON CLOUD FOUNDRY Andy Piper - @andypiper Cloud Foundry Developer Advocate,

A Day In The Life of a Hacker Things we get up to when nobody is looking, and that keep me awake

Personal Internet Security Basics Dan Ficker Twin Cities DrupalCamp 2018 Overview Security

Introduction to FreeNAS development John Hixson john@ixsystems.com iXsystems, Inc. A bit about

An empirical dynamical approach to modelling teleconnections using the DREAM model Nick Hall

The Art of Predictive Analytics: More Data, Same Models [STUDY - PowerPoint PPT Presentation

The Art of Predictive Analytics: More Data, Same Models [STUDY SLIDES] Joseph Turian joseph@metaoptimize.com @turian 2012.02.02 MetaOptimize NOTE: These are the STUDY slides from my talk at the predictive analytics meetup:

Session 3 Upskilling for Predictive Analytics Travis M Short, FSA Upskilling for Predictive

HISTORY ART Pre- Historic Art Egyptian Art Greek Art Roman Art Byzantine Art Medieval Art

HISTORY ART Pre- Historic Art Egyptian Art Greek Art Roman Art Byzantine Art Medieval Art

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

Predictive Analytics for Capacity Planning HIC 2015 Andrae Gaeth What is predictive

? Same time Same time Same place Same time Same place 2 different painters Our story really

Automating Predictive Analytics www.xpanseanalytics.com Agenda Predictive Analytics vs

Educational Predictive Analytics: Navigating Disparate Views Aaron Springer , Victoria Chou,

COVID-19 Predictive Analytics April 8th, 2020 Predictive Analytics Focus Areas Health System

Session 2 Predictive Analytics in Policyholder Behavior Eileen Burns, FSA, MAAA David Wang, FSA,

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

Model Predictive Control Model Predictive Control of Hybrid Systems of Hybrid Systems Model

Predictive Simulation &amp; Big Data Analytics ISD Analytics Predict a better future

Real time Predictive Fraud Analytics using Databricks &amp; Tableau Prasad Kona Partner Solution

Basic Network features Bart Baesens, Ph.D. Professor of Data Science, KU Leuven and University of

Homophily Bart Baesens, Ph.D. Professor of Data Science, KU Leuven and University of Southampton

Rapid Module Development Drupaldelphia, April 27, 2018 Intros Tom Mount Eastern Standard

CSE 504: Project Proposal Jennifer Niederlnder 01/13/2016 Improving Security Testing

The Geeks Guide to Leading Teams @patkua ThoughtWorks The Geeks Guide to Leading Teams

RUN YOUR JAVA CODE ON CLOUD FOUNDRY Andy Piper - @andypiper Cloud Foundry Developer Advocate,

A Day In The Life of a Hacker Things we get up to when nobody is looking, and that keep me awake

Personal Internet Security Basics Dan Ficker Twin Cities DrupalCamp 2018 Overview Security

Introduction to FreeNAS development John Hixson john@ixsystems.com iXsystems, Inc. A bit about

An empirical dynamical approach to modelling teleconnections using the DREAM model Nick Hall

Predictive Simulation & Big Data Analytics ISD Analytics Predict a better future

Real time Predictive Fraud Analytics using Databricks & Tableau Prasad Kona Partner Solution