Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W - PowerPoint PPT Presentation

Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

Click - thro u gh rates Click - thro u gh rate : # of clicks on ads / # of v ie w s of ads Companies and marketers ser v ing ads w ant to ma x imi z e click - thro u gh rate Prediction of click - thro u gh rates is critical for companies and marketers PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

A classification lens Classi � cation : assigning categories to obser v ations Classi � ers u se training data and are e v al u ated on testing data Target : a binar y v ariable , 0/1 for non - click or click Feat u re : an y v ariable u sed to help predict the target PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

A brief look sample data Each ro w represents a partic u lar o u tcome of click or not click for a gi v en u ser for a gi v en ad Filtering for col u mns can be done thro u gh .isin() : df.columns.isin(['device'])] Ass u ming y is a col u mn of clicks , CTR can be fo u nd b y: y.sum()/len(y) PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Anal yz ing feat u res print(df.device_type.value_counts()) 1 45902 0 2947 print(df.groupby('device_type')['click'].sum()) 0 633 1 7890 PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

O v er v ie w of machine learning models P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

Logistic regression Logistic regression : linear classi � er bet w een dependent v ariable and independent v ariables PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Training the model Can create the model v ia : clf = LogisticRegression() Each classi � er has a fit() method w hich takes in an X_train, y_train : clf.fit(X_train, y_train) X_train is the v ector of training feat u res , y_train is the v ector of training targets Classi � er sho u ld onl y see training data to a v oid " seeing ans w ers beforehand " PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Testing the model Each classi � er has a predict() method w hich takes in an X_test to generate a y_test as follo w s : array([0, 1, 1, ..., 1, 0, 1]) predict_proba() method prod u ces probabilit y scores array([0.2, 0.8], [0.4, 0.6] ..., [0.1, 0.9] [0.3, 0.7]]) Score re � ects probabilit y of a partic u lar ad being clicked b y partic u lar u ser PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

E v al u ating the model Acc u rac y: the percentage of test targets correctl y identi � ed accuracy_score(y_test, y_pred) Sho u ld not be the onl y metric to e v al u ate model , partic u larl y in imbalanced datasets CTR prediction is an e x ample w here classes are imbalanced PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

CTR prediction u sing decision trees P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

Decision trees Sample o u tcomes are sho w n in table belo w: First split is based on age of application For y o u th gro u p , second split is based on st u dent stat u s Model pro v ides he u ristics for u nderstanding is _ st u dent loan Nodes represent the feat u res middle _ aged 1 Branches represent the decisions based on feat u res y o u th no 0 y o u th y es 1 PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Training and testing the model Create v ia : clf = DecisionTreeClassifier() Similar to logistic regression , a decision tree also in v ol v es clf.fit(X_train, y_train) for training data and clf.predict(X_test) for testing labels : array([0, 1, 1, ..., 1, 0, 1]) clf.predict_proba(X_test) for probabilit y scores : array([0.2, 0.8], [0.4, 0.6] ..., [0.1, 0.9] [0.3, 0.7]]) E x ample for randoml y spli � ing training and testing data , w here testing data is 30% of total sample si z e : train_test_split(X, y, test_size = .3, random_state = 0) PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

E v al u ation w ith ROC c u r v e Tr u e positi v e rate ( Y - a x is ) = #( classi � er predicts positi v e , act u all y positi v e ) / #( positi v es ) False positi v e rate ( X - a x is ) = #( classi � er predicts positi v e , act u all y negati v e ) / #( negati v es ) Do � ed bl u e line : baseline AUC of 0.5 Want orange line ( AUC ) to be as close to 1 as possible PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

AUC of ROC c u r v e Y_score = clf.predict_proba(X_test) fpr, tpr, thresholds = roc_curve(Y_test, Y_score[:, 1]) roc_curve() inp u ts : test and score arra y s roc_auc = auc(fpr, tpr) auc() inp u t : false - positi v e and tr u e - positi v e arra y s If model is acc u rate and CTR is lo w, y o u ma y w ant to reassess ho w the ad message is rela y ed and w hat a u dience it is targeted for PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W - PowerPoint PPT Presentation

Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor Click - thro u gh rates Click - thro u gh rate : # of clicks on ads / # of v ie w s of ads Companies and

Click on M odel File for CAD Click on M odel File for CAD Click on Model File for CAD Click

PROPERTY RATES PROPERTY RATES PROPERTY RATES PROPERTY RATES BUFFALO CITY MUNICIPALITY

Duy H. Ho , Raj Marri , Sirisha Rella , Yugyung Lee University of Missouri Kansas City Click

Privacy as a Click to add title Click to add title Business Opportunity Click to add subtitle

Click to edit Master title style DRVR Click to edit Master title style Click to edit Master

Click to edit Master title style Click to edit Master title style Click to edit Master title

INTROD TRODUCT CTION TO TO PRI RIOR ORITY TY-BASED ED B BUDGET ET BUDGETI TING F FOR

Introd u ction to a u dio data in P y thon SP OK E N L AN G U AG E P R OC E SSIN G IN P YTH ON

Introd u ction to P y D u b SP OK E N L AN G U AG E P R OC E SSIN G IN P YTH ON Daniel Bo u

Introd u ction IN TE R ME D IATE IN TE R AC TIVE DATA VISU AL IZATION W ITH P L OTLY IN R

Introd u ction VISU AL IZIN G G E OSPATIAL DATA IN P YTH ON Mar y v an Valkenb u rg Data

Introd u ction to signals FIN AN C IAL TR AD IN G IN R Il y a Kipnis Professional Q u antitati

Introd u ction to E x plorator y Data Anal y sis STATISTIC AL TH IN K IN G IN P YTH ON ( PAR T 1

Introd u ction to iterators P YTH ON DATA SC IE N C E TOOL BOX ( PAR T 2 ) H u go Bo w ne -

Introd u ction to EFA FAC TOR AN ALYSIS IN R Jennifer Br u sso w Ps y chometrician Ps y cho +

Introd u ction to the NASA fireball data set BU IL D IN G DASH BOAR D S W ITH SH IN YDASH BOAR

Dynamic Marginal Contribution Mechanism Dirk Bergemann and Juuso Vlimki DIMACS: Economics and

Web Mining and Recommender Systems Algorithms for advertising Learning Goals Introduce the

4 Idiots Approach for Click-through Rate Prediction 1/15 Team Members 4 Idiots consist of:

Deep Character-Level Bora Edizel - Phd Student UPF Click-Through Rate Prediction Amin Mantrach -

Designing Auctions for Search Ads Kshipra Bhawalkar Lane (Google Research) Joint work with Gagan

Performability at Yahoo Search Amr Awadallah and a bunch of other yahoos amr@yahoo-inc.com Now,

Measurement and Analysis of OSN Ad Auctions Yabing Liu Chloe Kliman-Silver Robert Bell

Making the Leap John Donham Raph Koster Game Developers Conference Online October 2010 All data