introd u ction to click thro u gh rates
play

Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W - PowerPoint PPT Presentation

Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor Click - thro u gh rates Click - thro u gh rate : # of clicks on ads / # of v ie w s of ads Companies and


  1. Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  2. Click - thro u gh rates Click - thro u gh rate : # of clicks on ads / # of v ie w s of ads Companies and marketers ser v ing ads w ant to ma x imi z e click - thro u gh rate Prediction of click - thro u gh rates is critical for companies and marketers PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  3. A classification lens Classi � cation : assigning categories to obser v ations Classi � ers u se training data and are e v al u ated on testing data Target : a binar y v ariable , 0/1 for non - click or click Feat u re : an y v ariable u sed to help predict the target PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  4. A brief look sample data Each ro w represents a partic u lar o u tcome of click or not click for a gi v en u ser for a gi v en ad Filtering for col u mns can be done thro u gh .isin() : df.columns.isin(['device'])] Ass u ming y is a col u mn of clicks , CTR can be fo u nd b y: y.sum()/len(y) PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  5. Anal yz ing feat u res print(df.device_type.value_counts()) 1 45902 0 2947 print(df.groupby('device_type')['click'].sum()) 0 633 1 7890 PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  6. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

  7. O v er v ie w of machine learning models P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  8. Logistic regression Logistic regression : linear classi � er bet w een dependent v ariable and independent v ariables PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  9. Training the model Can create the model v ia : clf = LogisticRegression() Each classi � er has a fit() method w hich takes in an X_train, y_train : clf.fit(X_train, y_train) X_train is the v ector of training feat u res , y_train is the v ector of training targets Classi � er sho u ld onl y see training data to a v oid " seeing ans w ers beforehand " PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  10. Testing the model Each classi � er has a predict() method w hich takes in an X_test to generate a y_test as follo w s : array([0, 1, 1, ..., 1, 0, 1]) predict_proba() method prod u ces probabilit y scores array([0.2, 0.8], [0.4, 0.6] ..., [0.1, 0.9] [0.3, 0.7]]) Score re � ects probabilit y of a partic u lar ad being clicked b y partic u lar u ser PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  11. E v al u ating the model Acc u rac y: the percentage of test targets correctl y identi � ed accuracy_score(y_test, y_pred) Sho u ld not be the onl y metric to e v al u ate model , partic u larl y in imbalanced datasets CTR prediction is an e x ample w here classes are imbalanced PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  12. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

  13. CTR prediction u sing decision trees P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  14. Decision trees Sample o u tcomes are sho w n in table belo w: First split is based on age of application For y o u th gro u p , second split is based on st u dent stat u s Model pro v ides he u ristics for u nderstanding is _ st u dent loan Nodes represent the feat u res middle _ aged 1 Branches represent the decisions based on feat u res y o u th no 0 y o u th y es 1 PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  15. Training and testing the model Create v ia : clf = DecisionTreeClassifier() Similar to logistic regression , a decision tree also in v ol v es clf.fit(X_train, y_train) for training data and clf.predict(X_test) for testing labels : array([0, 1, 1, ..., 1, 0, 1]) clf.predict_proba(X_test) for probabilit y scores : array([0.2, 0.8], [0.4, 0.6] ..., [0.1, 0.9] [0.3, 0.7]]) E x ample for randoml y spli � ing training and testing data , w here testing data is 30% of total sample si z e : train_test_split(X, y, test_size = .3, random_state = 0) PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  16. E v al u ation w ith ROC c u r v e Tr u e positi v e rate ( Y - a x is ) = #( classi � er predicts positi v e , act u all y positi v e ) / #( positi v es ) False positi v e rate ( X - a x is ) = #( classi � er predicts positi v e , act u all y negati v e ) / #( negati v es ) Do � ed bl u e line : baseline AUC of 0.5 Want orange line ( AUC ) to be as close to 1 as possible PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  17. AUC of ROC c u r v e Y_score = clf.predict_proba(X_test) fpr, tpr, thresholds = roc_curve(Y_test, Y_score[:, 1]) roc_curve() inp u ts : test and score arra y s roc_auc = auc(fpr, tpr) auc() inp u t : false - positi v e and tr u e - positi v e arra y s If model is acc u rate and CTR is lo w, y o u ma y w ant to reassess ho w the ad message is rela y ed and w hat a u dience it is targeted for PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  18. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend