ch u rn prediction f u ndamentals
play

Ch u rn prediction f u ndamentals MAC H IN E L E AR N IN G FOR - PowerPoint PPT Presentation

Ch u rn prediction f u ndamentals MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on What is ch u rn ? Ch u rn happens w hen a c u stomer stops b uy ing / engaging The b u


  1. Ch u rn prediction f u ndamentals MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on

  2. What is ch u rn ? Ch u rn happens w hen a c u stomer stops b uy ing / engaging The b u siness conte x t co u ld be contract u al or non - contract u al Sometimes ch u rn can be v ie w ed as either v ol u ntar y or in v ol u ntar y MACHINE LEARNING FOR MARKETING IN PYTHON

  3. T y pes of ch u rn Main ch u rn t y polog y is based on t w o b u siness model t y pes : Contract u al ( phone s u bscription , TV streaming s u bscription ) Non - contract u al ( grocer y shopping , online shopping ) MACHINE LEARNING FOR MARKETING IN PYTHON

  4. Modeling different t y pes of ch u rn T y picall y: Non - contract u al ch u rn is harder to de � ne and model , as there ' s no e x plicit c u stomer decision We w ill model contract u al ch u rn in the telecom b u siness model MACHINE LEARNING FOR MARKETING IN PYTHON

  5. Encoding ch u rn T y picall y 1/0, w ith 1 = Ch u rn , 0 = No Ch u rn Co u ld be a string Churn / No Churn or Yes / No - best practice to transform as 1 and 0 set(telcom['Churn']) {0, 1} MACHINE LEARNING FOR MARKETING IN PYTHON

  6. E x ploring ch u rn distrib u tion telcom.groupby(['Churn']).size() / telcom.shape[0] * 100 Churn 0 73.421502 1 26.578498 dtype: float64 MACHINE LEARNING FOR MARKETING IN PYTHON

  7. Split to training and testing data from sklearn.model_selection import train_test_split train, test = train_test_split(telcom, test_size = .25) MACHINE LEARNING FOR MARKETING IN PYTHON

  8. Separate feat u res and target v ariables Separate col u mn names b y data t y pes target = ['Churn'] custid = ['customerID'] cols = [col for col in telcom.columns if col not in custid + target] B u ild training and testing datasets train_X = train[cols] train_Y = train[target] test_X = test[cols] test_Y = test[target] MACHINE LEARNING FOR MARKETING IN PYTHON

  9. Let ' s go practice ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON

  10. Predict ch u rn w ith logistic regression MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on

  11. Introd u ction to logistic regression Statistical classi � cation model for binar y responses Models log - odds of the probabilit y of the target Ass u mes linear relationship bet w een log - odds target and predictors Ret u rns coe � cients and prediction probabilit y MACHINE LEARNING FOR MARKETING IN PYTHON

  12. Modeling steps 1. Split data to training and testing 2. Initiali z e the model 3. Fit the model on the training data 4. Predict v al u es on the testing data 5. Meas u re model performance on testing data MACHINE LEARNING FOR MARKETING IN PYTHON

  13. Fitting the model Import the Logistic Regression classi � er from sklearn.linear_model import LogisticRegression Initiali z e Logistic Regression instance logreg = LogisticRegression() Fit the model on the training data logreg.fit(train_X, train_Y) MACHINE LEARNING FOR MARKETING IN PYTHON

  14. Model performance metrics Ke y metrics : Acc u rac y - The % of correctl y predicted labels ( both Ch u rn and non Ch u rn ) Precision - The % of total model ' s positi v e class predictions ( here - predicted as Ch u rn ) that w ere correctl y classi � ed Recall - The % of total positi v e class samples ( all ch u rned c u stomers ) that w ere correctl y classi � ed MACHINE LEARNING FOR MARKETING IN PYTHON

  15. Meas u ring model acc u rac y from sklearn.metrics import accuracy_score pred_train_Y = logreg.predict(train_X) pred_test_Y = logreg.predict(test_X) train_accuracy = accuracy_score(train_Y, pred_train_Y) test_accuracy = accuracy_score(test_Y, pred_test_Y) print('Training accuracy:', round(train_accuracy,4)) print('Test accuracy:', round(test_accuracy, 4)) Training accuracy: 0.8108 Test accuracy: 0.8009 MACHINE LEARNING FOR MARKETING IN PYTHON

  16. Meas u ring precision and recall from sklearn.metrics import precision_score, recall_score train_precision = round(precision_score(train_Y, pred_train_Y), 4) test_precision = round(precision_score(test_Y, pred_test_Y), 4) train_recall = round(recall_score(train_Y, pred_train_Y), 4) test_recall = round(recall_score(test_Y, pred_test_Y), 4) print('Training precision: {}, Training recall: {}'.format(train_precision, train_recall print('Test precision: {}, Test recall: {}'.format(train_recall, test_recall)) Training precision: 0.6725, Training recall: 0.5736 Test precision: 0.5736, Test recall: 0.4835 MACHINE LEARNING FOR MARKETING IN PYTHON

  17. Reg u lari z ation Introd u ces penalt y coe � cient in the model b u ilding phase Addresses o v er -� � ing (w hen pa � erns are " memori z ed b y the model ") Some reg u lari z ation techniq u es also perform feat u re selection e . g . L 1 Makes the model more generali z able to u nseen samples MACHINE LEARNING FOR MARKETING IN PYTHON

  18. L 1 reg u lari z ation and feat u re selection LogisticRegression from sklearn performs L 2 reg u lari z ation b y defa u lt L 1 reg u lari z ation or also called LASSO can be called e x plicitl y, and this approach performs feat u re selection b y shrinking some of the model coe � cients to z ero . from sklearn.linear_model import LogisticRegression logreg = LogisticRegression(penalty='l1', C=0.1, solver='liblinear') logreg.fit(train_X, train_Y) C parameter needs to be t u ned to � nd the optimal v al u e MACHINE LEARNING FOR MARKETING IN PYTHON

  19. T u ning L 1 reg u lari z ation C = [1, .5, .25, .1, .05, .025, .01, .005, .0025] l1_metrics = np.zeros((len(C), 5)) l1_metrics[:,0] = C for index in range(0, len(C)): logreg = LogisticRegression(penalty='l1', C=C[index], solver='liblinear') logreg.fit(train_X, train_Y) pred_test_Y = logreg.predict(test_X) l1_metrics[index,1] = np.count_nonzero(logreg.coef_) l1_metrics[index,2] = accuracy_score(test_Y, pred_test_Y) l1_metrics[index,3] = precision_score(test_Y, pred_test_Y) l1_metrics[index,4] = recall_score(test_Y, pred_test_Y) col_names = ['C','Non-Zero Coeffs','Accuracy','Precision','Recall'] print(pd.DataFrame(l1_metrics, columns=col_names) MACHINE LEARNING FOR MARKETING IN PYTHON

  20. Choosing optimal C v al u e MACHINE LEARNING FOR MARKETING IN PYTHON

  21. Choosing optimal C v al u e MACHINE LEARNING FOR MARKETING IN PYTHON

  22. Let ' s r u n some logistic regression models ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON

  23. Predict ch u rn w ith decision trees MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on

  24. Introd u ction to decision trees MACHINE LEARNING FOR MARKETING IN PYTHON

  25. Modeling steps 1. Split data to training and testing 2. Initiali z e the model 3. Fit the model on the training data 4. Predict v al u es on the testing data 5. Meas u re model performance on testing data MACHINE LEARNING FOR MARKETING IN PYTHON

  26. Fitting the model Import the decision tree mod u le from sklearn.tree import DecisionTreeClassifier Initiali z e the Decision Tree model mytree = DecisionTreeClassifier() Fit the model on the training data treemodel = mytree.fit(train_X, train_Y) MACHINE LEARNING FOR MARKETING IN PYTHON

  27. Meas u ring model acc u rac y from sklearn.metrics import accuracy_score pred_train_Y = mytree.predict(train_X) pred_test_Y = mytree.predict(test_X) train_accuracy = accuracy_score(train_Y, pred_train_Y) test_accuracy = accuracy_score(test_Y, pred_test_Y) print('Training accuracy:', round(train_accuracy,4)) print('Test accuracy:', round(test_accuracy, 4)) Training accuracy: 0.9973 Test accuracy: 0.7196 MACHINE LEARNING FOR MARKETING IN PYTHON

  28. Meas u ring precision and recall from sklearn.metrics import precision_score, recall_score train_precision = round(precision_score(train_Y, pred_train_Y), 4) test_precision = round(precision_score(test_Y, pred_test_Y), 4) train_recall = round(recall_score(train_Y, pred_train_Y), 4) test_recall = round(recall_score(test_Y, pred_test_Y), 4) print('Training precision: {}, Training recall: {}'.format(train_precision, train_recall print('Test precision: {}, Test recall: {}'.format(train_recall, test_recall)) Training precision: 0.9993, Training recall: 0.9906 Test precision: 0.9906, Test recall: 0.4878 MACHINE LEARNING FOR MARKETING IN PYTHON

  29. Tree depth parameter t u ning depth_list = list(range(2,15)) depth_tuning = np.zeros((len(depth_list), 4)) depth_tuning[:,0] = depth_list for index in range(len(depth_list)): mytree = DecisionTreeClassifier(max_depth=depth_list[index]) mytree.fit(train_X, train_Y) pred_test_Y = mytree.predict(test_X) depth_tuning[index,1] = accuracy_score(test_Y, pred_test_Y) depth_tuning[index,2] = precision_score(test_Y, pred_test_Y) depth_tuning[index,3] = recall_score(test_Y, pred_test_Y) col_names = ['Max_Depth','Accuracy','Precision','Recall'] print(pd.DataFrame(depth_tuning, columns=col_names)) MACHINE LEARNING FOR MARKETING IN PYTHON

  30. Choosing optimal depth MACHINE LEARNING FOR MARKETING IN PYTHON

  31. Choosing optimal depth MACHINE LEARNING FOR MARKETING IN PYTHON

  32. Let ' s b u ild a decision tree ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON

  33. Identif y and interpret ch u rn dri v ers MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend