Validating logistic regression results Anurag Gupta People - - PowerPoint PPT Presentation

validating logistic regression results
SMART_READER_LITE
LIVE PREVIEW

Validating logistic regression results Anurag Gupta People - - PowerPoint PPT Presentation

DataCamp Human Resources Analytics: Predicting Employee Churn in R HUMAN RESOURCES ANALYTICS : PREDICTING EMPLOYEE CHURN IN R Validating logistic regression results Anurag Gupta People Analytics Practitioner DataCamp Human Resources


slide-1
SLIDE 1

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Validating logistic regression results

HUMAN RESOURCES ANALYTICS: PREDICTING EMPLOYEE CHURN IN R

Anurag Gupta

People Analytics Practitioner

slide-2
SLIDE 2

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Turnover probability distribution of test cases

slide-3
SLIDE 3

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Turn probabilities in categories by using a cut-off

slide-4
SLIDE 4

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Turn probabilities in categories by using a cut-off

# Classify predictions using a cut-off of 0.5 pred_cutoff_50_test <- ifelse(predictions_test > 0.5, 1, 0)

slide-5
SLIDE 5

DataCamp Human Resources Analytics: Predicting Employee Churn in R

What is confusion matrix?

Confusion matrix measures the performance of a classification model.

slide-6
SLIDE 6

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Creating confusion matrix

## Creating confusion matrix table(pred_cutoff_50_test, test_set$turnover) prediction_categories 0 1 0 450 22 1 20 94

slide-7
SLIDE 7

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Understanding confusion matrix

True negatives (TN): The model correctly identified active employees True positives (TP): The model correctly identified inactive employees False positives (FP): The model predicted employees as inactive, but they are actually active False negatives (FN): The model predicted employees as active, but they are actually inactive

slide-8
SLIDE 8

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Confusion matrix: accuracy

Accuracy = Accuracy = = 0.9283 TP + TN + FP + FN TP + TN 450 + 94 + 22 + 20 450 + 94

slide-9
SLIDE 9

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Creating confusion matrix

# Load library library(caret) # Construct a confusion matrix conf_matrix_50 <- confusionMatrix(table(test_set$turnover, pred_cutoff_50_test))

slide-10
SLIDE 10

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Output of confusion matrix

conf_matrix_50 Confusion Matrix and Statistics prediction_categories 0 1 0 450 22 1 20 94 Accuracy : 0.9283 95% CI : (0.9044, 0.9479) No Information Rate : 0.802 P-Value [Acc > NIR] : <2e-16 Kappa : 0.7728 Mcnemar's Test P-Value : 0.8774 Sensitivity : 0.9574 Specificity : 0.8103 Pos Pred Value : 0.9534 Neg Pred Value : 0.8246 Prevalence : 0.8020 Detection Rate : 0.7679 Detection Prevalence : 0.8055 Balanced Accuracy : 0.8839 'Positive' Class : 0

slide-11
SLIDE 11

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Resources for advanced methods

Supervised Learning in R: Classification Machine learning in the Tidyverse

slide-12
SLIDE 12

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Let's practice!

HUMAN RESOURCES ANALYTICS: PREDICTING EMPLOYEE CHURN IN R

slide-13
SLIDE 13

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Designing retention strategy

HUMAN RESOURCES ANALYTICS: PREDICTING EMPLOYEE CHURN IN R

Anurag Gupta

People Analytics Practitioner

slide-14
SLIDE 14

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Know who may leave

# Load tidypredict library(tidypredict) # Calculate probability of turnover emp_risk <- emp_final %>% filter(status == "Active") %>% # Add predictions using the final model tidypredict_to_column(final_log)

slide-15
SLIDE 15

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Know who may leave

# Look at the employee's probability of turnover emp_risk %>% select(emp_id, fit) %>% top_n(5, wt = fit) # A tibble: 5 x 2 emp_id fit <chr> <dbl> E202 0.9694593 E6475 0.9814252 E6574 0.9983320 E7105 0.9193704 E9878 0.9371767

slide-16
SLIDE 16

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Classification of employees in risk buckets

slide-17
SLIDE 17

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Classification of employees in risk buckets

slide-18
SLIDE 18

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Classification of employees in risk buckets

slide-19
SLIDE 19

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Classification of employees in risk buckets

slide-20
SLIDE 20

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Classification of employees in risk buckets

slide-21
SLIDE 21

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Classify employees into risk buckets in R

# Create turnover risk buckets emp_risk_bucket <- emp_risk %>% mutate(risk_bucket = cut(fit, breaks = c(0, 0.5, 0.6, 0.8, 1), labels = c("no-risk", "low-risk", "medium-risk", "high-risk")))

slide-22
SLIDE 22

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Retention strategy

HIGH RISK

Immediate action planning Inform reporting manager Hold one-on-one conversation

MEDIUM RISK

Medium-term action planning Keep tracking for any behavioral change Have one-on-one or open house discussion

slide-23
SLIDE 23

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Retention strategy

LOW RISK

Long-term action planning Keep tracking for any behavioral change Have open house discussion

NO RISK

No action required

slide-24
SLIDE 24

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Let's practice!

HUMAN RESOURCES ANALYTICS: PREDICTING EMPLOYEE CHURN IN R

slide-25
SLIDE 25

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Return on investment calculation

HUMAN RESOURCES ANALYTICS: PREDICTING EMPLOYEE CHURN IN R

Anurag Gupta

People Analytics Practitioner

slide-26
SLIDE 26

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Total cost of employee turnover

Costs to off‐board employee Cost‐per‐hire for replacement Transition costs, including opportunity costs

slide-27
SLIDE 27

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Understand the cost implication of high turnover rate

Turnover overview Scenario 1 Scenario 2 % Change Total Turnover 300 200 33% Average Cost of Turnover** $40,000 $40,000 0% Total Cost of Turnover $12,000,000 $8,000,000 $4,000,000

**source

slide-28
SLIDE 28

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Calculating ROI

ROI = Program Cost Program Benefits

percent_hike -0.59500 0.08134 -7.315 2.57e-13 ***

slide-29
SLIDE 29

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Turnover rate across salary hike range

slide-30
SLIDE 30

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Let's practice!

HUMAN RESOURCES ANALYTICS: PREDICTING EMPLOYEE CHURN IN R

slide-31
SLIDE 31

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Course Wrap-up

HUMAN RESOURCES ANALYTICS: PREDICTING EMPLOYEE CHURN IN R

Anurag Gupta

People Analytics Practitioner

slide-32
SLIDE 32

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Course Wrap-up

What is employee turnover? HR data sources Derive new variables and variable importance Explore and validate Predict probability of turnover Desgined retention strategies

slide-33
SLIDE 33

DataCamp Human Resources Analytics: Predicting Employee Churn in R

Go implement employee turnover prediction in your

  • rganization!

HUMAN RESOURCES ANALYTICS: PREDICTING EMPLOYEE CHURN IN R