market basket introduction
play

Market basket introduction MARK ET BAS K ET AN ALYS IS IN R - PowerPoint PPT Presentation

Market basket introduction MARK ET BAS K ET AN ALYS IS IN R Christopher Bruffaerts Statistician Overview Market Basket course Chapter 1 : Introduction to market basket analysis Chapter 2 : Metrics and techniques in market basket analysis


  1. Market basket introduction MARK ET BAS K ET AN ALYS IS IN R Christopher Bruffaerts Statistician

  2. Overview Market Basket course Chapter 1 : Introduction to market basket analysis Chapter 2 : Metrics and techniques in market basket analysis Chapter 3 : Visualization in market basket analysis Chapter 4 : Case study: Movie recommendations @ movieLens MARKET BASKET ANALYSIS IN R

  3. What is a basket? Basket = collection of items Examples of baskets : Items 1. Your basket @ the grocery store 2. Your Amazon shopping cart 1. Products at the supermarket 3. Your courses @ DataCamp 2. Products on online website 4. The movies you watched on Net�ix 3. DataCamp courses 4. Movies watched by users MARKET BASKET ANALYSIS IN R

  4. Grocery store example What's in the store? What are you up for today? One bread Three pieces of cheese MARKET BASKET ANALYSIS IN R

  5. Grocery store example in R What's in the store? R output store = c("Bread", "Butter", my_basket "Cheese", "Wine") TID Product set.seed(1234) 1 1 Bread n_items = 4 2 1 Cheese my_basket = data.frame( 3 1 Cheese TID = rep(1,n_items), 4 1 Cheese Product = sample( store, n_items, replace = TRUE)) MARKET BASKET ANALYSIS IN R

  6. What's in my basket? My original basket My adjusted basket One record per item purchased One record per distinct item purchased TID Product # A tibble: 2 x 3 1 1 Bread TID Product Quantity 2 1 Cheese <dbl> <fct> <int> 3 1 Cheese 1 1 Bread 1 4 1 Cheese 2 1 Cheese 3 MARKET BASKET ANALYSIS IN R

  7. What's in my R basket? Reshaping the basket data # Number of distinct items n_distinct(my_basket$Product) # Adjusting my basket my_basket = my_basket %>% 2 add_count(Product) %>% unique() %>% # Total basket size rename(Quantity = n) my_basket %>% summarize(sum(Quantity)) 4 MARKET BASKET ANALYSIS IN R

  8. Visualizing items in my basket Visualizing items in my basket # Plotting items ggplot(my_basket, aes(x=reorder(Product, Quantity), y = Quantity)) + geom_col() + coord_flip() + xlab("Items") + ggtitle("Summary of items in my basket") MARKET BASKET ANALYSIS IN R

  9. Why are we looking at my basket? Question: Is there any relationship between items within a basket ? Back to examples 1. Your basket @ the grocery store, e.g. Spaghetti and T omato sauce 2. Your Amazon shopping cart, e.g. Phone and a phone case 3. Your courses @ DataCamp e.g. "Introduction to R" and "Intermediate R" MARKET BASKET ANALYSIS IN R

  10. Happy shopping! MARK ET BAS K ET AN ALYS IS IN R

  11. Item combinations MARK ET BAS K ET AN ALYS IS IN R Christopher Bruffaerts Statistician

  12. Back to the grocery store What's in the store? What are you up for today? {"Bread", "Cheese", "Cheese", "Cheese"} Focus of market basket analysis {"Bread", "Cheese"} MARKET BASKET ANALYSIS IN R

  13. Subsets and supersets My store - set Subsets of X - itemsets Size 0 : { ∅ } X = {"Bread", "Butter", "Cheese", "Wine"} Size 1 : {"Bread"}, {"Wine"}, ... Size 2 : {"Bread", "Wine"}, ... Supersets {"Bread", "Butter"} superset of {"Bread"} {"Bread", "Butter", "Cheese", "Wine"} superset of {"Bread", "Butter"} MARKET BASKET ANALYSIS IN R

  14. Itemset graph Question : What is the set of all possible subsets of X? X = {A, B, C, D} MARKET BASKET ANALYSIS IN R

  15. Intersections and unions Intersection Union {"Bread"} ∩ {"Butter"} = ∅ {"Bread"} ∪ {"Butter"} = {"Bread", "Butter"} {"Bread", "Butter"} ∩ {"Butter", "Wine"} = {"Butter"} union(A,B) library(dplyr) [1] "Bread" "Butter" "Wine" A = c("Bread", "Butter") B = c("Bread", "Wine") intersect(A,B) [1] "Bread" MARKET BASKET ANALYSIS IN R

  16. How many baskets of size k? Question : Example: How many possible subsets of size k from a set of Number of baskets with 2 distinct items from the size n ? store: "n choose k" n ! ( k n ) = , ( n − k )! k ! where 4 ) 4! n ! = n × ( n − 1) × ( n − 2) × ... × 2 × 1 ( 2 = = 6 (4 − 2)!2! MARKET BASKET ANALYSIS IN R

  17. How many possible baskets? Question Example How many possible baskets can be created from a T otal number of baskets: set of size n ? 4 2 = 16 Newton's binom n ( k n ) ∑ n = 2 k =0 2^(n_items) MARKET BASKET ANALYSIS IN R

  18. How many baskets in R? Combinations in R Output n_items = 4 colnames(store)=c("size", "nb_combi") basket_size = 2 store choose(n_items, basket_size) size nb_combi [1] 6 [1,] 0 1 [2,] 1 4 [3,] 2 6 # Looping through all possible values [4,] 3 4 store = matrix(NA, nrow=5, ncol=2) [5,] 4 1 for (i in 0:n_items){ store[i+1,] = c(i, choose(n_items,i))} MARKET BASKET ANALYSIS IN R

  19. Plotting number of combinations Get an idea of how fast number of combinations n_items = 50 fun_nk = function(x) choose(n_items, x) # Plotting ggplot(data = data.frame(x = 0), mapping = aes(x=x))+ stat_function(fun = fun_nk)+ xlim(0, n_items)+ xlab("Subset size")+ ylab("Number of subsets") MARKET BASKET ANALYSIS IN R

  20. Are you ready to count? MARK ET BAS K ET AN ALYS IS IN R

  21. What is market basket analysis ? MARK ET BAS K ET AN ALYS IS IN R Christopher Bruffaerts Statistician

  22. Multiple baskets @ grocery store What's in the store? Multiple baskets If 100 customers visit the grocery store, can we �nd associations of items that occur together? Example : Bread and Cheese Basket 1 : {"Bread", "Cheese"} Outcome: “if this, then that” Basket 2 : {"Bread", "Wine" , "Cheese"} MARKET BASKET ANALYSIS IN R

  23. Market basket applications Learning from multiple baskets Different applications E-commerce : “customers who bought this also bought this” Retail : items which are “bundled or placed together” Social media : friends and connections recommendation Videos and movies recommendation MARKET BASKET ANALYSIS IN R

  24. Multiple baskets in R Create a dataset containing multiple baskets! A glimpse at my baskets my_baskets = data.frame( head(my_baskets) "Basket" = c(1,1,1,1, 2,2,2, 3,3, 4,4,4, 5,5, 6,6, 7,7) "Product" = c("Bread", "Cheese", "Cheese", "Cheese", Basket Product "Bread", "Butter", "Wine", 1 1 Bread "Butter", "Butter", 2 1 Cheese "Butter", "Wine", "Wine", 3 1 Cheese "Butter", "Cheese", 4 1 Cheese "Cheese", "Wine", 5 2 Bread "Wine", "Wine") 6 2 Butter ) MARKET BASKET ANALYSIS IN R

  25. What's in our baskets? Questions How many items are there in each basket? df_basket = How many distinct items are there? my_baskets %>% n_distinct(my_baskets$Product) group_by(Basket) %>% summarize( n_total = n(), [1] 4 n_items = n_distinct(Product)) How many baskets are there? Basket n_total n_items n_distinct(my_baskets$Basket) <dbl> <int> <int> 1 1 4 2 2 2 3 3 [1] 7 MARKET BASKET ANALYSIS IN R

  26. How big are baskets? Average basket sizes Distribution of basket size basket_size %>% # Distribution of distinct items summarize( ggplot(df_basket, aes(n_items)) + avg_total_items = mean(n_total), geom_bar() avg_dist_items = mean(n_items)) # A tibble: 1 x 2 avg_total_items avg_dist_items <dbl> <dbl> 1 2.57 1.86 MARKET BASKET ANALYSIS IN R

  27. Speci�c products in the baskets Which item are you looking at? Filtering for Cheese in R How many times an item appears across all # Number of baskets containing Cheese baskets? my_baskets %>% How many baskets contain that item? filter(Product == "Cheese") %>% summarize( Example : n_tot_items = n(), n_basket_item = n_distinct(Basket)) n_tot_items n_basket_item 1 5 3 MARKET BASKET ANALYSIS IN R

  28. Association rule mining Association rule mining : �nding frequent co-occuring associations among a collection of items. Example of rule extraction: {Bread} → {Butter} {Bread, Cheese} → {Wine} MARKET BASKET ANALYSIS IN R

  29. So what's coming next? Agenda for the rest of the course: Chapter 2 : Metrics & techniques in market basket analysis Chapter 3 : Visualization in market basket analysis Chapter 4 : Case study: Movie recommendations @ movieLens MARKET BASKET ANALYSIS IN R

  30. Let's play with baskets! MARK ET BAS K ET AN ALYS IS IN R

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend