learned about lsh similarity search recommender systems
play

Learned about: LSH/Similarity search & recommender systems - PowerPoint PPT Presentation

Learned about: LSH/Similarity search & recommender systems Search: jaguar Uncertainty about the users information need Dont put all eggs in one basket! Relevance isnt everything need diversity ! 5/28/20 Tim


  1. ¡ Learned about: LSH/Similarity search & recommender systems ¡ Search: “jaguar” ¡ Uncertainty about the user’s information need § Don’t put all eggs in one basket! ¡ Relevance isn’t everything – need diversity ! 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 2

  2. ¡ Recommendation: ¡ Summarization: “Robert Downey Jr.” ¡ News Media: 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 3

  3. [ Althoff et al., KDD 2015 ] Robert ¡Downey ¡Jr. ¡(1965—) Deborah The ¡Party's Ben ¡Stiller Fiona ¡Apple Susan ¡Downey Iron ¡Man ¡2 Iron ¡Man ¡3 Falconer Over Robert Paramount Chaplin Ally ¡McBeal Gothika Iron ¡Man The ¡Avengers Downey, ¡Sr. Pictures 1985 1990 1995 2000 2005 2010 2015 Timeline Person ¡ Goal: Timeline should express his relationships to other people through events (personal, collaboration, mentorship, etc.) ¡ Why timelines? § Easier: Wikipedia article is 18 pages long § Context: Through relationships & event descriptions § Exploration: Can “jump” to other people 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 4

  4. ¡ Given: § Relevant relationships § Events that each cover some relationships ¡ Goal: Given a large set of events , pick a small subset that explains most known relationships (“the timeline”) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 5

  5. Demo available at: http://cs.stanford.edu/~althoff/timemachine/demo.html Robert ¡Downey ¡Jr. ¡(1965—) Deborah The ¡Party's Ben ¡Stiller Fiona ¡Apple Susan ¡Downey Iron ¡Man ¡2 Iron ¡Man ¡3 Falconer Over Robert Paramount Chaplin Ally ¡McBeal Gothika Iron ¡Man The ¡Avengers Downey, ¡Sr. Pictures 1985 1990 1995 2000 2005 2010 2015 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 6

  6. ¡ User studies: People hate redundancy! Chaplin Iron Man Iron Man Iron Man Academy Award US Release US Release vs Award N. Ceremony Rented Lips Iron Man US Release EU Release ¡ Want to see more diverse set of relationships 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 7

  7. ¡ Idea: Encode diversity as coverage problem ¡ Example: Selecting events for timeline § Try to cover all important relationships

  8. ¡ Q: What is being covered? ¡ A: Relationships Captain America Anthony Hopkins Gwyneth Paltrow Susan Downey Downey Jr. starred in Chaplin together with Anthony Hopkins ¡ Q: Who is doing the covering? ¡ A: Events 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 10

  9. ¡ Suppose we are given a set of events E § Each event e covers a set of X e ⊆ U e relationships ¡ For a set of events we define: S ⊆ E � � � � [ F ( S ) = X e � � � � � � e ∈ S ¡ Goal: We want to Cardinality | S | ≤ k F ( S ) max Constraint ¡ Note: F(S) is a set function: F ( S ) : 2 E → N 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 11

  10. ¡ Given universe of elements U = { u 1 , . . . , u n } and sets { X 1 , . . . , X m } ⊆ U U: all relationships X 3 X i : relationships covered by event i X 1 U X 2 X 4 ¡ Goal: Find set of k events X 1 …X k covering most of U § More precisely: Find set of k events X 1 …X k whose size of the union is the largest 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 12

  11. Simple Heuristic: Greedy Algorithm: ¡ Start with S 0 = {} ¡ For i = 1…k § Take event e that max F ( S i − 1 ∪ e ) � � § Let � � S i = S i − 1 ∪ { e } [ F ( S ) = X e � � � � � � e ∈ S ¡ Example: § Eval. F({e 1 }), …, F({e m }) , pick best (say e 1 ) § Eval. F({e 1 } u {e 2 }), …, F({e 1 } u {e m }) , pick best (say e 2 ) § Eval. F({e 1 , e 2 } u {e 3 }), …, F({e 1 , e 2 } u {e m }) , pick best § And so on… 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 13

  12. ¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 14

  13. ¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 15

  14. ¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 16

  15. ¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 17

  16. ¡ Goal: Maximize the covered area 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 18

  17. A C B ¡ Goal: Maximize the size of the covered area with two sets ¡ Greedy first picks A and then C ¡ But the optimal way would be to pick B and C 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 19

  18. ¡ Bad news: Maximum Coverage is NP-hard ¡ Good news: Good approximations exist § Problem has certain structure to it that even simple greedy algorithms perform reasonably well § Details in 2 nd half of lecture ¡ Now: Generalize our objective for timeline generation 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 20

  19. ¡ Objective values all relationships equally � � � � [ X [ F ( S ) = � = 1 where R = X e X e � � � � � e ∈ S r ∈ R e ∈ S ¡ Unrealistic: Some relationships are more important than others § use different weights (“weighted coverage function”) X w : R → R + F ( S ) = w ( r ) r ∈ R 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 21

  20. § Use global importance weights § How much interest is there? § Could be measured as § w(X) = # search queries for person X § w(X) = # Wikipedia article views for X § w(X) = # news article mentions for X Captain America Anthony Hopkins Gwyneth Paltrow Susan Downey Captain America Anthony Hopkins Gwyneth Paltrow Susan Downey 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 22

  21. Captain America Susan Downey Justin Bieber Tim Althoff Applying global importance weights Captain America Justin Bieber Susan Downey Tim Althoff ¡ Some relationships are not (very) globally important but (not) highly relevant to timeline ¡ Need relevant to timeline instead of globally relevant w(Susan Downey | RDJr) > w(Justin Bieber | RDJr) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 23

  22. ¡ Can use co-occurrence statistics w(X | RDJr) = #(X and RDJr) / (#(RDJr) * #(X)) § Similar: Pointwise mutual information (PMI) § How often do X and Y occur together compared to what you would expect if they were independent § Accounts for popular entities (e.g., Justin Bieber) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 24

  23. ¡ How to differentiate between two events that cover the same relationships ? ¡ Example: Robert and Susan Downey § Event 1: Wedding, August 27, 2005 § Event 2: Minor charity event, Nov 11, 2006 ¡ We need to be able to distinguish these! 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 25

  24. ¡ Further improvement when we not only score relationships but also score the event timestamp X X F ( S ) = w R ( r ) + w T ( t e ) where r ∈ R e ∈ S [ R = X e e ∈ S Relationship (as before) Timestamps ¡ Again, use co-occurrences for weights w T 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 26

  25. marvel.com • “Robert Downey Jr” and “May 4, 2012” occurs 173 times on 71 different webpages • US Release date of The Avengers • Use MapReduce on 10B web pages (10k+ machines) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 27

  26. ¡ Generalized earlier coverage function to linear combination of weighted coverage functions where X X F ( S ) = w R ( r ) + w T ( t e ) [ R = X e r ∈ R e ∈ S e ∈ S ¡ Goal: | S | ≤ k F ( S ) max ¡ Still NP-hard (because generalization of NP-hard problem) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 28

  27. ¡ How can we actually optimize this function? ¡ What structure is there that will help us do this efficiently? ¡ Any questions so far? 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 29

  28. ¡ For this optimization problem, Greedy produces a solution S s.t. F(S) ³ (1-1/e)*OPT ( F(S) ³ 0.63*OPT ) [Nemhauser, Fisher, Wolsey ’78] ¡ Claim holds for functions F (·) which are: § Submodular, Monotone, Normal, Non-negative (discussed next) 5/28/20 Tim Althoff, UW CS547: Machine Learning for Big Data, http://www.cs.washington.edu/cse547 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend