Towards using Cached Data Mining for Large Scale Recommender Systems
Swapneel Sheth, Gail Kaiser Department of Computer Science, Columbia University New York, NY 10027 {swapneel, kaiser}@cs.columbia.edu
1
Towards using Cached Data Mining for Large Scale Recommender - - PowerPoint PPT Presentation
Towards using Cached Data Mining for Large Scale Recommender Systems Swapneel Sheth, Gail Kaiser Department of Computer Science, Columbia University New York, NY 10027 {swapneel, kaiser}@cs.columbia.edu 1 Introduction Recommender
Swapneel Sheth, Gail Kaiser Department of Computer Science, Columbia University New York, NY 10027 {swapneel, kaiser}@cs.columbia.edu
1
increasingly commonplace - Pandora, Amazon, Facebook
such as algorithms [10, 11] and social network implications [12, 13]
the performance of recommender systems
2
its user base will grow
with
efficiently from a large set of data
efficiently to a diverse set of users
3
4
discussing caches for recommendation systems
Caches
queries to a given query
5
with a diverse user base that requires different kinds of recommendations
might perform worse than having no cache
Cache so all recommendations (and not just neighborhood ones) can be answered by the cache
6
computational biology and bioinformatics to collaborate by sharing data and knowledge
metaphors for collaborative work
based system for integrated genomics targeted toward biomedical researchers
7
genomics data analysis and visualizations
which tools to use, the order of using the tools, etc.
the most frequently occurring workflows including a given tool or starting with the sequence of tools the user has already executed
8
9
what the user has done so far
10
Dynamic Recommendations
recommendations supported
needed will be present in the cache
as, by definition, hit rate and recall is 100%
11
12
address the problem of concept drift [18] to weigh recent user data more heavily
to represent information such as: workflows including this tool, number of times this tool has been used, etc.
system and are used to provide recommendations
13
14
15
in Java
common Windows XP machines (no non- essential system processes running; >2GB of surplus RAM)
16
17
18
throughput and response time
recommender systems particularly as the system needs to support a diverse and large user base
19
20
Swapneel Sheth, Gail Kaiser Department of Computer Science, Columbia University New York, NY 10027 {swapneel, kaiser}@cs.columbia.edu
21