May 8-12, 2007. Banff, Alberta, Canada
Google News Personalization: Scalable Online Collaborative Filtering
Abhinandan Das
Google Inc. 1600 Amphitheatre Pkwy, Mountain View, CA 94043
Mayur Datar
Google Inc. 1600 Amphitheatre Pkwy, Mountain View, CA 94043
Ashutosh Garg
Google Inc. 1600 Amphitheatre Pkwy, Mountain View, CA 94043
abhinandan@google.com
mayur@google.com Shyam Rajaram
University of Illinois at Urbana Champaign Urbana, IL 61801
ashutosh@google.com
rajaram1@ifp.uiuc.edu ABSTRACT
Several approaches to collaborative filtering have been studied but seldom have studies been reported for large (several million users and items) and dynamic (the underlying item set is continually changing) settings. In this paper we describe our approach to collaborative filtering for generating personalized recommendations for users of Google News. We generate recommendations using three approaches: collaborative filtering using MinHash clustering, Probabilistic Latent Semantic Indexing (PLSI), and covisitation counts. We combine recommendations from different algorithms using a linear model. Our approach is content agnostic and consequently domain independent, making it easily adaptable for other applications and languages with minimal effort. This paper will describe our algorithms and system setup in detail, and report results of running the recommendations engine on Google News. Categories and Subject Descriptors: H.4.m [Information Systems]: Miscellaneous General Terms: Algorithms, Design Keywords: Scalable collaborative filtering, online recommendation system, MinHash, PLSI, Mapreduce, Google News, personalization me something interesting. In such cases, we would like to present recommendations to a user based on her interests as demonstrated by her past activity on the relevant site. Collaborative filtering is a technology that aims to learn user preferences and make recommendations based on user
References: [21] [1] G. Adomavicius, and A. Tuzhilin Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions. In IEEE Transactions on Knowledge And Data Engineering, Vol 17, No. 6, June 2005 [2] D. Blei, A. Ng, and M. Jordan Latent Dirichlet Allocation In Journal of Machine Learning Research, 2003. [3] J. Breese, D. Heckerman, and C. Kadie Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In [22] [23] 280