Bayes Probability Matrix Factorization

Powerful Essays

Bayesian Probabilistic Matrix Factorization using Markov Chain Monte Carlo

Ruslan Salakhutdinov rsalakhu@cs.toronto.edu Andriy Mnih amnih@cs.toronto.edu Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3G4, Canada

Abstract
Low-rank matrix approximation methods provide one of the simplest and most eﬀective approaches to collaborative ﬁltering. Such models are usually ﬁtted to data by ﬁnding a MAP estimate of the model parameters, a procedure that can be performed eﬃciently even on very large datasets. However, unless the regularization parameters are tuned carefully, this approach is prone to overﬁtting because it ﬁnds a single point estimate of the parameters. In this paper we present a fully Bayesian treatment of the Probabilistic
Matrix Factorization (PMF) model in which model capacity is controlled automatically by integrating over all model parameters and hyperparameters. We show that Bayesian
PMF models can be eﬃciently trained using Markov chain Monte Carlo methods by applying them to the Netﬂix dataset, which consists of over 100 million movie ratings.
The resulting models achieve signiﬁcantly higher prediction accuracy than PMF models trained using MAP estimation.

& Jaakkola, 2003). Training such a model amounts to ﬁnding the best rank-D approximation to the observed
N × M target matrix R under the given loss function.
A variety of probabilistic factor-based models have been proposed (Hofmann, 1999; Marlin, 2004; Marlin
& Zemel, 2004; Salakhutdinov & Mnih, 2008). In these models factor variables are assumed to be marginally independent while rating variables are assumed to be conditionally independent given the factor variables.
The main drawback of such models is that inferring the posterior distribution over the factors given the ratings is intractable. Many of the existing methods resort to performing MAP estimation of the model parameters. Training such models amounts to maximizing

References: Hofmann, T. (1999). Probabilistic latent semantic analysis Lim, Y. J., & Teh, Y. W. (2007). Variational Bayesian approach to movie rating prediction Marlin, B. (2004). Modeling user rating proﬁles for collaborative ﬁltering Marlin, B., & Zemel, R. S. (2004). The multiple multiplicative factor model for collaborative ﬁltering. Machine Learning, Proceedings of the Twenty-ﬁrst International Conference (ICML 2004), Banﬀ, Alberta, Canada Neal, R. M. (1993). Probabilistic inference using Markov chain Monte Carlo methods (Technical Report CRG-TR-93-1) Nowlan, S. J., & Hinton, G. E. (1992). Simplifying neural networks by soft weight-sharing. Neural Computation, 4, 473–493. Raiko, T., Ilin, A., & Karhunen, J. (2007). Principal component analysis for large scale problems with lots of missing values Rennie, J. D. M., & Srebro, N. (2005). Fast maximum margin matrix factorization for collaborative prediction. Machine Learning, Proceedings of the Twenty-Second International Conference (ICML Salakhutdinov, R., & Mnih, A. (2008). Probabilistic matrix factorization Srebro, N., & Jaakkola, T. (2003). Weighted low-rank approximations

Bayes Probability Matrix Factorization

You May Also Find These Documents Helpful

PT2520 Week 4 Essay 4142015

PT2520 Week 4 Essay 4142015

The Filter

The Filter

Machine Learning Week 6

Machine Learning Week 6

Gene Expression Data

Gene Expression Data

Success Factors for Self-Paced Online Learning in Business

Success Factors for Self-Paced Online Learning in Business

Social Commerce and Word of Mouth: Friend vs. Strangers

Social Commerce and Word of Mouth: Friend vs. Strangers

Principal Component Analysis

Principal Component Analysis

A Systematic Literature Review of Software Process Improvement for Small and Medium Web Companies

A Systematic Literature Review of Software Process Improvement for Small and Medium Web Companies

Bayesian Statistics

Bayesian Statistics

Vinhomes Golden River Analysis

Vinhomes Golden River Analysis

Metropolis-Hastings Algorithms

Metropolis-Hastings Algorithms

Exploiting Semantic Web Technologies for Recommender Systems a Multi View Recommendation Engine

Exploiting Semantic Web Technologies for Recommender Systems a Multi View Recommendation Engine

Anacor Algorithm

Anacor Algorithm

Learning Rate

Learning Rate

digital image processing

digital image processing

Related Topics

Bayes Probability Matrix Factorization

You May Also Find These Documents Helpful

PT2520 Week 4 Essay 4142015

PT2520 Week 4 Essay 4142015

The Filter

The Filter

Machine Learning Week 6

Machine Learning Week 6

Gene Expression Data

Gene Expression Data

Success Factors for Self-Paced Online Learning in Business

Success Factors for Self-Paced Online Learning in Business

Social Commerce and Word of Mouth: Friend vs. Strangers

Social Commerce and Word of Mouth: Friend vs. Strangers

Principal Component Analysis

Principal Component Analysis

A Systematic Literature Review of Software Process Improvement for Small and Medium Web Companies

A Systematic Literature Review of Software Process Improvement for Small and Medium Web Companies

Bayesian Statistics

Bayesian Statistics

Vinhomes Golden River Analysis

Vinhomes Golden River Analysis

Metropolis-Hastings Algorithms

Metropolis-Hastings Algorithms

Exploiting Semantic Web Technologies for Recommender Systems a Multi View Recommendation Engine

Exploiting Semantic Web Technologies for Recommender Systems a Multi View Recommendation Engine

Anacor Algorithm

Anacor Algorithm

Learning Rate

Learning Rate

digital image processing

digital image processing

Related Topics

Report this documents

Please chosse a reason

You'll be redirected