Class MatrixFactorization

java.lang.Object
  extended by MatrixFactorization

public class MatrixFactorization
extends java.lang.Object

This is a class implementing matrix-factorization-based CF algorithms, including regularized SVD, NMF (Lee and Seung, NIPS 2001), PMF (NIPS 2008), and Bayesian PMF (ICML 2008).

Since:
2011. 7. 12
Version:
20110712
Author:
Joonseok Lee

Field Summary
static int BAYESIAN_PROBABLISTIC_MF
          Algorithm Code for Bayesian PMF
 int featureCount
          The number of features.
 int itemCount
          The number of items.
 SparseMatrix itemFeatures
          Item profile in low-rank matrix form.
 double learningRate
          Learning rate parameter.
 int maxIter
          Maximum number of iteration.
 int maxValue
          Maximum value of rating, existing in the dataset.
 int minValue
          Minimum value of rating, existing in the dataset.
 double momentum
          Momentum parameter.
static int NON_NEGATIVE_MF_FROB
          Algorithm Code for NMF, optimizing Frobenius Norm
static int NON_NEGATIVE_MF_KLD
          Algorithm Code for NMF, optimizing KL Divergence
 double offset
          Offset to rating estimation.
static int PROBABLISTIC_MF
          Algorithm Code for PMF
 SparseMatrix rateMatrix
          Rating matrix for each user (row) and item (column)
static int REGULARIZED_SVD
          Algorithm Code for Regularized SVD
 double regularizer
          Regularization factor parameter.
 boolean showProgress
          Indicator whether to show progress of iteration.
 SparseMatrix testMatrix
          Rating matrix for test items.
 int userCount
          The number of users.
 SparseMatrix userFeatures
          User profile in low-rank matrix form.
private  SparseMatrix validationMatrix
          Rating matrix for items which will be used during the validation phase.
 double validationRatio
          Proportion of dataset, using for validation purpose.
 
Constructor Summary
MatrixFactorization(SparseMatrix rm, SparseMatrix tm, int uc, int ic, int max, int min, int fc, double lr, double r, double m, int iter)
          Construct a matrix-factorization model with the given data.
 
Method Summary
 void buildModel(int method)
          Build a model with the given data and algorithm.
 EvaluationMetrics evaluate(int method)
          Evaluate the designated algorithm with the given test data.
private  void makeValidationSet(double validationRatio)
          Items which will be used for validation purpose are moved from rateMatrix to validationMatrix.
private  void restoreValidationSet()
          Items in validationMatrix are moved to original rateMatrix.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

REGULARIZED_SVD

public static final int REGULARIZED_SVD
Algorithm Code for Regularized SVD

See Also:
Constant Field Values

NON_NEGATIVE_MF_FROB

public static final int NON_NEGATIVE_MF_FROB
Algorithm Code for NMF, optimizing Frobenius Norm

See Also:
Constant Field Values

NON_NEGATIVE_MF_KLD

public static final int NON_NEGATIVE_MF_KLD
Algorithm Code for NMF, optimizing KL Divergence

See Also:
Constant Field Values

PROBABLISTIC_MF

public static final int PROBABLISTIC_MF
Algorithm Code for PMF

See Also:
Constant Field Values

BAYESIAN_PROBABLISTIC_MF

public static final int BAYESIAN_PROBABLISTIC_MF
Algorithm Code for Bayesian PMF

See Also:
Constant Field Values

rateMatrix

public SparseMatrix rateMatrix
Rating matrix for each user (row) and item (column)


testMatrix

public SparseMatrix testMatrix
Rating matrix for test items. Not allowed to refer during training and validation phase.


validationMatrix

private SparseMatrix validationMatrix
Rating matrix for items which will be used during the validation phase. Not allowed to refer during training phase.


featureCount

public int featureCount
The number of features.


userCount

public int userCount
The number of users.


itemCount

public int itemCount
The number of items.


maxValue

public int maxValue
Maximum value of rating, existing in the dataset.


minValue

public int minValue
Minimum value of rating, existing in the dataset.


learningRate

public double learningRate
Learning rate parameter.


regularizer

public double regularizer
Regularization factor parameter.


momentum

public double momentum
Momentum parameter.


maxIter

public int maxIter
Maximum number of iteration.


offset

public double offset
Offset to rating estimation. Usually this is the average of ratings.


validationRatio

public double validationRatio
Proportion of dataset, using for validation purpose.


userFeatures

public SparseMatrix userFeatures
User profile in low-rank matrix form.


itemFeatures

public SparseMatrix itemFeatures
Item profile in low-rank matrix form.


showProgress

public boolean showProgress
Indicator whether to show progress of iteration.

Constructor Detail

MatrixFactorization

public MatrixFactorization(SparseMatrix rm,
                           SparseMatrix tm,
                           int uc,
                           int ic,
                           int max,
                           int min,
                           int fc,
                           double lr,
                           double r,
                           double m,
                           int iter)
Construct a matrix-factorization model with the given data.

Parameters:
rm - The rating matrix which will be used for training.
tm - The rating matrix which will be used for testing.
uc - The number of users in the dataset.
ic - The number of items in the dataset.
max - The maximum rating value in the dataset.
min - The minimum rating value in the dataset.
fc - The number of features in low-rank factorized matrix.
lr - The learning rate for gradient-descent method.
r - The regularization factor.
m - The momentum parameter.
iter - The maximum number of iteration.
Method Detail

buildModel

public void buildModel(int method)
Build a model with the given data and algorithm.

Parameters:
method - The code of algorithm to be tested. It can have one of the following: REGULARIZED_SVD, NON_NEGATIVE_MF_FROB, NON_NEGATIVE_MF_KLD, PROBABLISTIC_MF, and BAYESIAN_PROBABLISTIC_MF.

evaluate

public EvaluationMetrics evaluate(int method)
Evaluate the designated algorithm with the given test data.

Parameters:
method - The code of algorithm to be tested. It can have one of the following: REGULARIZED_SVD, NON_NEGATIVE_MF_FROB, NON_NEGATIVE_MF_KLD, PROBABLISTIC_MF, and BAYESIAN_PROBABLISTIC_MF.
Returns:
The result of evaluation, such as MAE, RMSE, and rank-score.

makeValidationSet

private void makeValidationSet(double validationRatio)
Items which will be used for validation purpose are moved from rateMatrix to validationMatrix.

Parameters:
validationRatio - Proportion of dataset, using for validation purpose.

restoreValidationSet

private void restoreValidationSet()
Items in validationMatrix are moved to original rateMatrix.