# Ranking (recommendation)¶

## CLiMF¶

CLiMF (Collaborative Less-is-More Filtering) is used in scenarios with binary relevance data. Hence, it’s focused on improving top-k recommendations through ranking by directly maximizing the Mean Reciprocal Rank (MRR).

Following a similar technique as other iterative approaches, the two low-rank matrices can be randomly initialize and then optimize through a training loss like this:

$F(U,V) = \sum _{ i=1 }^{ M }{ \sum _{ j=1 }^{ N }{ { Y }_{ ij }[\ln{\quad g({ U }_{ i }^{ T }V_{ i })} +\sum _{ k=1 }^{ N }{ \ln { (1-{ Y }_{ ik }g({ U }_{ i }^{ T }V_{ k }-{ U }_{ i }^{ T }V_{ j })) } } ] } } -\frac { \lambda }{ 2 } ({ \left\| U \right\| }^{ 2 }+{ \left\| V \right\| }^{ 2 })$

Note: Orange3 currently does not support ranking operations. Therefore, this model cannot be used neither in cross-validation nor in the prediction module available in Orange3

### Example¶

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24  import Orange import numpy as np from orangecontrib.recommendation import CLiMFLearner # Load data data = Orange.data.Table('epinions_train.tab') # Train recommender learner = CLiMFLearner(num_factors=10, num_iter=10, learning_rate=0.0001, lmbda=0.001) recommender = learner(data) # Load test dataset testdata = Orange.data.Table('epinions_test.tab') # Sample users num_users = len(recommender.U) num_samples = min(num_users, 1000) # max. number to sample users_sampled = np.random.choice(np.arange(num_users), num_samples) # Compute Mean Reciprocal Rank (MRR) mrr, _ = recommender.compute_mrr(data=testdata, users=users_sampled) print('MRR: %.4f' % mrr) >>> MRR: 0.3975 
class orangecontrib.recommendation.CLiMFLearner(num_factors=5, num_iter=25, learning_rate=0.0001, lmbda=0.001, preprocessors=None, optimizer=None, verbose=False, random_state=None, callback=None)[source]

CLiMF: Collaborative Less-is-More Filtering Matrix Factorization

This model uses stochastic gradient descent to find two low-rank matrices: user-feature matrix and item-feature matrix.

CLiMF is a matrix factorization for scenarios with binary relevance data when only a few (k) items are recommended to individual users. It improves top-k recommendations through ranking by directly maximizing the Mean Reciprocal Rank (MRR).

Attributes:
num_factors: int, optional
The number of latent factors.
num_iter: int, optional
The number of passes over the training data (aka epochs).
learning_rate: float, optional
The learning rate controlling the size of update steps (general).
lmbda: float, optional
Controls the importance of the regularization term (general). Avoids overfitting by penalizing the magnitudes of the parameters.
optimizer: Optimizer, optional
Set the optimizer for SGD. If None (default), classical SGD will be applied.
verbose: boolean or int, optional
Prints information about the process according to the verbosity level. Values: False (verbose=0), True (verbose=1) and INTEGER
random_state: int, optional
Set the seed for the numpy random generator, so it makes the random numbers predictable. This a debbuging feature.

callback: callable

fit_storage(data)[source]

Fit the model according to the given training data.

Args:
data: Orange.data.Table
Returns:
self: object
Returns self.