Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recommendation algorithms in the world of scientific Python packages (numpy, scipy, matplotlib).

Related tags

Deep Learning crab
Overview

Crab - A Recommendation Engine library for Python

Crab is a flexible, fast recommender engine for Python that integrates classic information filtering recom- mendation algorithms in the world of scientific Python packages (numpy, scipy, matplotlib). The engine aims to provide a rich set of components from which you can construct a customized recommender system from a set of algorithms.

Usage

For Usage and Instructions checkout the Crab Wiki

History

The project was started in 2010 by Marcel Caraciolo as a M.S.C related project, and since then many people interested joined to help in the project. It is currently maintained by a team of volunteers, members of the Muriçoca Labs.

Authors

Marcel Caraciolo ([email protected])

Bruno Melo ([email protected])

Ricardo Caspirro ([email protected])

Rodrigo Alves ([email protected])

Bugs, Feedback

Please submit bugs you might encounter, as well Patches and Features Requests to the Issues Tracker located at GitHub.

Contributions

If you want to submit a patch to this project, it is AWESOME. Follow this guide:

  • Fork Crab
  • Make your alterations and commit
  • Create a topic branch - git checkout -b my_branch
  • Push to your branch - git push origin my_branch
  • Create a Pull Request from your branch.
  • You just contributed to the Crab project!

Wiki

Please check our Wiki wiki, for further information on how to start developing or use Crab in your projects.

LICENCE (BSD)

Copyright (c) 2011, Muriçoca Labs

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the Muriçoca Labs nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL MURIÇOCA LABS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Comments
  • Implement the Evaluation techniques

    Implement the Evaluation techniques

    Implement the recommender evaluation techniques: MAE, RMSE, F1-Score, Precision and Recall.

    Evaluating collaborative filtering recommender systems References: http://dl.acm.org/citation.cfm?id=963772

    Evaluation of recommender systems: new approach

    http://ec.iem.cyut.edu.tw/drupal/sites/default/files/Evaluation%20of%20recommender%20systems%20A%20new%20approach.pdf

    Feature 
    opened by marcelcaraciolo 4
  • Create the FileDataModel

    Create the FileDataModel

    Create the FileDataModel that will receive as input the *.txt or *.csv or any text file and parse it and store it as internal structure.

    Use a sample database to test the FileDataModel.

    Work with the DictDataModel Assigned developer to choose the best internal structure to store the user matrix ratings/preferences.

    Feature 
    opened by marcelcaraciolo 4
  • TYPO in Documentation

    TYPO in Documentation

    In section 2.1.5 of http://muricoca.github.com/crab/tutorial.html, sample code

    >>> from crab.recommenders.knn import UserBasedRecommender
    

    should be

    >>> from scikits.crab.recommenders.knn import UserBasedRecommender
    
    opened by tkamishima 3
  • Implement Cross Validation Techniques for Evaluate the Recommenders

    Implement Cross Validation Techniques for Evaluate the Recommenders

    Check and implement several cross validation techniques for evaluate the recommenders.

    https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py

    Feature 
    opened by marcelcaraciolo 3
  • Pip install problem

    Pip install problem

    Currently I get this when trying to install crab:

    error in scikits.crab setup command: Distribution contains no modules or packages for namespace package 'scikits'
    

    My installation attempts included

    pip install git+git://github.com/muricoca/crab.git
    

    and

    pip install -e git+git://github.com/muricoca/crab.git#egg=crab
    
    opened by chrisgilmerproj 2
  • Add Recommender Evaluator

    Add Recommender Evaluator

    Add the implementation of Recommender Evaluator which will evaluate the recommender metrics using the cross-validation functions available for the data set.

    Feature 
    opened by marcelcaraciolo 2
  • Implement the representation as string (__repr__) of the DataModels

    Implement the representation as string (__repr__) of the DataModels

    The current data models lack the representation as strings by using repr.

    The goal is when you do print model , it will show a brief representation of the ratings matrix (based on the current ones) such as:

    print model MatrixDataModel (3 by 3) red orange green apple 2.000000 --- 1.000000 orange --- 2.000000 --- celery --- --- 1.000000

    Easy Fix 
    opened by marcelcaraciolo 2
  • Develop the Matrix Factorization  Recommender System

    Develop the Matrix Factorization Recommender System

    Based on the paper of Sarwar :

    Sarwar, B.; Karypis, G.; Konstan, J.; Riedl, J. (2000), Application of Dimensionality Reduction in Recommender System A Case Study

    and

    NCREMENTAL SINGULAR VALUE. DECOMPOSITION ALGORITHMS FOR. HIGHLY SCALABLE RECOMMENDER. SYSTEMS (SARWAR ET AL)

    Feature 
    opened by marcelcaraciolo 2
  • Fix non-ASCII character in setup.py

    Fix non-ASCII character in setup.py

    As setup.py does not define a file-specific encoding, it's encoding defaults to ASCII (per PEP-236). The descr string in setup.py contains the non-ASCII ligature , so executing setup.py results in SyntaxError: Non-ASCII character '\xef' in file .... The attached patch fixes that by replacing the ligature with two separate glyphs.

    opened by earl 2
  • Include self.__set_params in the recommend BaseRecommender method

    Include self.__set_params in the recommend BaseRecommender method

    It is necessary to set the params in the recommend BaseRecommender method by calling self.__set_params(**params) from BaseEstimator.

    How to put this in the BaseRecommender without calling super in the Child Class?

    Bug 
    opened by marcelcaraciolo 2
  • Setup.py does not install dataset files

    Setup.py does not install dataset files

    If you run python setup.py install you will not get the dataset files copied into your site-packages. This can be fixed updating package_data attribute in the setup.py file.

    opened by chrisgilmerproj 1
  • docs: fix simple typo, distrance -> distance

    docs: fix simple typo, distrance -> distance

    There is a small typo in scikits/crab/metrics/pairwise.py.

    Should read distance rather than distrance.

    Semi-automated pull request generated by https://github.com/timgates42/meticulous/blob/master/docs/NOTE.md

    opened by timgates42 0
  • Python 3 compatibility + error in import

    Python 3 compatibility + error in import

    • When installing from a Colaboratory (Jupyter) Notebook, I had issues generated from the file book_crossing.py. They where related to : the print function (missing parenthesis that are mandatory in Python 3), and the the comma (",") in an Exception statement, that has to be replaced by "as" keyword.

    • There was an other error, in the same file, related to a "path" in an import: **from base import Bunch**instead of **from .base import Bunch**

    • The errors where corrected and the import from scikits.crab import datasets` worked. But there are still other issues.

    opened by AIMosta 0
  • 按照官网上操作会报错

    按照官网上操作会报错

    from scikits.crab import datasets

    movies = datasets.load_sample_movies() songs = datasets.load_sample_songs() from scikits.crab.models import MatrixPreferenceDataModel

    Build the model

    model = MatrixPreferenceDataModel(movies.data)

    from scikits.crab.metrics import pearson_correlation from scikits.crab.similarities import UserSimilarity

    Build the similarity

    similarity = UserSimilarity(model, pearson_correlation)

    from scikits.crab.recommenders.knn import UserBasedRecommender

    Build the User based recommender

    recommender = UserBasedRecommender(model, similarity, with_preference=True)

    Recommend items for the user 5 (Toby)

    recommender.recommend(5)

    opened by wpinchine 0
  • Something wrong about the calculation of precision and recall

    Something wrong about the calculation of precision and recall

    https://github.com/muricoca/crab/blob/beb355538acc419b82beae3c6845d1e1cff5d26b/scikits/crab/metrics/metrics.py#L319-L323

    I doubt that the calculation of precision and recall is wrong.

    The definition of precision is the proportion of the recommendation list (the y_pred) which is contained in the true list(the y_real).

    So, I think the calculation of precision should be precision[i] = (intersection_size / float(len(y_items_pred)))
    if len(y_items_pred) else 0.0

    and similarly, the calculation of recall should be recall[i] = (intersection_size / float(len(y_real[i])))
    if len(y_real[i]) else 0.0

    opened by WinChua 0
  • ImportError: No module named crab.metrics.classes

    ImportError: No module named crab.metrics.classes

    ImportError: No module named crab.metrics.classes

    installed crab . but unable import.

    from pprint import pprint import csv from scikits.crab.models import MatrixPreferenceDataModel, MatrixBooleanPrefDataModel from scikits.crab.metrics import pearson_correlation, euclidean_distances, jaccard_coefficient, cosine_distances, manhattan_distances, spearman_coefficient from scikits.crab.similarities import ItemSimilarity, UserSimilarity from scikits.crab.recommenders.knn import ItemBasedRecommender, UserBasedRecommender from scikits.crab.recommenders.knn.neighborhood_strategies import NearestNeighborsStrategy from scikits.crab.recommenders.knn.item_strategies import ItemsNeighborhoodStrategy from scikits.crab.recommenders.svd.classes import MatrixFactorBasedRecommender from scikits.crab.metrics.classes import CfEvaluator

    """ import random

    fieldnames = ['user_id', 'item_id', 'star_rating'] with open('dataset-recsys.csv', "w") as myfile: # writing data to new csv file writer = csv.DictWriter(myfile, delimiter = ',', fieldnames = fieldnames)
    writer.writeheader()

    for x in range(1, 21):
        items = random.sample(list(range(1, 41)), 20)
        random.randint(1,5)
        for item in items:        
            writer.writerow({'user_id': x, 'item_id': item, 'star_rating': random.randint(1, 5)})
    

    """

    dataset = {} with open('sample_movielens_data.txt') as myfile:
    reader = csv.DictReader(myfile, delimiter=',')
    i = 0
    for line in reader:
    i += 1 if (i == 1): continue

        if (int(line['user_id']) not in dataset):
            dataset[int(line['user_id'])] = {}
            
        dataset[int(line['user_id'])][int(line['item_id'])] = float(line['star_rating'])
    

    model = MatrixPreferenceDataModel(dataset)

    User-based Similarity

    similarity = UserSimilarity(model, cosine_distances) neighborhood = NearestNeighborsStrategy() recsys = UserBasedRecommender(model, similarity, neighborhood)

    Item-based Similarity

    similarity = ItemSimilarity(model, cosine_distances) nhood_strategy = ItemsNeighborhoodStrategy() recsys = ItemBasedRecommender(model, similarity, nhood_strategy, with_preference=False)

    recsys = MatrixFactorBasedRecommender(model=model, items_selection_strategy=nhood_strategy, n_features=10, n_interations=1)

    evaluator = CfEvaluator()

    rmse = evaluator.evaluate(recsys, 'rmse', permutation=False) mae = evaluator.evaluate(recsys, 'mae', permutation=False) nmae = evaluator.evaluate(recsys, 'nmae', permutation=False) precision = evaluator.evaluate(recsys, 'precision', permutation=False) recall = evaluator.evaluate(recsys, 'recall', permutation=False) f1score = evaluator.evaluate(recsys, 'f1score', permutation=False)

    all_scores = evaluator.evaluate(recsys, permutation=False) #all_scores = evaluator.evaluate(boolean_recsys, permutation=False)

    result = evaluator.evaluate(recsys, None, permutation=False, at=10, sampling_ratings=0.7)

    Cross Validation

    result = evaluator.evaluate_on_split(recsys, 'rmse', permutation=False, at=10, cv=5, sampling_ratings=0.7)

    pprint (result)

    opened by bala17 0
Python Implementation of algorithms in Graph Mining, e.g., Recommendation, Collaborative Filtering, Community Detection, Spectral Clustering, Modularity Maximization, co-authorship networks.

Graph Mining Author: Jiayi Chen Time: April 2021 Implemented Algorithms: Network: Scrabing Data, Network Construbtion and Network Measurement (e.g., P

Jiayi Chen 3 Mar 3, 2022
Code for Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022)

Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022) We consider how a user of a web servi

joisino 20 Aug 21, 2022
Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

Algo Phantoms 81 Nov 26, 2022
PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

PyTorch implementations of Top-N recommendation, collaborative filtering recommenders.

Yoonki Jeong 129 Dec 22, 2022
Numerical Methods with Python, Numpy and Matplotlib

Numerical Bric-a-Brac Collections of numerical techniques with Python and standard computational packages (Numpy, SciPy, Numba, Matplotlib ...). Diffe

Vincent Bonnet 10 Dec 20, 2021
Img-process-manual - Utilize Python Numpy and Matplotlib to realize OpenCV baisc image processing function

Img-process-manual - Opencv Library basic graphic processing algorithm coding reproduction based on Numpy and Matplotlib library

Jack_Shaw 2 Dec 12, 2022
Composable transformations of Python+NumPy programsComposable transformations of Python+NumPy programs

Chex Chex is a library of utilities for helping to write reliable JAX code. This includes utils to help: Instrument your code (e.g. assertions) Debug

DeepMind 506 Jan 8, 2023
A Real-World Benchmark for Reinforcement Learning based Recommender System

RL4RS: A Real-World Benchmark for Reinforcement Learning based Recommender System RL4RS is a real-world deep reinforcement learning recommender system

null 121 Dec 1, 2022
MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

MLP-Numpy A simple modular implementation of Multi Layer Perceptron in pure Numpy. I used the Iris dataset from scikit-learn library for the experimen

Soroush Omranpour 1 Jan 1, 2022
Filtering variational quantum algorithms for combinatorial optimization

Current gate-based quantum computers have the potential to provide a computational advantage if algorithms use quantum hardware efficiently.

null 1 Feb 9, 2022
Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

null 730 Jan 9, 2023
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

null 15 Dec 27, 2022
Python implementation of cover trees, near-drop-in replacement for scipy.spatial.kdtree

This is a Python implementation of cover trees, a data structure for finding nearest neighbors in a general metric space (e.g., a 3D box with periodic

Patrick Varilly 28 Nov 25, 2022
Sequential model-based optimization with a `scipy.optimize` interface

Scikit-Optimize Scikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions. It implements

Scikit-Optimize 2.5k Jan 4, 2023
SciPy fixes and extensions

scipyx SciPy is large library used everywhere in scientific computing. That's why breaking backwards-compatibility comes as a significant cost and is

Nico Schlömer 16 Jul 17, 2022
PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Smooth ReLU in PyTorch Unofficial PyTorch reimplementation of the Smooth ReLU (SmeLU) activation function proposed in the paper Real World Large Scale

Christoph Reich 10 Jan 2, 2023
An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding, top-down-bottom-up, and attention (consensus between columns)

GLOM - Pytorch (wip) An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates neural fields, predictive coding,

Phil Wang 173 Dec 14, 2022
DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predicate.

DeepProbLog DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predic

KU Leuven Machine Learning Research Group 94 Dec 18, 2022