A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

Overview

collie

PyPI version versions Workflows Passing Documentation Status codecov license

Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Collie dog breed.

Collie offers a collection of simple APIs for preparing and splitting datasets, incorporating item metadata directly into a model architecture or loss, efficiently evaluating a model's performance on the GPU, and so much more. Above all else though, Collie is built with flexibility and customization in mind, allowing for faster prototyping and experimentation.

See the documentation for more details.

"We adopted 2 Border Collies a year ago and they are about 3 years old. They are completely obsessed with fetch and tennis balls and it's getting out of hand. They live in the fenced back yard and when anyone goes out there they instantly run around frantically looking for a tennis ball. If there is no ball they will just keep looking and will not let you pet them. When you do have a ball, they are 100% focused on it and will not notice anything else going on around them, like it's their whole world."

-- A Reddit thread on r/DogTraining

Installation

pip install collie

Through July 2021, this library used to be under the name collie_recs. While this version is still available on PyPI, it is no longer supported or maintained. All users of the library should use collie for the latest and greatest version of the code!

Quick Start

Implicit Data

Creating and evaluating a matrix factorization model with implicit MovieLens 100K data is simple with Collie:

Open In Colab

from collie.cross_validation import stratified_split
from collie.interactions import Interactions
from collie.metrics import auc, evaluate_in_batches, mapk, mrr
from collie.model import MatrixFactorizationModel, CollieTrainer
from collie.movielens import read_movielens_df
from collie.utils import convert_to_implicit


# read in explicit MovieLens 100K data
df = read_movielens_df()

# convert the data to implicit
df_imp = convert_to_implicit(df)

# store data as ``Interactions``
interactions = Interactions(users=df_imp['user_id'],
                            items=df_imp['item_id'],
                            allow_missing_ids=True)

# perform a data split
train, val = stratified_split(interactions)

# train an implicit ``MatrixFactorization`` model
model = MatrixFactorizationModel(train=train,
                                 val=val,
                                 embedding_dim=10,
                                 lr=1e-1,
                                 loss='adaptive',
                                 optimizer='adam')
trainer = CollieTrainer(model, max_epochs=10)
trainer.fit(model)
model.eval()

# evaluate the model
auc_score, mrr_score, mapk_score = evaluate_in_batches(metric_list=[auc, mrr, mapk],
                                                       test_interactions=val,
                                                       model=model)

print(f'AUC:          {auc_score}')
print(f'MRR:          {mrr_score}')
print(f'MAP@10:       {mapk_score}')

More complicated examples of implicit pipelines can be viewed for MovieLens 100K data here, in notebooks here, and documentation here.

Explicit Data

Collie also handles the situation when you instead have explicit data, such as star ratings. Note how similar the pipeline and APIs are compared to the implicit example above:

Open In Colab

from collie.cross_validation import stratified_split
from collie.interactions import ExplicitInteractions
from collie.metrics import explicit_evaluate_in_batches
from collie.model import MatrixFactorizationModel, CollieTrainer
from collie.movielens import read_movielens_df

from torchmetrics import MeanAbsoluteError, MeanSquaredError


# read in explicit MovieLens 100K data
df = read_movielens_df()

# store data as ``Interactions``
interactions = ExplicitInteractions(users=df['user_id'],
                                    items=df['item_id'],
                                    ratings=df['rating'])

# perform a data split
train, val = stratified_split(interactions)

# train an implicit ``MatrixFactorization`` model
model = MatrixFactorizationModel(train=train,
                                 val=val,
                                 embedding_dim=10,
                                 lr=1e-2,
                                 loss='mse',
                                 optimizer='adam')
trainer = CollieTrainer(model, max_epochs=10)
trainer.fit(model)
model.eval()

# evaluate the model
mae_score, mse_score = explicit_evaluate_in_batches(metric_list=[MeanAbsoluteError(),
                                                                 MeanSquaredError()],
                                                    test_interactions=val,
                                                    model=model)

print(f'MAE: {mae_score}')
print(f'MSE: {mse_score}')

Comparison With Other Open-Source Recommendation Libraries

On some smaller screens, you might have to scroll right to see the full table. ➡️

Aspect Included in Library Surprise LightFM FastAI Spotlight RecBole TensorFlow Recommenders Collie
Implicit data support for when we only know when a user interacts with an item or not, not the explicit rating the user gave the item
Explicit data support for when we know the explicit rating the user gave the item
Support for side-data incorporated directly into the models
Support a flexible framework for new model architectures and experimentation
Deep learning libraries utilizing speed-ups with a GPU and able to implement new, cutting-edge deep learning algorithms
Automatic support for multi-GPU training
Actively supported and maintained
Type annotations for classes, methods, and functions
Scalable for larger, out-of-memory datasets
Includes model zoo with two or more model architectures implemented
Includes implicit loss functions for training and metric functions for model evaluation
Includes adaptive loss functions for multiple negative examples
Includes loss functions with partial credit for side-data

The following table notes shows the results of an experiment training and evaluating recommendation models in some popular implicit recommendation model frameworks on a common MovieLens 10M dataset. The data was split via a 90/5/5 stratified data split. Each model was trained for a maximum of 40 epochs using an embedding dimension of 32. For each model, we used default hyperparameters (unless otherwise noted below).

Model MAP@10 Score Notes
Randomly initialized, untrained model 0.0001
Logistic MF 0.0128 Using the CUDA implementation.
LightFM with BPR Loss 0.0180
ALS 0.0189 Using the CUDA implementation.
BPR 0.0301 Using the CUDA implementation.
Spotlight 0.0376 Using adaptive hinge loss.
LightFM with WARP Loss 0.0412
Collie MatrixFactorizationModel 0.0425 Using a separate SGD bias optimizer.

At ShopRunner, we have found Collie models outperform comparable LightFM models with up to 64% improved MAP@10 scores.

Development

To run locally, begin by creating a data path environment variable:

# Define where on your local hard drive you want to store data. It is best if this
# location is not inside the repo itself. An example is below
export DATA_PATH=$HOME/data/collie

Run development from within the Docker container:

docker build -t collie .

# run the container in interactive mode, leaving port ``8888`` open for Jupyter
docker run \
    -it \
    --rm \
    -v "${DATA_PATH}:/collie/data/" \
    -v "${PWD}:/collie" \
    -p 8888:8888 \
    collie /bin/bash

Run on a GPU:

docker build -t collie .

# run the container in interactive mode, leaving port ``8888`` open for Jupyter
docker run \
    -it \
    --rm \
    --gpus all \
    -v "${DATA_PATH}:/collie/data/" \
    -v "${PWD}:/collie" \
    -p 8888:8888 \
    collie /bin/bash

Start JupyterLab

To run JupyterLab, start the container and execute the following:

jupyter lab --ip 0.0.0.0 --no-browser --allow-root

Connect to JupyterLab here: http://localhost:8888/lab

Unit Tests

Library unit tests in this repo are to be run in the Docker container:

# execute unit tests
pytest --cov-report term --cov=collie

Note that a handful of tests require the MovieLens 100K dataset to be downloaded (~5MB in size), meaning that either before or during test time, there will need to be an internet connection. This dataset only needs to be downloaded a single time for use in both unit tests and tutorials.

Docs

The Collie library supports Read the Docs documentation. To compile locally,

cd docs
make html

# open local docs
open build/html/index.html
Comments
  • Add user metadata cont

    Add user metadata cont

    Describe the big picture - why are we doing this?

    To allow users to leverage user_metadata in the HybridModel and HybridPretrainedModel as requested in https://github.com/ShopRunner/collie/issues/21

    Any additional details to clarify code in the PR?

    How is this tested?

    Existing tests have been extended and additional tests have been written to make sure that the models are performing as expected.

    Pull Request Checklist

    • [X] Pull request includes a description of why we are doing this
    • [X] CHANGELOG has been updated
    • [X] Version in _version.py has been updated
    • [X] All tests in the tests folder pass with a local build
    • [ ] README has been updated (if applicable)
    • [X] Documentation in docs has been updated (if applicable)
    • [ ] requirements.txt and requirements-dev.txt have been recompiled (if applicable)
    • [X] Docker image can be built using docker build -t collie .

    Screenshots or GIFs?

    enhancement 
    opened by ahuds001 11
  • Including user metadata in hybrid model

    Including user metadata in hybrid model

    Hi,

    This collie_recs looks really great, so thank you for your hard work. Think it fills a very nice gap for lots of people 😄

    I was wondering if you had plans to incorporate user_metadata (in addition to item_metadata). If not, I'd be very happy to try and contribute.

    If it were possible to include it, collie_recs would have similar functionality to popular (LightFM).

    enhancement 
    opened by lgpreston75 11
  • Item metadata issue

    Item metadata issue

    This PR fixes the issue described in https://github.com/ShopRunner/collie/issues/47 where if item_metadata contains nulls the models run successfully but generate null predictions

    Any additional details to clarify code in the PR?

    I had to make changes to the Dockerfile in order to get the docker image to build but that code hasn't been included in this PR. Code can be seen here:

    # added these to deal with nvidia issues
    # GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease:
    # The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
    # solution from here: https://github.com/NVIDIA/nvidia-docker/issues/1631#issuecomment-1112828208
    RUN rm /etc/apt/sources.list.d/cuda.list
    RUN rm /etc/apt/sources.list.d/nvidia-ml.list
    RUN apt-key del 7fa2af80
    RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
    RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/7fa2af80.pub
    

    How is this tested?

    Wrote two tests to ensure that item_metadata containing nulls returns a ValueError now, one for the HybridModel and one for the HybridPretrainedModel

    Pull Request Checklist

    • [X] Pull request includes a description of why we are doing this
    • [X] CHANGELOG has been updated
    • [X] Version in _version.py has been updated
    • [X] All tests in the tests folder pass with a local build
    • [ ] README has been updated (if applicable)
    • [ ] Documentation in docs has been updated (if applicable)
    • [ ] requirements.txt and requirements-dev.txt have been recompiled (if applicable)
    • [x] Docker image can be built using docker build -t collie .
    bug 
    opened by ahuds001 9
  • Fix unit tests error, failures, and warnings

    Fix unit tests error, failures, and warnings

    Describe the big picture - why are we doing this?

    PyTorch Lightning updated to version 1.4.0 and ended up introducing some breaking changes that affected Collie unit tests. Those were solved by replacing max_steps -> max_epochs (which I believe is actually a bug on the PyTorch Lightning side of things - opening an Issue about it in that repo soon) and by replaxing `max_epochs += X -> increase_max_epochs(X)``. Annoying, but ¯\(ツ)/¯ at least it works now 🎉

    Any additional details to clarify code in the PR?

    Any instances of max_epochs += X should be replaced. If I missed any, let me know!

    How is this tested?

    With many unit tests!

    Pull Request Checklist

    • [X] Pull request includes a description of why we are doing this
    • [X] CHANGELOG has been updated
    • [X] Version in _version.py has been updated
    • [X] All tests in the tests folder pass with a local build
    • [ ] README has been updated (if applicable)
    • [ ] Documentation in docs has been updated (if applicable)
    • [X] requirements.txt and requirements-dev.txt have been recompiled (if applicable)
    • [X] Docker image can be built using docker build -t collie .

    Screenshots or GIFs?

    opened by nathancooperjones 9
  • Fix small issues

    Fix small issues

    Describe the big picture - why are we doing this?

    This PR fixes many small issues including:

    • PyTorch Lightning Model Summary not working with latest version in CollieMinimalTrainer
    • Upgrade torch to 1.10
    • Before calling the model, check if the index is in-bound for get_item_predictions and item_item_similarity
    • Remove default num_workers for Interactions DataLoaders
    • Better catch the common ValueError that happens in in stratified splits, and add an option to force_split when users have only a single interaction
    • Change Callable type hints to reflect the actual syntax for these (e.g. Callable[[str, str], int])
    • Add in get_user_predictions and get_user_embeddings to BasePipeline and all model classes

    Any additional details to clarify code in the PR?

    How is this tested?

    The tests were run on an EC2 instance and all passed. The tutorial notebooks were re-run.

    A model was also fit using pytorch_lightning==1.4.9 to test backwards compatibility.

    Pull Request Checklist

    • [x] Pull request includes a description of why we are doing this
    • [x] CHANGELOG has been updated
    • [x] Version in _version.py has been updated
    • [x] All tests in the tests folder pass with a local build
    • [ ] README has been updated (if applicable)
    • [ ] Documentation in docs has been updated (if applicable)
    • [x] requirements.txt and requirements-dev.txt have been recompiled (if applicable)
    • [x] Docker image can be built using docker build -t collie .

    Screenshots or GIFs?

    My first collie and open source contribution!

    bug enhancement dependencies 
    opened by emmafgrimes 6
  • Question regarding production inference

    Question regarding production inference

    Hello,

    First of all thank you for the library, very useful! Second of all, I am sorry if this is a stupid question, I am getting my feet wet in recommender systems being more used to computer vision. Now to the question:

    This is how my training dataset looks like, before giving it to collie (implicit) interactions dataset.

    INDEX BUYER_ID PRODUCT_ID PURCHASE_COUNT PURCHASED 0 1 6620 24 1 1 1 14311 4 1 ... ... ... ... ... 796861 84420 2098732 8 1

    Unique buyer_ids: [0:8676] :[1, 8, 9, 15, 19, 21, 25, 26, 27, 28, 30, 32, 33, 37, ...] Unique product_ids: [0:111122] :[6620, 14311, 56640, 56898, 77918, 527578, 767357, 794276, 798465, 867129, 1095150, 1112374, 1351118, 1404537, ...]

    Again, this is before sending it to collie implicit interactions dataset. Now that my model is trained, I want to get similar products to PRODUCT_ID 56640. Can I just use: model.item_item_similarity(56640)? I have some doubts that I can use this directly since if im looking through the code it takes the 56640 row from the item embedding, which is constructed from 0 to max. How can I tackle this?

    The reason is that now when I try to run inference, similar_items returned by the item_item_similarity don't match to any product_id

    enhancement 
    opened by mhashas 5
  • [DS-3018] Add in multi-stage models

    [DS-3018] Add in multi-stage models

    Describe the big picture - why are we doing this?

    Okay... lots of changes here.

    As a bit of backstory, this PR started as a continuation of Hanna's hackweek, but she wasn't able to fully clean it up for a PR before she left. I took the branch over and, over the course of a straight 12 hour coding session, was able to clean it up for this PR. So here it is - a Hanna-Nate code mashup!

    The main changes here are the additions of the multi-stage models. These come in three files: one for establishing the base template class, one applying the template to a hybrid model, and another applying the template to a cold start model. Hanna initially had the cold start model for both users and items, but since we have another issue open to add user metadata to hybrid models (https://github.com/ShopRunner/collie_recs/issues/21), I decided to simplify the code a bit and just do item buckets for now.

    To make the multi-stage pipeline work with the existing CollieMinimalTrainer, I had to adjust how optimizers and learning rate schedulers work, which isn't a massive change.

    I also wanted to include bias terms into the hybrid model architectures, since Hanna did a big discussion on how important and beneficial they were to recommendations.

    So, with all that being said, here are the files I would look at:

    1. multi_stage_pipeline.py
    2. hybrid_pretrained_matrix_factorization.py
    3. hybrid_matrix_factorization.py
    4. cold_start_matrix_factorization.py
    5. Tutorial 06
    6. Everything else.

    Any additional details to clarify code in the PR?

    Let me know if you see any potential or obvious mistakes in here, I have code-writing fatigue and might be making mistakes I don't realize yet.

    How is this tested?

    With a lot of unit tests!

    Pull Request Checklist

    • [X] Pull request includes a description of why we are doing this
    • [X] CHANGELOG has been updated
    • [X] Version in _version.py has been updated
    • [X] All tests in the tests folder pass with a local build
    • [X] README has been updated (if applicable)
    • [X] Documentation in docs has been updated (if applicable)
    • [X] requirements.txt and requirements-dev.txt have been recompiled (if applicable)
    • [X] Docker image can be built using docker build -t collie_recs .

    Screenshots or GIFs?

    opened by nathancooperjones 5
  • [DS-3006] Add explicit data support

    [DS-3006] Add explicit data support

    Describe the big picture - why are we doing this?

    Closes https://github.com/ShopRunner/collie_recs/issues/10.

    Any additional details to clarify code in the PR?

    My hope was to introduce explicit data support with 1) no breaking changes to existing pipelines, 2) a minimal amount of API changes from the implicit counterpart, and 3) lots of data quality checks to reduce mistakes. If you see any improvements in here that I should add, please let me know!

    How is this tested?

    Pull Request Checklist

    • [x] Pull request includes a description of why we are doing this
    • [x] CHANGELOG has been updated
    • [x] Version in _version.py has been updated
    • [x] All tests in the tests folder pass with a local build
    • [x] README has been updated (if applicable)
    • [x] Documentation in docs has been updated (if applicable)
    • [x] requirements.txt and requirements-dev.txt have been recompiled (if applicable)

    Screenshots or GIFs?

    enhancement 
    opened by nathancooperjones 5
  • Hybrid model item_metadata with nulls

    Hybrid model item_metadata with nulls

    Hi Collie team,

    Love the work y'all are doing! I noticed when playing with the hybrid model that if I pass item_metadata that contains nulls the model can be trained and will build through all the stages. It generates metric scores as well but will only generate null predictions. This doesn't seem ideal, and it may be better if one of these things happened:

    • model fails altogether with an error explaining that item_metadata columns cannot contain null values
    • model gives a warning and proceeds with some form of imputation on the columns with null values

    I think the first option is a better option personally, but I'm sure y'all would know better than I would.

    bug enhancement good first issue help wanted invalid 
    opened by ahuds001 4
  • Error when using stratified_split[BUG]

    Error when using stratified_split[BUG]

    Describe the bug

    I'm using my own dataset based on movielens and each time I try to split the data it gives me this error

    ValueError: With n_samples=1, test_size=0.1 and train_size=None, the resulting train set will be empty. Adjust any of the aforementioned parameters.
    

    To try and see if I was doing something else wrong, I run the Quickstart exactly as it is, the only difference is that i used my own dataset and prepared it with the following code:

    movielens_df = pd.read_csv('movielens-metadata/ratings_small.csv', delimiter=';')
    movielens_df.rename(columns={'userId':'user_id', 'movieId':'item_id'}, inplace=True)
    movielens_df['rating'] = movielens_df['rating'].transform(lambda x : float(x.replace(',','.')))
    df = movielens_df
    

    But then if I use the read_movielens_df it works perfectly. Also this problem only occurs when using stratified_split, I tested random_split and it worked fine.

    Here is the dataset if you want to try it yourself ratings_small.csv

    bug invalid 
    opened by sgaseretto 4
  • added support for Adagrad optimizer

    added support for Adagrad optimizer

    Describe the big picture - why are we doing this?

    Currently Collie supports multiple optimizer, and only one can be used with sparse and dense layers. Adagrad is another recommended optimizer that can work with sparse and dense layers.

    Any additional details to clarify code in the PR?

    How is this tested?

    pytests with Docker container

    Pull Request Checklist

    • [X] Pull request includes a description of why we are doing this
    • [X] CHANGELOG has been updated
    • [X] Version in _version.py has been updated
    • [X] All tests in the tests folder pass with a local build
    • [ ] README has been updated (if applicable)
    • [ ] Documentation in docs has been updated (if applicable)
    • [ ] requirements.txt and requirements-dev.txt have been recompiled (if applicable)
    • [X] Docker image can be built using docker build -t collie .

    Screenshots or GIFs?

    opened by Menna13 4
  • Bump wheel from 0.37.1 to 0.38.1

    Bump wheel from 0.37.1 to 0.38.1

    Bumps wheel from 0.37.1 to 0.38.1.

    Changelog

    Sourced from wheel's changelog.

    Release Notes

    UNRELEASED

    • Updated vendored packaging to 22.0

    0.38.4 (2022-11-09)

    • Fixed PKG-INFO conversion in bdist_wheel mangling UTF-8 header values in METADATA (PR by Anderson Bravalheri)

    0.38.3 (2022-11-08)

    • Fixed install failure when used with --no-binary, reported on Ubuntu 20.04, by removing setup_requires from setup.cfg

    0.38.2 (2022-11-05)

    • Fixed regression introduced in v0.38.1 which broke parsing of wheel file names with multiple platform tags

    0.38.1 (2022-11-04)

    • Removed install dependency on setuptools
    • The future-proof fix in 0.36.0 for converting PyPy's SOABI into a abi tag was faulty. Fixed so that future changes in the SOABI will not change the tag.

    0.38.0 (2022-10-21)

    • Dropped support for Python < 3.7
    • Updated vendored packaging to 21.3
    • Replaced all uses of distutils with setuptools
    • The handling of license_files (including glob patterns and default values) is now delegated to setuptools>=57.0.0 (#466). The package dependencies were updated to reflect this change.
    • Fixed potential DoS attack via the WHEEL_INFO_RE regular expression
    • Fixed ValueError: ZIP does not support timestamps before 1980 when using SOURCE_DATE_EPOCH=0 or when on-disk timestamps are earlier than 1980-01-01. Such timestamps are now changed to the minimum value before packaging.

    0.37.1 (2021-12-22)

    • Fixed wheel pack duplicating the WHEEL contents when the build number has changed (#415)
    • Fixed parsing of file names containing commas in RECORD (PR by Hood Chatham)

    0.37.0 (2021-08-09)

    • Added official Python 3.10 support
    • Updated vendored packaging library to v20.9

    ... (truncated)

    Commits
    • 6f1608d Created a new release
    • cf8f5ef Moved news item from PR #484 to its proper place
    • 9ec2016 Removed install dependency on setuptools (#483)
    • 747e1f6 Fixed PyPy SOABI parsing (#484)
    • 7627548 [pre-commit.ci] pre-commit autoupdate (#480)
    • 7b9e8e1 Test on Python 3.11 final
    • a04dfef Updated the pypi-publish action
    • 94bb62c Fixed docs not building due to code style changes
    • d635664 Updated the codecov action to the latest version
    • fcb94cd Updated version to match the release
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump certifi from 2021.10.8 to 2022.12.7

    Bump certifi from 2021.10.8 to 2022.12.7

    Bumps certifi from 2021.10.8 to 2022.12.7.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Time-aware (sequential) recommendations

    Time-aware (sequential) recommendations

    PR is TBD. Will update this description when the PR is ready for review.

    Describe the big picture - why are we doing this?

    Any additional details to clarify code in the PR?

    How is this tested?

    Pull Request Checklist

    • [ ] Pull request includes a description of why we are doing this
    • [ ] CHANGELOG has been updated
    • [ ] Version in _version.py has been updated
    • [ ] All tests in the tests folder pass with a local build
    • [ ] README has been updated (if applicable)
    • [ ] Documentation in docs has been updated (if applicable)
    • [ ] requirements.txt and requirements-dev.txt have been recompiled (if applicable)
    • [ ] Docker image can be built using docker build -t collie .

    Screenshots or GIFs?

    opened by nathancooperjones 1
  • Bump joblib from 1.1.0 to 1.2.0

    Bump joblib from 1.1.0 to 1.2.0

    Bumps joblib from 1.1.0 to 1.2.0.

    Changelog

    Sourced from joblib's changelog.

    Release 1.2.0

    • Fix a security issue where eval(pre_dispatch) could potentially run arbitrary code. Now only basic numerics are supported. joblib/joblib#1327

    • Make sure that joblib works even when multiprocessing is not available, for instance with Pyodide joblib/joblib#1256

    • Avoid unnecessary warnings when workers and main process delete the temporary memmap folder contents concurrently. joblib/joblib#1263

    • Fix memory alignment bug for pickles containing numpy arrays. This is especially important when loading the pickle with mmap_mode != None as the resulting numpy.memmap object would not be able to correct the misalignment without performing a memory copy. This bug would cause invalid computation and segmentation faults with native code that would directly access the underlying data buffer of a numpy array, for instance C/C++/Cython code compiled with older GCC versions or some old OpenBLAS written in platform specific assembly. joblib/joblib#1254

    • Vendor cloudpickle 2.2.0 which adds support for PyPy 3.8+.

    • Vendor loky 3.3.0 which fixes several bugs including:

      • robustly forcibly terminating worker processes in case of a crash (joblib/joblib#1269);

      • avoiding leaking worker processes in case of nested loky parallel calls;

      • reliability spawn the correct number of reusable workers.

    Commits
    • 5991350 Release 1.2.0
    • 3fa2188 MAINT cleanup numpy warnings related to np.matrix in tests (#1340)
    • cea26ff CI test the future loky-3.3.0 branch (#1338)
    • 8aca6f4 MAINT: remove pytest.warns(None) warnings in pytest 7 (#1264)
    • 067ed4f XFAIL test_child_raises_parent_exits_cleanly with multiprocessing (#1339)
    • ac4ebd5 MAINT add back pytest warnings plugin (#1337)
    • a23427d Test child raises parent exits cleanly more reliable on macos (#1335)
    • ac09691 [MAINT] various test updates (#1334)
    • 4a314b1 Vendor loky 3.2.0 (#1333)
    • bdf47e9 Make test_parallel_with_interactively_defined_functions_default_backend timeo...
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump protobuf from 3.19.1 to 3.19.5

    Bump protobuf from 3.19.1 to 3.19.5

    Bumps protobuf from 3.19.1 to 3.19.5.

    Release notes

    Sourced from protobuf's releases.

    Protocol Buffers v3.19.5

    C++

    Protocol Buffers v3.19.4

    Python

    • Make libprotobuf symbols local on OSX to fix issue #9395 (#9435)

    Ruby

    • Fixed a data loss bug that could occur when the number of optional fields in a message is an exact multiple of 32. (#9440).

    PHP

    • Fixed a data loss bug that could occur when the number of optional fields in a message is an exact multiple of 32. (#9440).

    Protocol Buffers v3.19.3

    Python

    • Fix missing Windows wheel for Python 3.10 on PyPI

    Protocol Buffers v3.19.2

    Java

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • Bump oauthlib from 3.1.1 to 3.2.1

    Bump oauthlib from 3.1.1 to 3.2.1

    Bumps oauthlib from 3.1.1 to 3.2.1.

    Release notes

    Sourced from oauthlib's releases.

    3.2.1

    In short

    OAuth2.0 Provider:

    • #803 : Metadata endpoint support of non-HTTPS
    • CVE-2022-36087

    OAuth1.0:

    • #818 : Allow IPv6 being parsed by signature

    General:

    • Improved and fixed documentation warnings.
    • Cosmetic changes based on isort

    What's Changed

    New Contributors

    Full Changelog: https://github.com/oauthlib/oauthlib/compare/v3.2.0...v3.2.1

    3.2.0

    Changelog

    OAuth2.0 Client:

    • #795: Add Device Authorization Flow for Web Application
    • #786: Add PKCE support for Client
    • #783: Fallback to none in case of wrong expires_at format.

    OAuth2.0 Provider:

    • #790: Add support for CORS to metadata endpoint.
    • #791: Add support for CORS to token endpoint.
    • #787: Remove comma after Bearer in WWW-Authenticate

    OAuth2.0 Provider - OIDC:

    • #755: Call save_token in Hybrid code flow
    • #751: OIDC add support of refreshing ID Tokens with refresh_id_token
    • #751: The RefreshTokenGrant modifiers now take the same arguments as the AuthorizationCodeGrant modifiers (token, token_handler, request).

    ... (truncated)

    Changelog

    Sourced from oauthlib's changelog.

    3.2.1 (2022-09-09)

    OAuth2.0 Provider:

    • #803: Metadata endpoint support of non-HTTPS
    • CVE-2022-36087

    OAuth1.0:

    • #818: Allow IPv6 being parsed by signature

    General:

    • Improved and fixed documentation warnings.
    • Cosmetic changes based on isort

    3.2.0 (2022-01-29)

    OAuth2.0 Client:

    • #795: Add Device Authorization Flow for Web Application
    • #786: Add PKCE support for Client
    • #783: Fallback to none in case of wrong expires_at format.

    OAuth2.0 Provider:

    • #790: Add support for CORS to metadata endpoint.
    • #791: Add support for CORS to token endpoint.
    • #787: Remove comma after Bearer in WWW-Authenticate

    OAuth2.0 Provider - OIDC:

    • #755: Call save_token in Hybrid code flow
    • #751: OIDC add support of refreshing ID Tokens with refresh_id_token
    • #751: The RefreshTokenGrant modifiers now take the same arguments as the AuthorizationCodeGrant modifiers (token, token_handler, request).

    General:

    • Added Python 3.9, 3.10, 3.11
    • Improve Travis & Coverage
    Commits
    • 88bb156 Updated date and authors
    • 1a45d97 Prepare 3.2.1 release
    • 0adbbe1 docs: fix typos
    • 6569ec3 docs: Fix a few typos
    • bdc486e Fixed isort imports
    • 7db45bd Fix typo in server.rst
    • b14ad85 chore: s/bode_code_verifier/body_code_verifier/g
    • b123283 Allow non-HTTPS issuer when OAUTHLIB_INSECURE_TRANSPORT. (#803)
    • 2f887b5 Docs: fix Sphinx warnings for better ReadTheDocs generation (#807)
    • d4bafd9 Merge pull request #797 from cclauss/patch-2
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
Code for Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022)

Private Recommender Systems: How Can Users Build Their Own Fair Recommender Systems without Log Data? (SDM 2022) We consider how a user of a web servi

joisino 20 Aug 21, 2022
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

null 419 Jan 3, 2023
Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker This repository contai

Nikita 12 Dec 14, 2022
Hybrid CenterNet - Hybrid-supervised object detection / Weakly semi-supervised object detection

Hybrid-Supervised Object Detection System Object detection system trained by hybrid-supervision/weakly semi-supervision (HSOD/WSSOD): This project is

null 5 Dec 10, 2022
High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.

TL;DR Ignite is a high-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently. Click on the image to

null 4.2k Jan 1, 2023
An efficient PyTorch implementation of the evaluation metrics in recommender systems.

recsys_metrics An efficient PyTorch implementation of the evaluation metrics in recommender systems. Overview • Installation • How to use • Benchmark

Xingdong Zuo 12 Dec 2, 2022
StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking Datasets You can download datasets that have been pre-pr

null 25 May 29, 2022
PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Ilya Kostrikov 3k Dec 31, 2022
A Comparative Framework for Multimodal Recommender Systems

Cornac Cornac is a comparative framework for multimodal recommender systems. It focuses on making it convenient to work with models leveraging auxilia

Preferred.AI 671 Jan 3, 2023
Open-sourcing the Slates Dataset for recommender systems research

FINN.no Recommender Systems Slate Dataset This repository accompany the paper "Dynamic Slate Recommendation with Gated Recurrent Units and Thompson Sa

FINN.no 48 Nov 28, 2022
PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations.

HPNet This repository contains the PyTorch implementation of paper: HPNet: Deep Primitive Segmentation Using Hybrid Representations. Installation The

Siming Yan 42 Dec 7, 2022
ColossalAI-Examples - Examples of training models with hybrid parallelism using ColossalAI

ColossalAI-Examples This repository contains examples of training models with Co

HPC-AI Tech 185 Jan 9, 2023
Tensorflow 2 implementation of the paper: Learning and Evaluating Representations for Deep One-class Classification published at ICLR 2021

Deep Representation One-class Classification (DROC). This is not an officially supported Google product. Tensorflow 2 implementation of the paper: Lea

Google Research 137 Dec 23, 2022
torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

??A high level pipeline for face landmarks detection, supports training, evaluating, exporting, inference and 100+ data augmentations, compatible with torchvision and albumentations, can easily install with pip.

DefTruth 142 Dec 25, 2022
Scalable, event-driven, deep-learning-friendly backtesting library

...Minimizing the mean square error on future experience. - Richard S. Sutton BTGym Scalable event-driven RL-friendly backtesting library. Build on

Andrew 922 Dec 27, 2022
🤖 A Python library for learning and evaluating knowledge graph embeddings

PyKEEN PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-m

PyKEEN 1.1k Jan 9, 2023
Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology

Official repository for the ICLR 2021 paper Evaluating the Disentanglement of Deep Generative Models with Manifold Topology Sharon Zhou, Eric Zelikman

Stanford Machine Learning Group 34 Nov 16, 2022