Deploy recommendation engines with Edge Computing

NimbleEdge

Last update: Jan 2, 2023

Related tags

Deep Learning machine-learning privacy deep-learning pytorch recommender-system hacktoberfest edge-computing

Overview

RecoEdge: Bringing Recommendations to the Edge

A one stop solution to build your recommendation models, train them and, deploy them in a privacy preserving manner-- right on the users' devices.

RecoEdge integrate the phenomenal works by OpenMined and FedML to easily explore new federated learning algorithms and deploy them into production.

The steps to building an awesome recommendation system:

🔩 Standard ML training: Pick up any ML model and benchmark it using BaseTrainer
🎮 Federated Learning Simulation: Once you are satisfied with your model, explore a host of FL algorithms with FederatedWorker
🏭 Industrial Deployment: After all the testing and simulation, deploy easily using PySyft from OpenMined
🚀 Edge Computing: Integrate with NimbleEdge to improve FL training times by over 100x.

QuickStart

Let's train Facebook AI's DLRM on the edge. DLRM has been a standard baseline for all neural network based recommendation models.

Clone this repo and change the argument datafile in configs/dlrm.yml to the above path.

git clone https://github.com/NimbleEdge/RecoEdge

model :
  name : 'dlrm'
  ...
  preproc :
    datafile : "<Path to Criteo>/criteo/train.txt"

Install the dependencies with conda or pip

conda env create --name recoedge --file environment.yml
conda activate recoedge

Run data preprocessing with preprocess_data and supply the config file. You should be able to generate per-day split from the entire dataset as well a processed data file

python preprocess_data.py --config configs/dlrm.yml --logdir $HOME/logs/kaggle_criteo/exp_1

Begin Training

python train.py --config configs/dlrm.yml --logdir $HOME/logs/kaggle_criteo/exp_3 --num_eval_batches 1000 --devices 0

Run tensorboard to view training loss and validation metrics at localhost:8888

tensorboard --logdir $HOME/logs/kaggle_criteo --port 8888

Federated Training

This section is still work in progress. Reach out to us directly if you need help with FL deployment

Now we will simulate DLRM in federated setting. Create data split to mimic your users. We use Drichlet sampling for creating non-IID datasets for the model.

Adjust the parameters for distributed training like MPI in the config file

communications:
  gpu_map:
    host1: [0, 2]
    host2: [1, 0, 1]
    host3: [1, 1, 0, 1]
    host4: [0, 1, 0, 0, 0, 1, 0, 2]

Implement your own federated learning algorithm. In the demo we are using Federated Averaging. You just need to sub-class FederatedWorker and implement run() method.

@registry.load('fl_algo', 'fed_avg')
class FedAvgWorker(FederatedWorker):
    def __init__(self, ...):
        super().__init__(...)

    async def run(self):
        '''
            `Run` function updates the local model. 
            Implement this method to determine how the roles interact with each other to determine the final updated model.
            For example a worker which has both the `aggregator` and `trainer` roles might first train locally then run discounted `aggregate()` to get the fianl update model 


            In the following example,
            1. Aggregator requests models from the trainers before aggregating and updating its model.
            2. Trainer responds to aggregators' requests after updating its own model by local training.

            Since standard FL requires force updates from central entity before each cycle, trainers always start with global model/aggregator's model 

        '''
        assert role in self.roles, InvalidStateError("unknown role for worker")

        if role == 'aggregator':
            neighbours = await self.request_models_suspendable(self.sample_neighbours())
            weighted_params = self.aggregate(neighbours)
            self.update_model(weighted_params)
        elif role == 'trainer':
            # central server in this case
            aggregators = list(self.out_neighbours.values())
            global_models = await self.request_models_suspendable(aggregators)
            self.update_model(global_models[0])
            await self.train(model_dir=self.persistent_storage)
        self.round_idx += 1

    # Your aggregation strategy
    def aggregate(self, neighbour_ids):
        model_list = [
            (self.in_neighbours[id].sample_num, self.in_neighbours[id].model)
            for id in neighbour_ids
        ]
        (num0, averaged_params) = model_list[0]
        for k in averaged_params.keys():
            for i in range(0, len(model_list)):
                local_sample_number, local_model_params = model_list[i]
                w = local_sample_number / training_num
                if i == 0:
                    averaged_params[k] = local_model_params[k] * w
                else:
                    averaged_params[k] += local_model_params[k] * w

        return averaged_params

    # Your sampling strategy
    def sample_neighbours(self, round_idx, client_num_per_round):
        num_neighbours = len(self.in_neighbours)
        if num_neighbours == client_num_per_round:
            selected_neighbours = [
                neighbour for neighbour in self.in_neighbours]
        else:
            with RandomContext(round_idx):
                selected_neighbours = np.random.choice(
                    self.in_neighbours, min(client_num_per_round, num_neighbours), replace=False)
        logging.info("worker_indexes = %s" % str(selected_neighbours))
        return selected_neighbours

Begin FL simulation by

mpirun -np 20 python -m mpi4py.futures train_fl.py --num_workers 1000.

Deploy with PySyft

Customization

Training Configuration

There are two ways to adjust training hyper-parameters:

Set values in config/*.yml persistent settings which are necessary for reproducibility eg randomization seed
Pass them as CLI argument Good for non-persistent and dynamic settings like gpu device

In case of conflict, CLI argument supercedes config file parameter. For further reference, check out training config flags

Model Architecture

Adjusting DLRM model params

Any parameter needed to instantiate the pytorch module can be supplied by simply creating a key-value pair in the config file.

For example DLRM requires arch_feature_emb_size, arch_mlp_bot, etc

model: 
  name : 'dlrm'
  arch_sparse_feature_size : 16
  arch_mlp_bot : [13, 512, 256, 64]
  arch_mlp_top : [367, 256, 1]
  arch_interaction_op : "dot"
  arch_interaction_itself : False
  sigmoid_bot : "relu"
  sigmoid_top : "sigmoid"
  loss_function: "mse"

Adding new models

Model architecture can only be changed via configs/*.yml files. Every model declaration is tagged with an appropriate name and loaded into registry.

@registry.load('model','<model_name>')
class My_Model(torch.nn.Module):
    def __init__(num):
        ...

You can define your own modules and add them in the fedrec/modules. Finally set the name flag of model tag in config file

model : 
  name : "<model name>"

Contribute

Star, fork, and clone the repo.
Do your work.
Push to your fork.
Submit a PR to NimbleEdge/RecoEdge

We welcome you to the Discord for queries related to the library and contribution in general.

Comments

Inline documentation of the criteo dataset
Description

I have tried to do the inline documentation of the criteo.py file in the repo.I have documented only some parts of it ,as I want to check first whether I am going in the right direction or not the rest required changes I will do after feedback

Issue #165

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests
opened by soma2000-lang 16
Adding inline documentation for criteoprocessor
#164

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests
opened by soma2000-lang 10
Correcting the grammatical errors in many files
Description

In the docs folder of envis edge there are several files having grammatical errors.Those have been corrected in this PR.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests
opened by soma2000-lang 8
Added Inline documentation for job_response_model.py
Description

Added docstrings to the "serialize" and "deserialize" methods in fedrec/data_models/job_response_model.py, to help better explain what the methods do. This pull request only closes this issue.

Followed the numpy documentation style to write inline documentation for the job_response_model.py file.

Text is grammatically correct and has been checked using writing software.

Relevant Issue

Issue #186

Affected Dependencies.

This task is independent of any other files or folders in this project.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

documentation 📃 GSOD
opened by Tob-iee 8
added inline documentation
Description

Added inline documentation as instructed in the issue

Relevant Issue

Issue #194

Affected Dependencies

List any dependencies that are required for this change.

How has this been tested?

Describe the tests that you ran to verify your changes.

Provide instructions so we can reproduce.

List any relevant details for your test configuration.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests
opened by haripriya9647 7
corrected spelling and article errors!! ❤✔
Where

https://github.com/NimbleEdge/EnvisEdge/blob/main/docs/source/tutorials/Tutorial-Part-4-deployment.rst

Description

corrected spelling of capabilities

corrected spelling of dependencies

corrected the verb form to serialize

corrected spelling of themselves

Relevant Issue

Issue #Tag the issue here

Affected Dependencies

List any dependencies that are required for this change.

How has this been tested?

Describe the tests that you ran to verify your changes.

Provide instructions so we can reproduce.

List any relevant details for your test configuration.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests

documentation 📃 good first issue GSOD
opened by satyamroy001 6
📚 Documentation: improve tutorials
🙋 Where

Here

💭 Description

We need to improve the tutorials for android and iOS for federated learning.

👀 Have you spent some time to check if this issue has been raised before?

[X] I checked and didn't find similar issue

🏢 Have you read the guidlines?

[X] I have read the Guildlines

🏢 Have you read the Code of Conduct?

[X] I read the Code of Conduct

documentation 📃 good first issue GSOD
opened by ramesht007 6
Update jobber.py
Created inline documentation for this file. I have followed all the instructions.

Description

Fixes #51

Followed the numpystyle documentation style to write inline documentation for the jobbers.py file.

Text is grammatically correct and has been checked using writing software.

Relevant Issue

-Partially fixes #182

Affected Dependencies

This work is independent of any other files or folders in this project.

Checklist

[+] I have followed the Contribution Guidelines and Code of Conduct

[+] I have commented on my code following the NimbleEdge Styleguide

[+] I have labelled this PR with the relevant Type labels

[+] My changes are covered by tests

documentation 📃 GSOD
opened by abhiwalia15 6
Update criteo_processor.py
Relevant Issue

#164

Affected Dependencies

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests
opened by soma2000-lang 5
Update embeddings.py
Relevant Issue

#138

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[ ] I have labeled this PR with the relevant Type labels

[ ] My changes are covered by tests
opened by soma2000-lang 5
updated serialization folder
Description

added inline docs for methods and class

Relevant Issue

Issue #269 the issue here

Affected Dependencies

None

How has this been tested?

Describe the tests that you ran to verify your changes.

Provide instructions so we can reproduce.

List any relevant details for your test configuration.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests

documentation 📃 good first issue GSOD
opened by haripriya9647 4
added docs for topology tree
Description

added scala docs for topology tree data structure

Relevant Issue

Issue #132 the issue here

Affected Dependencies

List any dependencies that are required for this change - none

How has this been tested?

Describe the tests that you ran to verify your changes.

Provide instructions so we can reproduce.

List any relevant details for your test configuration.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests

scala
opened by haripriya9647 4
added docs for main file
Description

Added scala documentation for main file

Relevant Issue

Issue #Tag the issue here - no issue created

Affected Dependencies

List any dependencies that are required for this change.

How has this been tested?

Describe the tests that you ran to verify your changes.

Provide instructions so we can reproduce.

List any relevant details for your test configuration.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests

scala
opened by haripriya9647 4
added docs for FLsystemmanager
Description

added scala documentation for FLSystem Manager file

Relevant Issue

Issue #Tag the issue here - no issue created

Affected Dependencies

List any dependencies that are required for this change.

How has this been tested?

Describe the tests that you ran to verify your changes.

Provide instructions so we can reproduce.

List any relevant details for your test configuration.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests

scala
opened by haripriya9647 5
added docs for aggregator file
Description

added scala documentation for aggregator file

Relevant Issue

Issue #Tag the issue here - no issue created

Affected Dependencies

List any dependencies that are required for this change.

How has this been tested?

Describe the tests that you ran to verify your changes.

Provide instructions so we can reproduce.

List any relevant details for your test configuration.

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

[x] I have labeled this PR with the relevant Type labels

[x] My changes are covered by tests

scala
opened by haripriya9647 5
added doc for Identifier Data Structure
Description

This PR adds inline documentation to the scala_core/src/main/scala/org/nimbleedge/envisedge/models/Identifier.scala file.

Relevant Issue

Issue #131

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

scala
opened by bashirk 2
added doc for Trainer.scala
Description

This PR adds inline documentation to the scala_core/src/main/scala/org/nimbleedge/envisedge/Trainer.scala file.

Relevant Issue

Checklist

[x] I have followed the Contribution Guidelines and Code of Conduct

[x] I have commented my code following the NimbleEdge Styleguide

scala
opened by bashirk 2

Owner

NimbleEdge

An edge computing solution for all your needs

GitHub https://www.nimbleedge.ai

Azion the best solution of Edge Computing in the world.

Azion Edge Function docker action Create or update an Edge Functions on Azion Edge Nodes. The domain name is the key for decision to a create or updat

8 Jul 16, 2022

Sky Computing: Accelerating Geo-distributed Computing in Federated Learning

Sky Computing Introduction Sky Computing is a load-balanced framework for federated learning model parallelism. It adaptively allocate model layers to

72 Dec 27, 2022

Recommendationsystem - Movie-recommendation - matrixfactorization colloborative filtering recommendation system user

recommendationsystem matrixfactorization colloborative filtering recommendation

1 Jan 1, 2022

Product-based-recommendation-system - A product based recommendation system which uses Machine learning algorithm such as KNN and cosine similarity

Product-based-recommendation-system A product based recommendation system which

2 Feb 15, 2022

Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.

Couler What is Couler? Couler aims to provide a unified interface for constructing and managing workflows on different workflow engines, such as Argo

781 Jan 3, 2023

Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

210 Dec 28, 2022

This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Wizard of Search Engine: Access to Information Through Conversations with Search Engines by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zh

19 Oct 27, 2022

QueryFuzz implements a metamorphic testing approach to test Datalog engines.

Datalog is a popular query language with applications in several domains. Like any complex piece of software, Datalog engines may contain bugs. The mo

34 Sep 10, 2022

Fuzzing JavaScript Engines with Aspect-preserving Mutation

DIE Repository for "Fuzzing JavaScript Engines with Aspect-preserving Mutation" (in S&P'20). You can check the paper for technical details. Environmen

190 Dec 11, 2022

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

NuPIC Numenta Platform for Intelligent Computing The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implem

6.3k Dec 30, 2022

YolactEdge: Real-time Instance Segmentation on the Edge

YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7 FPS on an RTX 2080 Ti) with a ResNet-101 backbone on 550x550 resolution images.

1.1k Jan 6, 2023

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

4.7k Dec 29, 2022

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

NuPIC Numenta Platform for Intelligent Computing The Numenta Platform for Intelligent Computing (NuPIC) is a machine intelligence platform that implem

6.2k Feb 12, 2021

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising

EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising By Tengfei Liang, Yi Jin, Yidong Li, Tao Wang. Th

115 Jan 5, 2023

xitorch: differentiable scientific computing library

xitorch is a PyTorch-based library of differentiable functions and functionals that can be widely used in scientific computing applications as well as deep learning.

24 Apr 15, 2021

Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

GraphMask This repository contains an implementation of GraphMask, the interpretability technique for graph neural networks presented in our ICLR 2021

29 Sep 2, 2022

A static analysis library for computing graph representations of Python programs suitable for use with graph neural networks.

python_graphs This package is for computing graph representations of Python programs for machine learning applications. It includes the following modu

258 Dec 29, 2022

Blender scripts for computing geodesic distance

GeoDoodle Geodesic distance computation for Blender meshes Table of Contents Overivew Usage Implementation Overview This addon provides an operator fo

20 Jun 8, 2022

A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.

Note: This is an alpha (preview) version which is still under refining. nn-Meter is a novel and efficient system to accurately predict the inference l

244 Jan 6, 2023

Deploy recommendation engines with Edge Computing

Related tags

Overview

RecoEdge: Bringing Recommendations to the Edge

QuickStart

Federated Training

Customization

Training Configuration

Model Architecture

Adjusting DLRM model params

Adding new models

Contribute

Comments

Description

Checklist

Checklist

Description

Checklist

Description

Relevant Issue

Affected Dependencies.

Checklist

Description

Relevant Issue

Affected Dependencies

How has this been tested?

Checklist

Where

Description

Relevant Issue

Affected Dependencies

How has this been tested?

Checklist

🙋 Where

💭 Description

👀 Have you spent some time to check if this issue has been raised before?

🏢 Have you read the guidlines?

🏢 Have you read the Code of Conduct?

Description

Relevant Issue

Affected Dependencies

Checklist

Relevant Issue

Affected Dependencies

Checklist

Relevant Issue

Checklist

Description

Relevant Issue

Affected Dependencies

How has this been tested?

Checklist

Description

Relevant Issue

Affected Dependencies

How has this been tested?

Checklist

Description

Relevant Issue

Affected Dependencies

How has this been tested?

Checklist

Description

Relevant Issue

Affected Dependencies

How has this been tested?

Checklist

Description

Relevant Issue

Affected Dependencies

How has this been tested?

Checklist

Description

Relevant Issue

Checklist

Description

Relevant Issue

Checklist

Owner

NimbleEdge