Microsoft Distributed Machine Learning Toolkit

Overview

DMTK

Distributed Machine Learning Toolkit https://www.dmtk.io Please open issues in the project below. For any technical support email to [email protected]

DMTK includes the following projects:

  • DMTK framework(Multiverso): The parameter server framework for distributed machine learning.
  • LightLDA: Scalable, fast and lightweight system for large-scale topic modeling.
  • LightGBM: LightGBM is a fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
  • Distributed word embedding: Distributed algorithm for word embedding implemented on multiverso.

Updates

2017-02-04

  • A tutorial on the latests updates of Distributed Machine Learning is presented on AAAI 2017. you can download the slides here.

2016-11-21

  • Multiverso has been officially used in Microsoft CNTK to power its ASGD parallel training.

2016-10-17

  • LightGBM has been released. which is a fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

2016-09-12

  • A talk on the latest updates of DMTK is presented on GTC China. We also described the latest research work from our team, including the lightRNN(to be appeared in NIPS2016) and DC-ASGD.

2016-07-05

  • Multiverso has been upgrade to new API.Overview
  • Deep learning framework (torch/theano) support has been added.
  • Python/Lua bidding has been supported, you can using multiverso with Python/Lua.

Microsoft Open Source Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

You might also like...
Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

XGBoost-Ray is a distributed backend for XGBoost, built on top of distributed computing framework Ray.

Distributed-systems-algos - Distributed Systems Algorithms For Python

Distributed Systems Algorithms ISIS algorithm In an asynchronous system that kee

Microsoft Machine Learning for Apache Spark
Microsoft Machine Learning for Apache Spark

Microsoft Machine Learning for Apache Spark MMLSpark is an ecosystem of tools aimed towards expanding the distributed computing framework Apache Spark

Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about Machine Learning
Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about Machine Learning

Azure Cloud Advocates at Microsoft are pleased to offer a 12-week, 24-lesson curriculum all about Machine Learning

Causal Inference and Machine Learning in Practice with EconML and CausalML: Industrial Use Cases at Microsoft, TripAdvisor, Uber

Causal Inference and Machine Learning in Practice with EconML and CausalML: Industrial Use Cases at Microsoft, TripAdvisor, Uber

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Distributed machine learning platform

Veles Distributed platform for rapid Deep learning application development Consists of: Platform - https://github.com/Samsung/veles Znicz Plugin - Neu

Framework and Library for Distributed Online Machine Learning

Jubatus The Jubatus library is an online machine learning framework which runs in distributed environment. See http://jubat.us/ for details. Quick Sta

🎛 Distributed machine learning made simple.
🎛 Distributed machine learning made simple.

🎛 lazycluster Distributed machine learning made simple. Use your preferred distributed ML framework like a lazy engineer. Getting Started • Highlight

Management of exclusive GPU access for distributed machine learning workloads
Management of exclusive GPU access for distributed machine learning workloads

TensorHive is an open source tool for managing computing resources used by multiple users across distributed hosts. It focuses on granting

Bark Toolkit is a toolkit wich provides Denial-of-service attacks, SMS attacks and more.
Bark Toolkit is a toolkit wich provides Denial-of-service attacks, SMS attacks and more.

Bark Toolkit About Bark Toolkit Bark Toolkit is a set of tools that provides denial of service attacks. Bark Toolkit includes SMS attack tool, HTTP

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library,  for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Comments
  • ML functions

    ML functions

    Hi, Thanks for the great open source library. I was wondering if there are popular machine learning methods in DMTK that I can try such as random forest, SVM, KNN, PLS, NN, Tree Bagging, etc.?

    opened by jaelim 3
  • lightlda zeromq msg type error

    lightlda zeromq msg type error

    Hi, i install lightlda on CentOS6 when i run ./example/nytimes.sh, it report: [INFO] [2015-11-26 18:26:59] INFO: block = 0, the number of slice = 1 [INFO] [2015-11-26 18:26:59] Server 0 starts: num_workers=1 endpoint=inproc://server [INFO] [2015-11-26 18:26:59] Server 0: Worker registratrion completed: workers=1 trainers=1 servers=1 [INFO] [2015-11-26 18:26:59] Rank 0/1: Multiverso initialized successfully. [INFO] [2015-11-26 18:27:00] Rank 0/1: Begin of configuration and initialization. Assertion failed: check () (src/msg.cpp:248) Aborted I checked that the error seems derives from multiverso/third_party/zeromq-4.1.3/src/msg.cpp hope for help

    opened by psy2013GitHub 0
  • dmtk.io HTTPS not forced, certificate expired

    dmtk.io HTTPS not forced, certificate expired

    https://www.dmtk.io/

    Websites prove their identity via certificates, which are valid for a set time period. The certificate for www.dmtk.io expired on 10/19/2017.
     
    Error code: SEC_ERROR_EXPIRED_CERTIFICATE
    

    http://www.dmtk.io/ does not redirect/enforce to HTTPS

    opened by wesinator 0
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

null 19.4k Dec 30, 2022
Distributed machine learning platform

Veles Distributed platform for rapid Deep learning application development Consists of: Platform - https://github.com/Samsung/veles Znicz Plugin - Neu

Samsung 897 Dec 5, 2022
Framework and Library for Distributed Online Machine Learning

Jubatus The Jubatus library is an online machine learning framework which runs in distributed environment. See http://jubat.us/ for details. Quick Sta

Jubatus 701 Nov 29, 2022
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

null 23.2k Dec 30, 2022
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Horovod Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make dis

Horovod 12.9k Dec 29, 2022
Ray provides a simple, universal API for building distributed applications.

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

null 23.5k Jan 5, 2023
Distributed Synchronization for Python

Distributed Synchronization for Python Tutti is a nearly drop-in replacement for python's built-in synchronization primitives that lets you fearlessly

Hamilton Kibbe 4 Jul 7, 2022
A lightweight python module for building event driven distributed systems

Eventify A lightweight python module for building event driven distributed systems. Installation pip install eventify Problem Developers need a easy a

Eventify 16 Aug 18, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.3k Dec 29, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17k Feb 11, 2021