OneFlow is a performance-centered and open-source deep learning framework.

Overview

OneFlow

OneFlow is a performance-centered and open-source deep learning framework.

Simple CI Nightly Docker Image Nightly Release Documentation

Latest News

  • Version 0.5.0 is out!
    • First class support for eager execution. The deprecated APIs are moved to oneflow.compatible.single_client
    • Drop-in replacement of import torch for existing Pytorch projects. You could test it by inter-changing import oneflow as torch and import torch as flow.
    • Full changelog

Install OneFlow

System Requirements

  • Python 3.6, 3.7, 3.8, 3.9

  • (Highly recommended) Upgrade pip

    python3 -m pip install --upgrade pip #--user
    
  • CUDA Toolkit Linux x86_64 Driver

    • CUDA runtime is statically linked into OneFlow. OneFlow will work on a minimum supported driver, and any driver beyond. For more information, please refer to CUDA compatibility documentation.

    • Please upgrade your Nvidia driver to version 440.33 or above and install OneFlow for CUDA 10.2 if possible.

Install with Pip Package

  • To install latest stable release of OneFlow with CUDA support:

    python3 -m pip install -f https://release.oneflow.info oneflow==0.5.0+cu102
  • To install nightly release of OneFlow with CUDA support:

    python3 -m pip install oneflow -f https://staging.oneflow.info/branch/master/cu102
  • To install other available builds for different variants:

    • Stable
      python3 -m pip install --find-links https://release.oneflow.info oneflow==0.5.0+[PLATFORM]
    • Nightly
      python3 -m pip install oneflow -f https://staging.oneflow.info/branch/master/[PLATFORM]
      
    • All available [PLATFORM]:
      Platform CUDA Driver Version Supported GPUs
      cu112 >= 450.80.02 GTX 10xx, RTX 20xx, A100, RTX 30xx
      cu111 >= 450.80.02 GTX 10xx, RTX 20xx, A100, RTX 30xx
      cu110, cu110_xla >= 450.36.06 GTX 10xx, RTX 20xx, A100
      cu102, cu102_xla >= 440.33 GTX 10xx, RTX 20xx
      cu101, cu101_xla >= 418.39 GTX 10xx, RTX 20xx
      cu100, cu100_xla >= 410.48 GTX 10xx, RTX 20xx
      cpu N/A N/A
  • If you are in China, you could run this to have pip download packages from domestic mirror of pypi:

    python3 -m pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
    

    For more information on this, please refer to pypi 镜像使用帮助

Use docker image

docker pull oneflowinc/oneflow:nightly-cuda10.2
docker pull oneflowinc/oneflow:nightly-cuda11.1

Build from Source

Clone Source Code
  • Option 1: Clone source code from GitHub

    git clone https://github.com/Oneflow-Inc/oneflow --depth=1
  • Option 2: Download from Aliyun

    If you are in China, please download OneFlow source code from: https://oneflow-public.oss-cn-beijing.aliyuncs.com/oneflow-src.zip

    curl https://oneflow-public.oss-cn-beijing.aliyuncs.com/oneflow-src.zip -o oneflow-src.zip
    unzip oneflow-src.zip
Build OneFlow
  • Option 1: Build with Conda (recommended)

    Please refer to this repo

  • Option 2: Build in docker container (recommended)

    • Pull a docker image:

      docker pull oneflowinc/oneflow-manylinux2014-cuda10.2:0.1
      

      All images available : https://hub.docker.com/u/oneflowinc

    • In the root directory of OneFlow source code, run:

      python3 docker/package/manylinux/build_wheel.py --python_version=3.6
      

      This should produce .whl files in the directory wheelhouse

    • If you are in China, you might need to add these flags:

      --use_tuna --use_system_proxy --use_aliyun_mirror
      
    • You can choose CUDA/Python versions of wheel by adding:

      --cuda_version=10.1 --python_version=3.6,3.7
      
    • For more useful flags, plese run the script with flag --help or refer to the source code of the script.

  • Option 3: Build on bare metal

    • Install dependencies

      • on Ubuntu 20.04, run:
        sudo apt install -y libopenblas-dev nasm g++ gcc python3-pip cmake autoconf libtool
        
      • on macOS, run:
        brew install nasm
        
    • In the root directory of OneFlow source code, run:

      mkdir build
      cd build
      
    • Config the project, inside build directory:

      • If you are in China

        run this to config for CUDA:

        cmake .. -C ../cmake/caches/cn/cuda.cmake
        

        run this to config for CPU-only:

        cmake .. -C ../cmake/caches/cn/cpu.cmake
        
      • If you are not in China

        run this to config for CUDA:

        cmake .. -C ../cmake/caches/international/cuda.cmake
        

        run this to config for CPU-only:

        cmake .. -C ../cmake/caches/international/cpu.cmake
        
    • Build the project, inside build directory, run:

      make -j$(nproc)
      
    • Add oneflow to your PYTHONPATH, inside build directory, run:

      source source.sh
      

      Please note that this change is not permanent.

    • Simple validation

      python3 -m oneflow --doctor
      

Troubleshooting

Please refer to troubleshooting for common issues you might encounter when compiling and running OneFlow.

Advanced features

XRT
  • You can check this doc to obtain more details about how to use XLA and TensorRT with OneFlow.

Getting Started

3 minutes to run MNIST.
  • Clone the demo code from OneFlow documentation

    git clone https://github.com/Oneflow-Inc/oneflow-documentation.git
    cd oneflow-documentation/cn/docs/single_client/code/quick_start/
    
  • Run it in Python

    python mlp_mnist.py
    
  • Oneflow is running and you got the training loss

    2.7290366
    0.81281316
    0.50629824
    0.35949975
    0.35245502
    ...
    
  • More info on this demo, please refer to doc on quick start.

Documentation

Model Zoo and Benchmark

Communication

The Team

OneFlow was originally developed by OneFlow Inc and Zhejiang Lab.

License

Apache License 2.0

Comments
  • source op support s and fixed generator bug

    source op support s and fixed generator bug

    这个PR的目的

    • [x] randperm op支持 S0、添加单测
    • [x] 处理 random op 支持s 过程中,各个rank间local tensor 总是相等的bug 相关背景记录在https://github.com/Oneflow-Inc/oneflow/pull/7434#issuecomment-1033306931

    random op 支持 global tensor 一致性

    1. 在处理 randint op 和 rand op 支持B/S 保持global tensor 的一致性所采取的方案是利用 GetOpKernelRandomSeed(ctx)这个工具函数进行设计,当op 支持 S时 不同rank 间调用GetOpKernelRandomSeed(ctx) 返回一个不同的seed,再通过generator->set_current_seed(ctx->Attr<int64_t>("seed") + GetOpKernelRandomSeed(ctx)) 就可以为每个rank 设计不同的seed,这样能保证uniform 类的kernel 经过S 生成 同分布 不同数值的local tensor ,当op支持B时 每个rank 上 kernel 调用GetOpKernelRandomSeed(ctx) 时会生成相同的seed ,再通过generator->set_current_seed(ctx->Attr<int64_t>("seed") + GetOpKernelRandomSeed(ctx)) 就保证了每个rank 都拿到了相同的seed,这样就可以保持global tensor 的一致性
    2. 在处理 randperm op 和 arange op 支持 B/S 时 保持 global tensor 的一致性,目前打算采用的处理方案是让多个rank 公用seed 然后在 先在每个rank上生成完整的tensor再根据 infer physic shape信息利GetTensorSliceView4ParallelId(parallel_hierarchy, nd_sbp, logical_shape, parallel_id) 这个工具函数,获得本rank_id 和 physic shape所对应的tensor 上的索引信息,再把对应的位置的数据拷贝到 本rank 的local tensor 上

    以上方案 是通过与xiaoyu,yinggang开会总结出来的

    fixed: https://github.com/Oneflow-Inc/OneTeam/issues/1167

    enhancement automerge op test graph global 
    opened by grybd 82
  • Dev non-contiguous view ops

    Dev non-contiguous view ops

    从https://github.com/Oneflow-Inc/oneflow/tree/dev_contiguous_view_ops 分支剥离出的pr,完成以下功能: 1.ods注册op时支持添加SupportNonContiguous属性,标识是否支持non-contiguous的输入tensor,不支持,则会在interpreter处统一进行tensor->contiguous()操作 2.~~导出接口flow._oneflow_internal.has_same_tensor_storage用于检查原tensor和view tensor是否共享storage~~ 3.支持以下none-contiguous view ops:

    • [x] transpose
    • [x] permute
    • [x] narrow
    • [x] expand/expand_as
    • [x] split
    • [x] chunk
    • [x] unfold_tensor
    • [x] movedim
    • [x] as_strided
    • [x] select
    • [x] swapaxes
    • [x] T/t
    • [x] hsplit/vsplit/tensor_split
    • [ ] ~~TODO(再其他pr中完成):slice/slice_update~~
    enhancement automerge eager op api 
    opened by Flowingsun007 67
  • Fix fill_

    Fix fill_

    解决 https://github.com/Oneflow-Inc/oneflow/issues/8278 提出的 oneflow.Tensor.fill_ 速度慢问题。 image

    实现 fill_ kernel 使用了两种写法:

    • 如果 value 为 Scalar,使用 fill primitive 实现
    • 如果 value 为 Tensor,分别实现算子的 GPU 和 CPU 逻辑

    性能测试结果如下: | OP | Args | Library | Kernel Time (us, GPU) | Kernel Time (us, 1 CPU) | End-to-end Time (us, 1 CPU) | Kernel Time (us, 32 CPUs) | End-to-end Time (us, 32 CPUs) | | ------------ | ----------------------------- | ------- | --------------------- | ----------------------- | --------------------------- | ------------------------- | ----------------------------- | | Tensor.fill_ | ones(1, 8, 16, 16), 2 | OneFlow | 7 | 2.5 | 10.5 | 2.4 | 9.8 | | Tensor.fill_ | ones(1, 8, 16, 16), 2 | PyTorch | 1.1 | 2.4 | 7 | 1.2 | 3.7 | | Tensor.fill_ | ones(1000, 1000), 2 | OneFlow | 21.6 | 187.6 | 189.2 | 183 | 184.6 | | Tensor.fill_ | ones(1000, 1000), 2 | PyTorch | 11 | 186.4 | 191.3 | 26.4 | 30.7 | | Tensor.fill_ | ones(1, 8, 16, 16), tensor(2) | OneFlow | 20.4 | 3.1 | 21.5 | 3.1 | 21.8 | | Tensor.fill_ | ones(1, 8, 16, 16), tensor(2) | PyTorch | 1.2 | 7.8 | 9.3 | 3.6 | 5.7 | | Tensor.fill_ | ones(1000, 1000), tensor(2) | OneFlow | 26.7 | 180.4 | 184.4 | 175.9 | 179.8 | | Tensor.fill_ | ones(1000, 1000), tensor(2) | PyTorch | 11 | 184.2 | 187.8 | 23.8 | 25.9 |

    enhancement automerge eager 
    opened by zhongshsh 64
  • Graph rename v2

    Graph rename v2

    本 pr 去掉 Block 上的 attribute 和 config

    • 1、彻底避免重名问题;
    • 2、去掉 block config;

    实现的方案: | | Eager original | Proxy ,基类叫Proxy | GraphBlock ,基类 GraphBlock | |--------|-----------------|-------------------------------------------------------------------------------|--------------------------------------------------------------------| | 功能 | 支持拿到原始的 eager类型 | 代理执行能力,使用执行接口和 Module 和 Tensor 一样,但是行为已经变化,比如是 lazy 的,可能执行的 op 也被改写了 | GraphBlock, 对应的 一个 Graph代码块,保存graph执行需要的信息,比如name/scope/lazy op or tensor,一些 graph 上的分模块的优化开关 | | Module | Module | ProxyModule,内含了一个Module成员和一个GraphModule成员 | GraphModule | | Tensor | Tensor | ProxyTensor,内含了一个Tensor成员和一个GraphTensor成员 | GraphTensor |

    用例

    from  oneflow.nn.graph import GraphModule
    import  oneflow.nn as nn
    
    class AGraph(nn.Graph):
        def __init__(self, module: nn.Module):
            super().__init__()
    
            self.m = module
            # self.m is a ProxyModule
            # ProxyModule中有两大部分,一部分是原 module,一部分是 GraphModule
            self.m.name  // 默认取 eager module 的 name
            self.m.to(GraphModule).name // 取 GraphModule 的 name
            self.m.to(nn.Module) // 取得原 nn.Module
            
            # 取到 GraphModule 上的 config 的方法
            self.m.to(GraphModule).set_stage(id, placement)
    
    

    Fix issue: https://github.com/Oneflow-Inc/oneflow/issues/9193

    另外支持 nn.Module 多重继承时的property获取

    Fix issue:https://github.com/Oneflow-Inc/oneflow/issues/9345 and https://github.com/Oneflow-Inc/oneflow/issues/9186

    enhancement automerge bug api python 
    opened by strint 60
  • add searchsorted op

    add searchsorted op

    背景:NERF网络需要用到这个算子 算子描述: 参考pytorch的实现https://pytorch.org/docs/stable/generated/torch.searchsorted.html?highlight=searchsorted#torch.searchsorted 接口与pytorch 1.10 版本实现完全对齐。

    enhancement automerge op 
    opened by yoonlee888 59
  • Optimize slice and tensor getitem

    Optimize slice and tensor getitem

    • [x] 基于issue:https://github.com/Oneflow-Inc/OneTeam/issues/1268#issuecomment-1085433728 中提到的,tensor getitem优化,对所有使用eager dataloader的网络都有效。
    • [x] test case
    enhancement feature automerge eager 
    opened by Flowingsun007 57
  • Decouple vm mem and compute

    Decouple vm mem and compute

    让vm worker线程集中注意力做OpKernel::Compute,如果除此之外其他部分的性能优化到位,理论上eager能达到最高的性能。

    指令的执行现在分为两步:

    1. Infer。包括内存分配释放,以及opkernel state和cache的准备。
    2. Compute。只执行user_op::OpKernel::Compute函数。

    Infer阶段总是在scheduler线程里执行。Compute阶段默认在Worker线程里执行,通过设置ONEFLOW_VM_WORKLOAD_ON_SCHEDULER_THREAD=1,令其在scheduler线程工作执行。

    本pr 依赖其他几个pr或分支: vm优化pr:

    1. https://github.com/Oneflow-Inc/oneflow/pull/7923 将指令实现迁移到ep
    2. https://github.com/Oneflow-Inc/oneflow/pull/7623 合并InstructionMsg和Instruction Call指令优化pr:
    3. https://github.com/Oneflow-Inc/oneflow/pull/7617 让StatefullOpKernel变得线程安全。
    4. https://github.com/Oneflow-Inc/oneflow/tree/refactor_eager_tmp_buffer_x_merge_instruction_msg_to_instruction 完全重构指令对temp storage的处理,使得Infer/Compute可以异步工作。
    enhancement eager system 
    opened by lixinqi 55
  • Refactor MemoryCase to eliminate determine statements of device_type

    Refactor MemoryCase to eliminate determine statements of device_type

    重构 MemoryCase 结构体来消除代码逻辑中对 device 的特判逻辑。

    MemoryCase 改为开放性结构,避免每次增加 DeviceType 枚举类型时,都需对 MemoryCase 进行修改。

    MemoryCase 改为开放性结构后,也可消除很多地方的 if (device_type == DeviceType::kGPU)if (mem_case.has_device_cuda_mem()) 等特判逻辑。

    重构完后理论上唯一会剩下的就是对 device mem 是否是 host mem 的逻辑判断,因为有些地方的逻辑处理要特别对待 host mem。

    重构完后并不能完全消除对 gpu device 的特判逻辑,有些特判写法是与 mem_case 无关的,目前可能重点集中在内存复用那一块的逻辑,task graph 也可能有一些残余,待后续 pr 进一步重构。

    enhancement graph need-test-distributed 
    opened by leaves-zwx 53
  • Implement oneflow.embedding op

    Implement oneflow.embedding op

    概述

    这个PR补充了oneflow.nn.Embedding的实现,之前的实现并没有考虑到padding_idxmax_normnorm_typescale_grad_by_freq 四个参数,所以直接使用了oneflow.gather,但引入上述参数之后,无法直接复用gather op,需要自定义Embedding op。

    1

    pytorch接口链接

    功能 CheckList

    注意 : 功能复选框均为可选项,若未选择,说明理由即可。例如:该 Op 由 Python 接口拼接而成,因此无 SetBatchAxisInferFn Op 注册;再比如:该 Op 无输入,因此无 SetInputArgModifyFn

    Op

    • [ ] Op SetBatchAxisInferFn
    • [x] Op SetGetSbpFn
    • [x] Op SetInputArgModifyFn
    • [x] Op 反向梯度注册

    CPU Kernel

    • [x] CPU in:float32
    • [x] CPU in:float64
    • [ ] CPU in:int32
    • [ ] CPU in:int64
    • [ ] CPU in:int8

    GPU Kernel

    • [x] GPU in:float32
    • [x] GPU in:float64
    • [ ] GPU in:int32
    • [ ] GPU in:int64
    • [x] GPU in:float16
    • [ ] GPU in:int8

    Python Wrapper

    • [x] Python API 参数检查及异常提示
    • [x] 接口注释
    • [x] Example

    测试

    • [x] 单机单卡 CPU Test Case
    • [x] 单机单卡 GPU Test Case
    • [ ] 单机多卡 CPU Test Case
    • [ ] 单机多卡 GPU Test Case
    • [ ] 分布式 CPU Test Case
    • [ ] 分布式 GPU Test Case

    GPU 有效带宽

    带 GPU 的 Op,请参考 https://github.com/Oneflow-Inc/OneTeam/issues/167 测试有效带宽,并附带测试报告。 以下是报告样例:

    理论带宽:

     Device to Device Bandwidth, 1 Device(s)
     PINNED Memory Transfers
       Transfer Size (Bytes)	Bandwidth(MB/s)
       33554432			250798.5
    

    实际带宽:

    PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: sqrt_2 elapsed(ms): 0.196064 memory_size(Byte): 50331648 bandwidth(GB/s): 239.08
    PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: sqrt_2_grad elapsed(ms): 0.29072 memory_size(Byte): 75497472 bandwidth(GB/s): 241.856
    

    PR Checklist

    • [x] PR 标题语句通畅,明确表达 PR 内容,适合直接作为新版本发布时的 changelog
    • [x] 代码格式化
    • [x] 已经本地编译通过
    • [x] 已本地针对改动测试
    • [x] 已添加 type 标签:(填写 type 标签名,如 bug, enhancement, purge, feature, documentation)
    • [x] 已添加 component 标签:(填写 component 标签名,如 op, system, eager, build, xla, python, ci, test, tooling)
    • [x] Draft 转正式 PR 前已请人 Review
    enhancement automerge op 
    opened by EsdeathYZH 46
  • check graph op global test

    check graph op global test

    This PR is done:

    • [x] 执行一些 op 的 Graph Global test(only cuda)。

    还有一些未打开 graph 测试的 global op,情况见 https://github.com/Oneflow-Inc/oneflow/pull/8614#issuecomment-1185097594 。

    enhancement automerge test graph global 
    opened by lixiang007666 39
  • Implement exponential_ and multinomial

    Implement exponential_ and multinomial

    需求来源: https://github.com/Oneflow-Inc/OneTeam/issues/1184#issuecomment-1232440993

    Todo lists

    • [x] 实现 exponential_ 算子
      • [x] functor 逻辑
      • [x] cpu kernel
      • [x] cuda kernel
      • [x] 测试
    • [x] 实现 multinomial 算子
      • [x] functor 逻辑
      • [x] cpu kernel
      • [x] cuda kernel
      • [x] 测试
    • [x] 添加 Distribution 模块
      • [x] 实现 Categorical
    feature automerge op api python need-clean-ccache 
    opened by Ldpe2G 37
  • dev add spectral_norm

    dev add spectral_norm

    @BBuf 修复了一些 spectral_norm 实现过程中遇到的bug

    • [x] 修复 dot 在 cpu 下不支持 int32 与 int64 计算的 bug (因为matmul)
    • [x] 增加 spectral_norm 的基本功能
    • [x] 修复 kaiming_uniform_ 和 kaiming_normal_ 在输入0size tensor 的时候的除0 bug
    • [x] 新增 oneflow.linalg.multi_dot()
    • [ ] oneflow.contiguous_format
    • [ ] spectral_norm 的 load_state_dict 测试与 global 测试, load 与 hook
    • [ ] spectral_norm 和 multi_dot 的文档

    好像有一些多余头文件我后面检查一下

    feature bug op 
    opened by hhhfccz 0
  • Oneflow fails in einops CI, likely due to conflict with new numpy

    Oneflow fails in einops CI, likely due to conflict with new numpy

    Summary

    ___________________ ERROR collecting tests/test_examples.py ____________________
    tests/test_examples.py:5: in <module>
        from tests.test_ops import imp_op_backends
    <frozen importlib._bootstrap>:1007: in _find_and_load
        ???
    <frozen importlib._bootstrap>:986: in _find_and_load_unlocked
        ???
    <frozen importlib._bootstrap>:680: in _load_unlocked
        ???
    /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/_pytest/assertion/rewrite.py:168: in exec_module
        exec(co, module.__dict__)
    tests/test_ops.py:10: in <module>
        imp_op_backends = collect_test_backends(symbolic=False, layers=False)
    tests/__init__.py:64: in collect_test_backends
        result.append(backend_type())
    einops/_backends.py:554: in __init__
        import oneflow as flow
    ../../../.local/lib/python3.9/site-packages/oneflow/__init__.py:199: in <module>
        import oneflow.framework.register_class_method_util as register_class_method_util
    ../../../.local/lib/python3.9/site-packages/oneflow/framework/register_class_method_util.py:17: in <module>
        import oneflow.framework.check_point_v2 as check_point_v2
    ../../../.local/lib/python3.9/site-packages/oneflow/framework/check_point_v2.py:30: in <module>
        import oneflow.framework.dtype as dtype_util
    ../../../.local/lib/python3.9/site-packages/oneflow/framework/dtype.py:49: in <module>
        oneflow.bool: np.bool,
    /opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/numpy/__init__.py:284: in __getattr__
        raise AttributeError("module {!r} has no attribute "
    E   AttributeError: module 'numpy' has no attribute 'bool'
    ------------------------------- Captured stderr --------------------------------
    2022-12-27 07:50:33.696556: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.16/x64/lib
    2022-12-27 07:50:33.696647: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.9.16/x64/lib
    2022-12-27 07:50:33.696656: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
    

    Code to reproduce bug

    See CI job for full detailed messages and configuration:

    https://github.com/arogozhnikov/einops/actions/runs/3785978910/jobs/6436456017

    System Information

    • What is your OneFlow installation (pip, source, dockerhub): pip
    • OS: linux
    • OneFlow version (run python3 -m oneflow --doctor):
    • Python version: 3.9
    • CUDA driver version: None
    • GPU models: None
    • Other info:
    bug community 
    opened by arogozhnikov 8
Releases(v0.8.0)
HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR. CVPR 2022 [Project page | Video] Getting sta

null 51 Nov 29, 2022
FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning (FedML) developed and maintained by Scaleout Systems. FEDn enables highly scalable cross-silo and cross-device use-cases over FEDn networks.

Scaleout 75 Nov 9, 2022
PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

null 185 Dec 26, 2022
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

Jittor: a Just-in-time(JIT) deep learning framework Quickstart | Install | Tutorial | Chinese Jittor is a high-performance deep learning framework bas

null 2.7k Jan 3, 2023
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this

Nervana 3.9k Dec 20, 2022
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this

Nervana 3.9k Feb 9, 2021
NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

NVIDIA Merlin NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA’s GPUs. It enables data scientists, machine

null 419 Jan 3, 2023
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 8, 2023
Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases. Ivy wraps the functional APIs of existing frameworks. Framework-agnostic functions, libraries and layers can then be written using Ivy, with simultaneous support for all frameworks. Ivy currently supports Jax, TensorFlow, PyTorch, MXNet and Numpy. Check out the docs for more info!

Ivy 8.2k Jan 2, 2023
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

null 170.1k Jan 4, 2023
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Jan 7, 2023
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

null 170.1k Jan 5, 2023
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

null 153.2k Feb 13, 2021
ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem

Hao Su's Lab, UCSD 48 Dec 30, 2022
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.

Light Gradient Boosting Machine LightGBM is a gradient boosting framework that uses tree based learning algorithms. It is designed to be distributed a

Microsoft 14.5k Jan 8, 2023
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.3k Dec 29, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17k Feb 11, 2021