Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Related tags

Deep Learning mxnet
Overview

Apache MXNet (incubating) for Deep Learning

Master Docs License
Build Status Documentation Status GitHub license

banner

Apache MXNet (incubating) is a deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity. At its core, MXNet contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. MXNet is portable and lightweight, scaling effectively to multiple GPUs and multiple machines.

MXNet is also more than a deep learning project. It is also a collection of blue prints and guidelines for building deep learning systems, and interesting insights of DL systems for hackers.

Installation Guide

Install Dependencies to build mxnet for HIP/ROCm

ROCm Installation

Install Dependencies to build mxnet for HIP/CUDA

  • Install CUDA following the NVIDIA’s installation guide to setup MXNet with GPU support

  • Make sure to add CUDA install path to LD_LIBRARY_PATH

  • Example - export LD_LIBRARY_PATH=/usr/local/cuda/lib64/:$LD_LIBRARY_PATH

  • Install the dependencies hipblas, rocrand from source.

Build the MXNet library

  • Step 1: Install build tools.

    sudo apt-get update
    sudo apt-get install -y build-essential
    
  • Step 2: Install OpenBLAS. MXNet uses BLAS and LAPACK libraries for accelerated numerical computations on CPU machine. There are several flavors of BLAS/LAPACK libraries - OpenBLAS, ATLAS and MKL. In this step we install OpenBLAS. You can choose to install ATLAS or MKL.

      sudo apt-get install -y libopenblas-dev liblapack-dev libomp-dev libatlas-dev libatlas-base-dev
  • Step 3: Install OpenCV. Install OpenCV here. MXNet uses OpenCV for efficient image loading and augmentation operations.
      sudo apt-get install -y libopencv-dev
  • Step 4: Download MXNet sources and build MXNet core shared library.
      git clone --recursive https://github.com/ROCmSoftwarePlatform/mxnet.git
      cd mxnet
      export PATH=/opt/rocm/bin:$PATH
  • Step 5: To compile on HCC PLATFORM(HIP/ROCm):
      export HIP_PLATFORM=hcc

To compile on NVCC PLATFORM(HIP/CUDA):

      export HIP_PLATFORM=nvcc
  • Step 6: To enable MIOpen for higher acceleration :

    USE_CUDNN=1
    
  • Step 7:

    If building on CPU:

        make -jn(n=number of cores) USE_GPU=0 (For Ubuntu 16.04)
        make -jn(n=number of cores)  CXX=g++-6 USE_GPU=0 (For Ubuntu 18.04)

If building on GPU:

       make -jn(n=number of cores) USE_GPU=1 (For Ubuntu 16.04)
       make -jn(n=number of cores)  CXX=g++-6 USE_GPU=1 (For Ubuntu 18.04)

On succesfull compilation a library called libmxnet.so is created in mxnet/lib path.

NOTE: USE_CUDA, USE_CUDNN flags can be changed in make/config.mk.

To compile on HIP/CUDA make sure to set USE_CUDA_PATH to right CUDA installation path in make/config.mk. In most cases it is - /usr/local/cuda.

Install the MXNet Python binding

  • Step 1: Install prerequisites - python, setup-tools, python-pip and numpy.
      sudo apt-get install -y python-dev python-setuptools python-numpy python-pip python-scipy
      sudo apt-get install python-tk
      sudo apt install -y fftw3 fftw3-dev pkg-config
  • Step 2: Install the MXNet Python binding.
      cd python
      sudo python setup.py install
  • Step 3: Execute sample example
       cd example/
       cd bayesian-methods/

To run on gpu change mx.cpu() to mx.gpu() in python script (Example- bdk_demo.py)

       $ python bdk_demo.py

Ask Questions

What's New

Contents

Features

  • Design notes providing useful insights that can re-used by other DL projects
  • Flexible configuration for arbitrary computation graph
  • Mix and match imperative and symbolic programming to maximize flexibility and efficiency
  • Lightweight, memory efficient and portable to smart devices
  • Scales up to multi GPUs and distributed setting with auto parallelism
  • Support for Python, R, Scala, C++ and Julia
  • Cloud-friendly and directly compatible with S3, HDFS, and Azure

License

Licensed under an Apache-2.0 license.

Reference Paper

Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. In Neural Information Processing Systems, Workshop on Machine Learning Systems, 2015

History

MXNet emerged from a collaboration by the authors of cxxnet, minerva, and purine2. The project reflects what we have learned from the past projects. MXNet combines aspects of each of these projects to achieve flexibility, speed, and memory efficiency.

Comments
  • build mxnet failed

    build mxnet failed

    OS :docker rocm/dev-ubuntu-18.04:2.7

    Build info (Required if built from source)

    Compiler (gcc/clang/mingw/visual studio):CXX=g++-7

    MXNet commit hash: git clone -b hip_port_v1.4.x --recursive https://github.com/ROCmSoftwarePlatform/mxnet.git

    Build config: HIP_PLATFORM=hcc make -j16 CXX=g++-7 USE_GPU=1

    Error Message:

    g++-7 -std=c++11 -c -D__HIP_PLATFORM_HCC__= -I/opt/rocm/include -I/opt/rocm/hcc/include -DMSHADOW_FORCE_STREAM -Wall -Wsign-compare -O3 -DNDEBUG=1 -I. -I/opt/rocm/include -I/opt/ro cm/hipblas/include -I/opt/rocm/hiprand/include -I/opt/rocm/rocfft/include -I/opt/rocm/hipcub/include/ -I/opt/rocm/rocblas/include -I/opt/rocm/rocrand/include -I/home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/ -I/home/mxnet_hip_port_v1.4.x/3rdparty/dmlc-core/include -fPIC -I/home/mxnet_hip_port_v1.4.x/3rdparty/tvm/nnvm/include -I/home/mxnet_hip_port_v1.4.x/3rdparty/dlpack/include -I/home/mxnet_hip_port_v1.4.x/3rdparty/tvm/include -Iinclude -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -Wno-unused-local-typedefs -msse3 -mf16c -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -DMSHADOW_USE_PASCAL=0 -DMXNET_USE_OPENCV=1 -I/usr/include/opencv -fopenmp -DMXNET_USE_OPERATOR_TUNING=1 -DMXNET_USE_LAPACK -DMSHADOW_USE_MIOPEN=1 -I/opt/rocm/hipcub/include/hipcub/rocprim -DMXNET_USE_RCCL=0 -DMXNET_USE_LIBJPEG_TURBO=0 -MMD -c src/operator/tensor/elemwise_binary_broadcast_op_extended.cc -o build/src/operator/tensor/elemwise_binary_broadcast_op_extended.oIn file included from /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/tensor.h:16:0, from include/mxnet/./base.h:32, from include/mxnet/operator_util.h:43, from src/operator/random/./multisample_op.h:28, from src/operator/random/multisample_op.cc:26: /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/./base.h:374:46: error: 'HIPBLAS_R_8U' was not declared in this scope static const hipblasDatatype_t kCudaFlag = HIPBLAS_R_8U; ^~~~~~~~~~~~ /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/./base.h:374:46: note: suggested alternative: 'HIPBLAS_R_64F' static const hipblasDatatype_t kCudaFlag = HIPBLAS_R_8U; ^~~~~~~~~~~~ HIPBLAS_R_64F /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/./base.h:389:46: error: 'HIPBLAS_R_8I' was not declared in this scope static const hipblasDatatype_t kCudaFlag = HIPBLAS_R_8I; ^~~~~~~~~~~~ In file included from /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/tensor.h:16:0, from include/mxnet/./base.h:32, from include/mxnet/operator_util.h:43, from src/operator/random/./sample_multinomial_op.h:28, from src/operator/random/sample_multinomial_op.cc:25: /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/./base.h:374:46: error: 'HIPBLAS_R_8U' was not declared in this scope static const hipblasDatatype_t kCudaFlag = HIPBLAS_R_8U; ^~~~~~~~~~~~ /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/./base.h:374:46: note: suggested alternative: 'HIPBLAS_R_64F' static const hipblasDatatype_t kCudaFlag = HIPBLAS_R_8U; ^~~~~~~~~~~~ HIPBLAS_R_64F /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/./base.h:389:46: error: 'HIPBLAS_R_8I' was not declared in this scope static const hipblasDatatype_t kCudaFlag = HIPBLAS_R_8I; ^~~~~~~~~~~~ /home/mxnet_hip_port_v1.4.x/3rdparty/mshadow/mshadow/./base.h:389:46: note: suggested alternative: 'HIPBLAS_R_64F'

    Steps to reproduce

    (Paste the commands you ran that produced the error.)

    What have you tried to solve it?

    1.I find the hipblasDatatype_t in https://github.com/ROCmSoftwarePlatform/hipBLAS/blob/master-rocm-2.7/library/include/hipblas.h
    there is no HIPBLAS_R_8U in hipblasDatatype_t struct image how could i solve this problems? 2.

    opened by XuanBaby 5
  • compile error when try to install mxnet on rocm platform

    compile error when try to install mxnet on rocm platform

    Environment info

    OS: $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.5 LTS Release: 18.04 Codename: bionic

    Compiler: $ hipcc --version HIP version: 3.3.20126-2dbba46b HCC clang version 10.0.0 (/data/jenkins-workspace/compute-rocm-rel-3.3/external/hcc-tot/llvm-project/clang 1ce0fe5e88b2124494b9500817b4c2c66bdfa5aa) (based on HCC 3.1.20114-6776c83f-1ce0fe5e88b ) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm-3.3.0/hcc/bin

    Package used (Python/R/Scala/Julia): python

    MXNET version: install from source.

    MXNet commit hash (git rev-parse HEAD): 0cd2b0b82d269fa86e71258eb467b1e46f641b64

    Python version and distribution: $ python --version Python 3.6.9

    ROCM Version: 3.3.0

    Error Message:

    I try to install mxnet follow the guide https://github.com/ROCmSoftwarePlatform/mxnet#installation-guide.

    $ make -j1  CXX=g++-6 USE_GPU=1              
    Makefile:346: WARNING: could not find nvcc compiler, the specified path was: hipcc
    Running CUDA_ARCH: --amdgpu-target=gfx801 --amdgpu-target=gfx802 --amdgpu-target=gfx803 --amdgpu-target=gfx900 --amdgpu-target=gfx906
    cd /disk/zhanged/code/mxnet/3rdparty/dmlc-core; make libdmlc.a USE_SSE=1 config=/disk/zhanged/code/mxnet/make/config.mk; cd /disk/zhanged/code/mxnet
    make[1]: Entering directory '/disk/zhanged/code/mxnet/3rdparty/dmlc-core'
    make[1]: 'libdmlc.a' is up to date.
    make[1]: Leaving directory '/disk/zhanged/code/mxnet/3rdparty/dmlc-core'
    hipcc -std=c++11 -Xcompiler -D_FORCE_INLINES -g -O3   --amdgpu-target=gfx801 --amdgpu-target=gfx802 --amdgpu-target=gfx803 --amdgpu-target=gfx900 --amdgpu-target=gfx906 -Xcompiler "-DMSHADOW_FORCE_STREAM -Wall -Wsign-compare -O3 -DNDEBUG=1 -I. -I/opt/rocm/include -I/opt/rocm/hipblas/include -I/opt/rocm/hiprand/include -I/opt/rocm/rocfft/include -I/opt/rocm/hipcub/include/ -I/opt/rocm/rocblas/include -I/opt/rocm/rocrand/include -I/disk/zhanged/code/mxnet/3rdparty/mshadow/ -I/disk/zhanged/code/mxnet/3rdparty/dmlc-core/include -fPIC -I/disk/zhanged/code/mxnet/3rdparty/tvm/nnvm/include -I/disk/zhanged/code/mxnet/3rdparty/dlpack/include -I/disk/zhanged/code/mxnet/3rdparty/tvm/include -Iinclude -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -Wno-unused-local-typedefs -msse3 -mf16c -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -DMSHADOW_USE_PASCAL=0 -DMXNET_USE_OPENCV=1 -I/usr/include/opencv -fopenmp -DMXNET_USE_OPERATOR_TUNING=1 -DMXNET_USE_LAPACK  -I/opt/rocm/hipcub/include/hipcub/rocprim -DMXNET_USE_RCCL=0 -DMXNET_USE_LIBJPEG_TURBO=0" -M -MT build/src/operator/nn/ctc_loss_gpu.o src/operator/nn/ctc_loss.cu >build/src/operator/nn/ctc_loss_gpu.d
    clang-10: warning: argument unused during compilation: '-Xcompiler' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx801' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx802' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx803' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx900' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx906' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '-Xcompiler' [-Wunused-command-line-argument]
    hipcc -c -o build/src/operator/nn/ctc_loss_gpu.o -std=c++11 -Xcompiler -D_FORCE_INLINES -g -O3   --amdgpu-target=gfx801 --amdgpu-target=gfx802 --amdgpu-target=gfx803 --amdgpu-target=gfx900 --amdgpu-target=gfx906 -Xcompiler "-DMSHADOW_FORCE_STREAM -Wall -Wsign-compare -O3 -DNDEBUG=1 -I. -I/opt/rocm/include -I/opt/rocm/hipblas/include -I/opt/rocm/hiprand/include -I/opt/rocm/rocfft/include -I/opt/rocm/hipcub/include/ -I/opt/rocm/rocblas/include -I/opt/rocm/rocrand/include -I/disk/zhanged/code/mxnet/3rdparty/mshadow/ -I/disk/zhanged/code/mxnet/3rdparty/dmlc-core/include -fPIC -I/disk/zhanged/code/mxnet/3rdparty/tvm/nnvm/include -I/disk/zhanged/code/mxnet/3rdparty/dlpack/include -I/disk/zhanged/code/mxnet/3rdparty/tvm/include -Iinclude -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -Wno-unused-local-typedefs -msse3 -mf16c -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -DMSHADOW_USE_PASCAL=0 -DMXNET_USE_OPENCV=1 -I/usr/include/opencv -fopenmp -DMXNET_USE_OPERATOR_TUNING=1 -DMXNET_USE_LAPACK  -I/opt/rocm/hipcub/include/hipcub/rocprim -DMXNET_USE_RCCL=0 -DMXNET_USE_LIBJPEG_TURBO=0" src/operator/nn/ctc_loss.cu
    clang-10: warning: argument unused during compilation: '-Xcompiler' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx801' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx802' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx803' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx900' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '--amdgpu-target=gfx906' [-Wunused-command-line-argument]
    clang-10: warning: argument unused during compilation: '-Xcompiler' [-Wunused-command-line-argument]
    In file included from src/operator/nn/ctc_loss.cu:26:
    In file included from src/operator/nn/./ctc_loss-inl.h:29:
    In file included from include/mxnet/operator_util.h:43:
    In file included from include/mxnet/./base.h:32:
    In file included from /disk/zhanged/code/mxnet/3rdparty/mshadow/mshadow/tensor.h:16:
    In file included from /disk/zhanged/code/mxnet/3rdparty/mshadow/mshadow/./base.h:29:
    In file included from ./hip-wrappers.h:8:
    In file included from /opt/rocm/include/hip/hip_runtime.h:56:
    In file included from /opt/rocm/include/hip/hcc_detail/hip_runtime.h:105:
    /opt/rocm/include/hip/hcc_detail/surface_functions.h:37:18: warning: comparison of integers of different signs: 'int32_t' (aka 'int') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        if ((xOffset > width) || (xOffset < 0) || (y > height) || (y < 0)) {
             ~~~~~~~ ^ ~~~~~
    /opt/rocm/include/hip/hcc_detail/surface_functions.h:37:50: warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        if ((xOffset > width) || (xOffset < 0) || (y > height) || (y < 0)) {
                                                   ~ ^ ~~~~~~
    /opt/rocm/include/hip/hcc_detail/surface_functions.h:54:20: warning: comparison of integers of different signs: 'int32_t' (aka 'int') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        if (!((xOffset > width) || (xOffset < 0) || (y > height) || (y < 0))) {
               ~~~~~~~ ^ ~~~~~
    /opt/rocm/include/hip/hcc_detail/surface_functions.h:54:52: warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        if (!((xOffset > width) || (xOffset < 0) || (y > height) || (y < 0))) {
                                                     ~ ^ ~~~~~~
    In file included from src/operator/nn/ctc_loss.cu:26:
    In file included from src/operator/nn/./ctc_loss-inl.h:34:
    src/operator/nn/./sequence_mask-inl.h:55:5: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation]
        for (index_t s = lengths[batch]; s < smax; ++s)
        ^
    src/operator/nn/./sequence_mask-inl.h:51:3: note: previous statement is here
      if (batch >= bmax)
      ^
    In file included from src/operator/nn/ctc_loss.cu:26:
    In file included from src/operator/nn/./ctc_loss-inl.h:35:
    In file included from src/operator/nn/../sequence_op_common.h:31:
    In file included from src/operator/nn/.././operator_common.h:42:
    src/operator/nn/../../common/cuda_utils.h:246:11: warning: enumeration values 'HIPRAND_STATUS_DOUBLE_PRECISION_REQUIRED' and 'HIPRAND_STATUS_NOT_IMPLEMENTED' not handled in switch [-Wswitch]
      switch (status) {
              ^
    In file included from src/operator/nn/ctc_loss.cu:27:
    In file included from src/operator/nn/../../../3rdparty/ctc_include/detail/gpu_ctc.h:25:
    In file included from src/operator/nn/../../../3rdparty/ctc_include/detail/gpu_ctc_kernels.h:23:
    In file included from src/operator/nn/../../../3rdparty/ctc_include/detail/../contrib/moderngpu/include/device/ctascan.cuh:38:
    src/operator/nn/../../../3rdparty/ctc_include/detail/../contrib/moderngpu/include/device/deviceutil.cuh:68:13: error: no matching function for call to 'min'
            range.x += min(block, task.y);
                       ^~~
    src/operator/nn/../../../3rdparty/ctc_include/detail/../contrib/moderngpu/include/device/devicetypes.cuh:260:23: note: candidate function not viable: no known conversion from 'int' to 'int2' (aka 'HIP_vector_type<int, 2>') for 1st argument
    MGPU_HOST_DEVICE int2 min(int2 a, int2 b) {
                          ^
    src/operator/nn/../../../3rdparty/ctc_include/detail/../contrib/moderngpu/include/device/devicetypes.cuh:243:20: note: candidate template ignored: deduced conflicting types for parameter 'T' ('int' vs. 'hip_impl::Scalar_accessor<int, int __attribute__((ext_vector_type(2))), 1>')
    MGPU_HOST_DEVICE T min(T a, T b) {
                       ^
    In file included from src/operator/nn/ctc_loss.cu:26:
    In file included from src/operator/nn/./ctc_loss-inl.h:29:
    In file included from include/mxnet/operator_util.h:43:
    In file included from include/mxnet/base.h:32:
    In file included from /disk/zhanged/code/mxnet/3rdparty/mshadow/mshadow/./cuda/../tensor.h:16:
    In file included from /disk/zhanged/code/mxnet/3rdparty/mshadow/mshadow/base.h:29:
    In file included from ./hip-wrappers.h:8:
    In file included from /opt/rocm/include/hip/hip_runtime.h:56:
    In file included from /opt/rocm/include/hip/hcc_detail/hip_runtime.h:57:
    In file included from /opt/rocm/include/hip/hip_runtime_api.h:348:
    In file included from /opt/rocm/include/hip/hcc_detail/hip_runtime_api.h:44:
    In file included from /opt/rocm/include/hip/hcc_detail/hip_texture_types.h:38:
    In file included from /opt/rocm/include/hip/hcc_detail/channel_descriptor.h:28:
    /opt/rocm/include/hip/hcc_detail/hip_vector_types.h:176:22: warning: unused variable 'r' [-Wunused-variable]
                    auto r{data[idx]};
                         ^
    /opt/rocm/rocrand/include/rocrand_philox4x32_10.h:284:26: note: in instantiation of member function 'hip_impl::Scalar_accessor<unsigned int, unsigned int __attribute__((ext_vector_type(4))), 0>::operator++' requested here
            m_state.counter.x++;
                             ^
    7 warnings and 1 error generated.
    In file included from src/operator/nn/ctc_loss.cu:26:
    In file included from src/operator/nn/./ctc_loss-inl.h:29:
    In file included from include/mxnet/operator_util.h:43:
    In file included from include/mxnet/./base.h:32:
    In file included from /disk/zhanged/code/mxnet/3rdparty/mshadow/mshadow/tensor.h:16:
    In file included from /disk/zhanged/code/mxnet/3rdparty/mshadow/mshadow/./base.h:29:
    In file included from ./hip-wrappers.h:8:
    In file included from /opt/rocm/include/hip/hip_runtime.h:56:
    In file included from /opt/rocm/include/hip/hcc_detail/hip_runtime.h:105:
    /opt/rocm/include/hip/hcc_detail/surface_functions.h:37:18: warning: comparison of integers of different signs: 'int32_t' (aka 'int') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        if ((xOffset > width) || (xOffset < 0) || (y > height) || (y < 0)) {
             ~~~~~~~ ^ ~~~~~
    /opt/rocm/include/hip/hcc_detail/surface_functions.h:37:50: warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        if ((xOffset > width) || (xOffset < 0) || (y > height) || (y < 0)) {
                                                   ~ ^ ~~~~~~
    /opt/rocm/include/hip/hcc_detail/surface_functions.h:54:20: warning: comparison of integers of different signs: 'int32_t' (aka 'int') and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        if (!((xOffset > width) || (xOffset < 0) || (y > height) || (y < 0))) {
               ~~~~~~~ ^ ~~~~~
    /opt/rocm/include/hip/hcc_detail/surface_functions.h:54:52: warning: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Wsign-compare]
        if (!((xOffset > width) || (xOffset < 0) || (y > height) || (y < 0))) {
                                                     ~ ^ ~~~~~~
    In file included from src/operator/nn/ctc_loss.cu:26:
    In file included from src/operator/nn/./ctc_loss-inl.h:34:
    src/operator/nn/./sequence_mask-inl.h:55:5: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation]
        for (index_t s = lengths[batch]; s < smax; ++s)
        ^
    src/operator/nn/./sequence_mask-inl.h:51:3: note: previous statement is here
      if (batch >= bmax)
      ^
    In file included from src/operator/nn/ctc_loss.cu:26:
    In file included from src/operator/nn/./ctc_loss-inl.h:35:
    In file included from src/operator/nn/../sequence_op_common.h:31:
    In file included from src/operator/nn/.././operator_common.h:42:
    src/operator/nn/../../common/cuda_utils.h:246:11: warning: enumeration values 'HIPRAND_STATUS_DOUBLE_PRECISION_REQUIRED' and 'HIPRAND_STATUS_NOT_IMPLEMENTED' not handled in switch [-Wswitch]
      switch (status) {
              ^
    In file included from src/operator/nn/ctc_loss.cu:27:
    In file included from src/operator/nn/../../../3rdparty/ctc_include/detail/gpu_ctc.h:25:
    In file included from src/operator/nn/../../../3rdparty/ctc_include/detail/gpu_ctc_kernels.h:23:
    In file included from src/operator/nn/../../../3rdparty/ctc_include/detail/../contrib/moderngpu/include/device/ctascan.cuh:38:
    src/operator/nn/../../../3rdparty/ctc_include/detail/../contrib/moderngpu/include/device/deviceutil.cuh:68:13: error: no matching function for call to 'min'
            range.x += min(block, task.y);
                       ^~~
    src/operator/nn/../../../3rdparty/ctc_include/detail/../contrib/moderngpu/include/device/devicetypes.cuh:260:23: note: candidate function not viable: no known conversion from 'int' to 'int2' (aka 'HIP_vector_type<int, 2>') for 1st argument
    MGPU_HOST_DEVICE int2 min(int2 a, int2 b) {
                          ^
    src/operator/nn/../../../3rdparty/ctc_include/detail/../contrib/moderngpu/include/device/devicetypes.cuh:243:20: note: candidate template ignored: deduced conflicting types for parameter 'T' ('int' vs. 'hip_impl::Scalar_accessor<int, int __attribute__((ext_vector_type(2))), 1>')
    MGPU_HOST_DEVICE T min(T a, T b) {
                       ^
    In file included from src/operator/nn/ctc_loss.cu:26:
    In file included from src/operator/nn/./ctc_loss-inl.h:29:
    In file included from include/mxnet/operator_util.h:43:
    In file included from include/mxnet/base.h:32:
    In file included from /disk/zhanged/code/mxnet/3rdparty/mshadow/mshadow/./cuda/../tensor.h:16:
    In file included from /disk/zhanged/code/mxnet/3rdparty/mshadow/mshadow/base.h:29:
    In file included from ./hip-wrappers.h:8:
    In file included from /opt/rocm/include/hip/hip_runtime.h:56:
    In file included from /opt/rocm/include/hip/hcc_detail/hip_runtime.h:57:
    In file included from /opt/rocm/include/hip/hip_runtime_api.h:348:
    In file included from /opt/rocm/include/hip/hcc_detail/hip_runtime_api.h:44:
    In file included from /opt/rocm/include/hip/hcc_detail/hip_texture_types.h:38:
    In file included from /opt/rocm/include/hip/hcc_detail/channel_descriptor.h:28:
    /opt/rocm/include/hip/hcc_detail/hip_vector_types.h:176:22: warning: unused variable 'r' [-Wunused-variable]
                    auto r{data[idx]};
                         ^
    /opt/rocm/rocrand/include/rocrand_philox4x32_10.h:284:26: note: in instantiation of member function 'hip_impl::Scalar_accessor<unsigned int, unsigned int __attribute__((ext_vector_type(4))), 0>::operator++' requested here
            m_state.counter.x++;
                             ^
    7 warnings and 1 error generated.
    Makefile:507: recipe for target 'build/src/operator/nn/ctc_loss_gpu.o' failed
    make: *** [build/src/operator/nn/ctc_loss_gpu.o] Error 1
    

    What have i tried to solve it?

    The compiler has confused about min function. So I try to modify the code as following show:

    diff --git a/3rdparty/ctc_include/contrib/moderngpu/include/device/deviceutil.cuh b/3rdparty/ctc_include/contrib/moderngpu/include/device/deviceutil.cuh
    index e18807f38..fb4f08f21 100644
    --- a/3rdparty/ctc_include/contrib/moderngpu/include/device/deviceutil.cuh
    +++ b/3rdparty/ctc_include/contrib/moderngpu/include/device/deviceutil.cuh
    @@ -65,7 +65,7 @@ MGPU_HOST int2 DivideTaskRange(int numItems, int numWorkers) {
     MGPU_HOST_DEVICE int2 ComputeTaskRange(int block, int2 task) {
            int2 range;
            range.x = task.x * block;
    -       range.x += min(block, task.y);
    +       range.x += min(block, (int)task.y);
            range.y = range.x + task.x + (block < task.y);
            return range;
     }
    

    After the modify, Mxnet build success and the demo bdk_demo.py run success on my vega20 card. So the reason is int2 type or template function min. Is some one has some idea? thanks.

    opened by andyzhanged 0
  • updates to Makefile

    updates to Makefile

    Ubuntu - compiled from source

    issue: missing hipfft command 'make -j24'

    What have you tried to solve it?

    1. Changed Makefile: hipfft references to rocfft
    2. Changed opencv and openmp values to 0 in /mxnet/make/config.mk
    opened by aleksthegreat 0
  • AMD CPU mxnet performance benchmarks

    AMD CPU mxnet performance benchmarks

    For bugs or installation issues, please provide the following information. Is it possible for someone at AMD to run benchmarks on AMD CPUs? To run benchmarks you just need to run the following:

    python example/image-classification/benchmark_score.py
    

    Can this benchmark be run on the default mxnet package and the pip mxnet-mkl package?

    There is a thread to make the mxnet-mkl package the default mxnet package so we need to check that the performance changes for AMD users. https://lists.apache.org/thread.html/ca3c858a712a66f4b3443dfb395d84e8fdf2c86e297bcd28a3a36ebd@%3Cdev.mxnet.apache.org%3E

    Environment info

    Operating System: Linux

    Compiler:

    Package used (Python/R/Scala/Julia): Python

    MXNet version: 1.3.0 (Just pip mxnet and pip mxnet-mkl which will get 1.3.0)

    Or if installed from source:

    MXNet commit hash (git rev-parse HEAD):

    If you are using python package, please provide 1.3.0 (Just pip mxnet and pip mxnet-mkl which will get 1.3.0)

    Python version and distribution:

    If you are using R package, please provide

    R sessionInfo():

    Error Message:

    Please paste the full error message, including stack trace.

    Minimum reproducible example

    if you are using your own code, please provide a short script that reproduces the error.

    Steps to reproduce

    or if you are running standard examples, please provide the commands you have run that lead to the error.

    1. pip install mxnet
    2. Git clone repo to get benchmark script
    3. On line 73 of the example/image-classification/benchmark_score.py remove gpus from the list. (Change the devs list to devs = [mx.cpu()])
    4. Run the script python example/image-classification/benchmark_score.py
    5. Repeat this with pip install mxnet-mkl

    What have you tried to solve it?

    opened by azai91 0
  • Problems building mxnet

    Problems building mxnet

    For bugs or installation issues, please provide the following information. The more information you provide, the more likely people will be able to help you.

    Environment info

    Operating System: Ubuntu 16.04.3 LTS Compiler: hipcc / hcc (clang 6, see version output below)

    hipcc --version HIP version: 1.4.17494 HCC clang version 6.0.0 (ssh://gerritgit/compute/ec/hcc-tot/clang 42ceed861a212d9bd0aef883ee7981144f3ecc02) (ssh://gerritgit/compute/ec/hcc-tot/llvm 23e086be6f627e6e983c6789d2e77da6bf85ebb6) (based on HCC 1.1.17493-2f85d8a-42ceed8-23e086b ) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm/hcc/bin

    Package used (Python/R/Scala/Julia):

    MXNet version:

    Or if installed from source:

    MXNet commit hash (git rev-parse HEAD): d053ae86d5327ca36315b9a0646989678fff335d If you are using python package, please provide

    Python version and distribution:

    If you are using R package, please provide

    R sessionInfo():

    Error Message:

    Please paste the full error message, including stack trace. The initial issue was that with the latest rocm (1.7.60) install from the repositories there was problem with rocBLAS and hcRNG was missing so I built them from git. hcFFT was available as expected. At this point mxnet appear to compile but multiple errors reported I'm attached a build log from the second build attempt so it is less noisy.

    I am also using cuda 9.1 but I did try cuda 8 which also failed. The environment vars in both cases were: LD_LIBRARY_PATH=/usr/local/cuda/lib64 (this symlinked to 8 or 9.1 depending on what is installed) HIP_PLATFORM=hcc

    The current git version of mxnet also do not need the Makefile modification presented since it is always there.

    build.log

    Minimum reproducible example

    if you are using your own code, please provide a short script that reproduces the error.

    Steps to reproduce

    or if you are running standard examples, please provide the commands you have run that lead to the error.

    1.make -j $(nproc) 2. 3.

    What have you tried to solve it?

    The first stoppage in the log...

    41 warnings and 2 errors generated. Died at /opt/rocm/bin/hipcc line 500

    ...refers to a line in the hipcc script...

    495 if ($runCmd) { 496 if ($HIP_PLATFORM eq "hcc" and exists($hipConfig{'HCC_VERSION'}) and $HCC_VERSION ne $hipConfig{'HCC_VERSION'}) { 497 print ("HIP ($HIP_PATH) was built using hcc $hipConfig{'HCC_VERSION'}, but you are using $HCC_HOME/hcc with version $HCC_VERSION from hipcc. Please rebuild HIP including cmake or update HCC_HOME variable.\n") ; 498 die unless $ENV{'HIP_IGNORE_HCC_VERSION'}; 499 } 500 system ("$CMD") and die (); 501 }

    However, my HIP configuration appears to be good... hipconfig HIP version : 1.4.17494

    == hipconfig HIP_PATH : /opt/rocm HIP_PLATFORM : hcc CPP_CONFIG : -D__HIP_PLATFORM_HCC__= -I/opt/rocm/include -I/opt/rocm/hcc/include

    == hcc HSA_PATH : /opt/rocm/hsa HCC_HOME : /opt/rocm/hcc HCC clang version 6.0.0 (ssh://gerritgit/compute/ec/hcc-tot/clang 42ceed861a212d9bd0aef883ee7981144f3ecc02) (ssh://gerritgit/compute/ec/hcc-tot/llvm 23e086be6f627e6e983c6789d2e77da6bf85ebb6) (based on HCC 1.1.17493-2f85d8a-42ceed8-23e086b ) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm/hcc/bin LLVM (http://llvm.org/): LLVM version 6.0.0svn Optimized build. Default target: x86_64-unknown-linux-gnu Host CPU: znver1

    Registered Targets: amdgcn - AMD GCN GPUs r600 - AMD GPUs HD2XXX-HD6XXX x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 HCC-cxxflags : -hc -std=c++amp -I/opt/rocm/hcc-1.0/include -I/opt/rocm/includeHCC-ldflags : -hc -std=c++amp -L/opt/rocm/hcc-1.0/lib -Wl,--rpath=/opt/rocm/hcc-1.0/lib -ldl -lm -lpthread -lunwind -lhc_am -Wl,--whole-archive -lmcwamp -Wl,--no-whole-archive

    === Environment Variables PATH=/opt/rocm/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin LD_LIBRARY_PATH=/usr/local/cuda/lib64 HIP_PLATFORM=hcc

    == Linux Kernel Hostname : Linux 4.4.0-109-generic #132-Ubuntu SMP Tue Jan 9 19:52:39 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.3 LTS Release: 16.04 Codename: xenial

    ~ ~ ~

    I'm not sure what to try next. My guess is that there are some function differences between mxnet code and the larger requirements but I don't know how to resolve that.

    opened by kcperry 0
Owner
ROCm Software Platform
ROCm Software Platform Repository
ROCm Software Platform
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 19.3k Feb 12, 2021
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Master Docs License Apache MXNet (incubating) is a deep learning framework designed for both efficiency an

ROCm Software Platform 29 Nov 16, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Dec 31, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 20.6k Feb 13, 2021
Fuzzing JavaScript Engines with Aspect-preserving Mutation

DIE Repository for "Fuzzing JavaScript Engines with Aspect-preserving Mutation" (in S&P'20). You can check the paper for technical details. Environmen

gts3.org (SSLab@Gatech) 190 Dec 11, 2022
Angora is a mutation-based fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without symbolic execution.

Angora Angora is a mutation-based coverage guided fuzzer. The main goal of Angora is to increase branch coverage by solving path constraints without s

null 833 Jan 7, 2023
MOpt-AFL provided by the paper "MOPT: Optimized Mutation Scheduling for Fuzzers"

MOpt-AFL 1. Description MOpt-AFL is a AFL-based fuzzer that utilizes a customized Particle Swarm Optimization (PSO) algorithm to find the optimal sele

null 172 Dec 18, 2022
Bagua is a flexible and performant distributed training algorithm development framework.

Bagua is a flexible and performant distributed training algorithm development framework.

null 786 Dec 17, 2022
Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

null 567 Dec 26, 2022
Code for Boundary-Aware Segmentation Network for Mobile and Web Applications

BASNet Boundary-Aware Segmentation Network for Mobile and Web Applications This repository contain implementation of BASNet in tensorflow/keras. comme

Hamid Ali 8 Nov 24, 2022
Source code of the paper Meta-learning with an Adaptive Task Scheduler.

ATS About Source code of the paper Meta-learning with an Adaptive Task Scheduler. If you find this repository useful in your research, please cite the

Huaxiu Yao 16 Dec 26, 2022
Pocsploit is a lightweight, flexible and novel open source poc verification framework

Pocsploit is a lightweight, flexible and novel open source poc verification framework

cckuailong 208 Dec 24, 2022
Image data augmentation scheduler for albumentations transforms

albu_scheduler Scheduler for albumentations transforms based on PyTorch schedulers interface Usage TransformMultiStepScheduler import albumentations a

null 19 Aug 4, 2021
DexterRedTool - Dexter's Red Team Tool that creates cronjob/task scheduler to consistently creates users

DexterRedTool Author: Dexter Delandro CSEC 473 - Spring 2022 This tool persisten

null 2 Feb 16, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

Intelligent Systems Lab Org 2.3k Jan 1, 2023
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Chen Gao 139 Dec 28, 2022
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

null 10 Oct 7, 2022
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor

Keon Lee 279 Jan 4, 2023